Systemd

Author	SHA1	Message	Date
Lennart Poettering	99be45a46f	fs-util: rename path_is_safe() → path_is_normalized() Already, path_is_safe() refused paths container the "." dir. Doing that isn't strictly necessary to be "safe" by most definitions of the word. But it is necessary in order to consider a path "normalized". Hence, "path_is_safe()" is slightly misleading a name, but "path_is_normalize()" is more descriptive, hence let's rename things accordingly. No functional changes.	2017-11-17 11:13:44 +01:00
Lennart Poettering	5afe510c89	core: add a new unit file setting CollectMode= for tweaking the GC logic Right now, the option only takes one of two possible values "inactive" or "inactive-or-failed", the former being the default, and exposing same behaviour as the status quo ante. If set to "inactive-or-failed" units may be collected by the GC logic when in the "failed" state too. This logic should be a nicer alternative to using the "-" modifier for ExecStart= and friends, as the exit data is collected and logged about and only removed when the GC comes along. This should be useful in particular for per-connection socket-activated services, as well as "systemd-run" command lines that shall leave no artifacts in the system. I was thinking about whether to expose this as a boolean, but opted for an enum instead, as I have the suspicion other tweaks like this might be a added later on, in which case we extend this setting instead of having to add yet another one. Also, let's add some documentation for the GC logic.	2017-11-16 14:38:36 +01:00
Lennart Poettering	7eb2a8a125	unit: rework a bit how we keep the service fdstore from being destroyed during service restart When preparing for a restart we quickly go through the DEAD/INACTIVE service state before entering AUTO_RESTART. When doing this, we need to make sure we don't destroy the FD store. Previously this was done by checking the failure state of the unit, and keeping the FD store around when the unit failed, under the assumption that the restart logic will then get into action. This is not entirely correct howver, as there might be failure states that will no result in restarts. With this commit we slightly alter the logic: a ref counter for the fd store is added, that is increased right before we handle the restart logic, and decreased again right-after. This should ensure that the fdstore lives exactly as long as it needs. Follow-up for `f0bfbfac43`.	2017-11-16 14:37:33 +01:00
Lennart Poettering	d3070fbdf6	core: implement /run/systemd/units/-based path for passing unit info from PID 1 to journald And let's make use of it to implement two new unit settings with it: 1. LogLevelMax= is a new per-unit setting that may be used to configure log priority filtering: set it to LogLevelMax=notice and only messages of level "notice" and lower (i.e. more important) will be processed, all others are dropped. 2. LogExtraFields= is a new per-unit setting for configuring per-unit journal fields, that are implicitly included in every log record generated by the unit's processes. It takes field/value pairs in the form of FOO=BAR. Also, related to this, one exisiting unit setting is ported to this new facility: 3. The invocation ID is now pulled from /run/systemd/units/ instead of cgroupfs xattrs. This substantially relaxes requirements of systemd on the kernel version and the privileges it runs with (specifically, cgroupfs xattrs are not available in containers, since they are stored in kernel memory, and hence are unsafe to permit to lesser privileged code). /run/systemd/units/ is a new directory, which contains a number of files and symlinks encoding the above information. PID 1 creates and manages these files, and journald reads them from there. Note that this is supposed to be a direct path between PID 1 and the journal only, due to the special runtime environment the journal runs in. Normally, today we shouldn't introduce new interfaces that (mis-)use a file system as IPC framework, and instead just an IPC system, but this is very hard to do between the journal and PID 1, as long as the IPC system is a subject PID 1 manages, and itself a client to the journal. This patch cleans up a couple of types used in journal code: specifically we switch to size_t for a couple of memory-sizing values, as size_t is the right choice for everything that is memory. Fixes: #4089 Fixes: #3041 Fixes: #4441	2017-11-16 12:40:17 +01:00
Lennart Poettering	0263828039	core: rework the Delegate= unit file setting to take a list of controller names Previously it was not possible to select which controllers to enable for a unit where Delegate=yes was set, as all controllers were enabled. With this change, this is made configurable, and thus delegation units can pick specifically what they want to manage themselves, and what they don't care about.	2017-11-13 10:49:15 +01:00
Lennart Poettering	c999cf385a	core: add internal API to remove dependencies again, based on dependency mask let's make use of the dependency mask, and add internal API to remove dependencies ago, based on bits in the dependency mask.	2017-11-10 19:45:29 +01:00
Lennart Poettering	eef85c4a3f	core: track why unit dependencies came to be This replaces the dependencies Set* objects by Hashmap* objects, where the key is the depending Unit, and the value is a bitmask encoding why the specific dependency was created. The bitmask contains a number of different, defined bits, that indicate why dependencies exist, for example whether they are created due to explicitly configured deps in files, by udev rules or implicitly. Note that memory usage is not increased by this change, even though we store more information, as we manage to encode the bit mask inside the value pointer each Hashmap entry contains. Why this all? When we know how a dependency came to be, we can update dependencies correctly when a configuration source changes but others are left unaltered. Specifically: 1. We can fix UDEV_WANTS dependency generation: so far we kept adding dependencies configured that way, but if a device lost such a dependency we couldn't them again as there was no scheme for removing of dependencies in place. 2. We can implement "pin-pointed" reload of unit files. If we know what dependencies were created as result of configuration in a unit file, then we know what to flush out when we want to reload it. 3. It's useful for debugging: "systemd-analyze dump" now shows this information, helping substantially with understanding how systemd's dependency tree came to be the way it came to be.	2017-11-10 19:45:29 +01:00
Lubomir Rintel	19a44dfe45	core: fragments of masked units ought not be considered for NeedDaemonReload (#7060 ) The units that are not loaded don't have dropin_paths set. This currently results in units that have fragments to always have NeedDaemonReload=true when masked: $ find {/usr/lib,/run/user/8086}/systemd/user/meh.service* \|xargs ls -ld lrwxrwxrwx. 1 lkundrak lkundrak 9 Oct 11 11:19 /run/user/8086/systemd/user/meh.service -> /dev/null -rw-rw-r--. 1 root root 49 Oct 11 10:16 /usr/lib/systemd/user/meh.service drwxrwxr-x. 2 root root 4096 Oct 11 10:50 /usr/lib/systemd/user/meh.service.d -rw-rw-r--. 1 root root 666 Oct 11 10:50 /usr/lib/systemd/user/meh.service.d/override.conf $ systemctl --user daemon-reload $ busctl --user get-property org.freedesktop.systemd1 \ /org/freedesktop/systemd1/unit/meh_2eservice \ org.freedesktop.systemd1.Unit NeedDaemonReload b true	2017-10-18 08:38:50 +02:00
Yu Watanabe	4c70109600	tree-wide: use IN_SET macro (#6977 )	2017-10-04 16:01:32 +02:00
Lennart Poettering	72fd17682d	core: usually our enum's _INVALID and _MAX special values are named after the full type In most cases we followed the rule that the special _INVALID and _MAX values we use in our enums use the full type name as prefix (in contrast to regular values that we often make shorter), do so for ExecDirectoryType as well. No functional changes, just a little bit of renaming to make this code more like the rest.	2017-10-02 17:41:43 +02:00
Andreas Rammhold	ec2ce0c5d7	tree-wide: use `!IN_SET(..)` for `a != b && a != c && …` The included cocci was used to generate the changes. Thanks to @flo-wer for pointing this case out.	2017-10-02 13:09:56 +02:00
Andreas Rammhold	3742095b27	tree-wide: use IN_SET where possible In addition to the changes from #6933 this handles cases that could be matched with the included cocci file.	2017-10-02 13:09:54 +02:00
Lennart Poettering	ed77d407d3	core: log unit failure with type-specific result code This slightly changes how we log about failures. Previously, service_enter_dead() would log that a service unit failed along with its result code, and unit_notify() would do this again but without the result code. For other unit types only the latter would take effect. This cleans this up: we keep the message in unit_notify() only for debug purposes, and add type-specific log lines to all our unit types that can fail, and always place them before unit_notify() is invoked. Or in other words: the duplicate log message for service units is removed, and all other unit types get a more useful line with the precise result code.	2017-09-27 18:26:18 +02:00
Lennart Poettering	84b26d5149	core: free_and_strdup() FTW!	2017-09-27 18:26:18 +02:00
Lennart Poettering	09e2465407	cgroup: after determining that a cgroup is empty, asynchronously dispatch this This makes sure that if we learn via inotify or another event source that a cgroup is empty, and we checked that this is indeed the case (as we might get spurious notifications through inotify, as the inotify logic through the "cgroups.event" is pretty unspecific and might be trigger for a variety of reasons), then we'll enqueue a defer event for it, at a priority lower than SIGCHLD handling, so that we know for sure that if there's waitid() data for a process we used it before considering the cgroup empty notification. Fixes: #6608	2017-09-27 18:26:18 +02:00
Lennart Poettering	91a6073ef7	core: rename cgroup_queue → cgroup_realize_queue We are about to add second cgroup-related queue, called "cgroup_empty_queue", hence let's rename "cgroup_queue" to "cgroup_realize_queue" (as that is its purpose) to minimize confusion about the two queues. Just a rename, no functional changes.	2017-09-27 17:59:25 +02:00
Zbigniew Jędrzejewski-Szmek	2e4025c0f9	core/cgroup: add a helper macro for a common pattern (#6926 )	2017-09-27 17:54:06 +02:00
Jan Synacek	0cde65e263	test-cpu-set-util.c: fix typo in comment (#6916 )	2017-09-26 16:07:34 +02:00
Lennart Poettering	7960b0c704	cgroup: make use of unit_cgroup_delegate() where useful It's an easy-to-use wrapper, so let's take benefit of it.	2017-09-22 20:02:23 +02:00
Lennart Poettering	915b1d0174	core: whenever a unit terminates, log its consumed resources to the journal This adds a new recognizable log message for each unit invocation that contains structured information about consumed resources of the unit as a whole after it terminated. This is particular useful for apps that want to figure out what the resource consumption of a unit given a specific invocation ID was. The log message is only generated for units that have at least one XyzAccounting= property turned on, and currently only covers IP traffic and CPU time metrics.	2017-09-22 15:28:05 +02:00
Lennart Poettering	f1c50becda	core: make sure to log invocation ID of units also when doing structured logging	2017-09-22 15:24:55 +02:00
Lennart Poettering	58d83430e1	core: when coming back from reload/reexec, reapply all cgroup properties With this change we'll invalidate all cgroup settings after coming back from a daemon reload/reexec, so that the new settings are instantly applied. This is useful for the BPF case, because we don't serialize/deserialize the BPF program fd, and hence have to install a new, updated BPF program when coming back from the reload/reexec. However, this is also useful for the rest of the cgroup settings, as it ensures that user configuration really takes effect wherever we can.	2017-09-22 15:24:55 +02:00
Lennart Poettering	6b659ed87e	core: serialize/deserialize IP accounting across daemon reload/reexec Make sure the current IP accounting counters aren't lost during reload/reexec. Note that we destroy all BPF file objects during a reload: the BPF programs, the access and the accounting maps. The former two need to be regenerated anyway with the newly loaded configuration data, but the latter one needs to survive reloads/reexec. In this implementation I opted to only save/restore the accounting map content instead of the map itself. While this opens a (theoretic) window where IP traffic is still accounted to the old map after we read it out, and we thus miss a few bytes this has the benefit that we can alter the map layout between versions should the need arise.	2017-09-22 15:24:55 +02:00
Lennart Poettering	a79279c7fd	core: when creating the socket fds for a socket unit, join socket's cgroup first Let's make sure that a socket unit's IPAddressAllow=/IPAddressDeny= settings are in effect on all socket fds associated with it. In order to make this happen we need to make sure the cgroup the fds are associated with are the socket unit's cgroup. The only way to do that is invoking socket()+accept() in them. Since we really don't want to migrate PID 1 around we do this by forking off a helper process, which invokes socket()/accept() and sends the newly created fd to PID 1. Ugly, but works, and there's apparently no better way right now. This generalizes forking off per-unit helper processes in a new function unit_fork_helper_process(), which is then also used by the NSS chown() code of socket units.	2017-09-22 15:24:55 +02:00
Daniel Mack	377bfd2d49	manager: hook up IP accounting defaults	2017-09-22 15:24:55 +02:00
Daniel Mack	906c06f64a	cgroup, unit, fragment parser: make use of new firewall functions	2017-09-22 15:24:55 +02:00
Daniel Mack	6a48d82f02	cgroup: add fields to accommodate eBPF related details Add pointers for compiled eBPF programs as well as list heads for allowed and denied hosts for both directions.	2017-09-22 15:24:54 +02:00
Lennart Poettering	b1edf4456e	core: add new per-unit setting KeyringMode= for controlling kernel keyring setup Usually, it's a good thing that we isolate the kernel session keyring for the various services and disconnect them from the user keyring. However, in case of the cryptsetup key caching we actually want that multiple instances of the cryptsetup service can share the keys in the root user's user keyring, hence we need to be able to disable this logic for them. This adds KeyringMode=inherit\|private\|shared: inherit: don't do any keyring magic (this is the default in systemd --user) private: a private keyring as before (default in systemd --system) shared: the new setting	2017-09-15 16:53:35 +02:00
JÃÂ©rÃÂ©my Rosen	f54bcca5c1	unit : allow any unit which propagates reloads to be reloaded	2017-09-10 18:53:26 +02:00
Yu Watanabe	ada5e27657	core: StateDirectory= and friends imply RequiresMountsFor=	2017-08-31 18:19:35 +09:00
Lennart Poettering	f0d477979e	core: introduce unit_set_exec_params() The new unit_set_exec_params() call is to units what manager_set_exec_params() is to the manager object: it initializes the various fields from the relevant generic properties set.	2017-08-10 15:02:50 +02:00
Zbigniew Jędrzejewski-Szmek	0742986650	core: properly handle deserialization of unknown unit types (#6476 ) We just abort startup, without printing any error. Make sure we always print something, and when we cannot deserialize some unit, just ignore it and continue. Fixup for `4bc5d27b94`. Without this, we would hang in daemon-reexec after upgrade.	2017-07-31 08:05:35 +02:00
Martin Pitt	9fcaa574f0	Merge pull request #6465 from keszybz/drop-kdbus Drop kdbus-dependent code	2017-07-28 09:29:07 +02:00
Zbigniew Jędrzejewski-Szmek	4bc5d27b94	Drop busname unit type Since busname units are only useful with kdbus, they weren't actively used. This was dead code, only compile-tested. If busname units are ever added back, it'll be cleaner to start from scratch (possibly reverting parts of this patch).	2017-07-23 09:29:02 -04:00
Zbigniew Jędrzejewski-Szmek	9e4ea9cc34	Revert "core: don't load dropin data multiple times for the same unit (#5139 )" This reverts commit `2d058a87ff`. When we add another name to a unit (by following an alias), we need to reload all drop-ins. This is necessary to load any additional dropins found in the dirs created from the alias name. Fixes #6334.	2017-07-22 16:03:00 -04:00
Zbigniew Jędrzejewski-Szmek	13ddc3fc2b	systemd: do not stop units bound to inactive units while coldplugging (#6316 ) When running systemd-analyze verify I would get a random subset of warnings (sometimes none, sometimes one or two): dev-mapper-luks\x2d8db85dcf\x2d6230\x2d4e88\x2d940d\x2dba176d062b31.swap: Unit is bound to inactive unit dev-mapper-luks\x2d8db85dcf\x2d6230\x2d4e88\x2d940d\x2dba176d062b31.device. Stopping, too. home.mount: Unit is bound to inactive unit dev-disk-by\x2duuid-75751556\x2d6e31\x2d438b\x2d99c9\x2dd626330d9a1b.device. Stopping, too. boot.mount: Unit is bound to inactive unit dev-disk-by\x2duuid-56c56bfd\x2d93f0\x2d48fb\x2dbc4b\x2d90aa67144ea5.device. Stopping, too. When running with debug on, it's pretty obvious what is happening: home.mount: Changed dead -> mounted home.mount: Unit is bound to inactive unit dev-disk-by\x2duuid-75751556\x2d6e31\x2d438b\x2d99c9\x2dd626330d9a1b.device. Stopping, too. home.mount: Trying to enqueue job home.mount/stop/fail home.mount: Installed new job home.mount/stop as 27 home.mount: Enqueued job home.mount/stop as 27 ... dev-disk-by\x2duuid-75751556\x2d6e31\x2d438b\x2d99c9\x2dd626330d9a1b.device: Installed new job dev-disk-by\x2duuid-75751556\x2d6e31\x2d438b\x2d99c9\x2dd626330d9a1b.device/start as 47 dev-disk-by\x2duuid-75751556\x2d6e31\x2d438b\x2d99c9\x2dd626330d9a1b.device: Changed dead -> plugged dev-disk-by\x2duuid-75751556\x2d6e31\x2d438b\x2d99c9\x2dd626330d9a1b.device: Job dev-disk-by\x2duuid-75751556\x2d6e31\x2d438b\x2d99c9\x2dd626330d9a1b.device/start finished, result=done Fixes #2206, https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=808151.	2017-07-11 10:45:03 +02:00
Michal Koutný	b007626897	core: dbus: Interpret released names properly (#6175 ) When a DBus name is released, NameOwnerChanged signal contains an empty string as new_owner. Commit `bbc2908` changed interpretation of the empty string to a valid name, which is not consistent with values that are sent by dbus-daemon. As a side effect, this masks symptoms of systemd-logind dbus disconnections (#2925) by completely restarting it so it can freshly reconnect to dbus.	2017-06-22 20:26:04 -04:00
Franck Bui	8b108bd0ef	core: when deserializing a unit, fully restore its cgroup state The state of a unit was not fully restored, especially the "cgroup_realized_mask/cgroup_enabled_mask" fields were missing. This could be seen with the following sequence: $ systemctl show -p TasksCurrent sshd TasksCurrent=1 $ systemctl daemon-reload $ systemctl show -p TasksCurrent sshd TasksCurrent=18446744073709551615 This was also visible with the "status" command: "Tasks: " row wasn't showed in status of a service after a "daemon-reload" command.	2017-05-04 09:41:23 +02:00
Franck Bui	aae7e17f9c	core: introduce cg_mask_from_string()/cg_mask_to_string()	2017-05-04 09:41:19 +02:00
Lennart Poettering	db7076bf78	Merge pull request #5164 from Werkov/ordering-for-_netdev-devices Ordering for _netdev devices	2017-04-29 18:40:19 +02:00
Michal Koutný	a2df3ea4ae	job: add JobRunningTimeoutSec for JOB_RUNNING state Unit.JobTimeoutSec starts when a job is enqueued in a transaction. The introduced distinct Unit.JobRunningTimeoutSec starts only when the job starts running (e.g. it groups all Exec* commands of a service or spans waiting for a device period.) Unit.JobRunningTimeoutSec is intended to be used by default instead of Unit.JobTimeoutSec for device units where such behavior causes less confusion (consider a job for a _netdev mount device, with this change the timeout will start ticking only after the network is ready).	2017-04-25 18:00:29 +02:00
Zbigniew Jędrzejewski-Szmek	ba360bb05c	tree-wide: mark log_struct with _printf_ and fix fallout log_struct takes multiple format strings, each one followed by arguments. The _printf_ annotation is not sufficiently flexible to express this, but we can still annotate the first format string, though not its arguments (because their number is unknown). With the annotation, the places which specified the message id or similar as the first pattern cause a warning from -Wformat-nonliteral. This can be trivially fixed by putting the MESSAGE= first. This change will help find issues where a non-literal is erroneously used as the pattern.	2017-04-21 13:37:04 -04:00
Lennart Poettering	77969722aa	core: when a unit's SourcePath points to API VFS pretend we are never out-of-date (#5487 ) If the unit's SourcePath is below /proc then it's a unit genreated from a kernel resource (such as a .mount or .swap unit). And those we watch anyway, and hence should never be out-of-date. Fixes: #5461	2017-03-01 10:25:08 -05:00
Lennart Poettering	ae572acd62	core: always consider clients that pinned a unit to be subscribers If a client pins a unit, then it makes sense to also implicitly make it a subscriber. This is useful for clients that just want to watch one specific unit: they can pin it and receive its messages.	2017-02-28 18:34:58 +01:00
Zbigniew Jędrzejewski-Szmek	78e4f19ebc	Merge pull request #5444 from poettering/cgroups-revert-no-error Revert "core: simplify cg_[all_]unified()" and more.	2017-02-24 18:48:57 -05:00
AsciiWolf	13e785f7a0	Fix missing space in comments (#5439 )	2017-02-24 18:14:02 +01:00
Lennart Poettering	c22800e40e	cgroup: rename cg_unified() → cg_unified_controller() cg_unified() is a bit generic a name, let's make clear that it checks whether a specified controller is in unified mode.	2017-02-24 18:00:04 +01:00
Lennart Poettering	b4cccbc13a	cgroup: change cg_unified() to possibly return errors again We use our cgroup APIs in various contexts, including from our libraries sd-login, sd-bus. As we don#t control those environments we can't rely that the unified cgroup setup logic succeeds, and hence really shouldn't assert on it. This more or less reverts `415fc41cea`.	2017-02-24 17:52:58 +01:00
Tejun Heo	415fc41cea	core: simplify cg_[all_]unified() cg_[all_]unified() test whether a specific controller or all controllers are on the unified hierarchy. While what's being asked is a simple binary question, the callers must assume that the functions may fail any time, which unnecessarily complicates their usages. This complication is unnecessary. Internally, the test result is cached anyway and there are only a few places where the test actually needs to be performed. This patch simplifies cg_[all_]unified(). * cg_[all_]unified() are updated to return bool. If the result can't be decided, assertion failure is triggered. Error handlings from their callers are dropped. * cg_unified_flush() is updated to calculate the new result synchrnously and return whether it succeeded or not. Places which need to flush the test result are updated to test for failure. This ensures that all the following cg_[all_]unified() tests succeed. * Places which expected possible cg_[all_]unified() failures are updated to call and test cg_unified_flush() before calling cg_[all_]unified(). This includes functions used while setting up mounts during boot and manager_setup_cgroup().	2017-02-18 17:51:13 -05:00
Lennart Poettering	2fe917fe91	Merge pull request #4526 from keszybz/coredump-python Collect interpreter backtraces in systemd-coredump	2017-02-16 11:24:03 +01:00
Zbigniew Jędrzejewski-Szmek	2b0445262a	tree-wide: add SD_ID128_MAKE_STR, remove LOG_MESSAGE_ID Embedding sd_id128_t's in constant strings was rather cumbersome. We had SD_ID128_CONST_STR which returned a const char[], but it had two problems: - it wasn't possible to statically concatanate this array with a normal string - gcc wasn't really able to optimize this, and generated code to perform the "conversion" at runtime. Because of this, even our own code in coredumpctl wasn't using SD_ID128_CONST_STR. Add a new macro to generate a constant string: SD_ID128_MAKE_STR. It is not as elegant as SD_ID128_CONST_STR, because it requires a repetition of the numbers, but in practice it is more convenient to use, and allows gcc to generate smarter code: $ size .libs/systemd{,-logind,-journald}{.old,} text data bss dec hex filename 1265204 149564 4808 1419576 15a938 .libs/systemd.old 1260268 149564 4808 1414640 1595f0 .libs/systemd 246805 13852 209 260866 3fb02 .libs/systemd-logind.old 240973 13852 209 255034 3e43a .libs/systemd-logind 146839 4984 34 151857 25131 .libs/systemd-journald.old 146391 4984 34 151409 24f71 .libs/systemd-journald It is also much easier to check if a certain binary uses a certain MESSAGE_ID: $ strings .libs/systemd.old\|grep MESSAGE_ID MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x $ strings .libs/systemd\|grep MESSAGE_ID MESSAGE_ID=c7a787079b354eaaa9e77b371893cd27 MESSAGE_ID=b07a249cd024414a82dd00cd181378ff MESSAGE_ID=641257651c1b4ec9a8624d7a40a9e1e7 MESSAGE_ID=de5b426a63be47a7b6ac3eaac82e2f6f MESSAGE_ID=d34d037fff1847e6ae669a370e694725 MESSAGE_ID=7d4958e842da4a758f6c1cdc7b36dcc5 MESSAGE_ID=1dee0369c7fc4736b7099b38ecb46ee7 MESSAGE_ID=39f53479d3a045ac8e11786248231fbf MESSAGE_ID=be02cf6855d2428ba40df7e9d022f03d MESSAGE_ID=7b05ebc668384222baa8881179cfda54 MESSAGE_ID=9d1aaa27d60140bd96365438aad20286	2017-02-15 00:45:12 -05:00
Lennart Poettering	631b676bb7	core: explicitly verify that BindsTo= deps are in order before dispatch start operation of a unit Let's make sure we verify that all BindsTo= are in order before we actually go and dispatch a start operation to a unit. Normally the job queue should already have made sure all deps are in order, but this might not have been sufficient in two cases: a) when the user changes deps during runtime and reloads the daemon, and b) when the user placed BindsTo= dependencies without matching After= dependencies, so that we don't actually wait for the bound to unit to be up before upping also the binding unit. See: #4725	2017-02-14 13:38:24 +01:00
Lennart Poettering	8367fea557	core: make sure to destroy all name watching bus slots when we are kicked off the bus (#5294 ) Fixes: #4528	2017-02-09 21:54:48 -05:00
Lennart Poettering	915e6d1676	core: add RootImage= setting for using a specific image file as root directory for a service This is similar to RootDirectory= but mounts the root file system from a block device or loopback file instead of another directory. This reuses the image dissector code now used by nspawn and gpt-auto-discovery.	2017-02-07 12:19:42 +01:00
Franck Bui	2d058a87ff	core: don't load dropin data multiple times for the same unit (#5139 ) When an alias is loaded, we resolve this alias to its final unit first to load the dropin data. However if the final unit was already loaded, there's no point in reloading the dropin data a second time. This patch optimizes this case. Also this allows the dropin loading code to assume that only units not yet loaded are passed down. This assumption is not yet used but might be in the future. [zj: invert the condition in the if]	2017-01-24 08:29:57 -05:00
Lennart Poettering	d71f050599	core: implicitly order units with PrivateTmp= after systemd-tmpfiles-setup.service Preparation for fixing #4401.	2016-12-27 23:25:24 +01:00
Franck Bui	ebc8968bc0	core: make mount units from /proc/self/mountinfo possibly bind to a device (#4515 ) Since commit `9d06297`, mount units from mountinfo are not bound to their devices anymore (they use the "Requires" dependency instead). This has the following drawback: if a media is mounted and the eject button is pressed then the media is unconditionally ejected leaving some inconsistent states. Since udev is the component that is reacting (no matter if the device is used or not) to the eject button, users expect that udev at least try to unmount the media properly. This patch introduces a new property "SYSTEMD_MOUNT_DEVICE_BOUND". When set on a block device, all units that requires this device will see their "Requires" dependency upgraded to a "BindTo" one. This is currently only used by cdrom devices. This patch also gives the possibility to the user to restore the previous behavior that is bind a mount unit to a device. This is achieved by passing the "x-systemd.device-bound" option to mount(8). Please note that currently this is not working because libmount treats the x-* options has comments therefore they're not available in utab for later application retrievals.	2016-12-16 17:13:58 +01:00
Zbigniew Jędrzejewski-Szmek	59ec09a83e	pid1: simplify the logic in two statements related to killing processes Generally non-inverted conditions are nicer, and ternary operators with complex conditions are a bit hard to read. No functional change.	2016-12-09 13:53:31 -05:00
Lennart Poettering	c9d5c9c0e1	core: make unit_free() accept NULL pointers We generally try to make our destructors robust regarding NULL pointers, much in the same way as glibc's free(). Do this also for unit_free(). Follow-up for #4748.	2016-12-01 00:25:51 +01:00
Lennart Poettering	2e6dbc0fcd	Merge pull request #4538 from fbuihuu/confirm-spawn-fixes Confirm spawn fixes/enhancements	2016-11-18 11:08:06 +01:00
Franck Bui	c891efaf8a	core: confirm_spawn: always accept units with same_pgrp set for now For some reasons units remaining in the same process group as PID 1 (same_pgrp=true) fail to acquire the console even if it's not taken by anyone. So always accept for units with same_pgrp set for now.	2016-11-17 18:16:51 +01:00
Lennart Poettering	c5a97ed132	core: GC redundant device jobs from the run queue In contrast to all other unit types device units when queued just track external state, they cannot effect state changes on their own. Hence unless a client or other job waits for them there's no reason to keep them in the job queue. This adds a concept of GC'ing jobs of this type as soon as no client or other job waits for them anymore. To ensure this works correctly we need to track which clients actually reference a job (i.e. which ones enqueued it). Unfortunately that's pretty nasty to do for direct connections, as sd_bus_track doesn't work for them. For now, work around this, by simply remembering in a boolean that a job was requested by a direct connection, and reset it when we notice the direct connection is gone. This means the GC logic works fine, except that jobs are not immediately removed when direct connections disconnect. In the longer term, a rework of the bus logic should fix this properly. For now this should be good enough, as GC works for fine all cases except this one, and thus is a clear improvement over the previous behaviour. Fixes: #1921	2016-11-16 15:03:26 +01:00
Lennart Poettering	a2d72e265a	core: drop n_in_gc_queue field of Manager structure We count the units in the GC queue with this, but actually never make use of it, hence drop it.	2016-11-16 15:03:26 +01:00
Djalal Harouni	c92e8afebd	core: improve the logic that implies no new privileges The no_new_privileged_set variable is not used any more since commit `9b232d3241` that fixed another thing. So remove it. Also no need to check if we are under user manager, remove that part too.	2016-11-15 15:04:31 +01:00
Zbigniew Jędrzejewski-Szmek	f97b34a629	Rename formats-util.h to format-util.h We don't have plural in the name of any other -util files and this inconsistency trips me up every time I try to type this file name from memory. "formats-util" is even hard to pronounce.	2016-11-07 10:15:08 -05:00
Lennart Poettering	493fd52f1a	Merge pull request #4510 from keszybz/tree-wide-cleanups Tree wide cleanups	2016-11-03 13:59:20 -06:00
Zbigniew Jędrzejewski-Szmek	e68eedbbdc	Revert some uses of xsprintf This reverts some changes introduced in `d054f0a4d4`. xsprintf should be used in cases where we calculated the right buffer size by hand (using DECIMAL_STRING_MAX and such), and never in cases where we are printing externally specified strings of arbitrary length. Fixes #4534.	2016-11-02 22:36:29 -04:00
Zbigniew Jędrzejewski-Szmek	7fa6328cc4	Merge pull request #4481 from poettering/perpetual Add "perpetual" unit concept, sysctl fixes, networkd fixes, systemctl color fixes, nspawn discard.	2016-11-02 21:03:26 -04:00
Lennart Poettering	a581e45ae8	unit: unify some code with new unit_new_for_name() call	2016-11-02 11:29:59 -06:00
Lennart Poettering	f5869324e3	core: rework the "no_gc" unit flag to become a more generic "perpetual" flag So far "no_gc" was set on -.slice and init.scope, to units that are always running, cannot be stopped and never exist in an "inactive" state. Since these units are the only users of this flag, let's remodel it and rename it "perpetual" and let's derive more funcitonality off it. Specifically, refuse enqueing stop jobs for these units, and report that they are "unstoppable" in the CanStop bus property.	2016-11-02 11:29:59 -06:00
Zbigniew Jędrzejewski-Szmek	f0bfbfac43	core: when restarting services, don't close fds We would close all the stored fds in service_release_resources(), which of course broke the whole concept of storing fds over service restart. Fixes #4408.	2016-11-01 21:20:21 -04:00
Zbigniew Jędrzejewski-Szmek	605405c6cc	tree-wide: drop NULL sentinel from strjoin This makes strjoin and strjoina more similar and avoids the useless final argument. spatch -I . -I ./src -I ./src/basic -I ./src/basic -I ./src/shared -I ./src/shared -I ./src/network -I ./src/locale -I ./src/login -I ./src/journal -I ./src/journal -I ./src/timedate -I ./src/timesync -I ./src/nspawn -I ./src/resolve -I ./src/resolve -I ./src/systemd -I ./src/core -I ./src/core -I ./src/libudev -I ./src/udev -I ./src/udev/net -I ./src/udev -I ./src/libsystemd/sd-bus -I ./src/libsystemd/sd-event -I ./src/libsystemd/sd-login -I ./src/libsystemd/sd-netlink -I ./src/libsystemd/sd-network -I ./src/libsystemd/sd-hwdb -I ./src/libsystemd/sd-device -I ./src/libsystemd/sd-id128 -I ./src/libsystemd-network --sp-file coccinelle/strjoin.cocci --in-place $(git ls-files src/.c) git grep -e '\bstrjoin\b.NULL' -l\|xargs sed -i -r 's/strjoin$(.*), NULL$/strjoin(\1)/' This might have missed a few cases (spatch has a really hard time dealing with _cleanup_ macros), but that's no big issue, they can always be fixed later.	2016-10-23 11:43:27 -04:00
Lukas Nykryn	87a47f99bc	failure-action: generalize failure action to emergency action	2016-10-21 15:13:50 +02:00
Luca Bruno	52c239d770	core/exec: add a named-descriptor option ("fd") for streams (#4179 ) This commit adds a `fd` option to `StandardInput=`, `StandardOutput=` and `StandardError=` properties in order to connect standard streams to externally named descriptors provided by some socket units. This option looks for a file descriptor named as the corresponding stream. Custom names can be specified, separated by a colon. If multiple name-matches exist, the first matching fd will be used.	2016-10-17 20:05:49 -04:00
Zbigniew Jędrzejewski-Szmek	ba25d39e44	pid1: do not use mtime==0 as sign of masking (#4388 ) It is allowed for unit files to have an mtime==0, so instead of assuming that any file that had mtime==0 was masked, use the load_state to filter masked units. Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1384150.	2016-10-17 07:15:03 +02:00
Zbigniew Jędrzejewski-Szmek	6b430fdb7c	tree-wide: use mfree more	2016-10-16 23:35:39 -04:00
Djalal Harouni	2cd0a73547	core:sandbox: remove CAP_SYS_RAWIO on PrivateDevices=yes The rawio system calls were filtered, but CAP_SYS_RAWIO allows to access raw data through /proc, ioctl and some other exotic system calls...	2016-10-12 13:39:49 +02:00
Djalal Harouni	502d704e5e	core:sandbox: Add ProtectKernelModules= option This is useful to turn off explicit module load and unload operations on modular kernels. This option removes CAP_SYS_MODULE from the capability bounding set for the unit, and installs a system call filter to block module system calls. This option will not prevent the kernel from loading modules using the module auto-load feature which is a system wide operation.	2016-10-12 13:31:21 +02:00
Lennart Poettering	4b58153dd2	core: add "invocation ID" concept to service manager This adds a new invocation ID concept to the service manager. The invocation ID identifies each runtime cycle of a unit uniquely. A new randomized 128bit ID is generated each time a unit moves from and inactive to an activating or active state. The primary usecase for this concept is to connect the runtime data PID 1 maintains about a service with the offline data the journal stores about it. Previously we'd use the unit name plus start/stop times, which however is highly racy since the journal will generally process log data after the service already ended. The "invocation ID" kinda matches the "boot ID" concept of the Linux kernel, except that it applies to an individual unit instead of the whole system. The invocation ID is passed to the activated processes as environment variable. It is additionally stored as extended attribute on the cgroup of the unit. The latter is used by journald to automatically retrieve it for each log logged message and attach it to the log entry. The environment variable is very easily accessible, even for unprivileged services. OTOH the extended attribute is only accessible to privileged processes (this is because cgroupfs only supports the "trusted." xattr namespace, not "user."). The environment variable may be altered by services, the extended attribute may not be, hence is the better choice for the journal. Note that reading the invocation ID off the extended attribute from journald is racy, similar to the way reading the unit name for a logging process is. This patch adds APIs to read the invocation ID to sd-id128: sd_id128_get_invocation() may be used in a similar fashion to sd_id128_get_boot(). PID1's own logging is updated to always include the invocation ID when it logs information about a unit. A new bus call GetUnitByInvocationID() is added that allows retrieving a bus path to a unit by its invocation ID. The bus path is built using the invocation ID, thus providing a path for referring to a unit that is valid only for the current runtime cycleof it. Outlook for the future: should the kernel eventually allow passing of cgroup information along AF_UNIX/SOCK_DGRAM messages via a unique cgroup id, then we can alter the invocation ID to be generated as hash from that rather than entirely randomly. This way we can derive the invocation race-freely from the messages.	2016-10-07 20:14:38 +02:00
Zbigniew Jędrzejewski-Szmek	dd5e7000cb	core: complain if Before= dep on .device is declared [Unit] Before=foobar.device [Service] ExecStart=/bin/true Type=oneshot $ systemd-analyze verify before-device.service before-device.service: Dependency Before=foobar.device ignored (.device units cannot be delayed)	2016-10-01 22:53:17 +02:00
Lennart Poettering	63bb64a056	core: imply ProtectHome=read-only and ProtectSystem=strict if DynamicUser=1 Let's make sure that services that use DynamicUser=1 cannot leave files in the file system should the system accidentally have a world-writable directory somewhere. This effectively ensures that directories need to be whitelisted rather than blacklisted for access when DynamicUser=1 is set.	2016-09-25 10:42:18 +02:00
Lennart Poettering	390bc2b149	core: let's use set_contains() where appropriate	2016-08-22 16:14:21 +02:00
Lennart Poettering	fe700f46ec	core: cache last CPU usage counter, before destorying a cgroup It is useful for clients to be able to read the last CPU usage counter value of a unit even if the unit is already terminated. Hence, before destroying a cgroup's cgroup cache the last CPU usage counter and return it if the cgroup is gone.	2016-08-22 16:14:21 +02:00
Lennart Poettering	05a98afd3e	core: add Ref()/Unref() bus calls for units This adds two (privileged) bus calls Ref() and Unref() to the Unit interface. The two calls may be used by clients to pin a unit into memory, so that various runtime properties aren't flushed out by the automatic GC. This is necessary to permit clients to race-freely acquire runtime results (such as process exit status/code or accumulated CPU time) on successful service termination. Ref() and Unref() are fully recursive, hence act like the usual reference counting concept in C. Taking a reference is a privileged operation, as this allows pinning units into memory which consumes resources. Transient units may also gain a reference at the time of creation, via the new AddRef property (that is only defined for transient units at the time of creation).	2016-08-22 16:14:21 +02:00
Zbigniew Jędrzejewski-Szmek	2056ec1927	Merge pull request #3965 from htejun/systemd-controller-on-unified	2016-08-19 19:58:01 -04:00
Lennart Poettering	00d9ef8560	core: add RemoveIPC= setting This adds the boolean RemoveIPC= setting to service, socket, mount and swap units (i.e. all unit types that may invoke processes). if turned on, and the unit's user/group is not root, all IPC objects of the user/group are removed when the service is shut down. The life-cycle of the IPC objects is hence bound to the unit life-cycle. This is particularly relevant for units with dynamic users, as it is essential that no objects owned by the dynamic users survive the service exiting. In fact, this patch adds code to imply RemoveIPC= if DynamicUser= is set. In order to communicate the UID/GID of an executed process back to PID 1 this adds a new "user lookup" socket pair, that is inherited into the forked processes, and closed before the exec(). This is needed since we cannot do NSS from PID 1 due to deadlock risks, However need to know the used UID/GID in order to clean up IPC owned by it if the unit shuts down.	2016-08-19 00:37:25 +02:00
Tejun Heo	5da38d0768	core: use the unified hierarchy for the systemd cgroup controller hierarchy Currently, systemd uses either the legacy hierarchies or the unified hierarchy. When the legacy hierarchies are used, systemd uses a named legacy hierarchy mounted on /sys/fs/cgroup/systemd without any kernel controllers for process management. Due to the shortcomings in the legacy hierarchy, this involves a lot of workarounds and complexities. Because the unified hierarchy can be mounted and used in parallel to legacy hierarchies, there's no reason for systemd to use a legacy hierarchy for management even if the kernel resource controllers need to be mounted on legacy hierarchies. It can simply mount the unified hierarchy under /sys/fs/cgroup/systemd and use it without affecting other legacy hierarchies. This disables a significant amount of fragile workaround logics and would allow using features which depend on the unified hierarchy membership such bpf cgroup v2 membership test. In time, this would also allow deleting the said complexities. This patch updates systemd so that it prefers the unified hierarchy for the systemd cgroup controller hierarchy when legacy hierarchies are used for kernel resource controllers. * cg_unified(@controller) is introduced which tests whether the specific controller in on unified hierarchy and used to choose the unified hierarchy code path for process and service management when available. Kernel controller specific operations remain gated by cg_all_unified(). * "systemd.legacy_systemd_cgroup_controller" kernel argument can be used to force the use of legacy hierarchy for systemd cgroup controller. * nspawn: By default nspawn uses the same hierarchies as the host. If UNIFIED_CGROUP_HIERARCHY is set to 1, unified hierarchy is used for all. If 0, legacy for all. * nspawn: arg_unified_cgroup_hierarchy is made an enum and now encodes one of three options - legacy, only systemd controller on unified, and unified. The value is passed into mount setup functions and controls cgroup configuration. * nspawn: Interpretation of SYSTEMD_CGROUP_CONTROLLER to the actual mount option is moved to mount_legacy_cgroup_hierarchy() so that it can take an appropriate action depending on the configuration of the host. v2: - CGroupUnified enum replaces open coded integer values to indicate the cgroup operation mode. - Various style updates. v3: Fixed a bug in detect_unified_cgroup_hierarchy() introduced during v2. v4: Restored legacy container on unified host support and fixed another bug in detect_unified_cgroup_hierarchy().	2016-08-17 17:44:36 -04:00
Tejun Heo	ca2f6384aa	core: rename cg_unified() to cg_all_unified() A following patch will update cgroup handling so that the systemd controller (/sys/fs/cgroup/systemd) can use the unified hierarchy even if the kernel resource controllers are on the legacy hierarchies. This would require distinguishing whether all controllers are on cgroup v2 or only the systemd controller is. In preparation, this patch renames cg_unified() to cg_all_unified(). This patch doesn't cause any functional changes.	2016-08-15 18:13:36 -04:00
Tejun Heo	66ebf6c0a1	core: add cgroup CPU controller support on the unified hierarchy Unfortunately, due to the disagreements in the kernel development community, CPU controller cgroup v2 support has not been merged and enabling it requires applying two small out-of-tree kernel patches. The situation is explained in the following documentation. https://git.kernel.org/cgit/linux/kernel/git/tj/cgroup.git/tree/Documentation/cgroup-v2-cpu.txt?h=cgroup-v2-cpu While it isn't clear what will happen with CPU controller cgroup v2 support, there are critical features which are possible only on cgroup v2 such as buffered write control making cgroup v2 essential for a lot of workloads. This commit implements systemd CPU controller support on the unified hierarchy so that users who choose to deploy CPU controller cgroup v2 support can easily take advantage of it. On the unified hierarchy, "cpu.weight" knob replaces "cpu.shares" and "cpu.max" replaces "cpu.cfs_period_us" and "cpu.cfs_quota_us". [Startup]CPUWeight config options are added with the usual compat translation. CPU quota settings remain unchanged and apply to both legacy and unified hierarchies. v2: - Error in man page corrected. - CPU config application in cgroup_context_apply() refactored. - CPU accounting now works on unified hierarchy.	2016-08-07 09:45:39 -04:00
Lennart Poettering	29206d4619	core: add a concept of "dynamic" user ids, that are allocated as long as a service is running This adds a new boolean setting DynamicUser= to service files. If set, a new user will be allocated dynamically when the unit is started, and released when it is stopped. The user ID is allocated from the range 61184..65519. The user will not be added to /etc/passwd (but an NSS module to be added later should make it show up in getent passwd). For now, care should be taken that the service writes no files to disk, since this might result in files owned by UIDs that might get assigned dynamically to a different service later on. Later patches will tighten sandboxing in order to ensure that this cannot happen, except for a few selected directories. A simple way to test this is: systemd-run -p DynamicUser=1 /bin/sleep 99999	2016-07-22 15:53:45 +02:00
Lennart Poettering	1d98fef17d	core: when forcibly killing/aborting left-over unit processes log about it Let's lot at LOG_NOTICE about any processes that we are going to SIGKILL/SIGABRT because clean termination of them didn't work. This turns the various boolean flag parameters to cg_kill(), cg_migrate() and related calls into a single binary flags parameter, simply because the function now gained even more parameters and the parameter listed shouldn't get too long. Logging for killing processes is done either when the kill signal is SIGABRT or SIGKILL, or on explicit request if KILL_TERMINATE_AND_LOG instead of LOG_TERMINATE is passed. This isn't used yet in this patch, but is made use of in a later patch.	2016-07-20 14:35:15 +02:00
Michael Biebl	595bfe7df2	Various fixes for typos found by lintian (#3705 )	2016-07-12 12:52:11 +02:00
Torstein Husebø	61233823aa	treewide: fix typos and remove accidental repetition of words	2016-07-11 16:18:43 +02:00
David Michael	4f952a3f07	core: queue loading transient units after setting their properties (#3676 ) The unit load queue can be processed in the middle of setting the unit's properties, so its load_state would no longer be UNIT_STUB for the check in bus_unit_set_properties(), which would cause it to incorrectly return an error.	2016-07-08 05:43:01 +02:00
Kyle Walker	36f20ae3b2	manager: Only invoke a single sigchld per unit within a cleanup cycle By default, each iteration of manager_dispatch_sigchld() results in a unit level sigchld event being invoked. For scope units, this results in a scope_sigchld_event() which can seemingly stall for workloads that have a large number of PIDs within the scope. The stall exhibits itself as a SIG_0 being initiated for each u->pids entry as a result of pid_is_unwaited(). v2: This patch resolves this condition by only paying to cost of a sigchld in the underlying scope unit once per sigchld iteration. A new "sigchldgen" member resides within the Unit struct. The Manager is incremented via the sd event loop, accessed via sd_event_get_iteration, and the Unit member is set to the same value as the manager each time that a sigchld event is invoked. If the Manager iteration value and Unit member match, the sigchld event is not invoked for that iteration.	2016-06-30 15:16:47 -04:00
Lennart Poettering	fc40065bcd	core: when writing transient unit files, make sure all lines end with a newline This is a fix-up for `2a9a6f8ac0` which covered non-transient units, but missed the case for transient units.	2016-06-23 01:29:33 +02:00
Lennart Poettering	3f71dec5d7	unit: properly comment generated comments in unit files Fix-up for `2a9a6f8ac0`	2016-06-14 20:01:45 +02:00
Zbigniew Jędrzejewski-Szmek	2a9a6f8ac0	core/unit: append newline when writing drop ins unit_write_drop_in{,_private}{,_format} are all affected. We already append a header to the file (and section markers), so those functions can only be used to write a whole file at once. Including the newline at the end feels natural. After this commit newlines will be duplicated. They will be removed in subsequent commit. Also, rewrap the "autogenerated" header to fit within 80 columns.	2016-05-28 16:17:54 -04:00
Lennart Poettering	3103459e90	Merge pull request #3193 from htejun/cgroup-io-controller core: add io controller support on the unified hierarchy	2016-05-16 22:05:27 +02:00
Michal Sekletar	833f92ad39	core: don't log job status message in case job was effectively NOP (#3199 ) We currently generate log message about unit being started even when unit was started already and job didn't do anything. This is because job was requested explicitly and hence became anchor job of the transaction thus we could not eliminate it. That is fine but, let's not pollute journal with useless log messages. $ systemctl start systemd-resolved $ systemctl start systemd-resolved $ systemctl start systemd-resolved Current state: $ journalctl -u systemd-resolved \| grep Started May 05 15:31:42 rawhide systemd[1]: Started Network Name Resolution. May 05 15:31:59 rawhide systemd[1]: Started Network Name Resolution. May 05 15:32:01 rawhide systemd[1]: Started Network Name Resolution. After patch applied: $ journalctl -u systemd-resolved \| grep Started May 05 16:42:12 rawhide systemd[1]: Started Network Name Resolution. Fixes #1723	2016-05-16 11:24:51 -04:00

1 2 3 4 5 ...

478 commits