Systemd

Author	SHA1	Message	Date
Zbigniew Jędrzejewski-Szmek	349cc4a507	build-sys: use #if Y instead of #ifdef Y everywhere The advantage is that is the name is mispellt, cpp will warn us. $ git grep -Ee "conf.set$'(HAVE\|ENABLE)_" -l\|xargs sed -r -i "s/conf.set\('(HAVE\|ENABLE)_/conf.set10('\1_/" $ git grep -Ee '#ifn?def (HAVE\|ENABLE)' -l\|xargs sed -r -i 's/#ifdef (HAVE\|ENABLE)/#if \1/; s/#ifndef (HAVE\|ENABLE)/#if ! \1/;' $ git grep -Ee 'if.defined\(HAVE' -l\|xargs sed -i -r 's/defined\((HAVE_[A-Z0-9_])$/\1/g' $ git grep -Ee 'if.defined$ENABLE' -l\|xargs sed -i -r 's/defined\((ENABLE_[A-Z0-9_])$/\1/g' + manual changes to meson.build squash! build-sys: use #if Y instead of #ifdef Y everywhere v2: - fix incorrect setting of HAVE_LIBIDN2	2017-10-04 12:09:29 +02:00
Lennart Poettering	c621849539	core: fix special directories for user services The system paths were listed where the user paths should have been listed. Correct that.	2017-10-02 17:41:44 +02:00
Lennart Poettering	091e9efed3	core: fix StateDirectory= (and friends) safety checks when decoding transient unit properties Let's make sure relative directories such as "foo/bar" are accepted, by using the same validation checks as in unit file parsing.	2017-10-02 17:41:44 +02:00
Lennart Poettering	e53c42ca0a	core: pass the correct error to the caller	2017-10-02 17:41:44 +02:00
Lennart Poettering	da50b85af7	core: when looking for a UID to use for a dynamic UID start with the current owner of the StateDirectory= and friends Let's optimize dynamic UID allocation a bit: if a StateDirectory= (or suchlike) is configured, we start our allocation loop from that UID and use it if it currently isn't used otherwise. This is beneficial as it saves us from having to expensively recursively chown() these directories in the typical case (which StateDirectory= does when it notices that the owner of the directory doesn't match the UID picked). With this in place we now have the a three-phase logic for allocating a dynamic UID: a) first, we try to use the owning UID of StateDirectory=, CacheDirectory=, LogDirectory= if that exists and is currently otherwise unused. b) if that didn't work out, we hash the UID from the service name c) if that didn't yield an unused UID either, randomly pick new ones until we find a free one.	2017-10-02 17:41:44 +02:00
Lennart Poettering	6c47cd7d3b	execute: make StateDirectory= and friends compatible with DynamicUser=1 and RootDirectory=/RootImage= Let's clean up the interaction of StateDirectory= (and friends) to DynamicUser=1: instead of creating these directories directly below /var/lib, place them in /var/lib/private instead if DynamicUser=1 is set, making that directory 0700 and owned by root:root. This way, if a dynamic UID is later reused, access to the old run's state directory is prohibited for that user. Then, use file system namespacing inside the service to make /var/lib/private a readable tmpfs, hiding all state directories that are not listed in StateDirectory=, and making access to the actual state directory possible. Mount all directories listed in StateDirectory= to the same places inside the service (which means they'll now be mounted into the tmpfs instance). Finally, add a symlink from the state directory name in /var/lib/ to the one in /var/lib/private, so that both the host and the service can access the path under the same location. Here's an example: let's say a service runs with StateDirectory=foo. When DynamicUser=0 is set, it will get the following setup, and no difference between what the unit and what the host sees: /var/lib/foo (created as directory) Now, if DynamicUser=1 is set, we'll instead get this on the host: /var/lib/private (created as directory with mode 0700, root:root) /var/lib/private/foo (created as directory) /var/lib/foo → private/foo (created as symlink) And from inside the unit: /var/lib/private (a tmpfs mount with mode 0755, root:root) /var/lib/private/foo (bind mounted from the host) /var/lib/foo → private/foo (the same symlink as above) This takes inspiration from how container trees are protected below /var/lib/machines: they generally reuse UIDs/GIDs of the host, but because /var/lib/machines itself is set to 0700 host users cannot access files in the container tree even if the UIDs/GIDs are reused. However, for this commit we add one further trick: inside and outside of the unit /var/lib/private is a different thing: outside it is a plain, inaccessible directory, and inside it is a world-readable tmpfs mount with only the whitelisted subdirs below it, bind mounte din. This means, from the outside the dir acts as an access barrier, but from the inside it does not. And the symlink created in /var/lib/foo itself points across the barrier in both cases, so that root and the unit's user always have access to these dirs without knowing the details of this mounting magic. This logic resolves a major shortcoming of DynamicUser=1 units: previously they couldn't safely store persistant data. With this change they can have their own private state, log and data directories, which they can write to, but which are protected from UID recycling. With this change, if RootDirectory= or RootImage= are used it is ensured that the specified state/log/cache directories are always mounted in from the host. This change of semantics I think is much preferable since this means the root directory/image logic can be used easily for read-only resource bundling (as all writable data resides outside of the image). Note that this is a change of behaviour, but given that we haven't released any systemd version with StateDirectory= and friends implemented this should be a safe change to make (in particular as previously it wasn't clear what would actually happen when used in combination). Moreover, by making this change we can later add a "+" modifier to these setings too working similar to the same modifier in ReadOnlyPaths= and friends, making specified paths relative to the container itself.	2017-10-02 17:41:44 +02:00
Lennart Poettering	a227a4be48	namespace: if we can create the destination of bind and PrivateTmp= mounts When putting together the namespace, always create the file or directory we are supposed to bind mount on, the same way we do it for most other stuff, for example mount units or systemd-nspawn's --bind= option. This has the big benefit that we can use namespace bind mounts on dirs in /tmp or /var/tmp even in conjunction with PrivateTmp=.	2017-10-02 17:41:43 +02:00
Lennart Poettering	e908468b5b	namespace: properly handle bind mounts from the host Before this patch we had an ordering problem: if we have no namespacing enabled except for two bind mounts that intend to swap /a and /b via bind mounts, then we'd execute the bind mount binding /b to /a, followed by thebind mount from /a to /b, thus having the effect that /b is now visible in both /a and /b, which was not intended. With this change, as soon as any bind mount is configured we'll put together the service mount namespace in a temporary directory instead of operating directly in the root. This solves the problem in a straightforward fashion: the source of bind mounts will always refer to the host, and thus be unaffected from the bind mounts we already created.	2017-10-02 17:41:43 +02:00
Lennart Poettering	645767d6b5	namespace: create /dev, /proc, /sys when needed We already create /dev implicitly if PrivateTmp=yes is on, if it is missing. Do so too for the other two API VFS, as well as for /dev if PrivateTmp=yes is off but MountAPIVFS=yes is on (i.e. when /dev is bind mounted from the host).	2017-10-02 17:41:43 +02:00
Lennart Poettering	72fd17682d	core: usually our enum's _INVALID and _MAX special values are named after the full type In most cases we followed the rule that the special _INVALID and _MAX values we use in our enums use the full type name as prefix (in contrast to regular values that we often make shorter), do so for ExecDirectoryType as well. No functional changes, just a little bit of renaming to make this code more like the rest.	2017-10-02 17:41:43 +02:00
Lennart Poettering	a1164ae380	core: chown() StateDirectory= and friends recursively when starting a service This is particularly useful when used in conjunction with DynamicUser=1, where the UID might change for every invocation, but is useful in other cases too, for example, when these directories are shared between systems where the UID assignments differ slightly.	2017-10-02 17:41:43 +02:00
Jouke Witteveen	df66b93fe2	service: better detect when a Type=notify service cannot become active anymore (#6959 ) No need to wait for a timeout when we know things are not going to work out. When the main process goes away and only notifications from the main process are accepted, then we will not receive any notifications anymore.	2017-10-02 16:35:27 +02:00
Zbigniew Jędrzejewski-Szmek	b139c95bc4	Merge pull request #6941 from andir/use-in_set use IN_SET where possible	2017-10-02 15:08:10 +02:00
Andreas Rammhold	ec2ce0c5d7	tree-wide: use `!IN_SET(..)` for `a != b && a != c && …` The included cocci was used to generate the changes. Thanks to @flo-wer for pointing this case out.	2017-10-02 13:09:56 +02:00
Andreas Rammhold	3742095b27	tree-wide: use IN_SET where possible In addition to the changes from #6933 this handles cases that could be matched with the included cocci file.	2017-10-02 13:09:54 +02:00
Lennart Poettering	b13ddbbcf3	service: accept the fact that the three xyz_good() functions return ints Currently, all three of cgroup_good(), main_pid_good(), control_pid_good() all return an "int" (two of them propagate errors). It's a good thing to keep the three functions similar, so let's leave it at that, but then let's clean up the invocation of the three functions so that they always clearly acknowledge that the return value is not a bool, but potentially negative.	2017-10-02 12:58:42 +02:00
Lennart Poettering	019be28676	service: drop _pure_ decorator on static function The compiler should be good enough to figure this out on its own if this is a static function, and it makes control_pid_good() an outlier anyway, and decorators like this tend to bitrot. Hence, to keep things simple and automatic, let's just drop the decorator.	2017-10-02 12:58:42 +02:00
Lennart Poettering	3c751b1bfa	service: a cgroup empty notification isn't reason enough to go down The processes associated with a service are not just the ones in its cgroup, but also the control and main processes, which might possibly live outside of it, for example if they transitioned into their own cgroups because they registered a PAM session of their own. Hence, if we get a cgroup empty notification always check if the main PID is still around before taking action too eagerly. Fixes: #6045	2017-10-02 12:58:42 +02:00
Lennart Poettering	07697d7ec5	service: add explanatory comments to control_pid_good() and cgroup_good() Let's add a similar comment to each as we already have for main_pid_good(), emphasizing that these functions are supposed to be have very similar.	2017-10-02 12:58:42 +02:00
Lennart Poettering	51894d706f	service: fix main_pid_good() comment We don't actually return -1, don't claim that.	2017-10-02 12:58:37 +02:00
Zbigniew Jędrzejewski-Szmek	9500b9209b	Merge pull request #6928 from poettering/cgroup-empty-race rework cgroup empty notification handling (i.e. a fix for #6608)	2017-09-28 08:48:21 +02:00
Zbigniew Jędrzejewski-Szmek	7e56da12e8	Merge pull request #6922 from poettering/symlink-sockets Fixes for Symlinks= handling in socket units	2017-09-27 19:37:25 +02:00
Lennart Poettering	ed77d407d3	core: log unit failure with type-specific result code This slightly changes how we log about failures. Previously, service_enter_dead() would log that a service unit failed along with its result code, and unit_notify() would do this again but without the result code. For other unit types only the latter would take effect. This cleans this up: we keep the message in unit_notify() only for debug purposes, and add type-specific log lines to all our unit types that can fail, and always place them before unit_notify() is invoked. Or in other words: the duplicate log message for service units is removed, and all other unit types get a more useful line with the precise result code.	2017-09-27 18:26:18 +02:00
Lennart Poettering	84b26d5149	core: free_and_strdup() FTW!	2017-09-27 18:26:18 +02:00
Lennart Poettering	4724964040	cgroup: IN_SET() FTW!	2017-09-27 18:26:18 +02:00
Lennart Poettering	09e2465407	cgroup: after determining that a cgroup is empty, asynchronously dispatch this This makes sure that if we learn via inotify or another event source that a cgroup is empty, and we checked that this is indeed the case (as we might get spurious notifications through inotify, as the inotify logic through the "cgroups.event" is pretty unspecific and might be trigger for a variety of reasons), then we'll enqueue a defer event for it, at a priority lower than SIGCHLD handling, so that we know for sure that if there's waitid() data for a process we used it before considering the cgroup empty notification. Fixes: #6608	2017-09-27 18:26:18 +02:00
Lennart Poettering	91a6073ef7	core: rename cgroup_queue → cgroup_realize_queue We are about to add second cgroup-related queue, called "cgroup_empty_queue", hence let's rename "cgroup_queue" to "cgroup_realize_queue" (as that is its purpose) to minimize confusion about the two queues. Just a rename, no functional changes.	2017-09-27 17:59:25 +02:00
Lennart Poettering	6d330fef4d	unit: remove unused fields from Unit structure	2017-09-27 17:59:25 +02:00
Zbigniew Jędrzejewski-Szmek	2e4025c0f9	core/cgroup: add a helper macro for a common pattern (#6926 )	2017-09-27 17:54:06 +02:00
Lennart Poettering	22b20752e2	socket: if RemoveOnStop= is turned on for a socket, try to unlink() pre-existing symlinks Normally, Symlinks= failing is not considered fatal nor destructive. Let's slightly alter behaviour here if RemoveOnStop= is turned on. In that case the use in a way opted for destructive behaviour and we do unlink all sockets and symlinks when the socket unit goes down. And that means we might as well unlink any pre-existing if this mode is selected. Yeah, it's a bit of a stretch to do this, but @OhNoMoreGit is right: if RemoveOnStop= is on we are destructive regarding any pre-existing symlinks on stop, and it would be quite weird if we wouldn't be on start.	2017-09-27 17:53:00 +02:00
Lennart Poettering	1af87ab7d6	socket: create leading directories for socket symlinks It really doesn't hurt creating prefix directories if necessary, as we tend to do that for other file nodes we create, too. Fixes: #6920	2017-09-27 17:53:00 +02:00
Lennart Poettering	95f7fbbf88	socket: make sure we warn loudly about symlinks we can't create Note that this change does not make symlink creation failing fatal. I am not entirely sure about whether it should be, but I am leaning towards not making it fatal for two reasons: symlinks like this tend to be a compatibility feature, and hence unlikely to be essential for operation, in a way this breaks compatibility, and while doing that is not off the table, we should probably avoid it if we are not entirely sure it's a good thing. Note that this also changes plain symlink() to symlink_idempotent() so that existing symlinks with the right destination are nothing we log about. Fixes: #6920	2017-09-27 17:53:00 +02:00
Zbigniew Jędrzejewski-Szmek	dab9698e1d	Merge pull request #6919 from poettering/ebpf-followup Some minor follow-ups for the ebpf/cgroup PR	2017-09-27 11:23:02 +02:00
Zbigniew Jędrzejewski-Szmek	f30574144d	Merge pull request #6915 from poettering/log-execute make execute.c logging a bit less special	2017-09-27 11:16:24 +02:00
Lennart Poettering	4fe66c8681	core: improve dbus-cgroup error message As suggested by @keszybz in the review of #6764	2017-09-26 23:49:40 +02:00
Lennart Poettering	40a80078d2	execute: let's close glibc syslog channels too Just in case something opened them, let's make sure glibc invalidates them too. Thankfully so far no library opened log channels behind our back, at least as far as I know, hence this is actually a NOP, but let's better be safe than sorry.	2017-09-26 17:52:25 +02:00
Lennart Poettering	12145637e9	execute: normalize logging in execute.c Now that logging can implicitly reopen the log streams when needed we can log errors without any special magic, hence let's normalize things, and log the same way we do everywhere else.	2017-09-26 17:51:22 +02:00
Lennart Poettering	86ffb32560	execute: drop explicit log_open()/log_close() now that it is unnecessary	2017-09-26 17:46:34 +02:00
Lennart Poettering	2c027c62dd	execute: make use of the new logging mode in execute.c	2017-09-26 17:46:34 +02:00
Lennart Poettering	82677ae4c7	execute: downgrade a log message ERR → WARNING, since we proceed ignoring its result	2017-09-26 17:46:33 +02:00
Lennart Poettering	8002fb9747	execute: rework logging in setup_keyring() to include unit info Let's use log_unit_error() instead of log_error() everywhere (and friends).	2017-09-26 17:46:33 +02:00
Lennart Poettering	dedf371909	swap: introduce SWAP_STATE_WITH_PROCESS() similar to MOUNT_STATE_WITH_PROCESS()	2017-09-26 16:17:22 +02:00
Lennart Poettering	50864457e1	swap: adjust swap.c in a similar way to what we just did to mount.c Also drop the redundant states and make all similar changes too. Thankfully the swap.c state engine is much simpler than mount.c's, hence this should be easier to digest.	2017-09-26 16:17:22 +02:00
Lennart Poettering	c634f3d2fc	mount: rename mount_state_active() → MOUNT_STATE_WITH_PROCESS() The function returns true for all states that have a control process running, and each time we call it that's what we want to know, hence let's rename it accordingly. Moreover, the more generic unit states have an ACTIVE state, and it is defined quite differently from the set of states this function returns true for, hence let's avoid confusion and not reuse the word "ACTIVE" here in a different context. Finally, let's uppercase this, since in most ways it's pretty much identical to a macro	2017-09-26 16:17:22 +02:00
Lennart Poettering	22af0e5873	mount: rework mount state engine This changes the mount unit state engine in the following ways: 1. The MOUNT_MOUNTING_SIGTERM and MOUNT_MOUNTING_SIGKILL are removed. They have been pretty much equivalent to MOUNT_UNMOUNTING_SIGTERM and MOUNT_UNMOUNTING_SIGKILL in what they do, and the outcome has been the same as well: the unit is stopped. Hence, let's simplify things a bit, and merge them. Note that we keep MOUNT_REMOUNTING_{SIGTERM\|SIGKILL} however, as those states have a different outcome: the unit remains started. 2. mount_enter_signal() will now honour the SendSIGKILL= option of the mount unit if it was set. This was previously done already when we entered the signal states through a timeout, and was simply missing here. 3. A new helper function mount_enter_dead_or_mounted() is added that places the mount unit in either MOUNT_DEAD or MOUNT_MOUNTED, depending on what the kernel thinks about the mount's state. This function is called at various places now, wherever we finished an operation, and want to make sure our own state reflects again what the kernel thinks. Previously we had very similar code in a number of places and in other places didn't recheck the kernel state. Let's do that with the same logic and function at all relevant places now. 4. Rework mount_stop(): never forget about running control processes. Instead: when we have a start (i.e. a /bin/mount) process running, and are asked to stop, then enter the kill states for it, so that it gets cleaned up. This fixes #6048. Moreover, when we have a reload process running convert the possible states into the relevant unmounting states, so that we can properly execute the requested operation. Fixes #6048	2017-09-26 16:17:22 +02:00
Lennart Poettering	850b741084	mount: clean up reload_result management a bit Let's only collect the first failure in the load result, and let's clear it explicitly when we are about to enter a new reload operation. This makes it more alike the handling of the main result value (which also only stores the first failure), and also the handling of service.c's reload state.	2017-09-26 16:17:22 +02:00
Lennart Poettering	a6951a5079	service: rework service_kill_control_processes() Let's make sure we explicitly also kill any control process we know of, given that it might have moved outside of our control group.	2017-09-26 16:17:22 +02:00
Jan Synacek	0cde65e263	test-cpu-set-util.c: fix typo in comment (#6916 )	2017-09-26 16:07:34 +02:00
Lennart Poettering	88af31f922	socket: assign socket units to a default slice unconditionally Due to the chown() logic socket units might end up with processes even if no explicit command is defined for them, hence let's make sure these processes are in the right cgroup, and that means within a slice. Mount, swap and service units unconditionally are assigned to a slice already, let's do the same here, too. (This becomes more important as soon as the ebpf/firewall stuff is merged, as there'll be another reason to fork off processes then)	2017-09-22 20:09:21 +02:00
Lennart Poettering	7960b0c704	cgroup: make use of unit_cgroup_delegate() where useful It's an easy-to-use wrapper, so let's take benefit of it.	2017-09-22 20:02:23 +02:00

1 2 3 4 5 ...

3298 commits