Systemd

Author	SHA1	Message	Date
Zbigniew Jędrzejewski-Szmek	79e221d078	Merge pull request #9158 from poettering/notify-auto-reload trigger OnFailure= only if Restart= is not in effect	2018-06-05 13:51:07 +02:00
Zbigniew Jędrzejewski-Szmek	a1230ff972	basic/log: add the log_struct terminator to macro This way all callers do not need to specify it. Exhaustively tested by running test-log under valgrind ;)	2018-06-04 13:46:03 +02:00
Lennart Poettering	ec5b1452ac	core: go to failure state if the main service process fails and RemainAfterExit=yes (#9159 ) Previously, we'd not care about failures that were seen earlier and remain in "exited" state. This could be triggered if the main process of a service failed while ExecStartPost= was still running, as in that case we'd not immediately act on the main process failure because we needed to wait for ExecStartPost= to finish, before acting on it. Fixes: #8929	2018-06-04 11:35:25 +02:00
Yu Watanabe	858d36c1ec	path-util: introduce path_simplify() The function is similar to path_kill_slashes() but also removes initial './', trailing '/.', and '/./' in the path. When the second argument of path_simplify() is false, then it behaves as the same as path_kill_slashes(). Hence, this also replaces path_kill_slashes() with path_simplify().	2018-06-03 23:39:26 +09:00
Lennart Poettering	2ad2e41a72	core: don't trigger OnFailure= deps when a unit is going to restart This adds a flags parameter to unit_notify() which can be used to pass additional notification information to the function. We the make the old reload_failure boolean parameter one of these flags, and then add a new flag that let's unit_notify() if we are configured to restart the service. Note that this adjusts behaviour of systemd to match what the docs say. Fixes: #8398	2018-06-01 19:08:30 +02:00
Alan Jenkins	4330dc03a0	service: FileDescriptorStoreMax should also imply NotifyAccess Commenting out "WatchdogTimeout=3min" in systemd-logind.service causes NotifyAccess to go from "main" to "none", breaking support for logind restart. Let's fix that.	2018-05-15 12:33:56 +02:00
Lennart Poettering	da6053d0a7	tree-wide: be more careful with the type of array sizes Previously we were a bit sloppy with the index and size types of arrays, we'd regularly use unsigned. While I don't think this ever resulted in real issues I think we should be more careful there and follow a stricter regime: unless there's a strong reason not to use size_t for array sizes and indexes, size_t it should be. Any allocations we do ultimately will use size_t anyway, and converting forth and back between unsigned and size_t will always be a source of problems. Note that on 32bit machines "unsigned" and "size_t" are equivalent, and on 64bit machines our arrays shouldn't grow that large anyway, and if they do we have a problem, however that kind of overly large allocation we have protections for usually, but for overflows we do not have that so much, hence let's add it. So yeah, it's a story of the current code being already "good enough", but I think some extra type hygiene is better. This patch tries to be comprehensive, but it probably isn't and I missed a few cases. But I guess we can cover that later as we notice it. Among smaller fixes, this changes: 1. strv_length()' return type becomes size_t 2. the unit file changes array size becomes size_t 3. DNS answer and query array sizes become size_t Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=76745	2018-04-27 14:29:06 +02:00
Zbigniew Jędrzejewski-Szmek	11a1589223	tree-wide: drop license boilerplate Files which are installed as-is (any .service and other unit files, .conf files, .policy files, etc), are left as is. My assumption is that SPDX identifiers are not yet that well known, so it's better to retain the extended header to avoid any doubt. I also kept any copyright lines. We can probably remove them, but it'd nice to obtain explicit acks from all involved authors before doing that.	2018-04-06 18:58:55 +02:00
Yu Watanabe	1cc6c93a95	tree-wide: use TAKE_PTR() and TAKE_FD() macros	2018-04-05 14:26:26 +09:00
Lennart Poettering	12b6b3b7a4	Merge pull request #8562 from keszybz/docs Man page and log message fixes	2018-03-26 15:34:39 +02:00
Zbigniew Jędrzejewski-Szmek	5ce6e7f525	core/service: rework the hold-off time over message "hold-off" is apparently confusing, because we also have HoldoffTimeoutSec=. Let's use RestartSec= directly in the message. Fixes #5472.	2018-03-24 14:22:42 +01:00
Lennart Poettering	ae2a15bc14	macro: introduce TAKE_PTR() macro This macro will read a pointer of any type, return it, and set the pointer to NULL. This is useful as an explicit concept of passing ownership of a memory area between pointers. This takes inspiration from Rust: https://doc.rust-lang.org/std/option/enum.Option.html#method.take and was suggested by Alan Jenkins (@sourcejedi). It drops ~160 lines of code from our codebase, which makes me like it. Also, I think it clarifies passing of ownership, and thus helps readability a bit (at least for the initiated who know the new macro)	2018-03-22 20:21:42 +01:00
Zbigniew Jędrzejewski-Szmek	064c593899	core/service: fix memleak of USBFunctionStrings and USBFunctionDescriptors oss-fuzz #6892.	2018-03-17 09:01:53 +01:00
Lennart Poettering	62d74c78b5	coccinelle: add reallocarray() coccinelle script Let's systematically make use of reallocarray() whereever we invoke realloc() with a product of two values.	2018-03-02 12:39:07 +01:00
Lennart Poettering	00f5ad93b5	core: change KeyringMode= to "shared" by default for non-service units in the system manager (#8172 ) Before this change all unit types would default to "private" in the system service manager and "inherit" to in the user service manager. With this change this is slightly altered: non-service units of the system service manager are now run with KeyringMode=shared. This appears to be the more appropriate choice as isolation is not as desirable for mount tools, which regularly consume key material. After all mounts are a shared resource themselves as they appear system-wide hence it makes a lot of sense to share their key material too. Fixes: #8159	2018-02-20 08:53:34 +01:00
Lennart Poettering	a94ab7acfd	Merge pull request #8175 from keszybz/gc-cleanup Garbage collection cleanup	2018-02-15 17:47:37 +01:00
Zbigniew Jędrzejewski-Szmek	7f7d01ed58	pid1: include the source unit in UnitRef No functional change. The source unit manages the reference. It allocates the UnitRef structure and registers it in the target unit, and then the reference must be destroyed before the source unit is destroyed. Thus, is should be OK to include the pointer to the source unit, it should be live as long as the reference exists. v2: - rename refs to refs_by_target	2018-02-15 13:27:06 +01:00
Zbigniew Jędrzejewski-Szmek	f2f725e5cc	pid1: rename unit_check_gc to unit_may_gc "check" is unclear: what is true, what is false? Let's rename to "can_gc" and revert the return value ("positive" values are easier to grok). v2: - rename from unit_can_gc to unit_may_gc	2018-02-15 13:04:12 +01:00
Lennart Poettering	004c7f169e	core: fold manager_set_exec_params() into unit_set_exec_params() Let's simplify things a bit: we so far called both functions every single time, let's just merge one into the other, so that we have fewer functions to call.	2018-02-12 11:34:00 +01:00
Lennart Poettering	1d9cc8768f	cgroup: add a new "can_delegate" flag to the unit vtable, and set it for scope and service units only Currently we allowed delegation for alluntis with cgroup backing except for slices. Let's make this a bit more strict for now, and only allow this in service and scope units. Let's also add a generic accessor unit_cgroup_delegate() for checking whether a unit has delegation turned on that checks the new bool first. Also, when doing transient units, let's explcitly refuse turning on delegation for unit types that don#t support it. This is mostly cosmetical as we wouldn't act on the delegation request anyway, but certainly helpful for debugging.	2018-02-12 11:34:00 +01:00
Lennart Poettering	73969ab61c	service: relax PID file symlink chain checks a bit (#8133 ) Let's read the PID file after all if there's a potentially unsafe symlink chain in place. But if we do, then refuse taking the PID if its outside of the cgroup. Fixes: #8085	2018-02-09 17:05:17 +01:00
Yu Watanabe	f2e18ef1a3	core: remove unnecessary initialization	2018-02-09 16:36:37 +09:00
Yu Watanabe	e8a565cb66	core: make ExecRuntime be manager managed object Before this, each ExecRuntime object is owned by a unit. However, it may be shared with other units which enable JoinsNamespaceOf=. Thus, by the serialization/deserialization process, its sharing information, more specifically, reference counter is lost, and causes issue #7790. This makes ExecRuntime objects be managed by manager, and changes the serialization/deserialization process. Fixes #7790.	2018-02-06 16:00:34 +09:00
Yu Watanabe	c9d4169919	core/service: dump more settings	2018-01-30 17:10:47 +09:00
Lennart Poettering	adefcf2821	core: rework how we count the n_on_console counter Let's add a per-unit boolean that tells us whether our unit is currently counted or not. This way it's unlikely we get out of sync again and things are generally more robust. This also allows us to remove the counting logic specific to service units (which was in fact mostly a copy from the generic implementation), in favour of fully generic code. Replaces: #7824	2018-01-24 20:14:51 +01:00
Lennart Poettering	bb2c768545	core: add a new unit_needs_console() call This call determines whether a specific unit currently needs access to the console. It's a fancy wrapper around exec_context_may_touch_console() ultimately, however for service units we'll explicitly exclude the SERVICE_EXITED state from when we report true.	2018-01-24 19:54:26 +01:00
Lennart Poettering	9acac21249	service: simplify condition The left side of the \|\| expression is conditionalized on SERVICE_START, but SERVICE_START is blanket listed on the right side anyway, hence we can drop the left side entirely without any change in behaviour. Moreover, if main_pid is initialized, it should be watched, hence this is even the safe and right thing to do.	2018-01-23 21:29:31 +01:00
Lennart Poettering	eabd3e56a6	service: don't bother with watching PIDs during deserialization service_coldplug() takes care of that anyway, hence drop the unit_watch_pid() invocation entirely during serialization, it's redundant.	2018-01-23 21:29:31 +01:00
Lennart Poettering	11aef522c1	core: unify call we use to synthesize cgroup empty events when we stopped watching any unit PIDs This code is very similar in scope and service units, let's unify it in one function. This changes little for service units, but for scope units makes sure we go through the cgroup queue, which is something we should do anyway.	2018-01-23 21:22:50 +01:00
Lennart Poettering	5cdabc8d7b	service: don't send out dbus change notifications spuriously on SIGCHLD Let's send them out only if the main or control processe exited and we recorded a new exit status that is worth reporting. But if any other service process died this is nothing to report since we don't expose any properties about that anyway.	2018-01-23 21:22:50 +01:00
Jan Klötzke	2a12e32efa	pid1: add option to disable service watchdogs Add a "systemd.service_watchdogs=" option to the command line which disables all service runtime watchdogs and emergency actions.	2018-01-22 18:10:03 +01:00
Zbigniew Jędrzejewski-Szmek	e0b6d3cabe	Merge pull request #7816 from poettering/chase-pid Make MAINPID= and PIDFile= handling more restrictive (and other stuff)	2018-01-15 14:14:34 +04:00
Lennart Poettering	db256aab13	core: be stricter when handling PID files and MAINPID sd_notify() messages Let's be more restrictive when validating PID files and MAINPID= messages: don't accept PIDs that make no sense, and if the configuration source is not trusted, don't accept out-of-cgroup PIDs. A configuratin source is considered trusted when the PID file is owned by root, or the message was received from root. This should lock things down a bit, in case service authors write out PID files from unprivileged code or use NotifyAccess=all with unprivileged code. Note that doing so was always problematic, just now it's a bit less problematic. When we open the PID file we'll now use the CHASE_SAFE chase_symlinks() logic, to ensure that we won't follow an unpriviled-owned symlink to a privileged-owned file thinking this was a valid privileged PID file, even though it really isn't. Fixes: #6632	2018-01-11 15:12:16 +01:00
Lennart Poettering	8895eb7815	unit: log when we cannot add a watch on a specific PID	2018-01-11 15:07:14 +01:00
Lennart Poettering	f1d34068ef	tree-wide: add DEBUG_LOGGING macro that checks whether debug logging is on (#7645 ) This makes things a bit easier to read I think, and also makes sure we always use the _unlikely_ wrapper around it, which so far we used sometimes and other times we didn't. Let's clean that up.	2017-12-15 11:09:00 +01:00
Daniel Black	a327431bd1	core: add EXTEND_TIMEOUT_USEC={usec} - prevent timeouts in startup/runtime/shutdown (#7214 ) With Type=notify services, EXTEND_TIMEOUT_USEC= messages will delay any startup/ runtime/shutdown timeouts. A service that hasn't timed out, i.e, start time < TimeStartSec, runtime < RuntimeMaxSec and stop time < TimeoutStopSec, may by sending EXTEND_TIMEOUT_USEC=, allow the service to continue beyond the limit for the execution phase (i.e TimeStartSec, RunTimeMaxSec and TimeoutStopSec). EXTEND_TIMEOUT_USEC= must continue to be sent (in the same way as WATCHDOG=1) within the time interval specified to continue to reprevent the timeout from occuring. Watchdog timeouts are also extended if a EXTEND_TIMEOUT_USEC is greater than the remaining time on the watchdog counter. Fixes #5868.	2017-12-14 12:17:43 +01:00
Michal Koutný	deb4e7080d	service: Don't stop unneeded units needed by restarted service (#7526 ) An auto-restarted unit B may depend on unit A with StopWhenUnneeded=yes. If A stops before B's restart timeout expires, it'll be started again as part of B's dependent jobs. However, if stopping takes longer than the timeout, B's running stop job collides start job which also cancels B's start job. Result is that neither A or B are active. Currently, when a service with automatic restarting fails, it transitions through following states: 1) SERVICE_FAILED or SERVICE_DEAD to indicate the failure, 2) SERVICE_AUTO_RESTART while restart timer is running. The StopWhenUnneeded= check takes place in service_enter_dead between the two state mentioned above. We temporarily store the auto restart flag to query it during the check. Because we don't return control to the main event loop, this new service unit flag needn't be serialized. This patch prevents the pathologic situation when the service with Restart= won't restart automatically. As a side effect it also avoid restarting the dependency unit with StopWhenUnneeded=yes. Fixes: #7377	2017-12-05 16:51:19 +01:00
Lennart Poettering	f3b900311f	service: shortcut operations if the MAINPID= doesn't actually cause a change	2017-11-27 17:04:57 +01:00
Lennart Poettering	2fa40742a4	service: use parse_errno() for parsing error numbers Let's always use the same logic when parsing error numbers, i.e. use parse_errno() here too, to unify some code, and tighten the checks a bit. This also allows clients to pass errors as symbolic names. Probably nothing we want to advertise too eagerly (since new daemons generating this on old service managers won't understand), but still pretty useful I think, in particular in scripting languages and such, where the numeric error numbers might not be readily available.	2017-11-27 17:04:57 +01:00
Lennart Poettering	e78ee06de1	core: add a new sd_notify() message for removing fds from the FD store again Currenly the only way to remove fds from the fdstore is to fully stop the service, or to somehow trigger POLLERR/POLLHUP on the fd, in which case systemd will remove the fd automatically. Let's add another way: a new message that can be sent to remove fds explicitly, given their name.	2017-11-27 17:04:04 +01:00
Lennart Poettering	cc2b7b11b4	core: only process one of READY=1, STOPPING=1 or RELOADING=1 in sd_notify() handling Of course, it's not really a valid sd_notify() message if multiple of these fields are used in one, but let's handle this somewhat gracefully, by only processing one of them, and ignoring the rest.	2017-11-27 17:01:00 +01:00
Lennart Poettering	c45d11cb30	service: reorder sd_notify() handling a bit Let's keep handling of WATCHDOG= and WATCHDOG_USEC= together. No functional changes.	2017-11-27 16:59:52 +01:00
Lennart Poettering	e328523777	service: split out sd_notify() message authorization code into a function of its own Let's shorten service_notify_message() a bit, and do the authentication outside of the main function body. No functional changes.	2017-11-27 16:59:52 +01:00
Lennart Poettering	9711848ff1	core: only log about sd_notify() message contents, when debug logging is on Let's optimize things a bit for the non-debug case. No change in behaviour. Main reason to do this is not so much the speed benefit though, but merely to isolate the code from its surroundings more.	2017-11-27 16:39:43 +01:00
Lennart Poettering	a4634b214c	core: warn about left-over processes in cgroup on unit start Now that we don't kill control processes anymore, let's at least warn about any processes left-over in the unit cgroup at the moment of starting the unit.	2017-11-25 17:08:21 +01:00
Lennart Poettering	e98b2fbbe9	core: generalize the cgroup empty check on GC Let's move the cgroup empty check for all unit types into the generic unit_check_gc() call, out of the per-unit-type _check_gc() type. This not only allows us to share some code, but also hooks up mount and socket units with this kind of check, for free, as it was missing there previously.	2017-11-25 17:08:21 +01:00
Lennart Poettering	e9a4f67609	cgroup: remove logic for maintaining /control subcgroup for the service unit type Previously, in the service unit type we ran all control processes in a special subcgroup /control of the unit's main cgroup. Remove that, and run the control program in the main cgroup instead. The concept conflicts with cgroupv2's logic of "no processes in inner nodes": if a unit has a main daemon process running in the main cgroup, and a reload control process would be started in the /control subcgroup, then this would necessarily fail, as the main daemon process would become an inner node process that way. We could in theory continue to support this in cgroupv1, but in the interest in keeping behaviour similar in both hierarchies, let's drop this altogether. Philosophically maybe it wasn't the greatest idea anyway to just go berserk and SIGKILL all those processes — loud warning logging might have sufficed, too.	2017-11-25 17:08:21 +01:00
Zbigniew Jędrzejewski-Szmek	ffb70e4424	Merge pull request #7381 from poettering/cgroup-unified-delegate-rework Fix delegation in the unified hierarchy + more cgroup work	2017-11-22 07:42:08 +01:00
Zbigniew Jędrzejewski-Szmek	82a27ba821	Merge pull request #7389 from shawnl/warning tree-wide: adjust fall through comments so that gcc is happy	2017-11-22 07:38:51 +01:00
Lennart Poettering	3c7416b6ca	core: unify common code for preparing for forking off unit processes This introduces a new function unit_prepare_exec() that encapsulates a number of calls we do in preparation for spawning off some processes in all our unit types that do so. This allows us to neatly unify a bit of code between unit types and shorten our code.	2017-11-21 11:54:08 +01:00
Shawn Landden	4831981d89	tree-wide: adjust fall through comments so that gcc is happy Distcc removes comments, making the comment silencing not work. I know there was a decision against a macro in commit `ec251fe7d5`	2017-11-20 13:06:25 -08:00
Lennart Poettering	53c35a766f	core: generalize FailureAction= move it from service to unit All kinds of units can fail, hence it makes sense to offer this as generic concept for all unit types.	2017-11-20 16:37:22 +01:00
Lennart Poettering	0133d5553a	Merge pull request #7198 from poettering/stdin-stdout Add StandardInput=data, StandardInput=file:... and more	2017-11-19 19:49:11 +01:00
Zbigniew Jędrzejewski-Szmek	53e1b68390	Add SPDX license identifiers to source files under the LGPL This follows what the kernel is doing, c.f. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.	2017-11-19 19:08:15 +01:00
Lennart Poettering	f56e7bfe2b	core: be more defensive if we can't determine per-connection socket peer (#7329 ) Let's handle gracefully if a client disconnects very early on. This builds on #4120, but relaxes the condition checks further, since we getpeername() might already fail during ExecStartPre= and friends. Fixes: #7172	2017-11-17 15:22:11 +01:00
Lennart Poettering	08f3be7a38	core: add two new unit file settings: StandardInputData= + StandardInputText= Both permit configuring data to pass through STDIN to an invoked process. StandardInputText= accepts a line of text (possibly with embedded C-style escapes as well as unit specifiers), which is appended to the buffer to pass as stdin, followed by a single newline. StandardInputData= is similar, but accepts arbitrary base64 encoded data, and will not resolve specifiers or C-style escapes, nor append newlines. This may be used to pass input/configuration data to services, directly in-line from unit files, either in a cooked or in a more raw format.	2017-11-17 11:13:44 +01:00
Lennart Poettering	7eb2a8a125	unit: rework a bit how we keep the service fdstore from being destroyed during service restart When preparing for a restart we quickly go through the DEAD/INACTIVE service state before entering AUTO_RESTART. When doing this, we need to make sure we don't destroy the FD store. Previously this was done by checking the failure state of the unit, and keeping the FD store around when the unit failed, under the assumption that the restart logic will then get into action. This is not entirely correct howver, as there might be failure states that will no result in restarts. With this commit we slightly alter the logic: a ref counter for the fd store is added, that is increased right before we handle the restart logic, and decreased again right-after. This should ensure that the fdstore lives exactly as long as it needs. Follow-up for `f0bfbfac43`.	2017-11-16 14:37:33 +01:00
Lennart Poettering	d3070fbdf6	core: implement /run/systemd/units/-based path for passing unit info from PID 1 to journald And let's make use of it to implement two new unit settings with it: 1. LogLevelMax= is a new per-unit setting that may be used to configure log priority filtering: set it to LogLevelMax=notice and only messages of level "notice" and lower (i.e. more important) will be processed, all others are dropped. 2. LogExtraFields= is a new per-unit setting for configuring per-unit journal fields, that are implicitly included in every log record generated by the unit's processes. It takes field/value pairs in the form of FOO=BAR. Also, related to this, one exisiting unit setting is ported to this new facility: 3. The invocation ID is now pulled from /run/systemd/units/ instead of cgroupfs xattrs. This substantially relaxes requirements of systemd on the kernel version and the privileges it runs with (specifically, cgroupfs xattrs are not available in containers, since they are stored in kernel memory, and hence are unsafe to permit to lesser privileged code). /run/systemd/units/ is a new directory, which contains a number of files and symlinks encoding the above information. PID 1 creates and manages these files, and journald reads them from there. Note that this is supposed to be a direct path between PID 1 and the journal only, due to the special runtime environment the journal runs in. Normally, today we shouldn't introduce new interfaces that (mis-)use a file system as IPC framework, and instead just an IPC system, but this is very hard to do between the journal and PID 1, as long as the IPC system is a subject PID 1 manages, and itself a client to the journal. This patch cleans up a couple of types used in journal code: specifically we switch to size_t for a couple of memory-sizing values, as size_t is the right choice for everything that is memory. Fixes: #4089 Fixes: #3041 Fixes: #4441	2017-11-16 12:40:17 +01:00
Lennart Poettering	eef85c4a3f	core: track why unit dependencies came to be This replaces the dependencies Set* objects by Hashmap* objects, where the key is the depending Unit, and the value is a bitmask encoding why the specific dependency was created. The bitmask contains a number of different, defined bits, that indicate why dependencies exist, for example whether they are created due to explicitly configured deps in files, by udev rules or implicitly. Note that memory usage is not increased by this change, even though we store more information, as we manage to encode the bit mask inside the value pointer each Hashmap entry contains. Why this all? When we know how a dependency came to be, we can update dependencies correctly when a configuration source changes but others are left unaltered. Specifically: 1. We can fix UDEV_WANTS dependency generation: so far we kept adding dependencies configured that way, but if a device lost such a dependency we couldn't them again as there was no scheme for removing of dependencies in place. 2. We can implement "pin-pointed" reload of unit files. If we know what dependencies were created as result of configuration in a unit file, then we know what to flush out when we want to reload it. 3. It's useful for debugging: "systemd-analyze dump" now shows this information, helping substantially with understanding how systemd's dependency tree came to be the way it came to be.	2017-11-10 19:45:29 +01:00
Alan Jenkins	4cd9fa8176	core: failure to spawn ExecStartPost should not run ExecStop Failure to spawn ExecStartPost was being handled differently to e.g. EXIT_FAILURE returned by ExecStartPost. It looks like this was an oversight. Fix to match documented behaviour. `man systemd.service`: > Note that if any of the commands specified in ExecStartPre=, ExecStart=, > or ExecStartPost= fail (and are not prefixed with "-", see above) or time > out before the service is fully up, execution continues with commands > specified in ExecStopPost=, the commands in ExecStop= are skipped.	2017-11-01 15:28:50 +00:00
Yu Watanabe	4c70109600	tree-wide: use IN_SET macro (#6977 )	2017-10-04 16:01:32 +02:00
Jouke Witteveen	df66b93fe2	service: better detect when a Type=notify service cannot become active anymore (#6959 ) No need to wait for a timeout when we know things are not going to work out. When the main process goes away and only notifications from the main process are accepted, then we will not receive any notifications anymore.	2017-10-02 16:35:27 +02:00
Zbigniew Jędrzejewski-Szmek	b139c95bc4	Merge pull request #6941 from andir/use-in_set use IN_SET where possible	2017-10-02 15:08:10 +02:00
Andreas Rammhold	3742095b27	tree-wide: use IN_SET where possible In addition to the changes from #6933 this handles cases that could be matched with the included cocci file.	2017-10-02 13:09:54 +02:00
Lennart Poettering	b13ddbbcf3	service: accept the fact that the three xyz_good() functions return ints Currently, all three of cgroup_good(), main_pid_good(), control_pid_good() all return an "int" (two of them propagate errors). It's a good thing to keep the three functions similar, so let's leave it at that, but then let's clean up the invocation of the three functions so that they always clearly acknowledge that the return value is not a bool, but potentially negative.	2017-10-02 12:58:42 +02:00
Lennart Poettering	019be28676	service: drop _pure_ decorator on static function The compiler should be good enough to figure this out on its own if this is a static function, and it makes control_pid_good() an outlier anyway, and decorators like this tend to bitrot. Hence, to keep things simple and automatic, let's just drop the decorator.	2017-10-02 12:58:42 +02:00
Lennart Poettering	3c751b1bfa	service: a cgroup empty notification isn't reason enough to go down The processes associated with a service are not just the ones in its cgroup, but also the control and main processes, which might possibly live outside of it, for example if they transitioned into their own cgroups because they registered a PAM session of their own. Hence, if we get a cgroup empty notification always check if the main PID is still around before taking action too eagerly. Fixes: #6045	2017-10-02 12:58:42 +02:00
Lennart Poettering	07697d7ec5	service: add explanatory comments to control_pid_good() and cgroup_good() Let's add a similar comment to each as we already have for main_pid_good(), emphasizing that these functions are supposed to be have very similar.	2017-10-02 12:58:42 +02:00
Lennart Poettering	51894d706f	service: fix main_pid_good() comment We don't actually return -1, don't claim that.	2017-10-02 12:58:37 +02:00
Lennart Poettering	ed77d407d3	core: log unit failure with type-specific result code This slightly changes how we log about failures. Previously, service_enter_dead() would log that a service unit failed along with its result code, and unit_notify() would do this again but without the result code. For other unit types only the latter would take effect. This cleans this up: we keep the message in unit_notify() only for debug purposes, and add type-specific log lines to all our unit types that can fail, and always place them before unit_notify() is invoked. Or in other words: the duplicate log message for service units is removed, and all other unit types get a more useful line with the precise result code.	2017-09-27 18:26:18 +02:00
Lennart Poettering	09e2465407	cgroup: after determining that a cgroup is empty, asynchronously dispatch this This makes sure that if we learn via inotify or another event source that a cgroup is empty, and we checked that this is indeed the case (as we might get spurious notifications through inotify, as the inotify logic through the "cgroups.event" is pretty unspecific and might be trigger for a variety of reasons), then we'll enqueue a defer event for it, at a priority lower than SIGCHLD handling, so that we know for sure that if there's waitid() data for a process we used it before considering the cgroup empty notification. Fixes: #6608	2017-09-27 18:26:18 +02:00
Lennart Poettering	a6951a5079	service: rework service_kill_control_processes() Let's make sure we explicitly also kill any control process we know of, given that it might have moved outside of our control group.	2017-09-26 16:17:22 +02:00
Lennart Poettering	f1c50becda	core: make sure to log invocation ID of units also when doing structured logging	2017-09-22 15:24:55 +02:00
Daniel Mack	906c06f64a	cgroup, unit, fragment parser: make use of new firewall functions	2017-09-22 15:24:55 +02:00
Lennart Poettering	18f573aaf9	core: make sure to dump cgroup context when unit_dump() is called for all unit types For some reason we didn't dump the cgroup context for a number of unit types, including service units. Not sure how this wasn't noticed before... Add this in.	2017-09-22 15:24:54 +02:00
Evgeny Vereshchagin	7f92388482	core: serialize n-restarts and flush-n-restarts correctly (#6736 ) This makes n-restarts and flush-n-restarts survive `systemctl daemon-[reload\|rexec]`.	2017-09-04 15:36:01 +02:00
Lennart Poettering	0f52f8e552	core: disable the effect of Restart= if there's a stop job pending for a service (#6581 ) We shouldn't undo the job already enqueued, under any circumstances. Fixes: #6504	2017-08-26 22:07:23 +09:00
Yu Watanabe	2c5ad0fd6d	Merge pull request #6577 from poettering/more-exec-flags add ! and !! ExecStart= flags to make ambient caps useful	2017-08-26 21:49:05 +09:00
Michal Sekletar	b58aeb70db	service: attempt to execute next main command only for oneshot services (#6619 ) This commit fixes crash described in https://github.com/systemd/systemd/issues/6533 Multiple ExecStart lines are allowed only for oneshot services anyway so it doesn't make sense to call service_run_next_main() with services of type other than SERVICE_ONESHOT. Referring back to reproducer from the issue, previously we didn't observe this problem because s->main_command was reset after daemon-reload hence we never reached the assert statement in service_run_next_main(). Fixes #6533	2017-08-25 16:36:10 +03:00
Lennart Poettering	1703fa41a7	core: rename EXEC_APPLY_PERMISSIONS → EXEC_APPLY_SANDBOXING "Permissions" was a bit of a misnomer, as it suggests that UNIX file permission bits are adjusted, which aren't really changed here. Instead, this is about UNIX credentials such as users or groups, as well as namespacing, hence let's use a more generic term here, without any misleading reference to UNIX file permissions: "sandboxing", which shall refer to all kinds of sandboxing technologies, including UID/GID dropping, selinux relabelling, namespacing, seccomp, and so on.	2017-08-10 15:02:50 +02:00
Lennart Poettering	f0d477979e	core: introduce unit_set_exec_params() The new unit_set_exec_params() call is to units what manager_set_exec_params() is to the manager object: it initializes the various fields from the relevant generic properties set.	2017-08-10 15:02:50 +02:00
Lennart Poettering	19bbdd985e	core: manager_set_exec_params() cannot fail, hence make it void Let's simplify things a bit.	2017-08-10 15:02:50 +02:00
Lennart Poettering	584b8688d1	execute: also fold the cgroup delegate bit into ExecFlags	2017-08-10 15:02:50 +02:00
Lennart Poettering	ac6479781e	execute: also control the SYSTEMD_NSS_BYPASS_BUS through an ExecFlags field Also, correct the logic while we are at it: the variable is only required for system services, not user services.	2017-08-10 15:02:49 +02:00
Lennart Poettering	5bf7569cf8	service: let's set EXEC_NEW_KEYRING through SET_FLAG() Not that it really matters, but it matches how we set the flags in manager_set_exec_params() too.	2017-08-10 15:02:49 +02:00
Lennart Poettering	3ed0cd26ea	execute: replace command flag bools by a flags field This way, we can extend it later on in an easier way, and can pass it along nicely.	2017-08-10 14:44:58 +02:00
Lennart Poettering	7a0019d373	core: introduce a restart counter (#6495 ) This adds a per-service restart counter. Each time an automatic restart is scheduled (due to Restart=) it is increased by one. Its current value is exposed over the bus as NRestarts=. It is also logged (in a structured, recognizable way) on each restart. Note that this really only counts automatic starts triggered by Restart= (which it nicely complements). Manual restarts will reset the counter, as will explicit calls to "systemctl reset-failed". It's supposed to be a tool for measure the automatic restart feature, and nothing else. Fixes: #4126	2017-08-09 21:12:55 +02:00
Jouke Witteveen	15d167f8a3	core: propagate reload from RELOADING=1 notification (#6550 )	2017-08-07 11:27:24 +02:00
Zbigniew Jędrzejewski-Szmek	a132bef023	Drop kdbus bits Some kdbus_flag and memfd related parts are left behind, because they are entangled with the "legacy" dbus support. test-bus-benchmark is switched to "manual". It was already broken before (in the non-kdbus mode) but apparently nobody noticed. Hopefully it can be fixed later.	2017-07-23 12:01:54 -04:00
Lennart Poettering	df0ff12775	tree-wide: make use of getpid_cached() wherever we can This moves pretty much all uses of getpid() over to getpid_raw(). I didn't specifically check whether the optimization is worth it for each replacement, but in order to keep things simple and systematic I switched over everything at once.	2017-07-20 20:27:24 +02:00
Yu Watanabe	3536f49e8f	core: add {State,Cache,Log,Configuration}Directory= (#6384 ) This introduces {State,Cache,Log,Configuration}Directory= those are similar to RuntimeDirectory=. They create the directories under /var/lib, /var/cache/, /var/log, or /etc, respectively, with the mode specified in {State,Cache,Log,Configuration}DirectoryMode=. This also fixes #6391.	2017-07-18 14:34:52 +02:00
Yu Watanabe	53f47dfc7b	core: allow preserving contents of RuntimeDirectory= over process restart This introduces RuntimeDirectoryPreserve= option which takes a boolean argument or 'restart'. Closes #6087.	2017-07-17 16:22:25 +09:00
Lennart Poettering	9efb9df9e3	core: make NotifyAccess= and FileDescriptorStoreMax= available to transient services This is helpful for debugging/testing #5606.	2017-06-26 15:14:41 +02:00
Lennart Poettering	3ceb72e558	core: permit FDSTORE=1 messages with non-pollable fds This also alters the documentation to recommend memfds rather than /run for serializing state across reboots. That's because /run doesn't actually have the same lifecycle as the fd store, as it is cleared out on restarts. Fixes: #5606	2017-06-26 15:14:41 +02:00
Franck Bui	4c47affcf1	core: remove the redundancy of 'n_fds' and 'n_storage_fds' in ExecParameters struct 'n_fds' field in the ExecParameters structure was counting the total number of file descriptors to be passed to a unit. This counter also includes the number of passed socket fds which is counted by 'n_socket_fds' already. This patch removes that redundancy by replacing 'n_fds' with 'n_storage_fds'. The new field only counts the fds passed via the storage store mechanism. That way each fd is counted at one place only. Subsequently the patch makes sure to fix code that used 'n_fds' and also wanted to iterate through all of them by explicitly adding 'n_socket_fds' + 'n_storage_fds'. Suggested by Lennart.	2017-06-08 16:21:35 +02:00
Franck Bui	9b1419111a	core: only apply NonBlocking= to fds passed via socket activation Make sure to only apply the O_NONBLOCK flag to the fds passed via socket activation. Previously the flag was also applied to the fds which came from the fd store but this was incorrect since services, after being restarted, expect that these passed fds have their flags unchanged and can be reused as before. The documentation was a bit unclear about this so clarify it.	2017-06-06 22:42:50 +02:00
Thomas Hindoe Paaboel Andersen	6eeec374c1	tree-wide: remove unused variables	2017-04-28 23:56:44 +02:00
Lennart Poettering	8ea9aa9e88	Merge pull request #5354 from msekletar/issue-518 service: serialize information about currently executing command	2017-04-24 19:51:34 +02:00
Zbigniew Jędrzejewski-Szmek	ba360bb05c	tree-wide: mark log_struct with _printf_ and fix fallout log_struct takes multiple format strings, each one followed by arguments. The _printf_ annotation is not sufficiently flexible to express this, but we can still annotate the first format string, though not its arguments (because their number is unknown). With the annotation, the places which specified the message id or similar as the first pattern cause a warning from -Wformat-nonliteral. This can be trivially fixed by putting the MESSAGE= first. This change will help find issues where a non-literal is erroneously used as the pattern.	2017-04-21 13:37:04 -04:00
Michal Sekletar	e266c068b5	service: serialize information about currently executing command Stored information will help us to resume execution after the daemon-reload. This commit implements following scheme, * On serialization: - we count rank of the currently executing command - we store command type, its rank and command line arguments * On deserialization: - configuration is parsed and loaded - we deserialize stored data, command type, rank and arguments - we look at the given rank in the list and if command there has same arguments then we restore execution at that point - otherwise we search respective command list and we look for command that has the same arguments - if both methods fail we do not do not resume execution at all To better illustrate how does above scheme works, please consider following cases (<<< denotes position where we resume execution after reload) ; Original unit file [Service] ExecStart=/bin/true <<< ExecStart=/bin/false ; Swapped commands ; Second command is not going to be executed [Service] ExecStart=/bin/false ExecStart=/bin/true <<< ; Commands added before ; Same commands are problematic and execution could be restarted at wrong place [Service] ExecStart=/bin/foo ExecStart=/bin/bar ExecStart=/bin/true <<< ExecStart=/bin/false ; Commands added after ; Same commands are not an issue in this case [Service] ExecStart=/bin/true <<< ExecStart=/bin/false ExecStart=/bin/foo ExecStart=/bin/bar ; New commands interleaved with old commands ; Some new commands will be executed while others won't ExecStart=/bin/foo ExecStart=/bin/true <<< ExecStart=/bin/bar ExecStart=/bin/false As you can see, above scheme has some drawbacks. However, in most cases (we assume that in most common case unit file command list is not changed while some other command is running for the same unit) it should cause that systemd does the right thing, which is restoring execution exactly at the point we were before daemon-reload. Fixes #518	2017-04-11 09:22:25 +02:00
Lennart Poettering	6939ce648a	service: refuse using PID 1 as MAINPID for a service	2017-02-28 16:08:40 +01:00
Lennart Poettering	e8b509d3be	service: make use of log_unit_warning_errno()'s return value	2017-02-28 16:08:21 +01:00
Lennart Poettering	7c102d6092	core: use PID_FMT where appropriate	2017-02-28 16:07:56 +01:00
Lennart Poettering	c22800e40e	cgroup: rename cg_unified() → cg_unified_controller() cg_unified() is a bit generic a name, let's make clear that it checks whether a specified controller is in unified mode.	2017-02-24 18:00:04 +01:00
Lennart Poettering	b4cccbc13a	cgroup: change cg_unified() to possibly return errors again We use our cgroup APIs in various contexts, including from our libraries sd-login, sd-bus. As we don#t control those environments we can't rely that the unified cgroup setup logic succeeds, and hence really shouldn't assert on it. This more or less reverts `415fc41cea`.	2017-02-24 17:52:58 +01:00
Tejun Heo	415fc41cea	core: simplify cg_[all_]unified() cg_[all_]unified() test whether a specific controller or all controllers are on the unified hierarchy. While what's being asked is a simple binary question, the callers must assume that the functions may fail any time, which unnecessarily complicates their usages. This complication is unnecessary. Internally, the test result is cached anyway and there are only a few places where the test actually needs to be performed. This patch simplifies cg_[all_]unified(). * cg_[all_]unified() are updated to return bool. If the result can't be decided, assertion failure is triggered. Error handlings from their callers are dropped. * cg_unified_flush() is updated to calculate the new result synchrnously and return whether it succeeded or not. Places which need to flush the test result are updated to test for failure. This ensures that all the following cg_[all_]unified() tests succeed. * Places which expected possible cg_[all_]unified() failures are updated to call and test cg_unified_flush() before calling cg_[all_]unified(). This includes functions used while setting up mounts during boot and manager_setup_cgroup().	2017-02-18 17:51:13 -05:00
Stefan Hajnoczi	359a5bcf78	core: add AF_VSOCK support to socket units Accept AF_VSOCK listen addresses in socket unit files. Both guest and host can now take advantage of socket activation. The QEMU guest agent has recently been modified to support socket activation and can run over AF_VSOCK with this patch.	2017-01-10 15:29:04 +00:00
Stefan Hajnoczi	882ac6e769	socket-util: introduce port argument in sockaddr_port() sockaddr_port() either returns a >= 0 port number or a negative errno. This works for AF_INET and AF_INET6 because port ranges are only 16-bit. In AF_VSOCK ports are 32-bit so an int cannot represent all port number and negative errnos. Separate the port and the return code.	2017-01-10 15:29:04 +00:00
Lennart Poettering	74dd6b515f	core: run each system service with a fresh session keyring This patch ensures that each system service gets its own session kernel keyring automatically, and implicitly. Without this a keyring is allocated for it on-demand, but is then linked with the user's kernel keyring, which is OK behaviour for logged in users, but not so much for system services. With this change each service gets a session keyring that is specific to the service and ceases to exist when the service is shut down. The session keyring is not linked up with the user keyring and keys hence only search within the session boundaries by default. (This is useful in a later commit to store per-service material in the keyring, for example the invocation ID) (With input from David Howells)	2016-12-13 20:59:10 +01:00
Zbigniew Jędrzejewski-Szmek	1ac7a93574	Merge pull request #4835 from poettering/unit-name-printf Various specifier resolution fixes.	2016-12-10 01:29:52 -05:00
Lennart Poettering	5125e76243	core: move specifier expansion out of service.c/socket.c This monopolizes unit file specifier expansion in load-fragment.c, and removes it from socket.c + service.c. This way expansion becomes an operation done exclusively at time of loading unit files. Previously specifiers were resolved for all settings during loading of unit files with the exception of ExecStart= and friends which were resolved in socket.c and service.c. With this change the latter is also moved to the loading of unit files. Fixes: #3061	2016-12-07 18:47:32 +01:00
Jouke Witteveen	c3fda31da3	service: go through stop_post on failure (#4770 )	2016-12-06 14:02:36 +01:00
Jouke Witteveen	6375bd2007	service: new NotifyAccess= value for control processes (#4212 ) Setting NotifyAccess=exec allows notifications coming directly from any control process.	2016-11-29 23:20:04 +01:00
Jouke Witteveen	3c9512c71d	service: prevent registering control pids as the main pid We assume a process can be only one of the two in service_sigchld_event.	2016-11-29 10:34:33 +01:00
Jouke Witteveen	71e529fcf1	service: only fail notify services on empty cgroup during start We stay in the SERVICE_START while no READY=1 notification message has been received. When we are in the SERVICE_START_POST state, we have already received a ready notification. Hence we should not fail when the cgroup becomes empty in that state.	2016-11-29 10:34:33 +01:00
Jouke Witteveen	3d474ef7a6	service: fix main processes exit behavior for type notify services Before this commit, when the main process of a Type=notify service exits the service would enter a running state without passing through the startup post state. This meant ExecStartPost= from being executed and allowed follow-up units to start too early (before the ready notification). Additionally, when RemainAfterExit=yes is used on a Type=notify service, the exit status of the main process would be disregarded. After this commit, an unsuccessful exit of the main process of a Type=notify service puts the unit in a failed state. A successful exit is inconsequential in case RemainAfterExit=yes. Otherwise, when no ready notification has been received, the unit is put in a failed state because it has never been active. When all processes in the cgroup of a Type=notify service are gone and no ready notification has been received yet, the unit is also put in a failed state.	2016-11-22 17:54:27 +01:00
Jouke Witteveen	c35755fb87	service: introduce protocol error type Introduce a SERVICE_FAILURE_PROTOCOL error type for when a service does not follow the protocol. This error type is used when a pid file is expected, but not delivered.	2016-11-22 17:54:27 +01:00
Franck Bui	7d5ceb6416	core: allow to redirect confirmation messages to a different console It's rather hard to parse the confirmation messages (enabled with systemd.confirm_spawn=true) amongst the status messages and the kernel ones (if enabled). This patch gives the possibility to the user to redirect the confirmation message to a different virtual console, either by giving its name or its path, so those messages are separated from the other ones and easier to read.	2016-11-17 18:16:16 +01:00
Zbigniew Jędrzejewski-Szmek	f97b34a629	Rename formats-util.h to format-util.h We don't have plural in the name of any other -util files and this inconsistency trips me up every time I try to type this file name from memory. "formats-util" is even hard to pronounce.	2016-11-07 10:15:08 -05:00
Lennart Poettering	493fd52f1a	Merge pull request #4510 from keszybz/tree-wide-cleanups Tree wide cleanups	2016-11-03 13:59:20 -06:00
Zbigniew Jędrzejewski-Szmek	b09246352f	pid1: fix fd memleak when we hit FileDescriptorStoreMax limit Since service_add_fd_store() already does the check, remove the redundant check from service_add_fd_store_set(). Also, print a warning when repopulating FDStore after daemon-reexec and we hit the limit. This is a user visible issue, so we should not discard fds silently. (Note that service_deserialize_item is impacted by the return value from service_add_fd_store(), but we rely on the general error message, so the caller does not need to be modified, and does not show up in the diff.)	2016-11-02 15:07:17 -04:00
Zbigniew Jędrzejewski-Szmek	f0bfbfac43	core: when restarting services, don't close fds We would close all the stored fds in service_release_resources(), which of course broke the whole concept of storing fds over service restart. Fixes #4408.	2016-11-01 21:20:21 -04:00
Zbigniew Jędrzejewski-Szmek	16f70d6362	pid1: nicely log when doing operation on stored fds Should help with debugging #4408.	2016-10-28 22:45:05 -04:00
Zbigniew Jędrzejewski-Szmek	9021ff17e2	pid1: only log about added fd if it was really added If it was a duplicate, log nothing.	2016-10-28 22:45:05 -04:00
Zbigniew Jędrzejewski-Szmek	605405c6cc	tree-wide: drop NULL sentinel from strjoin This makes strjoin and strjoina more similar and avoids the useless final argument. spatch -I . -I ./src -I ./src/basic -I ./src/basic -I ./src/shared -I ./src/shared -I ./src/network -I ./src/locale -I ./src/login -I ./src/journal -I ./src/journal -I ./src/timedate -I ./src/timesync -I ./src/nspawn -I ./src/resolve -I ./src/resolve -I ./src/systemd -I ./src/core -I ./src/core -I ./src/libudev -I ./src/udev -I ./src/udev/net -I ./src/udev -I ./src/libsystemd/sd-bus -I ./src/libsystemd/sd-event -I ./src/libsystemd/sd-login -I ./src/libsystemd/sd-netlink -I ./src/libsystemd/sd-network -I ./src/libsystemd/sd-hwdb -I ./src/libsystemd/sd-device -I ./src/libsystemd/sd-id128 -I ./src/libsystemd-network --sp-file coccinelle/strjoin.cocci --in-place $(git ls-files src/.c) git grep -e '\bstrjoin\b.NULL' -l\|xargs sed -i -r 's/strjoin$(.*), NULL$/strjoin(\1)/' This might have missed a few cases (spatch has a really hard time dealing with _cleanup_ macros), but that's no big issue, they can always be fixed later.	2016-10-23 11:43:27 -04:00
Zbigniew Jędrzejewski-Szmek	7d78f7cea8	Merge pull request #4428 from lnykryn/ctrl_v2 rename failure-action to emergency-action and use it for ctrl+alt+del burst	2016-10-22 23:16:11 -04:00
Lukas Nykryn	87a47f99bc	failure-action: generalize failure action to emergency action	2016-10-21 15:13:50 +02:00
Lennart Poettering	47fffb3530	core: if the start command vanishes during runtime don't hit an assert This can happen when the configuration is changed and reloaded while we are executing a service. Let's not hit an assert in this case. Fixes: #4444	2016-10-21 12:27:46 +02:00
Lennart Poettering	5368222db6	core: let's upgrade the log level for service processes dying of signal (#4415 ) As suggested in https://github.com/systemd/systemd/pull/4367#issuecomment-253670328	2016-10-19 19:48:35 -04:00
Zbigniew Jędrzejewski-Szmek	3b319885c4	tree-wide: introduce free_and_replace helper It's a common pattern, so add a helper for it. A macro is necessary because a function that takes a pointer to a pointer would be type specific, similarly to cleanup functions. Seems better to use a macro.	2016-10-16 23:35:39 -04:00
Zbigniew Jędrzejewski-Szmek	b744e8937c	Merge pull request #4067 from poettering/invocation-id Add an "invocation ID" concept to the service manager	2016-10-11 13:40:50 -04:00
Lennart Poettering	1f0958f640	core: when determining whether a process exit status is clean, consider whether it is a command or a daemon SIGTERM should be considered a clean exit code for daemons (i.e. long-running processes, as a daemon without SIGTERM handler may be shut down without issues via SIGTERM still) while it should not be considered a clean exit code for commands (i.e. short-running processes). Let's add two different clean checking modes for this, and use the right one at the appropriate places. Fixes: #4275	2016-10-10 22:57:01 +02:00
Lennart Poettering	41e2036eb8	exit-status: kill is_clean_exit_lsb(), move logic to sysv-generator Let's get rid of is_clean_exit_lsb(), let's move the logic for the special handling of the two LSB exit codes into the sysv-generator by writing out appropriate SuccessExitStatus= lines if the LSB header exists. This is not only semantically more correct, bug also fixes a bug as the code in service.c that chose between is_clean_exit_lsb() and is_clean_exit() based this check on whether a native unit files was available for the unit. However, that check was bogus since a long time, since the SysV generator was introduced and native SysV script support was removed from PID 1, as in that case a unit file always existed.	2016-10-10 21:48:08 +02:00
Lennart Poettering	4b58153dd2	core: add "invocation ID" concept to service manager This adds a new invocation ID concept to the service manager. The invocation ID identifies each runtime cycle of a unit uniquely. A new randomized 128bit ID is generated each time a unit moves from and inactive to an activating or active state. The primary usecase for this concept is to connect the runtime data PID 1 maintains about a service with the offline data the journal stores about it. Previously we'd use the unit name plus start/stop times, which however is highly racy since the journal will generally process log data after the service already ended. The "invocation ID" kinda matches the "boot ID" concept of the Linux kernel, except that it applies to an individual unit instead of the whole system. The invocation ID is passed to the activated processes as environment variable. It is additionally stored as extended attribute on the cgroup of the unit. The latter is used by journald to automatically retrieve it for each log logged message and attach it to the log entry. The environment variable is very easily accessible, even for unprivileged services. OTOH the extended attribute is only accessible to privileged processes (this is because cgroupfs only supports the "trusted." xattr namespace, not "user."). The environment variable may be altered by services, the extended attribute may not be, hence is the better choice for the journal. Note that reading the invocation ID off the extended attribute from journald is racy, similar to the way reading the unit name for a logging process is. This patch adds APIs to read the invocation ID to sd-id128: sd_id128_get_invocation() may be used in a similar fashion to sd_id128_get_boot(). PID1's own logging is updated to always include the invocation ID when it logs information about a unit. A new bus call GetUnitByInvocationID() is added that allows retrieving a bus path to a unit by its invocation ID. The bus path is built using the invocation ID, thus providing a path for referring to a unit that is valid only for the current runtime cycleof it. Outlook for the future: should the kernel eventually allow passing of cgroup information along AF_UNIX/SOCK_DGRAM messages via a unique cgroup id, then we can alter the invocation ID to be generated as hash from that rather than entirely randomly. This way we can derive the invocation race-freely from the messages.	2016-10-07 20:14:38 +02:00
Kyle Russell	7dd736abec	service: fixup ExecStop for socket-activated shutdown (#4120 ) Previous fix didn't consider handling multiple ExecStop commands.	2016-09-10 08:55:36 +03:00
Kyle Russell	f2dbd059a6	service: Continue shutdown on socket activated unit on termination (#4108 ) ENOTCONN may be a legitimate return code if the endpoint disappeared, but the service should still attempt to shutdown cleanly.	2016-09-09 05:34:43 +03:00
Zbigniew Jędrzejewski-Szmek	2056ec1927	Merge pull request #3965 from htejun/systemd-controller-on-unified	2016-08-19 19:58:01 -04:00
Lennart Poettering	00d9ef8560	core: add RemoveIPC= setting This adds the boolean RemoveIPC= setting to service, socket, mount and swap units (i.e. all unit types that may invoke processes). if turned on, and the unit's user/group is not root, all IPC objects of the user/group are removed when the service is shut down. The life-cycle of the IPC objects is hence bound to the unit life-cycle. This is particularly relevant for units with dynamic users, as it is essential that no objects owned by the dynamic users survive the service exiting. In fact, this patch adds code to imply RemoveIPC= if DynamicUser= is set. In order to communicate the UID/GID of an executed process back to PID 1 this adds a new "user lookup" socket pair, that is inherited into the forked processes, and closed before the exec(). This is needed since we cannot do NSS from PID 1 due to deadlock risks, However need to know the used UID/GID in order to clean up IPC owned by it if the unit shuts down.	2016-08-19 00:37:25 +02:00
Tejun Heo	5da38d0768	core: use the unified hierarchy for the systemd cgroup controller hierarchy Currently, systemd uses either the legacy hierarchies or the unified hierarchy. When the legacy hierarchies are used, systemd uses a named legacy hierarchy mounted on /sys/fs/cgroup/systemd without any kernel controllers for process management. Due to the shortcomings in the legacy hierarchy, this involves a lot of workarounds and complexities. Because the unified hierarchy can be mounted and used in parallel to legacy hierarchies, there's no reason for systemd to use a legacy hierarchy for management even if the kernel resource controllers need to be mounted on legacy hierarchies. It can simply mount the unified hierarchy under /sys/fs/cgroup/systemd and use it without affecting other legacy hierarchies. This disables a significant amount of fragile workaround logics and would allow using features which depend on the unified hierarchy membership such bpf cgroup v2 membership test. In time, this would also allow deleting the said complexities. This patch updates systemd so that it prefers the unified hierarchy for the systemd cgroup controller hierarchy when legacy hierarchies are used for kernel resource controllers. * cg_unified(@controller) is introduced which tests whether the specific controller in on unified hierarchy and used to choose the unified hierarchy code path for process and service management when available. Kernel controller specific operations remain gated by cg_all_unified(). * "systemd.legacy_systemd_cgroup_controller" kernel argument can be used to force the use of legacy hierarchy for systemd cgroup controller. * nspawn: By default nspawn uses the same hierarchies as the host. If UNIFIED_CGROUP_HIERARCHY is set to 1, unified hierarchy is used for all. If 0, legacy for all. * nspawn: arg_unified_cgroup_hierarchy is made an enum and now encodes one of three options - legacy, only systemd controller on unified, and unified. The value is passed into mount setup functions and controls cgroup configuration. * nspawn: Interpretation of SYSTEMD_CGROUP_CONTROLLER to the actual mount option is moved to mount_legacy_cgroup_hierarchy() so that it can take an appropriate action depending on the configuration of the host. v2: - CGroupUnified enum replaces open coded integer values to indicate the cgroup operation mode. - Various style updates. v3: Fixed a bug in detect_unified_cgroup_hierarchy() introduced during v2. v4: Restored legacy container on unified host support and fixed another bug in detect_unified_cgroup_hierarchy().	2016-08-17 17:44:36 -04:00
Tejun Heo	ca2f6384aa	core: rename cg_unified() to cg_all_unified() A following patch will update cgroup handling so that the systemd controller (/sys/fs/cgroup/systemd) can use the unified hierarchy even if the kernel resource controllers are on the legacy hierarchies. This would require distinguishing whether all controllers are on cgroup v2 or only the systemd controller is. In preparation, this patch renames cg_unified() to cg_all_unified(). This patch doesn't cause any functional changes.	2016-08-15 18:13:36 -04:00
Zbigniew Jędrzejewski-Szmek	3bb81a80bd	Merge pull request #3818 from poettering/exit-status-env beef up /var/tmp and /tmp handling; set $SERVICE_RESULT/$EXIT_CODE/$EXIT_STATUS on ExecStop= and make sure root/nobody are always resolvable	2016-08-05 20:55:08 -04:00
Zbigniew Jędrzejewski-Szmek	3ebcd323bd	systemd: do not serialize peer, bump count when deserializing socket instead	2016-08-05 08:16:31 -04:00
Zbigniew Jędrzejewski-Szmek	9dfb64f87d	core/service: serialize and deserialize accept_socket This fixes an issue during reexec — the count of connections would be lost: [zbyszek@fedora-rawhide ~]$ systemctl status testlimit.socket \| grep Connected Accepted: 1; Connected: 1 [zbyszek@fedora-rawhide ~]$ sudo systemctl daemon-reexec [zbyszek@fedora-rawhide ~]$ systemctl status testlimit.socket \| grep Connected Accepted: 1; Connected: 0 With the patch, Connected count is preserved. Also add "Accept Socket" to the dump output for services.	2016-08-05 08:16:31 -04:00
Lennart Poettering	b08af3b127	core: only set the watchdog variables in ExecStart= lines	2016-08-04 23:08:05 +02:00
Lennart Poettering	a0fef983ab	core: remember first unit failure, not last unit failure Previously, the result value of a unit was overriden with each failure that took place, so that the result always reported the last failure that took place. With this commit this is changed, so that the first failure taking place is stored instead. This should normally not matter much as multiple failures are sufficiently uncommon. However, it improves one behaviour: if we send SIGABRT to a service due to a watchdog timeout, then this currently would be reported as "coredump" failure, rather than the "watchodg" failure it really is. Hence, in order to report information about the type of the failure, and not about the effect of it, let's change this from all unit type to store the first, not the last failure. This addresses the issue pointed out here: https://github.com/systemd/systemd/pull/3818#discussion_r73433520	2016-08-04 23:08:05 +02:00
Lennart Poettering	136dc4c435	core: set $SERVICE_RESULT, $EXIT_CODE and $EXIT_STATUS in ExecStop=/ExecStopPost= commands This should simplify monitoring tools for services, by passing the most basic information about service result/exit information via environment variables, thus making it unnecessary to retrieve them explicitly via the bus.	2016-08-04 23:08:05 +02:00
Lennart Poettering	9c1a61adba	core: move masking of chroot/permission masking into service_spawn() Let's fix up the flags fields in service_spawn() rather than its callers, in order to simplify things a bit.	2016-08-04 16:27:07 +02:00
Lennart Poettering	c39f1ce24d	core: turn various execution flags into a proper flags parameter The ExecParameters structure contains a number of bit-flags, that were so far exposed as bool:1, change this to a proper, single binary bit flag field. This makes things a bit more expressive, and is helpful as we add more flags, since these booleans are passed around in various callers, for example service_spawn(), whose signature can be made much shorter now. Not all bit booleans from ExecParameters are moved into the flags field for now, but this can be added later.	2016-08-04 16:27:07 +02:00
Susant Sahani	9d56542764	socket: add support to control no. of connections from one source (#3607 ) Introduce MaxConnectionsPerSource= that is number of concurrent connections allowed per IP. RFE: 1939	2016-08-02 13:48:23 -04:00
Lennart Poettering	29206d4619	core: add a concept of "dynamic" user ids, that are allocated as long as a service is running This adds a new boolean setting DynamicUser= to service files. If set, a new user will be allocated dynamically when the unit is started, and released when it is stopped. The user ID is allocated from the range 61184..65519. The user will not be added to /etc/passwd (but an NSS module to be added later should make it show up in getent passwd). For now, care should be taken that the service writes no files to disk, since this might result in files owned by UIDs that might get assigned dynamically to a different service later on. Later patches will tighten sandboxing in order to ensure that this cannot happen, except for a few selected directories. A simple way to test this is: systemd-run -p DynamicUser=1 /bin/sleep 99999	2016-07-22 15:53:45 +02:00

1 2 3 4 5 ...

535 commits