Systemd

Author	SHA1	Message	Date
Franck Bui	f75f613d25	core: reduce the number of stalled PIDs from the watched processes list when possible Some PIDs can remain in the watched list even though their processes have exited since a long time. It can easily happen if the main process of a forking service manages to spawn a child before the control process exits for example. However when a pid is about to be mapped to a unit by calling unit_watch_pid(), the caller usually knows if the pid should belong to this unit exclusively: if we just forked() off a child, then we can be sure that its PID is otherwise unused. In this case we take this opportunity to remove any stalled PIDs from the watched process list. If we learnt about a PID in any other form (for example via PID file, via searching, MAINPID= and so on), then we can't assume anything.	2019-03-20 10:51:49 +01:00
Lennart Poettering	9adb695987	core: split error list in comment for unit_start() in two	2019-03-18 16:06:36 +01:00
Lennart Poettering	36c4dc089e	core: change emergency_action() to return void The function so far always returned -ECANCELLED, which is ignored in all cases the function is invoked, except one: in unit_test_start_limit() where -ECANCELLED is returned when the start limit is hit, which is part of unit_start()'s protocol of return values. Since the emergency_action() logic should be relatively generic and is used in many places, let's drop the return value from it, since it's constant anyway, and in alll cases useless. Instead, let's return it in unit_test_start_limit(), where it's part of the protocol. No change in behaviour.	2019-03-18 16:06:36 +01:00
Lennart Poettering	2de9b9793b	core: check start limit on condition checks too Let's add a safety precaution: if the start condition checks for a unit are tested too often and fail each time, let's rate limit this too. This should add extra safety in case people define .path, .timer or .automount units that trigger a service that as a conditoin that always fails.	2019-03-18 16:06:36 +01:00
Lennart Poettering	5766aca8d2	core: modernize unit_start() a bit No change in behaviour, just a re-line-breaking of the various comments to our current coding style, and some use of SYNTHETIC_ERRNO().	2019-03-18 16:06:36 +01:00
Lennart Poettering	a4191c9fb5	core: unify code for checking whether unit to trigger is loaded	2019-03-18 16:06:36 +01:00
Lennart Poettering	97a3f4ee05	core: rename unit_{start_limit\|condition\|assert}_test() to unit_test_xyz() Just some renaming, no change in behaviour. Background: I'd like to add more functions unit_test_xyz() that test various things, hence let's streamline the naming a bit.	2019-03-18 16:06:36 +01:00
Lennart Poettering	9e30cf74ce	core: add comment explaining ECOMM return value of unit_start() we explain all other return values, explain these ones too.	2019-03-18 16:06:36 +01:00
Stephane Chazelas	106bf8e445	remove "." path components from required mount paths unit_require_mounts_for may be passed path arguments that contain "." components like for user's home directories where "." is sometimes used to specify some form of anchor point. This change stops considering such path as an error and removes the "." components instead. Closes: #11910	2019-03-07 10:12:03 +01:00
Lennart Poettering	5bcffb4b54	Merge pull request #11457 from grooverdan/sendsigkill_no service: killmode=cgroup\|mixed, SendSIGKILL=no services are not multiprocess	2019-02-18 13:41:52 +01:00
Daniel Black	c53d2d54bd	service: make killmode=cgroup\|mixed, SendSIGKILL=no services singletons KillMode=mixed and control group are used to indicate that all process should be killed off. SendSIGKILL is used for services that require a clean shutdown. These are typically database service where a SigKilled process would result in a lengthy recovery and who's shutdown or startup time is quite variable (so Timeout settings aren't of use). Here we take these two factors and refuse to start a service if there are existing processes within a control group. Databases, while generally having some protection against multiple instances running, lets not stress the rigor of these. Also ExecStartPre parts of the service aren't as rigoriously written to protect against against multiple use. closes #8630	2019-01-29 15:35:59 +11:00
Jonathon Kowalski	6255af75d7	Return -EAGAIN instead of -EALREADY from unit_reload Fixes: #11499 Let's return -EAGAIN so that on state change, unit_process_job tries to add our job to run_queue again so that all the reloads that coalesced into the installed reload (which itself merged into a running one) inititally atleast runs once. This should ensure service picks up all config changes reliably. See the issue being fixed for a detailed explanation.	2019-01-20 22:12:24 +00:00
Lennart Poettering	2d41e9b7a0	Merge pull request #11143 from keszybz/enable-symlink Runtime mask symlink confusion fix	2018-12-16 12:37:07 +01:00
Zbigniew Jędrzejewski-Szmek	58d9d89b4b	pid1: fix free of uninitialized pointer in unit_fail_if_noncanonical() https://bugzilla.redhat.com/show_bug.cgi?id=1653068	2018-12-14 11:21:16 +01:00
Zbigniew Jędrzejewski-Szmek	303ee60151	Mark data and userdata params to specifier_printf() as const It would be very wrong if any of the specfier printf calls modified any of the objects or data being printed. Let's mark all arguments as const (primarily to make it easier for the reader to see where modifications cannot occur).	2018-12-12 16:45:33 +01:00
Lennart Poettering	a1c7334b61	core: when a unit state changes only propagate to jobs after reloading is complete Previously, we'd immediately propagate unit state changes into any jobs pending for them, always. With this we only do this if the manager is out of the "reload" state. This fixes the problem #8803 tried to address, by simply not completing jobs until after the reload (and thus reestablishment of the dbus connection) is complete. Note that there's no need to later on explicitly catch up with the missed job state changes (i.e. there's no need to call unit_process_job() later one explicitly). That's because for jobs in JOB_WAITING state on deserialization all jobs are requeued into the run queue anyway, and thus checked again if they can complete now. And for JOB_RUNNING jobs unit_catchup() phase is going to trigger missed out state changes after the reload complete anyway (after all that's what distinguishes from unit_coldplug()). Replaces: #8803	2018-12-12 11:15:07 +01:00
Lennart Poettering	16c74914d2	core: split out all logic that updates a Job on a unit's unit_notify() invocation Just some refactoring, no change in behaviour.	2018-12-12 11:15:07 +01:00
Lennart Poettering	b17c9620c8	core: rework how we deserialize jobs Let's add a helper call unit_deserialize_job() for this purpose, and let's move registration in the global jobs hash table into job_install_deserialized() so that it it is done after all superficial checks are done, and before transitioning into installed states, so that rollback code is not necessary anymore.	2018-12-12 11:15:07 +01:00
Zbigniew Jędrzejewski-Szmek	4cb06c5949	Use VLA instead of alloca The test is the same, but an array is more readable.	2018-12-10 11:57:26 +01:00
Zbigniew Jędrzejewski-Szmek	2d479ff1cc	Merge pull request #10963 from poettering/bus-force-state-change-signal force PropertiesChanged bus signal on all unit state changes	2018-12-06 16:42:21 +01:00
Lennart Poettering	e4de72876e	util-lib: split out all temporary file related calls into tmpfiles-util.c This splits out a bunch of functions from fileio.c that have to do with temporary files. Simply to make the header files a bit shorter, and to group things more nicely. No code changes, just some rearranging of source files.	2018-12-02 13:22:29 +01:00
Lennart Poettering	ee228be10c	util-lib: don't include fileio.h from fileio-label.h There's no reason for doing that, hence simply don't.	2018-12-02 13:22:29 +01:00
Lennart Poettering	3c4832ada4	core: enqueue unit earlier when state changes Previously, we'd enqueue a unit to the dbus queue whenever the state changed, after we processed the state change fully. This commit to the beginning of the state change. This has the benefit that when the state change causes a job to complete the unit is already in the dbus queue, and thus we get the guarantee that any unit change can be sent out to clients before the job change.	2018-12-01 12:53:26 +01:00
Lennart Poettering	af92c603bb	core: send out unit change events when a new invocation ID is acquired It's free, as this generally coincides with unit_start(), but let's make this clean and explicit.	2018-12-01 12:53:26 +01:00
Lennart Poettering	e18f8852f3	core: invalidate invidual Assert/Condition properties when sending out change messages Let's inform the clients about assert/condition property changes as they happen, it's basically for free because assert/condition property changes generally coincide with other unit state changes (after all these checks are done on unit_start())	2018-12-01 12:53:26 +01:00
Lennart Poettering	37d0b962ef	core: when we manage to resolve a user, only enqueue dbus event, don't send out message right-away Let's only enqueue the dbus signal generation, let's not do it right-away, after all we want coalescing to take effect here.	2018-12-01 12:53:26 +01:00
Zbigniew Jędrzejewski-Szmek	8b4e51a60e	Merge pull request #10797 from poettering/run-generator add new "systemd-run-generator" for running arbitrary commands from the kernel command line as system services using the "systemd.run=" kernel command line switch	2018-11-28 22:40:55 +01:00
Lennart Poettering	7af67e9a8b	core: allow to set exit status when using SuccessAction=/FailureAction=exit in units This adds SuccessActionExitStatus= and FailureActionExitStatus= that may be used to configure the exit status to propagate in when SuccessAction=exit or FailureAction=exit is used. When not specified let's also propagate the exit status of the main process we fork off for the unit.	2018-11-27 09:44:40 +01:00
Lennart Poettering	5b262f74e4	unit: tweak status output a bit Let's highlight the unit description string in the status updates, to separate them a bit more the english sentence they are part of, and thus make the different casing less surprising.	2018-11-26 18:24:12 +01:00
Lennart Poettering	b8b6f32104	cgroup: when we unload a unit, also update all its parent's members mask This way we can corectly ensure that when a unit that requires some controller goes away, we propagate the removal of it all the way up, so that the controller is turned off in all the parents too.	2018-11-23 13:41:37 +01:00
Lennart Poettering	5af8805872	cgroup: drastically simplify caching of cgroups members mask Previously we tried to be smart: when a new unit appeared and it only added controllers to the cgroup mask we'd update the cached members mask in all parents by ORing in the controller flags in their cached values. Unfortunately this was quite broken, as we missed some conditions when this cache had to be reset (for example, when a unit got unloaded), moreover the optimization doesn't work when a controller is removed anyway (as in that case there's no other way for the parent to iterate though all children if any other, remaining child unit still needs it). Hence, let's simplify the logic substantially: instead of updating the cache on the right events (which we didn't get right), let's simply invalidate the cache, and generate it lazily when we encounter it later. This should actually result in better behaviour as we don't have to calculate the new members mask for a whole subtree whever we have the suspicion something changed, but can delay it to the point where we actually need the members mask. This allows us to simplify things quite a bit, which is good, since validating this cache for correctness is hard enough. Fixes: #9512	2018-11-23 13:41:37 +01:00
Lennart Poettering	0adf88b68c	cgroup: dump delegation mask too	2018-11-23 12:24:37 +01:00
Lennart Poettering	00e7b3c8e5	unit: minor optimization, use stack over heap, when we can	2018-11-23 00:46:56 +01:00
Lennart Poettering	66fa4bdd70	core: add two minor comments (#10890 )	2018-11-23 06:25:27 +09:00
Lennart Poettering	6e64994d69	core: make unit_start() return a distinguishable error code in case conditions didn't hold Ideally we'd even propagate this all the way to the client, by having a separate JobType enum value for this. But it's hard to add this without breaking compat, hence for now let's at least internally propagate this case differently from the case "already on it". This is then used to call job_finish_and_invalidate() slightly differently, with the already= parameter false, as in the failed condition case no message was likely produced so far.	2018-11-16 15:22:48 +01:00
Lennart Poettering	523ee2d414	core: log a recognizable message when a unit succeeds, too We already are doing it on failure, let's do it on success, too. Fixes: #10265	2018-11-16 15:22:48 +01:00
Lennart Poettering	91bbd9b796	core: make log messages about unit processes exiting recognizable	2018-11-16 15:22:48 +01:00
Lennart Poettering	7c047d7443	core: make log messages about units entering a 'failed' state recognizable Let's make this recognizable, and carry result information in a structure fashion.	2018-11-16 15:22:48 +01:00
Lennart Poettering	33a3fdd978	core: move unit_status_emit_starting_stopping_reloading() and related calls to job.c This call is only used by job.c and very specific to job handling. Moreover the very similar logic of job_emit_status_message() is already in job.c. Hence, let's clean this up, and move both sets of functions to job.c, and rename them a bit so that they express precisely what they do: 1. unit_status_emit_starting_stopping_reloading() → job_emit_begin_status_message() 2. job_emit_status_message() → job_emit_done_status_message() The first call is after all what we call when we begin with the execution of a job, and the second call what we call when we are done wiht it. Just some moving and renaming, not other changes, and hence no change in behaviour.	2018-11-16 15:22:48 +01:00
Lennart Poettering	8204470252	unit: don't claim there was no IP traffic generated by a unit when we don't know Only if we have some IP traffic accounting at all we should claim that.	2018-11-14 09:53:50 +01:00
Lennart Poettering	6eb65e7ca4	core: split out audit message generation from unit_notify() Just some refactoring, no change in behaviour.	2018-11-14 09:51:47 +01:00
INSUN PYO	8724defeae	core: use local variable m instead of u->manager	2018-11-13 10:39:35 +01:00
Tommi Rantala	429926e9cc	core: include unit name in emergency_action() reason message Add unit name in StartLimitAction=, FailureAction= and SuccessAction= emergency_action() reason messages, so that the problematic unit is easily visible, for example: "unit dbus.service failed"	2018-11-12 16:36:03 +01:00
Lennart Poettering	1ad6e8b302	core: split environment block mantained by PID 1's Manager object in two This splits the "environment" field of Manager into two: transient_environment and client_environment. The former is generated from configuration file, kernel cmdline, environment generators. The latter is the one the user can control with "systemctl set-environment" and similar. Both sets are merged transparently whenever needed. Separating the two sets has the benefit that we can safely flush out the former while keeping the latter during daemon reload cycles, so that env var settings from env generators or configuration files do not accumulate, but dynamic API changes are kept around. Note that this change is not entirely transparent to users: if the user first uses "set-environment" to override a transient variable, and then uses "unset-environment" to unset it again things will revert to the original transient variable now, while previously the variable was fully removed. This change in behaviour should not matter too much though I figure. Fixes: #9972	2018-10-31 18:00:53 +01:00
Lennart Poettering	d68c645bd3	core: rework serialization Let's be more careful with what we serialize: let's ensure we never serialize strings that are longer than LONG_LINE_MAX, so that we know we can read them back with read_line(…, LONG_LINE_MAX, …) safely. In order to implement this all serialization functions are move to serialize.[ch], and internally will do line size checks. We'd rather skip a serialization line (with a loud warning) than write an overly long line out. Of course, this is just a second level protection, after all the data we serialize shouldn't be this long in the first place. While we are at it also clean up logging: while serializing make sure to always log about errors immediately. Also, (void)ify all calls we don't expect errors in (or catch errors as part of the general fflush_and_check() at the end.	2018-10-26 10:52:41 +02:00
Lennart Poettering	8948b3415d	core: when deserializing state always use read_line(…, LONG_LINE_MAX, …) This should be much better than fgets(), as we can read substantially longer lines and overly long lines result in proper errors. Fixes a vulnerability discovered by Jann Horn at Google. CVE-2018-15686 LP: #1796402 https://bugzilla.redhat.com/show_bug.cgi?id=1639071	2018-10-26 10:40:01 +02:00
Martin Wilck	e1e74614aa	core: don't create Requires for workdir if "missing ok" Don't add an implicit RequiresMountsFor depenency for the WorkingDirectory of a unit if the "-" character was used to indicate that "a missing working directory is not considered fatal" (see systemd.exec(5)). Otherwise systemd might fail the unit because of missing dependencies.	2018-10-25 11:35:59 +02:00
Yu Watanabe	ec9d636b37	core: use ascii_toupper() instead of everytime judging whether it is the first message	2018-10-24 04:58:08 +09:00
Lennart Poettering	a87b1faad3	core: beautify per-unit consumed resources log message a bit. (#10390 ) Shorten message to say "no IP traffic" if there is no IP traffic, rather than "received 0B IP traffic, sent 0B IP traffic". Fixes: #9816	2018-10-19 09:04:12 +09:00
Anita Zhang	90fc172e19	core: implement per unit journal rate limiting Add LogRateLimitIntervalSec= and LogRateLimitBurst= options for services. If provided, these values get passed to the journald client context, and those values are used in the rate limiting function in the journal over the the journald.conf values. Part of #10230	2018-10-18 09:56:20 +02:00
Zbigniew Jędrzejewski-Szmek	c7adcb1af9	core: do not "warn" about mundane emergency actions For example in a container we'd log: Oct 17 17:01:10 rawhide systemd[1]: Started Power-Off. Oct 17 17:01:10 rawhide systemd[1]: Forcibly powering off: unit succeeded Oct 17 17:01:10 rawhide systemd[1]: Reached target Power-Off. Oct 17 17:01:10 rawhide systemd[1]: Shutting down. and on the console we'd write (in red) [ !! ] Forcibly powering off: unit succeeded This is not useful in any way, and the fact that we're calling an "emergency action" is an internal implementation detail. Let's log about c-a-d and the watchdog actions only.	2018-10-17 19:32:09 +02:00
Zbigniew Jędrzejewski-Szmek	1710d4beff	core: limit service-watchdogs=no to actual "watchdog" commands The setting is now only looked at when considering an action for a job timeout or unit start limit. It is ignored for ctrl-alt-del, SuccessAction, SuccessFailure. v2: turn the parameter into a flag field v3: rename Options to Flags	2018-10-17 19:31:50 +02:00
Lennart Poettering	93d4cb09d5	core: fix unfortunate typo in unit_is_unneeded() Follow-up for `a3c1168ac2`.	2018-10-13 13:01:08 +02:00
Zbigniew Jędrzejewski-Szmek	f436470ae1	Merge pull request #10343 from poettering/manager-state-fix various fixes for PID1's Manager object	2018-10-10 12:36:16 +02:00
Lennart Poettering	3316429f19	Merge pull request #10062 from rgushchin/device Support cgroup v2 bpf-based device controller	2018-10-09 23:29:27 +02:00
Lennart Poettering	5f616d5feb	core: add missing 'continue' statement	2018-10-09 21:11:06 +02:00
Lennart Poettering	638cece45d	core: clean up test run flags Let's make them typesafe, and let's add a nice macro helper for checking if we are in a test run, which should make testing for this much easier to read for most cases.	2018-10-09 19:43:43 +02:00
Roman Gushchin	084c700780	core: support cgroup v2 device controller Cgroup v2 provides the eBPF-based device controller, which isn't currently supported by systemd. This commit aims to provide such support. There are no user-visible changes, just the device policy and whitelist start working if cgroup v2 is used.	2018-10-09 09:47:51 -07:00
Roman Gushchin	17f149556a	core: refactor bpf firewall support into a pseudo-controller The idea is to introduce a concept of bpf-based pseudo-controllers to make adding new bpf-based features easier.	2018-10-09 09:46:08 -07:00
Lennart Poettering	0e699122b7	core: properly serialize "in_audit" per-unit boolean Fixes: #9962	2018-10-09 10:09:39 +02:00
Lennart Poettering	256f65d045	core: rearrange conditions in unit_notify() a bit This shouldn't change control flow, with one exception: we won't send notifications for boot progress to plymouth anymore during reload, which is something we really shouldn't.	2018-10-09 10:09:39 +02:00
Lennart Poettering	334415b16e	Merge pull request #10094 from keszybz/wants-loading Fix bogus fragment paths in units in .wants/.requires	2018-10-05 17:36:31 +02:00
Anita Zhang	c87700a133	Make Watchdog Signal Configurable Allows configuring the watchdog signal (with a default of SIGABRT). This allows an alternative to SIGABRT when coredumps are not desirable. Appropriate references to SIGABRT or aborting were renamed to reflect more liberal watchdog signals. Closes #8658	2018-09-26 16:14:29 +02:00
Zbigniew Jędrzejewski-Szmek	23e8c79665	pid1: drop now-unused path parameter to resolve_template()	2018-09-15 20:03:32 +02:00
Zbigniew Jędrzejewski-Szmek	5a72417084	pid1: drop unused path parameter to add_two_dependencies_by_name()	2018-09-15 20:02:00 +02:00
Zbigniew Jędrzejewski-Szmek	35d8c19ace	pid1: drop now-unused path parameter to add_dependency_by_name()	2018-09-15 19:57:52 +02:00
Zbigniew Jędrzejewski-Szmek	fda09318e3	core: rename function to better reflect semantics	2018-08-20 10:43:31 +02:00
Lennart Poettering	a3c1168ac2	core: rework StopWhenUnneeded= logic Previously, we'd act immediately on StopWhenUnneeded= when a unit state changes. With this rework we'll maintain a queue instead: whenever there's the chance that StopWhenUneeded= might have an effect we enqueue the unit, and process it later when we have nothing better to do. This should make the implementation a bit more reliable, as the unit notify event cannot immediately enqueue tons of side-effect jobs that might contradict each other, but we do so only in a strictly ordered fashion, from the main event loop. This slightly changes the check when to consider a unit "unneeded". Previously, we'd assume that a unit in "deactivating" state could also be cleaned up. With this new logic we'll only consider units unneeded that are fully up and have no job queued. This means that whenever there's something pending for a unit we won't clean it up.	2018-08-10 16:19:01 +02:00
Yu Watanabe	fe65e88ba6	namespace: implicitly adds DeviceAllow= when RootImage= is set RootImage= may require the following settings ``` DeviceAllow=/dev/loop-control rw DeviceAllow=block-loop rwm DeviceAllow=block-blkext rwm ``` This adds the following settings implicitly when RootImage= is specified. Fixes #9737.	2018-08-06 14:02:31 +09:00
Jon Ringle	fbb48d4c66	Make final kill signal configurable Usecase is to allow changing the final kill from SIGKILL to SIGQUIT which should create a core dump useful for debugging why the service didn't stop with the SIGTERM	2018-07-23 13:44:54 +02:00
Chris Lamb	3fe910794b	Correct a number of trivial typos.	2018-06-18 22:44:44 +02:00
Lennart Poettering	0c69794138	tree-wide: remove Lennart's copyright lines These lines are generally out-of-date, incomplete and unnecessary. With SPDX and git repository much more accurate and fine grained information about licensing and authorship is available, hence let's drop the per-file copyright notice. Of course, removing copyright lines of others is problematic, hence this commit only removes my own lines and leaves all others untouched. It might be nicer if sooner or later those could go away too, making git the only and accurate source of authorship information.	2018-06-14 10:20:20 +02:00
Lennart Poettering	818bf54632	tree-wide: drop 'This file is part of systemd' blurb This part of the copyright blurb stems from the GPL use recommendations: https://www.gnu.org/licenses/gpl-howto.en.html The concept appears to originate in times where version control was per file, instead of per tree, and was a way to glue the files together. Ultimately, we nowadays don't live in that world anymore, and this information is entirely useless anyway, as people are very welcome to copy these files into any projects they like, and they shouldn't have to change bits that are part of our copyright header for that. hence, let's just get rid of this old cruft, and shorten our codebase a bit.	2018-06-14 10:20:20 +02:00
Lennart Poettering	6f40aa4547	core: add a couple of more error cases that should result in "bad-setting" This changes a number of EINVAL cases to ENOEXEC, so that we enter "bad-setting" state if they fail.	2018-06-11 12:53:12 +02:00
Lennart Poettering	c4555ad8f6	core: introduce a new load state "bad-setting" Since `bb28e68477` parsing failures of certain unit file settings will result in load failures of units. This introduces a new load state "bad-setting" that is entered in precisely this case. With this addition error messages on bad settings should be a lot more explicit, as we don't have to show some generic "errno" error in that case, but can explicitly say that a bad setting is at fault. Internally this unit load state is entered as soon as any configuration loader call returns ENOEXEC. Hence: config parser calls should return ENOEXEC now for such essential unit file settings. Turns out, they generally already do. Fixes: #9107	2018-06-11 12:53:12 +02:00
Lennart Poettering	f0831ed2a0	core: add a new unit method "catchup()" This is very similar to the existing unit method coldplug() but is called a bit later. The idea is that that coldplug() restores the unit state from before any prior reload/restart, i.e. puts the deserialized state in effect. The catchup() call is then called a bit later, to catch up with the system state for which we missed notifications while we were reloading. This is only really useful for mount, swap and device mount points were we should be careful to generate all missing unit state change events (i.e. call unit_notify() appropriately) for everything that happened while we were reloading.	2018-06-07 15:28:50 +02:00
Lennart Poettering	50be4f4a46	core: rework how we track service and scope PIDs This reworks how systemd tracks processes on cgroupv1 systems where cgroup notification is not reliable. Previously, whenever we had reason to believe that new processes showed up or got removed we'd scan the cgroup of the scope or service unit for new processes, and would tidy up the list of PIDs previously watched. This scanning is relatively slow, and does not scale well. With this change behaviour is changed: instead of scanning for new/removed processes right away we do this work in a per-unit deferred event loop job. This event source is scheduled at a very low priority, so that it is executed when we have time but does not starve other event sources. This has two benefits: this expensive work is coalesced, if events happen in quick succession, and we won't delay SIGCHLD handling for too long. This patch basically replaces all direct invocation of unit_watch_all_pids() in scope.c and service.c with invocations of the new unit_enqueue_rewatch_pids() call which just enqueues a request of watching/tidying up the PID sets (with one exception: in scope_enter_signal() and service_enter_signal() we'll still do unit_watch_all_pids() synchronously first, since we really want to know all processes we are about to kill so that we can track them properly. Moreover, all direct invocations of unit_tidy_watch_pids() and unit_synthesize_cgroup_empty_event() are removed too, when the unit_enqueue_rewatch_pids() call is invoked, as the queued job will run those operations too. All of this is done on cgroupsv1 systems only, and is disabled on cgroupsv2 systems as cgroup-empty notifications are reliable there, and we do not need SIGCHLD events to track processes there. Fixes: #9138	2018-06-05 22:06:48 +02:00
Zbigniew Jędrzejewski-Szmek	79e221d078	Merge pull request #9158 from poettering/notify-auto-reload trigger OnFailure= only if Restart= is not in effect	2018-06-05 13:51:07 +02:00
Zbigniew Jędrzejewski-Szmek	a1230ff972	basic/log: add the log_struct terminator to macro This way all callers do not need to specify it. Exhaustively tested by running test-log under valgrind ;)	2018-06-04 13:46:03 +02:00
Zbigniew Jędrzejewski-Szmek	d94a24ca2e	Add macro for checking if some flags are set This way we don't need to repeat the argument twice. I didn't replace all instances. I think it's better to leave out: - asserts - comparisons like x & y == x, which are mathematically equivalent, but here we aren't checking if flags are set, but if the argument fits in the flags.	2018-06-04 11:50:44 +02:00
Yu Watanabe	858d36c1ec	path-util: introduce path_simplify() The function is similar to path_kill_slashes() but also removes initial './', trailing '/.', and '/./' in the path. When the second argument of path_simplify() is false, then it behaves as the same as path_kill_slashes(). Hence, this also replaces path_kill_slashes() with path_simplify().	2018-06-03 23:39:26 +09:00
Lennart Poettering	2ad2e41a72	core: don't trigger OnFailure= deps when a unit is going to restart This adds a flags parameter to unit_notify() which can be used to pass additional notification information to the function. We the make the old reload_failure boolean parameter one of these flags, and then add a new flag that let's unit_notify() if we are configured to restart the service. Note that this adjusts behaviour of systemd to match what the docs say. Fixes: #8398	2018-06-01 19:08:30 +02:00
Lennart Poettering	7f66b026bb	core: when we can't enqueue OnFailure= job show full error message Let's ask for the full error message and show it, there's really no reason to just show the crappy errno error.	2018-06-01 19:04:37 +02:00
Lennart Poettering	6f8fa29465	Merge pull request #8981 from keszybz/ratelimit-and-dbus Ratelimit renaming and dbus error message fix	2018-05-18 21:38:30 +02:00
Felipe Sateler	57b7a260c2	core: undo the dependency inversion between unit.h and all unit types	2018-05-15 14:24:34 -04:00
Yu Watanabe	af4fa99d6a	core: use _cleanup_set_free_ instread of _cleanup_(set_freep)	2018-05-14 14:13:57 +09:00
Zbigniew Jędrzejewski-Szmek	7994ac1d85	Rename ratelimit_test to ratelimit_below When I see "test", I have to think three times what the return value means. With "below" this is immediately clear. ratelimit_below(&limit) sounds almost like English and is imho immediately obvious. (I also considered ratelimit_ok, but this strongly implies that being under the limit is somehow better. Most of the times this is true, but then we use the ratelimit to detect triple-c-a-d, and "ok" doesn't fit so well there.) C.f. `a1bcaa07`.	2018-05-13 22:08:30 +02:00
David Tardon	95f14a3e21	core: use automatic cleanup more	2018-05-12 18:29:41 +02:00
Lennart Poettering	d4fd1cf208	core: enforce that scope units can be started only once Scope units are populated from PIDs specified by the bus client. We do that when a scope is started. We really shouldn't allow scopes to be started multiple times, as the PIDs then might be heavily out of date. Moreover, clients should have the guarantee that any scope they allocate has a clear runtime cycle which is not repetitive.	2018-04-27 21:52:45 +02:00
Lennart Poettering	7a9a0c05d4	Merge pull request #8765 from poettering/test-fixes some short fixes for the tests	2018-04-19 16:18:46 +02:00
Lennart Poettering	5d13a15b1d	tree-wide: drop spurious newlines (#8764 ) Double newlines (i.e. one empty lines) are great to structure code. But let's avoid triple newlines (i.e. two empty lines), quadruple newlines, quintuple newlines, …, that's just spurious whitespace. It's an easy way to drop 121 lines of code, and keeps the coding style of our sources a bit tigther.	2018-04-19 12:13:23 +02:00
Lennart Poettering	8f63253149	core: don't export per-unit metadata files in test mode We shouldn't clobber the host's /run directories with metadata we export for our units when we run in test mode.	2018-04-19 11:30:18 +02:00
Lennart Poettering	4d09e1c8ba	Merge pull request #8676 from keszybz/drop-license-boilerplate Drop license boilerplate	2018-04-10 14:53:31 +02:00
Zbigniew Jędrzejewski-Szmek	e9e8cbc83a	core: minor comment update	2018-04-07 20:05:58 +02:00
Zbigniew Jędrzejewski-Szmek	11a1589223	tree-wide: drop license boilerplate Files which are installed as-is (any .service and other unit files, .conf files, .policy files, etc), are left as is. My assumption is that SPDX identifiers are not yet that well known, so it's better to retain the extended header to avoid any doubt. I also kept any copyright lines. We can probably remove them, but it'd nice to obtain explicit acks from all involved authors before doing that.	2018-04-06 18:58:55 +02:00
Yu Watanabe	1cc6c93a95	tree-wide: use TAKE_PTR() and TAKE_FD() macros	2018-04-05 14:26:26 +09:00
Michal Sekletar	19496554e2	core: delay adding target dependencies until all units are loaded and aliases resolved (#8381 ) Currently we add target dependencies while we are loading units. This can create ordering loops even if configuration doesn't contain any loop. Take for example following configuration, $ systemctl get-default multi-user.target $ cat /etc/systemd/system/test.service [Unit] After=default.target [Service] ExecStart=/bin/true [Install] WantedBy=multi-user.target If we encounter such unit file early during manager start-up (e.g. load queue is dispatched while enumerating devices due to SYSTEMD_WANTS in udev rules) we would add stub unit default.target and we order it Before test.service. At the same time we add implicit Before to multi-user.target. Later we merge two units and we create ordering cycle in the process. To fix the issue we will now never add any target dependencies until we loaded all the unit files and resolved all the aliases.	2018-03-23 15:28:06 +01:00
Lennart Poettering	ae2a15bc14	macro: introduce TAKE_PTR() macro This macro will read a pointer of any type, return it, and set the pointer to NULL. This is useful as an explicit concept of passing ownership of a memory area between pointers. This takes inspiration from Rust: https://doc.rust-lang.org/std/option/enum.Option.html#method.take and was suggested by Alan Jenkins (@sourcejedi). It drops ~160 lines of code from our codebase, which makes me like it. Also, I think it clarifies passing of ownership, and thus helps readability a bit (at least for the initiated who know the new macro)	2018-03-22 20:21:42 +01:00
Lennart Poettering	31dc1ca3bf	move MANAGER_IS_RELOADING() check into manager_recheck_{dbus\|journal}() (#8510 ) Let's better check this inside of the call than before it, so that we never issue this while reloading, even should these calls be called due to other reasons than just the unit notify. This makes sure the reload state is unset a bit earlier in manager_reload() so that we can safely call this function from there and they do the right thing. Follow-up for `e63ebf71ed`.	2018-03-21 12:03:45 +01:00
Evgeny Vereshchagin	e4711004d6	Merge pull request #8461 from keszybz/oss-fuzz-fixes Oss fuzz fixes	2018-03-19 00:06:44 +03:00
Zbigniew Jędrzejewski-Szmek	ca8700e922	core/unit: delay creating a stack variable until after length has been checked path_is_normalized() will reject paths longer than 4095 bytes, so it's better to not create a stack variable of unbounded size, but instead do the check first and only then do that allocation. Also use _cleanup_ to make things a bit shorter. https://oss-fuzz.com/v2/issue/5424177403133952/7000	2018-03-18 21:07:01 +01:00
Zbigniew Jędrzejewski-Szmek	e63ebf71ed	core: when reloading, delay any actions on journal and dbus connections manager_recheck_journal() and manager_recheck_dbus() would be called to early while we were deserialiazing units, before the systemd-journald.service and dbus.service have been deserialized. In effect we'd disable logging to the journald and close the bus connection. The first is not very noticable, it mostly means that logs emitted during deserialization are lost. The second is more noticeable, because manager_recheck_dbus() would call bus_done_api() and bus_done_system() and close dbus connections. Logging and bus connection would then be restored later after the respective units have been deserialized. This is easily reproduced by calling: $ sudo gdbus call --system --dest org.freedesktop.systemd1 --object-path /org/freedesktop/systemd1 --method "org.freedesktop.systemd1.Manager.Reload" which works fine before `8559b3b75c`, and then starts failing with: Error: GDBus.Error:org.freedesktop.DBus.Error.NoReply: Remote peer disconnected None of this should happen, and we should delay changing state until after deserialization is complete when reloading. manager_reload() already included the calls to manager_recheck_journal() and manager_recheck_dbus(), so the connection state will be updated after deserialization during reloading is done. Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1554578.	2018-03-16 23:14:04 +01:00
Zbigniew Jędrzejewski-Szmek	dc409696cf	Introduce _cleanup_(unit_freep)	2018-03-11 16:33:58 +01:00
Zbigniew Jędrzejewski-Szmek	bea28c5adb	core/unit: voidify one snprintf statement One more follow-up for `f810b631cd`.	2018-02-26 15:49:27 +01:00
Zbigniew Jędrzejewski-Szmek	f810b631cd	Revert "Replace use of snprintf with xsprintf" This reverts commit `a7419dbc59`. _All_ changes in that commit were wrong. Fixes #8211.	2018-02-23 00:13:52 +01:00
Lennart Poettering	aa2b6f1d2b	bpf: rework how we keep track and attach cgroup bpf programs So, the kernel's management of cgroup/BPF programs is a bit misdesigned: if you attach a BPF program to a cgroup and close the fd for it it will stay pinned to the cgroup with no chance of ever removing it again (or otherwise getting ahold of it again), because the fd is used for selecting which BPF program to detach. The only way to get rid of the program again is to destroy the cgroup itself. This is particularly bad for root the cgroup (and in fact any other cgroup that we cannot realistically remove during runtime, such as /system.slice, /init.scope or /system.slice/dbus.service) as getting rid of the program only works by rebooting the system. To counter this let's closely keep track to which cgroup a BPF program is attached and let's implicitly detach the BPF program when we are about to close the BPF fd. This hence changes the bpf_program_cgroup_attach() function to track where we attached the program and changes bpf_program_cgroup_detach() to use this information. Moreover bpf_program_unref() will now implicitly call bpf_program_cgroup_detach(). In order to simplify things, bpf_program_cgroup_attach() will now implicitly invoke bpf_program_load_kernel() when necessary, simplifying the caller's side. Finally, this adds proper reference counting to BPF programs. This is useful for working with two BPF programs in parallel: the BPF program we are preparing for installation and the BPF program we so far installed, shortening the window when we detach the old one and reattach the new one.	2018-02-21 16:43:36 +01:00
Lennart Poettering	00f5ad93b5	core: change KeyringMode= to "shared" by default for non-service units in the system manager (#8172 ) Before this change all unit types would default to "private" in the system service manager and "inherit" to in the user service manager. With this change this is slightly altered: non-service units of the system service manager are now run with KeyringMode=shared. This appears to be the more appropriate choice as isolation is not as desirable for mount tools, which regularly consume key material. After all mounts are a shared resource themselves as they appear system-wide hence it makes a lot of sense to share their key material too. Fixes: #8159	2018-02-20 08:53:34 +01:00
Lennart Poettering	30663b6c25	Merge pull request #8199 from keszybz/small-things Sundry small cleanups	2018-02-19 16:55:10 +01:00
Zbigniew Jędrzejewski-Szmek	f4aa0bde1c	core: drop obsolete comment https://github.com/systemd/systemd/pull/8125#pullrequestreview-96894581	2018-02-19 15:18:54 +01:00
Lennart Poettering	a94ab7acfd	Merge pull request #8175 from keszybz/gc-cleanup Garbage collection cleanup	2018-02-15 17:47:37 +01:00
Zbigniew Jędrzejewski-Szmek	648461c07d	Merge pull request #8125 from poettering/cgroups-migrate Trivial merge conflict resolved locally.	2018-02-15 16:15:45 +01:00
Zbigniew Jędrzejewski-Szmek	1bdf279002	pid1: properly remove references to the unit from gc queue during final cleanup When various references to the unit were dropped during cleanup in unit_free(), add_to_gc_queue() could be called on this unit. If the unit was previously in the gc queue (at the time when unit_free() was called on it), this wouldn't matter, because it'd have in_gc_queue still set even though it was already removed from the queue. But if it wasn't set, then the unit could be added to the queue. Then after unit_free() would deallocate the unit, we would be left with a dangling pointer in gc_queue. A unit could be added to the gc queue in two places called from unit_free(): in the job_install calls, and in unit_ref_unset(). The first was OK, because it was above the LIST_REMOVE(gc_queue,...) call, but the second was not, because it was after that. Move the all LIST_REMOVE() calls down.	2018-02-15 14:03:53 +01:00
Zbigniew Jędrzejewski-Szmek	a946fa9bb9	pid1: free basic unit information at the very end, before freeing the unit We would free stuff like the names of the unit first, and then recurse into other structures to remove the unit from there. Technically this was OK, since the code did not access the name, but this makes debugging harder. And if any log messages are added in any of those functions, they are likely to access u->id and such other basic information about the unit. So let's move the removal of this "basic" information towards the end of unit_free().	2018-02-15 13:32:59 +01:00
Zbigniew Jędrzejewski-Szmek	2641f02e23	pid1: fix collection of cycles of units which reference one another A .socket will reference a .service unit, by registering a UnitRef with the .service unit. If this .service unit has the .socket unit listed in Wants or Sockets or such, a cycle will be created. We would not free this cycle properly, because we treated any unit with non-empty refs as uncollectable. To solve this issue, treats refs with UnitRef in u->refs_by_target similarly to the refs in u->dependencies, and check if the "other" unit is known to be needed. If it is not needed, do not treat the reference from it as preventing the unit we are looking at from being freed.	2018-02-15 13:32:53 +01:00
Zbigniew Jędrzejewski-Szmek	7f7d01ed58	pid1: include the source unit in UnitRef No functional change. The source unit manages the reference. It allocates the UnitRef structure and registers it in the target unit, and then the reference must be destroyed before the source unit is destroyed. Thus, is should be OK to include the pointer to the source unit, it should be live as long as the reference exists. v2: - rename refs to refs_by_target	2018-02-15 13:27:06 +01:00
Zbigniew Jędrzejewski-Szmek	f2f725e5cc	pid1: rename unit_check_gc to unit_may_gc "check" is unclear: what is true, what is false? Let's rename to "can_gc" and revert the return value ("positive" values are easier to grok). v2: - rename from unit_can_gc to unit_may_gc	2018-02-15 13:04:12 +01:00
Lennart Poettering	6592b9759c	core: add new new bus call for migrating foreign processes to scope/service units This adds a new bus call to service and scope units called AttachProcesses() that moves arbitrary processes into the cgroup of the unit. The primary user for this new API is systemd itself: the systemd --user instance uses this call of the systemd --system instance to migrate processes if itself gets the request to migrate processes and the kernel refuses this due to access restrictions. The primary use-case of this is to make "systemd-run --scope --user …" invoked from user session scopes work correctly on pure cgroupsv2 environments. There, the kernel refuses to migrate processes between two unprivileged-owned cgroups unless the requestor as well as the ownership of the closest parent cgroup all match. This however is not the case between the session-XYZ.scope unit of a login session and the user@ABC.service of the systemd --user instance. The new logic always tries to move the processes on its own, but if that doesn't work when being the user manager, then the system manager is asked to do it instead. The new operation is relatively restrictive: it will only allow to move the processes like this if the caller is root, or the UID of the target unit, caller and process all match. Note that this means that unprivileged users cannot attach processes to scope units, as those do not have "owning" users (i.e. they have now User= field). Fixes: #3388	2018-02-12 11:34:00 +01:00
Lennart Poettering	8559b3b75c	core: rework how we connect to the bus This removes the current bus_init() call, as it had multiple problems: it munged handling of the three bus connections we care about (private, "api" and system) into one, even though the conditions when which was ready are very different. It also added redundant logging, as the individual calls it called all logged on their own anyway. The three calls bus_init_api(), bus_init_private() and bus_init_system() are now made public. A new call manager_dbus_is_running() is added that works much like manager_journal_is_running() and is a lot more careful when checking whether dbus is around. Optionally it checks the unit's deserialized_state rather than state, in order to accomodate for cases where we cant to connect to the bus before deserializing the "subscribed" list, before coldplugging the units. manager_recheck_dbus() is added, that works a lot like manager_recheck_journal() and is invoked in unit_notify(), i.e. when units change state. All in all this should make handling a bit more alike to journal handling, and it also fixes one major bug: when running in user mode we'll now connect to the system bus early on, without conditionalizing this in anyway.	2018-02-12 11:34:00 +01:00
Lennart Poettering	004c7f169e	core: fold manager_set_exec_params() into unit_set_exec_params() Let's simplify things a bit: we so far called both functions every single time, let's just merge one into the other, so that we have fewer functions to call.	2018-02-12 11:34:00 +01:00
Lennart Poettering	1d9cc8768f	cgroup: add a new "can_delegate" flag to the unit vtable, and set it for scope and service units only Currently we allowed delegation for alluntis with cgroup backing except for slices. Let's make this a bit more strict for now, and only allow this in service and scope units. Let's also add a generic accessor unit_cgroup_delegate() for checking whether a unit has delegation turned on that checks the new bool first. Also, when doing transient units, let's explcitly refuse turning on delegation for unit types that don#t support it. This is mostly cosmetical as we wouldn't act on the delegation request anyway, but certainly helpful for debugging.	2018-02-12 11:34:00 +01:00
Lennart Poettering	548f69375e	tree-wide: use path_hash_ops instead of string_hash_ops whenever we key by a path Let's make use of our new hash_ops!	2018-02-12 11:07:55 +01:00
Franck Bui	9ea3a0e702	core: use id unit when retrieving unit file state (#8038 ) Previous code was using the basename(id->fragment_path) which returned incorrect result if the unit was an instance. For example, assuming that no instances of "template" have been created so far: $ systemctl enable template@1 Created symlink from /etc/systemd/system/multi-user.target.wants/template@1.service to /usr/lib/systemd/system/template@.service. $ systemctl is-enabled template@3.service disabled $ systemctl status template@3.service ● template@3.service - openQA Worker #3 Loaded: loaded (/usr/lib/systemd/system/template@.service; enabled; vendor preset: disabled) [...] Here the unit file states reported by "status" and "is-enabled" were different.	2018-02-07 14:08:02 +01:00
Andrei Gherzan	3f602115b7	core: Avoid empty directory warning when we are bind-mounting a file (#8069 )	2018-02-06 16:35:52 +01:00
Yu Watanabe	e8a565cb66	core: make ExecRuntime be manager managed object Before this, each ExecRuntime object is owned by a unit. However, it may be shared with other units which enable JoinsNamespaceOf=. Thus, by the serialization/deserialization process, its sharing information, more specifically, reference counter is lost, and causes issue #7790. This makes ExecRuntime objects be managed by manager, and changes the serialization/deserialization process. Fixes #7790.	2018-02-06 16:00:34 +09:00
Lennart Poettering	81e9871e87	selinux: make sure we never use /dev/null for making unit selinux access decisions	2018-01-31 19:54:25 +01:00
Lennart Poettering	adefcf2821	core: rework how we count the n_on_console counter Let's add a per-unit boolean that tells us whether our unit is currently counted or not. This way it's unlikely we get out of sync again and things are generally more robust. This also allows us to remove the counting logic specific to service units (which was in fact mostly a copy from the generic implementation), in favour of fully generic code. Replaces: #7824	2018-01-24 20:14:51 +01:00
Lennart Poettering	bb2c768545	core: add a new unit_needs_console() call This call determines whether a specific unit currently needs access to the console. It's a fancy wrapper around exec_context_may_touch_console() ultimately, however for service units we'll explicitly exclude the SERVICE_EXITED state from when we report true.	2018-01-24 19:54:26 +01:00
Lennart Poettering	62a769136d	core: rework how we track which PIDs to watch for a unit Previously, we'd maintain two hashmaps keyed by PIDs, pointing to Unit interested in SIGCHLD events for them. This scheme allowed a specific PID to be watched by exactly 0, 1 or 2 units. With this rework this is replaced by a single hashmap which is primarily keyed by the PID and points to a Unit interested in it. However, it optionally also keyed by the negated PID, in which case it points to a NULL terminated array of additional Unit objects also interested. This scheme means arbitrary numbers of Units may now watch the same PID. Runtime and memory behaviour should not be impact by this change, as for the common case (i.e. each PID only watched by a single unit) behaviour stays the same, but for the uncommon case (a PID watched by more than one unit) we only pay with a single additional memory allocation for the array. Why this all? Primarily, because allowing exactly two units to watch a specific PID is not sufficient for some niche cases, as processes can belong to more than one unit these days: 1. sd_notify() with MAINPID= can be used to attach a process from a different cgroup to multiple units. 2. Similar, the PIDFile= setting in unit files can be used for similar setups, 3. By creating a scope unit a main process of a service may join a different unit, too. 4. On cgroupsv1 we frequently end up watching all processes remaining in a scope, and if a process opens lots of scopes one after the other it might thus end up being watch by many of them. This patch hence removes the 2-unit-per-PID limit. It also makes a couple of other changes, some of them quite relevant: - manager_get_unit_by_pid() (and the bus call wrapping it) when there's ambiguity will prefer returning the Unit the process belongs to based on cgroup membership, and only check the watch-pids hashmap if that fails. This change in logic is probably more in line with what people expect and makes things more stable as each process can belong to exactly one cgroup only. - Every SIGCHLD event is now dispatched to all units interested in its PID. Previously, there was some magic conditionalization: the SIGCHLD would only be dispatched to the unit if it was only interested in a single PID only, or the PID belonged to the control or main PID or we didn't dispatch a signle SIGCHLD to the unit in the current event loop iteration yet. These rules were quite arbitrary and also redundant as the the per-unit handlers would filter the PIDs anyway a second time. With this change we'll hence relax the rules: all we do now is dispatch every SIGCHLD event exactly once to each unit interested in it, and it's up to the unit to then use or ignore this. We use a generation counter in the unit to ensure that we only invoke the unit handler once for each event, protecting us from confusion if a unit is both associated with a specific PID through cgroup membership and through the "watch_pids" logic. It also protects us from being confused if the "watch_pids" hashmap is altered while we are dispatching to it (which is a very likely case). - sd_notify() message dispatching has been reworked to be very similar to SIGCHLD handling now. A generation counter is used for dispatching as well. This also adds a new test that validates that "watch_pid" registration and unregstration works correctly.	2018-01-23 21:29:31 +01:00
Alan Jenkins	25cd49647c	mount: forbid mount on path with symlinks It was forbidden to create mount units for a symlink. But the reason is that the mount unit needs to know the real path that will appear in /proc/self/mountinfo. The kernel dereferences all the symlinks in the path at mount time (I checked this with `mount -c` running under `strace`). This will have no effect on most systems. As recommended by docs, most systems use /etc/fstab, as opposed to native mount unit files. fstab-generator dereferences symlinks for backwards compatibility. A relatively minor issue regarding Time Of Check / Time Of Use also exists here. I can't see how to get rid of it entirely. If we pass an absolute path to mount, the racing process can replace it with a symlink. If we chdir() to the mount point and pass ".", the racing process can move the directory. The latter might potentially be nicer, except that it breaks WorkingDirectory=. I'm not saying the race is relevant to security - I just want to consider how bad the effect is. Currently, it can make the mount unit active (and hence the job return success), despite there never being a matching entry in /proc/self/mountinfo. This wart will be removed in the next commit; i.e. it will make the mount unit fail instead.	2018-01-20 22:06:34 +00:00
Lennart Poettering	75152a4d6a	tree-wide: install matches asynchronously Let's remove a number of synchronization points from our service startups: let's drop synchronous match installation, and let's opt for asynchronous instead. Also, let's use sd_bus_match_signal() instead of sd_bus_add_match() where we can.	2018-01-05 13:58:32 +01:00
Lennart Poettering	4c253ed1ca	tree-wide: introduce new safe_fork() helper and port everything over This adds a new safe_fork() wrapper around fork() and makes use of it everywhere. The new wrapper does a couple of things we previously did manually and separately in a safer, more correct and automatic way: 1. Optionally resets signal handlers/mask in the child 2. Sets a name on all processes we fork off right after forking off (and the patch assigns useful names for all processes we fork off now, following a systematic naming scheme: always enclosed in () – in order to indicate that these are not proper, exec()ed processes, but only forked off children, and if the process is long-running with only our own code, without execve()'ing something else, it gets am "sd-" prefix.) 3. Optionally closes all file descriptors in the child 4. Optionally sets a PR_SET_DEATHSIG to SIGTERM in the child, in a safe way so that the parent dying before this happens being handled safely. 5. Optionally reopens the logs 6. Optionally connects stdin/stdout/stderr to /dev/null 7. Debug logs about the forked off processes.	2017-12-25 11:48:21 +01:00
Lennart Poettering	a8ea93a5e2	core: use empty_to_null() where we can	2017-12-07 12:13:00 +01:00
Michal Koutný	deb4e7080d	service: Don't stop unneeded units needed by restarted service (#7526 ) An auto-restarted unit B may depend on unit A with StopWhenUnneeded=yes. If A stops before B's restart timeout expires, it'll be started again as part of B's dependent jobs. However, if stopping takes longer than the timeout, B's running stop job collides start job which also cancels B's start job. Result is that neither A or B are active. Currently, when a service with automatic restarting fails, it transitions through following states: 1) SERVICE_FAILED or SERVICE_DEAD to indicate the failure, 2) SERVICE_AUTO_RESTART while restart timer is running. The StopWhenUnneeded= check takes place in service_enter_dead between the two state mentioned above. We temporarily store the auto restart flag to query it during the check. Because we don't return control to the main event loop, this new service unit flag needn't be serialized. This patch prevents the pathologic situation when the service with Restart= won't restart automatically. As a side effect it also avoid restarting the dependency unit with StopWhenUnneeded=yes. Fixes: #7377	2017-12-05 16:51:19 +01:00
Lennart Poettering	50fb00b707	core: use safe_fclose() where we can	2017-11-29 12:34:12 +01:00
Lennart Poettering	45639f1be5	core: never remove "transient" and "control" directories from unit search path This changes the unit search path logic to never drop the transient and control directories from the unit search path. This is necessary as we add new entries to both during runtime, due to the "systemctl set-property" and transient unit logic. Previously, the "transient" directory was created during early boot to deal with this, but the "control" directories were not covered like that. Creating the control directories early at boot is not possible however, as /etc might be read-only then, and we do define a persistent control directory. Hence, let's create these dirs on-demand when we need them, and make sure the search path clean-up logic never drops them from the search path even if they are initially missing. (Also, always create these paths properly labelled)	2017-11-29 12:34:12 +01:00
Lennart Poettering	0126c8f3f6	core: minor simplification	2017-11-29 12:34:12 +01:00
Lennart Poettering	2e59b241ca	core: add proper escaping to writing of drop-ins/transient unit files This majorly refactors the transient unit file and drop-in writing logic, so that we properly C-escape and specifier-escape (% → %%) everything we write out, so that when we read it back again, specifiers are parsed that aren't supposed to be parsed. This renames unit_write_drop_in() and friends by unit_write_setting(). The name change is supposed to clarify that the functions are not only used to write drop-in files, but also transient unit files. The previous "mode" parameter to this function is replaced by a more generic "flags", which knows additional flags for implicit C-style and specifier escaping before writing things out. This can cover most properties where either form of escaping is defined. For the cases where this isn't sufficient, we add helpers unit_escape_setting() and unit_concat_strv() for escaping individual strings or strvs properly. While we are at it, we also prettify generation of transient unit files: we try to reduce the number of section headers written out: previously we'd write the right section header our for each setting. With this change we do so only if the setting lives in a different section than the one before. (This should also be considered preparation for when we add proper APIs to systemd to write normal, persistant unit files through the bus API)	2017-11-29 12:34:12 +01:00
Lennart Poettering	a4634b214c	core: warn about left-over processes in cgroup on unit start Now that we don't kill control processes anymore, let's at least warn about any processes left-over in the unit cgroup at the moment of starting the unit.	2017-11-25 17:08:21 +01:00
Lennart Poettering	e98b2fbbe9	core: generalize the cgroup empty check on GC Let's move the cgroup empty check for all unit types into the generic unit_check_gc() call, out of the per-unit-type _check_gc() type. This not only allows us to share some code, but also hooks up mount and socket units with this kind of check, for free, as it was missing there previously.	2017-11-25 17:08:21 +01:00
Lennart Poettering	60c728adf7	unit: initialize bpf cgroup realization state properly Before this patch, the bpf cgroup realization state was implicitly set to "NO", meaning that the bpf configuration was realized but was turned off. That means invalidation requests for the bpf stuff (which we issue in blanket fashion when doing a daemon reload) would actually later result in a us re-realizing the unit, under the assumption it was already realized once, even though in reality it never was realized before. This had the effect that after each daemon-reload we'd end up realizing all defined units, even the unloaded ones, populating cgroupfs with lots of unneeded empty cgroups. With this fix we properly set the realiazation state to "INVALIDATED", i.e. indicating the bpf stuff was never set up for the unit, and hence when we try to invalidate it later we won't do anything.	2017-11-25 17:08:21 +01:00
Daniel Lockyer	a7419dbc59	Replace use of snprintf with xsprintf	2017-11-24 10:36:04 +00:00
Zbigniew Jędrzejewski-Szmek	ffb70e4424	Merge pull request #7381 from poettering/cgroup-unified-delegate-rework Fix delegation in the unified hierarchy + more cgroup work	2017-11-22 07:42:08 +01:00
Lennart Poettering	3c7416b6ca	core: unify common code for preparing for forking off unit processes This introduces a new function unit_prepare_exec() that encapsulates a number of calls we do in preparation for spawning off some processes in all our unit types that do so. This allows us to neatly unify a bit of code between unit types and shorten our code.	2017-11-21 11:54:08 +01:00
Lennart Poettering	e7dfbb4e74	core: introduce SuccessAction= as unit file property SuccessAction= is similar to FailureAction= but declares what to do on success of a unit, rather than on failure. This is useful for running commands in qemu/nspawn images, that shall power down on completion. We frequently see "ExecStopPost=/usr/bin/systemctl poweroff" or so in unit files like this. Offer a simple, more declarative alternative for this. While we are at it, hook up failure action with unit_dump() and transient units too.	2017-11-20 16:37:22 +01:00
Lennart Poettering	53c35a766f	core: generalize FailureAction= move it from service to unit All kinds of units can fail, hence it makes sense to offer this as generic concept for all unit types.	2017-11-20 16:37:22 +01:00
Lennart Poettering	0133d5553a	Merge pull request #7198 from poettering/stdin-stdout Add StandardInput=data, StandardInput=file:... and more	2017-11-19 19:49:11 +01:00
Zbigniew Jędrzejewski-Szmek	53e1b68390	Add SPDX license identifiers to source files under the LGPL This follows what the kernel is doing, c.f. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.	2017-11-19 19:08:15 +01:00
Lennart Poettering	99be45a46f	fs-util: rename path_is_safe() → path_is_normalized() Already, path_is_safe() refused paths container the "." dir. Doing that isn't strictly necessary to be "safe" by most definitions of the word. But it is necessary in order to consider a path "normalized". Hence, "path_is_safe()" is slightly misleading a name, but "path_is_normalize()" is more descriptive, hence let's rename things accordingly. No functional changes.	2017-11-17 11:13:44 +01:00
Lennart Poettering	5afe510c89	core: add a new unit file setting CollectMode= for tweaking the GC logic Right now, the option only takes one of two possible values "inactive" or "inactive-or-failed", the former being the default, and exposing same behaviour as the status quo ante. If set to "inactive-or-failed" units may be collected by the GC logic when in the "failed" state too. This logic should be a nicer alternative to using the "-" modifier for ExecStart= and friends, as the exit data is collected and logged about and only removed when the GC comes along. This should be useful in particular for per-connection socket-activated services, as well as "systemd-run" command lines that shall leave no artifacts in the system. I was thinking about whether to expose this as a boolean, but opted for an enum instead, as I have the suspicion other tweaks like this might be a added later on, in which case we extend this setting instead of having to add yet another one. Also, let's add some documentation for the GC logic.	2017-11-16 14:38:36 +01:00
Lennart Poettering	7eb2a8a125	unit: rework a bit how we keep the service fdstore from being destroyed during service restart When preparing for a restart we quickly go through the DEAD/INACTIVE service state before entering AUTO_RESTART. When doing this, we need to make sure we don't destroy the FD store. Previously this was done by checking the failure state of the unit, and keeping the FD store around when the unit failed, under the assumption that the restart logic will then get into action. This is not entirely correct howver, as there might be failure states that will no result in restarts. With this commit we slightly alter the logic: a ref counter for the fd store is added, that is increased right before we handle the restart logic, and decreased again right-after. This should ensure that the fdstore lives exactly as long as it needs. Follow-up for `f0bfbfac43`.	2017-11-16 14:37:33 +01:00

1 2 3 4 5 ...

675 commits