Systemd

Commit Graph

Author	SHA1	Message	Date
Lennart Poettering	8559b3b75c	core: rework how we connect to the bus This removes the current bus_init() call, as it had multiple problems: it munged handling of the three bus connections we care about (private, "api" and system) into one, even though the conditions when which was ready are very different. It also added redundant logging, as the individual calls it called all logged on their own anyway. The three calls bus_init_api(), bus_init_private() and bus_init_system() are now made public. A new call manager_dbus_is_running() is added that works much like manager_journal_is_running() and is a lot more careful when checking whether dbus is around. Optionally it checks the unit's deserialized_state rather than state, in order to accomodate for cases where we cant to connect to the bus before deserializing the "subscribed" list, before coldplugging the units. manager_recheck_dbus() is added, that works a lot like manager_recheck_journal() and is invoked in unit_notify(), i.e. when units change state. All in all this should make handling a bit more alike to journal handling, and it also fixes one major bug: when running in user mode we'll now connect to the system bus early on, without conditionalizing this in anyway.	2018-02-12 11:34:00 +01:00
Lennart Poettering	004c7f169e	core: fold manager_set_exec_params() into unit_set_exec_params() Let's simplify things a bit: we so far called both functions every single time, let's just merge one into the other, so that we have fewer functions to call.	2018-02-12 11:34:00 +01:00
Yu Watanabe	e8a565cb66	core: make ExecRuntime be manager managed object Before this, each ExecRuntime object is owned by a unit. However, it may be shared with other units which enable JoinsNamespaceOf=. Thus, by the serialization/deserialization process, its sharing information, more specifically, reference counter is lost, and causes issue #7790. This makes ExecRuntime objects be managed by manager, and changes the serialization/deserialization process. Fixes #7790.	2018-02-06 16:00:34 +09:00
Alan Jenkins	cc2b9e6b20	rationalize interface for opening/closing logging log_open_console() did not switch from stderr to /dev/console, when "always_reopen_console" was set. It was necessary to call log_close_console() first. By contrast, log_open() did switch between e.g. journald and kmsg according to the value of "prohibit_ipc". Let's fix log_open() to respect the values of all the log options, and we can make log_close_*() private. Also log_close_console() is changed. There was some precaution, avoiding closing the console fd if we are not PID 1. I think commit `48a601fe` made a little mistake in leaving this in, and it only served to confuse readers :). Also I changed systemd-shutdown. Now we have log_set_prohibit_ipc(), let's use it to clarify that systemd-shutdown is not expected to try and log via journald (which it is about to kill). We avoided ever asking it to, but it's more convenient for the reader if they don't have to think about that. In that sense, it's similar to using assert() to validate a function's arguments.	2018-01-27 18:01:51 +00:00
Alan Jenkins	ba30753899	pid1: when we can't log to journal, remember our fallback log target If we have to force the logging to close the journal fd, then we can open any fallback log target. E.g. kmsg, if the target was the default JOURNAL_OR_KMSG. This is the behaviour I would expect from the documentation. I couldn't find any justification in the code, for why we would want to start dropping log messages instead of sending them to the fallback target. This means we will match the behaviour of processes which we fork and which set `open_when_needed`, and with generators - which use log_set_prohibit_ipc(true) - which we fork+exec during a reload. IMO this illustrates that the log_open/log_close interface is too clunky. So with the behaviour settled, I will refactor the interface in the next commit :).	2018-01-26 22:47:16 +00:00
Zbigniew Jędrzejewski-Szmek	dc3c9f5e36	core: initalize buffer	2018-01-26 00:59:23 +09:00
Yu Watanabe	dd1db3c288	core: manager logs firmware and loader time when startup finished	2018-01-26 00:59:20 +09:00
Zbigniew Jędrzejewski-Szmek	5eb83fa645	Merge pull request #7991 from poettering/n-on-console a comprehensive fix for the n_on_console miscounting issue	2018-01-25 13:48:08 +03:00
Lennart Poettering	adefcf2821	core: rework how we count the n_on_console counter Let's add a per-unit boolean that tells us whether our unit is currently counted or not. This way it's unlikely we get out of sync again and things are generally more robust. This also allows us to remove the counting logic specific to service units (which was in fact mostly a copy from the generic implementation), in favour of fully generic code. Replaces: #7824	2018-01-24 20:14:51 +01:00
Lennart Poettering	46fb617bf9	manager: minor manager_get_show_status() simplification Since the the whole function ultimately is just a fancy getter for the show_status field, let's actually return it as last step literally without an extra needless "if".	2018-01-24 19:52:29 +01:00
Lennart Poettering	5a69973ff2	manager: add some explanatory comments to manager_dispatch_idle_pipe_fd()	2018-01-24 19:52:14 +01:00
Lennart Poettering	d075092f14	pid1: make use of new "prohibit_ipc" logging flag in PID 1 Let's set it initially, and then toggle it only when we know its safe.	2018-01-24 18:22:56 +01:00
Lennart Poettering	62a769136d	core: rework how we track which PIDs to watch for a unit Previously, we'd maintain two hashmaps keyed by PIDs, pointing to Unit interested in SIGCHLD events for them. This scheme allowed a specific PID to be watched by exactly 0, 1 or 2 units. With this rework this is replaced by a single hashmap which is primarily keyed by the PID and points to a Unit interested in it. However, it optionally also keyed by the negated PID, in which case it points to a NULL terminated array of additional Unit objects also interested. This scheme means arbitrary numbers of Units may now watch the same PID. Runtime and memory behaviour should not be impact by this change, as for the common case (i.e. each PID only watched by a single unit) behaviour stays the same, but for the uncommon case (a PID watched by more than one unit) we only pay with a single additional memory allocation for the array. Why this all? Primarily, because allowing exactly two units to watch a specific PID is not sufficient for some niche cases, as processes can belong to more than one unit these days: 1. sd_notify() with MAINPID= can be used to attach a process from a different cgroup to multiple units. 2. Similar, the PIDFile= setting in unit files can be used for similar setups, 3. By creating a scope unit a main process of a service may join a different unit, too. 4. On cgroupsv1 we frequently end up watching all processes remaining in a scope, and if a process opens lots of scopes one after the other it might thus end up being watch by many of them. This patch hence removes the 2-unit-per-PID limit. It also makes a couple of other changes, some of them quite relevant: - manager_get_unit_by_pid() (and the bus call wrapping it) when there's ambiguity will prefer returning the Unit the process belongs to based on cgroup membership, and only check the watch-pids hashmap if that fails. This change in logic is probably more in line with what people expect and makes things more stable as each process can belong to exactly one cgroup only. - Every SIGCHLD event is now dispatched to all units interested in its PID. Previously, there was some magic conditionalization: the SIGCHLD would only be dispatched to the unit if it was only interested in a single PID only, or the PID belonged to the control or main PID or we didn't dispatch a signle SIGCHLD to the unit in the current event loop iteration yet. These rules were quite arbitrary and also redundant as the the per-unit handlers would filter the PIDs anyway a second time. With this change we'll hence relax the rules: all we do now is dispatch every SIGCHLD event exactly once to each unit interested in it, and it's up to the unit to then use or ignore this. We use a generation counter in the unit to ensure that we only invoke the unit handler once for each event, protecting us from confusion if a unit is both associated with a specific PID through cgroup membership and through the "watch_pids" logic. It also protects us from being confused if the "watch_pids" hashmap is altered while we are dispatching to it (which is a very likely case). - sd_notify() message dispatching has been reworked to be very similar to SIGCHLD handling now. A generation counter is used for dispatching as well. This also adds a new test that validates that "watch_pid" registration and unregstration works correctly.	2018-01-23 21:29:31 +01:00
Lennart Poettering	575b300b79	pid1: rework how we dispatch SIGCHLD and other signals This fundamentally makes one change: we never process more than one signal or more than one waitid() event per event loop. We'll never tight loop around waitid() or around read() on our signalfd instead, but always return to the main event loop after processing one event. By doing this we put the event priorization handling into full power again, as we'll always check for higher priority events before looking at the next signal or waitid() again. This introduces a new "defer" event source "sigchld_event". It's enabled as soon as we see SIGCHLD, and disabled as soon as waitid() reported no further children pending. It's running at a relatively high priority, one step higher than signal handling itself, but lower than /proc/self/mountinfo event handling, so that the latter always takes precedence. Since we want to process sd_notify() events at an even higher priority than SIGCHLD (as before) it is moved one priority step up, too. Fixes: #7932 Possibly fixes: #7966	2018-01-23 18:41:40 +01:00
Lennart Poettering	67ae4e8d59	core: move user lookup event priority to -11 This is internal stuff, us talking to ourselves and relatively independent of everything else, let's put this at highest priority hence.	2018-01-23 18:15:16 +01:00
Lennart Poettering	4259d20215	manager: add MANAGER_IS_RUNNING() for checking whether the manager is running This macro is useful as the check is not obvious, and we better abstract this away.	2018-01-23 16:43:56 +01:00
Lennart Poettering	4adf314b77	manager: split out send_ready and basic.target checking into functions of their own Let's shorten manager_check_finished() a bit by splitting out checking of basic.target and the two things we do when we reach it. This should not change behaviour, except for one thing: we now check basic.target's actual state for figuring out whether it is up, instead of generically checking whether it has any job queued. This is arguably more correct, and is what other code does too for similar purposes, for example manager_state()	2018-01-23 16:39:12 +01:00
Jan Klötzke	2a12e32efa	pid1: add option to disable service watchdogs Add a "systemd.service_watchdogs=" option to the command line which disables all service runtime watchdogs and emergency actions.	2018-01-22 18:10:03 +01:00
Zbigniew Jędrzejewski-Szmek	d8eb10d61a	core: delay logging the taint string until after basic.target is reached (#7935 ) This happens to be almost the same moment as when we send READY=1 in the user instance, but the logic is slightly different, since we log taint when basic.target is reached in the system manager, but we send the notification only in the user manager. So add a separate flag for this and propagate it across reloads. Fixes #7683.	2018-01-21 21:17:54 +09:00
Lennart Poettering	db256aab13	core: be stricter when handling PID files and MAINPID sd_notify() messages Let's be more restrictive when validating PID files and MAINPID= messages: don't accept PIDs that make no sense, and if the configuration source is not trusted, don't accept out-of-cgroup PIDs. A configuratin source is considered trusted when the PID file is owned by root, or the message was received from root. This should lock things down a bit, in case service authors write out PID files from unprivileged code or use NotifyAccess=all with unprivileged code. Note that doing so was always problematic, just now it's a bit less problematic. When we open the PID file we'll now use the CHASE_SAFE chase_symlinks() logic, to ensure that we won't follow an unpriviled-owned symlink to a privileged-owned file thinking this was a valid privileged PID file, even though it really isn't. Fixes: #6632	2018-01-11 15:12:16 +01:00
Lennart Poettering	15e23e8cdf	manager: make use of pid_is_valid() where appropriate	2018-01-11 15:12:16 +01:00
Lennart Poettering	007e4b5490	manager: make use of NEWLINE macro where appropriate	2018-01-11 15:12:16 +01:00
Lennart Poettering	da5fb86100	manager: swap order in which we ellipsize/escape sd_notify() messages for debugging If we have to chose between truncated escape sequences and strings exploded to 4 times the desried length by fully escaping, prefer the latter. It's for debug only, hence doesn't really matter much.	2018-01-11 15:12:16 +01:00
Lennart Poettering	47cf8ff206	manager: rework manager_clean_environment() Let's rename it manager_sanitize_environment() which is a more precise name. Moreover, sort the environment implicitly inside it, as all our callers do that anyway afterwards and we can save some code this way. Also, update the list of env vars to drop, i.e. the env vars we manage ourselves and don't want user code to interfear with. Also sort this list to make it easier to update later on.	2018-01-10 18:30:06 +01:00
Lennart Poettering	665dfe9318	io-util: make flush_fd() return how many bytes where flushed This is useful so that callers know whether anything at all and how much was flushed. This patches through users of this functions to ensure that the return values > 0 which may be returned now are not propagated in public APIs. Also, users that ignore the return value are changed to do so explicitly now.	2018-01-05 13:55:08 +01:00
Lennart Poettering	f1d34068ef	tree-wide: add DEBUG_LOGGING macro that checks whether debug logging is on (#7645 ) This makes things a bit easier to read I think, and also makes sure we always use the _unlikely_ wrapper around it, which so far we used sometimes and other times we didn't. Let's clean that up.	2017-12-15 11:09:00 +01:00
Lennart Poettering	e3140015a7	Merge pull request #7640 from keszybz/tainting-updates Tainting updates	2017-12-14 22:57:17 +01:00
Zbigniew Jędrzejewski-Szmek	198ce93248	core: drop taints for nobody user/group names We have a check and warning at compile time. The user cannot do anything about this at runtime, and all other taints are about checks that happen at runtime and are specific to that system (and at least potentially correctable). (The logic in the compilation-time check was updated to treat "nogroup" as OK, but not the runtime check. But I think it's better to remove the runtime check for this altogether, so this becomes moot.)	2017-12-14 22:14:38 +01:00
Lennart Poettering	fbd0b64f44	tree-wide: make use of new STRLEN() macro everywhere (#7639 ) Let's employ coccinelle to do this for us. Follow-up for #7625.	2017-12-14 19:02:29 +01:00
Lennart Poettering	0d53667334	tree-wide: use __fsetlocking() instead of fxyz_unlocked() Let's replace usage of fputc_unlocked() and friends by __fsetlocking(f, FSETLOCKING_BYCALLER). This turns off locking for the entire FILE, instead of doing individual per-call decision whether to use normal calls or _unlocked() calls. This has various benefits: 1. It's easier to read and easier not to forget 2. It's more comprehensive, as fprintf() and friends are covered too (as these functions have no _unlocked() counterpart) 3. Philosophically, it's a bit more correct, because it's more a property of the file handle really whether we ever pass it on to another thread, not of the operations we then apply to it. This patch reworks all pieces of codes that so far used fxyz_unlocked() calls to use __fsetlocking() instead. It also reworks all places that use open_memstream(), i.e. use stdio FILE for string manipulations. Note that this in some way a revert of `4b61c87511`.	2017-12-14 10:42:25 +01:00
Alan Jenkins	0fd402b012	core: fix undefined behaviour due to uninitialized string buffer (#7597 ) Failure of systemd to respond on the bus interface was bisected to `af6b0ecc` "core: make "taint" string logic a bit more generic and output it at boot". Failure was presumably caused by trying to append strings to an unintialized buffer, leading to writing outside the unterminated buffer and hence undefined behaviour.	2017-12-10 19:58:01 +09:00
Zbigniew Jędrzejewski-Szmek	ba60adc623	Merge pull request #7572 from poettering/taint-manager "taint" logic improvements and other minor fixes	2017-12-07 21:06:28 +01:00
Lennart Poettering	90d7464d83	manager: taint the manager if the overflowuid/overflowgid aren't set to 65534	2017-12-07 12:34:46 +01:00
Lennart Poettering	af6b0ecc4c	core: make "taint" string logic a bit more generic and output it at boot The tainting logic existed for a long time, but was hidden inside the bus interfaces. Let's give it a small bit more coverage, by logging its value early at boot during initialization.	2017-12-07 11:27:07 +01:00
Lennart Poettering	e27fe688f2	manager: don't check /usr state of initrd to determine "taint-usr" taint	2017-12-07 11:09:09 +01:00
Lennart Poettering	5eb397cfad	manager: don't bother with creating /run/systemd/units/ in test mode This makes sure running "systemd --test" works again on systems running older systemd versions where the dir doesn't exist yet.	2017-12-07 11:07:55 +01:00
Lennart Poettering	279d81dd46	manager: split out code that sets up run_queue event source into function of its own Let's shorten manager_new() a bit.	2017-12-07 11:02:47 +01:00
Lennart Poettering	45639f1be5	core: never remove "transient" and "control" directories from unit search path This changes the unit search path logic to never drop the transient and control directories from the unit search path. This is necessary as we add new entries to both during runtime, due to the "systemctl set-property" and transient unit logic. Previously, the "transient" directory was created during early boot to deal with this, but the "control" directories were not covered like that. Creating the control directories early at boot is not possible however, as /etc might be read-only then, and we do define a persistent control directory. Hence, let's create these dirs on-demand when we need them, and make sure the search path clean-up logic never drops them from the search path even if they are initially missing. (Also, always create these paths properly labelled)	2017-11-29 12:34:12 +01:00
Lennart Poettering	45a7b16bae	core: don't reference rescue/emergency targets in --user mode They are only defined for system mode, hence let's not check for them in --user mode. Follow-up for #7433	2017-11-29 12:34:12 +01:00
Yu Watanabe	706424c2e2	core/manager: check the existance of the special units (#7433 ) In the user mode, not all special units exist. So, we need to check whether the units exist or not before operate something to the units. Such the check was mistakenly dropped by `e68537f0ba`. Fixes #7426.	2017-11-23 13:25:56 +01:00
Zbigniew Jędrzejewski-Szmek	bfbcf21d75	Merge pull request #7406 from poettering/timestamp-rework timestamping rework	2017-11-22 11:55:04 +01:00
Lennart Poettering	e68537f0ba	core: make use of unit_active_or_pending() where we can Let's make use of unit_active_or_pending() where we can. Note that this change changes beaviour in one specific case: when shutdown.target is active we'll now also return that the system is in "stopping" state, not only when we try to get into it. That makes sense as shutdown.target is ordered before the actually shutdown units such as "systemd-poweroff.service", and if the state is queried between reaching those we should also report "stopping".	2017-11-21 11:01:34 +01:00
Lennart Poettering	49d5666cc5	manager: introduce MANAGER_IS_FINISHED() macro Let's make our finished checks a bit more readable. Checking the timestamp is not entirely obvious, hence let's abstract that a bit by adding a macro that shows what we are doing here, not how we doing it. This is particularly useful if we want to change the definition of "finished" later on, in particular, when we try to fix #7023.	2017-11-21 11:01:34 +01:00
Lennart Poettering	713f6f901d	manager: add manager_get_dump_string() It's like manager_dump(), but returns a string. This allows us to reduce some duplicate code. Also, while we are at it, turn off stdio locking while we write to the memory FILE *f.	2017-11-21 11:01:34 +01:00
Lennart Poettering	ad75b9e765	core: add manager_dump() call, and make it output timestamp data It's a wrapper around manager_dump_units() and manager_dump_jobs(), and outputs some additional timestamp data. Also, port two users of this over.	2017-11-21 10:22:28 +01:00
Lennart Poettering	9f9f034271	manager: rework the timestamps logic, so that they are an enum-index array This makes things quite a bit more systematic I think, as we can systematically operate on all timestamps, for example for the purpose of serialization/deserialization. This rework doesn't necessarily make things shorter in the individual lines, but it does reduce the line count a bit. (This is useful particularly when we want to add additional timestamps, for example to solve #7023)	2017-11-21 10:22:28 +01:00
Shawn Landden	4831981d89	tree-wide: adjust fall through comments so that gcc is happy Distcc removes comments, making the comment silencing not work. I know there was a decision against a macro in commit `ec251fe7d5`	2017-11-20 13:06:25 -08:00
Zbigniew Jędrzejewski-Szmek	53e1b68390	Add SPDX license identifiers to source files under the LGPL This follows what the kernel is doing, c.f. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.	2017-11-19 19:08:15 +01:00
Lennart Poettering	fd1306121d	core: never apply first boot presets in the initrd Presets are useful to initialize uninitialized /etc, but that doesn't apply to the initrd. Also, let's rename etc_empty → first_boot. After all, the variable doesn't actually reflect whether /etc is really empty, it just reflects whether /etc/machine-id existed originally or not. Moreover, we later on directly initialize manager_set_first_boot() from it, hence let's just name it the same way all through the codepath, to make this all less confusing. See: #7100	2017-11-17 11:28:17 +01:00
Lennart Poettering	d3070fbdf6	core: implement /run/systemd/units/-based path for passing unit info from PID 1 to journald And let's make use of it to implement two new unit settings with it: 1. LogLevelMax= is a new per-unit setting that may be used to configure log priority filtering: set it to LogLevelMax=notice and only messages of level "notice" and lower (i.e. more important) will be processed, all others are dropped. 2. LogExtraFields= is a new per-unit setting for configuring per-unit journal fields, that are implicitly included in every log record generated by the unit's processes. It takes field/value pairs in the form of FOO=BAR. Also, related to this, one exisiting unit setting is ported to this new facility: 3. The invocation ID is now pulled from /run/systemd/units/ instead of cgroupfs xattrs. This substantially relaxes requirements of systemd on the kernel version and the privileges it runs with (specifically, cgroupfs xattrs are not available in containers, since they are stored in kernel memory, and hence are unsafe to permit to lesser privileged code). /run/systemd/units/ is a new directory, which contains a number of files and symlinks encoding the above information. PID 1 creates and manages these files, and journald reads them from there. Note that this is supposed to be a direct path between PID 1 and the journal only, due to the special runtime environment the journal runs in. Normally, today we shouldn't introduce new interfaces that (mis-)use a file system as IPC framework, and instead just an IPC system, but this is very hard to do between the journal and PID 1, as long as the IPC system is a subject PID 1 manages, and itself a client to the journal. This patch cleans up a couple of types used in journal code: specifically we switch to size_t for a couple of memory-sizing values, as size_t is the right choice for everything that is memory. Fixes: #4089 Fixes: #3041 Fixes: #4441	2017-11-16 12:40:17 +01:00

1 2 3 4 5 ...

517 Commits