Systemd

Commit Graph

Author	SHA1	Message	Date
Zbigniew Jędrzejewski-Szmek	c70cac548a	Introduce _cleanup_(manager_freep)	2018-03-11 16:33:57 +01:00
Yu Watanabe	a1d32bac2a	Revert "core: don't setup init.scope in test mode (#8380 )" (#8390 ) This reverts commit `a9e8ecf037`, as it breaks test-path. Fixes #8389.	2018-03-08 15:29:19 +09:00
Michal Sekletar	a9e8ecf037	core: don't setup init.scope in test mode (#8380 ) Reproducer: $ meson build && cd build $ ninja $ sudo useradd test $ sudo su test $ ./systemd --system --test ... Failed to create /user.slice/user-1000.slice/session-6.scope/init.scope control group: Permission denied Failed to allocate manager object: Permission denied Above error message is caused by the fact that user test didn't have its own session and we tried to set up init.scope already running as user test in the directory owned by different user. Let's skip setting up init.scope altogether since we won't be launching processes anyway.	2018-03-07 16:41:41 +01:00
Lennart Poettering	e0a085811d	core: don't process dbus unit and job queue when there are already too many messages pending We maintain a queue of units and jobs that we are supposed to generate change/new notifications for because they were either just created or some of their property has changed. Let's throttle processing of this queue a bit: as soon as > 1K of bus messages are queued for writing let's skip processing the queue, and then recheck on the next iteration again. Moreover, never process more than 100 units in one go, return to the event loop after that. Both limits together should put effective limits on both space and time usage of the function, delaying further operations until a later moment, when the queue is empty or the the event loop is sufficiently idle again. This should keep the number of generated messages much lower than before on busy systems or where some client is hanging. Note that this also means a bad client can slow down message dispatching substantially for up to 90s if it likes to, for all clients. But that should be acceptable as we only allow trusted bus clients, anyway. Fixes: #8166	2018-02-27 19:54:29 +01:00
Lennart Poettering	30663b6c25	Merge pull request #8199 from keszybz/small-things Sundry small cleanups	2018-02-19 16:55:10 +01:00
Zbigniew Jędrzejewski-Szmek	9ecdba8cb7	Move config_parse_join_controllers to shared, add test config_parse_join_controllers would free the destination argument on failure, which is contrary to our normal style, where failed parsing has no effect. Moving it to shared also allows a test to be added.	2018-02-19 15:02:13 +01:00
Lennart Poettering	a94ab7acfd	Merge pull request #8175 from keszybz/gc-cleanup Garbage collection cleanup	2018-02-15 17:47:37 +01:00
Lennart Poettering	476a8618fc	Merge pull request #8150 from poettering/memory-accounting-by-default pid1: turn memory accounting on by default now	2018-02-15 17:22:36 +01:00
Zbigniew Jędrzejewski-Szmek	648461c07d	Merge pull request #8125 from poettering/cgroups-migrate Trivial merge conflict resolved locally.	2018-02-15 16:15:45 +01:00
Zbigniew Jędrzejewski-Szmek	2ab3050f6e	pid1: rename job_check_gc to job_may_gc The reasoning is the same as for unit_can_gc. v2: - rename can_gc to may_gc	2018-02-15 14:09:40 +01:00
Zbigniew Jędrzejewski-Szmek	2641f02e23	pid1: fix collection of cycles of units which reference one another A .socket will reference a .service unit, by registering a UnitRef with the .service unit. If this .service unit has the .socket unit listed in Wants or Sockets or such, a cycle will be created. We would not free this cycle properly, because we treated any unit with non-empty refs as uncollectable. To solve this issue, treats refs with UnitRef in u->refs_by_target similarly to the refs in u->dependencies, and check if the "other" unit is known to be needed. If it is not needed, do not treat the reference from it as preventing the unit we are looking at from being freed.	2018-02-15 13:32:53 +01:00
Zbigniew Jędrzejewski-Szmek	f2f725e5cc	pid1: rename unit_check_gc to unit_may_gc "check" is unclear: what is true, what is false? Let's rename to "can_gc" and revert the return value ("positive" values are easier to grok). v2: - rename from unit_can_gc to unit_may_gc	2018-02-15 13:04:12 +01:00
Zbigniew Jędrzejewski-Szmek	444d586333	meson: add -Dmemory-accounting-default=true\|false This makes it easy to set the default for distributions and users which want to default to off because they primarily use older kernels.	2018-02-15 12:02:41 +01:00
Zbigniew Jędrzejewski-Szmek	04a5236233	Merge pull request #8144 from poettering/journal-inotify-fixes various journal fixes	2018-02-14 13:52:17 +01:00
Alan Jenkins	8afabc5090	manager: avoid infinite loop for unexpected waitid() error (#8168 ) I think if we log the error as being _ignored_, we should also consider the event as handled and clear it. This was the behaviour prior to `575b300b` (PR #7968). I don't think we particularly wanted to change behaviour and keep retrying. Sometimes that's useful, other times you cause more problems by filling the logs. Plus a nearby typo fix.	2018-02-13 19:04:31 +01:00
Lennart Poettering	5f109056d5	core: delay bus name synchronization after reload/reexec into a later event loop iteration Previously, we'd synchronize bus names immediately when we succeeded connecting to the bus, potentially even before coldplugging the units. This was problematic, as synchronizing bus names meant invoking the per-unit name change handler function which might change the unit's state — which will result in consistency when done before we coldplug things. With this change we instead enqueue a job for the event loop to resync the names in a later loop iteration, i.e. at a point where we know coldplugging has finished.	2018-02-12 11:34:00 +01:00
Lennart Poettering	cedf508886	core: simplify manager_recheck_journal() a bit No need for an if check if we just pass along a bool anyway.	2018-02-12 11:34:00 +01:00
Lennart Poettering	217677abb0	core: tweak manager_journal_is_running() a bit more Let's also use the journal if it is currently reloading. In that state it should also be able to process our requests. Moreover, we might otherwise end up disconnecting/reconnecting from the journal without really any need to hence, relax the check accordingly.	2018-02-12 11:34:00 +01:00
Lennart Poettering	7d814a197a	manager: tweak manager_journal_is_running() a bit regarding test mode In test mode, let's not consider the journal to be up ever: we want all output to go to stderr.	2018-02-12 11:34:00 +01:00
Lennart Poettering	8559b3b75c	core: rework how we connect to the bus This removes the current bus_init() call, as it had multiple problems: it munged handling of the three bus connections we care about (private, "api" and system) into one, even though the conditions when which was ready are very different. It also added redundant logging, as the individual calls it called all logged on their own anyway. The three calls bus_init_api(), bus_init_private() and bus_init_system() are now made public. A new call manager_dbus_is_running() is added that works much like manager_journal_is_running() and is a lot more careful when checking whether dbus is around. Optionally it checks the unit's deserialized_state rather than state, in order to accomodate for cases where we cant to connect to the bus before deserializing the "subscribed" list, before coldplugging the units. manager_recheck_dbus() is added, that works a lot like manager_recheck_journal() and is invoked in unit_notify(), i.e. when units change state. All in all this should make handling a bit more alike to journal handling, and it also fixes one major bug: when running in user mode we'll now connect to the system bus early on, without conditionalizing this in anyway.	2018-02-12 11:34:00 +01:00
Lennart Poettering	004c7f169e	core: fold manager_set_exec_params() into unit_set_exec_params() Let's simplify things a bit: we so far called both functions every single time, let's just merge one into the other, so that we have fewer functions to call.	2018-02-12 11:34:00 +01:00
Lennart Poettering	548f69375e	tree-wide: use path_hash_ops instead of string_hash_ops whenever we key by a path Let's make use of our new hash_ops!	2018-02-12 11:07:55 +01:00
Lennart Poettering	e0c46a7364	pid1: turn memory accounting on by default now After discussions with @htejun it appears it's OK now to enable memory accounting by default for all units without affecting system performance too badly. facebook has made good experiences with deploying memory accounting across their infrastructure. This hence turns MemoryAccounting= from opt-in to opt-out, similar to how TasksAccounting= is already handled. The other accounting options remain off, their performance impact is too big still.	2018-02-09 20:06:33 +01:00
Yu Watanabe	e8a565cb66	core: make ExecRuntime be manager managed object Before this, each ExecRuntime object is owned by a unit. However, it may be shared with other units which enable JoinsNamespaceOf=. Thus, by the serialization/deserialization process, its sharing information, more specifically, reference counter is lost, and causes issue #7790. This makes ExecRuntime objects be managed by manager, and changes the serialization/deserialization process. Fixes #7790.	2018-02-06 16:00:34 +09:00
Alan Jenkins	cc2b9e6b20	rationalize interface for opening/closing logging log_open_console() did not switch from stderr to /dev/console, when "always_reopen_console" was set. It was necessary to call log_close_console() first. By contrast, log_open() did switch between e.g. journald and kmsg according to the value of "prohibit_ipc". Let's fix log_open() to respect the values of all the log options, and we can make log_close_*() private. Also log_close_console() is changed. There was some precaution, avoiding closing the console fd if we are not PID 1. I think commit `48a601fe` made a little mistake in leaving this in, and it only served to confuse readers :). Also I changed systemd-shutdown. Now we have log_set_prohibit_ipc(), let's use it to clarify that systemd-shutdown is not expected to try and log via journald (which it is about to kill). We avoided ever asking it to, but it's more convenient for the reader if they don't have to think about that. In that sense, it's similar to using assert() to validate a function's arguments.	2018-01-27 18:01:51 +00:00
Alan Jenkins	ba30753899	pid1: when we can't log to journal, remember our fallback log target If we have to force the logging to close the journal fd, then we can open any fallback log target. E.g. kmsg, if the target was the default JOURNAL_OR_KMSG. This is the behaviour I would expect from the documentation. I couldn't find any justification in the code, for why we would want to start dropping log messages instead of sending them to the fallback target. This means we will match the behaviour of processes which we fork and which set `open_when_needed`, and with generators - which use log_set_prohibit_ipc(true) - which we fork+exec during a reload. IMO this illustrates that the log_open/log_close interface is too clunky. So with the behaviour settled, I will refactor the interface in the next commit :).	2018-01-26 22:47:16 +00:00
Zbigniew Jędrzejewski-Szmek	dc3c9f5e36	core: initalize buffer	2018-01-26 00:59:23 +09:00
Yu Watanabe	dd1db3c288	core: manager logs firmware and loader time when startup finished	2018-01-26 00:59:20 +09:00
Zbigniew Jędrzejewski-Szmek	5eb83fa645	Merge pull request #7991 from poettering/n-on-console a comprehensive fix for the n_on_console miscounting issue	2018-01-25 13:48:08 +03:00
Lennart Poettering	adefcf2821	core: rework how we count the n_on_console counter Let's add a per-unit boolean that tells us whether our unit is currently counted or not. This way it's unlikely we get out of sync again and things are generally more robust. This also allows us to remove the counting logic specific to service units (which was in fact mostly a copy from the generic implementation), in favour of fully generic code. Replaces: #7824	2018-01-24 20:14:51 +01:00
Lennart Poettering	46fb617bf9	manager: minor manager_get_show_status() simplification Since the the whole function ultimately is just a fancy getter for the show_status field, let's actually return it as last step literally without an extra needless "if".	2018-01-24 19:52:29 +01:00
Lennart Poettering	5a69973ff2	manager: add some explanatory comments to manager_dispatch_idle_pipe_fd()	2018-01-24 19:52:14 +01:00
Lennart Poettering	d075092f14	pid1: make use of new "prohibit_ipc" logging flag in PID 1 Let's set it initially, and then toggle it only when we know its safe.	2018-01-24 18:22:56 +01:00
Lennart Poettering	62a769136d	core: rework how we track which PIDs to watch for a unit Previously, we'd maintain two hashmaps keyed by PIDs, pointing to Unit interested in SIGCHLD events for them. This scheme allowed a specific PID to be watched by exactly 0, 1 or 2 units. With this rework this is replaced by a single hashmap which is primarily keyed by the PID and points to a Unit interested in it. However, it optionally also keyed by the negated PID, in which case it points to a NULL terminated array of additional Unit objects also interested. This scheme means arbitrary numbers of Units may now watch the same PID. Runtime and memory behaviour should not be impact by this change, as for the common case (i.e. each PID only watched by a single unit) behaviour stays the same, but for the uncommon case (a PID watched by more than one unit) we only pay with a single additional memory allocation for the array. Why this all? Primarily, because allowing exactly two units to watch a specific PID is not sufficient for some niche cases, as processes can belong to more than one unit these days: 1. sd_notify() with MAINPID= can be used to attach a process from a different cgroup to multiple units. 2. Similar, the PIDFile= setting in unit files can be used for similar setups, 3. By creating a scope unit a main process of a service may join a different unit, too. 4. On cgroupsv1 we frequently end up watching all processes remaining in a scope, and if a process opens lots of scopes one after the other it might thus end up being watch by many of them. This patch hence removes the 2-unit-per-PID limit. It also makes a couple of other changes, some of them quite relevant: - manager_get_unit_by_pid() (and the bus call wrapping it) when there's ambiguity will prefer returning the Unit the process belongs to based on cgroup membership, and only check the watch-pids hashmap if that fails. This change in logic is probably more in line with what people expect and makes things more stable as each process can belong to exactly one cgroup only. - Every SIGCHLD event is now dispatched to all units interested in its PID. Previously, there was some magic conditionalization: the SIGCHLD would only be dispatched to the unit if it was only interested in a single PID only, or the PID belonged to the control or main PID or we didn't dispatch a signle SIGCHLD to the unit in the current event loop iteration yet. These rules were quite arbitrary and also redundant as the the per-unit handlers would filter the PIDs anyway a second time. With this change we'll hence relax the rules: all we do now is dispatch every SIGCHLD event exactly once to each unit interested in it, and it's up to the unit to then use or ignore this. We use a generation counter in the unit to ensure that we only invoke the unit handler once for each event, protecting us from confusion if a unit is both associated with a specific PID through cgroup membership and through the "watch_pids" logic. It also protects us from being confused if the "watch_pids" hashmap is altered while we are dispatching to it (which is a very likely case). - sd_notify() message dispatching has been reworked to be very similar to SIGCHLD handling now. A generation counter is used for dispatching as well. This also adds a new test that validates that "watch_pid" registration and unregstration works correctly.	2018-01-23 21:29:31 +01:00
Lennart Poettering	575b300b79	pid1: rework how we dispatch SIGCHLD and other signals This fundamentally makes one change: we never process more than one signal or more than one waitid() event per event loop. We'll never tight loop around waitid() or around read() on our signalfd instead, but always return to the main event loop after processing one event. By doing this we put the event priorization handling into full power again, as we'll always check for higher priority events before looking at the next signal or waitid() again. This introduces a new "defer" event source "sigchld_event". It's enabled as soon as we see SIGCHLD, and disabled as soon as waitid() reported no further children pending. It's running at a relatively high priority, one step higher than signal handling itself, but lower than /proc/self/mountinfo event handling, so that the latter always takes precedence. Since we want to process sd_notify() events at an even higher priority than SIGCHLD (as before) it is moved one priority step up, too. Fixes: #7932 Possibly fixes: #7966	2018-01-23 18:41:40 +01:00
Lennart Poettering	67ae4e8d59	core: move user lookup event priority to -11 This is internal stuff, us talking to ourselves and relatively independent of everything else, let's put this at highest priority hence.	2018-01-23 18:15:16 +01:00
Lennart Poettering	4259d20215	manager: add MANAGER_IS_RUNNING() for checking whether the manager is running This macro is useful as the check is not obvious, and we better abstract this away.	2018-01-23 16:43:56 +01:00
Lennart Poettering	4adf314b77	manager: split out send_ready and basic.target checking into functions of their own Let's shorten manager_check_finished() a bit by splitting out checking of basic.target and the two things we do when we reach it. This should not change behaviour, except for one thing: we now check basic.target's actual state for figuring out whether it is up, instead of generically checking whether it has any job queued. This is arguably more correct, and is what other code does too for similar purposes, for example manager_state()	2018-01-23 16:39:12 +01:00
Jan Klötzke	2a12e32efa	pid1: add option to disable service watchdogs Add a "systemd.service_watchdogs=" option to the command line which disables all service runtime watchdogs and emergency actions.	2018-01-22 18:10:03 +01:00
Zbigniew Jędrzejewski-Szmek	d8eb10d61a	core: delay logging the taint string until after basic.target is reached (#7935 ) This happens to be almost the same moment as when we send READY=1 in the user instance, but the logic is slightly different, since we log taint when basic.target is reached in the system manager, but we send the notification only in the user manager. So add a separate flag for this and propagate it across reloads. Fixes #7683.	2018-01-21 21:17:54 +09:00
Lennart Poettering	db256aab13	core: be stricter when handling PID files and MAINPID sd_notify() messages Let's be more restrictive when validating PID files and MAINPID= messages: don't accept PIDs that make no sense, and if the configuration source is not trusted, don't accept out-of-cgroup PIDs. A configuratin source is considered trusted when the PID file is owned by root, or the message was received from root. This should lock things down a bit, in case service authors write out PID files from unprivileged code or use NotifyAccess=all with unprivileged code. Note that doing so was always problematic, just now it's a bit less problematic. When we open the PID file we'll now use the CHASE_SAFE chase_symlinks() logic, to ensure that we won't follow an unpriviled-owned symlink to a privileged-owned file thinking this was a valid privileged PID file, even though it really isn't. Fixes: #6632	2018-01-11 15:12:16 +01:00
Lennart Poettering	15e23e8cdf	manager: make use of pid_is_valid() where appropriate	2018-01-11 15:12:16 +01:00
Lennart Poettering	007e4b5490	manager: make use of NEWLINE macro where appropriate	2018-01-11 15:12:16 +01:00
Lennart Poettering	da5fb86100	manager: swap order in which we ellipsize/escape sd_notify() messages for debugging If we have to chose between truncated escape sequences and strings exploded to 4 times the desried length by fully escaping, prefer the latter. It's for debug only, hence doesn't really matter much.	2018-01-11 15:12:16 +01:00
Lennart Poettering	47cf8ff206	manager: rework manager_clean_environment() Let's rename it manager_sanitize_environment() which is a more precise name. Moreover, sort the environment implicitly inside it, as all our callers do that anyway afterwards and we can save some code this way. Also, update the list of env vars to drop, i.e. the env vars we manage ourselves and don't want user code to interfear with. Also sort this list to make it easier to update later on.	2018-01-10 18:30:06 +01:00
Lennart Poettering	665dfe9318	io-util: make flush_fd() return how many bytes where flushed This is useful so that callers know whether anything at all and how much was flushed. This patches through users of this functions to ensure that the return values > 0 which may be returned now are not propagated in public APIs. Also, users that ignore the return value are changed to do so explicitly now.	2018-01-05 13:55:08 +01:00
Lennart Poettering	f1d34068ef	tree-wide: add DEBUG_LOGGING macro that checks whether debug logging is on (#7645 ) This makes things a bit easier to read I think, and also makes sure we always use the _unlikely_ wrapper around it, which so far we used sometimes and other times we didn't. Let's clean that up.	2017-12-15 11:09:00 +01:00
Lennart Poettering	e3140015a7	Merge pull request #7640 from keszybz/tainting-updates Tainting updates	2017-12-14 22:57:17 +01:00
Zbigniew Jędrzejewski-Szmek	198ce93248	core: drop taints for nobody user/group names We have a check and warning at compile time. The user cannot do anything about this at runtime, and all other taints are about checks that happen at runtime and are specific to that system (and at least potentially correctable). (The logic in the compilation-time check was updated to treat "nogroup" as OK, but not the runtime check. But I think it's better to remove the runtime check for this altogether, so this becomes moot.)	2017-12-14 22:14:38 +01:00
Lennart Poettering	fbd0b64f44	tree-wide: make use of new STRLEN() macro everywhere (#7639 ) Let's employ coccinelle to do this for us. Follow-up for #7625.	2017-12-14 19:02:29 +01:00

1 2 3 4 5 ...

538 Commits