Systemd

Commit Graph

Author	SHA1	Message	Date
Michael Marley	61927b9f11	manager: Fix HW watchdog when systemd starts before driver loaded When manager_{set\|override}_watchdog is called, set the watchdog timeout regardless of whether the hardware watchdog was successfully initialized. If the watchdog was requested but could not be initialized, then instead of pinging it, attempt to initialize it again. This ensures that the hardware watchdog is initialized even if the kernel module for it isn't loaded when systemd starts (which is quite likely, unless it is compiled in). This builds on work by @danc86 in https://github.com/systemd/systemd/pull/17460, but fixes the issue of not updating the watchdog timeout with the actual value from the hardware. Fixes https://github.com/systemd/systemd/issues/17838 Co-authored-by: Dan Callaghan <djc@djc.id.au> Co-authored-by: Michael Marley <michael@michaelmarley.com>	2020-12-09 11:47:22 +00:00
Yu Watanabe	db9ecf0501	license: LGPL-2.1+ -> LGPL-2.1-or-later	2020-11-09 13:23:58 +09:00
Lennart Poettering	bfeb927a55	pid1: various minor watchdog modernizations Just some clean-ups.	2020-10-30 13:02:06 +01:00
Zbigniew Jędrzejewski-Szmek	69c0807432	Merge pull request #15206 from anitazha/systoomd-v0 systemd-oomd	2020-10-15 14:16:52 +02:00
Lennart Poettering	670eed4c8c	core: debug log about received fds	2020-10-14 16:41:37 +02:00
Anita Zhang	fe8d22fb09	core: systemd-oomd pid1 integration	2020-10-07 17:12:24 -07:00
Zbigniew Jędrzejewski-Szmek	526e3cbbdd	core: don't try to load units from non-absolute paths The error message disagreed with the check that was actually performed. Adjust the check.	2020-09-23 14:49:37 +02:00
Lennart Poettering	c6552f7cd5	Merge pull request #16955 from keszybz/test-execute-cleanup One patch for test-execute and assorted cleanups	2020-09-08 18:33:12 +02:00
Zbigniew Jędrzejewski-Szmek	90e74a66e6	tree-wide: define iterator inside of the macro	2020-09-08 12:14:05 +02:00
Zbigniew Jędrzejewski-Szmek	9978e631cd	core/manager: reindent table for readability	2020-09-04 18:14:26 +02:00
Zbigniew Jędrzejewski-Szmek	5b10116e49	core/{execute, manager}: reduce scope of iterator variables a bit	2020-09-04 18:14:26 +02:00
Christian Göttsche	45ae2f725e	selinux: create systemd/notify socket with default SELinux context	2020-09-01 16:25:06 +02:00
Zbigniew Jędrzejewski-Szmek	c2911d48ff	Rework how we cache mtime to figure out if units changed Instead of assuming that more-recently modified directories have higher mtime, just look for any mtime changes, up or down. Since we don't want to remember individual mtimes, hash them to obtain a single value. This should help us behave properly in the case when the time jumps backwards during boot: various files might have mtimes that in the future, but we won't care. This fixes the following scenario: We have /etc/systemd/system with T1. T1 is initially far in the past. We have /run/systemd/generator with time T2. The time is adjusted backwards, so T2 will be always in the future for a while. Now the user writes new files to /etc/systemd/system, and T1 is updated to T1'. Nevertheless, T1 < T1' << T2. We would consider our cache to be up-to-date, falsely.	2020-08-31 20:53:38 +02:00
Zbigniew Jędrzejewski-Szmek	02103e5716	core: always try to reload not-found unit This check was added in `d904afc730`. It would only apply in the case where the cache hasn't been loaded yet. I think we pretty much always have the cache loaded when we reach this point, but even if we didn't, it seems better to try to reload the unit. So let's drop this check.	2020-08-31 20:53:38 +02:00
Zbigniew Jędrzejewski-Szmek	c149d2b491	pid1: use the cache mtime not clock to "mark" load attempts We really only care if the cache has been reloaded between the time when we last attempted to load this unit and now. So instead of recording the actual time we try to load the unit, just store the timestamp of the cache. This has the advantage that we'll notice if the cache mtime jumps forward or backward. Also rename fragment_loadtime to fragment_not_found_time. It only gets set when we failed to load the unit and the old name was suggesting it is always set. In https://bugzilla.redhat.com/show_bug.cgi?id=1871327 (and most likely https://bugzilla.redhat.com/show_bug.cgi?id=1867930 and most likely https://bugzilla.redhat.com/show_bug.cgi?id=1872068) we try to load a non-existent unit over and over from transaction_add_job_and_dependencies(). My understanding is that the clock was in the future during inital boot, so cache_mtime is always in the future (since we don't touch the fs after initial boot), so no matter how many times we try to load the unit and set fragment_loadtime / fragment_not_found_time, it is always higher than cache_mtime, so manager_unit_cache_should_retry_load() always returns true.	2020-08-31 20:53:38 +02:00
Zbigniew Jędrzejewski-Szmek	81be23886d	core: rename manager_unit_file_maybe_loadable_from_cache() The name is misleading, since we aren't really loading the unit from cache — if this function returns true, we'll try to load the unit from disk, updating the cache in the process.	2020-08-31 20:53:38 +02:00
Lennart Poettering	bb0c0d6f29	core: add credentials logic Fixes: #15778 #16060	2020-08-25 19:45:35 +02:00
Zbigniew Jędrzejewski-Szmek	2aed63f427	tree-wide: fix spelling of "fallback" Similarly to "setup" vs. "set up", "fallback" is a noun, and "fall back" is the verb. (This is pretty clear when we construct a sentence in the present continous: "we are falling back" not "we are fallbacking").	2020-08-20 17:45:32 +02:00
Lennart Poettering	39cf0351c5	tree-wide: make use of new relative time events in sd-event.h	2020-07-28 11:24:55 +02:00
Lennart Poettering	8047ac8fdc	core: clean more env vars from env block pid1 receives We generally clean all env vars we use ourselves to communicate with out childrens. We forgot some more recent additions however. Let's correct that.	2020-07-23 18:30:15 +02:00
Zbigniew Jędrzejewski-Szmek	56a13a495c	pid1: create ro private tmp dirs when /tmp or /var/tmp is read-only Read-only /var/tmp is more likely, because it's backed by a real device. /tmp is (by default) backed by tmpfs, but it doesn't have to be. In both cases the same consideration applies. If we boot with read-only /var/tmp, any unit with PrivateTmp=yes would fail because we cannot create the subdir under /var/tmp to mount the private directory. But many services actually don't require /var/tmp (either because they only use it occasionally, or because they only use /tmp, or even because they don't use the temporary directories at all, and PrivateTmp=yes is used to isolate them from the rest of the system). To handle both cases let's create a read-only directory under /run/systemd and mount it as the private /tmp or /var/tmp. (Read-only to not fool the service into dumping too much data in /run.) $ sudo systemd-run -t -p PrivateTmp=yes bash Running as unit: run-u14.service Press ^] three times within 1s to disconnect TTY. [root@workstation /]# ls -l /tmp/ total 0 [root@workstation /]# ls -l /var/tmp/ total 0 [root@workstation /]# touch /tmp/f [root@workstation /]# touch /var/tmp/f touch: cannot touch '/var/tmp/f': Read-only file system This commit has more changes than I like to put in one commit, but it's touching all the same paths so it's hard to split. exec_runtime_make() was using the wrong cleanup function, so the directory would be left behind on error.	2020-07-14 19:47:15 +02:00
Luca Boccassi	cda667722c	core: refresh unit cache when building a transaction if UNIT_NOT_FOUND When a command asks to load a unit directly and it is in state UNIT_NOT_FOUND, and the cache is outdated, we refresh it and attempto to load again. Use the same logic when building up a transaction and a dependency in UNIT_NOT_FOUND state is encountered. Update the unit test to exercise this code path.	2020-07-07 10:09:24 +02:00
Luca Boccassi	7233e91af0	core: store timestamps of unit load attempts When the system is under heavy load, it can happen that the unit cache is refreshed for an unrelated reason (in the test I simulate this by attempting to start a non-existing unit). The new unit is found and accounted for in the cache, but it's ignored since we are loading something else. When we actually look for it, by attempting to start it, the cache is up to date so no refresh happens, and starting fails although we have it loaded in the cache. When the unit state is set to UNIT_NOT_FOUND, mark the timestamp in u->fragment_loadtime. Then when attempting to load again we can check both if the cache itself needs a refresh, OR if it was refreshed AFTER the last failed attempt that resulted in the state being UNIT_NOT_FOUND. Update the test so that this issue reproduces more often.	2020-06-30 16:50:00 +02:00
Zbigniew Jędrzejewski-Szmek	f83803a649	Merge pull request #16238 from keszybz/set-handling-more Fix handling of cases where a duplicate item is added to a set and related cleanups	2020-06-24 17:42:13 +02:00
Zbigniew Jędrzejewski-Szmek	de7fef4b6e	tree-wide: use set_ensure_put() Patch contains a coccinelle script, but it only works in some cases. Many parts were converted by hand. Note: I did not fix errors in return value handing. This will be done separate to keep the patch comprehensible. No functional change is intended in this patch.	2020-06-22 16:32:37 +02:00
Franck Bui	43bba15ac8	pid1: rename manager_set_{show_status,watchdog}_overridden() into manager_override_(show_status,watchdog} No functional change.	2020-06-11 12:00:32 +02:00
Franck Bui	3ceb347130	pid1: introduce an helper to handle the show-status marker No functional change.	2020-06-11 12:00:16 +02:00
Franck Bui	44a419540e	pid1: rework handling of m->show_status The fact that m->show_status was serialized/deserialized made impossible any further customisation of this setting via system.conf. IOW the value was basically always locked unless it was changed via signals. This patch reworks the handling of m->show_status but also makes sure that if a new value was changed via the signal API then this value is kept and preserved accross PID1 reexecuting or reloading. Note: this effectively means that once the value is set via the signal interface, it can be changed again only through the signal API.	2020-06-09 09:16:54 +02:00
Franck Bui	0d6d3cf055	pid1: rename manager_get_show_status() to manager_should_show_status() The name 'manager_get_show_status()' suggests that the function simply reads the property 'show_status' of the manager and hence returns a 'StatusType' value. However it was doing more than that since it contained the logic (based on 'show_status' but also on the state of the manager) to figure out if status message could be emitted to the console. Hence this patch renames the function to 'manager_should_show_status()'. The previous name will be reused in a later patch to effectively return the value of 'show_status' property. No functional change.	2020-06-09 09:16:54 +02:00
Franck Bui	b309078ab9	pid1: make more use of show_status_on() No functional change.	2020-06-09 09:16:54 +02:00
Luca Boccassi	d904afc730	core: reload cache if it's dirty when starting a UNIT_NOT_FOUND unit The time-based cache allows starting a new unit without an expensive daemon-reload, unless there was already a reference to it because of a dependency or ordering from another unit. If the cache is out of date, check again if we can load the fragment.	2020-05-30 16:50:05 +02:00
Zbigniew Jędrzejewski-Szmek	a4ac27c1af	manager: free the jobs hashmap after we have no jobs After a larger transaction, e.g. after bootup, we're left with an empty hashmap with hundreds of buckets. Long-term, it'd be better to size hashmaps down when they are less than 1/4 full, but even if we implement that, jobs hashmap is likely to be empty almost always, so it seems useful to deallocate it once the jobs count reaches 0.	2020-05-28 18:54:20 +02:00
Zbigniew Jędrzejewski-Szmek	3fb2326f3e	shared/unit-file: make sure the old hashmaps and sets are freed upon replacement Possibly fixes #15220. (There might be another leak. I'm still investigating.) The leak would occur when the path cache was rebuilt. So in normal circumstances it wouldn't be too bad, since usually the path cache is not rebuilt too often. But the case in #15220, where new unit files are created in a loop and started, the leak occurs once for each unit file: $ for i in {1..300}; do cp ~/.config/systemd/user/test0001.service ~/.config/systemd/user/test$(printf %04d $i).service; systemctl --user start test$(printf %04d $i).service;done	2020-05-28 18:51:52 +02:00
Zbigniew Jędrzejewski-Szmek	24b4597064	core: minor simplification	2020-05-27 09:02:53 +02:00
Franck Bui	b406c6d128	pid1: make manager_deserialize_{uid,gid}_refs() static No functional change.	2020-05-19 15:48:54 +02:00
Franck Bui	80f605c807	pid1: make manager_serialize_{uid,gid}_refs() static No functional change.	2020-05-19 15:48:54 +02:00
Franck Bui	06a4eb0737	pid1: make manager_vacuum_{uid,gid}_refs() static No functional change.	2020-05-19 15:48:54 +02:00
Franck Bui	1addc46c8c	pid1: make manager_flip_auto_status() static No functional change.	2020-05-19 15:48:54 +02:00
Franck Bui	986935cf6a	pid1: update manager settings on reload too Most complexity of this patch is due to the fact that some manager settings (basically the watchdog properties) can be set at runtime and in this case the runtime values must be retained over daemon-reload or daemon-reexec. For consistency sake, all watchdog properties behaves now the same way, that is: - Values defined by config files can be overridden by writing the new value through their respective D-BUS properties. In this case, these values are preserved over reload/reexec until the special value '0' or USEC_INFINITY is written, which will then restore the last values loaded from the config files. If the restored value is '0' or 'USEC_INFINITY', the watchdogs will be disabled and the corresponding device will be closed. - Reading the properties from a user instance will return the USEC_INFINITY value as these properties are only meaningful for PID1. - Writing to one of the watchdog properties of a user instance's will be a NOP. Fixes: #15453	2020-05-19 15:31:55 +02:00
Benjamin Robin	5151b4ccd2	core: Parse the tags list sooner, and use it for multiple function - Parse the tags list using strv_split_newlines() which remove any unnecessary empty string at the end of the strv. - Use this parsed list for manager_process_barrier_fd() and every call to manager_invoke_notify_message(). - This also allow to simplify the manager_process_barrier_fd() function.	2020-05-13 22:44:12 +02:00
Lennart Poettering	fb29cdbef2	tree-wide: make sure our control buffers are properly aligned We always need to make them unions with a "struct cmsghdr" in them, so that things properly aligned. Otherwise we might end up at an unaligned address and the counting goes all wrong, possibly making the kernel refuse our buffers. Also, let's make sure we initialize the control buffers to zero when sending, but leave them uninitialized when reading. Both the alignment and the initialization thing is mentioned in the cmsg(3) man page.	2020-05-07 14:39:44 +02:00
Benjamin Robin	08f468567d	tree-wide: Workaround -Wnonnull GCC bug See issue #6119	2020-05-07 09:43:28 +02:00
Kumar Kartikeya Dwivedi	4f07ddfa9b	Introduce sd_notify_barrier This adds the sd_notify_barrier function, to allow users to synchronize against the reception of sd_notify(3) status messages. It acts as a synchronization point, and a successful return gurantees that all previous messages have been consumed by the manager. This can be used to eliminate race conditions where the sending process exits too early for systemd to associate its PID to a cgroup and attribute the status message to a unit correctly. systemd-notify now uses this function for proper notification delivery and be useful for NotifyAccess=all units again in user mode, or in cases where it doesn't have a control process as parent. Fixes: #2739	2020-05-01 03:22:47 +05:30
Lennart Poettering	3691bcf3c5	tree-wide: use recvmsg_safe() at various places Let's be extra careful whenever we return from recvmsg() and see MSG_CTRUNC set. This generally means we ran into a programming error, as we didn't size the control buffer large enough. It's an error condition we should at least log about, or propagate up. Hence do that. This is particularly important when receiving fds, since for those the control data can be of any size. In particular on stream sockets that's nasty, because if we miss an fd because of control data truncation we cannot recover, we might not even realize that we are one off. (Also, when failing early, if there's any chance the socket might be AF_UNIX let's close all received fds, all the time. We got this right most of the time, but there were a few cases missing. God, UNIX is hard to use)	2020-04-23 09:41:47 +02:00
Lennart Poettering	df3d3bdfe8	core: minor error code handling fixes	2020-04-22 08:56:05 +02:00
Alin Popa	c5f8a179a2	watchdog: reduce watchdog pings in timeout interval The watchdog ping is performed for every iteration of manager event loop. This results in a lot of ioctls on watchdog device driver especially during boot or if services are aggressively using sd_notify. Depending on the watchdog device driver this may have performance impact on embedded systems. The patch skips sending the watchdog to device driver if the ping is requested before half of the watchdog timeout.	2020-04-16 16:32:05 +02:00
root	f9d29f6d06	fix manager_state	2020-04-07 15:27:50 +02:00
Vito Caputo	b46c3e4913	*: use _cleanup_close_ with fdopen() where trivial Also convert these to use take_fdopen().	2020-03-31 06:48:03 -07:00
Zbigniew Jędrzejewski-Szmek	385093b702	Split out generator directory setup to a src/core/generator-setup.c Those functions have only one non-test user, so we can move them to src/core/.	2020-03-27 20:12:44 +01:00
Zbigniew Jędrzejewski-Szmek	51327bcc74	sd-path: rename the two functions I think the two names were both pretty bad. They did not give a proper hint what the difference between the two functions is, and sd_path_home sounds like it is somehow related to /home or home directories or whatever, when in fact both functions return the same set of paths as either a colon-delimited string or a strv. "_strv" suffix is used by various functions in sd-bus, so let's reuse that. Those functions are not public yet, so let's rename.	2020-03-27 20:12:44 +01:00

1 2 3 4 5 ...

757 Commits