Systemd

Author	SHA1	Message	Date
Lennart Poettering	62d74c78b5	coccinelle: add reallocarray() coccinelle script Let's systematically make use of reallocarray() whereever we invoke realloc() with a product of two values.	2018-03-02 12:39:07 +01:00
Lennart Poettering	2589472712	Merge pull request #8237 from sourcejedi/timer_suspend core: let OnCalendar= timer units expire during suspend (#8231)	2018-03-02 12:11:06 +01:00
Zbigniew Jędrzejewski-Szmek	671f0f8de0	Remove /sbin from paths if split-bin is false (#8324 ) Follow-up for `157baa87e4`.	2018-03-01 21:48:36 +01:00
Lennart Poettering	902c8502ad	Merge pull request #8149 from poettering/fake-root-cgroup Properly synthesize CPU+memory accounting data for the root cgroup	2018-03-01 11:10:24 +01:00
Lennart Poettering	649a5ffba8	Merge pull request #8171 from poettering/sd-bus-queue-limit try not to overload pid1's bus message write queue	2018-02-28 18:15:40 +01:00
Alan Jenkins	13f512d324	core: don't freeze OnCalendar= timer units when the clock goes back a lot E.g. if you have a monthly event and you set the computer clock back one year, we can allow the next 12 monthly events to happen naturally. In fact we already do this when you start a Persistent=yes timer, we just need to apply the same logic when it's running and we notice the system clock being set backwards.	2018-02-28 17:00:07 +00:00
Alan Jenkins	9ea9faff78	core: let OnCalendar= timer units expire during suspend (#8231 ) On timejumps, including suspend, timer_time_change() calls for a re-calculation of the next elapse. Sadly I'm not quite sure what the intended effect of this was! Because it was not managing to fire OnCalendar= timers which fired during the suspend... unless the timer had already fired once before. Reported, entirely correctly as far as I can see, on stackexchange: https://unix.stackexchange.com/questions/351829/systemd-timer-that-expired-while-suspended /* If we know the last time this was * triggered, schedule the job based relative - * to that. If we don't just start from - * now. / + to that. If we don't, just start from + * the activation time. / The same code is called for both the initial calculation and this re-calculation. If we're _not_ already active, then this is before the activation time has been recorded in the unit, so just use the current time as before. The new code is mechanically adapted from the same logic for `OnActiveSec=` (case TIMER_ACTIVE in the code which follows). Tested with `date --set`. Motivations: Rotate monitoring data from Atop into files which are named per-day. Fedora currently implements this with a cron job that runs at midnight, but that didn't handle suspend correctly either. * unbound-anchor.timer on Fedora, is used to update DNSSEC "root trust anchor" daily, before the TTL expires. It uses OnCalendar=daily AccuracySec=24h. Which is a bit suspect because the TTL is 2 days, but I think it has the right general idea. None of the other timer settings are correct, because they would not account for time spent in suspend. Unless you set WakeSystem (this feature is currently undocumented). * So in general, we can expect to see people using OnCalendar= for the same cases as cron.daily and cron.monthly. Which use anacron to keep track of jobs which should be run even if the system was down at the time. Timers which are configured to run more frequently than that, are unlikely to mind if they get run slightly more often that the writer realized, relative to the amount of time the system was really running. * From the user report above: "I only want to use remind to show a desktop notification, it seems excessive to wake up the computer for that. Also, I would like to get the reminder first thing in the morning, so the OnActiveSec doesn't help with that."	2018-02-28 16:12:22 +00:00
Alan Jenkins	60933bb89b	core: timer_enter_waiting(): refactor `base` local variable We have two variables `b` and `base`. `b` is declared within limited scope; `base` is declared at the top of the function. However `base` is actually only used within a scope which is exclusive of `b`. Clarify by moving `base` inside the limited scope as well. (Also `base` doesn't need initializing any more than `b` does. The declaration of `base` is now immediately followed by a case analysis of `v->base`, which serves almost exclusively to determine the value of `base`).	2018-02-28 15:07:30 +00:00
Zbigniew Jędrzejewski-Szmek	bdad9e44e4	Merge pull request #8294 from fsateler/debian-patches Upstreaming some debian patches	2018-02-28 09:10:16 +01:00
Ansgar Burchardt	7486f305cd	Include additional directories in ProtectSystem	2018-02-27 18:56:19 -03:00
Lennart Poettering	13d92c6300	seccomp: rework functions for parsing system call filters This reworks system call filter parsing, and replaces a couple of "bool" function arguments by a single flags parameter. This shouldn't change behaviour, except for one case: when we recursively call our parsing function on our own syscall list, then we'll lower the log level to LOG_DEBUG from LOG_WARNING, because at that point things are just a problem in our own code rather than in the user configuration we are parsing, and we shouldn't hence generate confusing warnings about syntax errors. Fixes: #8261	2018-02-27 19:59:09 +01:00
Lennart Poettering	e0a085811d	core: don't process dbus unit and job queue when there are already too many messages pending We maintain a queue of units and jobs that we are supposed to generate change/new notifications for because they were either just created or some of their property has changed. Let's throttle processing of this queue a bit: as soon as > 1K of bus messages are queued for writing let's skip processing the queue, and then recheck on the next iteration again. Moreover, never process more than 100 units in one go, return to the event loop after that. Both limits together should put effective limits on both space and time usage of the function, delaying further operations until a later moment, when the queue is empty or the the event loop is sufficiently idle again. This should keep the number of generated messages much lower than before on busy systems or where some client is hanging. Note that this also means a bad client can slow down message dispatching substantially for up to 90s if it likes to, for all clients. But that should be acceptable as we only allow trusted bus clients, anyway. Fixes: #8166	2018-02-27 19:54:29 +01:00
Lennart Poettering	9fc677e3c9	core: don't bother enqueuing signal messages into busses that aren't ready yet This is an optimization: there's no point in enqueuing unit and job change notificiation signal messages into bus connection that aren't fully set up yet. This doesn't fix #8166 but should lower the load of messages enqueued but not processed yet a bit.	2018-02-27 19:54:29 +01:00
Lennart Poettering	84df74c6f0	Merge pull request #8284 from keszybz/gcc-warning-fixes Gcc warning fixes	2018-02-26 21:20:13 +01:00
Zbigniew Jędrzejewski-Szmek	aa484f3561	tree-wide: use reallocarray instead of our home-grown realloc_multiply (#8279 ) There isn't much difference, but in general we prefer to use the standard functions. glibc provides reallocarray since version 2.26. I moved explicit_bzero is configure test to the bottom, so that the two stdlib functions are at the bottom.	2018-02-26 21:20:00 +01:00
Zbigniew Jędrzejewski-Szmek	bea28c5adb	core/unit: voidify one snprintf statement One more follow-up for `f810b631cd`.	2018-02-26 15:49:27 +01:00
Zbigniew Jędrzejewski-Szmek	8012712791	core/path: add one more assert	2018-02-26 15:49:27 +01:00
Zbigniew Jędrzejewski-Szmek	f810b631cd	Revert "Replace use of snprintf with xsprintf" This reverts commit `a7419dbc59`. _All_ changes in that commit were wrong. Fixes #8211.	2018-02-23 00:13:52 +01:00
Zbigniew Jędrzejewski-Szmek	94be6463bd	Merge pull request #8205 from poettering/bpf-multi bpf/cgroup improvements	2018-02-22 14:52:48 +01:00
Lennart Poettering	c5c07649c2	Merge pull request #8243 from poettering/statx-syscall-unfuck statx() syscall macro fix + reboot() handling improvements	2018-02-22 13:15:41 +01:00
Zbigniew Jędrzejewski-Szmek	30c81ce2ce	pid1: when creating service directories, don't chown existing files (#8181 ) This partially reverts `3536f49e8f` and `3536f49e8f`. When the user is dynamic, and we are setting up state, cache, or logs dirs, behaviour is unchanged, we always do a recursive chown. This is necessary because the user number might change between invocations. But when setting up a directory for non-dynamic user, or a runtime directory for a dynamic user, do any ownership or mode changes only when the directory is initially created. Nothing says that the files under those directories have to be all recursively owned by our user. This restores behaviour before `3536f49e8f`, so modifications to the state of the runtime directory persist between ExecStartPre's and ExecStart's, and even longer in case the directory is persistent. I think it _would_ be a nice property if setting a user would automatically propagate to ownership of any Runtime/Logs/Cache directories. But this is incompatible with another nice property, namely preserving changes to those directories made by an admin, and with allowing change of ownership of files in those directories by the service (e.g. to allow other users to access them). Of the two, I think the second property is more important. Also, it's backwards compatible. https://bugzilla.redhat.com/show_bug.cgi?id=1508495 There is no need to chmod a directory we just created, so move that step up into a branch. After that, 'effective' is only used once, so get rid of it too.	2018-02-22 11:30:59 +01:00
Lennart Poettering	1f409a0cbb	shutdown: let's not use exit() needlessly Generally we prefer 'return' from main() over exit() so that automatic cleanups and such work correct. Let's do that in shutdown.c too, becuase there's not really any reason not to. With this we are pretty good in consistently using return from main() rather than exit() all across the codebase. Yay!	2018-02-22 10:46:26 +01:00
Lennart Poettering	c01dcddf80	reboot-util: unify reboot with parameter in a single implementation So far, we had two implementations of reboot-with-parameter doing pretty much the same. Let's unify that in a generic implementation used by both. This is particulary nice as it unifies all /run/systemd/reboot-param handling in a single .c file.	2018-02-22 10:46:26 +01:00
Lennart Poettering	e3631d1c80	basic: split out update_reboot_parameter_and_warn() into its own .c/.h files This is primarily preparation for a follow-up commit that adds a common implementation of the other side of the reboot parameter file, i.e. the code that reads the file and issues reboot() for it.	2018-02-22 10:46:12 +01:00
Lennart Poettering	118cf9523b	tree-wide: voidify reboot() invocations We use (void) in most cases for reboot() already, let's add it to the others as well.	2018-02-22 10:42:06 +01:00
Lennart Poettering	c52a937b46	basic: add a common syscall wrapper around reboot() This mimics the raw_clone() call we have in place already and establishes a new syscall wrapper raw_reboot() that wraps the kernel's reboot() system call in a bit more low-level fashion that glibc's reboot() wrapper. The main difference is that the extra "arg" argument is supported. Ultimately this just replaces the syscall wrapper implementation we currently have at three places in our codebase by a single one. With this change this means that all our syscall() invocations are neatly separated out in static inline system call wrappers in our header functions.	2018-02-22 10:42:06 +01:00
Lennart Poettering	0b1f3c768c	tree-wide: reopen log when we need to log in FORK_CLOSE_ALL_FDS children In a number of occasions we use FORK_CLOSE_ALL_FDS when forking off a child, since we don't want to pass fds to the processes spawned (either because we later want to execve() some other process there, or because our child might hang around for longer than expected, in which case it shouldn't keep our fd pinned). This also closes any logging fds, and thus means logging is turned off in the child. If we want to do proper logging, explicitly reopen the logs hence in the child at the right time. This is particularly crucial in the umount/remount children we fork off the shutdown binary, as otherwise the children can't log, which is why #8155 is harder to debug than necessary: the log messages we generate about failing mount() system calls aren't actually visible on screen, as they done in the child processes where the log fds are closed.	2018-02-22 00:35:00 +01:00
Lennart Poettering	e18805fbd0	shutdown: explicitly set a log target in shutdown.c We used to set this, but this was dropped when shutdown got taught to get the target passed in from the regular PID 1. Let's readd this to make things more explanatory, and cover all grounds, since after all the target passed is in theory an optional part of the protocol between the regular PID 1 and the shutdown PID 1.	2018-02-22 00:33:12 +01:00
Lennart Poettering	d405394c5c	shutdown: always pass errno to logging functions We have them, let's propagate them.	2018-02-22 00:32:31 +01:00
Lennart Poettering	00adeed99f	umount: beef up logging when umount/remount child processes fail Let's extend what we log if umount/remount doesn't work correctly as we expect. See #8155	2018-02-21 23:57:21 +01:00
Lennart Poettering	5128346127	bpf: reset "extra" IP accounting counters when turning off IP accounting for a unit We maintain an "extra" set of IP accounting counters that are used when we systemd is reloaded to carry over the counters from the previous run. Let's reset these to zero whenever IP accounting is turned off. If we don't do this then turning off IP accounting and back on later wouldn't reset the counters, which is quite surprising and different from how our CPU time counting works.	2018-02-21 16:43:36 +01:00
Lennart Poettering	aa2b6f1d2b	bpf: rework how we keep track and attach cgroup bpf programs So, the kernel's management of cgroup/BPF programs is a bit misdesigned: if you attach a BPF program to a cgroup and close the fd for it it will stay pinned to the cgroup with no chance of ever removing it again (or otherwise getting ahold of it again), because the fd is used for selecting which BPF program to detach. The only way to get rid of the program again is to destroy the cgroup itself. This is particularly bad for root the cgroup (and in fact any other cgroup that we cannot realistically remove during runtime, such as /system.slice, /init.scope or /system.slice/dbus.service) as getting rid of the program only works by rebooting the system. To counter this let's closely keep track to which cgroup a BPF program is attached and let's implicitly detach the BPF program when we are about to close the BPF fd. This hence changes the bpf_program_cgroup_attach() function to track where we attached the program and changes bpf_program_cgroup_detach() to use this information. Moreover bpf_program_unref() will now implicitly call bpf_program_cgroup_detach(). In order to simplify things, bpf_program_cgroup_attach() will now implicitly invoke bpf_program_load_kernel() when necessary, simplifying the caller's side. Finally, this adds proper reference counting to BPF programs. This is useful for working with two BPF programs in parallel: the BPF program we are preparing for installation and the BPF program we so far installed, shortening the window when we detach the old one and reattach the new one.	2018-02-21 16:43:36 +01:00
Lennart Poettering	13a141f046	namespace: protect bpf file system as part of ProtectKernelTunables= It also exposes kernel objects, let's better include this in ProtectKernelTunables=.	2018-02-21 16:43:36 +01:00
Lennart Poettering	6590080851	mount-setup: always use the same source as fstype for the API VFS we mount So far, for all our API VFS mounts we used the fstype also as mount source, let's do that for the cgroupsv2 mounts too. The kernel doesn't really care about the source for API VFS, but it's visible to the user, hence let's clean this up and follow the rule we otherwise follow.	2018-02-21 16:43:36 +01:00
Lennart Poettering	acf7f253de	bpf: use BPF_F_ALLOW_MULTI flag if it is available This new kernel 4.15 flag permits that multiple BPF programs can be executed for each packet processed: multiple per cgroup plus all programs defined up the tree on all parent cgroups. We can use this for two features: 1. Finally provide per-slice IP accounting (which was previously unavailable) 2. Permit delegation of BPF programs to services (i.e. leaf nodes). This patch beefs up PID1's handling of BPF to enable both. Note two special items to keep in mind: a. Our inner-node BPF programs (i.e. the ones we attach to slices) do not enforce IP access lists, that's done exclsuively in the leaf-node BPF programs. That's a good thing, since that way rules in leaf nodes can cancel out rules further up (i.e. for example to implement a logic of "disallow everything except httpd.service"). Inner node BPF programs to accounting however if that's requested. This is beneficial for performance reasons: it means in order to provide per-slice IP accounting we don't have to add up all child unit's data. b. When this code is run on pre-4.15 kernel (i.e. where BPF_F_ALLOW_MULTI is not available) we'll make IP acocunting on slice units unavailable (i.e. revert to behaviour from before this commit). For leaf nodes we'll fallback to non-ALLOW_MULTI mode however, which means that BPF delegation is not available there at all, if IP fw/acct is turned on for the unit. This is a change from earlier behaviour, where we use the BPF_F_ALLOW_OVERRIDE flag, so that our fw/acct would lose its effect as soon as delegation was turned on and some client made use of that. I think the new behaviour is the safer choice in this case, as silent bypassing of our fw rules is not possible anymore. And if people want proper delegation then the way out is a more modern kernel or turning off IP firewalling/acct for the unit algother.	2018-02-21 16:43:36 +01:00
Lennart Poettering	43b7f24b5e	bpf: mount bpffs by default on boot We make heavy use of BPF functionality these days, hence expose the BPF file system too by default now. (Note however, that we don't actually make use bpf file systems object yet, but we might later on too.)	2018-02-21 16:43:36 +01:00
Lennart Poettering	9b3c189786	bpf-program: optionally take fd of program to detach This is useful for BPF_F_ALLOW_MULTI programs, where the kernel requires us to specify the fd.	2018-02-21 16:43:36 +01:00
Lennart Poettering	2ae7ee58fa	bpf: beef up bpf detection, check if BPF_F_ALLOW_MULTI is supported This improves the BPF/cgroup detection logic, and looks whether BPF_ALLOW_MULTI is supported. This flag allows execution of multiple BPF filters in a recursive fashion for a whole cgroup tree. It enables us to properly report IP accounting for slice units, as well as delegation of BPF support to units without breaking our own IP accounting.	2018-02-21 16:43:36 +01:00
Alan Jenkins	59e00b2a16	Merge pull request #7908 from yuwata/rfe-7895 core: add TemporaryFileSystem= setting and 'tmpfs' option to ProtectHome=	2018-02-21 08:57:11 +00:00
Yu Watanabe	784ad252ea	core: add DBus API for TemporaryFileSystem=	2018-02-21 09:18:20 +09:00
Yu Watanabe	e4da7d8c79	core: add new option 'tmpfs' to ProtectHome= This make ProtectHome= setting can take 'tmpfs'. This is mostly equivalent to `TemporaryFileSystem=/home /run/user /root`.	2018-02-21 09:18:17 +09:00
Yu Watanabe	2abd4e388a	core: add new setting TemporaryFileSystem= This introduces a new setting TemporaryFileSystem=. This is useful to hide files not relevant to the processes invoked by unit, while necessary files or directories can be still accessed by combining with Bind{,ReadOnly}Paths=.	2018-02-21 09:17:52 +09:00
Yu Watanabe	4ca763a902	core/namespace: make '-' prefix in Bind{,ReadOnly}Paths= work Each path in `Bind{ReadOnly}Paths=` accept '-' prefix. However, the prefix is completely ignored. This makes it work as expected.	2018-02-21 09:07:56 +09:00
Yu Watanabe	4ff4c98a39	core: simplify DBus API for BindPaths=	2018-02-21 09:06:32 +09:00
Yu Watanabe	280921f29e	core: fix DBus API for AppArmorProfile= and SmackProcessLabel=	2018-02-21 09:05:40 +09:00
Yu Watanabe	8e06d57ccb	core/execute: clear bind_mounts	2018-02-21 09:05:37 +09:00
Yu Watanabe	a635a7aec6	core/execute: simplify compile_bind_mounts() It is not necessary to re-assign error code.	2018-02-21 09:05:35 +09:00
Yu Watanabe	f5c52a7724	core/namespace: remove unused argument	2018-02-21 09:05:30 +09:00
Yu Watanabe	e282f51f57	core/namespace: use free_and_replace()	2018-02-21 09:05:21 +09:00
Yu Watanabe	55fe743273	core/namespace: fix comment	2018-02-21 09:05:18 +09:00

1 2 3 4 5 ...

3822 commits