Systemd

Author	SHA1	Message	Date
Chris Down	4e1dfa45e9	cgroup: s/cgroups? ?v?([0-9])/cgroup v\1/gI Nitpicky, but we've used a lot of random spacings and names in the past, but we're trying to be completely consistent on "cgroup vN" now. Generated by `fd -0 \| xargs -0 -n1 sed -ri --follow-symlinks 's/cgroups? ?v?([0-9])/cgroup v\1/gI'`. I manually ignored places where it's not appropriate to replace (eg. "cgroup2" fstype and in src/shared/linux).	2019-01-03 11:32:40 +09:00
Zbigniew Jędrzejewski-Szmek	303ee60151	Mark data and userdata params to specifier_printf() as const It would be very wrong if any of the specfier printf calls modified any of the objects or data being printed. Let's mark all arguments as const (primarily to make it easier for the reader to see where modifications cannot occur).	2018-12-12 16:45:33 +01:00
Lennart Poettering	a95c0505ad	core: extend comments regarding coldplug() vs. catchup()	2018-12-12 11:20:53 +01:00
Lennart Poettering	7af67e9a8b	core: allow to set exit status when using SuccessAction=/FailureAction=exit in units This adds SuccessActionExitStatus= and FailureActionExitStatus= that may be used to configure the exit status to propagate in when SuccessAction=exit or FailureAction=exit is used. When not specified let's also propagate the exit status of the main process we fork off for the unit.	2018-11-27 09:44:40 +01:00
Lennart Poettering	5af8805872	cgroup: drastically simplify caching of cgroups members mask Previously we tried to be smart: when a new unit appeared and it only added controllers to the cgroup mask we'd update the cached members mask in all parents by ORing in the controller flags in their cached values. Unfortunately this was quite broken, as we missed some conditions when this cache had to be reset (for example, when a unit got unloaded), moreover the optimization doesn't work when a controller is removed anyway (as in that case there's no other way for the parent to iterate though all children if any other, remaining child unit still needs it). Hence, let's simplify the logic substantially: instead of updating the cache on the right events (which we didn't get right), let's simply invalidate the cache, and generate it lazily when we encounter it later. This should actually result in better behaviour as we don't have to calculate the new members mask for a whole subtree whever we have the suspicion something changed, but can delay it to the point where we actually need the members mask. This allows us to simplify things quite a bit, which is good, since validating this cache for correctness is hard enough. Fixes: #9512	2018-11-23 13:41:37 +01:00
Lennart Poettering	5a62e5e2ac	cgroup: document what the various masks variables are used for	2018-11-23 13:41:37 +01:00
Lennart Poettering	27da878e7e	unit: drop an unused fields from Unit struct	2018-11-23 00:37:00 +01:00
Lennart Poettering	66fa4bdd70	core: add two minor comments (#10890 )	2018-11-23 06:25:27 +09:00
Zbigniew Jędrzejewski-Szmek	aac99f303a	core: introduce a helper function to wrap unit_log_{success,failure} It's inline so that the compiler can easily optimize away the call to get status string.	2018-11-16 19:47:07 +01:00
Lennart Poettering	523ee2d414	core: log a recognizable message when a unit succeeds, too We already are doing it on failure, let's do it on success, too. Fixes: #10265	2018-11-16 15:22:48 +01:00
Lennart Poettering	91bbd9b796	core: make log messages about unit processes exiting recognizable	2018-11-16 15:22:48 +01:00
Lennart Poettering	7c047d7443	core: make log messages about units entering a 'failed' state recognizable Let's make this recognizable, and carry result information in a structure fashion.	2018-11-16 15:22:48 +01:00
Lennart Poettering	33a3fdd978	core: move unit_status_emit_starting_stopping_reloading() and related calls to job.c This call is only used by job.c and very specific to job handling. Moreover the very similar logic of job_emit_status_message() is already in job.c. Hence, let's clean this up, and move both sets of functions to job.c, and rename them a bit so that they express precisely what they do: 1. unit_status_emit_starting_stopping_reloading() → job_emit_begin_status_message() 2. job_emit_status_message() → job_emit_done_status_message() The first call is after all what we call when we begin with the execution of a job, and the second call what we call when we are done wiht it. Just some moving and renaming, not other changes, and hence no change in behaviour.	2018-11-16 15:22:48 +01:00
Lennart Poettering	6529ccfa20	unit: replace three non-type-safe macros by type-safe inline functions Behaviour is prett ymuch the same, but there's some additional type checking done on the input parameters. (In the case of UNIT_WRITE_FLAGS_NOOP() the C compiler won't actually do the type checking necessarily, but static chckers at least could)	2018-11-08 13:55:25 +01:00
Lennart Poettering	bbf1120623	unit: make UNIT() cast function deal with NULL pointers Fixes: #10681	2018-11-08 10:47:08 +01:00
Lennart Poettering	1ad6e8b302	core: split environment block mantained by PID 1's Manager object in two This splits the "environment" field of Manager into two: transient_environment and client_environment. The former is generated from configuration file, kernel cmdline, environment generators. The latter is the one the user can control with "systemctl set-environment" and similar. Both sets are merged transparently whenever needed. Separating the two sets has the benefit that we can safely flush out the former while keeping the latter during daemon reload cycles, so that env var settings from env generators or configuration files do not accumulate, but dynamic API changes are kept around. Note that this change is not entirely transparent to users: if the user first uses "set-environment" to override a transient variable, and then uses "unset-environment" to unset it again things will revert to the original transient variable now, while previously the variable was fully removed. This change in behaviour should not matter too much though I figure. Fixes: #9972	2018-10-31 18:00:53 +01:00
Lennart Poettering	d68c645bd3	core: rework serialization Let's be more careful with what we serialize: let's ensure we never serialize strings that are longer than LONG_LINE_MAX, so that we know we can read them back with read_line(…, LONG_LINE_MAX, …) safely. In order to implement this all serialization functions are move to serialize.[ch], and internally will do line size checks. We'd rather skip a serialization line (with a loud warning) than write an overly long line out. Of course, this is just a second level protection, after all the data we serialize shouldn't be this long in the first place. While we are at it also clean up logging: while serializing make sure to always log about errors immediately. Also, (void)ify all calls we don't expect errors in (or catch errors as part of the general fflush_and_check() at the end.	2018-10-26 10:52:41 +02:00
Lennart Poettering	8948b3415d	core: when deserializing state always use read_line(…, LONG_LINE_MAX, …) This should be much better than fgets(), as we can read substantially longer lines and overly long lines result in proper errors. Fixes a vulnerability discovered by Jann Horn at Google. CVE-2018-15686 LP: #1796402 https://bugzilla.redhat.com/show_bug.cgi?id=1639071	2018-10-26 10:40:01 +02:00
Anita Zhang	90fc172e19	core: implement per unit journal rate limiting Add LogRateLimitIntervalSec= and LogRateLimitBurst= options for services. If provided, these values get passed to the journald client context, and those values are used in the rate limiting function in the journal over the the journald.conf values. Part of #10230	2018-10-18 09:56:20 +02:00
Roman Gushchin	084c700780	core: support cgroup v2 device controller Cgroup v2 provides the eBPF-based device controller, which isn't currently supported by systemd. This commit aims to provide such support. There are no user-visible changes, just the device policy and whitelist start working if cgroup v2 is used.	2018-10-09 09:47:51 -07:00
Roman Gushchin	17f149556a	core: refactor bpf firewall support into a pseudo-controller The idea is to introduce a concept of bpf-based pseudo-controllers to make adding new bpf-based features easier.	2018-10-09 09:46:08 -07:00
Lennart Poettering	334415b16e	Merge pull request #10094 from keszybz/wants-loading Fix bogus fragment paths in units in .wants/.requires	2018-10-05 17:36:31 +02:00
Anita Zhang	c87700a133	Make Watchdog Signal Configurable Allows configuring the watchdog signal (with a default of SIGABRT). This allows an alternative to SIGABRT when coredumps are not desirable. Appropriate references to SIGABRT or aborting were renamed to reflect more liberal watchdog signals. Closes #8658	2018-09-26 16:14:29 +02:00
Zbigniew Jędrzejewski-Szmek	5a72417084	pid1: drop unused path parameter to add_two_dependencies_by_name()	2018-09-15 20:02:00 +02:00
Zbigniew Jędrzejewski-Szmek	35d8c19ace	pid1: drop now-unused path parameter to add_dependency_by_name()	2018-09-15 19:57:52 +02:00
Zbigniew Jędrzejewski-Szmek	fda09318e3	core: rename function to better reflect semantics	2018-08-20 10:43:31 +02:00
Lennart Poettering	a3c1168ac2	core: rework StopWhenUnneeded= logic Previously, we'd act immediately on StopWhenUnneeded= when a unit state changes. With this rework we'll maintain a queue instead: whenever there's the chance that StopWhenUneeded= might have an effect we enqueue the unit, and process it later when we have nothing better to do. This should make the implementation a bit more reliable, as the unit notify event cannot immediately enqueue tons of side-effect jobs that might contradict each other, but we do so only in a strictly ordered fashion, from the main event loop. This slightly changes the check when to consider a unit "unneeded". Previously, we'd assume that a unit in "deactivating" state could also be cleaned up. With this new logic we'll only consider units unneeded that are fully up and have no job queued. This means that whenever there's something pending for a unit we won't clean it up.	2018-08-10 16:19:01 +02:00
Yu Watanabe	845d247a3d	tree-wide: use 'signed int' instead of 'int' for bit field variables Suggested by LGTM: https://lgtm.com/rules/1506024027114/	2018-06-28 10:16:51 +02:00
Lennart Poettering	0c69794138	tree-wide: remove Lennart's copyright lines These lines are generally out-of-date, incomplete and unnecessary. With SPDX and git repository much more accurate and fine grained information about licensing and authorship is available, hence let's drop the per-file copyright notice. Of course, removing copyright lines of others is problematic, hence this commit only removes my own lines and leaves all others untouched. It might be nicer if sooner or later those could go away too, making git the only and accurate source of authorship information.	2018-06-14 10:20:20 +02:00
Lennart Poettering	818bf54632	tree-wide: drop 'This file is part of systemd' blurb This part of the copyright blurb stems from the GPL use recommendations: https://www.gnu.org/licenses/gpl-howto.en.html The concept appears to originate in times where version control was per file, instead of per tree, and was a way to glue the files together. Ultimately, we nowadays don't live in that world anymore, and this information is entirely useless anyway, as people are very welcome to copy these files into any projects they like, and they shouldn't have to change bits that are part of our copyright header for that. hence, let's just get rid of this old cruft, and shorten our codebase a bit.	2018-06-14 10:20:20 +02:00
Lennart Poettering	ef31828d06	tree-wide: unify how we define bit mak enums Let's always write "1 << 0", "1 << 1" and so on, except where we need more than 31 flag bits, where we write "UINT64(1) << 0", and so on to force 64bit values.	2018-06-12 21:44:00 +02:00
Lennart Poettering	04eb582acc	core: enumerate perpetual units in a separate per-unit-type method Previously the enumerate() callback defined for each unit type would do two things: 1. It would create perpetual units (i.e. -.slice, system.slice, -.mount and init.scope) 2. It would enumerate units from /proc/self/mountinfo, /proc/swaps and the udev database With this change these two parts are split into two seperate methods: enumerate() now only does #2, while enumerate_perpetual() is responsible for #1. Why make this change? Well, perpetual units should have a slightly different effect that those found through enumeration: as perpetual units should be up unconditionally, perpetually and thus never change state, they should also not pull in deps by their state changing, not even when the state is first set to active. Thus, their state is generally initialized through the per-device coldplug() method in similar fashion to the deserialized state from a previous run would be put into place. OTOH units found through regular enumeration should result in state changes (and thus pull in deps due to state changes), hence their state should be put in effect in the catchup() method instead. Hence, given this difference, let's also separate the functions, so that the rule is: 1. What is created in enumerate_perpetual() should be started in coldplug() 2. What is created in enumerate() should be started in catchup().	2018-06-07 15:29:17 +02:00
Lennart Poettering	f0831ed2a0	core: add a new unit method "catchup()" This is very similar to the existing unit method coldplug() but is called a bit later. The idea is that that coldplug() restores the unit state from before any prior reload/restart, i.e. puts the deserialized state in effect. The catchup() call is then called a bit later, to catch up with the system state for which we missed notifications while we were reloading. This is only really useful for mount, swap and device mount points were we should be careful to generate all missing unit state change events (i.e. call unit_notify() appropriately) for everything that happened while we were reloading.	2018-06-07 15:28:50 +02:00
Lennart Poettering	bbf5fd8e41	core: subscribe to /etc/localtime timezone changes and update timer elapsation accordingly Fixes: #8233 This is our first real-life usecase for the new sd_event_add_inotify() calls we just added.	2018-06-06 10:53:56 +02:00
Lennart Poettering	50be4f4a46	core: rework how we track service and scope PIDs This reworks how systemd tracks processes on cgroupv1 systems where cgroup notification is not reliable. Previously, whenever we had reason to believe that new processes showed up or got removed we'd scan the cgroup of the scope or service unit for new processes, and would tidy up the list of PIDs previously watched. This scanning is relatively slow, and does not scale well. With this change behaviour is changed: instead of scanning for new/removed processes right away we do this work in a per-unit deferred event loop job. This event source is scheduled at a very low priority, so that it is executed when we have time but does not starve other event sources. This has two benefits: this expensive work is coalesced, if events happen in quick succession, and we won't delay SIGCHLD handling for too long. This patch basically replaces all direct invocation of unit_watch_all_pids() in scope.c and service.c with invocations of the new unit_enqueue_rewatch_pids() call which just enqueues a request of watching/tidying up the PID sets (with one exception: in scope_enter_signal() and service_enter_signal() we'll still do unit_watch_all_pids() synchronously first, since we really want to know all processes we are about to kill so that we can track them properly. Moreover, all direct invocations of unit_tidy_watch_pids() and unit_synthesize_cgroup_empty_event() are removed too, when the unit_enqueue_rewatch_pids() call is invoked, as the queued job will run those operations too. All of this is done on cgroupsv1 systems only, and is disabled on cgroupsv2 systems as cgroup-empty notifications are reliable there, and we do not need SIGCHLD events to track processes there. Fixes: #9138	2018-06-05 22:06:48 +02:00
Lennart Poettering	2ad2e41a72	core: don't trigger OnFailure= deps when a unit is going to restart This adds a flags parameter to unit_notify() which can be used to pass additional notification information to the function. We the make the old reload_failure boolean parameter one of these flags, and then add a new flag that let's unit_notify() if we are configured to restart the service. Note that this adjusts behaviour of systemd to match what the docs say. Fixes: #8398	2018-06-01 19:08:30 +02:00
Felipe Sateler	57b7a260c2	core: undo the dependency inversion between unit.h and all unit types	2018-05-15 14:24:34 -04:00
Lennart Poettering	d4fd1cf208	core: enforce that scope units can be started only once Scope units are populated from PIDs specified by the bus client. We do that when a scope is started. We really shouldn't allow scopes to be started multiple times, as the PIDs then might be heavily out of date. Moreover, clients should have the guarantee that any scope they allocate has a clear runtime cycle which is not repetitive.	2018-04-27 21:52:45 +02:00
Zbigniew Jędrzejewski-Szmek	11a1589223	tree-wide: drop license boilerplate Files which are installed as-is (any .service and other unit files, .conf files, .policy files, etc), are left as is. My assumption is that SPDX identifiers are not yet that well known, so it's better to retain the extended header to avoid any doubt. I also kept any copyright lines. We can probably remove them, but it'd nice to obtain explicit acks from all involved authors before doing that.	2018-04-06 18:58:55 +02:00
Michal Sekletar	19496554e2	core: delay adding target dependencies until all units are loaded and aliases resolved (#8381 ) Currently we add target dependencies while we are loading units. This can create ordering loops even if configuration doesn't contain any loop. Take for example following configuration, $ systemctl get-default multi-user.target $ cat /etc/systemd/system/test.service [Unit] After=default.target [Service] ExecStart=/bin/true [Install] WantedBy=multi-user.target If we encounter such unit file early during manager start-up (e.g. load queue is dispatched while enumerating devices due to SYSTEMD_WANTS in udev rules) we would add stub unit default.target and we order it Before test.service. At the same time we add implicit Before to multi-user.target. Later we merge two units and we create ordering cycle in the process. To fix the issue we will now never add any target dependencies until we loaded all the unit files and resolved all the aliases.	2018-03-23 15:28:06 +01:00
Zbigniew Jędrzejewski-Szmek	dc409696cf	Introduce _cleanup_(unit_freep)	2018-03-11 16:33:58 +01:00
Lennart Poettering	aa2b6f1d2b	bpf: rework how we keep track and attach cgroup bpf programs So, the kernel's management of cgroup/BPF programs is a bit misdesigned: if you attach a BPF program to a cgroup and close the fd for it it will stay pinned to the cgroup with no chance of ever removing it again (or otherwise getting ahold of it again), because the fd is used for selecting which BPF program to detach. The only way to get rid of the program again is to destroy the cgroup itself. This is particularly bad for root the cgroup (and in fact any other cgroup that we cannot realistically remove during runtime, such as /system.slice, /init.scope or /system.slice/dbus.service) as getting rid of the program only works by rebooting the system. To counter this let's closely keep track to which cgroup a BPF program is attached and let's implicitly detach the BPF program when we are about to close the BPF fd. This hence changes the bpf_program_cgroup_attach() function to track where we attached the program and changes bpf_program_cgroup_detach() to use this information. Moreover bpf_program_unref() will now implicitly call bpf_program_cgroup_detach(). In order to simplify things, bpf_program_cgroup_attach() will now implicitly invoke bpf_program_load_kernel() when necessary, simplifying the caller's side. Finally, this adds proper reference counting to BPF programs. This is useful for working with two BPF programs in parallel: the BPF program we are preparing for installation and the BPF program we so far installed, shortening the window when we detach the old one and reattach the new one.	2018-02-21 16:43:36 +01:00
Lennart Poettering	a94ab7acfd	Merge pull request #8175 from keszybz/gc-cleanup Garbage collection cleanup	2018-02-15 17:47:37 +01:00
Zbigniew Jędrzejewski-Szmek	7f7d01ed58	pid1: include the source unit in UnitRef No functional change. The source unit manages the reference. It allocates the UnitRef structure and registers it in the target unit, and then the reference must be destroyed before the source unit is destroyed. Thus, is should be OK to include the pointer to the source unit, it should be live as long as the reference exists. v2: - rename refs to refs_by_target	2018-02-15 13:27:06 +01:00
Zbigniew Jędrzejewski-Szmek	f2f725e5cc	pid1: rename unit_check_gc to unit_may_gc "check" is unclear: what is true, what is false? Let's rename to "can_gc" and revert the return value ("positive" values are easier to grok). v2: - rename from unit_can_gc to unit_may_gc	2018-02-15 13:04:12 +01:00
Lennart Poettering	6592b9759c	core: add new new bus call for migrating foreign processes to scope/service units This adds a new bus call to service and scope units called AttachProcesses() that moves arbitrary processes into the cgroup of the unit. The primary user for this new API is systemd itself: the systemd --user instance uses this call of the systemd --system instance to migrate processes if itself gets the request to migrate processes and the kernel refuses this due to access restrictions. The primary use-case of this is to make "systemd-run --scope --user …" invoked from user session scopes work correctly on pure cgroupsv2 environments. There, the kernel refuses to migrate processes between two unprivileged-owned cgroups unless the requestor as well as the ownership of the closest parent cgroup all match. This however is not the case between the session-XYZ.scope unit of a login session and the user@ABC.service of the systemd --user instance. The new logic always tries to move the processes on its own, but if that doesn't work when being the user manager, then the system manager is asked to do it instead. The new operation is relatively restrictive: it will only allow to move the processes like this if the caller is root, or the UID of the target unit, caller and process all match. Note that this means that unprivileged users cannot attach processes to scope units, as those do not have "owning" users (i.e. they have now User= field). Fixes: #3388	2018-02-12 11:34:00 +01:00
Lennart Poettering	1d9cc8768f	cgroup: add a new "can_delegate" flag to the unit vtable, and set it for scope and service units only Currently we allowed delegation for alluntis with cgroup backing except for slices. Let's make this a bit more strict for now, and only allow this in service and scope units. Let's also add a generic accessor unit_cgroup_delegate() for checking whether a unit has delegation turned on that checks the new bool first. Also, when doing transient units, let's explcitly refuse turning on delegation for unit types that don#t support it. This is mostly cosmetical as we wouldn't act on the delegation request anyway, but certainly helpful for debugging.	2018-02-12 11:34:00 +01:00
Lennart Poettering	81e9871e87	selinux: make sure we never use /dev/null for making unit selinux access decisions	2018-01-31 19:54:25 +01:00
Lennart Poettering	adefcf2821	core: rework how we count the n_on_console counter Let's add a per-unit boolean that tells us whether our unit is currently counted or not. This way it's unlikely we get out of sync again and things are generally more robust. This also allows us to remove the counting logic specific to service units (which was in fact mostly a copy from the generic implementation), in favour of fully generic code. Replaces: #7824	2018-01-24 20:14:51 +01:00
Lennart Poettering	bb2c768545	core: add a new unit_needs_console() call This call determines whether a specific unit currently needs access to the console. It's a fancy wrapper around exec_context_may_touch_console() ultimately, however for service units we'll explicitly exclude the SERVICE_EXITED state from when we report true.	2018-01-24 19:54:26 +01:00

1 2 3 4 5

227 commits