Systemd

Author	SHA1	Message	Date
Yu Watanabe	d85ff94477	core: use SYNTHETIC_ERRNO() macro	2020-11-27 14:35:20 +09:00
Yu Watanabe	db9ecf0501	license: LGPL-2.1+ -> LGPL-2.1-or-later	2020-11-09 13:23:58 +09:00
Anita Zhang	4d824a4e0b	core: add ManagedOOM*= properties to configure systemd-oomd on the unit This adds the hook ups so it can be read with the usual systemd utilities. Used in later commits by sytemd-oomd.	2020-10-07 16:17:23 -07:00
Yu Watanabe	93c5b90459	core/slice: explicitly specify return value	2020-09-09 02:34:38 +09:00
Zbigniew Jędrzejewski-Szmek	90e74a66e6	tree-wide: define iterator inside of the macro	2020-09-08 12:14:05 +02:00
Michal Sekletár	2884836e3c	core: fix the return value in order to make sure we don't dipatch method return too early Actually, it is the same kind of problem as in `d910f4c` . Basically, we need to return 1 on success code path in slice_freezer_action(). Otherwise we dispatch DBus return message too soon. Fixes: #16050	2020-06-05 16:10:40 +02:00
Zbigniew Jędrzejewski-Szmek	f6e9aa9e45	pid1: convert to the new scheme In all the other cases, I think the code was clearer with the static table. Here, not so much. And because of the existing dump code, the vtables cannot be made static and need to remain exported. I still think it's worth to do the change to have the cmdline introspection, but I'm disappointed with how this came out.	2020-05-05 22:40:37 +02:00
Michal Sekletár	d9e45bc3ab	core: introduce support for cgroup freezer With cgroup v2 the cgroup freezer is implemented as a cgroup attribute called cgroup.freeze. cgroup can be frozen by writing "1" to the file and kernel will send us a notification through "cgroup.events" after the operation is finished and processes in the cgroup entered quiescent state, i.e. they are not scheduled to run. Writing "0" to the attribute file does the inverse and process execution is resumed. This commit exposes above low-level functionality through systemd's DBus API. Each unit type must provide specialized implementation for these methods, otherwise, we return an error. So far only service, scope, and slice unit types provide the support. It is possible to check if a given unit has the support using CanFreeze() DBus property. Note that DBus API has a synchronous behavior and we dispatch the reply to freeze/thaw requests only after the kernel has notified us that requested operation was completed.	2020-04-30 19:02:51 +02:00
Zbigniew Jędrzejewski-Szmek	75193d4128	core: adjust load functions for other unit types to be more like service No functional change, just adjusting code to follow the same pattern everywhere. In particular, never call _verify() on an already loaded unit, but return early from the caller instead. This makes the code a bit easier to follow.	2019-10-11 13:46:05 +02:00
Zbigniew Jędrzejewski-Szmek	c362077087	core: turn unit_load_fragment_and_dropin_optional() into a flag unit_load_fragment_and_dropin() and unit_load_fragment_and_dropin_optional() are really the same, with one minor difference in behaviour. Let's drop the second function. "_optional" in the name suggests that it's the "dropin" part that is optional. (Which it is, but in this case, we mean the fragment to be optional.) I think the new version with a flag is easier to understand.	2019-10-11 10:45:33 +02:00
Chris Down	bc0623df16	cgroup: analyze: Report memory configurations that deviate from systemd This is the most basic consumer of the new systemd-vs-kernel checker, both acting as a reasonable standalone exerciser of the code, and also as a way for easy inspection of deviations from systemd internal state.	2019-10-03 15:06:25 +01:00
Lennart Poettering	9b2559a13e	core: add new call unit_reset_accounting() It's a simple wrapper for resetting both IP and CPU accounting in one go. This will become particularly useful when we also needs this to reset IO accounting (to be added in a later commit).	2019-04-12 14:25:44 +02:00
Lennart Poettering	6fcbec6f9b	core: whenever we change state of a unit, force out PropertiesChanged bus signal This allows clients to follow our internal state changes safely. Previously, quick state changes (for example, when we restart a unit due to Restart= after it quickly transitioned through DEAD/FAILED states) would be coalesced into one bus signal event, with this change there's the guarantee that all state changes after the unit was announced ones are reflected on th bus. Note we only do this kind of guaranteed flushing only for unit state changes, not for other unit property changes, where clients still have to expect coalescing. This is because the unit state is a very important, high-level concept. Fixes: #10185	2018-12-01 12:53:26 +01:00
Lennart Poettering	611c4f8afb	cgroup: rename {manager_owns\|unit_has}_root_cgroup() → .._host_root_cgroup() Let's emphasize that this function checks for the host root cgroup, i.e. returns false for the root cgroup when we run in a container where CLONE_NEWCGROUP is used. There has been some confusion around this already, for example cgroup_context_apply() uses the function incorrectly (which we'll fix in a later commit). Just some refactoring, not change in behaviour.	2018-11-23 12:24:37 +01:00
Lennart Poettering	bea1a01310	strv: wrap strv_new() in a macro so that NULL sentinel is implicit	2018-10-31 18:00:52 +01:00
Lennart Poettering	d68c645bd3	core: rework serialization Let's be more careful with what we serialize: let's ensure we never serialize strings that are longer than LONG_LINE_MAX, so that we know we can read them back with read_line(…, LONG_LINE_MAX, …) safely. In order to implement this all serialization functions are move to serialize.[ch], and internally will do line size checks. We'd rather skip a serialization line (with a loud warning) than write an overly long line out. Of course, this is just a second level protection, after all the data we serialize shouldn't be this long in the first place. While we are at it also clean up logging: while serializing make sure to always log about errors immediately. Also, (void)ify all calls we don't expect errors in (or catch errors as part of the general fflush_and_check() at the end.	2018-10-26 10:52:41 +02:00
Zbigniew Jędrzejewski-Szmek	5a72417084	pid1: drop unused path parameter to add_two_dependencies_by_name()	2018-09-15 20:02:00 +02:00
Lennart Poettering	0c69794138	tree-wide: remove Lennart's copyright lines These lines are generally out-of-date, incomplete and unnecessary. With SPDX and git repository much more accurate and fine grained information about licensing and authorship is available, hence let's drop the per-file copyright notice. Of course, removing copyright lines of others is problematic, hence this commit only removes my own lines and leaves all others untouched. It might be nicer if sooner or later those could go away too, making git the only and accurate source of authorship information.	2018-06-14 10:20:20 +02:00
Lennart Poettering	818bf54632	tree-wide: drop 'This file is part of systemd' blurb This part of the copyright blurb stems from the GPL use recommendations: https://www.gnu.org/licenses/gpl-howto.en.html The concept appears to originate in times where version control was per file, instead of per tree, and was a way to glue the files together. Ultimately, we nowadays don't live in that world anymore, and this information is entirely useless anyway, as people are very welcome to copy these files into any projects they like, and they shouldn't have to change bits that are part of our copyright header for that. hence, let's just get rid of this old cruft, and shorten our codebase a bit.	2018-06-14 10:20:20 +02:00
Lennart Poettering	6f40aa4547	core: add a couple of more error cases that should result in "bad-setting" This changes a number of EINVAL cases to ENOEXEC, so that we enter "bad-setting" state if they fail.	2018-06-11 12:53:12 +02:00
Lennart Poettering	04eb582acc	core: enumerate perpetual units in a separate per-unit-type method Previously the enumerate() callback defined for each unit type would do two things: 1. It would create perpetual units (i.e. -.slice, system.slice, -.mount and init.scope) 2. It would enumerate units from /proc/self/mountinfo, /proc/swaps and the udev database With this change these two parts are split into two seperate methods: enumerate() now only does #2, while enumerate_perpetual() is responsible for #1. Why make this change? Well, perpetual units should have a slightly different effect that those found through enumeration: as perpetual units should be up unconditionally, perpetually and thus never change state, they should also not pull in deps by their state changing, not even when the state is first set to active. Thus, their state is generally initialized through the per-device coldplug() method in similar fashion to the deserialized state from a previous run would be put into place. OTOH units found through regular enumeration should result in state changes (and thus pull in deps due to state changes), hence their state should be put in effect in the catchup() method instead. Hence, given this difference, let's also separate the functions, so that the rule is: 1. What is created in enumerate_perpetual() should be started in coldplug() 2. What is created in enumerate() should be started in catchup().	2018-06-07 15:29:17 +02:00
Lennart Poettering	2ad2e41a72	core: don't trigger OnFailure= deps when a unit is going to restart This adds a flags parameter to unit_notify() which can be used to pass additional notification information to the function. We the make the old reload_failure boolean parameter one of these flags, and then add a new flag that let's unit_notify() if we are configured to restart the service. Note that this adjusts behaviour of systemd to match what the docs say. Fixes: #8398	2018-06-01 19:08:30 +02:00
Zbigniew Jędrzejewski-Szmek	11a1589223	tree-wide: drop license boilerplate Files which are installed as-is (any .service and other unit files, .conf files, .policy files, etc), are left as is. My assumption is that SPDX identifiers are not yet that well known, so it's better to retain the extended header to avoid any doubt. I also kept any copyright lines. We can probably remove them, but it'd nice to obtain explicit acks from all involved authors before doing that.	2018-04-06 18:58:55 +02:00
Lennart Poettering	902c8502ad	Merge pull request #8149 from poettering/fake-root-cgroup Properly synthesize CPU+memory accounting data for the root cgroup	2018-03-01 11:10:24 +01:00
Zbigniew Jędrzejewski-Szmek	7f7d01ed58	pid1: include the source unit in UnitRef No functional change. The source unit manages the reference. It allocates the UnitRef structure and registers it in the target unit, and then the reference must be destroyed before the source unit is destroyed. Thus, is should be OK to include the pointer to the source unit, it should be live as long as the reference exists. v2: - rename refs to refs_by_target	2018-02-15 13:27:06 +01:00
Lennart Poettering	cc6271f17d	core: turn on memory/cpu/tasks accounting by default for the root slice The kernel exposes the necessary data in /proc anyway, let's expose it hence by default. With this in place "systemctl status -- -.slice" will show accounting data out-of-the-box now.	2018-02-09 19:07:39 +01:00
Alan Jenkins	d8e5a93382	slice: system.slice should be perpetual like -.mount `-.mount` is placed in `system.slice`, and hence depends on it. `-.mount` is always active and can never be stopped. Therefore the same should be true of `system.slice`. Synthesize it as perpetual (unless systemd is running as a user manager). Notice we also drop `Before=slices.target` as unnecessary. AFAICS the justification for `perpetual` is to provide extra protection against unintentionally stopping every single service. So adding system.slice to the perpetual units is perfectly consistent. I don't expect this will (or can) fix any other problem. And the `perpetual` protection probably isn't formal enough to spend much time thinking about. I've just noticed this a couple of times, as something that looks strange. Might be a bit surprising that we have user.slice on-disk but not system.slice, but I think it's ok. `systemctl status system.slice` will still point you towards `man systemd.special`. The only detail is that the system slice disables `DefaultDependencies`. If you're worrying about how system shutdown works when you read `man systemd.slice`, I think it is not too hard to guess that system.slice might do this: > Only slice units involved with early boot > or late system shutdown should disable this option (Docs are great. I really appreciate the systemd ones).	2018-02-04 22:51:34 +00:00
Alan Jenkins	0c79456781	slice, scope: IgnoreOnIsolate=yes is already the default `IgnoreOnIsolate=yes` is the default for slices and scopes. So it's not essential to set it on root.slice or init.scope. We don't need to worry about a bad unit file configuration. Any attempt to stop these unit should fail, since we mark them as `perpetual`. Also since init.scope cannot be stopped, there is no point setting `KillSignal=SIGRTMIN+14`. According to both documentation and testing, KillSignal= does not affect the behaviour of `systemctl kill`.	2018-02-04 22:51:34 +00:00
Zbigniew Jędrzejewski-Szmek	a789420775	core: reuse slice_build_parent_slice	2017-12-15 14:57:07 +01:00
Zbigniew Jędrzejewski-Szmek	53e1b68390	Add SPDX license identifiers to source files under the LGPL This follows what the kernel is doing, c.f. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.	2017-11-19 19:08:15 +01:00
Lennart Poettering	eef85c4a3f	core: track why unit dependencies came to be This replaces the dependencies Set* objects by Hashmap* objects, where the key is the depending Unit, and the value is a bitmask encoding why the specific dependency was created. The bitmask contains a number of different, defined bits, that indicate why dependencies exist, for example whether they are created due to explicitly configured deps in files, by udev rules or implicitly. Note that memory usage is not increased by this change, even though we store more information, as we manage to encode the bit mask inside the value pointer each Hashmap entry contains. Why this all? When we know how a dependency came to be, we can update dependencies correctly when a configuration source changes but others are left unaltered. Specifically: 1. We can fix UDEV_WANTS dependency generation: so far we kept adding dependencies configured that way, but if a device lost such a dependency we couldn't them again as there was no scheme for removing of dependencies in place. 2. We can implement "pin-pointed" reload of unit files. If we know what dependencies were created as result of configuration in a unit file, then we know what to flush out when we want to reload it. 3. It's useful for debugging: "systemd-analyze dump" now shows this information, helping substantially with understanding how systemd's dependency tree came to be the way it came to be.	2017-11-10 19:45:29 +01:00
Daniel Mack	906c06f64a	cgroup, unit, fragment parser: make use of new firewall functions	2017-09-22 15:24:55 +02:00
Lennart Poettering	a581e45ae8	unit: unify some code with new unit_new_for_name() call	2016-11-02 11:29:59 -06:00
Lennart Poettering	f5869324e3	core: rework the "no_gc" unit flag to become a more generic "perpetual" flag So far "no_gc" was set on -.slice and init.scope, to units that are always running, cannot be stopped and never exist in an "inactive" state. Since these units are the only users of this flag, let's remodel it and rename it "perpetual" and let's derive more funcitonality off it. Specifically, refuse enqueing stop jobs for these units, and report that they are "unstoppable" in the CanStop bus property.	2016-11-02 11:29:59 -06:00
Lennart Poettering	8e4e851f1d	core: move initialization of -.slice and init.scope into the unit_load() callbacks Previously, we'd synthesize the root slice unit and the init scope unit in the enumerator callbacks for the unit type. This is problematic if either of them is already referenced from a unit that is loaded as result of another unit type's enumerator logic. Let's clean this up and simply create the two objects from the enumerator callbacks, if they are not around yet. Do the actual filling in of the settings from the unit_load() callbacks, to match how other units are loaded. Fixes: #4322	2016-10-24 20:46:30 +02:00
Lennart Poettering	4b58153dd2	core: add "invocation ID" concept to service manager This adds a new invocation ID concept to the service manager. The invocation ID identifies each runtime cycle of a unit uniquely. A new randomized 128bit ID is generated each time a unit moves from and inactive to an activating or active state. The primary usecase for this concept is to connect the runtime data PID 1 maintains about a service with the offline data the journal stores about it. Previously we'd use the unit name plus start/stop times, which however is highly racy since the journal will generally process log data after the service already ended. The "invocation ID" kinda matches the "boot ID" concept of the Linux kernel, except that it applies to an individual unit instead of the whole system. The invocation ID is passed to the activated processes as environment variable. It is additionally stored as extended attribute on the cgroup of the unit. The latter is used by journald to automatically retrieve it for each log logged message and attach it to the log entry. The environment variable is very easily accessible, even for unprivileged services. OTOH the extended attribute is only accessible to privileged processes (this is because cgroupfs only supports the "trusted." xattr namespace, not "user."). The environment variable may be altered by services, the extended attribute may not be, hence is the better choice for the journal. Note that reading the invocation ID off the extended attribute from journald is racy, similar to the way reading the unit name for a logging process is. This patch adds APIs to read the invocation ID to sd-id128: sd_id128_get_invocation() may be used in a similar fashion to sd_id128_get_boot(). PID1's own logging is updated to always include the invocation ID when it logs information about a unit. A new bus call GetUnitByInvocationID() is added that allows retrieving a bus path to a unit by its invocation ID. The bus path is built using the invocation ID, thus providing a path for referring to a unit that is valid only for the current runtime cycleof it. Outlook for the future: should the kernel eventually allow passing of cgroup information along AF_UNIX/SOCK_DGRAM messages via a unique cgroup id, then we can alter the invocation ID to be generated as hash from that rather than entirely randomly. This way we can derive the invocation race-freely from the messages.	2016-10-07 20:14:38 +02:00
Zbigniew Jędrzejewski-Szmek	ce99c68a33	Move no_instances information to shared/ This way it can be used in install.c in subsequent commit.	2016-05-01 19:58:59 -04:00
Zbigniew Jędrzejewski-Szmek	8a993b61d1	Move no_alias information to shared/ This way it can be used in install.c in subsequent commit.	2016-05-01 19:40:51 -04:00
Lennart Poettering	4f4afc88ec	core: rework how transient unit files and property drop-ins work With this change the logic for placing transient unit files and drop-ins generated via "systemctl set-property" is reworked. The latter are now placed in the newly introduced "control" unit file directory. The fomer are now placed in the "transient" unit file directory. Note that the properties originally set when a transient unit was created will be written to and stay in the transient unit file directory, while later changes are done via drop-ins. This is preparation for a later "systemctl revert" addition, where existing drop-ins are flushed out, but the original transient definition is restored.	2016-04-12 13:43:32 +02:00
Lennart Poettering	1b4cd0cf11	core: exclude .slice units from "systemctl isolate" Fixes: #1969	2016-02-20 22:42:29 +01:00
Daniel Mack	b26fa1a2fb	tree-wide: remove Emacs lines from all files This should be handled fine now by .dir-locals.el, so need to carry that stuff in every file.	2016-02-10 13:41:57 +01:00
Thomas Hindoe Paaboel Andersen	cf0fbc49e6	tree-wide: sort includes Sort the includes accoding to the new coding style.	2015-11-16 22:09:36 +01:00
Lennart Poettering	17f62e9bd0	core: enable transient unit support for slice units	2015-11-13 19:50:52 +01:00
Lennart Poettering	4c9ea260ae	core: simplify things a bit by checking default_dependencies boolean in callee, not caller It's nicer to hide the check away in the various xyz_add_default_dependencies() calls, rather than making it explicit in the caller, and thus require deeper nesing.	2015-11-11 20:42:39 +01:00
Lennart Poettering	ba64af90ec	core: change return value of the unit's enumerate() call to void We cannot handle enumeration failures in a sensible way, hence let's try hard to continue without making such failures fatal, and log about it with precise error messages.	2015-11-10 21:03:49 +01:00
Lennart Poettering	b5efdb8af4	util-lib: split out allocation calls into alloc-util.[ch]	2015-10-27 13:45:53 +01:00
Lennart Poettering	07630cea1f	util-lib: split our string related calls from util.[ch] into its own file string-util.[ch] There are more than enough calls doing string manipulations to deserve its own files, hence do something about it. This patch also sorts the #include blocks of all files that needed to be updated, according to the sorting suggestions from CODING_STYLE. Since pretty much every file needs our string manipulation functions this effectively means that most files have sorted #include blocks now. Also touches a few unrelated include files.	2015-10-24 23:05:02 +02:00
Lennart Poettering	417800228f	core: ignore -.slice and init.scope when isolating Otherwise, we might end up trying to isolate it away when starting user instances. While we are at it, also prohibit manual start/stop of these two units. Fixes: #1507	2015-10-09 17:20:32 +02:00
Zbigniew Jędrzejewski-Szmek	7e55de3b96	Move all unit states to basic/ and extend systemctl --state=help	2015-09-28 15:09:34 -04:00
Lennart Poettering	efdb02375b	core: unified cgroup hierarchy support This patch set adds full support the new unified cgroup hierarchy logic of modern kernels. A new kernel command line option "systemd.unified_cgroup_hierarchy=1" is added. If specified the unified hierarchy is mounted to /sys/fs/cgroup instead of a tmpfs. No further hierarchies are mounted. The kernel command line option defaults to off. We can turn it on by default as soon as the kernel's APIs regarding this are stabilized (but even then downstream distros might want to turn this off, as this will break any tools that access cgroupfs directly). It is possibly to choose for each boot individually whether the unified or the legacy hierarchy is used. nspawn will by default provide the legacy hierarchy to containers if the host is using it, and the unified otherwise. However it is possible to run containers with the unified hierarchy on a legacy host and vice versa, by setting the $UNIFIED_CGROUP_HIERARCHY environment variable for nspawn to 1 or 0, respectively. The unified hierarchy provides reliable cgroup empty notifications for the first time, via inotify. To make use of this we maintain one manager-wide inotify fd, and each cgroup to it. This patch also removes cg_delete() which is unused now. On kernel 4.2 only the "memory" controller is compatible with the unified hierarchy, hence that's the only controller systemd exposes when booted in unified heirarchy mode. This introduces a new enum for enumerating supported controllers, plus a related enum for the mask bits mapping to it. The core is changed to make use of this everywhere. This moves PID 1 into a new "init.scope" implicit scope unit in the root slice. This is necessary since on the unified hierarchy cgroups may either contain subgroups or processes but not both. PID 1 hence has to move out of the root cgroup (strictly speaking the root cgroup is the only one where processes and subgroups are still allowed, but in order to support containers nicey, we move PID 1 into the new scope in all cases.) This new unit is also used on legacy hierarchy setups. It's actually pretty useful on all systems, as it can then be used to filter journal messages coming from PID 1, and so on. The root slice ("-.slice") is now implicitly created and started (and does not require a unit file on disk anymore), since that's where "init.scope" is located and the slice needs to be started before the scope can. To check whether we are in unified or legacy hierarchy mode we use statfs() on /sys/fs/cgroup. If the .f_type field reports tmpfs we are in legacy mode, if it reports cgroupfs we are in unified mode. This patch set carefuly makes sure that cgls and cgtop continue to work as desired. When invoking nspawn as a service it will implicitly create two subcgroups in the cgroup it is using, one to move the nspawn process into, the other to move the actual container processes into. This is done because of the requirement that cgroups may either contain processes or other subgroups.	2015-09-01 23:52:27 +02:00

1 2

67 commits