Systemd

Author	SHA1	Message	Date
Lennart Poettering	a4634b214c	core: warn about left-over processes in cgroup on unit start Now that we don't kill control processes anymore, let's at least warn about any processes left-over in the unit cgroup at the moment of starting the unit.	2017-11-25 17:08:21 +01:00
Lennart Poettering	3c7416b6ca	core: unify common code for preparing for forking off unit processes This introduces a new function unit_prepare_exec() that encapsulates a number of calls we do in preparation for spawning off some processes in all our unit types that do so. This allows us to neatly unify a bit of code between unit types and shorten our code.	2017-11-21 11:54:08 +01:00
Zbigniew Jędrzejewski-Szmek	53e1b68390	Add SPDX license identifiers to source files under the LGPL This follows what the kernel is doing, c.f. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.	2017-11-19 19:08:15 +01:00
Lennart Poettering	d3070fbdf6	core: implement /run/systemd/units/-based path for passing unit info from PID 1 to journald And let's make use of it to implement two new unit settings with it: 1. LogLevelMax= is a new per-unit setting that may be used to configure log priority filtering: set it to LogLevelMax=notice and only messages of level "notice" and lower (i.e. more important) will be processed, all others are dropped. 2. LogExtraFields= is a new per-unit setting for configuring per-unit journal fields, that are implicitly included in every log record generated by the unit's processes. It takes field/value pairs in the form of FOO=BAR. Also, related to this, one exisiting unit setting is ported to this new facility: 3. The invocation ID is now pulled from /run/systemd/units/ instead of cgroupfs xattrs. This substantially relaxes requirements of systemd on the kernel version and the privileges it runs with (specifically, cgroupfs xattrs are not available in containers, since they are stored in kernel memory, and hence are unsafe to permit to lesser privileged code). /run/systemd/units/ is a new directory, which contains a number of files and symlinks encoding the above information. PID 1 creates and manages these files, and journald reads them from there. Note that this is supposed to be a direct path between PID 1 and the journal only, due to the special runtime environment the journal runs in. Normally, today we shouldn't introduce new interfaces that (mis-)use a file system as IPC framework, and instead just an IPC system, but this is very hard to do between the journal and PID 1, as long as the IPC system is a subject PID 1 manages, and itself a client to the journal. This patch cleans up a couple of types used in journal code: specifically we switch to size_t for a couple of memory-sizing values, as size_t is the right choice for everything that is memory. Fixes: #4089 Fixes: #3041 Fixes: #4441	2017-11-16 12:40:17 +01:00
Yu Watanabe	74b1731c75	core/mount: fstype may be NULL	2017-11-12 14:27:25 +01:00
Lennart Poettering	3e3852b3c6	core: make "tmpfs" dependencies on swapfs a "default" dep, not an "implicit" There should be a way to turn this logic of, and DefaultDependencies= appears to be the right option for that, hence let's downgrade this dependency type from "implicit" to "default, and thus honour DefaultDependencies=. This also drops mount_get_fstype() as we only have a single user needing this now. A follow-up for #7076.	2017-11-10 19:52:41 +01:00
Lennart Poettering	eef85c4a3f	core: track why unit dependencies came to be This replaces the dependencies Set* objects by Hashmap* objects, where the key is the depending Unit, and the value is a bitmask encoding why the specific dependency was created. The bitmask contains a number of different, defined bits, that indicate why dependencies exist, for example whether they are created due to explicitly configured deps in files, by udev rules or implicitly. Note that memory usage is not increased by this change, even though we store more information, as we manage to encode the bit mask inside the value pointer each Hashmap entry contains. Why this all? When we know how a dependency came to be, we can update dependencies correctly when a configuration source changes but others are left unaltered. Specifically: 1. We can fix UDEV_WANTS dependency generation: so far we kept adding dependencies configured that way, but if a device lost such a dependency we couldn't them again as there was no scheme for removing of dependencies in place. 2. We can implement "pin-pointed" reload of unit files. If we know what dependencies were created as result of configuration in a unit file, then we know what to flush out when we want to reload it. 3. It's useful for debugging: "systemd-analyze dump" now shows this information, helping substantially with understanding how systemd's dependency tree came to be the way it came to be.	2017-11-10 19:45:29 +01:00
Alan Jenkins	79aafbd122	core: distinguish "Killing"/"Terminating"/"Stopping" for mount unit timeout Update the timeout warnings for remount and unmount. For consistency with mount, for accuracy, and for consistency with their equivalents in service.c.	2017-11-01 15:28:50 +00:00
Michal Sekletar	fab35afabf	mount: make sure we unmount tmpfs mounts before we deactivate swaps (#7076 ) In the past we introduced this property just for tmp.mount. However on todays systems usually there are many more tmpfs mounts. Most notably mounts backing XDG_RUNTIME_DIR for each user. Let's generalize what we already have for tmp.mount and implement the ordering After=swap.target for all tmpfs based mounts.	2017-10-16 16:15:05 +02:00
Lennart Poettering	ed77d407d3	core: log unit failure with type-specific result code This slightly changes how we log about failures. Previously, service_enter_dead() would log that a service unit failed along with its result code, and unit_notify() would do this again but without the result code. For other unit types only the latter would take effect. This cleans this up: we keep the message in unit_notify() only for debug purposes, and add type-specific log lines to all our unit types that can fail, and always place them before unit_notify() is invoked. Or in other words: the duplicate log message for service units is removed, and all other unit types get a more useful line with the precise result code.	2017-09-27 18:26:18 +02:00
Lennart Poettering	c634f3d2fc	mount: rename mount_state_active() → MOUNT_STATE_WITH_PROCESS() The function returns true for all states that have a control process running, and each time we call it that's what we want to know, hence let's rename it accordingly. Moreover, the more generic unit states have an ACTIVE state, and it is defined quite differently from the set of states this function returns true for, hence let's avoid confusion and not reuse the word "ACTIVE" here in a different context. Finally, let's uppercase this, since in most ways it's pretty much identical to a macro	2017-09-26 16:17:22 +02:00
Lennart Poettering	22af0e5873	mount: rework mount state engine This changes the mount unit state engine in the following ways: 1. The MOUNT_MOUNTING_SIGTERM and MOUNT_MOUNTING_SIGKILL are removed. They have been pretty much equivalent to MOUNT_UNMOUNTING_SIGTERM and MOUNT_UNMOUNTING_SIGKILL in what they do, and the outcome has been the same as well: the unit is stopped. Hence, let's simplify things a bit, and merge them. Note that we keep MOUNT_REMOUNTING_{SIGTERM\|SIGKILL} however, as those states have a different outcome: the unit remains started. 2. mount_enter_signal() will now honour the SendSIGKILL= option of the mount unit if it was set. This was previously done already when we entered the signal states through a timeout, and was simply missing here. 3. A new helper function mount_enter_dead_or_mounted() is added that places the mount unit in either MOUNT_DEAD or MOUNT_MOUNTED, depending on what the kernel thinks about the mount's state. This function is called at various places now, wherever we finished an operation, and want to make sure our own state reflects again what the kernel thinks. Previously we had very similar code in a number of places and in other places didn't recheck the kernel state. Let's do that with the same logic and function at all relevant places now. 4. Rework mount_stop(): never forget about running control processes. Instead: when we have a start (i.e. a /bin/mount) process running, and are asked to stop, then enter the kill states for it, so that it gets cleaned up. This fixes #6048. Moreover, when we have a reload process running convert the possible states into the relevant unmounting states, so that we can properly execute the requested operation. Fixes #6048	2017-09-26 16:17:22 +02:00
Lennart Poettering	850b741084	mount: clean up reload_result management a bit Let's only collect the first failure in the load result, and let's clear it explicitly when we are about to enter a new reload operation. This makes it more alike the handling of the main result value (which also only stores the first failure), and also the handling of service.c's reload state.	2017-09-26 16:17:22 +02:00
Daniel Mack	906c06f64a	cgroup, unit, fragment parser: make use of new firewall functions	2017-09-22 15:24:55 +02:00
Lennart Poettering	18f573aaf9	core: make sure to dump cgroup context when unit_dump() is called for all unit types For some reason we didn't dump the cgroup context for a number of unit types, including service units. Not sure how this wasn't noticed before... Add this in.	2017-09-22 15:24:54 +02:00
Lennart Poettering	1703fa41a7	core: rename EXEC_APPLY_PERMISSIONS → EXEC_APPLY_SANDBOXING "Permissions" was a bit of a misnomer, as it suggests that UNIX file permission bits are adjusted, which aren't really changed here. Instead, this is about UNIX credentials such as users or groups, as well as namespacing, hence let's use a more generic term here, without any misleading reference to UNIX file permissions: "sandboxing", which shall refer to all kinds of sandboxing technologies, including UID/GID dropping, selinux relabelling, namespacing, seccomp, and so on.	2017-08-10 15:02:50 +02:00
Lennart Poettering	f0d477979e	core: introduce unit_set_exec_params() The new unit_set_exec_params() call is to units what manager_set_exec_params() is to the manager object: it initializes the various fields from the relevant generic properties set.	2017-08-10 15:02:50 +02:00
Lennart Poettering	19bbdd985e	core: manager_set_exec_params() cannot fail, hence make it void Let's simplify things a bit.	2017-08-10 15:02:50 +02:00
Lennart Poettering	584b8688d1	execute: also fold the cgroup delegate bit into ExecFlags	2017-08-10 15:02:50 +02:00
Abdó Roig-Maranges	1df96fcb31	core: Do not fail perpetual mount units without fragment (#6459 ) mount_load does not require fragment files to be present in order to load mount units which are perpetual, or come from /proc/self/mountinfo. mount_verify should do the same, otherwise a synthesized '-.mount' would be marked as failed with "No such file or directory", as it is perpetual but not marked to come from /proc/self/mountinfo at this point. This happens for the user instance, and I suspect it was the cause of #5375 for the system instance, without gpt-generator.	2017-07-31 12:32:09 +02:00
Yu Watanabe	3536f49e8f	core: add {State,Cache,Log,Configuration}Directory= (#6384 ) This introduces {State,Cache,Log,Configuration}Directory= those are similar to RuntimeDirectory=. They create the directories under /var/lib, /var/cache/, /var/log, or /etc, respectively, with the mode specified in {State,Cache,Log,Configuration}DirectoryMode=. This also fixes #6391.	2017-07-18 14:34:52 +02:00
NeilBrown	83897d5470	core/mount: pass "-c" flag to /bin/umount (#6093 ) "-c", which is short for "--no-canonicalize", tells /bin/umount that the path name is canonical (no .. or symlinks etc). systemd always uses a canonical name, so this flag is appropriate for systemd to use. Knowing that the path is canonical allows umount to avoid some calls to lstat() on the path. From v2.30 "-c" goes further and causes umount to avoid all attempts to 'lstat()' (or similar) the path. This is important when automatically unmounting a filesystem, as lstat() can hang indefinitely in some cases such as when an NFS server is not accessible. "-c" has been supported since util-linux 2.17 which is before the earliest version supported by systemd. So "-c" is safe to use now, and once util-linux v2.30 is in use, it will allow mounts from non-responsive NFS servers to be unmounted.	2017-06-07 15:28:23 +03:00
Zbigniew Jędrzejewski-Szmek	ce954c0319	core/mount: remove repeated word	2017-02-02 11:18:34 -05:00
Yu Watanabe	cfcd431890	core: add missing unit_add_to_load_queue() to mount_setup_new_unit() unit_add_to_load_queue was present in the code before `03b8cfede9`, and was inadvertently dropped. Fixes #5105	2017-01-23 14:06:43 +09:00
Yu Watanabe	a51ee72d2e	core: minor error handling fix in mount_setup_new_unit() The function mount_setup_new_unit() should return -ENOMEM if at least one of `strdup` calls are failed.	2017-01-23 13:59:21 +09:00
Franck Bui	03b8cfede9	core: make sure to init mount params before calling mount_is_extrinsic() (#5087 ) When a new entry appears in /proc/self/mountinfo, mount_setup_unit() allocated a new mount unit for it and starts initializing it. mount_setup_unit() is also used to update a mount unit when a change happens in /proc/self/mountinfo, for example a mountpoint can be remounted with additional mount options. This patch introduces 2 separate functions to deal with those 2 cases instead of mount_setup_unit() dealing with both of them. The common code is small and doing the split makes the code easier to read and less error prone if extended later. It also makes sure to initialize in both functions the mount parameters of the mount unit before calling mount_is_extrinsic() since this function relies on them. Fixes: #4902	2017-01-16 15:19:13 -05:00
Franck Bui	ebc8968bc0	core: make mount units from /proc/self/mountinfo possibly bind to a device (#4515 ) Since commit `9d06297`, mount units from mountinfo are not bound to their devices anymore (they use the "Requires" dependency instead). This has the following drawback: if a media is mounted and the eject button is pressed then the media is unconditionally ejected leaving some inconsistent states. Since udev is the component that is reacting (no matter if the device is used or not) to the eject button, users expect that udev at least try to unmount the media properly. This patch introduces a new property "SYSTEMD_MOUNT_DEVICE_BOUND". When set on a block device, all units that requires this device will see their "Requires" dependency upgraded to a "BindTo" one. This is currently only used by cdrom devices. This patch also gives the possibility to the user to restore the previous behavior that is bind a mount unit to a device. This is achieved by passing the "x-systemd.device-bound" option to mount(8). Please note that currently this is not working because libmount treats the x-* options has comments therefore they're not available in utab for later application retrievals.	2016-12-16 17:13:58 +01:00
Lennart Poettering	ad2706db7c	core: rework logic to determine when we decide to add automatic deps for mounts This adds a concept of "extrinsic" mounts. If mounts are extrinsic we consider them managed by something else and do not add automatic ordering against umount.target, local-fs.target, remote-fs.target. Extrinsic mounts are considered: - All mounts if we are running in --user mode - API mounts such as everything below /proc, /sys, /dev, which exist from earliest boot to latest shutdown. - All mounts marked as initrd mounts, if we run on the host - The initrd's private directory /run/initrams that should survive until last reboot. This primarily merges a couple of different exclusion lists into a single concept.	2016-12-14 10:13:52 +01:00
Lennart Poettering	c9d5c9c0e1	core: make unit_free() accept NULL pointers We generally try to make our destructors robust regarding NULL pointers, much in the same way as glibc's free(). Do this also for unit_free(). Follow-up for #4748.	2016-12-01 00:25:51 +01:00
Franck Bui	7d5ceb6416	core: allow to redirect confirmation messages to a different console It's rather hard to parse the confirmation messages (enabled with systemd.confirm_spawn=true) amongst the status messages and the kernel ones (if enabled). This patch gives the possibility to the user to redirect the confirmation message to a different virtual console, either by giving its name or its path, so those messages are separated from the other ones and easier to read.	2016-11-17 18:16:16 +01:00
Zbigniew Jędrzejewski-Szmek	f97b34a629	Rename formats-util.h to format-util.h We don't have plural in the name of any other -util files and this inconsistency trips me up every time I try to type this file name from memory. "formats-util" is even hard to pronounce.	2016-11-07 10:15:08 -05:00
Lennart Poettering	1201cae704	core: change mount_synthesize_root() return to int Let's propagate the error here, instead of eating it up early. In a later change we should probably also change mount_enumerate() to propagate errors up, but that would mean we'd have to change the unit vtable, and thus change all unit types, hence is quite an invasive change.	2016-11-02 11:39:49 -06:00
Lennart Poettering	a581e45ae8	unit: unify some code with new unit_new_for_name() call	2016-11-02 11:29:59 -06:00
Lennart Poettering	11222d0fe0	core: make the root mount perpetual too Now that have a proper concept of "perpetual" units, let's make the root mount one too, since it also cannot go away.	2016-11-02 11:29:59 -06:00
Zbigniew Jędrzejewski-Szmek	3b319885c4	tree-wide: introduce free_and_replace helper It's a common pattern, so add a helper for it. A macro is necessary because a function that takes a pointer to a pointer would be type specific, similarly to cleanup functions. Seems better to use a macro.	2016-10-16 23:35:39 -04:00
Zbigniew Jędrzejewski-Szmek	b744e8937c	Merge pull request #4067 from poettering/invocation-id Add an "invocation ID" concept to the service manager	2016-10-11 13:40:50 -04:00
Lennart Poettering	052364d41f	core: simplify if branches a bit We do the same thing in two branches, let's merge them. Let's also add an explanatory comment, while we are at it.	2016-10-10 22:57:02 +02:00
Lennart Poettering	f2aed3070d	core: make use of IN_SET() in various places in mount.c	2016-10-10 22:57:02 +02:00
Lennart Poettering	1f0958f640	core: when determining whether a process exit status is clean, consider whether it is a command or a daemon SIGTERM should be considered a clean exit code for daemons (i.e. long-running processes, as a daemon without SIGTERM handler may be shut down without issues via SIGTERM still) while it should not be considered a clean exit code for commands (i.e. short-running processes). Let's add two different clean checking modes for this, and use the right one at the appropriate places. Fixes: #4275	2016-10-10 22:57:01 +02:00
Lennart Poettering	4b58153dd2	core: add "invocation ID" concept to service manager This adds a new invocation ID concept to the service manager. The invocation ID identifies each runtime cycle of a unit uniquely. A new randomized 128bit ID is generated each time a unit moves from and inactive to an activating or active state. The primary usecase for this concept is to connect the runtime data PID 1 maintains about a service with the offline data the journal stores about it. Previously we'd use the unit name plus start/stop times, which however is highly racy since the journal will generally process log data after the service already ended. The "invocation ID" kinda matches the "boot ID" concept of the Linux kernel, except that it applies to an individual unit instead of the whole system. The invocation ID is passed to the activated processes as environment variable. It is additionally stored as extended attribute on the cgroup of the unit. The latter is used by journald to automatically retrieve it for each log logged message and attach it to the log entry. The environment variable is very easily accessible, even for unprivileged services. OTOH the extended attribute is only accessible to privileged processes (this is because cgroupfs only supports the "trusted." xattr namespace, not "user."). The environment variable may be altered by services, the extended attribute may not be, hence is the better choice for the journal. Note that reading the invocation ID off the extended attribute from journald is racy, similar to the way reading the unit name for a logging process is. This patch adds APIs to read the invocation ID to sd-id128: sd_id128_get_invocation() may be used in a similar fashion to sd_id128_get_boot(). PID1's own logging is updated to always include the invocation ID when it logs information about a unit. A new bus call GetUnitByInvocationID() is added that allows retrieving a bus path to a unit by its invocation ID. The bus path is built using the invocation ID, thus providing a path for referring to a unit that is valid only for the current runtime cycleof it. Outlook for the future: should the kernel eventually allow passing of cgroup information along AF_UNIX/SOCK_DGRAM messages via a unique cgroup id, then we can alter the invocation ID to be generated as hash from that rather than entirely randomly. This way we can derive the invocation race-freely from the messages.	2016-10-07 20:14:38 +02:00
Barron Rulon	49915de245	mount: add SloppyOptions= to mount_dump()	2016-08-27 10:47:46 -04:00
Barron Rulon	4f8d40a9dc	mount: add new ForceUnmount= setting for mount units, mapping to umount(8)'s "-f" switch	2016-08-27 10:46:52 -04:00
brulon	e520950a03	mount: add new LazyUnmount= setting for mount units, mapping to umount(8)'s "-l" switch (#3827 )	2016-08-26 17:57:22 +02:00
Lennart Poettering	00d9ef8560	core: add RemoveIPC= setting This adds the boolean RemoveIPC= setting to service, socket, mount and swap units (i.e. all unit types that may invoke processes). if turned on, and the unit's user/group is not root, all IPC objects of the user/group are removed when the service is shut down. The life-cycle of the IPC objects is hence bound to the unit life-cycle. This is particularly relevant for units with dynamic users, as it is essential that no objects owned by the dynamic users survive the service exiting. In fact, this patch adds code to imply RemoveIPC= if DynamicUser= is set. In order to communicate the UID/GID of an executed process back to PID 1 this adds a new "user lookup" socket pair, that is inherited into the forked processes, and closed before the exec(). This is needed since we cannot do NSS from PID 1 due to deadlock risks, However need to know the used UID/GID in order to clean up IPC owned by it if the unit shuts down.	2016-08-19 00:37:25 +02:00
Lennart Poettering	a0fef983ab	core: remember first unit failure, not last unit failure Previously, the result value of a unit was overriden with each failure that took place, so that the result always reported the last failure that took place. With this commit this is changed, so that the first failure taking place is stored instead. This should normally not matter much as multiple failures are sufficiently uncommon. However, it improves one behaviour: if we send SIGABRT to a service due to a watchdog timeout, then this currently would be reported as "coredump" failure, rather than the "watchodg" failure it really is. Hence, in order to report information about the type of the failure, and not about the effect of it, let's change this from all unit type to store the first, not the last failure. This addresses the issue pointed out here: https://github.com/systemd/systemd/pull/3818#discussion_r73433520	2016-08-04 23:08:05 +02:00
Lennart Poettering	c39f1ce24d	core: turn various execution flags into a proper flags parameter The ExecParameters structure contains a number of bit-flags, that were so far exposed as bool:1, change this to a proper, single binary bit flag field. This makes things a bit more expressive, and is helpful as we add more flags, since these booleans are passed around in various callers, for example service_spawn(), whose signature can be made much shorter now. Not all bit booleans from ExecParameters are moved into the flags field for now, but this can be added later.	2016-08-04 16:27:07 +02:00
Lennart Poettering	eb18df724b	Merge pull request #2471 from michaelolbrich/transient-mounts allow transient mounts and automounts	2016-08-04 16:16:04 +02:00
Lennart Poettering	29206d4619	core: add a concept of "dynamic" user ids, that are allocated as long as a service is running This adds a new boolean setting DynamicUser= to service files. If set, a new user will be allocated dynamically when the unit is started, and released when it is stopped. The user ID is allocated from the range 61184..65519. The user will not be added to /etc/passwd (but an NSS module to be added later should make it show up in getent passwd). For now, care should be taken that the service writes no files to disk, since this might result in files owned by UIDs that might get assigned dynamically to a different service later on. Later patches will tighten sandboxing in order to ensure that this cannot happen, except for a few selected directories. A simple way to test this is: systemd-run -p DynamicUser=1 /bin/sleep 99999	2016-07-22 15:53:45 +02:00
Lennart Poettering	cf6f7f66a4	core: add minor comment Let's explain #3444 briefly in the sources, too.	2016-06-06 22:03:31 +02:00
michaelolbrich	53203e5f8f	mount: make sure got into MOUNT_DEAD state after a successful umount (#3444 ) Without this code the following can happen: 1. Open a file to keep a mount busy 2. Try to stop the corresponding mount unit with systemctl -> umount fails and the failure is remembered in mount->result 3. Close the file and umount the filesystem manually -> mount_dispatch_io() calls "mount_enter_dead(mount, MOUNT_SUCCESS)" -> Old error in mount->result is reused and the mount unit enters a failed state Clear the old error result when 'mountinfo' reports a successful umount to fix this.	2016-06-06 21:59:51 +02:00

1 2 3 4 5 ...

252 commits