Commit graph

258 commits

Author SHA1 Message Date
Alan Jenkins 5701836121 mount: clarify that umount retries do not (anymore) allow multiple timeouts
It _looks_ as if, back when we used to retry unsuccessful calls to umount,
this would have inflated the effective timeout.  Multiplying it by
RETRY_UMOUNT_MAX.  Which is set to 32.

I'm surprised if it's true: I would have expected it to be noticed during
the work on NFS timeouts.  But I can't see what would have stopped it.

Clarify that I do not expect this to happen anymore.  I think each
individual umount call is allowed up to the full timeout, but if umount
ever exited with a signal status, we would stop retrying.

To be extra clear, make sure that we do not retry in the event that umount
perversely returned EXIT_SUCCESS after receiving SIGTERM.
2018-01-23 11:09:18 +00:00
Alan Jenkins 006aabbd05 mount: mountinfo event is supposed to always arrive before SIGCHLD
"Due to the io event priority logic we can be sure the new mountinfo is
loaded before we process the SIGCHLD for the mount command."

I think this is a reasonable expectation.  But if it works, then the
other comment must be false:

"Note that mount(8) returning and the kernel sending us a mount table
change event might happen out-of-order."

Therefore we can clean up the code for the latter.

If this is working as advertised, then we can make sure that mount units
fail if the mount we thought we were creating did not actually appear,
due to races or trickery (or because /sbin/mount did something unexpected
despite returning EXIT_SUCCESS).

Include a specific warning message for this failure.

If we give up when the mount point is still mounted after 32 successful
calls to /sbin/umount, that seems a fairly similar case.  So make that
message a LOG_WARN as well (not LOG_DEBUG). Also, this was recently changed to only
retry while umount is returning EXIT_SUCCESS; in that case in particular
there would be no other messages in the log to suggest what had happened.
2018-01-23 11:09:06 +00:00
Alan Jenkins 25cd49647c mount: forbid mount on path with symlinks
It was forbidden to create mount units for a symlink.  But the reason is
that the mount unit needs to know the real path that will appear in
/proc/self/mountinfo.  The kernel dereferences *all* the symlinks in the
path at mount time (I checked this with `mount -c` running under `strace`).

This will have no effect on most systems.  As recommended by docs, most
systems use /etc/fstab, as opposed to native mount unit files.
fstab-generator dereferences symlinks for backwards compatibility.

A relatively minor issue regarding Time Of Check / Time Of Use also exists
here.  I can't see how to get rid of it entirely.  If we pass an absolute
path to mount, the racing process can replace it with a symlink.  If we
chdir() to the mount point and pass ".", the racing process can move the
directory.  The latter might potentially be nicer, except that it breaks
WorkingDirectory=.

I'm not saying the race is relevant to security - I just want to consider
how bad the effect is.  Currently, it can make the mount unit active (and
hence the job return success), despite there never being a matching entry
in /proc/self/mountinfo.  This wart will be removed in the next commit;
i.e. it will make the mount unit fail instead.
2018-01-20 22:06:34 +00:00
Alan Jenkins 3cc9685649 core: prevent spurious retries of umount
Testing the previous commit with `systemctl stop tmp.mount` logged the
reason for failure as expected, but unexpectedly the message was repeated
32 times.

The retry is a special case for umount; it is only supposed to cover the
case where the umount command was _successful_, but there was still some
remaining mount(s) underneath.  Fix it by making sure to test the first
condition :).

Re-tested with and without a preceding `mount --bind /mnt /tmp`,
and using `findmnt` to check the end result.
2018-01-13 17:22:46 +00:00
Alan Jenkins 5804e1b6ff core: fix output (logging) for mount units (#7603)
Documentation - systemd.exec - strongly implies mount units get logging.

It is safe for mounts to depend on systemd-journald.socket.  There is no
cyclic dependency generated.  This is because the root, -.mount, was
already deliberately set to EXEC_OUTPUT_NULL.  See comment in
mount_load_root_mount().  And /run is excluded from being a mount unit.

Nor does systemd-journald depend on /var.  It starts earlier, initially
logging to /run.

Tested before/after using `systemctl stop tmp.mount`.
2018-01-13 13:03:13 +00:00
rkolchmeyer 65d36b4950 core: Fix edge case when processing /proc/self/mountinfo (#7811)
Currently, if there are two /proc/self/mountinfo entries with the same
mount point path, the mount setup flags computed for the second of
these two entries will overwrite the mount setup flags computed for
the first of these two entries. This is the root cause of issue #7798.
This patch changes mount_setup_existing_unit to prevent the
just_mounted mount setup flag from being overwritten if it is set to
true. This will allow all mount units created from /proc/self/mountinfo
entries to be initialized properly.

Fixes: #7798
2018-01-05 19:28:23 +01:00
Lennart Poettering a4634b214c core: warn about left-over processes in cgroup on unit start
Now that we don't kill control processes anymore, let's at least warn
about any processes left-over in the unit cgroup at the moment of
starting the unit.
2017-11-25 17:08:21 +01:00
Lennart Poettering 3c7416b6ca core: unify common code for preparing for forking off unit processes
This introduces a new function unit_prepare_exec() that encapsulates a
number of calls we do in preparation for spawning off some processes in
all our unit types that do so.

This allows us to neatly unify a bit of code between unit types and
shorten our code.
2017-11-21 11:54:08 +01:00
Zbigniew Jędrzejewski-Szmek 53e1b68390 Add SPDX license identifiers to source files under the LGPL
This follows what the kernel is doing, c.f.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.
2017-11-19 19:08:15 +01:00
Lennart Poettering d3070fbdf6 core: implement /run/systemd/units/-based path for passing unit info from PID 1 to journald
And let's make use of it to implement two new unit settings with it:

1. LogLevelMax= is a new per-unit setting that may be used to configure
   log priority filtering: set it to LogLevelMax=notice and only
   messages of level "notice" and lower (i.e. more important) will be
   processed, all others are dropped.

2. LogExtraFields= is a new per-unit setting for configuring per-unit
   journal fields, that are implicitly included in every log record
   generated by the unit's processes. It takes field/value pairs in the
   form of FOO=BAR.

Also, related to this, one exisiting unit setting is ported to this new
facility:

3. The invocation ID is now pulled from /run/systemd/units/ instead of
   cgroupfs xattrs. This substantially relaxes requirements of systemd
   on the kernel version and the privileges it runs with (specifically,
   cgroupfs xattrs are not available in containers, since they are
   stored in kernel memory, and hence are unsafe to permit to lesser
   privileged code).

/run/systemd/units/ is a new directory, which contains a number of files
and symlinks encoding the above information. PID 1 creates and manages
these files, and journald reads them from there.

Note that this is supposed to be a direct path between PID 1 and the
journal only, due to the special runtime environment the journal runs
in. Normally, today we shouldn't introduce new interfaces that (mis-)use
a file system as IPC framework, and instead just an IPC system, but this
is very hard to do between the journal and PID 1, as long as the IPC
system is a subject PID 1 manages, and itself a client to the journal.

This patch cleans up a couple of types used in journal code:
specifically we switch to size_t for a couple of memory-sizing values,
as size_t is the right choice for everything that is memory.

Fixes: #4089
Fixes: #3041
Fixes: #4441
2017-11-16 12:40:17 +01:00
Yu Watanabe 74b1731c75 core/mount: fstype may be NULL 2017-11-12 14:27:25 +01:00
Lennart Poettering 3e3852b3c6 core: make "tmpfs" dependencies on swapfs a "default" dep, not an "implicit"
There should be a way to turn this logic of, and DefaultDependencies=
appears to be the right option for that, hence let's downgrade this
dependency type from "implicit" to "default, and thus honour
DefaultDependencies=.

This also drops mount_get_fstype() as we only have a single user needing
this now.

A follow-up for #7076.
2017-11-10 19:52:41 +01:00
Lennart Poettering eef85c4a3f core: track why unit dependencies came to be
This replaces the dependencies Set* objects by Hashmap* objects, where
the key is the depending Unit, and the value is a bitmask encoding why
the specific dependency was created.

The bitmask contains a number of different, defined bits, that indicate
why dependencies exist, for example whether they are created due to
explicitly configured deps in files, by udev rules or implicitly.

Note that memory usage is not increased by this change, even though we
store more information, as we manage to encode the bit mask inside the
value pointer each Hashmap entry contains.

Why this all? When we know how a dependency came to be, we can update
dependencies correctly when a configuration source changes but others
are left unaltered. Specifically:

1. We can fix UDEV_WANTS dependency generation: so far we kept adding
   dependencies configured that way, but if a device lost such a
   dependency we couldn't them again as there was no scheme for removing
   of dependencies in place.

2. We can implement "pin-pointed" reload of unit files. If we know what
   dependencies were created as result of configuration in a unit file,
   then we know what to flush out when we want to reload it.

3. It's useful for debugging: "systemd-analyze dump" now shows
   this information, helping substantially with understanding how
   systemd's dependency tree came to be the way it came to be.
2017-11-10 19:45:29 +01:00
Alan Jenkins 79aafbd122 core: distinguish "Killing"/"Terminating"/"Stopping" for mount unit timeout
Update the timeout warnings for remount and unmount.  For consistency with
mount, for accuracy, and for consistency with their equivalents in
service.c.
2017-11-01 15:28:50 +00:00
Michal Sekletar fab35afabf mount: make sure we unmount tmpfs mounts before we deactivate swaps (#7076)
In the past we introduced this property just for tmp.mount. However on
todays systems usually there are many more tmpfs mounts. Most notably
mounts backing XDG_RUNTIME_DIR for each user.

Let's generalize what we already have for tmp.mount and implement the
ordering After=swap.target for all tmpfs based mounts.
2017-10-16 16:15:05 +02:00
Lennart Poettering ed77d407d3 core: log unit failure with type-specific result code
This slightly changes how we log about failures. Previously,
service_enter_dead() would log that a service unit failed along with its
result code, and unit_notify() would do this again but without the
result code. For other unit types only the latter would take effect.

This cleans this up: we keep the message in unit_notify() only for debug
purposes, and add type-specific log lines to all our unit types that can
fail, and always place them before unit_notify() is invoked.

Or in other words: the duplicate log message for service units is
removed, and all other unit types get a more useful line with the
precise result code.
2017-09-27 18:26:18 +02:00
Lennart Poettering c634f3d2fc mount: rename mount_state_active() → MOUNT_STATE_WITH_PROCESS()
The function returns true for all states that have a control process
running, and each time we call it that's what we want to know, hence
let's rename it accordingly. Moreover, the more generic unit states have
an ACTIVE state, and it is defined quite differently from the set of
states this function returns true for, hence let's avoid confusion and
not reuse the word "ACTIVE" here in a different context.

Finally, let's uppercase this, since in most ways it's pretty much
identical to a macro
2017-09-26 16:17:22 +02:00
Lennart Poettering 22af0e5873 mount: rework mount state engine
This changes the mount unit state engine in the following ways:

1. The MOUNT_MOUNTING_SIGTERM and MOUNT_MOUNTING_SIGKILL are removed.
   They have been pretty much equivalent to MOUNT_UNMOUNTING_SIGTERM and
   MOUNT_UNMOUNTING_SIGKILL in what they do, and the outcome has been
   the same as well: the unit is stopped. Hence, let's simplify things a
   bit, and merge them. Note that we keep
   MOUNT_REMOUNTING_{SIGTERM|SIGKILL} however, as those states have a
   different outcome: the unit remains started.

2. mount_enter_signal() will now honour the SendSIGKILL= option of the
   mount unit if it was set. This was previously done already when we
   entered the signal states through a timeout, and was simply missing
   here.

3. A new helper function mount_enter_dead_or_mounted() is added that
   places the mount unit in either MOUNT_DEAD or MOUNT_MOUNTED,
   depending on what the kernel thinks about the mount's state. This
   function is called at various places now, wherever we finished an
   operation, and want to make sure our own state reflects again what
   the kernel thinks. Previously we had very similar code in a number of
   places and in other places didn't recheck the kernel state. Let's do
   that with the same logic and function at all relevant places now.

4. Rework mount_stop(): never forget about running control processes.
   Instead: when we have a start (i.e. a /bin/mount) process running,
   and are asked to stop, then enter the kill states for it, so that it
   gets cleaned up. This fixes #6048. Moreover, when we have a reload
   process running convert the possible states into the relevant
   unmounting states, so that we can properly execute the requested
   operation.

Fixes #6048
2017-09-26 16:17:22 +02:00
Lennart Poettering 850b741084 mount: clean up reload_result management a bit
Let's only collect the first failure in the load result, and let's clear
it explicitly when we are about to enter a new reload operation. This
makes it more alike the handling of the main result value (which also
only stores the first failure), and also the handling of service.c's
reload state.
2017-09-26 16:17:22 +02:00
Daniel Mack 906c06f64a cgroup, unit, fragment parser: make use of new firewall functions 2017-09-22 15:24:55 +02:00
Lennart Poettering 18f573aaf9 core: make sure to dump cgroup context when unit_dump() is called for all unit types
For some reason we didn't dump the cgroup context for a number of unit
types, including service units. Not sure how this wasn't noticed
before... Add this in.
2017-09-22 15:24:54 +02:00
Lennart Poettering 1703fa41a7 core: rename EXEC_APPLY_PERMISSIONS → EXEC_APPLY_SANDBOXING
"Permissions" was a bit of a misnomer, as it suggests that UNIX file
permission bits are adjusted, which aren't really changed here. Instead,
this is about UNIX credentials such as users or groups, as well as
namespacing, hence let's use a more generic term here, without any
misleading reference to UNIX file permissions: "sandboxing", which shall
refer to all kinds of sandboxing technologies, including UID/GID
dropping, selinux relabelling, namespacing, seccomp, and so on.
2017-08-10 15:02:50 +02:00
Lennart Poettering f0d477979e core: introduce unit_set_exec_params()
The new unit_set_exec_params() call is to units what
manager_set_exec_params() is to the manager object: it initializes the
various fields from the relevant generic properties set.
2017-08-10 15:02:50 +02:00
Lennart Poettering 19bbdd985e core: manager_set_exec_params() cannot fail, hence make it void
Let's simplify things a bit.
2017-08-10 15:02:50 +02:00
Lennart Poettering 584b8688d1 execute: also fold the cgroup delegate bit into ExecFlags 2017-08-10 15:02:50 +02:00
Abdó Roig-Maranges 1df96fcb31 core: Do not fail perpetual mount units without fragment (#6459)
mount_load does not require fragment files to be present in order to
load mount units which are perpetual, or come from /proc/self/mountinfo.

mount_verify should do the same, otherwise a synthesized '-.mount' would
be marked as failed with "No such file or directory", as it is perpetual
but not marked to come from /proc/self/mountinfo at this point.

This happens for the user instance, and I suspect it was the cause of #5375
for the system instance, without gpt-generator.
2017-07-31 12:32:09 +02:00
Yu Watanabe 3536f49e8f core: add {State,Cache,Log,Configuration}Directory= (#6384)
This introduces {State,Cache,Log,Configuration}Directory= those are
similar to RuntimeDirectory=. They create the directories under
/var/lib, /var/cache/, /var/log, or /etc, respectively, with the mode
specified in {State,Cache,Log,Configuration}DirectoryMode=.

This also fixes #6391.
2017-07-18 14:34:52 +02:00
NeilBrown 83897d5470 core/mount: pass "-c" flag to /bin/umount (#6093)
"-c", which is short for "--no-canonicalize", tells /bin/umount
that the path name is canonical (no .. or symlinks etc).

systemd always uses a canonical name, so this flag is appropriate
for systemd to use.
Knowing that the path is canonical allows umount to avoid
some calls to lstat() on the path.

From v2.30 "-c" goes further and causes umount to avoid all
attempts to 'lstat()' (or similar) the path.  This is important
when automatically unmounting a filesystem, as lstat() can
hang indefinitely in some cases such as when an NFS server
is not accessible.

"-c" has been supported since util-linux 2.17 which is before the
earliest version supported by systemd.
So "-c" is safe to use now, and once util-linux v2.30 is in use,
it will allow mounts from non-responsive NFS servers to be
unmounted.
2017-06-07 15:28:23 +03:00
Zbigniew Jędrzejewski-Szmek ce954c0319 core/mount: remove repeated word 2017-02-02 11:18:34 -05:00
Yu Watanabe cfcd431890 core: add missing unit_add_to_load_queue() to mount_setup_new_unit()
unit_add_to_load_queue was present in the code before 03b8cfede9,
and was inadvertently dropped.

Fixes #5105
2017-01-23 14:06:43 +09:00
Yu Watanabe a51ee72d2e core: minor error handling fix in mount_setup_new_unit()
The function mount_setup_new_unit() should return -ENOMEM
if at least one of `strdup` calls are failed.
2017-01-23 13:59:21 +09:00
Franck Bui 03b8cfede9 core: make sure to init mount params before calling mount_is_extrinsic() (#5087)
When a new entry appears in /proc/self/mountinfo, mount_setup_unit()
allocated a new mount unit for it and starts initializing it.

mount_setup_unit() is also used to update a mount unit when a change happens in
/proc/self/mountinfo, for example a mountpoint can be remounted with additional
mount options.

This patch introduces 2 separate functions to deal with those 2 cases instead
of mount_setup_unit() dealing with both of them. The common code is small and
doing the split makes the code easier to read and less error prone if extended
later.

It also makes sure to initialize in both functions the mount parameters of the
mount unit before calling mount_is_extrinsic() since this function relies on
them.

Fixes: #4902
2017-01-16 15:19:13 -05:00
Franck Bui ebc8968bc0 core: make mount units from /proc/self/mountinfo possibly bind to a device (#4515)
Since commit 9d06297, mount units from mountinfo are not bound to their devices
anymore (they use the "Requires" dependency instead).

This has the following drawback: if a media is mounted and the eject button is
pressed then the media is unconditionally ejected leaving some inconsistent
states.

Since udev is the component that is reacting (no matter if the device is used
or not) to the eject button, users expect that udev at least try to unmount the
media properly.

This patch introduces a new property "SYSTEMD_MOUNT_DEVICE_BOUND". When set on
a block device, all units that requires this device will see their "Requires"
dependency upgraded to a "BindTo" one. This is currently only used by cdrom
devices.

This patch also gives the possibility to the user to restore the previous
behavior that is bind a mount unit to a device. This is achieved by passing the
"x-systemd.device-bound" option to mount(8). Please note that currently this is
not working because libmount treats the x-* options has comments therefore
they're not available in utab for later application retrievals.
2016-12-16 17:13:58 +01:00
Lennart Poettering ad2706db7c core: rework logic to determine when we decide to add automatic deps for mounts
This adds a concept of "extrinsic" mounts. If mounts are extrinsic we consider
them managed by something else and do not add automatic ordering against
umount.target, local-fs.target, remote-fs.target.

Extrinsic mounts are considered:

- All mounts if we are running in --user mode

- API mounts such as everything below /proc, /sys, /dev, which exist from
  earliest boot to latest shutdown.

- All mounts marked as initrd mounts, if we run on the host

- The initrd's private directory /run/initrams that should survive until last
  reboot.

This primarily merges a couple of different exclusion lists into a single
concept.
2016-12-14 10:13:52 +01:00
Lennart Poettering c9d5c9c0e1 core: make unit_free() accept NULL pointers
We generally try to make our destructors robust regarding NULL pointers, much
in the same way as glibc's free(). Do this also for unit_free().

Follow-up for #4748.
2016-12-01 00:25:51 +01:00
Franck Bui 7d5ceb6416 core: allow to redirect confirmation messages to a different console
It's rather hard to parse the confirmation messages (enabled with
systemd.confirm_spawn=true) amongst the status messages and the kernel
ones (if enabled).

This patch gives the possibility to the user to redirect the confirmation
message to a different virtual console, either by giving its name or its path,
so those messages are separated from the other ones and easier to read.
2016-11-17 18:16:16 +01:00
Zbigniew Jędrzejewski-Szmek f97b34a629 Rename formats-util.h to format-util.h
We don't have plural in the name of any other -util files and this
inconsistency trips me up every time I try to type this file name
from memory. "formats-util" is even hard to pronounce.
2016-11-07 10:15:08 -05:00
Lennart Poettering 1201cae704 core: change mount_synthesize_root() return to int
Let's propagate the error here, instead of eating it up early.

In a later change we should probably also change mount_enumerate() to propagate
errors up, but that would mean we'd have to change the unit vtable, and thus
change all unit types, hence is quite an invasive change.
2016-11-02 11:39:49 -06:00
Lennart Poettering a581e45ae8 unit: unify some code with new unit_new_for_name() call 2016-11-02 11:29:59 -06:00
Lennart Poettering 11222d0fe0 core: make the root mount perpetual too
Now that have a proper concept of "perpetual" units, let's make the root mount
one too, since it also cannot go away.
2016-11-02 11:29:59 -06:00
Zbigniew Jędrzejewski-Szmek 3b319885c4 tree-wide: introduce free_and_replace helper
It's a common pattern, so add a helper for it. A macro is necessary
because a function that takes a pointer to a pointer would be type specific,
similarly to cleanup functions. Seems better to use a macro.
2016-10-16 23:35:39 -04:00
Zbigniew Jędrzejewski-Szmek b744e8937c Merge pull request #4067 from poettering/invocation-id
Add an "invocation ID" concept to the service manager
2016-10-11 13:40:50 -04:00
Lennart Poettering 052364d41f core: simplify if branches a bit
We do the same thing in two branches, let's merge them. Let's also add an
explanatory comment, while we are at it.
2016-10-10 22:57:02 +02:00
Lennart Poettering f2aed3070d core: make use of IN_SET() in various places in mount.c 2016-10-10 22:57:02 +02:00
Lennart Poettering 1f0958f640 core: when determining whether a process exit status is clean, consider whether it is a command or a daemon
SIGTERM should be considered a clean exit code for daemons (i.e. long-running
processes, as a daemon without SIGTERM handler may be shut down without issues
via SIGTERM still) while it should not be considered a clean exit code for
commands (i.e. short-running processes).

Let's add two different clean checking modes for this, and use the right one at
the appropriate places.

Fixes: #4275
2016-10-10 22:57:01 +02:00
Lennart Poettering 4b58153dd2 core: add "invocation ID" concept to service manager
This adds a new invocation ID concept to the service manager. The invocation ID
identifies each runtime cycle of a unit uniquely. A new randomized 128bit ID is
generated each time a unit moves from and inactive to an activating or active
state.

The primary usecase for this concept is to connect the runtime data PID 1
maintains about a service with the offline data the journal stores about it.
Previously we'd use the unit name plus start/stop times, which however is
highly racy since the journal will generally process log data after the service
already ended.

The "invocation ID" kinda matches the "boot ID" concept of the Linux kernel,
except that it applies to an individual unit instead of the whole system.

The invocation ID is passed to the activated processes as environment variable.
It is additionally stored as extended attribute on the cgroup of the unit. The
latter is used by journald to automatically retrieve it for each log logged
message and attach it to the log entry. The environment variable is very easily
accessible, even for unprivileged services. OTOH the extended attribute is only
accessible to privileged processes (this is because cgroupfs only supports the
"trusted." xattr namespace, not "user."). The environment variable may be
altered by services, the extended attribute may not be, hence is the better
choice for the journal.

Note that reading the invocation ID off the extended attribute from journald is
racy, similar to the way reading the unit name for a logging process is.

This patch adds APIs to read the invocation ID to sd-id128:
sd_id128_get_invocation() may be used in a similar fashion to
sd_id128_get_boot().

PID1's own logging is updated to always include the invocation ID when it logs
information about a unit.

A new bus call GetUnitByInvocationID() is added that allows retrieving a bus
path to a unit by its invocation ID. The bus path is built using the invocation
ID, thus providing a path for referring to a unit that is valid only for the
current runtime cycleof it.

Outlook for the future: should the kernel eventually allow passing of cgroup
information along AF_UNIX/SOCK_DGRAM messages via a unique cgroup id, then we
can alter the invocation ID to be generated as hash from that rather than
entirely randomly. This way we can derive the invocation race-freely from the
messages.
2016-10-07 20:14:38 +02:00
Barron Rulon 49915de245 mount: add SloppyOptions= to mount_dump() 2016-08-27 10:47:46 -04:00
Barron Rulon 4f8d40a9dc mount: add new ForceUnmount= setting for mount units, mapping to umount(8)'s "-f" switch 2016-08-27 10:46:52 -04:00
brulon e520950a03 mount: add new LazyUnmount= setting for mount units, mapping to umount(8)'s "-l" switch (#3827) 2016-08-26 17:57:22 +02:00
Lennart Poettering 00d9ef8560 core: add RemoveIPC= setting
This adds the boolean RemoveIPC= setting to service, socket, mount and swap
units (i.e.  all unit types that may invoke processes). if turned on, and the
unit's user/group is not root, all IPC objects of the user/group are removed
when the service is shut down. The life-cycle of the IPC objects is hence bound
to the unit life-cycle.

This is particularly relevant for units with dynamic users, as it is essential
that no objects owned by the dynamic users survive the service exiting. In
fact, this patch adds code to imply RemoveIPC= if DynamicUser= is set.

In order to communicate the UID/GID of an executed process back to PID 1 this
adds a new "user lookup" socket pair, that is inherited into the forked
processes, and closed before the exec(). This is needed since we cannot do NSS
from PID 1 due to deadlock risks, However need to know the used UID/GID in
order to clean up IPC owned by it if the unit shuts down.
2016-08-19 00:37:25 +02:00