Systemd/src/core
Lennart Poettering 6c47cd7d3b execute: make StateDirectory= and friends compatible with DynamicUser=1 and RootDirectory=/RootImage=
Let's clean up the interaction of StateDirectory= (and friends) to
DynamicUser=1: instead of creating these directories directly below
/var/lib, place them in /var/lib/private instead if DynamicUser=1 is
set, making that directory 0700 and owned by root:root. This way, if a
dynamic UID is later reused, access to the old run's state directory is
prohibited for that user. Then, use file system namespacing inside the
service to make /var/lib/private a readable tmpfs, hiding all state
directories that are not listed in StateDirectory=, and making access to
the actual state directory possible. Mount all directories listed in
StateDirectory= to the same places inside the service (which means
they'll now be mounted into the tmpfs instance). Finally, add a symlink
from the state directory name in /var/lib/ to the one in
/var/lib/private, so that both the host and the service can access the
path under the same location.

Here's an example: let's say a service runs with StateDirectory=foo.
When DynamicUser=0 is set, it will get the following setup, and no
difference between what the unit and what the host sees:

        /var/lib/foo (created as directory)

Now, if DynamicUser=1 is set, we'll instead get this on the host:

        /var/lib/private (created as directory with mode 0700, root:root)
        /var/lib/private/foo (created as directory)
        /var/lib/foo → private/foo (created as symlink)

And from inside the unit:

        /var/lib/private (a tmpfs mount with mode 0755, root:root)
        /var/lib/private/foo (bind mounted from the host)
        /var/lib/foo → private/foo (the same symlink as above)

This takes inspiration from how container trees are protected below
/var/lib/machines: they generally reuse UIDs/GIDs of the host, but
because /var/lib/machines itself is set to 0700 host users cannot access
files in the container tree even if the UIDs/GIDs are reused. However,
for this commit we add one further trick: inside and outside of the unit
/var/lib/private is a different thing: outside it is a plain,
inaccessible directory, and inside it is a world-readable tmpfs mount
with only the whitelisted subdirs below it, bind mounte din.  This
means, from the outside the dir acts as an access barrier, but from the
inside it does not. And the symlink created in /var/lib/foo itself
points across the barrier in both cases, so that root and the unit's
user always have access to these dirs without knowing the details of
this mounting magic.

This logic resolves a major shortcoming of DynamicUser=1 units:
previously they couldn't safely store persistant data. With this change
they can have their own private state, log and data directories, which
they can write to, but which are protected from UID recycling.

With this change, if RootDirectory= or RootImage= are used it is ensured
that the specified state/log/cache directories are always mounted in
from the host. This change of semantics I think is much preferable since
this means the root directory/image logic can be used easily for
read-only resource bundling (as all writable data resides outside of the
image). Note that this is a change of behaviour, but given that we
haven't released any systemd version with StateDirectory= and friends
implemented this should be a safe change to make (in particular as
previously it wasn't clear what would actually happen when used in
combination). Moreover, by making this change we can later add a "+"
modifier to these setings too working similar to the same modifier in
ReadOnlyPaths= and friends, making specified paths relative to the
container itself.
2017-10-02 17:41:44 +02:00
..
audit-fd.c tree-wide: use !IN_SET(..) for a != b && a != c && … 2017-10-02 13:09:56 +02:00
audit-fd.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
automount.c tree-wide: use !IN_SET(..) for a != b && a != c && … 2017-10-02 13:09:56 +02:00
automount.h automount: rework propagation between automount and mount units 2016-05-02 16:51:45 +02:00
bpf-firewall.c bpf: set BPF_F_ALLOW_OVERRIDE when attaching a cgroup program if Delegate=yes is set 2017-09-22 15:28:05 +02:00
bpf-firewall.h Add firewall eBPF compiler 2017-09-22 15:24:55 +02:00
cgroup.c cgroup: IN_SET() FTW! 2017-09-27 18:26:18 +02:00
cgroup.h cgroup: after determining that a cgroup is empty, asynchronously dispatch this 2017-09-27 18:26:18 +02:00
chown-recursive.c core: chown() StateDirectory= and friends recursively when starting a service 2017-10-02 17:41:43 +02:00
chown-recursive.h core: chown() StateDirectory= and friends recursively when starting a service 2017-10-02 17:41:43 +02:00
dbus-automount.c automount: implement transient automounts 2016-06-06 07:33:54 +02:00
dbus-automount.h automount: implement transient automounts 2016-06-06 07:33:54 +02:00
dbus-cgroup.c core: improve dbus-cgroup error message 2017-09-26 23:49:40 +02:00
dbus-cgroup.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
dbus-device.c tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
dbus-device.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
dbus-execute.c core: usually our enum's _INVALID and _MAX special values are named after the full type 2017-10-02 17:41:43 +02:00
dbus-execute.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
dbus-job.c core/dbus-job, systemctl: shorten some code 2016-11-16 21:01:11 -05:00
dbus-job.h core: add bus calls for determining jobs waiting for other jobs 2016-11-16 17:01:46 +01:00
dbus-kill.c tree-wide: remove newlines from unit_write_drop_in 2016-05-28 16:29:42 -04:00
dbus-kill.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
dbus-manager.c tree-wide: use pid_is_valid() at more places 2017-08-31 15:45:04 +02:00
dbus-manager.h manager: refuse reloading/reexecing when /run is overly full 2017-02-06 16:58:06 +01:00
dbus-mount.c mount: add new ForceUnmount= setting for mount units, mapping to umount(8)'s "-f" switch 2016-08-27 10:46:52 -04:00
dbus-mount.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
dbus-path.c tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
dbus-path.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
dbus-scope.c core: make sure RequestStop signal is send directed 2016-07-20 14:35:15 +02:00
dbus-scope.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
dbus-service.c execute: replace command flag bools by a flags field 2017-08-10 14:44:58 +02:00
dbus-service.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
dbus-slice.c tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
dbus-slice.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
dbus-socket.c core: add RemoveIPC= setting 2016-08-19 00:37:25 +02:00
dbus-socket.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
dbus-swap.c core: add RemoveIPC= setting 2016-08-19 00:37:25 +02:00
dbus-swap.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
dbus-target.c tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
dbus-target.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
dbus-timer.c timer: convert property_get_next_elapse_monotonic() to use usec_shift_clock() 2017-06-22 20:59:48 +02:00
dbus-timer.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
dbus-unit.c tree-wide: use IN_SET where possible 2017-10-02 13:09:54 +02:00
dbus-unit.h core: add Ref()/Unref() bus calls for units 2016-08-22 16:14:21 +02:00
dbus.c core: downgrade log message about inability to propagate cgroup release message 2017-09-08 17:24:57 +02:00
dbus.h core: add Ref()/Unref() bus calls for units 2016-08-22 16:14:21 +02:00
device.c Revert "device : reload when udev generates a "changed" event" (#6836) 2017-09-15 09:21:49 +02:00
device.h core: make mount units from /proc/self/mountinfo possibly bind to a device (#4515) 2016-12-16 17:13:58 +01:00
dynamic-user.c tree-wide: use IN_SET where possible 2017-10-02 13:09:54 +02:00
dynamic-user.h core: add a concept of "dynamic" user ids, that are allocated as long as a service is running 2016-07-22 15:53:45 +02:00
emergency-action.c failure-action: generalize failure action to emergency action 2016-10-21 15:13:50 +02:00
emergency-action.h failure-action: generalize failure action to emergency action 2016-10-21 15:13:50 +02:00
execute.c execute: make StateDirectory= and friends compatible with DynamicUser=1 and RootDirectory=/RootImage= 2017-10-02 17:41:44 +02:00
execute.h core: usually our enum's _INVALID and _MAX special values are named after the full type 2017-10-02 17:41:43 +02:00
hostname-setup.c core: when booting up, initialize hostname to compile-time fallback hostname 2017-02-17 10:19:26 +01:00
hostname-setup.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
ima-setup.c ima: Ensure policy exists before asking the kernel to load it (#5777) 2017-04-21 10:53:40 +02:00
ima-setup.h tree-wide: use mdash instead of a two minuses 2016-04-21 23:00:13 -04:00
ip-address-access.c core: warn loudly if IP firewalling is configured but not in effect 2017-09-22 15:24:55 +02:00
ip-address-access.h ip-address-access: minimize IP address lists 2017-09-22 15:24:55 +02:00
job.c tree-wide: use IN_SET where possible 2017-10-02 13:09:54 +02:00
job.h job: Ensure JobRunningTimeoutSec= survives serialization (#6128) 2017-06-18 11:51:17 -04:00
kill.c tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
kill.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
killall.c tree-wide: use !IN_SET(..) for a != b && a != c && … 2017-10-02 13:09:56 +02:00
killall.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
kmod-setup.c kmod-setup: define has_virtio_rng() only in a case when HAVE_KMOD is enabled (#6784) 2017-09-08 23:51:04 +02:00
kmod-setup.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
load-dropin.c load-dropin: propagate errors properly from unit_name_compatible() (#6002) 2017-05-22 09:18:00 -04:00
load-dropin.h core: when loading .wants and .requires, follow the same logic as .d conf dropins 2017-02-07 21:31:22 -05:00
load-fragment-gperf-nulstr.awk build-sys: fix the script used for generating load_fragment_gperf_nulstr (#6646) 2017-08-26 23:21:23 +09:00
load-fragment-gperf.gperf.m4 cgroup, unit, fragment parser: make use of new firewall functions 2017-09-22 15:24:55 +02:00
load-fragment.c core: add new per-unit setting KeyringMode= for controlling kernel keyring setup 2017-09-15 16:53:35 +02:00
load-fragment.h core: add new per-unit setting KeyringMode= for controlling kernel keyring setup 2017-09-15 16:53:35 +02:00
locale-setup.c tree-wide: drop NULL sentinel from strjoin 2016-10-23 11:43:27 -04:00
locale-setup.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
loopback-setup.c core/loopback-setup: make log messages nicer 2017-07-02 12:03:56 -04:00
loopback-setup.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
machine-id-setup.c core: machine_id_setup overwrites broken machine-id 2016-12-13 12:58:42 +00:00
machine-id-setup.h core: rework machine-id-setup.c to use the calls from id128-util.[ch] 2016-07-22 12:59:36 +02:00
macros.systemd.in Add rpm macro %_userpresetdir (#6571) 2017-08-09 09:45:02 +02:00
main.c main: bump RLIMIT_NOFILE for the root user substantially 2017-09-22 15:28:05 +02:00
manager.c core: usually our enum's _INVALID and _MAX special values are named after the full type 2017-10-02 17:41:43 +02:00
manager.h core: usually our enum's _INVALID and _MAX special values are named after the full type 2017-10-02 17:41:43 +02:00
meson.build core: chown() StateDirectory= and friends recursively when starting a service 2017-10-02 17:41:43 +02:00
mount-setup.c Revert "mount-setup: mount xenfs filesystem (#6491)" (#6662) 2017-08-28 18:46:01 +02:00
mount-setup.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
mount.c core: log unit failure with type-specific result code 2017-09-27 18:26:18 +02:00
mount.h cgroup, unit, fragment parser: make use of new firewall functions 2017-09-22 15:24:55 +02:00
namespace.c execute: make StateDirectory= and friends compatible with DynamicUser=1 and RootDirectory=/RootImage= 2017-10-02 17:41:44 +02:00
namespace.h execute: make StateDirectory= and friends compatible with DynamicUser=1 and RootDirectory=/RootImage= 2017-10-02 17:41:44 +02:00
org.freedesktop.systemd1.conf core: add bus calls for determining jobs waiting for other jobs 2016-11-16 17:01:46 +01:00
org.freedesktop.systemd1.policy.in.in
org.freedesktop.systemd1.service
path.c tree-wide: use !IN_SET(..) for a != b && a != c && … 2017-10-02 13:09:56 +02:00
path.h core: move enforcement of the start limit into per-unit-type code again 2016-05-02 13:08:00 +02:00
scope.c tree-wide: use IN_SET where possible 2017-10-02 13:09:54 +02:00
scope.h Merge pull request #3762 from poettering/sigkill-log 2016-07-22 09:18:30 +02:00
selinux-access.c basic/log: fix _printf_ annotation on log_object_internalv 2017-04-20 14:42:43 -04:00
selinux-access.h core,network: Use const qualifiers for block-local variables in macro functions (#4019) 2016-08-23 12:29:30 +03:00
selinux-setup.c tree-wide: get rid of selinux_context_t (#3732) 2016-07-15 18:44:02 +02:00
selinux-setup.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
service.c service: better detect when a Type=notify service cannot become active anymore (#6959) 2017-10-02 16:35:27 +02:00
service.h cgroup, unit, fragment parser: make use of new firewall functions 2017-09-22 15:24:55 +02:00
show-status.c io-util: add new IOVEC_INIT/IOVEC_MAKE macros 2017-09-22 15:28:04 +02:00
show-status.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
shutdown.c tree-wide: make use of getpid_cached() wherever we can 2017-07-20 20:27:24 +02:00
slice.c cgroup, unit, fragment parser: make use of new firewall functions 2017-09-22 15:24:55 +02:00
slice.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
smack-setup.c tree-wide: fput[cs]() → fput[cs]_unlocked() wherever that makes sense (#6396) 2017-07-21 10:35:45 +02:00
smack-setup.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
socket.c tree-wide: use !IN_SET(..) for a != b && a != c && … 2017-10-02 13:09:56 +02:00
socket.h cgroup, unit, fragment parser: make use of new firewall functions 2017-09-22 15:24:55 +02:00
swap.c core: log unit failure with type-specific result code 2017-09-27 18:26:18 +02:00
swap.h cgroup, unit, fragment parser: make use of new firewall functions 2017-09-22 15:24:55 +02:00
system.conf manager: hook up IP accounting defaults 2017-09-22 15:24:55 +02:00
systemd.pc.in build-sys: move systemd.pc from pkgconfiglibdir back into pkgconfigdatadir 2015-04-21 20:35:17 +02:00
target.c core: move checking default_dependencies into target_add_default_dependencies. (#5762) 2017-04-21 11:00:47 +02:00
target.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
timer.c tree-wide: use IN_SET where possible 2017-10-02 13:09:54 +02:00
timer.h timer: convert property_get_next_elapse_monotonic() to use usec_shift_clock() 2017-06-22 20:59:48 +02:00
transaction.c tree-wide: use IN_SET where possible 2017-10-02 13:09:54 +02:00
transaction.h core: propagate reload from RELOADING=1 notification (#6550) 2017-08-07 11:27:24 +02:00
triggers.systemd.in rpm triggers: do nothing if systemd is not running (#5065) 2017-01-12 10:16:20 +01:00
umount.c shutdown: don't be fooled when detaching DM devices with BTRFS 2017-09-13 11:59:46 +02:00
umount.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
unit-printf.c core: add {State,Cache,Log,Configuration}Directory= (#6384) 2017-07-18 14:34:52 +02:00
unit-printf.h tree-wide: remove Emacs lines from all files 2016-02-10 13:41:57 +01:00
unit.c core: usually our enum's _INVALID and _MAX special values are named after the full type 2017-10-02 17:41:43 +02:00
unit.h tree-wide: use IN_SET where possible 2017-10-02 13:09:54 +02:00
user.conf core: rename StartLimitInterval= to StartLimitIntervalSec= 2016-04-29 16:27:48 +02:00