Commit graph

3329 commits

Author SHA1 Message Date
Zbigniew Jędrzejewski-Szmek 0c2826c60c core: in --user mode, report READY=1 as soon as basic.target is reached (#7102)
When a user logs in, systemd-pam will wait for the user manager instance to
report readiness. We don't need to wait for all the jobs to finish, it
is enough if the basic startup is done and the user manager is responsive.

systemd --user will now send out a READY=1 notification when either of two
conditions becomes true:
- basic.target/start job is gone,
- the initial transaction is done.

Also fixes #2863.
2017-10-24 14:48:54 +02:00
Zbigniew Jędrzejewski-Szmek c2983a7fdd core/dynamic-user: use gid from pwnam if a static user was found
Fixes #7133.

v2:
- update based on review
2017-10-23 16:09:20 +02:00
Zbigniew Jędrzejewski-Szmek 362d90b7f2 core/dynamic-user: use _cleanup_ in dynamic user locking
This makes the code a bit easier to read.
2017-10-20 13:39:07 +02:00
Lubomir Rintel 19a44dfe45 core: fragments of masked units ought not be considered for NeedDaemonReload (#7060)
The units that are not loaded don't have dropin_paths set. This
currently results in units that have fragments to always have
NeedDaemonReload=true when masked:

  $ find {/usr/lib,/run/user/8086}/systemd/user/meh.service* |xargs ls -ld
  lrwxrwxrwx. 1 lkundrak lkundrak    9 Oct 11 11:19 /run/user/8086/systemd/user/meh.service -> /dev/null
  -rw-rw-r--. 1 root     root       49 Oct 11 10:16 /usr/lib/systemd/user/meh.service
  drwxrwxr-x. 2 root     root     4096 Oct 11 10:50 /usr/lib/systemd/user/meh.service.d
  -rw-rw-r--. 1 root     root      666 Oct 11 10:50 /usr/lib/systemd/user/meh.service.d/override.conf
  $ systemctl --user daemon-reload
  $ busctl --user get-property org.freedesktop.systemd1 \
        /org/freedesktop/systemd1/unit/meh_2eservice \
        org.freedesktop.systemd1.Unit NeedDaemonReload
  b true
2017-10-18 08:38:50 +02:00
Zbigniew Jędrzejewski-Szmek 895265ad7d Merge pull request #7059 from yuwata/dynamic-user-7013
dynamic-user: permit the case static uid and gid are different
2017-10-18 08:37:12 +02:00
Yu Watanabe e2b0cc3415 core: fix invalid error message
The error message corresponds to EILSEQ is "Invalid or incomplete
multibyte or wide character", and is not suitable in this case.
So, let's show a custom error message when the function
dynamic_creds_realize() returns -EILSEQ.
2017-10-18 08:57:59 +09:00
Michal Sekletar fab35afabf mount: make sure we unmount tmpfs mounts before we deactivate swaps (#7076)
In the past we introduced this property just for tmp.mount. However on
todays systems usually there are many more tmpfs mounts. Most notably
mounts backing XDG_RUNTIME_DIR for each user.

Let's generalize what we already have for tmp.mount and implement the
ordering After=swap.target for all tmpfs based mounts.
2017-10-16 16:15:05 +02:00
Yu Watanabe 709dbeac14 core: cleanup for enforce_groups() (#7064)
SupplementaryGroups= is preprocessed in get_supplementary_groups().
So, it is not necessary to input ExecContext to enforce_groups().
2017-10-12 08:10:25 +02:00
Yu Watanabe 9ec655cbbd dynamic-user: permit the case static uid and gid are different
This makes systemd supports the case that DynamicUser=yes and
static user and group exist such that uid and gid of them are different.
We only refuse the operation when user does not exist but the group
with the same name exists.

Fixes #7013.
2017-10-11 14:41:19 +09:00
Yu Watanabe 9da440b1b3 dynamic-user: label functions not necessary to export as static 2017-10-11 12:46:27 +09:00
Yu Watanabe a8cabc612b core: fix segfault in compile_bind_mounts() when BindPaths= or BindReadOnlyPaths= is set
This fixes a bug introduced by 6c47cd7d3b.

Fixes #7055.
2017-10-11 12:28:22 +09:00
Zbigniew Jędrzejewski-Szmek 7081228acd Merge pull request #7045 from poettering/namespace-casing
some super-trivial fixes to namespace.c
2017-10-10 21:50:17 +02:00
Lennart Poettering b74023db06 Merge pull request #7003 from yuwata/enable-dynamic-user
timesyncd, journal-upload: Enable DynamicUser=
2017-10-10 10:05:43 +02:00
Lennart Poettering 0fa5b8312a namespace: make ns_type_supported() a tiny bit shorter
namespace_type_to_string() already validates the type paramater, we can
use that, and shorten the function a bit.
2017-10-10 09:52:08 +02:00
Lennart Poettering bb0ff3fb1b namespace: change NameSpace → Namespace
We generally use the casing "Namespace" for the word, and that's visible
in a number of user-facing interfaces, including "RestrictNamespace=" or
"JoinsNamespaceOf=". Let's make sure to use the same casing internally
too.

As discussed in #7024
2017-10-10 09:51:58 +02:00
Michal Sekletar 6e2d7c4f13 namespace: fall back gracefully when kernel doesn't support network namespaces (#7024) 2017-10-10 09:46:13 +02:00
Zbigniew Jędrzejewski-Szmek 232ac0d681 util-lib: introdude _cleanup_ macros for kmod objects 2017-10-08 22:04:07 +02:00
Yu Watanabe c31ad02403 mkdir: introduce follow_symlink flag to mkdir_safe{,_label}() 2017-10-06 16:03:33 +09:00
Zbigniew Jędrzejewski-Szmek 892a035c2e core: make gc_marker unsigned (#7004)
This matches the definition in unit.h.
2017-10-05 15:04:19 +02:00
Zbigniew Jędrzejewski-Szmek c05f3c8ff8 Merge pull request #6931 from poettering/job-timeout-sec 2017-10-05 14:43:13 +02:00
Lennart Poettering eae51da36e unit: when JobTimeoutSec= is turned off, implicitly turn off JobRunningTimeoutSec= too
We added JobRunningTimeoutSec= late, and Dracut configured only
JobTimeoutSec= to turn of root device timeouts before. With this change
we'll propagate a reset of JobTimeoutSec= into JobRunningTimeoutSec=,
but only if the latter wasn't set explicitly.

This should restore compatibility with older systemd versions.

Fixes: #6402
2017-10-05 13:06:44 +02:00
Lennart Poettering 98e4fcec36 dynamic-user: don't use a UID that currently owns IPC objects (#6962)
This fixes a mostly theoretical potential security hole: if for some
reason we failed to remove IPC objects created for a dynamic user (maybe
because a MAC/SElinux erronously prohibited), then we should not hand
out the same UID again until they are successfully removed.

With this commit we'll enumerate the IPC objects currently existing, and
step away from using a UID for the dynamic UID logic if there are any
matching it.
2017-10-04 21:40:01 +02:00
Zbigniew Jędrzejewski-Szmek 03d4358277 Merge pull request #6975 from sourcejedi/logind_pid_0_v2
Selectively revert "tree-wide: use pid_is_valid() at more places"
2017-10-04 21:33:52 +02:00
Lennart Poettering 4aa1d31c89 Merge pull request #6974 from keszybz/clean-up-defines
Clean up define definitions
2017-10-04 19:25:30 +02:00
Lennart Poettering 5ad90fe376 Merge pull request #6985 from yuwata/empty
load-fragment: do not create empty array
2017-10-04 17:54:35 +02:00
Yu Watanabe 4c70109600 tree-wide: use IN_SET macro (#6977) 2017-10-04 16:01:32 +02:00
Zbigniew Jędrzejewski-Szmek f9fa32f09c build-sys: s/HAVE_SMACK/ENABLE_SMACK/
Same justification as for HAVE_UTMP.
2017-10-04 12:09:50 +02:00
Zbigniew Jędrzejewski-Szmek 392fd235fd build-sys: s/HAVE_IMA/ENABLE_IMA/
Same justification as for HAVE_UTMP.
2017-10-04 12:09:50 +02:00
Zbigniew Jędrzejewski-Szmek 349cc4a507 build-sys: use #if Y instead of #ifdef Y everywhere
The advantage is that is the name is mispellt, cpp will warn us.

$ git grep -Ee "conf.set\('(HAVE|ENABLE)_" -l|xargs sed -r -i "s/conf.set\('(HAVE|ENABLE)_/conf.set10('\1_/"
$ git grep -Ee '#ifn?def (HAVE|ENABLE)' -l|xargs sed -r -i 's/#ifdef (HAVE|ENABLE)/#if \1/; s/#ifndef (HAVE|ENABLE)/#if ! \1/;'
$ git grep -Ee 'if.*defined\(HAVE' -l|xargs sed -i -r 's/defined\((HAVE_[A-Z0-9_]*)\)/\1/g'
$ git grep -Ee 'if.*defined\(ENABLE' -l|xargs sed -i -r 's/defined\((ENABLE_[A-Z0-9_]*)\)/\1/g'
+ manual changes to meson.build

squash! build-sys: use #if Y instead of #ifdef Y everywhere

v2:
- fix incorrect setting of HAVE_LIBIDN2
2017-10-04 12:09:29 +02:00
Zbigniew Jędrzejewski-Szmek ac6e8be66e core: use strv_isempty to check if supplementary_groups is empty
With the previous commit, we know that it will be NULL if empty, but
it's safe to always use strv_isempty() in case the code changes
in the future.
2017-10-04 11:33:30 +02:00
Yu Watanabe 9f2d41a65f load-fragment: do not create empty array
Originally added in 4589f5bb0a.
C.f. 8249bb728d and #6746.
2017-10-04 11:23:42 +02:00
Alan Jenkins 07b38ba51e Revert "tree-wide: use pid_is_valid() at more places"
This reverts commit ee043777be.

It broke almost everywhere it touched.  The places that
handn't been converted, were mostly followed by special
handling for the invalid PID `0`.  That explains why they
tested for `pid < 0` instead of `pid <= 0`.

I think that one was the first commit I reviewed, heh.
2017-10-03 12:43:24 +01:00
Lennart Poettering c621849539 core: fix special directories for user services
The system paths were listed where the user paths should have been
listed. Correct that.
2017-10-02 17:41:44 +02:00
Lennart Poettering 091e9efed3 core: fix StateDirectory= (and friends) safety checks when decoding transient unit properties
Let's make sure relative directories such as "foo/bar" are accepted, by
using the same validation checks as in unit file parsing.
2017-10-02 17:41:44 +02:00
Lennart Poettering e53c42ca0a core: pass the correct error to the caller 2017-10-02 17:41:44 +02:00
Lennart Poettering da50b85af7 core: when looking for a UID to use for a dynamic UID start with the current owner of the StateDirectory= and friends
Let's optimize dynamic UID allocation a bit: if a StateDirectory= (or
suchlike) is configured, we start our allocation loop from that UID and
use it if it currently isn't used otherwise. This is beneficial as it
saves us from having to expensively recursively chown() these
directories in the typical case (which StateDirectory= does when it
notices that the owner of the directory doesn't match the UID picked).

With this in place we now have the a three-phase logic for allocating a
dynamic UID:

a) first, we try to use the owning UID of StateDirectory=,
   CacheDirectory=, LogDirectory= if that exists and is currently
   otherwise unused.

b) if that didn't work out, we hash the UID from the service name

c) if that didn't yield an unused UID either, randomly pick new ones
   until we find a free one.
2017-10-02 17:41:44 +02:00
Lennart Poettering 6c47cd7d3b execute: make StateDirectory= and friends compatible with DynamicUser=1 and RootDirectory=/RootImage=
Let's clean up the interaction of StateDirectory= (and friends) to
DynamicUser=1: instead of creating these directories directly below
/var/lib, place them in /var/lib/private instead if DynamicUser=1 is
set, making that directory 0700 and owned by root:root. This way, if a
dynamic UID is later reused, access to the old run's state directory is
prohibited for that user. Then, use file system namespacing inside the
service to make /var/lib/private a readable tmpfs, hiding all state
directories that are not listed in StateDirectory=, and making access to
the actual state directory possible. Mount all directories listed in
StateDirectory= to the same places inside the service (which means
they'll now be mounted into the tmpfs instance). Finally, add a symlink
from the state directory name in /var/lib/ to the one in
/var/lib/private, so that both the host and the service can access the
path under the same location.

Here's an example: let's say a service runs with StateDirectory=foo.
When DynamicUser=0 is set, it will get the following setup, and no
difference between what the unit and what the host sees:

        /var/lib/foo (created as directory)

Now, if DynamicUser=1 is set, we'll instead get this on the host:

        /var/lib/private (created as directory with mode 0700, root:root)
        /var/lib/private/foo (created as directory)
        /var/lib/foo → private/foo (created as symlink)

And from inside the unit:

        /var/lib/private (a tmpfs mount with mode 0755, root:root)
        /var/lib/private/foo (bind mounted from the host)
        /var/lib/foo → private/foo (the same symlink as above)

This takes inspiration from how container trees are protected below
/var/lib/machines: they generally reuse UIDs/GIDs of the host, but
because /var/lib/machines itself is set to 0700 host users cannot access
files in the container tree even if the UIDs/GIDs are reused. However,
for this commit we add one further trick: inside and outside of the unit
/var/lib/private is a different thing: outside it is a plain,
inaccessible directory, and inside it is a world-readable tmpfs mount
with only the whitelisted subdirs below it, bind mounte din.  This
means, from the outside the dir acts as an access barrier, but from the
inside it does not. And the symlink created in /var/lib/foo itself
points across the barrier in both cases, so that root and the unit's
user always have access to these dirs without knowing the details of
this mounting magic.

This logic resolves a major shortcoming of DynamicUser=1 units:
previously they couldn't safely store persistant data. With this change
they can have their own private state, log and data directories, which
they can write to, but which are protected from UID recycling.

With this change, if RootDirectory= or RootImage= are used it is ensured
that the specified state/log/cache directories are always mounted in
from the host. This change of semantics I think is much preferable since
this means the root directory/image logic can be used easily for
read-only resource bundling (as all writable data resides outside of the
image). Note that this is a change of behaviour, but given that we
haven't released any systemd version with StateDirectory= and friends
implemented this should be a safe change to make (in particular as
previously it wasn't clear what would actually happen when used in
combination). Moreover, by making this change we can later add a "+"
modifier to these setings too working similar to the same modifier in
ReadOnlyPaths= and friends, making specified paths relative to the
container itself.
2017-10-02 17:41:44 +02:00
Lennart Poettering a227a4be48 namespace: if we can create the destination of bind and PrivateTmp= mounts
When putting together the namespace, always create the file or directory
we are supposed to bind mount on, the same way we do it for most other
stuff, for example mount units or systemd-nspawn's --bind= option.

This has the big benefit that we can use namespace bind mounts on dirs
in /tmp or /var/tmp even in conjunction with PrivateTmp=.
2017-10-02 17:41:43 +02:00
Lennart Poettering e908468b5b namespace: properly handle bind mounts from the host
Before this patch we had an ordering problem: if we have no namespacing
enabled except for two bind mounts that intend to swap /a and /b via
bind mounts, then we'd execute the bind mount binding /b to /a, followed
by thebind mount from /a to /b, thus having the effect that /b is now
visible in both /a and /b, which was not intended.

With this change, as soon as any bind mount is configured we'll put
together the service mount namespace in a temporary directory instead of
operating directly in the root. This solves the problem in a
straightforward fashion: the source of bind mounts will always refer to
the host, and thus be unaffected from the bind mounts we already
created.
2017-10-02 17:41:43 +02:00
Lennart Poettering 645767d6b5 namespace: create /dev, /proc, /sys when needed
We already create /dev implicitly if PrivateTmp=yes is on, if it is
missing. Do so too for the other two API VFS, as well as for /dev if
PrivateTmp=yes is off but MountAPIVFS=yes is on (i.e. when /dev is bind
mounted from the host).
2017-10-02 17:41:43 +02:00
Lennart Poettering 72fd17682d core: usually our enum's _INVALID and _MAX special values are named after the full type
In most cases we followed the rule that the special _INVALID and _MAX
values we use in our enums use the full type name as prefix (in contrast
to regular values that we often make shorter), do so for
ExecDirectoryType as well.

No functional changes, just a little bit of renaming to make this code
more like the rest.
2017-10-02 17:41:43 +02:00
Lennart Poettering a1164ae380 core: chown() StateDirectory= and friends recursively when starting a service
This is particularly useful when used in conjunction with DynamicUser=1,
where the UID might change for every invocation, but is useful in other
cases too, for example, when these directories are shared between
systems where the UID assignments differ slightly.
2017-10-02 17:41:43 +02:00
Jouke Witteveen df66b93fe2 service: better detect when a Type=notify service cannot become active anymore (#6959)
No need to wait for a timeout when we know things are not going to work out.
When the main process goes away and only notifications from the main process are
accepted, then we will not receive any notifications anymore.
2017-10-02 16:35:27 +02:00
Zbigniew Jędrzejewski-Szmek b139c95bc4 Merge pull request #6941 from andir/use-in_set
use IN_SET where possible
2017-10-02 15:08:10 +02:00
Andreas Rammhold ec2ce0c5d7
tree-wide: use !IN_SET(..) for a != b && a != c && …
The included cocci was used to generate the changes.

Thanks to @flo-wer for pointing this case out.
2017-10-02 13:09:56 +02:00
Andreas Rammhold 3742095b27
tree-wide: use IN_SET where possible
In addition to the changes from #6933 this handles cases that could be
matched with the included cocci file.
2017-10-02 13:09:54 +02:00
Lennart Poettering b13ddbbcf3 service: accept the fact that the three xyz_good() functions return ints
Currently, all three of cgroup_good(), main_pid_good(),
control_pid_good() all return an "int" (two of them propagate errors).
It's a good thing to keep the three functions similar, so let's leave it
at that, but then let's clean up the invocation of the three functions
so that they always clearly acknowledge that the return value is not a
bool, but potentially negative.
2017-10-02 12:58:42 +02:00
Lennart Poettering 019be28676 service: drop _pure_ decorator on static function
The compiler should be good enough to figure this out on its own if this
is a static function, and it makes control_pid_good() an outlier anyway,
and decorators like this tend to bitrot. Hence, to keep things simple
and automatic, let's just drop the decorator.
2017-10-02 12:58:42 +02:00
Lennart Poettering 3c751b1bfa service: a cgroup empty notification isn't reason enough to go down
The processes associated with a service are not just the ones in its
cgroup, but also the control and main processes, which might possibly
live outside of it, for example if they transitioned into their own
cgroups because they registered a PAM session of their own. Hence, if we
get a cgroup empty notification always check if the main PID is still
around before taking action too eagerly.

Fixes: #6045
2017-10-02 12:58:42 +02:00
Lennart Poettering 07697d7ec5 service: add explanatory comments to control_pid_good() and cgroup_good()
Let's add a similar comment to each as we already have for
main_pid_good(), emphasizing that these functions are supposed to be
have very similar.
2017-10-02 12:58:42 +02:00