Commit graph

5796 commits

Author SHA1 Message Date
Zbigniew Jędrzejewski-Szmek 2aed63f427 tree-wide: fix spelling of "fallback"
Similarly to "setup" vs. "set up", "fallback" is a noun, and "fall back"
is the verb. (This is pretty clear when we construct a sentence in the
present continous: "we are falling back" not "we are fallbacking").
2020-08-20 17:45:32 +02:00
Lennart Poettering 5b14956385
Merge pull request #16543 from poettering/nspawn-run-host
nspawn: /run/host/ tweaks
2020-08-20 16:20:05 +02:00
Luca Boccassi 427353f668 core: add mount options support for MountImages
Follow the same model established for RootImage and RootImageOptions,
and allow to either append a single list of options or tuples of
partition_number:options.
2020-08-20 14:45:40 +01:00
Luca Boccassi 9ece644435 core: change RootImageOptions to use names instead of partition numbers
Follow the designations from the Discoverable Partitions Specification
2020-08-20 13:58:02 +01:00
Luca Boccassi bc8d56d305 core: use strv_split_colon_pairs when parsing RootImageOptions 2020-08-20 13:24:32 +01:00
Luca Boccassi c20acbb2bd core: cleanup unused variables
Leftovers from previous implementation of MountImages feature, unused now
2020-08-20 13:24:32 +01:00
Lennart Poettering 476cfe626d core: remove support for ConditionNull=
The concept is flawed, and mostly useless. Let's finally remove it.

It has been deprecated since 90a2ec10f2 (6
years ago) and we started to warn since
55dadc5c57 (1.5 years ago).

Let's get rid of it altogether.
2020-08-20 14:01:25 +02:00
Lennart Poettering 4f55a5b0bf core: add missing conditions/asserts to unit file parsing 2020-08-20 13:56:14 +02:00
Zbigniew Jędrzejewski-Szmek ec673ad4ab
Merge pull request #16559 from benzea/benzea/memory-recursiveprot
mount-setup: Enable memory_recursiveprot for cgroup2
2020-08-20 13:05:07 +02:00
Lennart Poettering 3242980582 core: create per-user inaccessible node from the service manager
Previously, we'd create them from user-runtime-dir@.service. That has
one benefit: since this service runs privileged, we can create the full
set of device nodes. It has one major drawback though: it security-wise
problematic to create files/directories in directories as privileged
user in directories owned by unprivileged users, since they can use
symlinks to redirect what we want to do. As a general rule we hence
avoid this logic: only unpriv code should populate unpriv directories.

Hence, let's move this code to an appropriate place in the service
manager. This means we lose the inaccessible block device node, but
since there's already a fallback in place, this shouldn't be too bad.
2020-08-20 10:18:02 +02:00
Lennart Poettering 9fac502920 nspawn,pid1: pass "inaccessible" nodes from cntr mgr to pid1 payload via /run/host
Let's make /run/host the sole place we pass stuff from host to container
in and place the "inaccessible" nodes in /run/host too.

In contrast to the previous two commits this is a minor compat break, but
not a relevant one I think. Previously the container manager would place
these nodes in /run/systemd/inaccessible/ and that's where PID 1 in the
container would try to add them too when missing. Container manager and
PID 1 in the container would thus manage the same dir together.

With this change the container manager now passes an immutable directory
to the container and leaves /run/systemd entirely untouched, and managed
exclusively by PID 1 inside the container, which is nice to have clear
separation on who manages what.

In order to make sure systemd then usses the /run/host/inaccesible/
nodes this commit changes PID 1 to look for that dir and if it exists
will symlink it to /run/systemd/inaccessible.

Now, this will work fine if new nspawn and new pid 1 in the container
work together. as then the symlink is created and the difference between
the two dirs won't matter.

For the case where an old nspawn invokes a new PID 1: in this case
things work as they always worked: the dir is managed together.

For the case where different container manager invokes a new PID 1: in
this case the nodes aren't typically passed in, and PID 1 in the
container will try to create them and will likely fail partially (though
gracefully) when trying to create char/block device nodes. THis is fine
though as there are fallbacks in place for that case.

For the case where a new nspawn invokes an old PID1: this is were the
(minor) incompatibily happens: in this case new nspawn will place the
nodes in the /run/host/inaccessible/ subdir, but the PID 1 in the
container won't look for them there. Since the nodes are also not
pre-created in /run/systed/inaccessible/ PID 1 will try to create them
there as if a different container manager sets them up. This is of
course not sexy, but is not a total loss, since as mentioned fallbacks
are in place anyway. Hence I think it's OK to accept this minor
incompatibility.
2020-08-20 10:17:52 +02:00
Zbigniew Jędrzejewski-Szmek 2eecdd1d69
Merge pull request #16790 from poettering/core-if-block-merge
core: merge a few if blocks
2020-08-20 10:15:01 +02:00
Lennart Poettering 1f894e682c machine-id-setup: don't use KVM or container manager supplied uuid if in chroot env
Fixes: #16758
2020-08-19 18:23:11 +02:00
Lennart Poettering 4428c49db9 mount-setup: drop pointless zero initialization 2020-08-19 18:11:00 +02:00
Lennart Poettering 3196e42393 core: merge a few if blocks
arg_system == true and getpid() == 1 hold under the very same condition
this early in the main() function (this only changes later when we start
parsing command lines, where arg_system = true is set if users invoke us
in test mode even when getpid() != 1.

Hence, let's simplify things, and merge a couple of if branches and not
pretend they were orthogonal.
2020-08-19 18:06:12 +02:00
Michal Koutný d9ef594454 cgroup: Cleanup function usage
Some masks shouldn't be needed externally, so keep their functions in
the module (others would fit there too but they're used in tests) to
think twice if something would depend on them.

Drop unused function cg_attach_many_everywhere.

Use cgroup_realized instead of cgroup_path when we actually ask for
realized.

This should not cause any functional changes.
2020-08-19 11:41:53 +02:00
Michal Koutný 12b975e065 cgroup: Reduce unit_get_ancestor_disable_mask use
The usage in unit_get_own_mask is redundant, we only need apply
disable_mask at the end befor application, i.e. calculating enable or
target mask.

(IOW, we allow all configurations, but disabling affects effective
controls.)

Modify tests accordingly and add testing of enable mask.

This is intended as cleanup, with no effect but changing unit_dump
output.
2020-08-19 11:41:53 +02:00
Michal Koutný 4c591f3996 cgroup: Introduce family queueing instead of siblings
The unit_add_siblings_to_cgroup_realize_queue does more than mere
siblings queueing, hence define a family of a unit as (immediate)
children of the unit and immediate children of all ancestors.

Working with this abstraction simplifies the queuing calls and it
shouldn't change the functionality.
2020-08-19 11:41:53 +02:00
Michal Koutný f23ba94db3 cgroup: Implicit unit_invalidate_cgroup_members_masks
Merge members mask invalidation into
unit_add_siblings_to_cgroup_realize_queue, this way unit_realize_cgroup
needn't be called with members mask invalidation.

We have to retain the members mask invalidation in unit_load -- although
active units would have cgroups (re)realized (unit_load queues for
realization), the realization would happen with potentially stale mask.
2020-08-19 11:41:53 +02:00
Michal Koutný fb46fca7e0 cgroup: Eager realization in unit_free
unit_free(u) realizes direct parent and invalidates members mask of all
ancestors. This isn't sufficient in v1 controller hierarchies since
siblings of the freed unit may have existed only because of the removed
unit.

We cannot be lazy about the siblings because if parent(u) is also
removed, it'd migrate and rmdir cgroups for siblings(u). However,
realized masks of siblings(u) won't reflect this change.

This was a non-issue earlier, because we weren't removing cgroup
directories properly (effectively matching the stale realized mask),
removal failed because of tasks left by missing migration (see previous
commit).

Therefore, ensure realization of all units necessary to clean up after
the free'd unit.

Fixes: #14149
2020-08-19 11:41:53 +02:00
Michal Koutný 7b63961415 cgroup: Swap cgroup v1 deletion and migration
When we are about to derealize a controller on v1 cgroup, we first
attempt to delete the controller cgroup and migrate afterwards. This
doesn't work in practice because populated cgroup cannot be deleted.
Furthermore, we leave out slices from migration completely, so
(un)setting a control value on them won't realize their controller
cgroup.

Rework actual realization, unit_create_cgroup() becomes
unit_update_cgroup() and make sure that controller hierarchies are
reduced when given controller cgroup ceased to be needed.

Note that with this we introduce slight deviation between v1 and v2 code
-- when a descendant unit turns off a delegated controller, we attempt
to disable it in ancestor slices. On v2 this may fail (kernel enforced,
because of child cgroups using the controller), on v1 we'll migrate
whole subtree and trim the subhierachy. (Previously, we wouldn't take
away delegated controller, however, derealization was broken anyway.)

Fixes: #14149
2020-08-19 11:41:53 +02:00
Benjamin Berg 56f47800d8 mount-setup: Enable memory_recursiveprot for cgroup2
When available, enable memory_recursiveprot. Realistically it always
makes sense to delegate MemoryLow= and MemoryMin= to all children of a
slice/unit.

The kernel option is not enabled by default as it might cause
regressions in some setups. However, it is the better default in
general, and it results in a more flexible and obvious behaviour.

The alternative to using this option would be for user's to also set
DefaultMemoryLow= on slices when assigning MemoryLow=. However, this
makes the effect of MemoryLow= on some children less obvious, as it
could result in a lower protection rather than increasing it.

From the kernel documentation:

  memory_recursiveprot

        Recursively apply memory.min and memory.low protection to
        entire subtrees, without requiring explicit downward
        propagation into leaf cgroups.  This allows protecting entire
        subtrees from one another, while retaining free competition
        within those subtrees.  This should have been the default
        behavior but is a mount-option to avoid regressing setups
        relying on the original semantics (e.g. specifying bogusly
        high 'bypass' protection values at higher tree levels).

This was added in kernel commit 8a931f801340c (mm: memcontrol:
recursive memory.low protection), which became available in 5.7 and was
subsequently fixed in kernel 5.7.7 (mm: memcontrol: handle div0 crash
race condition in memory.low).
2020-08-19 11:17:01 +02:00
Alyssa Ross 556a7bbed6
load-fragment: fix grammar in error messages 2020-08-18 20:56:59 +00:00
Lennart Poettering 3f181262f4 namespace: fix minor memory leak 2020-08-14 15:33:04 +02:00
Lennart Poettering 0a388dfcc5 core,home,machined: generate description fields for all groups we synthesize 2020-08-07 08:39:52 +02:00
Luca Boccassi b3d133148e core: new feature MountImages
Follows the same pattern and features as RootImage, but allows an
arbitrary mount point under / to be specified by the user, and
multiple values - like BindPaths.

Original implementation by @topimiettinen at:
https://github.com/systemd/systemd/pull/14451
Reworked to use dissect's logic instead of bare libmount() calls
and other review comments.
Thanks Topi for the initial work to come up with and implement
this useful feature.
2020-08-05 21:34:55 +01:00
Axel Rasmussen a119185c02 selinux: improve comment about getcon_raw semantics
This code was changed in this pull request:
https://github.com/systemd/systemd/pull/16571

After some discussion and more investigation, we better understand
what's going on. So, update the comment, so things are more clear
to future readers.
2020-08-05 20:20:45 +02:00
Zbigniew Jędrzejewski-Szmek d06bd2e785 Merge pull request #16596 from poettering/event-time-rel
Conflict in src/libsystemd-network/test-ndisc-rs.c fixed manually.
2020-08-04 16:07:03 +02:00
Zbigniew Jędrzejewski-Szmek 94efaa3181 core: reset bus error before reuse
From a report in https://bugzilla.redhat.com/show_bug.cgi?id=1861463:
usb-gadget.target: Failed to load configuration: No such file or directory
usb-gadget.target: Failed to load configuration: No such file or directory
usb-gadget.target: Trying to enqueue job usb-gadget.target/start/fail
usb-gadget.target: Failed to load configuration: No such file or directory
Assertion '!bus_error_is_dirty(e)' failed at src/libsystemd/sd-bus/bus-error.c:239, function bus_error_setfv(). Ignoring.
sys-devices-platform-soc-2100000.bus-2184000.usb-ci_hdrc.0-udc-ci_hdrc.0.device: Failed to enqueue SYSTEMD_WANTS= job, ignoring: Unit usb-gadget.target not found.

I *think* this is the place where the reuse occurs: we call
bus_unit_validate_load_state(unit, e) twice in a row.
2020-08-03 17:54:32 +02:00
Zbigniew Jędrzejewski-Szmek 7e62257219
Merge pull request #16308 from bluca/root_image_options
service: add new RootImageOptions feature
2020-08-03 10:04:36 +02:00
Zbigniew Jędrzejewski-Szmek b67ec8e5b2 pid1: stop limiting size of /dev/shm
The explicit limit is dropped, which means that we return to the kernel default
of 50% of RAM. See 362a55fc14 for a discussion why that is not as much as it
seems. It turns out various applications need more space in /dev/shm and we
would break them by imposing a low limit.

While at it, rename the define and use a single macro for various tmpfs mounts.
We don't really care what the purpose of the given tmpfs is, so it seems
reasonable to use a single macro.

This effectively reverts part of 7d85383edb. Fixes #16617.
2020-07-30 18:48:35 +02:00
Luca Boccassi 18d7370587 service: add new RootImageOptions feature
Allows to specify mount options for RootImage.
In case of multi-partition images, the partition number can be prefixed
followed by colon. Eg:

RootImageOptions=1:ro,dev 2:nosuid nodev

In absence of a partition number, 0 is assumed.
2020-07-29 17:17:32 +01:00
Michal Koutný 30ad3ca086 cgroup: Add root slice to cgroup realization queue
When we're disabling controller on a direct child of root cgroup, we
forgot to add root slice into cgroup realization queue, which prevented
proper disabling of the controller (on unified hierarchy).

The mechanism relying on "bounce from bottom and propagate up" in
unit_create_cgroup doesn't work on unified hierarchy (leaves needn't be
enabled). Drop it as we rely on the ancestors to be queued -- that's now
intentional but was artifact of combining the two patches:

cb5e3bc37d ("cgroup: Don't explicitly check for member in UNIT_BEFORE") v240~78
65f6b6bdcb ("core: fix re-realization of cgroup siblings") v245-rc1~153^2

Fixes: #14917
2020-07-28 15:49:24 +02:00
Michal Koutný a479c21ed2 cgroup: Make realize_queue behave FIFO
The current implementation is LIFO, which is a) confusing b) prevents
some ordered operations on the cgroup tree (e.g. removing children
before parents).

Fix it quickly. Current list implementation turns this from O(1) to O(n)
operation. Rework the lists later.
2020-07-28 15:49:24 +02:00
Lennart Poettering 39cf0351c5 tree-wide: make use of new relative time events in sd-event.h 2020-07-28 11:24:55 +02:00
Axel Rasmussen 199a892218 selinux: handle getcon_raw producing a NULL pointer, despite returning 0
Previously, we assumed that success meant we definitely got a valid
pointer. There is at least one edge case where this is not true (i.e.,
we can get both a 0 return value, and *also* a NULL pointer):
4246bb550d/libselinux/src/procattr.c (L175)

When this case occurrs, if we don't check the pointer we SIGSEGV in
early initialization.
2020-07-24 13:34:27 +09:00
Lennart Poettering 8047ac8fdc core: clean more env vars from env block pid1 receives
We generally clean all env vars we use ourselves to communicate with out
childrens. We forgot some more recent additions however. Let's correct
that.
2020-07-23 18:30:15 +02:00
Lennart Poettering 00b868e857
Merge pull request #16542 from keszybz/make-targets-fail-again
Make targets fail again
2020-07-23 08:37:47 +02:00
Lennart Poettering c3f8a065e9 execute: take ownership of more fields in ExecParameters
Let's simplify things a bit, and take ownership of more fields in
ExecParameters, so that they are automatically freed when the structure
is released.
2020-07-23 08:37:21 +02:00
Zbigniew Jędrzejewski-Szmek 94d1ddbd7c pid1: target units can fail through dependencies
Fixes #16401.

c80a9a33d0 introduced the .can_fail field,
but didn't set it on .targets. Targets can fail through dependencies.
This leaves .slice and .device units as the types that cannot fail.

$ systemctl cat bad.service bad.target bad-fallback.service
[Service]
Type=oneshot
ExecStart=false

[Unit]
OnFailure=bad-fallback.service

[Service]
Type=oneshot
ExecStart=echo Fixing everythign!

$ sudo systemctl start bad.target
systemd[1]: Starting bad.service...
systemd[1]: bad.service: Main process exited, code=exited, status=1/FAILURE
systemd[1]: bad.service: Failed with result 'exit-code'.
systemd[1]: Failed to start bad.service.
systemd[1]: Dependency failed for bad.target.
systemd[1]: bad.target: Job bad.target/start failed with result 'dependency'.
systemd[1]: bad.target: Triggering OnFailure= dependencies.
systemd[1]: Starting bad-fallback.service...
echo[46901]: Fixing everythign!
systemd[1]: bad-fallback.service: Succeeded.
systemd[1]: Finished bad-fallback.service.
2020-07-22 17:58:12 +02:00
Zbigniew Jędrzejewski-Szmek 771b52427a core/job: adjust whitespace and comment 2020-07-22 17:58:12 +02:00
Lennart Poettering 58afc4f8e4 core: don't acquire dual timestamp needlessly if we don't need it in .timer handling
Follow-up for: 26698337f3
2020-07-21 17:33:47 +02:00
Yu Watanabe 9e54462cd5
Merge pull request #16482 from poettering/coverity-246
two coverity fixes
2020-07-16 20:23:23 +09:00
Lennart Poettering 3cd4459003 Revert "selinux: cache enforced status and treat retrieve failure as enforced mode"
This reverts commit 257188f80c.
2020-07-16 08:49:35 +02:00
Lennart Poettering f63ef93703 execute: fix if check
Fixes: coverity 1430459
2020-07-16 08:35:18 +02:00
Lennart Poettering 330f899079 load-fragment: downgrade log messages we ignore to LOG_WARNING
We typically don't log above LOG_WARNING about issues we then go on to
ignore. Do so here, too
2020-07-16 14:58:05 +09:00
Lennart Poettering 8d5bb13d78 core: fix invalid assertion
We miscounted here, and would hit an assert once too early.
2020-07-16 09:13:04 +09:00
Filipe Brandenburger 26698337f3 timer: Adjust calendar timers based on monotonic timer instead of realtime
When the RTC time at boot is off in the future by a few days, OnCalendar=
timers will be scheduled based on the time at boot. But if the time has been
adjusted since boot, the timers will end up scheduled way in the future, which
may cause them not to fire as shortly or often as expected.

Update the logic so that the time will be adjusted based on monotonic time.
We do that by calculating the adjusted manager startup realtime from the
monotonic time stored at that time, by comparing that time with the realtime
and monotonic time of the current time.

Added a test case to validate this works as expected. The test case creates a
QEMU virtual machine with the clock 3 days in the future. Then we adjust the
clock back 3 days, and test creating a timer with an OnCalendar= for every 15
minutes. We also check the manager startup timestamp from both `systemd-analyze
dump` and from D-Bus.

Test output without the corresponding code changes that fix the issue:

  Timer elapse outside of the expected 20 minute window.
    next_elapsed=1594686119
    now=1594426921
    time_delta=259198

With the code changes in, the test passes as expected.
2020-07-15 09:23:09 +02:00
Zbigniew Jędrzejewski-Szmek 76830e2500
Merge pull request #16462 from keszybz/rpm-macro-warnings
Emit better errors for rpm macro misuse
2020-07-15 08:56:28 +02:00
Zbigniew Jędrzejewski-Szmek 6cdc429454
Merge pull request #16340 from keszybz/var-tmp-readonly
Create ro private /var/tmp dir when /var/tmp is read-only
2020-07-14 19:59:48 +02:00
Zbigniew Jędrzejewski-Szmek 56a13a495c pid1: create ro private tmp dirs when /tmp or /var/tmp is read-only
Read-only /var/tmp is more likely, because it's backed by a real device. /tmp
is (by default) backed by tmpfs, but it doesn't have to be. In both cases the
same consideration applies.

If we boot with read-only /var/tmp, any unit with PrivateTmp=yes would fail
because we cannot create the subdir under /var/tmp to mount the private directory.
But many services actually don't require /var/tmp (either because they only use
it occasionally, or because they only use /tmp, or even because they don't use the
temporary directories at all, and PrivateTmp=yes is used to isolate them from
the rest of the system).

To handle both cases let's create a read-only directory under /run/systemd and
mount it as the private /tmp or /var/tmp. (Read-only to not fool the service into
dumping too much data in /run.)

$ sudo systemd-run -t -p PrivateTmp=yes bash
Running as unit: run-u14.service
Press ^] three times within 1s to disconnect TTY.
[root@workstation /]# ls -l /tmp/
total 0
[root@workstation /]# ls -l /var/tmp/
total 0
[root@workstation /]# touch /tmp/f
[root@workstation /]# touch /var/tmp/f
touch: cannot touch '/var/tmp/f': Read-only file system

This commit has more changes than I like to put in one commit, but it's touching all
the same paths so it's hard to split.
exec_runtime_make() was using the wrong cleanup function, so the directory would be
left behind on error.
2020-07-14 19:47:15 +02:00
Zbigniew Jędrzejewski-Szmek 1061fc1c17 rpm: include macro name in errors for two args macros too 2020-07-14 19:22:42 +02:00
Zbigniew Jędrzejewski-Szmek 281014b73e rpm: adjust various macros to print macro name in the error message
Based on initial patch by Jan Engelhardt <jengelh@inai.de>.

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1856122.
2020-07-14 19:21:12 +02:00
Zbigniew Jędrzejewski-Szmek 8800df5f71
Merge pull request #16430 from mikhailnov/fix-rpm-create-package-macros
Fix RPM *_create_package macros
2020-07-14 19:02:09 +02:00
Christian Göttsche f2df56bfea namespace: unify logging in mount_tmpfs
Fixes: abad72be4d
Follow up: #16426
2020-07-11 21:25:39 +02:00
Mikhail Novosyolov 3e6e0856cd rpm: avoid hiding errors and output in *_create_package macros
Commit b0ca726585 "rpm: avoid hiding errors from systemd commands" remove hiding errors and output
for other macros, but did not do that for %sysusers_create_package and %tmpfiles_create_package.

This change syncs their behaviour with %sysusers_create and %tmpfiles_create

Signed-off-by: Mikhail Novosyolov <m.novosyolov@rosalinux.ru>
2020-07-11 17:20:23 +03:00
Mikhail Novosyolov 93406fd379 rpm: avoid odd symbols in EOF indicator
The last line in this macros was actually "SYSTEMD_INLINE_EOF " with a space at the end,
but the shell was instructed to look for a line without space.

Macros %sysusers_create_inline and %tmpfiles_create_inline did not have this mistake.

An example:
[root@rosa-2019 bind-server]# cat /etc/passwd | grep named
[root@rosa-2019 bind-server]# cat /tmp/bs
systemd-sysusers --replace=/usr/lib/sysusers.d/named.conf - <<SYSTEMD_INLINE_EOF >/dev/null 2>&1 || :
u named - "BIND DNS Server" /var/lib/named
g named - -
m named named
SYSTEMD_INLINE_EOF
[root@rosa-2019 bind-server]# sh /tmp/bs
/tmp/bs: line 5: warning: here-document at line 1 delimited by end-of-file (wanted `SYSTEMD_INLINE_EOF')
[root@rosa-2019 bind-server]# bash /tmp/bs
/tmp/bs: line 5: warning: here-document at line 1 delimited by end-of-file (wanted `SYSTEMD_INLINE_EOF')
[root@rosa-2019 bind-server]# bash --version
GNU bash, version 5.0.17(1)-release (x86_64-openmandriva-linux-gnu)

The user and group named were NOT created!

Now I remove the trailing space after "SYSTEMD_INLINE_EOF" and rerun:
[root@rosa-2019 bind-server]# sh /tmp/bs
[root@rosa-2019 bind-server]# tail -n 1 /etc/group
named485:named
[root@rosa-2019 bind-server]#

The user and group have been created correctly.

Signed-off-by: Mikhail Novosyolov <m.novosyolov@rosalinux.ru>
2020-07-11 17:20:16 +03:00
Christian Göttsche abad72be4d namespace: fix MAC labels of TemporaryFileSystem=
Reproducible with:
  systemd-run -p TemporaryFileSystem=/root -t /bin/bash
    ls -dZ /root

Prior:
  root:object_r:tmpfs_t:s0 /root
Past:
  root:object_r:user_home_dir_t:s0 /root
2020-07-11 00:09:05 +02:00
Zbigniew Jędrzejewski-Szmek 02b0109af5
Merge pull request #15955 from anitazha/nullorempty
core: check null_or_empty_path for masked units instead of /dev/null
2020-07-08 22:18:17 +02:00
Zbigniew Jędrzejewski-Szmek cbc056c819 core: wrap some long lines and other formatting changes 2020-07-08 16:37:23 +02:00
Alan Perry 5dc60faae5 add error message when bind mount src missing 2020-07-07 20:04:19 +02:00
Zbigniew Jędrzejewski-Szmek 2b0bf3ccf8
Merge pull request #16301 from poettering/firstboot-image
Add --image= switch to firstboot, similar to --root= but with support for operating on disk image
2020-07-07 19:44:12 +02:00
Zbigniew Jędrzejewski-Szmek cd990847b9 tree-wide: more repeated words 2020-07-07 12:08:22 +02:00
Lennart Poettering e2ec9c4d3a namespace-util: introduce helper for combining unshare() + MS_SLAVE remount
We have multiple places we do these two non-trivial operations together,
let's introduce a unified helper for doing both at once.
2020-07-07 11:20:42 +02:00
Luca Boccassi cda667722c core: refresh unit cache when building a transaction if UNIT_NOT_FOUND
When a command asks to load a unit directly and it is in state
UNIT_NOT_FOUND, and the cache is outdated, we refresh it and
attempto to load again.
Use the same logic when building up a transaction and a dependency in
UNIT_NOT_FOUND state is encountered.
Update the unit test to exercise this code path.
2020-07-07 10:09:24 +02:00
Dan Callaghan 2fadbb4535 core: set private section name for automount units
Because this was left unset, the unit_write_setting() function was
refusing to write out the automount-specific TimeoutIdleSec= and
DirectoryMode= settings when creating transient automount units.
Set it to the proper value in line with other unit types.
2020-07-04 18:48:36 +02:00
gzjsgdsb 33d943d168 initialize arg_clock_usec 2020-07-03 14:52:20 +02:00
Anita Zhang 640f3b143d core: check null_or_empty for masked units instead of /dev/null
There's some inconsistency in the what is considered a masked unit:
some places (i.e. load-fragment.c) use `null_or_empty()` while others
check if the file path is symlinked to "/dev/null". Since the latter
doesn't account for things like non-absolute symlinks to "/dev/null",
this commit switches the check for "/dev/null" to use `null_or_empty_path()`
2020-07-03 02:33:50 -07:00
Zbigniew Jędrzejewski-Szmek 15e6a6e87b tree-wide: spell "lifecycle" without hyphen everywhere
We had 2 more instances of unhyphentated spelling.
2020-07-02 09:55:44 +02:00
Zbigniew Jędrzejewski-Szmek 37b22b3b47 tree: wide "the the" and other trivial grammar fixes 2020-07-02 09:51:38 +02:00
Yu Watanabe 9457b6bb21
Merge pull request #16303 from poettering/dbus-util-split
shared: split src/shared/bus-util.c into multiple files
2020-07-01 14:15:40 +09:00
Luca Boccassi 7233e91af0 core: store timestamps of unit load attempts
When the system is under heavy load, it can happen that the unit cache
is refreshed for an unrelated reason (in the test I simulate this by
attempting to start a non-existing unit). The new unit is found and
accounted for in the cache, but it's ignored since we are loading
something else.
When we actually look for it, by attempting to start it, the cache is
up to date so no refresh happens, and starting fails although we have
it loaded in the cache.

When the unit state is set to UNIT_NOT_FOUND, mark the timestamp in
u->fragment_loadtime. Then when attempting to load again we can check
both if the cache itself needs a refresh, OR if it was refreshed AFTER
the last failed attempt that resulted in the state being
UNIT_NOT_FOUND.

Update the test so that this issue reproduces more often.
2020-06-30 16:50:00 +02:00
Lennart Poettering 40af3d020f shared: split out property get helpers
No code changes, just some refactoring.
2020-06-30 15:10:17 +02:00
Lennart Poettering c664cf5607 shared: split out BusObjectImplementor APIs
Just some refactoring, no code changes
2020-06-30 15:08:35 +02:00
Zbigniew Jędrzejewski-Szmek 0e31a6c2ad
Merge pull request #16142 from poettering/random-seed-cmdline
pid1: add support for allowing to pass in random seed via kernel cmdline
2020-06-26 22:42:51 +02:00
Lennart Poettering bed0b7dfc0
pid1: warn if people use User=nobody (#16293) 2020-06-26 22:36:39 +02:00
Luca Boccassi 0cffae953a core: add device mapper to allow-list with DevicePolicy=closed and RootImage
To set up a verity/cryptsetup RootImage the forked child needs to
ioctl /dev/mapper/control and create a new mapper.
If PrivateDevices=yes and/or DevicePolicy=closed are used, this is
blocked by the cgroup setting, so add an exception like it's done
for loop devices (and also add a dependency on the kernel modules
implementing them).
2020-06-26 18:39:45 +02:00
Zbigniew Jędrzejewski-Szmek 98506a41fe
Merge pull request #15697 from OhNoMoreGit/fix-path-units
Recheck PathExists=, PathExistsGlob=, DirectoryNotEmpty= when triggered unit terminates
2020-06-25 18:23:47 +02:00
Luca Boccassi d4d55b0d13 core: add RootHashSignature service parameter
Allow to explicitly pass root hash signature as a unit option. Takes precedence
over implicit checks.
2020-06-25 08:45:21 +01:00
Luca Boccassi c2923fdcd7 dissect/nspawn: add support for dm-verity root hash signature
Since cryptsetup 2.3.0 a new API to verify dm-verity volumes by a
pkcs7 signature, with the public key in the kernel keyring,
is available. Use it if libcryptsetup supports it.
2020-06-25 08:45:21 +01:00
Zbigniew Jędrzejewski-Szmek e60d3b13df
Merge pull request #16265 from Werkov/fix-16248
cgroup: Parse infinity properly for memory protections
2020-06-25 09:25:18 +02:00
Lennart Poettering 6b000af4f2 tree-wide: avoid some loaded terms
https://tools.ietf.org/html/draft-knodel-terminology-02
https://lwn.net/Articles/823224/

This gets rid of most but not occasions of these loaded terms:

1. scsi_id and friends are something that is supposed to be removed from
   our tree (see #7594)

2. The test suite defines an API used by the ubuntu CI. We can remove
   this too later, but this needs to be done in sync with the ubuntu CI.

3. In some cases the terms are part of APIs we call or where we expose
   concepts the kernel names the way it names them. (In particular all
   remaining uses of the word "slave" in our codebase are like this,
   it's used by the POSIX PTY layer, by the network subsystem, the mount
   API and the block device subsystem). Getting rid of the term in these
   contexts would mean doing some major fixes of the kernel ABI first.

Regarding the replacements: when whitelist/blacklist is used as noun we
replace with with allow list/deny list, and when used as verb with
allow-list/deny-list.
2020-06-25 09:00:19 +02:00
Michal Koutný 67e2baff6b cgroup: Parse infinity properly for memory protections
This fixes commit db2b8d2e28 that
rectified parsing empty values but broke parsing explicit infinity.
Intended parsing semantics will be captured in a testcase in a follow up
commit.

Ref: #16248
2020-06-24 23:16:06 +02:00
Zbigniew Jędrzejewski-Szmek f83803a649
Merge pull request #16238 from keszybz/set-handling-more
Fix handling of cases where a duplicate item is added to a set and related cleanups
2020-06-24 17:42:13 +02:00
Lennart Poettering d247f232a8 core: add new systemd.random_seed= kernel command line option for seeding entropy pool
This is useful in test environments with entropy starved VMs.
2020-06-24 15:33:44 +02:00
Lennart Poettering 4dd055f907 random-util: add common helper random_write_entropy() for crediting entropy to the kernel's pool 2020-06-24 15:33:27 +02:00
Lennart Poettering 45250e66cc pid1: don't apply "systemd.clock_usec" kernel cmdline parameter outside of PID 1 2020-06-24 15:33:22 +02:00
Zbigniew Jędrzejewski-Szmek d02fd8b1c6 core/bpf-firewall: use the correct cleanup function
On error, we'd just free the object, and not close the fd.

While at it, let's use set_ensure_consume() to make sure we don't leak
the object if it was already in the set. I'm not sure if that condition
can be achieved.
2020-06-24 10:38:15 +02:00
Jay Burger a1ba8c5b71 feature to honor first shutdown request to completion
Create unit tests per established norm at position 52

check in_set first before getting unit
2020-06-24 09:42:01 +02:00
Christian Göttsche a9ba0e328f Make failures of mac_selinux_init() fatal 2020-06-23 19:10:07 +02:00
Christian Göttsche a11bfc17dc Initialize SELinux in user instances
Call mac_selinux_init() to setup the label cache, so objects can be
created with default SELinux context.

Fixes: #8004
2020-06-23 19:10:03 +02:00
Zbigniew Jędrzejewski-Szmek 311a0e2ee6 Revert "cgroup: Allow empty assignments of Memory{Low,Min}="
This reverts commit 53aa85af24.
The reason is that that patch changes the dbus api to be different than
the types declared by introspection api.

Replaces #16122.
2020-06-23 16:54:23 +02:00
Dave Reisner cc479760b4 Revert "job: Don't mark as redundant if deps are relevant"
This reverts commit 097537f07a.

At least Fedora and Debian have already reverted this at the distro
level because it causes more problems than it solves. Arch is debating
reverting it as well [0] but would strongly prefer that this happens
upstream first. Fixes #15188.

[0] https://bugs.archlinux.org/task/66458
2020-06-23 11:42:45 +02:00
Luca Boccassi 0389f4fa81 core: add RootHash and RootVerity service parameters
Allow to explicitly pass root hash (explicitly or as a file) and verity
device/file as unit options. Take precedence over implicit checks.
2020-06-23 10:50:09 +02:00
Zbigniew Jędrzejewski-Szmek de7fef4b6e tree-wide: use set_ensure_put()
Patch contains a coccinelle script, but it only works in some cases. Many
parts were converted by hand.

Note: I did not fix errors in return value handing. This will be done separate
to keep the patch comprehensible. No functional change is intended in this
patch.
2020-06-22 16:32:37 +02:00
Lennart Poettering b7e8286b4a
Merge pull request #15926 from fbuihuu/rework-show-status
Rework show status
2020-06-22 13:55:06 +02:00
Filipe Brandenburger 209b2592ed efi: Cache contents of EFI variable SystemdOptions
Cache it early in startup of the system manager, right after `/run/systemd` is
created, so that further access to it can be done without accessing the EFI
filesystem at all.
2020-06-13 14:46:57 +02:00
Franck Bui 43bba15ac8 pid1: rename manager_set_{show_status,watchdog}_overridden() into manager_override_(show_status,watchdog}
No functional change.
2020-06-11 12:00:32 +02:00
Franck Bui 0bb007f7a2 pid1: add a new SetShowStatus() bus call to override/restore show status mode
The only way to control "ShowStatus" property programmatically was to use the
signal API and wait until the property "ShowStatus" switched to the new value.

This interface is rather cumbersome to use and doesn't allow to temporarily
override the current setting and later restore the overridden value in
race-free manner.

The new method also accepts the empty string as argument which allows to
restore the initial value of ShowStatus, ie the value before it was overridden
by this method.

Fixes: #11447.
2020-06-11 12:00:32 +02:00
Franck Bui 3ceb347130 pid1: introduce an helper to handle the show-status marker
No functional change.
2020-06-11 12:00:16 +02:00
Lennart Poettering 24bd74ae03
Merge pull request #15940 from keszybz/names-set-optimization
Try to optimize away Unit.names set
2020-06-10 18:52:08 +02:00
Anita Zhang bb9244781c core: don't consider SERVICE_SKIP_CONDITION for abnormal or failure restarts
Fixes: #16115
2020-06-10 17:12:55 +02:00
Zbigniew Jędrzejewski-Szmek 9664be199a
Merge pull request #16118 from poettering/inaccessible-fixlets
move $XDG_RUNTIME_DIR/inaccessible/ to $XDG_RUNTIME_DIR/systemd/inaccessible
2020-06-10 10:23:13 +02:00
Zbigniew Jędrzejewski-Szmek e2ea005681 core: do not touch instance from unit_choose_id()
unit_choose_id() is about marking one of the aliases of the unit as the main
name. With the preparatory work in previous patches, all aliases of the unit
must have the same instance, so the operation to update the instance is a noop.
2020-06-10 09:45:58 +02:00
Zbigniew Jędrzejewski-Szmek 934ef6a522 core: create socket service instances with the correct name from the start
Upon an incoming connection for an accepting socket, we'd create a unit like
foo@0.service, then figure out that the instance name should be e.g. "0-41-0",
and then add the name foo@0-41-0.service to the unit. This obviously violates
the rule that any service needs to have a constance instance part.

So let's reverse the order: we first determine the instance name and then
create the unit with the correct name from the start.

There are two cases where we don't know the instance name:
- analyze-verify: we just do a quick check that the instance unit can be
  created. So let's use a bogus instance string.
- selinux: the code wants to load the service unit to extract the ExecStart path
  and query it for the selinux label. Do the same as above.

Note that in both cases it is possible that the real unit that is loaded could
be different than the one with the bogus instance value, for example if there
is a dropin for a specific instance name. We can't do much about this, since we
can't figure out the instance name in advance. The old code had the same
shortcoming.
2020-06-10 09:45:55 +02:00
Zbigniew Jędrzejewski-Szmek ada4b34ec7 core: rework error messages in unit_add_name()
They were added recently in acd1987a18. We can
make them more informative by using unit_type_to_string() and not repeating
unit names as much. Also, %m should not be used together with SYNTHETIC_ERRNO().
2020-06-10 09:42:20 +02:00
Zbigniew Jędrzejewski-Szmek d383acad25 core: when adding names to unit, require matching instance strings
We would check that the instance is present in both units (or missing in both).
But when it is defined, it should be the same in both. The comment in the code
was explicitly saying that differing instance strings are allowed, but this
mostly seems to be a left-over from old times. The man page is pretty clear:

> the instance (if any) is always uniquely defined for a given unit and all its
> aliases.
2020-06-10 09:42:20 +02:00
Zbigniew Jędrzejewski-Szmek 4562c35527 core: store unit aliases in a separate set
We allocated the names set for each unit, but in the majority of cases, we'd
put only one name in the set:

$ systemctl show --value -p Names '*'|grep .|grep -v ' '|wc -l
564
$ systemctl show --value -p Names '*'|grep .|grep ' '|wc -l
16

So let's add a separate .id field, and only store aliases in the set, and only
create the set if there's at least one alias. This requires a bit of gymnastics
in the code, but I think this optimization is worth the trouble, because we
save one object for many loaded units.

In particular set_complete_move() wasn't very useful because the target
unit would always have at least one name defined, i.e. the optimization to
move the whole set over would never fire.
2020-06-10 09:36:58 +02:00
Zbigniew Jędrzejewski-Szmek 02939ee001
Merge pull request #16087 from mrc0mmand/travis-build-check
travis: check build with various compiler options
2020-06-10 09:06:14 +02:00
Frantisek Sumsal dfa64b64a7 tree-wide: mark assert()-only variables as unused
to make a compilation with -Db_ndebug=true and --werror pass once again.
2020-06-09 21:31:10 +02:00
YmrDtnJu 2ffadd3cee AppArmor: Support for loading a set of pre-compiled profiles at startup time
Let systemd load a set of pre-compiled AppArmor profile files from a policy
cache at /etc/apparmor/earlypolicy. Maintenance of that policy cache must be
done outside of systemd.

After successfully loading the profiles systemd will attempt to change to a
profile named systemd.

If systemd is already confined in a profile, it will not load any profile files
and will not attempt to change it's profile.

If anything goes wrong, systemd will only log failures. It will not fail to
start.
2020-06-09 20:27:47 +02:00
Lennart Poettering d689f0f20a
Merge pull request #16058 from Werkov/fix-memory-protection-default
Fix memory protection default setting
2020-06-09 20:02:53 +02:00
Lennart Poettering 4f4f37b20d
Merge pull request #16046 from bluca/dissect_squashfs_verity
dissect: single-filesystem verity images with external hashdevice
2020-06-09 19:52:21 +02:00
Lennart Poettering 48b747fa03 inaccessible: move inaccessible file nodes to /systemd/ subdir in runtime dir always
Let's make sure $XDG_RUNTIME_DIR for the user instance and /run for the
system instance is always organized the same way: the "inaccessible"
device nodes should be placed in a subdir of either called "systemd" and
a subdir of that called "inaccessible".

This way we can emphasize the common behaviour, and only differ where
really necessary.

Follow-up for #13823
2020-06-09 16:23:56 +02:00
Lennart Poettering 57b611a5bf
Merge pull request #16074 from msekletar/freezer-test-flakes
Freezer test flakes
2020-06-09 14:38:40 +02:00
Luca Boccassi e7cbe5cb9e dissect: support single-filesystem verity images with external verity hash
dm-verity support in dissect-image at the moment is restricted to GPT
volumes.
If the image a single-filesystem type without a partition table (eg: squashfs)
and a roothash/verity file are passed, set the verity flag and mark as
read-only.
2020-06-09 12:19:21 +01:00
Jan Klötzke bf76080180 core: let user define start-/stop-timeout behaviour
The usual behaviour when a timeout expires is to terminate/kill the
service. This is what user usually want in production systems. To debug
services that fail to start/stop (especially sporadic failures) it
might be necessary to trigger the watchdog machinery and write core
dumps, though. Likewise, it is usually just a waste of time to
gracefully stop a stuck service. Instead it might save time to go
directly into kill mode.

This commit adds two new options to services: TimeoutStartFailureMode=
and TimeoutStopFailureMode=. Both take the same values and tweak the
behavior of systemd when a start/stop timeout expires:

 * 'terminate': is the default behaviour as it has always been,
 * 'abort': triggers the watchdog machinery and will send SIGABRT
   (unless WatchdogSignal was changed) and
 * 'kill' will directly send SIGKILL.

To handle the stop failure mode in stop-post state too a new
final-watchdog state needs to be introduced.
2020-06-09 10:04:57 +02:00
Franck Bui 44a419540e pid1: rework handling of m->show_status
The fact that m->show_status was serialized/deserialized made impossible any
further customisation of this setting via system.conf. IOW the value was
basically always locked unless it was changed via signals.

This patch reworks the handling of m->show_status but also makes sure that if a
new value was changed via the signal API then this value is kept and preserved
accross PID1 reexecuting or reloading.

Note: this effectively means that once the value is set via the signal
interface, it can be changed again only through the signal API.
2020-06-09 09:16:54 +02:00
Franck Bui 0d6d3cf055 pid1: rename manager_get_show_status() to manager_should_show_status()
The name 'manager_get_show_status()' suggests that the function simply reads
the property 'show_status' of the manager and hence returns a 'StatusType'
value.

However it was doing more than that since it contained the logic (based on
'show_status' but also on the state of the manager) to figure out if status
message could be emitted to the console.

Hence this patch renames the function to 'manager_should_show_status()'. The
previous name will be reused in a later patch to effectively return the value
of 'show_status' property.

No functional change.
2020-06-09 09:16:54 +02:00
Franck Bui b309078ab9 pid1: make more use of show_status_on()
No functional change.
2020-06-09 09:16:54 +02:00
Michal Sekletár 2884836e3c core: fix the return value in order to make sure we don't dipatch method return too early
Actually, it is the same kind of problem as in d910f4c . Basically, we
need to return 1 on success code path in slice_freezer_action().
Otherwise we dispatch DBus return message too soon.

Fixes: #16050
2020-06-05 16:10:40 +02:00
Lennart Poettering 7ade8982ca core: remove support for ".include" stanza
Six years ago we declared it obsolete and removed it from the docs
(c073a0c4a5) and added a note about it in
NEWS. Two years ago we add warning messages about it, indicating the
feature will be removed (41b283d0f1) and
mentioned it in NEWS again.

Let's now kill it for good.
2020-06-03 18:26:12 +02:00
Yu Watanabe 6161b35d5e
Merge pull request #16048 from poettering/conf-parser-mtime
conf-parser: automatically pick up newest mtime when parsing configuration files
2020-06-03 08:25:28 +09:00
Zbigniew Jędrzejewski-Szmek c9e0695675 core: set source_mtime after load dropins
Dropins may specify SourcePath= too, but we would do the stat only
after loading the main fragment, before loading of the drop-ins.

Fixes #13634.
2020-06-02 22:53:55 +02:00
Lennart Poettering 4f9ff96a55 conf-parser: return mtime in config_parse() and friends
This is a follow-up for 9f83091e3c.

Instead of reading the mtime off the configuration files after reading,
let's do so before reading, but with the fd we read the data from. This
is not only cleaner (as it allows us to save one stat()), but also has
the benefit that we'll detect changes that happen while we read the
files.

This also reworks unit file drop-ins to use the common code for
determining drop-in mtime, instead of reading system clock for that.
2020-06-02 19:32:20 +02:00
Michal Koutný 53aa85af24 cgroup: Allow empty assignments of Memory{Low,Min}=
Currently, an empty assignment of Memory{Low,Min}= directives would be
interpretted as setting it to global default, i.e. zero. However, if we
set a runtime protection value on a unit that inherits parent's
DefaultMemory{Low,Min}=, it is not possible to revert it back to the
state where the DefaultMemory{Low,Min}= is propagated from parent
slice(s).

This patch changes the semantics of the empty assignments to explicitly
nullify any value set by the user previously. Since DBus API uses
uint64_t where 0 is a valid configuration, the patch modifies DBus API
by exploiting the variant type of property value to pass the NULL value.
2020-06-02 18:59:47 +02:00
Michal Koutný db2b8d2e28 cgroup: Make empty assignments reset to default
When MemoryLow= or MemoryMin= is set, it is interpretted as setting the
values to infinity. This is inconsistent with the default initialization
to 0.
It'd be nice to interpret the empty assignment as fallback to
DefaultMemory* of parent slice, however, current DBus API cannot convey
such a NULL value, so stick to simply interpretting that as hard-wired
default.
2020-06-02 18:59:47 +02:00
Luca Boccassi d904afc730 core: reload cache if it's dirty when starting a UNIT_NOT_FOUND unit
The time-based cache allows starting a new unit without an expensive
daemon-reload, unless there was already a reference to it because of
a dependency or ordering from another unit.
If the cache is out of date, check again if we can load the
fragment.
2020-05-30 16:50:05 +02:00
Lennart Poettering 5fc20ede0f
Merge pull request #15954 from keszybz/unit-file-leak
Fix leak in unit path cache and another small optimization
2020-05-29 16:02:53 +02:00
Evgeny Vereshchagin ceae629564 README: add a Fossies codespell badge
I keep forgetting where the report is. Hopefully the badge will
make it easier to find it. I also fixed several typos codespell
found along the way.
2020-05-29 15:06:16 +02:00
Yu Watanabe fac729f811
Merge pull request #15911 from poettering/unit-name-tighten
pid1: improve logging when we encounter a path that is too long to be converted into a mount unit name
2020-05-29 15:55:38 +09:00
Yu Watanabe 8abbd9a4d9
Merge pull request #15944 from poettering/sd-path-fixups
sd-path: some fixups
2020-05-29 15:23:37 +09:00
Yu Watanabe 27cd09442f
Merge pull request #15942 from poettering/pass-pktinfo
disable PassCredentials= on networkd netlink socket, enable NETLINK_PKTINFO instead
2020-05-29 15:23:15 +09:00
Lennart Poettering 4a56315a99 path: use ROOTPREFIX properly
ROOTPREFIX doesn't include the trailing /, hence add it in where needed.

Also, given that sysctl.d/, binfmt.d/, sysusers.d/ are generally
accessed before /var/ is up they should use ROOTPREFIX rather than
PREFIX. Fix that.
2020-05-28 23:52:34 +02:00
Lennart Poettering 08b8f29a25 tree-wide: remove spurious whitespace 2020-05-28 19:23:47 +02:00
Zbigniew Jędrzejewski-Szmek a4ac27c1af manager: free the jobs hashmap after we have no jobs
After a larger transaction, e.g. after bootup, we're left with an empty hashmap
with hundreds of buckets. Long-term, it'd be better to size hashmaps down when
they are less than 1/4 full, but even if we implement that, jobs hashmap is
likely to be empty almost always, so it seems useful to deallocate it once the
jobs count reaches 0.
2020-05-28 18:54:20 +02:00
Zbigniew Jędrzejewski-Szmek f6173cb955 core: define UnitDependency iterators in loops
Reduced scope of variables is always nice.
2020-05-28 18:53:35 +02:00
Zbigniew Jędrzejewski-Szmek 3fb2326f3e shared/unit-file: make sure the old hashmaps and sets are freed upon replacement
Possibly fixes #15220. (There might be another leak. I'm still investigating.)

The leak would occur when the path cache was rebuilt. So in normal circumstances
it wouldn't be too bad, since usually the path cache is not rebuilt too often. But
the case in #15220, where new unit files are created in a loop and started, the leak
occurs once for each unit file:

$ for i in {1..300}; do cp ~/.config/systemd/user/test0001.service ~/.config/systemd/user/test$(printf %04d $i).service; systemctl --user start test$(printf %04d $i).service;done
2020-05-28 18:51:52 +02:00
Zbigniew Jędrzejewski-Szmek db868d45f9 core: make unit_set_invocation_id static
No functional change.
2020-05-28 18:47:01 +02:00
Lennart Poettering a3d19f5d99 core: add new PassPacketInfo= socket unit property 2020-05-27 22:40:38 +02:00
Lennart Poettering 49d5aae503
Merge pull request #15934 from keszybz/docs-and-test-fix
A few sundry documentation and code fixes
2020-05-27 15:16:24 +02:00
Lennart Poettering 6d02412d51
Merge pull request #15928 from poettering/kill-mode-warnings
warn on KillMode=none, inform about left-over processes on stop and warn about sysv services
2020-05-27 15:05:41 +02:00
Chris Down 4793c31083 service: Display updated WatchdogUSec from sd_notify
Suppose a service has WatchdogSec set to 2 seconds in its unit file. I
then start the service and WatchdogUSec is set correctly:

    % systemctl --user show psi-notify -p WatchdogUSec
    WatchdogUSec=2s

Now I call `sd_notify(0, "WATCHDOG_USEC=10000000")`. The new timer seems
to have taken effect, since I only send `WATCHDOG=1` every 4 seconds,
and systemd isn't triggering the watchdog handler. However, `systemctl
show` still shows WatchdogUSec as 2s:

    % systemctl --user show psi-notify -p WatchdogUSec
    WatchdogUSec=2s

This seems surprising, since this "original" watchdog timer isn't the
one taking effect any more. This patch makes it so that we instead
display the new watchdog timer after sd_notify(WATCHDOG_USEC):

    % systemctl --user show psi-notify -p WatchdogUSec
    WatchdogUSec=10s

Fixes #15726.
2020-05-27 09:09:40 +02:00
Zbigniew Jędrzejewski-Szmek 24b4597064 core: minor simplification 2020-05-27 09:02:53 +02:00
Zbigniew Jędrzejewski-Szmek 1414090854 core: add forgotten return in error path
If we get an error here, this is most likely oom, and we should not continue.
2020-05-27 09:02:53 +02:00
Zbigniew Jędrzejewski-Szmek 3035a7c577 core: update comment about sysv compat
We don't want people to consider that part deprecated.

Follow-up for 6589a56972.
2020-05-27 09:02:09 +02:00
Lennart Poettering c8aa4b5b86 core: voidify one function return 2020-05-26 23:52:22 +02:00
Lennart Poettering b0cea477d4 core: some more structured initialization 2020-05-26 23:52:18 +02:00
Lennart Poettering 4c42543429 core: also log about left-over processes during unit stop
Only log at LOG_INFO level, i.e. make this informational. During start
let's leave it at LOG_WARNING though.

Of course, it's ugly leaving processes around like that either in start
or in stop, but at start its more dangerous than on stop, so be tougher
there.
2020-05-26 23:52:13 +02:00
Lennart Poettering d068765b68 core: warn if KillMode=none is used 2020-05-26 23:50:54 +02:00
Lennart Poettering cf3317f63a
Merge pull request #15906 from keszybz/busctl-stdout-stderr
Make busctl stdout/stderr split consistent
2020-05-26 15:38:14 +02:00
Lennart Poettering 2c905207db core: improve log messages when we cannot process a mount point
Whenever we pick up a new line in /proc/self/mountinfo and want to
synthesize a new mount unit from it, let's say which one it is.
Moreover, downgrade the log message when we encounter a mount point with
an overly long name to LOG_WARNING, since it's generally fine to ignore
such mount points.

Also, attach a catalog entry to explain the situation further.

Prompted-By: #15221
2020-05-26 09:17:48 +02:00
Lennart Poettering 6028d766d1 macro: introduce DISABLE_WARNING_TYPE_LIMITS and make use of it everywhere 2020-05-25 18:25:09 +02:00
Lennart Poettering 56e577c62f tree-wide: use DISABLE_WARNING_FORMAT_NONLITERAL where appropriate 2020-05-25 18:25:05 +02:00
Zbigniew Jędrzejewski-Szmek 5453a4b1a8 tree-wide: use public sd-bus functions in more places 2020-05-25 11:09:21 +02:00
Zbigniew Jędrzejewski-Szmek bb94ded693
Merge pull request #15661 from hundeboll/mount-read-write-only
Mount read write only
2020-05-20 15:48:04 +02:00
Zbigniew Jędrzejewski-Szmek 3d45f6b2d0
Merge pull request #15535 from fbuihuu/update-runtime-watchdog-on-reload
pid1: update RuntimeWatchogSec setting on daemon-reload as well
2020-05-20 15:40:34 +02:00
ExtinctFire 929fed02df load-fragment: fix a typo
Correct the typo of “priority” in log message
2020-05-20 15:22:37 +02:00
Franck Bui b406c6d128 pid1: make manager_deserialize_{uid,gid}_refs() static
No functional change.
2020-05-19 15:48:54 +02:00
Franck Bui 80f605c807 pid1: make manager_serialize_{uid,gid}_refs() static
No functional change.
2020-05-19 15:48:54 +02:00
Franck Bui 06a4eb0737 pid1: make manager_vacuum_{uid,gid}_refs() static
No functional change.
2020-05-19 15:48:54 +02:00
Franck Bui 1addc46c8c pid1: make manager_flip_auto_status() static
No functional change.
2020-05-19 15:48:54 +02:00
Franck Bui ffe5c01eaa pid1: fold load_configuration() into main()
That's reduce the number of functions dealing with configuration
parsing/loading and should make the code simpler especially since this function
was used only once.

No functional change.
2020-05-19 15:48:54 +02:00
Franck Bui 986935cf6a pid1: update manager settings on reload too
Most complexity of this patch is due to the fact that some manager settings
(basically the watchdog properties) can be set at runtime and in this case the
runtime values must be retained over daemon-reload or daemon-reexec.

For consistency sake, all watchdog properties behaves now the same way, that
is:

  - Values defined by config files can be overridden by writing the new value
    through their respective D-BUS properties. In this case, these values are
    preserved over reload/reexec until the special value '0' or USEC_INFINITY
    is written, which will then restore the last values loaded from the config
    files. If the restored value is '0' or 'USEC_INFINITY', the watchdogs will
    be disabled and the corresponding device will be closed.

  - Reading the properties from a user instance will return the USEC_INFINITY
    value as these properties are only meaningful for PID1.

  - Writing to one of the watchdog properties of a user instance's will be a
    NOP.

Fixes: #15453
2020-05-19 15:31:55 +02:00
Lennart Poettering 34293dfafd core: allow overriding the system hostname with systemd.hostname= on the kernel command line 2020-05-18 20:20:50 +02:00
Lennart Poettering 3753325bef main: add a kernel command line option for setting the system clock early during boot 2020-05-18 20:20:50 +02:00
Lennart Poettering a0b191b705 condition: add ConditionEnvironment=
Prompted by the discussions in #15180.

This is a bit more complex than I hoped, since for PID 1 we need to pass
in the synethetic environment block in we generate on demand.
2020-05-15 16:05:33 +02:00
Lennart Poettering e1e214c56b
Merge pull request #15265 from fbuihuu/mount-fixes
Mount fixes
2020-05-15 11:13:45 +02:00
Lennart Poettering f3dc6af20f core: automatically update StandardOuput=syslog to =journal (and similar for StandardError=)
Let's go one step further and upgrade implicitly. Usually =syslog
assignments are historic artifacts only. Let's upgrade the lines
automatically, and politely suggest people update their unit
files/configuration (and drop the lines altogether, without
replacement).

Fixes: #15807
2020-05-15 00:05:46 +02:00
Lennart Poettering 3250501865
Merge pull request #15660 from benjarobin/perf_barrier_fd
Faster manager_process_barrier_fd and drop message if BARRIER=1 found
2020-05-14 09:07:41 +02:00
Christian Göttsche 9bf4984a2a selinux: add parenthesis to function names in log messages 2020-05-14 09:03:51 +02:00
Benjamin Robin 5151b4ccd2 core: Parse the tags list sooner, and use it for multiple function
- Parse the tags list using strv_split_newlines() which remove any
   unnecessary empty string at the end of the strv.
 - Use this parsed list for manager_process_barrier_fd() and every call
   to manager_invoke_notify_message().
 - This also allow to simplify the manager_process_barrier_fd() function.
2020-05-13 22:44:12 +02:00
Topi Miettinen 7d85383edb tree-wide: add size limits for tmpfs mounts
Limit size of various tmpfs mounts to 10% of RAM, except volatile root and /var
to 25%. Another exception is made for /dev (also /devs for PrivateDevices) and
/sys/fs/cgroup since no (or very few) regular files are expected to be used.

In addition, since directories, symbolic links, device specials and xattrs are
not counted towards the size= limit, number of inodes is also limited
correspondingly: 4MB size translates to 1k of inodes (assuming 4k each), 10% of
RAM (using 16GB of RAM as baseline) translates to 400k and 25% to 1M inodes.

Because nr_inodes option can't use ratios like size option, there's an
unfortunate side effect that with small memory systems the limit may be on the
too large side. Also, on an extremely small device with only 256MB of RAM, 10%
of RAM for /run may not be enough for re-exec of PID1 because 16MB of free
space is required.
2020-05-13 00:37:18 +02:00
Lennart Poettering 08d50deaf8 core: don't bind varlink socket if running in test mode
Fixes: #15748
2020-05-12 22:10:49 +02:00
Zbigniew Jędrzejewski-Szmek 9b107000ab
Merge pull request #15762 from keszybz/gcc-10-build
Fix build with -O3 with gcc 10
2020-05-11 19:43:40 +02:00
Benjamin Robin fcee2755ec core: Update prototype of notify_message, tags list is read only
Indicates that the tags list cannot be modified by notify_message function.
Since the tags list is created only once for multiple call to
notify_message functions.
2020-05-10 18:58:03 +02:00
Benjamin Robin 20c3acfaad tree-wide: Replace assert() by assert_se() when there is side effect 2020-05-10 09:23:12 +02:00
Zbigniew Jędrzejewski-Szmek 23450c897d core: fix compilation with gcc -O3
../src/core/path.c: In function ‘path_serialize’:
../src/core/path.c:616:24: error: ‘%s’ directive argument is null [-Werror=format-overflow=]
  616 |                 (void) serialize_item_format(f, "path-spec", "%s %%i %%s",
      |                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  617 |                                              path_type_to_string(s->type) //,
      |                                              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  618 |                                              //                                             s->previous_exists,
      |                                              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  619 |                                              //                                             s->path
      |                                              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  620 |                 );
      |                 ~
In function ‘path_spec_dump’,
    inlined from ‘path_dump’ at ../src/core/path.c:392:17:
../src/core/path.c:226:9: error: ‘%s’ directive argument is null [-Werror=format-overflow=]
  226 |         fprintf(f,
      |         ^~~~~~~~~~
  227 |                 "%s%s: %s\n",
      |                 ~~~~~~~~~~~~~
  228 |                 prefix,
      |                 ~~~~~~~
  229 |                 path_type_to_string(s->type),
      |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  230 |                 s->path);
      |                 ~~~~~~~~

s->type should be valid here, so let's just add an assert.

For https://github.com/systemd/systemd/issues/6119#issuecomment-626073743.
2020-05-09 09:10:25 +02:00
Zbigniew Jędrzejewski-Szmek 8acb7780df
Merge pull request #15623 from poettering/cmsg-cleanup
various CMSG_xyz clean-ups, split out of #15571
2020-05-08 11:05:06 +02:00
Michal Sekletar d910f4c2b2 core/cgroup: fix return value of unit_cgorup_freezer_action()
We should return 0 only if current freezer state, as reported by the
kernel, is already the desired state. Otherwise, we would dispatch
return dbus message prematurely in bus_unit_method_freezer_generic().

Thanks to Frantisek Sumsal for reporting the issue.
2020-05-07 22:19:19 +02:00
Lennart Poettering c60bc8d4fb
Merge pull request #15745 from keszybz/one-more-specifier
Add %l as specifier for short hostname
2020-05-07 22:18:59 +02:00
Zbigniew Jędrzejewski-Szmek e97708fa3e Add %l as specifier for the hostname without any domain component
As described in #15603, it is a fairly common setup to use a fqdn as the
configured hostname. But it is often convenient to use just the actual
hostname, i.e. until the first dot. This adds support in tmpfiles, sysusers,
and unit files for %l which expands to that.

Fixes #15603.
2020-05-07 17:36:44 +02:00
Zbigniew Jędrzejewski-Szmek 5cea17a177
Merge pull request #15635 from keszybz/set-put-strdup
Let set_put_strdup() allocate the set and related changes
2020-05-07 17:01:59 +02:00
Lennart Poettering fb29cdbef2 tree-wide: make sure our control buffers are properly aligned
We always need to make them unions with a "struct cmsghdr" in them, so
that things properly aligned. Otherwise we might end up at an unaligned
address and the counting goes all wrong, possibly making the kernel
refuse our buffers.

Also, let's make sure we initialize the control buffers to zero when
sending, but leave them uninitialized when reading.

Both the alignment and the initialization thing is mentioned in the
cmsg(3) man page.
2020-05-07 14:39:44 +02:00
Zbigniew Jędrzejewski-Szmek 354722dde4
Merge pull request #15733 from benjarobin/fix_warn_ndebug
Allow to build without any warning with NDEBUG defined + Bugfix
2020-05-07 09:47:19 +02:00
Benjamin Robin 08f468567d tree-wide: Workaround -Wnonnull GCC bug
See issue #6119
2020-05-07 09:43:28 +02:00
Benjamin Robin 0a0e594a26 tree-wide: Mark as _unused_ variables that are only used in assert()
Allow to build without any warning with NDEBUG defined
2020-05-06 23:03:25 +02:00
Zbigniew Jędrzejewski-Szmek e57ac1b017 tree-wide: use _cleanup_set_free_ where appropriate
If we already have the helper defined, let's use it instead of open-coding.
2020-05-06 17:08:17 +02:00
Zbigniew Jędrzejewski-Szmek be32732168 basic/set: let set_put_strdup() create the set with string hash ops
If we're using a set with _put_strdup(), most of the time we want to use
string hash ops on the set, and free the strings when done. This defines
the appropriate a new string_hash_ops_free structure to automatically free
the keys when removing the set, and makes set_put_strdup() and set_put_strdupv()
instantiate the set with those hash ops.

hashmap_put_strdup() was already doing something similar.

(It is OK to instantiate the set earlier, possibly with a different hash ops
structure. set_put_strdup() will then use the existing set. It is also OK
to call set_free_free() instead of set_free() on a set with
string_hash_ops_free, the effect is the same, we're just overriding the
override of the cleanup function.)

No functional change intended.
2020-05-06 16:54:06 +02:00
Zbigniew Jędrzejewski-Szmek ba5a389d24 Merge pull request #15473 from keszybz/bus-introspection 2020-05-06 15:00:07 +02:00
Lennart Poettering c92391f52f
Merge pull request #15692 from keszybz/preset-cleanup
Make systemctl list-unit-files output more useful
2020-05-06 08:19:37 +02:00
Zbigniew Jędrzejewski-Szmek 5c08257b70 pid1: add --bus-introspect 2020-05-05 22:40:44 +02:00
Zbigniew Jędrzejewski-Szmek f6e9aa9e45 pid1: convert to the new scheme
In all the other cases, I think the code was clearer with the static table.
Here, not so much. And because of the existing dump code, the vtables cannot
be made static and need to remain exported. I still think it's worth to do the
change to have the cmdline introspection, but I'm disappointed with how this
came out.
2020-05-05 22:40:37 +02:00
Zbigniew Jędrzejewski-Szmek 8f7b256665 shared/install: optionally cache the preset list
When doing list-unit-files with --root, we would re-read the preset
list for every unit. This uses a cache to only do it once. The time
for list-unit-files goes down by about ~30%.

unit_file_query_preset() is also called from src/core/. This patch does not
touch that path, since the saving there are smaller, since preset status is
only read on demand over dbus, and caching would be more complicated.
2020-05-05 21:50:31 +02:00
Michael Chapman 708961c701 core/path: recheck path specs when triggered unit changes state
As documented in systemd.path(5):

    When a service unit triggered by a path unit terminates (regardless
    whether it exited successfully or failed), monitored paths are
    checked immediately again, and the service accordingly restarted
    instantly.

This commit implements this behaviour for PathExists=, PathExistsGlob=,
and DirectoryNotEmpty=. These predicates are essentially
"level-triggered": the service should be activated whenever the
predicate is true. PathChanged= and PathModified=, on the other hand,
are "edge-triggered": the service should only be activated when the
predicate *becomes* true.

The behaviour has been broken since at least as far back as commit
8fca6944c2 ("path: stop watching path specs once we triggered the target
unit"). This commit had systemd stop monitoring inotify whenever the
triggered unit was activated. Unfortunately this meant it never updated
the ->inotify_triggered flag, so it never rechecked the path specs when
the triggered unit deactivated.

With this commit, systemd rechecks all paths specs whenever the
triggered unit deactivates. If any PathExists=, PathExistsGlob= or
DirectoryNotEmpty= predicate passes, the triggered unit is reactivated.

If the target unit is activated by something outside of the path unit,
the path unit immediately transitions to a running state. This ensures
the path unit stops monitoring inotify in this situation.

With this change in place, commit d7cf8c24d4 ("core/path: fix spurious
triggering of PathExists= on restart/reload") is no longer necessary.
The path unit (and its triggered unit) is now always active whenever
the PathExists= predicate passes, so there is no spurious restart when
systemd is reloaded or restarted.
2020-05-05 13:56:02 +10:00
Michael Chapman f285f07752 core/path: use escaped path in serialization 2020-05-05 13:56:02 +10:00
Zbigniew Jędrzejewski-Szmek 7ae473268c manager: add highlighting to --help
Also change the summary line to a bit more concise.
2020-05-03 10:55:51 +02:00
Frantisek Sumsal 2d4f8cf467 man: fix few spelling errors
Reported by Fossies.org.
2020-05-03 10:23:25 +02:00
Martin Hundebøll c600357ba6 mount: add ReadWriteOnly property to fail on read-only mounts
Systems where a mount point is expected to be read-write needs a way to
fail mount units that fallback as read-only.

Add a property to allow setting the -w option when calling mount(8).
2020-05-01 13:23:30 +02:00
Lennart Poettering 6eb35fd695
Merge pull request #15547 from kkdwivedi/notify-barrier
Introduce sd_notify_barrier
2020-05-01 08:48:42 +02:00
Zbigniew Jędrzejewski-Szmek b76ef59756
Merge pull request #13512 from msekletar/freezer
core: introduce support for cgroup freezer
2020-05-01 07:52:29 +02:00
Kumar Kartikeya Dwivedi 4f07ddfa9b
Introduce sd_notify_barrier
This adds the sd_notify_barrier function, to allow users to synchronize against
the reception of sd_notify(3) status messages. It acts as a synchronization
point, and a successful return gurantees that all previous messages have been
consumed by the manager. This can be used to eliminate race conditions where
the sending process exits too early for systemd to associate its PID to a
cgroup and attribute the status message to a unit correctly.

systemd-notify now uses this function for proper notification delivery and be
useful for NotifyAccess=all units again in user mode, or in cases where it
doesn't have a control process as parent.

Fixes: #2739
2020-05-01 03:22:47 +05:30
Kenny Levinsen 3052049260 core: (De-)Serialize poll flag for fds in fdstore
This replaces manual string splitting and unescaping with
extract_first_word.
2020-04-30 19:42:53 +02:00
Kenny Levinsen cb5a46b845 core: Add optional FDPOLL=0 argument to fdstore
A service can specify FDSTORE=1 FDPOLL=0 to request that PID1 does not
poll the fd to remove them on error. If set, fds will only be removed on
FDSTOREREMOVE=1 or when the service is done.

Fixes: #12086
2020-04-30 19:42:26 +02:00
Michal Sekletár d9e45bc3ab core: introduce support for cgroup freezer
With cgroup v2 the cgroup freezer is implemented as a cgroup
attribute called cgroup.freeze. cgroup can be frozen by writing "1"
to the file and kernel will send us a notification through
"cgroup.events" after the operation is finished and processes in the
cgroup entered quiescent state, i.e. they are not scheduled to
run. Writing "0" to the attribute file does the inverse and process
execution is resumed.

This commit exposes above low-level functionality through systemd's DBus
API. Each unit type must provide specialized implementation for these
methods, otherwise, we return an error. So far only service, scope, and
slice unit types provide the support. It is possible to check if a
given unit has the support using CanFreeze() DBus property.

Note that DBus API has a synchronous behavior and we dispatch the reply
to freeze/thaw requests only after the kernel has notified us that
requested operation was completed.
2020-04-30 19:02:51 +02:00
Michal Sekletár 08deac6e3e selinux: do preprocessor check only in selinux-access.c
This has the advantage that mac_selinux_access_check() can be used as a
function in all contexts. For example, parameters passed to it won't be
reported as unused if the "function" call is replaced with 0 on SELinux
disabled builds.
2020-04-29 13:56:40 +02:00
Lennart Poettering 268f5a5463 tree-wide: support a bunch of additional specifiers 2020-04-28 22:47:21 +02:00
Lennart Poettering 04d1ee0f7e main: bump RLIMIT_MEMLOCK by physical RAM size
Let's allow more memory to be locked on beefy machines than on small
ones. The previous limit of 64M is the lower bound still. This
effectively means on a 4GB machine we can lock 512M, which should be
more than enough, but still not lock up the machine entirely under
pressure.

Fixes: #15053
2020-04-28 19:54:21 +02:00
Lennart Poettering 035e3cb9f9
Merge pull request #15563 from keszybz/wait-callback-tweak
Add NULL callback check in one more place
2020-04-23 19:50:36 +02:00
Lennart Poettering 0f4a141744
Merge pull request #15504 from poettering/cmsg-find-pure
just the recvmsg_safe() stuff from #15457
2020-04-23 17:28:19 +02:00
Lennart Poettering a9ab5cdb50
Merge pull request #15472 from keszybz/dbus-api-docs
A few more dbus api documentation updates
2020-04-23 17:01:11 +02:00
Zbigniew Jędrzejewski-Szmek 8f3e342fa9 core: fix unused variable warning when !HAVE_SECCOMP 2020-04-23 14:42:09 +02:00
Lennart Poettering 3691bcf3c5 tree-wide: use recvmsg_safe() at various places
Let's be extra careful whenever we return from recvmsg() and see
MSG_CTRUNC set. This generally means we ran into a programming error, as
we didn't size the control buffer large enough. It's an error condition
we should at least log about, or propagate up. Hence do that.

This is particularly important when receiving fds, since for those the
control data can be of any size. In particular on stream sockets that's
nasty, because if we miss an fd because of control data truncation we
cannot recover, we might not even realize that we are one off.

(Also, when failing early, if there's any chance the socket might be
AF_UNIX let's close all received fds, all the time. We got this right
most of the time, but there were a few cases missing. God, UNIX is hard
to use)
2020-04-23 09:41:47 +02:00
Lennart Poettering 0cd41757d0 sd-bus: work around ubsan warning
ubsan complains that we add an offset to a NULL ptr here in some cases.
Which isn't really a bug though, since we only use it as the end
condition for a for loop, but we can still fix it...

Fixes: #15522
2020-04-23 08:54:30 +02:00
Zbigniew Jędrzejewski-Szmek 9786b27cbb
Merge pull request #15546 from poettering/pid1-serialize-comand-fix
core: make sure to restore the control command id, too
2020-04-23 08:51:58 +02:00
Zbigniew Jędrzejewski-Szmek 1943d50e4e
Merge pull request #15507 from poettering/bus-log-api
add generic bus interface for setting log level that can be implemented by any daemon
2020-04-22 23:30:09 +02:00
Lennart Poettering e8cf09b2a2 core: make sure we don't get confused when setting TERM for a tty fd
Fixes: #15344
2020-04-22 22:59:41 +02:00
Lennart Poettering e9da62b18a core: make sure to restore the control command id, too
Fixes: #15356
2020-04-22 20:34:02 +02:00
Lennart Poettering 5b99bd5fd4 core: some minor clean-ups/modernizations 2020-04-22 20:33:57 +02:00
Zbigniew Jędrzejewski-Szmek eecf4f7e04
Merge pull request #15526 from bluca/start_stop_wait_debug
Add debug log when a job in the activation queue is not runnable
2020-04-22 14:24:07 +02:00
Luca Boccassi c03fbd37d6 core: add debug log when a job in the activation queue is not runnable
When a job is skipped due its dependencies not being ready, log
a debug message saying what is holding it back.
This was very useful with transient units timing out to figure
out where the problem was.
2020-04-22 09:58:12 +01:00
Lennart Poettering df3d3bdfe8 core: minor error code handling fixes 2020-04-22 08:56:05 +02:00
Frantisek Sumsal 86b52a3958 tree-wide: fix spelling errors
Based on a report from Fossies.org using Codespell.

Followup to #15436
2020-04-21 23:21:08 +02:00
Luca Boccassi c5bc2c01ee core: add log_get_max_level check optimization in log_unit_full
Just as log_full already does, check if the log level would result in
logging immediately in the macro in order to avoid doing
unnecessary work that adds up in hot spots.
2020-04-21 18:05:24 +01:00
Lennart Poettering 893f801d67 core: implement generic log control API in PID1 too
It has slightly different setters in place, so it needs some special
love, which is easy enough though.
2020-04-21 17:08:23 +02:00
Lennart Poettering 25141692e9 core: use generic implementations of log level/target bus propertier getters
The setters are slightly different, hence keep them as they are for now.
2020-04-21 17:08:23 +02:00
Zbigniew Jędrzejewski-Szmek 38b38500c6 tree-wide: use "hostname" spelling everywhere
It's not that I think that "hostname" is vastly superior to "host name". Quite
the opposite — the difference is small, and in some context the two-word version
does fit better. But in the tree, there are ~200 occurrences of the first, and
>1600 of the other, and consistent spelling is more important than any particular
spelling choice.
2020-04-21 16:58:04 +02:00
Lennart Poettering 33b58dfb41 core: automatically add udev dependency for units using RootImage=
We use udev to wait for /dev/loopX devices to be fully proped hence we
need an implicit ordering dependency on it, for RootImage= to work
reliably in early boot, too.

Fixes: #14972
2020-04-21 16:31:06 +02:00
Lennart Poettering 09f8722801
Merge pull request #15396 from keszybz/dbus-api-docs
D-bus API docs
2020-04-17 23:40:50 +02:00
Lennart Poettering 4444e8533f
Merge pull request #15454 from keszybz/codespell-fixes
Spelling fixes based on codespell suggestions
2020-04-17 15:53:01 +02:00
Zbigniew Jędrzejewski-Szmek dad97f0425 manager: add dbus parameter names 2020-04-16 19:46:40 +02:00
Zbigniew Jędrzejewski-Szmek 162392b75a tree-wide: spellcheck using codespell
Fixes #15436.
2020-04-16 18:00:40 +02:00
Alin Popa c5f8a179a2 watchdog: reduce watchdog pings in timeout interval
The watchdog ping is performed for every iteration of manager event
loop. This results in a lot of ioctls on watchdog device driver
especially during boot or if services are aggressively using sd_notify.
Depending on the watchdog device driver this may have performance
impact on embedded systems.
The patch skips sending the watchdog to device driver if the ping is
requested before half of the watchdog timeout.
2020-04-16 16:32:05 +02:00
Lennart Poettering daf8f72b4e core: make sure ProtectHostname= is handled gracefully in containers lacking seccomp
Fixes: #15408
2020-04-13 17:32:27 +02:00
Zbigniew Jędrzejewski-Szmek 38cd55b007 Remove unneded {}s
$ perl -i -0pe 's|\s+{\n([^\n]*;)\n\s+}\n|\n\1\n|gms' **/*.c

Inspired by ea7cbf5bdd.
2020-04-13 09:31:49 +02:00
Zbigniew Jędrzejewski-Szmek a1a11d5610
Merge pull request #15365 from poettering/remount-fs-pstore-fix
pid1: automatically generate systemd-remount-fs.service deps, plus enable systemd-pstore from sysinit.target
2020-04-12 17:16:07 +02:00
Lennart Poettering 9b3c65ed36
Merge pull request #15352 from poettering/user-group-name-valdity-rework
user/group name validity rework
2020-04-09 18:49:22 +02:00
Lennart Poettering 611cb82612
Merge pull request #15318 from fbuihuu/inherit-umask-for-user-units
pid1: by default make user units inherit their umask from the user ma…
2020-04-09 17:15:55 +02:00
Lennart Poettering f58921bde3
Merge pull request #15332 from keszybz/coredump-filter
CoredumpFilter=
2020-04-09 17:15:26 +02:00
Franck Bui 5a7c4f4f3b mount: introduce mount_is_nofail() helper 2020-04-09 15:26:21 +02:00
Franck Bui 2ec15c4f8a mount: default startup dependencies and default network ones are orthogonal
Regardless of whether a mount is setup in initrd or int the main system,
the network default dependencies _netdev should still be honored.

IOW if a mount unit use the following options "x-initrd.mount,_netdev", it
should be ordered against initrd-fs.target, network.target,
network-online.target.

/dev/vdb1 /mnt ext4 x-initrd.mount,_netdev defaults 0 0

Before this patch:

Before=umount.target initrd-fs.target
After=system.slice sysroot.mount dev-vdb1.device -.mount systemd-journald.socket blockdev@dev-vdb1.target

After this patch:

Before=initrd-fs.target umount.target
After=network-online.target -.mount blockdev@dev-vdb1.target dev-vdb1.device sysroot.mount system.slice network.target systemd-journald.socket
2020-04-09 15:26:14 +02:00
Franck Bui 61154cf953 mount: introduce mount_add_default_ordering_dependencies()
Move the handling of the usual startup/shutdown dependencies in a dedicated
funtion.

No functional change.
2020-04-09 15:25:14 +02:00
Franck Bui b3d7aef525 automount: fix handling of default dependencies for automount units
First After=local-fs-pre.target wasn't described in the man page although it's
part of the default dependencies automatically set by pid1.

Secondly, Before=local-fs.target was only set if the automount unit was
generated from the fstab-generator because the dep was explicitly
generated. It was also not documented as a default dependency.

Fix it by managing the dep from pid1 instead.
2020-04-09 15:17:17 +02:00
Franck Bui 83cdc87094 mount: let pid1 alone handle the default dependencies for mount units
fstab-generator was also handling the default ordering dependencies for mount
units setup in initrd. To do that it was turning the defaults dependencies off
completely and ordered the mount unit against either local-fs.target or
initrd-fs.target or initrd-root-fs.target itself.

But it had the bad side effect to also remove all other default dependencies as
well. Thus if an initrd mount was using _netdev, the network dependencies were
missing.

In general fstab-generator shouldn't use DefaultDependecies=no because it can
handle only a small set of the default dependencies the rest are dealt by pid1.

So this patch makes pid1 handle all default dependencies.
2020-04-09 15:17:09 +02:00
Franck Bui 5e37d1930b pid1: by default make user units inherit their umask from the user manager
This patch changes the way user managers set the default umask for the units it
manages.

Indeed one can expect that if user manager's umask is redefined through PAM
(via /etc/login.defs or pam_umask), all its children including the units it
spawns have their umask set to the new value.

Hence make user units inherit their umask value from their parent instead of
the hard coded value 0022 but allow them to override this value via their unit
file.

Note that reexecuting managers with 'systemctl daemon-reexec' after changing
UMask= has no effect. To take effect managers need to be restarted with
'systemct restart' instead. This behavior was already present before this
patch.

Fixes #6077.
2020-04-09 14:17:07 +02:00
Zbigniew Jędrzejewski-Szmek b94fb74bae manager: also log at debug level failure to read oom_score_adj 2020-04-09 14:08:48 +02:00
Zbigniew Jędrzejewski-Szmek ad21e542b2 manager: add CoredumpFilter= setting
Fixes #6685.
2020-04-09 14:08:48 +02:00
Wen Yang f74349d88b mount-setup: change the system mount propagation to shared by default only at bootup
The commit b3ac5f8cb9 has changed the system mount propagation to
shared by default, and according to the following patch:
https://github.com/opencontainers/runc/pull/208
When starting the container, the pouch daemon will call runc to execute
make-private.

However, if the systemctl daemon-reexec is executed after the container
has been started, the system mount propagation will be changed to share
again by default, and the make-private operation above will have no chance
to execute.
2020-04-09 10:14:20 +02:00
Lennart Poettering 7a8867abfa user-util: rework how we validate user names
This reworks the user validation infrastructure. There are now two
modes. In regular mode we are strict and test against a strict set of
valid chars. And in "relaxed" mode we just filter out some really
obvious, dangerous stuff. i.e. strict is whitelisting what is OK, but
"relaxed" is blacklisting what is really not OK.

The idea is that we use strict mode whenver we allocate a new user
(i.e. in sysusers.d or homed), while "relaxed" mode is when we process
users registered elsewhere, (i.e. userdb, logind, …)

The requirements on user name validity vary wildly. SSSD thinks its fine
to embedd "@" for example, while the suggested NAME_REGEX field on
Debian does not even allow uppercase chars…

This effectively liberaralizes a lot what we expect from usernames.

The code that warns about questionnable user names is now optional and
only used at places such as unit file parsing, so that it doesn't show
up on every userdb query, but only when processing configuration files
that know better.

Fixes: #15149 #15090
2020-04-08 17:11:20 +02:00
Lennart Poettering f3b7a79b97 core: automatically add dependency on systemd-remount-fs.service if StateDirectory= is used
And similar for other settings that require a writable /var/.

Rationale: if these options are used for early-boot services (such as
systemd-pstore.service) we need /var/ writable. And if /var/ is on the
root fs, then systemd-remount-fs.service is the service that ensures
that /var/ is writable.

This allows us to remove explicit deps in services such as
systemd-pstore.service.
2020-04-08 16:29:25 +02:00