Let's generalize this, so that we can use this in nspawn later on, which
is pretty useful as we need to be able to mask files from the inner
child of nspawn too, where the host's /run/systemd/inaccessible
directory is not visible anymore. Moreover, if nspawn can create these
nodes on its own before the payload this means the payload can run with
fewer privileges.
add new "systemd-run-generator" for running arbitrary commands from the kernel command line as system services using the "systemd.run=" kernel command line switch
Previously, we'd link the unit file into /etc in this case, but that
should only be done if the unit file is not in the search path anyway,
and this is already done implicitly anyway for all enabled unit files,
hence no reason to duplicate this here.
Fixes: #10253
Quite often when we generate objects some fields should only be
generated in some conditions. Let's add high-level support for that.
Matching the existing JSON_BUILD_PAIR() this adds
JSON_BUILD_PAIR_CONDITIONAL() which is very similar, but takes an
additional parameter: a boolean condition. If "true" this acts like
JSON_BUILD_PAIR(), but if false then the whole pair is suppressed.
This sounds simply, but requires a tiny bit of complexity: when complex
sub-variants are used in fields, then we also need to suppress them.
This adds SuccessActionExitStatus= and FailureActionExitStatus= that may
be used to configure the exit status to propagate in when
SuccessAction=exit or FailureAction=exit is used.
When not specified let's also propagate the exit status of the main
process we fork off for the unit.
Let's simplify things and drop the logic that /var/lib/machines is setup
as auto-growing btrfs loopback file /var/lib/machines.raw.
THis was done in order to make quota available for machine management,
but quite frankly never really worked properly, as we couldn't grow the
file system in sync with its use properly. Moreover philosophically it's
problematic overriding the admin's choice of file system like this.
Let's hence drop this, and simplify things. Deleting code is a good
feeling.
Now that regular file systems provide project quota we could probably
add per-machine quota support based on that, hence the btrfs quota
argument is not that interesting anymore (though btrfs quota is a bit
more powerful as it allows recursive quota, i.e. that the machine pool
gets an overall quota in addition to per-machine quota).
We invoke this usually on a temporary path before renaming it into
place. This means the log message is quite suprising as it mentions a
weird path with random characters in it. Hence, let's downgrade the
message in order not to confuse the user.
Otherwise a "reboot" or "poweroff" in the initramfs will have to wait
until systemd-fsck-root.service has completed, which might never happen
if the root device never shows up.
Ideally, coccinelle would strip unnecessary braces too. But I do not see any
option in coccinelle for this, so instead, I edited the patch text using
search&replace to remove the braces. Unfortunately this is not fully automatic,
in particular it didn't deal well with if-else-if-else blocks and ifdefs, so
there is an increased likelikehood be some bugs in such spots.
I also removed part of the patch that coccinelle generated for udev, where we
returns -1 for failure. This should be fixed independently.
This reverts 5fdf2d51c2, except for one improved
log message.
Fixes#10613.
Checking if resume= is configured is a good idea, but it turns out we cannot do
it reliably:
- the code only supported boot options with sd-boot, and it's not very widely
used. This means that for most systemd we could only check the current
commandline, not the next one.
- Various systems resume without, e.g. Debian has
/etc/initramfs-tools/conf.d/resume in the initramfs.
Making those checks better would be possible with enough effort, but there'll
be always new systems that boot in a slightly different way and we would need
to keep adding new cases. Longer term, we want to rely on autodetecting the
resume partition, and then checks like this will not be necessary at all. It is
quite clear from the number of bug reports that the number of poeple impacted
by this is quite high now, so let's just drop this.
The agent is closed after the static destuctors but before the pager.
No users of DEFINE_MAIN_FUNCTION* were using a polkit agent, so this makes no
functional difference.
This doesn't have much effect on the final build, because we link libbasic.a
into libsystemd-shared.so, so in the end, all the object built from basic/
end up in libsystemd-shared. And when the static library is linked into binaries,
any objects that are included in it but are not used are trimmed. Hence, the
size of output artifacts doesn't change:
$ du -sb /var/tmp/inst*
54181861 /var/tmp/inst1 (old)
54207441 /var/tmp/inst1s (old split-usr)
54182477 /var/tmp/inst2 (new)
54208041 /var/tmp/inst2s (new split-usr)
(The negligible change in size is because libsystemd-shared.so is bigger
by a few hundred bytes. I guess it's because symbols are named differently
or something like that.)
The effect is on the build process, in particular partial builds. This change
effectively moves the requirements on some build steps toward the leaves of the
dependency tree. Two effects:
- when building items that do not depend on libsystemd-shared, we
build less stuff for libbasic.a (which wouldn't be used anyway,
so it's a net win).
- when building items that do depend on libshared, we reduce libbasic.a as a
synchronization point, possibly allowing better parallelism.
Method:
1. copy list of .h files from src/basic/meson.build to /tmp/basic
2. $ for i in $(grep '.h$' /tmp/basic); do echo $i; git --no-pager grep "include \"$i\"" src/basic/ 'src/lib*' 'src/nss-*' 'src/journal/sd-journal.c' |grep -v "${i%.h}.c";echo ;done | less
This removes the ability to configure which cgroup controllers to mount
together. Instead, we'll now hardcode that "cpu" and "cpuacct" are
mounted together as well as "net_cls" and "net_prio".
The concept of mounting controllers together has no future as it does
not exist to cgroupsv2. Moreover, the current logic is systematically
broken, as revealed by the discussions in #10507. Also, we surveyed Red
Hat customers and couldn't find a single user of the concept (which
isn't particularly surprising, as it is broken...)
This reduced the (already way too complex) cgroup handling for us, since
we now know whenever we make a change to a cgroup for one controller to
which other controllers it applies.