tmp.conf was dealing with 2 different kind of paths: one dealing with general
temporary paths such as /var/tmp and /tmp and the other one dealing with
temporary directories owned by systemd.
If for example a user wants to adjust the age argument of the general paths
only, he had to overload the whole file which is cumbersome and error prone
since any future changes in tmp.conf shipped by systemd will be lost.
So this patch splits out tmp.conf so the systemd directories are dealt
separately in a dedicated conf file. It's named "systemd-tmp.conf" based on the
naming recommendation made in tmpfiles.d man page.
In practice it shouldn't cause any regression since it's very unlikely that
users override paths owned by systemd.
When shooting down a service with SIGABRT the user might want to have a
much longer stop timeout than on regular stops/shutdowns. Especially in
the face of short stop timeouts the time might not be sufficient to
write huge core dumps before the service is killed.
This commit adds a dedicated (Default)TimeoutAbortSec= timer that is
used when stopping a service via SIGABRT. In all other cases the
existing TimeoutStopSec= is used. The timer value is unset by default
to skip the special handling and use TimeoutStopSec= for state
'stop-watchdog' to keep the old behaviour.
If the service is in state 'stop-watchdog' and the service should be
stopped explicitly we still go to 'stop-sigterm' and re-apply the usual
TimeoutStopSec= timeout.
[zj: this is a subset of changes generated by clang-format, just the ones
I think improve readability or consistency.]
This is a part of https://github.com/systemd/systemd/pull/11811.
In cgroup v2 we have protection tunables -- currently MemoryLow and
MemoryMin (there will be more in future for other resources, too). The
design of these protection tunables requires not only intermediate
cgroups to propagate protections, but also the units at the leaf of that
resource's operation to accept it (by setting MemoryLow or MemoryMin).
This makes sense from an low-level API design perspective, but it's a
good idea to also have a higher-level abstraction that can, by default,
propagate these resources to children recursively. In this patch, this
happens by having descendants set memory.low to N if their ancestor has
DefaultMemoryLow=N -- assuming they don't set a separate MemoryLow
value.
Any affected unit can opt out of this propagation by manually setting
`MemoryLow` to some value in its unit configuration. A unit can also
stop further propagation by setting `DefaultMemoryLow=` with no
argument. This removes further propagation in the subtree, but has no
effect on the unit itself (for that, use `MemoryLow=0`).
Our use case in production is simplifying the configuration of machines
which heavily rely on memory protection tunables, but currently require
tweaking a huge number of unit files to make that a reality. This
directive makes that significantly less fragile, and decreases the risk
of misconfiguration.
After this patch is merged, I will implement DefaultMemoryMin= using the
same principles.
Let's exit the loop early in case the variant is not actually an object
or array. This is safer since otherwise we might end up iterating
through these variants and access fields that aren't of the type we
expect them to be and then bad things happen.
Of course, this doesn't absolve uses of these macros to check the type
of the variant explicitly beforehand, but it makes it less bad if they
forget to do so.
There's no point in returning the "key" within each loop iteration as
JsonVariant object. Let's simplify things and return it as string. That
simplifies usage (since the caller doesn't have to convert the object to
the string anymore) and is safe since we already validate that keys are
strings when an object JsonVariant is allocated.
Unlocked operations are used in all three places. I don't see why just one was
special.
This also improves logging, since we don't just log the final component of the
path, but the full name.
This is partially a refactoring, but also makes many more places use
unlocked operations implicitly, i.e. all users of fopen_temporary().
AFAICT, the uses are always for short-lived files which are not shared
externally, and are just used within the same context. Locking is not
necessary.
We noticed in our tests that occasionally SystemCallFilter= would
fail to set and the service would run with no syscall filtering.
Most of the time the same tests would apply the filter and fail
the service as expected. While it's not totally clear why this happens,
we noticed seccomp_load() in the systemd code base would fail open for
all errors except EPERM and EACCES.
ENOMEM, EINVAL, and EFAULT seem like reasonable values to add to the
error set based on what I gather from libseccomp code and man pages:
-ENOMEM: out of memory, failed to allocate space for a libseccomp structure, or would exceed a defined constant
-EINVAL: kernel isn't configured to support the operations, args are invalid (to seccomp_load(), seccomp(), or prctl())
-EFAULT: addresses passed as args are invalid
The comment explains that $PATH might not be set in certain circumstances and
takes steps to handle this case. If we do that, let's assume that $PATH indeed
might be unset and not call setenv("PATH", NULL, 1). It is not clear from the
man page if that is allowed.
CID #1400497.
Coverity was unhappy, because it doesn't know that $PATH is pretty much always
set. But let's not assume that in the test. CID #1400496.
$ (unset PATH; build/test-env-util)
[1] 31658 segmentation fault (core dumped) ( unset PATH; build/test-env-util; )