Make possible to set NUMA allocation policy for manager. Manager's
policy is by default inherited to all forked off processes. However, it
is possible to override the policy on per-service basis. Currently we
support, these policies: default, prefer, bind, interleave, local.
See man 2 set_mempolicy for details on each policy.
Overall NUMA policy actually consists of two parts. Policy itself and
bitmask representing NUMA nodes where is policy effective. Node mask can
be specified using related option, NUMAMask. Default mask can be
overwritten on per-service level.
Originally, `systemctl cat` would match only active units, for example:
$ systemctl cat sshd.service
would cat the sshd.service unit file even if the service was inactive.
However:
$ systemctl cat ssh*
would show it only if it was active.
Let's unify the behavior and cat all unit files regardless of a state,
if no state was given explicitly to filter.
(before)
$ build/systemctl list-machines
Need to be root.
$ sudo build/systemctl list-machines
NAME STATE FAILED JOBS
krowka (host) running 0 0
rawhide running 0 0
2 machines listed.
(after)
$ build/systemctl list-machines
NAME STATE FAILED JOBS
krowka (host) running 0 0
rawhide n/a 0 0
2 machines listed.
The output for non-root is missing some bits of information, but we display
what information is missing nicely, and e.g. in the case when no machines are
running at all, or only VMs are running, the unprivileged output would be the
same as the privileged one.
In cgroup v2 we have protection tunables -- currently MemoryLow and
MemoryMin (there will be more in future for other resources, too). The
design of these protection tunables requires not only intermediate
cgroups to propagate protections, but also the units at the leaf of that
resource's operation to accept it (by setting MemoryLow or MemoryMin).
This makes sense from an low-level API design perspective, but it's a
good idea to also have a higher-level abstraction that can, by default,
propagate these resources to children recursively. In this patch, this
happens by having descendants set memory.low to N if their ancestor has
DefaultMemoryLow=N -- assuming they don't set a separate MemoryLow
value.
Any affected unit can opt out of this propagation by manually setting
`MemoryLow` to some value in its unit configuration. A unit can also
stop further propagation by setting `DefaultMemoryLow=` with no
argument. This removes further propagation in the subtree, but has no
effect on the unit itself (for that, use `MemoryLow=0`).
Our use case in production is simplifying the configuration of machines
which heavily rely on memory protection tunables, but currently require
tweaking a huge number of unit files to make that a reality. This
directive makes that significantly less fragile, and decreases the risk
of misconfiguration.
After this patch is merged, I will implement DefaultMemoryMin= using the
same principles.
Some chattrs only work sensible if you set them right after opening a
file for create (think: FS_NOCOW_FL). Others only work when they are
applied when the file is fully written (think: FS_IMMUTABLE_FL). Let's
take that into account when copying files and applying a chattr to them.
Commit d85515edcf changed logic how reboot is
executed. That commit changed behavior to use emergency action reboot code path
to perform the reboot.
This inadvertently broke rebooting with argument:
$ systemctl reboot custom-reason
Restore original behavior so that if reboot service unit similar to
systemd-reboot.service is executed it is possible to override reboot reason
with "systemctl reboot ARG".
When "systemctl reboot ARG" is executed ARG is placed in file
/run/systemd/reboot-param and reboot is issued using logind's Reboot
dbus-service.
If RebootArgument is specified in systemd-reboot.service it takes precedence
over what systemctl sets.
Fixes: #11828