Commit graph

615 commits

Author SHA1 Message Date
Frantisek Sumsal 2d4f8cf467 man: fix few spelling errors
Reported by Fossies.org.
2020-05-03 10:23:25 +02:00
Lennart Poettering 04d1ee0f7e main: bump RLIMIT_MEMLOCK by physical RAM size
Let's allow more memory to be locked on beefy machines than on small
ones. The previous limit of 64M is the lower bound still. This
effectively means on a 4GB machine we can lock 512M, which should be
more than enough, but still not lock up the machine entirely under
pressure.

Fixes: #15053
2020-04-28 19:54:21 +02:00
Zbigniew Jędrzejewski-Szmek 162392b75a tree-wide: spellcheck using codespell
Fixes #15436.
2020-04-16 18:00:40 +02:00
Wen Yang f74349d88b mount-setup: change the system mount propagation to shared by default only at bootup
The commit b3ac5f8cb9 has changed the system mount propagation to
shared by default, and according to the following patch:
https://github.com/opencontainers/runc/pull/208
When starting the container, the pouch daemon will call runc to execute
make-private.

However, if the systemctl daemon-reexec is executed after the container
has been started, the system mount propagation will be changed to share
again by default, and the make-private operation above will have no chance
to execute.
2020-04-09 10:14:20 +02:00
Lennart Poettering 01262d0d9e
Merge pull request #13084 from ddstreet/log_time
log: add support for prefixing console log messages with current timestamp
2020-03-30 17:57:34 +02:00
Zbigniew Jędrzejewski-Szmek cd57038a30 core: consistently emit oom warning when parsing args
In practice we are very unlikely to fail at this point, but for
consistency, we should always warn when allocation fails, and
we have free_and_strdup_warn() for this.
2020-03-13 15:54:27 +01:00
Zbigniew Jędrzejewski-Szmek c86c31d9ca core: log an info line about the default target
If the default target is not specified on the kernel command line, it
may not be obvious what we are booting into. See for example
https://github.com/systemd/systemd/issues/15062#issuecomment-597577464.
2020-03-12 15:09:22 +01:00
Zbigniew Jędrzejewski-Szmek f1d075dcb7 core: shorten local variable names in do_queue_default_job()
There is really no risk of confusion here: it's a simple function
that queues one job.
2020-03-11 13:43:19 +01:00
Chris Down dfb3303b6c
Merge pull request #15010 from cgzones/selinux_reload_cache_enforce
SELinux: add trigger for policy reload and cache enforced status
2020-03-06 16:12:48 +00:00
Yu Watanabe df883de98a pid1, nspawn: voidify loopback_setup() 2020-03-04 14:18:55 +01:00
Christian Göttsche 61f3e897f1 selinux: add trigger for policy reload to refresh internal selabel cache
Reload the internal selabel cache automatically on SELinux policy reloads so non pid-1 daemons are participating.

Run the reload function `mac_selinux_reload()` not manually on daemon-reload, but rather pass it as callback to libselinux.
Trigger the callback prior usage of the systemd internal selabel cache by depleting the selinux netlink socket via `avc_netlink_check_nb()`.

Improves: a9dfac21ec ("core: reload SELinux label cache on daemon-reload")
Improves: #13363
2020-03-03 20:25:49 +01:00
Zbigniew Jędrzejewski-Szmek 0d066dd1a4 pid1: add new mode systemd.show-status=error and use it when 'quiet' is passed
systemd.show-status=error is useful for the case where people care about errors
only.

If people want to have a quiet boot, they most likely don't want to see all
status output even if there is a delay in boot, so make "quiet" imply
systemd.show-status=error instead of systemd.show-status=auto.

Fixes #14976.
2020-03-01 11:48:23 +01:00
Zbigniew Jędrzejewski-Szmek 7365a29670 pid1: when printing status message status, give reason 2020-03-01 11:42:19 +01:00
Zbigniew Jędrzejewski-Szmek 5ca02bfc39 core: fix message about show status state
We would say "Enabling" also for SHOW_STATUS_AUTO, which is actually
"soft off". So just print the exact state to make things easier to understand.
Also add a helper function to avoid repeating the enum value list.

For #14814.
2020-03-01 11:42:12 +01:00
Dan Streetman c5673ed0de log: add support for prefixing console log messages with current timestamp 2020-02-10 07:01:30 -05:00
Leonid Evdokimov 6e3c443b56 Fix typo 2019-12-26 12:07:08 +09:00
Lennart Poettering b51d61fec6
Merge pull request #14177 from keszybz/use-initrd.target
Use initrd.target in the initramfs
2019-12-04 10:30:32 +01:00
Christian Göttsche a9dfac21ec core: reload SELinux label cache on daemon-reload
Reloading the SELinux label cache here enables a light-wight follow-up of a SELinux policy change, e.g. adding a label for a RuntimeDirectory.

Closes: #13363
2019-12-04 10:29:46 +01:00
Lennart Poettering 68d58f3869 pid1: add new kernel cmdline arg systemd.cpu_affinity=
Let's allow configuration of the CPU affinity via the kernel cmdline,
overriding CPUAffinity= in /etc/systemd/system.conf

Prompted by:

https://lists.freedesktop.org/archives/systemd-devel/2019-November/043754.html
2019-12-04 10:28:43 +01:00
Zbigniew Jędrzejewski-Szmek 8755dbad5b pid1: use initrd.target in the initramfs by default
This makes the code do what the documentation says. The code had no inkling
about initrd.target, so I think this change is fairly risky. As a fallback,
default.target will be loaded, so initramfses which relied on current behaviour
will still work, as along as they don't have a different initrd.target.

In an initramfs created with recent dracut:
$ ls -l usr/lib/systemd/system/{default.target,initrd.target}
lrwxrwxrwx. usr/lib/systemd/system/default.target -> initrd.target
-rw-r--r--. usr/lib/systemd/system/initrd.target
So at least for dracut, there should be no difference.

Also avoid a pointless allocation.
2019-11-28 19:59:33 +01:00
Zbigniew Jędrzejewski-Szmek a61d68748a pid1: fix setting of DefaultTimeoutAbortSec
This partially reverts a07a7324ad.
We have two pieces of information: the value and a boolean.
config_parse_timeout_abort() added in the reverted commit would write
the boolean to the usec_t value, making a mess.

The code is reworked to have just one implementation and two wrappers
which pass two pointers.
2019-11-27 13:56:28 +01:00
Zbigniew Jędrzejewski-Szmek 3a0f06c41a core: make TasksMax a partially dynamic property
TasksMax= and DefaultTasksMax= can be specified as percentages. We don't
actually document of what the percentage is relative to, but the implementation
uses the smallest of /proc/sys/kernel/pid_max, /proc/sys/kernel/threads-max,
and /sys/fs/cgroup/pids.max (when present). When the value is a percentage,
we immediately convert it to an absolute value. If the limit later changes
(which can happen e.g. when systemd-sysctl runs), the absolute value becomes
outdated.

So let's store either the percentage or absolute value, whatever was specified,
and only convert to an absolute value when the value is used. For example, when
starting a unit, the absolute value will be calculated when the cgroup for
the unit is created.

Fixes #13419.
2019-11-14 18:41:54 +01:00
Yu Watanabe 1405cb653a tree-wide: drop stdio.h when stdio-util.h is included 2019-11-04 00:30:32 +09:00
Yu Watanabe 021cdf8330 tree-wide: drop signal.h when signal-util.h is included 2019-11-04 00:30:32 +09:00
Yu Watanabe e30e8b5073 tree-wide: drop stat.h or statfs.h when stat-util.h is included 2019-11-04 00:30:32 +09:00
Yu Watanabe 455fa9610c tree-wide: drop string.h when string-util.h or friends are included 2019-11-04 00:30:32 +09:00
Lennart Poettering 0e06a03165 pid1: rework environment block copy logic
This reworks the logic introduced in
a5cede8c24 (#13693).

First of all, let's move this out of util.c, since only PID 1 really
needs this, and there's no real need to have it in util.c.

Then, fix freeing of the variable. It previously relied on
STATIC_DESTRUCTOR_REGISTER() which however relies on static_destruct()
to be called explicitly. Currently only the main-func.h macros do that,
and PID 1 does not. (It might be worth investigating whether to do that,
but it's not trivial.) Hence the freeing wasn't applied.

Finally, an OOM check was missing, add it in.
2019-11-01 11:30:59 +01:00
Yu Watanabe f5947a5e92 tree-wide: drop missing.h 2019-10-31 17:57:03 +09:00
Franck Bui a5cede8c24 pid1: restore the original environment passed by the kernel when switching to a new system manager
PID1 may modified the environment passed by the kernel when it starts
running. Commit 9d48671c62 unset $HOME for
example.

In case PID1 is going to switch to a new root and execute a new system manager
which is not systemd, we should restore the original environment as the new
manager might expect some variables to be set by default (more specifically
$HOME).
2019-10-03 22:08:13 +02:00
Franck Bui c0000de87d pid1: fix DefaultTasksMax initialization
Otherwise DefaultTasksMax is always set to "inifinity".

This was broken by fb39af4ce4.
2019-10-03 11:24:27 +02:00
Zbigniew Jędrzejewski-Szmek 6123dfaa72 pid1: disable printk ratelimit in early boot
We have the problem that many early boot or late shutdown issues are harder
to solve than they could be because we have no logs. When journald is not
running, messages are redirected to /dev/kmsg. It is also the time when many
things happen in a rapid succession, so we tend to hit the kernel printk
ratelimit fairly reliably. The end result is that we get no logs from the time
where they would be most useful. Thus let's disable the kernels ratelimit.

Once the system is up and running, the ratelimit is not a problem. But during
normal runtime, things also log to journald, and not to /dev/kmsg, so the
ratelimit is not useful. Hence, there doesn't seem to be much point in trying
to restore the ratelimit after boot is finished and journald is up and running.

See kernel's commit 750afe7babd117daabebf4855da18e4418ea845e for the
description of the kenrel interface. Our setting has lower precedence than
explicit configuration on the kenrel command line.
2019-09-20 16:05:53 +02:00
Maciej Stanczew 6327aa9f6c core: Fix setting StatusUnitFormat from config files 2019-09-17 15:21:21 +09:00
Lennart Poettering e04ed6db6b exit-status: rename EXIT_STATUS_GLIBC → EXIT_STATUS_LIBC
After all these two exit codes are defined by ISO C as part of the C
library, and it's not the GNU implementation defines them.
2019-07-29 19:05:25 +02:00
Zbigniew Jędrzejewski-Szmek e1714f0250 shared/exit-status: turn status level into a bitmask, add "test"
The "test" doesn't really test much automatically, but it is still useful
to look at the mappings.
2019-07-29 15:54:45 +02:00
Lennart Poettering c18ecf0375 core: take random seed from boot loader and credit it to kernel entropy pool 2019-07-25 18:16:46 +02:00
Lennart Poettering 0a2eef1ee1 core: try to reopen /dev/kmsg again right after mounting /dev
I was debugging stuff during early boot, and was confused that I never
found the logs for it in kmsg. The reason for that was that /proc is
generally not mounted the first time we do log_open() and hence
log_set_target(LOG_TARGET_KMSG) we do when running as PID 1 had not
effect. A lot later during start-up we call log_open() again where this
is fixed (after the point where we close all remaining fds still open),
but in the meantime no logs every got written to kmsg. This patch fixes
that.
2019-07-24 19:56:51 +02:00
Luca Boccassi 65224c1d0e core: rename ShutdownWatchdogSec to RebootWatchdogSec
This option is only used on reboot, not on other types of shutdown
modes, so it is misleading.
Keep the old name working for backward compatibility, but remove it
from the documentation.
2019-07-23 20:29:03 +01:00
Luca Boccassi acafd7d8a6 core: add KExecWatchdogSec option
Rather than always enabling the shutdown WD on kexec, which might be
dangerous in case the kernel driver and/or the hardware implementation
does not reset the wd on kexec, add a new timer, disabled by default,
to let users optionally enable the shutdown WD on kexec separately
from the runtime and reboot ones. Advise in the documentation to
also use the runtime WD in conjunction with it.

Fixes: a637d0f9ec ("core: set shutdown watchdog on kexec too")
2019-07-23 20:29:03 +01:00
Luca Boccassi a637d0f9ec core: set shutdown watchdog on kexec too
At the moment the shutdown watchdog is set only when rebooting.
The set of "things that can go wrong" is not too far off when kexec'ing
and in fact we have a use case where it would be useful - moving to a
new kernel image.
2019-07-18 22:31:43 +02:00
Franck Bui a9fd4cd120 pid1: make sure to restore correct default values for some rlimits
Commit fb39af4ce4 forgot to restore the default
rlimit values (RLIMIT_NOFILE and RLIMIT_MEMLOCK) while PID1 is reloading.

This patch extracts the code in charge of initializing the default values for
those rlimits in order to create dedicated functions, which take care of their
initialization.

These functions are then called in parse_configuration() so we make sure that
the default values for these rlimits get restored every time PID1 is reloading
its configuration.
2019-07-17 06:24:27 +09:00
Lennart Poettering 56e8419aa8 main: use sysctl_writef() where appropriate 2019-07-13 11:05:07 +02:00
Zbigniew Jędrzejewski-Szmek 36cf45078c Add config and kernel commandline option to use short identifiers
No functional change, just docs and configuration and parsing.

v2:
- change ShortIdentifiers=yes|no to StatusUnitFormat=name|description.
2019-07-10 13:35:26 +02:00
Zbigniew Jędrzejewski-Szmek 1f65fd4926 basic/time-util: add helper function to check if timestamp is set
No functional change.
2019-07-04 19:12:47 +02:00
Frantisek Sumsal a07a7324ad core: move config_parse_* functions to a shared module
Apart from making the code a little bit more clean, it should allow us
to write a fuzzer around the config-parsing functions in the future
2019-06-25 22:35:02 +09:00
Michal Sekletar b070c7c0e1 core: introduce NUMAPolicy and NUMAMask options
Make possible to set NUMA allocation policy for manager. Manager's
policy is by default inherited to all forked off processes. However, it
is possible to override the policy on per-service basis. Currently we
support, these policies: default, prefer, bind, interleave, local.
See man 2 set_mempolicy for details on each policy.

Overall NUMA policy actually consists of two parts. Policy itself and
bitmask representing NUMA nodes where is policy effective. Node mask can
be specified using related option, NUMAMask. Default mask can be
overwritten on per-service level.
2019-06-24 16:58:54 +02:00
Lennart Poettering cd69e88ba3 doc: make clear that --system and --user only make sense with --test
Fixes: #12843
2019-06-24 14:51:52 +02:00
Yu Watanabe 657ee2d82b tree-wide: replace strjoin() with path_join() 2019-06-21 03:26:16 +09:00
Lennart Poettering 6e2f789484 core: set fs.file-max sysctl to LONG_MAX rather than ULONG_MAX
Since kernel 5.2 the kernel thankfully returns proper errors when we
write a value out of range to the sysctl. Which however breaks writing
ULONG_MAX to request the maximum value. Hence let's write the new
maximum value instead, LONG_MAX.

/cc @brauner

Fixes: #12803
2019-06-17 15:48:11 +02:00
Michal Sekletar 3f09629c22
Merge pull request #12628 from keszybz/dbus-execute
Rework cpu affinity parsing
2019-05-30 12:32:53 +02:00
Zbigniew Jędrzejewski-Szmek fb39af4ce4 pid1: when reloading configuration, forget old settings
If we had a configuration setting from a configuration file, and it was
removed, we'd still remember the old value, because there's was no mechanism to
"reset" everything, just to assign new values.

Note that the effect of this is limited. For settings that have an "ongoing" effect,
like systemd.confirm_spawn, the new value is simply used. But some settings can only
be set at start.

In particular, CPUAffinity= will be updated if set to a new value, but if
CPUAffinity= is fully removed, it will not be reset, simply because we don't
know what to reset it to. We might have inherited a setting, or we might have
set it ourselves. In principle we could remember the "original" value that was
set when we were executed, but propagate this over reloads and reexecs, but
that would be a lot of work for little gain. So this corner case of removal of
CPUAffinity= is not handled fully, and a reboot is needed to execute the
change. As a work-around, a full mask of CPUAffinity=0-8191 can be specified.
2019-05-29 10:29:28 +02:00