$ build/systemctl --version
systemd 239-3555-g6178cbb5b5
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN +PCRE2 default-hierarchy=hybrid
$ git tag v240 -m 'v240'
$ ninja -C build
ninja: Entering directory `build'
[76/76] Linking target fuzz-unit-file.
$ build/systemctl --version
systemd 240
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN +PCRE2 default-hierarchy=hybrid
This is very useful during development, because a precise version string is
embedded in the build product and displayed during boot, so we don't have to
guess answers for questions like "did I just boot the latest version or the one
from before?".
This change creates an overhead for "noop" builds. On my laptop, 'ninja -C
build' that does nothing goes from 0.1 to 0.5 s. It would be nice to avoid
this, but I think that <1 s is still acceptable.
Fixes#7183.
PACKAGE_VERSION is renamed to GIT_VERSION, to make it obvious that this is the
more dynamically changing version string.
Why save to a file? It would be easy to generate the version tag using
run_command(), but we want to go through a file so that stuff gets rebuilt when
this file changes. If we just defined an variable in meson, ninja wouldn't know
it needs to rebuild things.
PACKAGE_VERSION is more explicit, and also, we don't pretend that changing the
project name in meson.build has any real effect. "systemd" is embedded in a
thousand different places, so let's just use the hardcoded string consistently.
This is mostly in preparation for future changes.
Whenever we invoke external, foreign code from code that has
RLIMIT_NOFILE's soft limit bumped to high values, revert it to 1024
first. This is a safety precaution for compatibility with programs using
select() which cannot operate with fds > 1024.
This commit adds the call to rlimit_nofile_safe() to all invocations of
exec{v,ve,l}() and friends that either are in code that we know runs
with RLIMIT_NOFILE bumped up (which is PID 1 and all journal code for
starters) or that is part of shared code that might end up there.
The calls are placed as early as we can in processes invoking a flavour
of execve(), but after the last time we do fd manipulations, so that we
can still take benefit of the high fd limits for that.
Ideally, coccinelle would strip unnecessary braces too. But I do not see any
option in coccinelle for this, so instead, I edited the patch text using
search&replace to remove the braces. Unfortunately this is not fully automatic,
in particular it didn't deal well with if-else-if-else blocks and ifdefs, so
there is an increased likelikehood be some bugs in such spots.
I also removed part of the patch that coccinelle generated for udev, where we
returns -1 for failure. This should be fixed independently.
Otherwise we keep collecting stuff from env generators, and we really
shouldn't.
This was working properly on reexec but not on reload, as for reexec we
would always start fresh, but for reload would reuse the Manager object
and hence its default environment set.
Fixes: #10671
We now don't enable the CPU controller just for CPU accounting if we are
on 4.15+ and using pure unified hierarchy, as this is provided
externally to the CPU controller. This makes CPUAccounting=yes
essentially free, so enabling it by default when it's cheap seems like a
good idea.
This removes the ability to configure which cgroup controllers to mount
together. Instead, we'll now hardcode that "cpu" and "cpuacct" are
mounted together as well as "net_cls" and "net_prio".
The concept of mounting controllers together has no future as it does
not exist to cgroupsv2. Moreover, the current logic is systematically
broken, as revealed by the discussions in #10507. Also, we surveyed Red
Hat customers and couldn't find a single user of the concept (which
isn't particularly surprising, as it is broken...)
This reduced the (already way too complex) cgroup handling for us, since
we now know whenever we make a change to a cgroup for one controller to
which other controllers it applies.
Pretty much everything uses just the first argument, and this doesn't make this
common pattern more complicated, but makes it simpler to pass multiple options.
sysctl is disabled for /proc mounted from an user namespace thus entries like
/proc/sys/net/unix/max_dgram_qlen do not exist. In this case, skip the error
and do not try to change the default for the AF_UNIX datagram queue length.
This splits the "environment" field of Manager into two:
transient_environment and client_environment. The former is generated
from configuration file, kernel cmdline, environment generators. The
latter is the one the user can control with "systemctl set-environment"
and similar.
Both sets are merged transparently whenever needed. Separating the two
sets has the benefit that we can safely flush out the former while
keeping the latter during daemon reload cycles, so that env var settings
from env generators or configuration files do not accumulate, but
dynamic API changes are kept around.
Note that this change is not entirely transparent to users: if the user
first uses "set-environment" to override a transient variable, and then
uses "unset-environment" to unset it again things will revert to the
original transient variable now, while previously the variable was fully
removed. This change in behaviour should not matter too much though I
figure.
Fixes: #9972
Let's be more careful with what we serialize: let's ensure we never
serialize strings that are longer than LONG_LINE_MAX, so that we know we
can read them back with read_line(…, LONG_LINE_MAX, …) safely.
In order to implement this all serialization functions are move to
serialize.[ch], and internally will do line size checks. We'd rather
skip a serialization line (with a loud warning) than write an overly
long line out. Of course, this is just a second level protection, after
all the data we serialize shouldn't be this long in the first place.
While we are at it also clean up logging: while serializing make sure to
always log about errors immediately. Also, (void)ify all calls we don't
expect errors in (or catch errors as part of the general
fflush_and_check() at the end.
After discussions with kernel folks, a system with memcg really
shouldn't need extra hard limits on file descriptors anymore, as they
are properly accounted for by memcg anyway. Hence, let's bump these
values to their maximums.
This also adds a build time option to turn thiss off, to cover those
users who do not want to use memcg.
Following the discussions with the kernel folks, let's substantially
increase the hard limit (but not the soft limit) of RLIMIT_NOFILE to
256K for all services we start.
Note that PID 1 itself bumps the limit even further, to the max the
kernel allows. We can deal with that after all.
let's clean up error handling and logging in manager_reload() a bit.
Specifically: make sure we log about every error we might encounter at
least and at most once.
When we encounter an error before the "point of no return" then log at
LOG_ERR about it and propagate it. Otherwise, eat it up, but warn about
it and proceed, it's the best we can do.
"ExitCode" is a bit of a misnomer in two ways: it suggests this was
about the "exit code" concept that exit()/waitid() deal with, but really
isn't. Moreover, it's not event just about exiting either, but more
often about reloading/reexecing or rebooting. Let's hence pick a new
name for this that is a bit more correct.
I initially thought about naming this the "state", but that'd be a
misnomer too, as the value really encodes a "goal" more than a current
state. Also we already have the externally visible ManagerState.
No actual changes in behaviour, just the rename.
Bpf programs are charged against memlock ulimit, and the default value
can be too tight on machines with many cgroups and attached bpf programs.
Let's bump it to 64Mb.
Until a core dump handler is installed by systemd-sysctl, the generation of
core dump for services is turned OFF which can make the debugging of the early
boot process harder especially since there's no easy way to restore the core
dump generation.
This patch introduces a new kernel command line option which specifies an
absolute path where the kernel should write the core dump file when an early
process crashes.
This will take effect until systemd-coredump (or any other handlers) takes
over.
This is a bit like the info link in most of GNU's --help texts, but we
don't do info but man pages, and we make them properly clickable on
terminal supporting that, because awesome.
I think it's generally advisable to link up our (brief) --help texts and
our (more comprehensive) man pages a bit, so this should be an easy and
straight-forward way to do it.
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
This part of the copyright blurb stems from the GPL use recommendations:
https://www.gnu.org/licenses/gpl-howto.en.html
The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.
hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
To make debugging easier, this patches allows one to change the log target and
do reload/reexec without modifying configuration permanently, which makes
debugging easier.
Indeed if one changed the log target at runtime (via the bus or via signals),
the change was lost on the next reload/reexecution.
In order to restore back the default value (set via system.conf, environment
variables or any other means ), the empty string in the "LogTarget" property is
now supported as well as sending SIGTRMIN+26 signal.
To make debugging easier, this patches allows one to change the log level and
do reload/reexec without modifying configuration permanently, which makes
debugging easier.
Indeed if one changed the log max level at runtime (via the bus or via
signals), the change was lost on the next daemon reload/reexecution.
In order to restore the original value back (set via system.conf, environment
variables or any other means), the empty string in the "LogLevel" property is
now supported as well as sending SIGRTMIN+23 signal.