We would accept e.g. FailureAction=reboot-force in user units and then do an
exit in the user manager. Let's be stricter, and define "exit"/"exit-force" as
the only supported actions in user units.
v2:
- rename 'exit' to 'exit-force' and add new 'exit'
- add test for the parsing function
This makes it so that tests no longer need to know the absolute paths to the
source and build dirs, instead using the systemd-runtest.env file to get these
paths when running from the build tree.
Confirmed that test-catalog works on `ninja test`, when called standalone and
also when the environment file is not present, in which case it will use the
installed location under /usr/lib/systemd/catalog.
The location can now also be overridden for this test by setting the
$SYSTEMD_CATALOG_DIR environment variable.
This adds -Dnss-resolve= and -Dnss-mymachines= meson options.
By using this option, e.g., resolved can be built without nss-resolve.
When no nss modules are built, then test-nss is neither built.
Also, This changes the option name -Dmyhostname= to -Dnss-myhostname=
for consistency to other nss related options.
Closes#9596.
This shows a minor memleak:
==1883== 24 bytes in 1 blocks are definitely lost in loss record 1 of 1
==1883== at 0x4C2DBAB: malloc (vg_replace_malloc.c:299)
==1883== by 0x4E9D385: malloc_multiply (alloc-util.h:69)
==1883== by 0x4EA2959: bus_request_name_async_may_reload_dbus (bus-util.c:1841)
==1883== by ...
The exchange of messages is truncated at two different points: once right
after the first callback is requested, and the second time after the full
sequence has run (usually resulting in an error because of policy).
We have plenty of code in our codebase that outputs tables to the
console, and all is homegrown and awful. Let's replace it with a generic
implementation that can do automatically what the old implementations
did manually.
Features:
1. Ellipsation (for fields overly long) and alignment (for
fields overly short)
2. Sorting of rows
3. automatically copies formatting from the same cell in the row above
4. Heavy use of varargs to make putting together tables easy
5. can expand and compress tables, with weights
6. Has a minimal understanding of unicode wide characters in order to
match unicode strings to character cell terminals.
7. Columns can be reordered and individually turned off.
8. pretty printing for various data types
And more.
We go through the whole file system, so this test can take arbitrary time. But
this test is still quite useful, so let's at least try to make it more efficent
by not descending at all into the directories we would filter out later on
anyway.
Also increase the timeout, in case the previous step doesn't help enough.
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.
I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
Case sensitive or case insensitive matching can be requested using
--case-sensitive[=yes|no].
Unless specified, matching is case sensitive if the pattern contains any
uppercase letters, and case insensitive otherwise. This matches what
forward-search does in emacs, and recently also --ignore-case in less. This
works surprisingly well, because usually when one is wants to do case-sensitive
matching, the pattern is usually camel-cased. In the less frequent case when
case-sensitive matching is required with an all-lowercase pattern,
--case-sensitive can be used to override the automatic logic.
Previously, we'd maintain two hashmaps keyed by PIDs, pointing to Unit
interested in SIGCHLD events for them. This scheme allowed a specific
PID to be watched by exactly 0, 1 or 2 units.
With this rework this is replaced by a single hashmap which is primarily
keyed by the PID and points to a Unit interested in it. However, it
optionally also keyed by the negated PID, in which case it points to a
NULL terminated array of additional Unit objects also interested. This
scheme means arbitrary numbers of Units may now watch the same PID.
Runtime and memory behaviour should not be impact by this change, as for
the common case (i.e. each PID only watched by a single unit) behaviour
stays the same, but for the uncommon case (a PID watched by more than
one unit) we only pay with a single additional memory allocation for the
array.
Why this all? Primarily, because allowing exactly two units to watch a
specific PID is not sufficient for some niche cases, as processes can
belong to more than one unit these days:
1. sd_notify() with MAINPID= can be used to attach a process from a
different cgroup to multiple units.
2. Similar, the PIDFile= setting in unit files can be used for similar
setups,
3. By creating a scope unit a main process of a service may join a
different unit, too.
4. On cgroupsv1 we frequently end up watching all processes remaining in
a scope, and if a process opens lots of scopes one after the other it
might thus end up being watch by many of them.
This patch hence removes the 2-unit-per-PID limit. It also makes a
couple of other changes, some of them quite relevant:
- manager_get_unit_by_pid() (and the bus call wrapping it) when there's
ambiguity will prefer returning the Unit the process belongs to based on
cgroup membership, and only check the watch-pids hashmap if that
fails. This change in logic is probably more in line with what people
expect and makes things more stable as each process can belong to
exactly one cgroup only.
- Every SIGCHLD event is now dispatched to all units interested in its
PID. Previously, there was some magic conditionalization: the SIGCHLD
would only be dispatched to the unit if it was only interested in a
single PID only, or the PID belonged to the control or main PID or we
didn't dispatch a signle SIGCHLD to the unit in the current event loop
iteration yet. These rules were quite arbitrary and also redundant as
the the per-unit handlers would filter the PIDs anyway a second time.
With this change we'll hence relax the rules: all we do now is
dispatch every SIGCHLD event exactly once to each unit interested in
it, and it's up to the unit to then use or ignore this. We use a
generation counter in the unit to ensure that we only invoke the unit
handler once for each event, protecting us from confusion if a unit is
both associated with a specific PID through cgroup membership and
through the "watch_pids" logic. It also protects us from being
confused if the "watch_pids" hashmap is altered while we are
dispatching to it (which is a very likely case).
- sd_notify() message dispatching has been reworked to be very similar
to SIGCHLD handling now. A generation counter is used for dispatching
as well.
This also adds a new test that validates that "watch_pid" registration
and unregstration works correctly.
This adds a "watch-bind" feature to sd-bus connections. If set and the
AF_UNIX socket we are connecting to doesn't exist yet, we'll establish
an inotify watch instead, and wait for the socket to appear. In other
words, a missing AF_UNIX just makes connecting slower.
This is useful for daemons such as networkd or resolved that shall be
able to run during early-boot, before dbus-daemon is up, and want to
connect to dbus-daemon as soon as it becomes ready.
This reduces the meson man=false target count to 1281.
v2:
- link test-engine with libshared instead of libsystemd_static
Previous version built fine on F27, but fails on F26 with the following error:
/usr/bin/ld: /tmp/ccr8HRGw.ltrans6.ltrans.o: undefined reference to symbol '__start_BUS_ERROR_MAP@@SD_SHARED'
/home/zbyszek/fedora/systemd/systemd-9d5aae75c64f5583a110f03b94816aacc03bbf4d/x86_64-redhat-linux-gnu/src/shared/libsystemd-shared-236.so: error adding symbols: DSO missing from command line
v3:
- add libudev_basic
gcrypt_util_sources had to be moved because otherwise they appeared twice
in libshared.so halfproducts, causing an error.
-fvisibility=default is added to libbasic, libshared_static so that the symbols
appear properly in the exported symbol list in libshared.
The advantage is that files are not compiled twice. When configured with -Dman=false,
the ninja target list is reduced from 1588 to 1347 targets. The difference in compilation
time is small (<10%). I think this is because of -O0 and ccache and multiple cores, and
in different settings the compilation time could be reduced. The main advantage is that
errors and warnings are not reported twice.
We already use the "_static" suffix for libshared_static ("shared" is the name
of the library, "static" is the format) and other libs, so let's rename for
consistency.
Also change libsystemd_static_sources to libsystemd_sources, since the same
list is used for both and shorter is better.
In this way, individual errors in files can be treated differently than a
failure of the whole service.
A test is added to check that the expected value is returned.
Some parts are commented out, because it is not. This will be fixed in
a subsequent commit.
So far I avoided adding license headers to meson files, but they are pretty
big and important and should carry license headers like everything else.
I added my own copyright, even though other people modified those files too.
But this is mostly symbolic, so I hope that's OK.
Without this "meson test" will end up running all tests in the same
cgroup root, and they all will try to manage it. Which usually isn't too
bad, except when they end up clearing up each other's cgroups. This race
is hard to trigger but has caused various CI runs to fail spuriously.
With this change we simply move every test that runs a manager object
into their own private cgroup. Note that we don't clean up the cgroup at
the end, we leave that to the cgroup manager around it.
This fixes races that become visible by test runs throwing out errors
like this:
```
exec-systemcallfilter-failing.service: Passing 0 fds to service
exec-systemcallfilter-failing.service: About to execute: /bin/echo 'This should not be seen'
exec-systemcallfilter-failing.service: Forked /bin/echo as 5693
exec-systemcallfilter-failing.service: Changed dead -> start
exec-systemcallfilter-failing.service: Failed to attach to cgroup /exec-systemcallfilter-failing.service: No such file or directory
Received SIGCHLD from PID 5693 ((echo)).
Child 5693 ((echo)) died (code=exited, status=219/CGROUP)
exec-systemcallfilter-failing.service: Child 5693 belongs to exec-systemcallfilter-failing.service
exec-systemcallfilter-failing.service: Main process exited, code=exited, status=219/CGROUP
exec-systemcallfilter-failing.service: Changed start -> failed
exec-systemcallfilter-failing.service: Unit entered failed state.
exec-systemcallfilter-failing.service: Failed with result 'exit-code'.
exec-systemcallfilter-failing.service: cgroup is empty
Assertion 'service->main_exec_status.status == status_expected' failed at ../src/src/test/test-execute.c:71, function check(). Aborting.
```
BTW, I tracked this race down by using perf:
```
# perf record -e cgroup:cgroup_mkdir,cgroup_rmdir
…
# perf script
```
Thanks a lot @iaguis, @alban for helping me how to use perf for this.
Fixes#5895.
Some kdbus_flag and memfd related parts are left behind, because they
are entangled with the "legacy" dbus support.
test-bus-benchmark is switched to "manual". It was already broken before
(in the non-kdbus mode) but apparently nobody noticed. Hopefully it can
be fixed later.