job_compare return value is undefined in case the jobs have a loop
between them, so better make a test to make sure transaction cycle
detection catches it.
The test-engine Test2 tests the cycle detection when units a, b and d
all start at once
,-------------------after-----------------,
v |
a/start ---after---> d/start ---after---> b/start
Extend the test with Test11 that adds i.service which causes a and d
stop (by unordered Conflicts=) while starting b. Because stops precede
starts, we effectively eliminate the job cycle and all transaction jobs
should be applicable.
,-------------------after-----------------,
v |
a/stop <---after--- d/stop <---after--- b/start
. . ^
. . |
'. . . . . . . . . i/start ---after------'
Takes a single /sys/fs/bpf/pinned_prog string as argument, but may be
specified multiple times. An empty assignment resets all previous filters.
Closes https://github.com/systemd/systemd/issues/10227
prefix_root() is equivalent to path_join() in almost all ways, hence
let's remove it.
There are subtle differences though: prefix_root() will try shorten
multiple "/" before and after the prefix. path_join() doesn't do that.
This means prefix_root() might return a string shorter than both its
inputs combined, while path_join() never does that. I like the
path_join() semantics better, hence I think dropping prefix_root() is
totally OK. In the end the strings generated by both functon should
always be identical in terms of path_equal() if not streq().
This leaves prefix_roota() in place. Ideally we'd have path_joina(), but
I don't think we can reasonably implement that as a macro. or maybe we
can? (if so, sounds like something for a later PR)
Also add in a few missing OOM checks
This does the following:
- rename enum udev_builtin_cmd -> UdevBuiltinCmd
- rename struct udev_builtin -> UdevBuiltin
- move type definitions to udev-rules.h
- move prototypes of functions defined in udev-rules.c to udev-rules.h
- drop to use strbuf
- propagate critical errors in applying rules,
- drop limitation for number of tokens per line.
Should finally fix oss-fuzz-14688.
8688c29b5a wasn't enough.
The buffer retrieved from memstream has the size that the same as the written
data. When we write do write(f, s, strlen(s)), then no terminating NUL is written,
and the buffer is not (necessarilly) a proper C string.
The idea is to have at least one range to make the new format clearly
distinguishable from the old. But it is enough to just do it once.
In particular, in case the affinity would be specified like 0, 2, 4, 6…,
this gives much shorter output.
cpu_set_malloc() was the last user. It doesn't seem useful to keep
it just to save the allocation of a few hundred bytes in a test, so
it is dropped and a fixed maximum is allocated (1024 bytes).
The CPU_SET_S api is pretty bad. In particular, it has a parameter for the size
of the array, but operations which take two (CPU_EQUAL_S) or even three arrays
(CPU_{AND,OR,XOR}_S) still take just one size. This means that all arrays must
be of the same size, or buffer overruns will occur. This is exactly what our
code would do, if it received an array of unexpected size over the network.
("Unexpected" here means anything different from what cpu_set_malloc() detects
as the "right" size.)
Let's rework this, and store the size in bytes of the allocated storage area.
The code will now parse any number up to 8191, independently of what the current
kernel supports. This matches the kernel maximum setting for any architecture,
to make things more portable.
Fixes#12605.
Depending on system configuration and whether SCMP_ACT_KILL_PROCESS or
SCMP_ACT_KILL_THREAD is available/used processes might coredump on
specific coredumps or are just plain killed. For our test case the
difference doesn't really matter, hence let's hide it away.
I covered the most obvious paths: those where there's a clear problem
with a path specified by the user.
Prints something like this (at error level):
May 21 20:00:01.040418 systemd[125871]: bad-workdir.service: Failed to set up mount namespacing: /run/systemd/unit-root/etc/tomcat9/Catalina: No such file or directory
May 21 20:00:01.040456 systemd[125871]: bad-workdir.service: Failed at step NAMESPACE spawning /bin/true: No such file or directory
Fixes#10972.
It turns out that the kernel allows comm names higher than our expected limit
of 16.
$ wc -c /proc/*/comm|sort -g|tail -n3
35 /proc/1292317/comm
35 /proc/1293610/comm
36 /proc/1287112/comm
$ cat /proc/1287112/comm
kworker/u9:3-kcryptd/253:0
The functions to retrieve and print process cmdlines were based on the
assumption that they contain printable ASCII, and everything else
should be filtered out. That assumption doesn't hold in today's world,
where people are free to use unicode everywhere.
This replaces the custom cmdline reading code with a more generic approach
using utf8_escape_non_printable_full().
For kernel threads, truncation is done on the parenthesized name, so we'll
get "[worker]", "[worker…]", …, "[w…]", "[…", "…" as we reduce the number of
available columns.
This implementation is most likely slower for very long cmdlines, but I don't
think this is very important. The common case is to have short commandlines,
and should print those properly. Absurdly long cmdlines are the exception,
which needs to be handled correctly and safely, but speed is not too important.
Fixes#12532.
v2:
- use size_t for the number of columns. This change propagates into various
other functions that call get_process_cmdline(), increasing the size of the
patch, but the changes are rather trivial.
The test reliably fails inside LXC and Docker when run on a new enough
kernel. It's unclear whether this is a kernel, LXC/Docker or systemd
issue and apparently there is no real interest to get this fixed, so
let's skip this test.
As this also covers Travis CI, there is no need for this additional
check anymore.
See https://github.com/systemd/systemd/issues/9666
Lookup of a non-existing user using getpwnam() is not considered
an error, thus the `errno` is not set appropriately, causing
unexpected fails on systems, where 'nobody' user doesn't exist by
default
This wraps a few common steps. It is defined as inline function instead of in a
.c file to avoid having a .c file. With a .c file, we would have three choices:
- either link it into libshared, but then then libshared would have to be
linked to libmount.
- or compile the .c file into each target separately. This has the disdvantage
that configuration of every target has to be updated and stuff will be compiled
multiple times anyway, which is not too different from keeping this in the
header file.
- or create a new convenience library just for this. This also has the disadvantage
that the every target would have to be updated, and a separate library for a
10 line function seems overkill.
By keeping everything in a header file, we compile this a few times, but
otherwise it's the least painful option. The compiler can optimize most of the
function away, because it knows if 'source' is set or not.
test-bus-introspect is also applied to the tables from test-bus-vtable.c.
test-bus-vtable.c is also used as C++ sources to produce test-bus-vtable-cc,
and our hashmap headers are not C++ compatible. So let's do the introspection
part only in the C version.
In 856ad2a86b sd_bus_add_object_vtable() and
sd_bus_add_fallback_vtable() were changed to take an updated sd_bus_vtable[]
array with additional 'features' and 'names' fields in the union.
The commit tried to check whether the old or the new table format is used, by
looking at the vtable[0].x.start.element_size field, on the assumption that the
added fields caused the structure size to grow. Unfortunately, this assumption
was false, and on arm32 (at least), the structure size is unchanged.
In libsystemd we use symbol versioning and a major.minor.patch semantic
versioning of the library name (major equals the number in the so-name). When
systemd-242 was released, the minor number was (correctly) bumped, but this is
not enough, because no new symbols were added or symbol versions changed. This
means that programs compiled with the new systemd headers and library could be
successfully linked to older versions of the library. For example rpm only
looks at the so-name and the list of versioned symbols, completely ignoring the
major.minor numbers in the library name. But the older library does not
understand the new vtable format, and would return -EINVAL after failing the
size check (on those architectures where the structure size did change, i.e.
all 64 bit architectures).
To force new libsystemd (with the functions that take the updated
sd_bus_vtable[] format) to be used, let's pull in a dummy symbol from the table
definition. This is a bit wasteful, because a dummy pointer has to be stored,
but the effect is negligible. In particular, the pointer doesn't even change
the size of the structure because if fits in an unused area in the union.
The number stored in the new unsigned integer is not checked anywhere. If the
symbol exists, we already know we have the new version of the library, so an
additional check would not tell us anything.
An alternative would be to make sd_bus_add_{object,fallback}_vtable() versioned
symbols, using .symver linker annotations. We would provide
sd_bus_add_{object,fallback}_vtable@LIBSYSTEMD_221 (for backwards
compatibility) and e.g. sd_bus_add_{object,fallback}_vtable@@LIBSYSTEMD_242
(the default) with the new implementation. This would work too, but is more
work. We would have to version at least those two functions. And it turns out
that the .symver linker instructions have to located in the same compilation
unit as the function being annotated. We first compile libsystemd.a, and then
link it into libsystemd.so and various other targets, including
libsystemd-shared.so, and the nss modules. If the .symver annotations were
placed next to the function definitions (in bus-object.c), they would influence
all targets that link libsystemd.a, and cause problems, because those functions
should not be exported there. To export them only in libsystemd.so, compilation
would have to be rearranged, so that the functions exported in libsystemd.so
would not be present in libsystemd.a, but a separate compilation unit containg
them and the .symver annotations would be linked solely into libsystemd.so.
This is certainly possible, but more work than the approach in this patch.
856ad2a86b has one more issue: it relies on the
undefined fields in sd_bus_vtable[] array to be zeros. But the structure
contains a union, and fields of the union do not have to be zero-initalized by
the compiler. This means that potentially, we could have garbarge values there,
for example when reading the old vtable format definition from the new function
implementation. In practice this should not be an issue at all, because vtable
definitions are static data and are placed in the ro-data section, which is
fully initalized, so we know that those undefined areas will be zero. Things
would be different if somebody defined the vtable array on the heap or on the
stack. Let's just document that they should zero-intialize the unused areas
in this case.
The symbol checking code had to be updated because otherwise gcc warns about a
cast from unsigned to a pointer.
In cgroup v2 we have protection tunables -- currently MemoryLow and
MemoryMin (there will be more in future for other resources, too). The
design of these protection tunables requires not only intermediate
cgroups to propagate protections, but also the units at the leaf of that
resource's operation to accept it (by setting MemoryLow or MemoryMin).
This makes sense from an low-level API design perspective, but it's a
good idea to also have a higher-level abstraction that can, by default,
propagate these resources to children recursively. In this patch, this
happens by having descendants set memory.low to N if their ancestor has
DefaultMemoryLow=N -- assuming they don't set a separate MemoryLow
value.
Any affected unit can opt out of this propagation by manually setting
`MemoryLow` to some value in its unit configuration. A unit can also
stop further propagation by setting `DefaultMemoryLow=` with no
argument. This removes further propagation in the subtree, but has no
effect on the unit itself (for that, use `MemoryLow=0`).
Our use case in production is simplifying the configuration of machines
which heavily rely on memory protection tunables, but currently require
tweaking a huge number of unit files to make that a reality. This
directive makes that significantly less fragile, and decreases the risk
of misconfiguration.
After this patch is merged, I will implement DefaultMemoryMin= using the
same principles.
The comment explains that $PATH might not be set in certain circumstances and
takes steps to handle this case. If we do that, let's assume that $PATH indeed
might be unset and not call setenv("PATH", NULL, 1). It is not clear from the
man page if that is allowed.
CID #1400497.
Coverity was unhappy, because it doesn't know that $PATH is pretty much always
set. But let's not assume that in the test. CID #1400496.
$ (unset PATH; build/test-env-util)
[1] 31658 segmentation fault (core dumped) ( unset PATH; build/test-env-util; )
Some chattrs only work sensible if you set them right after opening a
file for create (think: FS_NOCOW_FL). Others only work when they are
applied when the file is fully written (think: FS_IMMUTABLE_FL). Let's
take that into account when copying files and applying a chattr to them.
I couldn't figure out what is going on here, because LTO inlines everything and
then the backtrace reported a different spot. But when compiled with NDEBUG but
no LTO, it's fairly obvious ;)
C.f. #12008.
Coverity is unhappy because we use "line" in the assert that checks
the return value. It doesn't matter much, but let's clean this up.
Also, let's not assume that /proc/cmdline contains anything.
CID #1400219.
When running in Fedora "mock", / is a tmpfs and /home is not mounted. The test
assumes that /home will be a tmpfs only and only if we can unshare. Obviously,
this does not hold in this case, because unsharing is not possible, but /home
is still a tmpfs. Let's just skip the test, since it's fully legitimate to
mount either or both of / and /home as tmpfs.
test_exec_ambientcapabilities: exec-ambientcapabilities-nobody.service: exit status 0, expected 1
Sometimes we get just the last line, for example from the failure summary,
so make it as useful as possible.
This adds some extra paranoia: when we recursively chown a directory for
use with DynamicUser=1 services we'll now drop suid/sgid from all files
we chown().
Of course, such files should not exist in the first place, and noone
should get access to those dirs who isn't root anyway, but let's better
be safe than sorry, and drop everything we come across.
On arm64 with gcc-8.2.1-5.fc29.aarch64:
../src/test/test-fileio.c:645:29: warning: comparison is always false due to limited range of data type [-Wtype-limits]
assert_se(c == EOF || safe_fgetc(f, &c) == 1);
^~
Casting c to int is not enough, gcc is able to figure out that the original
type was unsigned and still warns. So let's just silence the warning like
in test-sizeof.c.
Some PIDs can remain in the watched list even though their processes have
exited since a long time. It can easily happen if the main process of a forking
service manages to spawn a child before the control process exits for example.
However when a pid is about to be mapped to a unit by calling unit_watch_pid(),
the caller usually knows if the pid should belong to this unit exclusively: if
we just forked() off a child, then we can be sure that its PID is otherwise
unused. In this case we take this opportunity to remove any stalled PIDs from
the watched process list.
If we learnt about a PID in any other form (for example via PID file, via
searching, MAINPID= and so on), then we can't assume anything.
When debugging failure in one of the cases, it's annoying to have to wade
through the output from all the other cases. Let's allow picking select
cases.
This avoids double compilation. Those files are tiny, so it doesn't save time,
but we avoid repeated warnings and errors, and it's generally cleaner to it
this way.
The number of commands in 'ninja -C build clean && ninja -C build' drops from
1462 to 1455 for me.
Let's move the shutdown binary into its own subdirectory in
src/shutdown, after all it is relatively isolated from the normal PID 1
sources, being a different binary and all.
Unfortunately it's not possible to move some of the code, since it is
shared with PID 1, that I wished we could move, but I still think it's
worth it.