The function is similar to path_kill_slashes() but also removes
initial './', trailing '/.', and '/./' in the path.
When the second argument of path_simplify() is false, then it
behaves as the same as path_kill_slashes(). Hence, this also
replaces path_kill_slashes() with path_simplify().
First, ellipsize() and ellipsize_mem() should not read past the input
buffer. Those functions take an explicit length for the input data, so they
should not assume that the buffer is terminated by a nul.
Second, ellipsization was off in various cases where wide on multi-byte
characters were used.
We had some basic test for ellipsize(), but apparently it wasn't enough to
catch more serious cases.
Should fix https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=8686.
sparc sets the carry bit when a syscall fails. Use this information to
set errno and return -1 as appropriate.
The added test case calls raw_clone() with flags known to be invalid
according to the clone(2) manpage.
We already do that in get_process_cmdline(), which is very similar in
behaviour otherwise. Hence, let's be safe and also filter them in
get_process_comm(). Let's try to retain as much information as we can
though and escape rather than suppress unprintable characters. Let's not
increase comm names beyond the kernel limit on such names however.
Also see discussion about this here:
https://marc.info/?l=linux-api&m=152649570404881&w=2
For short buffer sizes cellescape() was a bit wasteful, as it might
suffice to to drop a single character to find enough place for the full
four byte ellipsis, if that one character was a four character escape.
With this rework we'll guarantee to drop the minimum number of
characters from the end to fit in the ellipsis.
If the buffers we write to are large this doesn't matter much. However,
if they are short (as they are when talking about the process comm
field) then it starts to matter that we put as much information as we
can in the space we get.
If the timestamp is above 9999-12-30, (or 2038-something-something on 32 bit),
use XXXX-XX-XX XX:XX:XX as the replacement.
The problem with refusing to print timestamps is that our code accepts such
timestamps, so we can't really just refuse to process them afterwards. Also, it
makes journal files non-portable, because suddently we might completely refuse
to print entries which are totally OK on a different machine.
It's not supposed to be the most efficient, but instead fast and simple to use.
I kept the logic in ellipsize_mem() to use unicode ellipsis even in non-unicode
locales. I'm not quite convinced things should be this way, especially that with
this patch it'd actually be simpler to always use "…" in unicode locale and "..."
otherwise, but Lennart wanted it this way for some reason.
This adds fozr new flags:
- If CONF_FILES_DIRECTORY is specified conf_file_list() and friends
will look for directories only.
- Similar CONF_FILES_REGULAR means we'll look only for regular files.
- If CONF_FILES_BASENAME is specified the resulting list will contain
only the basenames of all discovered files or directories, not the
full paths.
- If CONF_FILES_FILTER_MASKED is specified the resulting list will have
masked entries removed (i.e. those symlinked to /dev/null and
suchlike)
These four flags are useful for discovering portable service profile
information.
While we are at it, also improve a couple of other things:
- More debug logging
- use path_hash_ops instead of string_hash_ops when putting together the
path lists
Most our other parsing functions do this, let's do this here too,
internally we accept that anyway. Also, the closely related
load_env_file() and load_env_file_pairs() also do this, so let's be
systematic.
This call creates an fd from another fd containing the same data.
Specifically, repeated read() on the returned fd should return the same
data as the original fd. This call is useful when we want to copy data
out of disk images and suchlike, and want to be pass fds with the data
around without having to keep the disk image continously mounted.
The implementation tries to be somewhat smart and tries to prefer
memfds/pipes over files in /tmp or /var/tmp based on the size of the
data, but has appropropriate fallbacks in place.
On both 32 and 64 bits, the result is:
enum Enum → 32 bits, unsigned
enum BigEnum → 32 bits, unsigned
enum BigEnum2 → 64 bits, unsigned
big_enum2_pos → 4
big_enum2_neg → 8
The last two lines show that even if the enum is 64 bit, and the field of an
enum is defined with UINT64_C(), the field can still be smaller.
This simplifies the use of tempfiles in tests and fixes "leaked"
temporary files in test-fileio, test-catalog, test-conf-parser.
Not the whole tree is converted.
That way we can use it in nspawn.
Also, while we are at it, let's rename the call config_parse_rlimit(),
i.e. insert the "r", to clarify what kind of limit this is about.
Initializing just some of the values doesn't make sense. We should handle them
all in the same way. In those tests there are no jumps, all steps are covered
with assert_se(), so we know everything will be set if we get to the end of the
function. And _not_ initializing those variables has the advantage that it
allows valgrind to catch potential initalization errors in the function being
tested.
We were inconsitently using them in some cases, but in majority not.
Using assignment in assert_se is very common, not an exception like in
'if', so let's drop the extra parens everywhere.
When I see "test", I have to think three times what the return value
means. With "below" this is immediately clear. ratelimit_below(&limit)
sounds almost like English and is imho immediately obvious.
(I also considered ratelimit_ok, but this strongly implies that being under the
limit is somehow better. Most of the times this is true, but then we use the
ratelimit to detect triple-c-a-d, and "ok" doesn't fit so well there.)
C.f. a1bcaa07.
This makes it behave the same whether there is a blank line or not at
the end of the file. This is also consistent with the behavior of the
shell on a shell script that ends on a trailing backslash at the last
line.
Added tests to test_config_parse(), which only pass if the corresponding
change to config_parse() is included.
This is based on @jsynacek's patch from #8837, but adds the new URL in
two flavours instead of replacing the old, also making @keszybz happy.
Replaces: #8837
Previously we were a bit sloppy with the index and size types of arrays,
we'd regularly use unsigned. While I don't think this ever resulted in
real issues I think we should be more careful there and follow a
stricter regime: unless there's a strong reason not to use size_t for
array sizes and indexes, size_t it should be. Any allocations we do
ultimately will use size_t anyway, and converting forth and back between
unsigned and size_t will always be a source of problems.
Note that on 32bit machines "unsigned" and "size_t" are equivalent, and
on 64bit machines our arrays shouldn't grow that large anyway, and if
they do we have a problem, however that kind of overly large allocation
we have protections for usually, but for overflows we do not have that
so much, hence let's add it.
So yeah, it's a story of the current code being already "good enough",
but I think some extra type hygiene is better.
This patch tries to be comprehensive, but it probably isn't and I missed
a few cases. But I guess we can cover that later as we notice it. Among
smaller fixes, this changes:
1. strv_length()' return type becomes size_t
2. the unit file changes array size becomes size_t
3. DNS answer and query array sizes become size_t
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=76745
This is used as 'systemd-analyze show-config systemd/logind.conf', which
will dump
/etc/systemd/system/user@.service
/etc/systemd/system/user@.service.d/*.conf
/run/systemd/system/user@.service.d/*.conf
/usr/local/lib/systemd/system/user@.service.d/*.conf
/usr/lib/systemd/system/user@.service.d/*.conf
The idea is to make it easy to dump the configuration using the same locations
and order that systemd programs use themselves (including masking, in the right
order, etc.). This is the generic variant that works with any configuration
scheme that follows the same general rules:
$ systemd-analyze cat-config systemd/system.conf
$ systemd-analyze cat-config systemd/user.conf
$ systemd-analyze cat-config systemd/logind.conf
$ systemd-analyze cat-config systemd/sleep.conf
$ systemd-analyze cat-config systemd/journald.conf
$ systemd-analyze cat-config systemd/journal-remote.conf
$ systemd-analyze cat-config systemd/journal-upload.conf
$ systemd-analyze cat-config systemd/coredump.conf
$ systemd-analyze cat-config systemd/resolved.conf
$ systemd-analyze cat-config systemd/timesyncd.conf
$ systemd-analyze cat-config udev/udev.conf
We use MTUs all over the place, let's add a unified, strict parser for
it, that takes MTU ranges into account.
We already have parse_ifindex() close-by, hence this appears to be a
natural addition, in particular as the range checking is not entirely
trivial to do, as it depends on the protocol used.
This drops a good number of type-specific _cleanup_ macros, and patches
all users to just use the generic ones.
In most recent code we abstained from defining type-specific macros, and
this basically removes all those added already, with the exception of
the really low-level ones.
Having explicit macros for this is not too useful, as the expression
without the extra macro is generally just 2ch wider. We should generally
emphesize generic code, unless there are really good reasons for
specific code, hence let's follow this in this case too.
Note that _cleanup_free_ and similar really low-level, libc'ish, Linux
API'ish macros continue to be defined, only the really high-level OO
ones are dropped. From now on this should really be the rule: for really
low-level stuff, such as memory allocation, fd handling and so one, go
ahead and define explicit per-type macros, but for high-level, specific
program code, just use the generic _cleanup_() macro directly, in order
to keep things simple and as readable as possible for the uninitiated.
Note that before this patch some of the APIs (notable libudev ones) were
already used with the high-level macros at some places and with the
generic _cleanup_ macro at others. With this patch we hence unify on the
latter.
With the recent terminal_urlify() APIs we'll now sometimes generate
clickable link CSO sequences. Hence we should also be able to remove
them again from strings. This beefs up the logic to do so.
Follow-up for: 23b27b39d2
Those are quite similar to %i/%I, but refer to the last dash-separated
component of the name prefix.
The new functionality of dash-dropins could largely supersede the template
functionality, so it would be tempting to overload %i/%I. But that would
not be backwards compatible. So let's add the two new letters instead.
Newer terminals (in particular gnome-terminal) understand special escape
sequence for formatting clickable links. Let's support that to make our
tool output more clickable where that's appropriate.
For details see this:
https://gist.github.com/egmontkob/eb114294efbcd5adb1944c9f3cb5feda
The one big issue is that 'less' currently doesn't grok this, and
doesn't ignore sequence like regular terminal implementations do if they
don't support it. Hence for now, let's disable URL output if a pager is
used. We should revisit that though as soon as less added support for it
and enough time passed for it to enter various distributions.
Double newlines (i.e. one empty lines) are great to structure code. But
let's avoid triple newlines (i.e. two empty lines), quadruple newlines,
quintuple newlines, …, that's just spurious whitespace.
It's an easy way to drop 121 lines of code, and keeps the coding style
of our sources a bit tigther.
We check the same condition at various places. Let's add a trivial,
common helper for this, and use it everywhere.
It's not going to make things much faster or much shorter, but I think a
lot more readable
If the flag is set only a single step of the normalization is executed,
and the resulting path is returned.
This allows callers to normalize piecemeal, taking into account every
single intermediary path of the normalization.
We have plenty of code in our codebase that outputs tables to the
console, and all is homegrown and awful. Let's replace it with a generic
implementation that can do automatically what the old implementations
did manually.
Features:
1. Ellipsation (for fields overly long) and alignment (for
fields overly short)
2. Sorting of rows
3. automatically copies formatting from the same cell in the row above
4. Heavy use of varargs to make putting together tables easy
5. can expand and compress tables, with weights
6. Has a minimal understanding of unicode wide characters in order to
match unicode strings to character cell terminals.
7. Columns can be reordered and individually turned off.
8. pretty printing for various data types
And more.
This primarily changes to things:
1. Ellipsation to 0, 1 or 2 characters is now supported. Previously we'd
hit an assert if the new lengths was < 3, this is now permitted. The
result strings won't show too much info still of course, but the code
becomes a bit more generic and robust to use.
2. If a UTF-8 mode is disabled and the input string is pure ASCII, then
"..." is used for ellipsation, otherwise (as before) "…". This means
on a pure-ASCII system we should remain pure-ASCII, matching
behaviour otherwise exposed with special_glyph() and friends. Note
that we'll use "…" for ellipsiation as soon as either the locale
settings indicate an UTF-8 mode or the input string already contains
non-ASCII unicode characters.
Testing for these special cases is improved.
We go through the whole file system, so this test can take arbitrary time. But
this test is still quite useful, so let's at least try to make it more efficent
by not descending at all into the directories we would filter out later on
anyway.
Also increase the timeout, in case the previous step doesn't help enough.
Absolute paths make everything simple and quick, but sometimes this requirement
can be annoying. A good example is calling 'test', which will be located in
/usr/bin/ or /bin depending on the distro. The need the provide the full path
makes it harder a portable unit file in such cases.
This patch uses a fixed search path (DEFAULT_PATH which was already used as the
default value of $PATH), and if a non-absolute file name is found, it is
immediately resolved to a full path using this search path when the unit is
loaded. After that, everything behaves as if an absolute path was specified. In
particular, the executable must exist when the unit is loaded.
Doing manager_load_unit() followed by UNIT_VTABLE(unit)->start(unit) would
result in an assertion failure in ->start() if the unit failed to load
properly. Something like this is okey-ish is tests, since the test units are
not expected to fail to load, but the reason for failure is clearer if we
fail immediately.
$ sudo swapoff -av
swapoff /dev/vda4
$ sudo systemctl hibernate
Failed to hibernate system via logind: Not enough swap space for hibernation
Fixes#6729.
The Linux kernel is adding support for configuring the offset
into a disk. This allows swapfiles to be more usable as users
will no longer need to set the offset on their kernel command
line.
Use this API in systemd when hibernating as well.
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.
I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
Running `test-path` under an umask such as 027 fails with:
Assertion '(s.st_mode & S_IRWXO) == 0004' failed at ../src/test/test-path.c:247, function test_path_makedirectory_directorymode(). Aborting.
Looking at directory /tmp/test-path_makedirectory, it was indeed created
with mode 0740, applying the umask to the requested 0744.
Set an explicit umask for this test, to ensure reproducible results.
We have the same code for this in place at various locations, let's
unify that. Also, let's repurpose test-fs-util.c as a test for this new
helper cal..
Not that it matters much, but it seems cleaner to also count those
inputs, even if they do not consume extra storage space.
The test is extended to include an empty input and counts in the test are
adjusted to include it.
When we are attempting to create directory somewhere in the bowels of /var/lib
and get an error that it already exists, it can be quite hard to diagnose what
is wrong (especially for a user who is not aware that the directory must have
the specified owner, and permissions not looser than what was requested). Let's
print a warning in most cases. A warning is appropriate, because such state is
usually a sign of borked installation and needs to be resolved by the adminstrator.
$ build/test-fs-util
Path "/tmp/test-readlink_and_make_absolute" already exists and is not a directory, refusing.
(or)
Directory "/tmp/test-readlink_and_make_absolute" already exists, but has mode 0775 that is too permissive (0755 was requested), refusing.
(or)
Directory "/tmp/test-readlink_and_make_absolute" already exists, but is owned by 1001:1000 (1000:1000 was requested), refusing.
Assertion 'mkdir_safe(tempdir, 0755, getuid(), getgid(), MKDIR_WARN_MODE) >= 0' failed at ../src/test/test-fs-util.c:320, function test_readlink_and_make_absolute(). Aborting.
No functional change except for the new log lines.
The warning is not emitted for absolute paths like /dev/sda or /home, which are
converted to .device and .mount unit names without any fuss.
Most of the time it's unlikely that users use invalid unit names on purpose,
so let's warn them. Warnings are silenced when --quiet is used.
$ build/systemctl show -p Id hello@foo-bar/baz
Invalid unit name "hello@foo-bar/baz" was escaped as "hello@foo-bar-baz" (maybe you should use systemd-escape?)
Id=hello@foo-bar-baz.service
$ build/systemd-run --user --slice foo-bar/baz --unit foo-bar/foo true
Invalid unit name "foo-bar/foo" was escaped as "foo-bar-foo" (maybe you should use systemd-escape?)
Invalid unit name "foo-bar/baz" was escaped as "foo-bar-baz" (maybe you should use systemd-escape?)
Running as unit: foo-bar-foo.service
Fixes#8302.
We have only three bits of space, i.e. 8 possible classes. Immediately reject
anything outside of that range. Add the fuzzer test case and an additional
unit test.
oss-fuzz #6908.
Also fix one case where the presence of a newline was used to generate
an invalid environment assignment.
Tested: with mkosi, which builds the local tree and run ninja tests.
example.swaps with "(deleted)" does not cause bogus entries in the list now,
but a memleak in libmount instead. The memleaks is not very important since
this code is run just once.
Reported as https://github.com/karelzak/util-linux/issues/596.
$ build/test-umount
...
/* test_swap_list("/proc/swaps") */
path=/var/tmp/swap o= f=0x0 try-ro=no dev=0:0
path=/dev/dm-2 o= f=0x0 try-ro=no dev=0:0
/* test_swap_list("/home/zbyszek/src/systemd/test/test-umount/example.swaps") */
path=/some/swapfile o= f=0x0 try-ro=no dev=0:0
path=/dev/dm-2 o= f=0x0 try-ro=no dev=0:0
==26912==
==26912== HEAP SUMMARY:
==26912== in use at exit: 16 bytes in 1 blocks
==26912== total heap usage: 1,546 allocs, 1,545 frees, 149,008 bytes allocated
==26912==
==26912== 16 bytes in 1 blocks are definitely lost in loss record 1 of 1
==26912== at 0x4C31C15: realloc (vg_replace_malloc.c:785)
==26912== by 0x55C5D8C: _IO_vfscanf (in /usr/lib64/libc-2.26.so)
==26912== by 0x55D8AEC: vsscanf (in /usr/lib64/libc-2.26.so)
==26912== by 0x55D25C3: sscanf (in /usr/lib64/libc-2.26.so)
==26912== by 0x53236D0: mnt_table_parse_stream (in /usr/lib64/libmount.so.1.1.0)
==26912== by 0x53249B6: mnt_table_parse_file (in /usr/lib64/libmount.so.1.1.0)
==26912== by 0x10D157: swap_list_get (umount.c:194)
==26912== by 0x10B06E: test_swap_list (test-umount.c:34)
==26912== by 0x10B24B: main (test-umount.c:56)
==26912==
==26912== LEAK SUMMARY:
==26912== definitely lost: 16 bytes in 1 blocks
==26912== indirectly lost: 0 bytes in 0 blocks
==26912== possibly lost: 0 bytes in 0 blocks
==26912== still reachable: 0 bytes in 0 blocks
==26912== suppressed: 0 bytes in 0 blocks
This seems to be a false positive in msan:
https://github.com/google/sanitizers/issues/767.
I don't see anything wrong with the code either, and valgrind does not see the
issue. Anyway, let's add the test case.
We don't have msan hooked up yet, but hopefully we'll in the future.
oss-fuzz #6884.
When running tests like test-unit-name, there is not point in setting
up the cgroup and signals and interacting with the environment. Similarly
when running fuzz testing of the parser.
Add new MANAGER_TEST_RUN_BASIC which takes the role of MANAGER_TEST_RUN_MINIMAL,
and redefine MANAGER_TEST_RUN_MINIMAL to just create the basic data structures.
This mainly gets around a kernel bug making it possible to
have non-existent paths in /proc/self/mountinfo, but it should also
prevent flaky failures that can happen if something changes immediately
after or during reading /proc/self/mountinfo.
Closes https://github.com/systemd/systemd/issues/8286.
Suspend to Hibernate is a new sleep method that invokes suspend
for a predefined period of time before automatically waking up
and hibernating the system.
It's similar to HybridSleep however there isn't a performance
impact on every suspend cycle.
It's intended to use with systems that may have a higher power
drain in their supported suspend states to prevent battery and
data loss over an extended suspend cycle.
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
This should make it a bit easier to search for real file descriptor leaks.
```
$ valgrind --leak-check=full --track-fds=yes ./build/test-fileio
...
==29457==
==29457== FILE DESCRIPTORS: 4 open at exit.
==29457== Open file descriptor 3: /tmp/test-systemd_writing_tmpfile.lyV5Rc
==29457== at 0x4B9AD9E: open (open.c:43)
==29457== by 0x4B19B24: __gen_tempname (tempname.c:261)
==29457== by 0x4BA5CC3: mkostemp64 (mkostemp64.c:32)
==29457== by 0x48F739B: mkostemp_safe (fileio.c:1206)
==29457== by 0x10D968: test_writing_tmpfile (test-fileio.c:620)
==29457== by 0x10E930: main (test-fileio.c:767)
==29457==
```
This helps get around a bug confusing `glibc` and making the test bail
out with the following error under `asan` on `x86`:
Fatal error: glibc detected an invalid stdio handle
Aborted (core dumped)
The bug has been reported in https://github.com/google/sanitizers/issues/778,
but it is unlikely to be fixed anytime soon.
The unit files for test-execute are named like
`exec-(setting-name-in-lower-character)-(optional-text).service`.
However, test units for AmbientCapabilities= are not following this.
So, let's rename them for the consistency.
This does not change anything in the functionality of the test.
Quite often we need to set up a number of fds as stdin/stdout/stderr of
a process we are about to start. Add a generic implementation for a
routine doing that that takes care to do so properly:
1. Can handle the case where stdin/stdout/stderr where previously
closed, and the fds to set as stdin/stdout/stderr hence likely in the
0..2 range. handling this properly is nasty, since we need to first
move the fds out of this range in order to later move them back in, to
make things fully robust.
2. Can optionally open /dev/null in case for one or more of the fds, in
a smart way, sharing the open file if possible between multiple of
the fds.
3. Guarantees that O_CLOEXEC is not set on the three fds, even if the fds
already were in the 0..2 range and hence possibly weren't moved.
The nobody user/group may not synthesized by systemd.
To run tests the functionalities in such situation, this adds tests
by user/group by daemon, as it is expected to exists all environments.
"-m" is specified as a short form of "--monitor" in the option struct,
but not included in getopt_long's optstring. Update the optstring
to be consistent with the option struct.
Several tests request nobody user or group. If they are badly
configured, then tests may fail.
This makes test-execute check nobody user and group are configured
correctly before running such tests.
Fixes#8276.
gcc warns about unitialized memory access because it notices that ssize_t which
is < 0 could be cast to positive int value. We know that this can't really
happen because only -1 can be returned, but OTOH, in principle a large
*positive* value cannot be cast properly. This is unlikely too, since xattrs
cannot be too large, but it seems cleaner to just use a size_t to return the
value and avoid the cast altoghter. This makes the code simpler and gcc is
happy too.
The following warning goes away:
[113/1502] Compiling C object 'src/basic/basic@sta/xattr-util.c.o'.
In file included from ../src/basic/alloc-util.h:28:0,
from ../src/basic/xattr-util.c:30:
../src/basic/xattr-util.c: In function ‘fd_getcrtime_at’:
../src/basic/macro.h:207:60: warning: ‘b’ may be used uninitialized in this function [-Wmaybe-uninitialized]
UNIQ_T(A,aq) < UNIQ_T(B,bq) ? UNIQ_T(A,aq) : UNIQ_T(B,bq); \
^
../src/basic/xattr-util.c:155:19: note: ‘b’ was declared here
usec_t a, b;
^
If log_do_header() was called with overly long parameters, it'd generate
improper output. Essentially, it'd be truncated at random point, in particular
missing a newline at the end, so it'd run with the next field, usually MESSAGE=.
log_do_header is called with parameters from compiled code (file name, lien
nubmer, etc), so in practice this was unlikely to ever be a problem, but it is
possible. In particular, if systemd was compiled from sources in some deeply
nested directory (which happens for example in mock and other build roots), the
filename could be very long.
As a safety measure, let's truncate all parameters to 256 bytes. So we have
5 fields which are 256 bytes (plus the field name prefix), and a few other
fields with fixed width. This must always fit in the 2048 byte buffer.
I don't think there's much gain in calculating the required length precisely,
since it's a lot of fields and a few bytes allocated on the stack don't matter.
This patch adds safe_atoux16 for parsing an unsigned hexadecimal 16bit int, and
uses that for parsing USB device and vendor IDs.
This fixes a compile error with gcc-8 because while we know that USB IDs are 2 bytes,
the compiler does not know that.
../src/udev/udev-builtin-hwdb.c:80:38: error: '%04X' directive output may be
truncated writing between 4 and 8 bytes into a region of size between 2 and 6
[-Werror=format-truncation=]
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Signed-off-by: Patrick Uiterwijk <puiterwijk@redhat.com>
This improves the BPF/cgroup detection logic, and looks whether
BPF_ALLOW_MULTI is supported. This flag allows execution of multiple
BPF filters in a recursive fashion for a whole cgroup tree. It enables
us to properly report IP accounting for slice units, as well as
delegation of BPF support to units without breaking our own IP
accounting.
When synthetisation is turned off, there's just too many ways those tests can
go wrong. We are not interested in verifying that the db on disk is correct,
let's just skip all checks.
In the first version of this patch, I recorded if we detected a mismatch during
configuration and only skipped tests in that case, but actually it is possible
to change the host configuration between our configuration phase and running
of the tests. It's just more robust to skip always. (This is particularly true
if tests are installed.)
This introduces a new setting TemporaryFileSystem=. This is useful
to hide files not relevant to the processes invoked by unit, while
necessary files or directories can be still accessed by combining
with Bind{,ReadOnly}Paths=.
This makes it easier to see what is going on. Crashes may happen in a
nested test_{uid,gid}_to_name_one() function, and the default backtrace
doesn't show the actual string being tested.
The Linux kernel exposes the birth time now for files through statx()
hence make use of it where available. We keep the xattr logic in place
for this however, since only a subset of file systems on Linux currently
expose the birth time. NFS and tmpfs for example do not support it. OTOH
there are other file systems that do support the birth time but might
not support xattrs (smb…), hence make the best of the two, in particular
in order to deal with journal files copied between file system types and
to maintain compatibility with older file systems that are updated to
newer version of the file system.
config_parse_join_controllers would free the destination argument on failure,
which is contrary to our normal style, where failed parsing has no effect.
Moving it to shared also allows a test to be added.
We already have the terminal open, hence pass the fd we got to
ask_password_tty(), so that it doesn't have to reopen it a second time.
This is mostly an optimization, but it has the nice benefit of making us
independent from RLIMIT_NOFILE issues and so on, as we don't need to
allocate another fd needlessly.
This new helper not only removes a file from a directory but also
ensures its space on disk is deallocated, by either punching a hole over
the full file or truncating the file afterwards if the file's link
counter is 0. This is useful in "vacuuming" algorithms to ensure that
client's can't keep the disk space the vacuuming is supposed to recover
pinned simply by keeping an fd open to it.
This is similar to string_hash_ops but operates one file system paths
specifically. It will ensure that "/foo//bar" and "///foo/bar" are
considered to be the same path for hashmap purposes.
This makes use of the existing path_compare() API, and adds a matching
hashing function for it.
Note that relative and absolute paths will hash to different values,
however whether the path is suffixed with a slash or not is not
detected. This matches the existing path_compare() behaviour, and
follows the logic that on Linux there can't be two different objects at
path /foo/bar and /foo/bar/ either.
This adds some paranoia code that moves some of the fds we allocate for
longer periods of times to fds > 2 if they are allocated below this
boundary. This is a paranoid safety thing, in order to avoid that
external code might end up erroneously use our fds under the assumption
they were valid stdin/stdout/stderr. Think: some app closes
stdin/stdout/stderr and then invokes 'fprintf(stderr, …' which causes
writes on our fds.
This both adds the helper to do the moving as well as ports over a
number of users to this new logic. Since we don't want to litter all our
code with invocations of this I tried to strictly focus on fds we keep
open for long periods of times only and only in code that is frequently
loaded into foreign programs (under the assumptions that in our own
codebase we are smart enough to always keep stdin/stdout/stderr
allocated to avoid this pitfall). Specifically this means all code used
by NSS and our sd-xyz API:
1. our logging APIs
2. sd-event
3. sd-bus
4. sd-resolve
5. sd-netlink
This changed was inspired by this:
https://github.com/systemd/systemd/issues/8075#issuecomment-363689755
This shows that apparently IRL there are programs that do close
stdin/stdout/stderr, and we should accomodate for that.
Note that this won't fix any bugs, this just makes sure that buggy
programs are less likely to interfere with out own code.
This is preparation for emulating the "usage_usec" keyed attribute of
the "cpu.stat" property of the root cgroup from data in /proc. Similar,
for emulating the "memory.current" attribute.
It's not good if the paths are in different order. With --user, we expect
more paths, but it must be a strict superset, and the order for the ones
that appear in both sets must be the same.
$ diff -u <(build/systemd-analyze --global unit-paths) <(build/systemd-analyze --user unit-paths)|colordiff
--- /proc/self/fd/14 2018-02-08 14:11:45.425353107 +0100
+++ /proc/self/fd/15 2018-02-08 14:11:45.426353116 +0100
@@ -1,6 +1,17 @@
+/home/zbyszek/.config/systemd/system.control
+/run/user/1000/systemd/system.control
+/run/user/1000/systemd/transient
+/run/user/1000/systemd/generator.early
+/home/zbyszek/.config/systemd/user
/etc/systemd/user
+/run/user/1000/systemd/user
/run/systemd/user
+/run/user/1000/systemd/generator
+/home/zbyszek/.local/share/systemd/user
+/home/zbyszek/.local/share/flatpak/exports/share/systemd/user
+/var/lib/flatpak/exports/share/systemd/user
/usr/local/share/systemd/user
/usr/share/systemd/user
/usr/local/lib/systemd/user
/usr/lib/systemd/user
+/run/user/1000/systemd/generator.late
A test is added so that we don't regress on this.
The function `strv_join_quoted()` is now not used, and has a bug
in the buffer size calculation when the strings needs to escaped,
as reported in #8056.
So, let's remove the function.
Closes#8056.
Red is used for highligting, the same as grep does. Except when the line is
highlighted red already, because it has high priority, in which case plain ansi
highlight is used for the matched substring.
Coloring is implemented for short and cat outputs, and not for other types.
I guess we could also add it for verbose output in the future.
Case sensitive or case insensitive matching can be requested using
--case-sensitive[=yes|no].
Unless specified, matching is case sensitive if the pattern contains any
uppercase letters, and case insensitive otherwise. This matches what
forward-search does in emacs, and recently also --ignore-case in less. This
works surprisingly well, because usually when one is wants to do case-sensitive
matching, the pattern is usually camel-cased. In the less frequent case when
case-sensitive matching is required with an all-lowercase pattern,
--case-sensitive can be used to override the automatic logic.
Previously, we'd maintain two hashmaps keyed by PIDs, pointing to Unit
interested in SIGCHLD events for them. This scheme allowed a specific
PID to be watched by exactly 0, 1 or 2 units.
With this rework this is replaced by a single hashmap which is primarily
keyed by the PID and points to a Unit interested in it. However, it
optionally also keyed by the negated PID, in which case it points to a
NULL terminated array of additional Unit objects also interested. This
scheme means arbitrary numbers of Units may now watch the same PID.
Runtime and memory behaviour should not be impact by this change, as for
the common case (i.e. each PID only watched by a single unit) behaviour
stays the same, but for the uncommon case (a PID watched by more than
one unit) we only pay with a single additional memory allocation for the
array.
Why this all? Primarily, because allowing exactly two units to watch a
specific PID is not sufficient for some niche cases, as processes can
belong to more than one unit these days:
1. sd_notify() with MAINPID= can be used to attach a process from a
different cgroup to multiple units.
2. Similar, the PIDFile= setting in unit files can be used for similar
setups,
3. By creating a scope unit a main process of a service may join a
different unit, too.
4. On cgroupsv1 we frequently end up watching all processes remaining in
a scope, and if a process opens lots of scopes one after the other it
might thus end up being watch by many of them.
This patch hence removes the 2-unit-per-PID limit. It also makes a
couple of other changes, some of them quite relevant:
- manager_get_unit_by_pid() (and the bus call wrapping it) when there's
ambiguity will prefer returning the Unit the process belongs to based on
cgroup membership, and only check the watch-pids hashmap if that
fails. This change in logic is probably more in line with what people
expect and makes things more stable as each process can belong to
exactly one cgroup only.
- Every SIGCHLD event is now dispatched to all units interested in its
PID. Previously, there was some magic conditionalization: the SIGCHLD
would only be dispatched to the unit if it was only interested in a
single PID only, or the PID belonged to the control or main PID or we
didn't dispatch a signle SIGCHLD to the unit in the current event loop
iteration yet. These rules were quite arbitrary and also redundant as
the the per-unit handlers would filter the PIDs anyway a second time.
With this change we'll hence relax the rules: all we do now is
dispatch every SIGCHLD event exactly once to each unit interested in
it, and it's up to the unit to then use or ignore this. We use a
generation counter in the unit to ensure that we only invoke the unit
handler once for each event, protecting us from confusion if a unit is
both associated with a specific PID through cgroup membership and
through the "watch_pids" logic. It also protects us from being
confused if the "watch_pids" hashmap is altered while we are
dispatching to it (which is a very likely case).
- sd_notify() message dispatching has been reworked to be very similar
to SIGCHLD handling now. A generation counter is used for dispatching
as well.
This also adds a new test that validates that "watch_pid" registration
and unregstration works correctly.
The new flag returns the O_PATH fd of the final component, which may be
converted into a proper fd by open()ing it again through the
/proc/self/fd/xyz path.
Together with O_SAFE this provides us with a somewhat safe way to open()
files in directories potentially owned by unprivileged code, where we
want to refuse operation if any symlink tricks are played pointing to
privileged files.
When the flag is specified we won't transition to a privilege-owned
file or directory from an unprivileged-owned one. This is useful when
privileged code wants to load data from a file unprivileged users have
write access to, and validates the ownership, but want's to make sure
that no symlink games are played to read a root-owned system file
belonging to a different context.
log.h really should only include the bare minimum of other headers, as
it is really pulled into pretty much everything else and already in
itself one of the most basic pieces of code we have.
Let's hence drop inclusion of:
1. sd-id128.h because it's entirely unneeded in current log.h
2. errno.h, dito.
3. sys/signalfd.h which we can replace by a simple struct forward
declaration
4. process-util.h which was needed for getpid_cached() which we now hide
in a funciton log_emergency_level() instead, which nicely abstracts
the details away.
5. sys/socket.h which was needed for struct iovec, but a simple struct
forward declaration suffices for that too.
Ultimately this actually makes our source tree larger (since users of
the functionality above must now include it themselves, log.h won't do
that for them), but I think it helps to untangle our web of includes a
tiny bit.
(Background: I'd like to isolate the generic bits of src/basic/ enough
so that we can do a git submodule import into casync for it)
This adds a "watch-bind" feature to sd-bus connections. If set and the
AF_UNIX socket we are connecting to doesn't exist yet, we'll establish
an inotify watch instead, and wait for the socket to appear. In other
words, a missing AF_UNIX just makes connecting slower.
This is useful for daemons such as networkd or resolved that shall be
able to run during early-boot, before dbus-daemon is up, and want to
connect to dbus-daemon as soon as it becomes ready.