This adds fozr new flags:
- If CONF_FILES_DIRECTORY is specified conf_file_list() and friends
will look for directories only.
- Similar CONF_FILES_REGULAR means we'll look only for regular files.
- If CONF_FILES_BASENAME is specified the resulting list will contain
only the basenames of all discovered files or directories, not the
full paths.
- If CONF_FILES_FILTER_MASKED is specified the resulting list will have
masked entries removed (i.e. those symlinked to /dev/null and
suchlike)
These four flags are useful for discovering portable service profile
information.
While we are at it, also improve a couple of other things:
- More debug logging
- use path_hash_ops instead of string_hash_ops when putting together the
path lists
Most our other parsing functions do this, let's do this here too,
internally we accept that anyway. Also, the closely related
load_env_file() and load_env_file_pairs() also do this, so let's be
systematic.
We already have a flag for creating a new mount namespace for the child.
Let's add an extension to that: a new FORK_MOUNTNFS_SLAVE flag. When
used in combination will mark all mounts in the child namespace as
MS_SLAVE so that the child can freely mount or unmount stuff but it
won't leak into the parent.
This call creates an fd from another fd containing the same data.
Specifically, repeated read() on the returned fd should return the same
data as the original fd. This call is useful when we want to copy data
out of disk images and suchlike, and want to be pass fds with the data
around without having to keep the disk image continously mounted.
The implementation tries to be somewhat smart and tries to prefer
memfds/pipes over files in /tmp or /var/tmp based on the size of the
data, but has appropropriate fallbacks in place.
If we log to the pty that is configured as stdin/stdout/stderr of the
container too early we risk filling it up in full before we start
processing the pty from the parent process, resulting in deadlocks.
Let's hence keep a copy of the original tty we were started on before
setting up stdin/stdout/stderr, so that we can log to it, and keep using
it as long as we can.
Since the kernel's pty internal buffer is pretty small this actually
triggered deadlocks when we debug logged at lot from nspawn's child
processes, see: https://github.com/systemd/systemd/pull/9024#issuecomment-390403674
With this change we won't use the pty at all, only the actual payload we
start will, and hence we won't deadlock on it, ever.
The macro is inspired by the other string table macros, and takes the
same arguments in the same order and dumps a string table to stdout.
Since it's typesafe it's nice to implement this as macro rather than
regular function.
This new macro is useful for implementing commands such as "systemctl -t
help" and similar, i.e. wherever we want to dump all values of an enum
to stdout.
Functions whose only purpose is to be used with _cleanup_() should not
touch errno, so that failing removals do not alter errno at unexpected
places.
This is already done in unlink_and_freep(), rmdir_and_freep(),
rm_rf_physical_and_freep(), hence do so for unlink_tempfilep(), too.
Follow-up for #9013
This simplifies the use of tempfiles in tests and fixes "leaked"
temporary files in test-fileio, test-catalog, test-conf-parser.
Not the whole tree is converted.
And port config_parse_exec_oom_score_adjust() over to use it.
While we are at it, let's also fix config_parse_exec_oom_score_adjust()
to accept an empty string for turning off OOM score adjustments set
earlier.
POSIX doesn't declare too clearly how RLIM_INFINITY is set. Let's hence
filter it out explicitly early on, just as safety precaution should it
be defined weirdly on some arch, for example negative or below the
maximum value of the rlim_t type.
Configuration through environment variable is inconvenient with meson, because
they cannot be convieniently changed and/or are not preserved during
reconfiguration (https://github.com/mesonbuild/meson/issues/1503).
This adds -Dvalgrind=true/false, which has the advantage that it can be set
at any time with meson configure -Dvalgrind=... and ninja will rebuild targets
as necessary. Additional minor advantages are better consistency with the
options for hashmap debugging, and typo avoidance with '#if' instead of '#ifdef'.
We were inconsitently using them in some cases, but in majority not.
Using assignment in assert_se is very common, not an exception like in
'if', so let's drop the extra parens everywhere.
When I see "test", I have to think three times what the return value
means. With "below" this is immediately clear. ratelimit_below(&limit)
sounds almost like English and is imho immediately obvious.
(I also considered ratelimit_ok, but this strongly implies that being under the
limit is somehow better. Most of the times this is true, but then we use the
ratelimit to detect triple-c-a-d, and "ok" doesn't fit so well there.)
C.f. a1bcaa07.
We can jump to chase_one from two places. In the first 'todo' is set to
'buffer', which comes from path_make_absolute_cwd() and is nonnull In the
second 'todo' is set to 'joined' which is checked to be nonull a few lines
above the jump. So let's kill the code that deals with null todo there.
CID #1390941.
This means that when those targets are built, all the sources are built again,
instead of reusing the work done to create libbasic.a and other convenience static
libraries. It would be nice to not do this, but there seems to be no support in
our toolchain for joining multiple static libraries into one. When linking
a static library, any -l arguments are simply ignored by ar/gcc-ar, and .a
libraries given as positional arguments are copied verbatim into the archive
so they objects in them cannot be accessed.
https://stackoverflow.com/questions/2157629/linking-static-libraries-to-other-static-libraries
suggests either unzipping all the archives and putting them back togather,
or using a linker script. Unzipping and zipping back together seems ugly.
The other option is not very nice. The linker script language does not
allow "+" to appear in the filenames, and filenames that meson generates
use that, so files would have to be renamed before a linker script was used.
And we would have to generate the linker script on the fly. Either way, this
doesn't seem attractive. Since those static libraries are a niche use case,
it seems reasonable to just go with the easiest and safest solution and
recompile all the source files. Thanks to ccache, this is probably almost as
cheap as actually reusing the convenience .a libraries.
test-libsystemd-sym.c and test-libudev-sym.c compile fine with the generated
static libs, so it seems that they indeed provide all the symbols they should.
if we lack privs to create device nodes that's fine, and creating
/run/systemd/inaccessible/chr or /run/systemd/inaccessible/blk won't
work then. Document this in longer comments.
Fixes: #4484
Before this, `signal_from_string()` accepts simple signal name
or RTMIN+n. This makes the function also accept RTMIN, RTMAX,
and RTMAX-n.
Note that RTMIN+0 is equivalent to RTMIN, and RTMAX-0 is to RTMAX.
This also fixes the integer overflow reported by oss-fuzz #8064.
https://oss-fuzz.com/v2/testcase-detail/5648573352902656
Let's simplify the code a bit. Let's reduce the number of redundant if
checks a bit, (i.e. if we want to check for equality with
VIRTUALIZATION_VM_OTHER there's no need to check for non-equality with
VIRTUALIZATION_NONE first). As a very welcome side-effect this means we
lose some lines of code and our level of indentation is reduced.
No changes in behaviour.
This extends the change done in b29f6480ec to other logging functions.
This actually fixes some bugs in callers of log_struct(), for example
config_parse_alias() called 'return log_syntax(..., 0, ...)' which could result
in a bogus non-zero return value.
Calls to log_object() and log_format_iovec() — which is only used by
server_driver_message() — appear correct.
Let's optionally translate BSD exit codes to error strings too.
My first approach on adding this was to turn ExitStatusLevel into a
bitmask rather than a linear level, with one bit for the various feature
bits. However, the exit code ranges are generally not defined
independently from each other, i.e. our own ones are defined with the
LSB ones in mind, and most sets are defined with the ISO C ones.
Hence, instead I changed the existing hierarchy of MINIMAL, SYSTEMD, LSB
with an alias of FULL == LSB, only slightly by seperating FULL and LSB
into two separate levels, so that there's now:
1. MINIMAL (only EXIT_SUCCESS/EXIT_FAILURE)
2. SYSTEMD (incorporating our own exit codes)
3. LSB (like SYSTEMD but adding in LSB service exit codes)
4. FULL (like FULL but adding BSD exit codes)
Note that across the codebase only FULL, SYSTEMD, and MINIMAL are used,
depending on context, how much we know about the process and whether we
are logging for debugging purposes or not. This means the LSB level
wouldn't really have to be separate, but it appeared careless to me to
fold it into FULL along with the BSD exit codes.
Note that this commit doesn't change much for regular codepaths: the
FULL exit status level is only used during debug logging, as a helper to
the user reading the debug logs.
Of course, alloca() shouldn't be used with anything that can grow
without bounds anyway, but let's better safe than sorry, and catch this
early.
Since alloca() is not supposed to return an error we trigger an
assert() instead, which is still better than heap trickery.
Previously we were a bit sloppy with the index and size types of arrays,
we'd regularly use unsigned. While I don't think this ever resulted in
real issues I think we should be more careful there and follow a
stricter regime: unless there's a strong reason not to use size_t for
array sizes and indexes, size_t it should be. Any allocations we do
ultimately will use size_t anyway, and converting forth and back between
unsigned and size_t will always be a source of problems.
Note that on 32bit machines "unsigned" and "size_t" are equivalent, and
on 64bit machines our arrays shouldn't grow that large anyway, and if
they do we have a problem, however that kind of overly large allocation
we have protections for usually, but for overflows we do not have that
so much, hence let's add it.
So yeah, it's a story of the current code being already "good enough",
but I think some extra type hygiene is better.
This patch tries to be comprehensive, but it probably isn't and I missed
a few cases. But I guess we can cover that later as we notice it. Among
smaller fixes, this changes:
1. strv_length()' return type becomes size_t
2. the unit file changes array size becomes size_t
3. DNS answer and query array sizes become size_t
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=76745
If the main config file or one of the drop-ins did not have the final newline,
there would be no seperating empty line (or if this was the last file
displayed, our own output would end without the final newline, possibly running
into the subsequent prompt or such). copy_bytes() does not know anything about
lines, so let's just use a normal loop with read_line() and puts().
This implements similar logic as conf_files_cat(), but with slightly different
file gathering logic. I also want to add support for replacement files later on,
so it seems better to keep those two file-gathering functions separate.
This is used as 'systemd-analyze show-config systemd/logind.conf', which
will dump
/etc/systemd/system/user@.service
/etc/systemd/system/user@.service.d/*.conf
/run/systemd/system/user@.service.d/*.conf
/usr/local/lib/systemd/system/user@.service.d/*.conf
/usr/lib/systemd/system/user@.service.d/*.conf
The idea is to make it easy to dump the configuration using the same locations
and order that systemd programs use themselves (including masking, in the right
order, etc.). This is the generic variant that works with any configuration
scheme that follows the same general rules:
$ systemd-analyze cat-config systemd/system.conf
$ systemd-analyze cat-config systemd/user.conf
$ systemd-analyze cat-config systemd/logind.conf
$ systemd-analyze cat-config systemd/sleep.conf
$ systemd-analyze cat-config systemd/journald.conf
$ systemd-analyze cat-config systemd/journal-remote.conf
$ systemd-analyze cat-config systemd/journal-upload.conf
$ systemd-analyze cat-config systemd/coredump.conf
$ systemd-analyze cat-config systemd/resolved.conf
$ systemd-analyze cat-config systemd/timesyncd.conf
$ systemd-analyze cat-config udev/udev.conf
We use MTUs all over the place, let's add a unified, strict parser for
it, that takes MTU ranges into account.
We already have parse_ifindex() close-by, hence this appears to be a
natural addition, in particular as the range checking is not entirely
trivial to do, as it depends on the protocol used.
If enabling controller for some reason fails we need to clear error
for the FILE stream. Enabling remaining controllers would otherwise
fail because write_string_stream_ts() checks for ferror(f) and returns
-EIO if there is one.
Broken by commit <77fa610b22>.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
This drops a good number of type-specific _cleanup_ macros, and patches
all users to just use the generic ones.
In most recent code we abstained from defining type-specific macros, and
this basically removes all those added already, with the exception of
the really low-level ones.
Having explicit macros for this is not too useful, as the expression
without the extra macro is generally just 2ch wider. We should generally
emphesize generic code, unless there are really good reasons for
specific code, hence let's follow this in this case too.
Note that _cleanup_free_ and similar really low-level, libc'ish, Linux
API'ish macros continue to be defined, only the really high-level OO
ones are dropped. From now on this should really be the rule: for really
low-level stuff, such as memory allocation, fd handling and so one, go
ahead and define explicit per-type macros, but for high-level, specific
program code, just use the generic _cleanup_() macro directly, in order
to keep things simple and as readable as possible for the uninitiated.
Note that before this patch some of the APIs (notable libudev ones) were
already used with the high-level macros at some places and with the
generic _cleanup_ macro at others. With this patch we hence unify on the
latter.
A regression was introduced that caused the mtime of /etc/.updated
and /var/.updated to be the current time when systemd-update-done
ran instead of being copied from /usr.
This was nearly fixed, but due to fflush being called after mtime
was carefully set, it was overwritten with the current time.
Regression introduced in 872c403963
A fix was just missed in 39c38d773fFixes#8806
With the recent terminal_urlify() APIs we'll now sometimes generate
clickable link CSO sequences. Hence we should also be able to remove
them again from strings. This beefs up the logic to do so.
Follow-up for: 23b27b39d2
Quoting https://github.com/systemd/systemd/pull/8760#discussion_r183321060:
> When we originally added the errno patching we went for a "best of both
> worlds" approach, i.e. that we override errno if an error is specified, but
> if no error is specified (i.e. 0 is passed as error code) then we use the
> previously set errno, similar in style how plain `printf()` would do it. In
> retrospect I think we almost never purposefully made use of the second,
> i.e. the plain `printf()` logic, but we multiple times ran into this case
> accidentally and introduced a bug. Hence yes, it probably makes sense to
> switch this over, and consistently ignore the `errno` already set and always
> override it with the error passed in. The only problem I see with that is: I
> wonder if there might be a case or two lurking somewhere where we actually
> made use of the "best of both worlds" approach, and if so, if we can detect
> where... (But then again, even if there is, and we fail to find those cases,
> maybe that's not all bad, as it's just a few new bugs against probably fixing
> many more old and future bugs, if you follow what I mean).
I scanned our codebase, and found some bugs in the value passed to log_*_errno,
but no intentional cases of error=0 being passed.
We don't know what the container payload needs, hence default to a PATH
with both bin and sbin included, as well as / and /usr.
Follow-up for #8324Fixes: #8698
Newer terminals (in particular gnome-terminal) understand special escape
sequence for formatting clickable links. Let's support that to make our
tool output more clickable where that's appropriate.
For details see this:
https://gist.github.com/egmontkob/eb114294efbcd5adb1944c9f3cb5feda
The one big issue is that 'less' currently doesn't grok this, and
doesn't ignore sequence like regular terminal implementations do if they
don't support it. Hence for now, let's disable URL output if a pager is
used. We should revisit that though as soon as less added support for it
and enough time passed for it to enter various distributions.
Double newlines (i.e. one empty lines) are great to structure code. But
let's avoid triple newlines (i.e. two empty lines), quadruple newlines,
quintuple newlines, …, that's just spurious whitespace.
It's an easy way to drop 121 lines of code, and keeps the coding style
of our sources a bit tigther.
To indicate that the there're no more entries, these wrappers return false but
did leave the passed pointed unmodified.
However EOF is not an error and is a very common case so initialize the output
argument to NULL even in this case so callers don't need to do that.
Fixes: #8721
We check the same condition at various places. Let's add a trivial,
common helper for this, and use it everywhere.
It's not going to make things much faster or much shorter, but I think a
lot more readable
If the flag is set only a single step of the normalization is executed,
and the resulting path is returned.
This allows callers to normalize piecemeal, taking into account every
single intermediary path of the normalization.
We have plenty of code in our codebase that outputs tables to the
console, and all is homegrown and awful. Let's replace it with a generic
implementation that can do automatically what the old implementations
did manually.
Features:
1. Ellipsation (for fields overly long) and alignment (for
fields overly short)
2. Sorting of rows
3. automatically copies formatting from the same cell in the row above
4. Heavy use of varargs to make putting together tables easy
5. can expand and compress tables, with weights
6. Has a minimal understanding of unicode wide characters in order to
match unicode strings to character cell terminals.
7. Columns can be reordered and individually turned off.
8. pretty printing for various data types
And more.
pager.[ch] doesn't use any APIs from src/libsystemd/ or src/shared/
hence there's no reason for it to be in src/shared/, let's move it to
src/basic/ instead.
This enables us to use pager.[ch] APIs from other code in src/basic/,
for example pager_have() and suchlike.
This primarily changes to things:
1. Ellipsation to 0, 1 or 2 characters is now supported. Previously we'd
hit an assert if the new lengths was < 3, this is now permitted. The
result strings won't show too much info still of course, but the code
becomes a bit more generic and robust to use.
2. If a UTF-8 mode is disabled and the input string is pure ASCII, then
"..." is used for ellipsation, otherwise (as before) "…". This means
on a pure-ASCII system we should remain pure-ASCII, matching
behaviour otherwise exposed with special_glyph() and friends. Note
that we'll use "…" for ellipsiation as soon as either the locale
settings indicate an UTF-8 mode or the input string already contains
non-ASCII unicode characters.
Testing for these special cases is improved.
The NULSTR_FOREACH iterator needs to see an empty string at the end, so
we need to insert an extra NUL explicitly.
Also update PATH0_BIN_SBIN(x) to include an extra NUL terminator, rename
it to PATH_BIN_SBIN_NULSTR(x), which is more consistent with the similar
CONF_PATHS_NULSTR(x) macro.
Fixes: 5008da1ec1
Absolute paths make everything simple and quick, but sometimes this requirement
can be annoying. A good example is calling 'test', which will be located in
/usr/bin/ or /bin depending on the distro. The need the provide the full path
makes it harder a portable unit file in such cases.
This patch uses a fixed search path (DEFAULT_PATH which was already used as the
default value of $PATH), and if a non-absolute file name is found, it is
immediately resolved to a full path using this search path when the unit is
loaded. After that, everything behaves as if an absolute path was specified. In
particular, the executable must exist when the unit is loaded.
If a tool only invokes copy_bytes() a single time the _unlikely_() will always be
wrong, and is hence not useful. Let's drop it and let the compiler
figure our what to do, instead of misleading it.
Also, some coding style imprvoements.
splice() ignores O_NONBLOCK on pipes but not on other fds. Let's handle
that properly, and query O_ONBLOCK manually in that case, ensuring
systematic behaviour in either case.
Optionally, when we copy between fds with simple read/write, let's
return any remaining data we already read into the buffer if write
fails. This is useful to allow callers to use the read data otherwise,
perhaps implementing a different fallback for copying.
It does two things:
1. It derives the element size from the array argument type
2. It derives the right type for the function from the array argument
type
Using this macro call should make the invocations of qsort() quite a bit
safer.
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.
I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
Things can fail, and we have no control over it:
- file system issues (immutable bits, file system errors, MAC refusals, etc)
- kernel refusing certain arguments when writing to /proc/sys or /sys
Let's add a new code for the case where we parsed configuration but failed
to execute it because of external errors.
Accept definitions to other AF_ constants, not just PF_ ones,
such as:
#define AF_LINUX AF_LOCAL
It may not be necessary to impose any restriction on the
definitions of the macros extracted, but for now
keep most of that requirement but match AF_* as well.
We have the same code for this in place at various locations, let's
unify that. Also, let's repurpose test-fs-util.c as a test for this new
helper cal..
This reworks the SELinux and SMACK label fixing calls in a number of
ways:
1. The two separate boolean arguments of these functions are converted
into a flags type LabelFixFlags.
2. The operations are now implemented based on O_PATH. This should
resolve TTOCTTOU races between determining the label for the file
system object and applying it, as it it allows to pin the object
while we are operating on it.
3. When changing a label fails we'll query the label previously set, and
if matches what we want to set anyway we'll suppress the error.
Also, all calls to label_fix() are now (void)ified, when we ignore the
return values.
Fixes: #8566
Not that it matters much, but it seems cleaner to also count those
inputs, even if they do not consume extra storage space.
The test is extended to include an empty input and counts in the test are
adjusted to include it.
https://clang.llvm.org/docs/MemorySanitizer.html#id5 documents this
check as the way to detect MemorySanitizer at compilation time. We
only need to skip the test if MemorySanitizer is used.
Also, use this condition in cg_slice_to_path(). There, the code that is
conditionalized is not harmful in any way (it's just unnecessary), so remove
the FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION condition.
Fixes#8482.
When we are attempting to create directory somewhere in the bowels of /var/lib
and get an error that it already exists, it can be quite hard to diagnose what
is wrong (especially for a user who is not aware that the directory must have
the specified owner, and permissions not looser than what was requested). Let's
print a warning in most cases. A warning is appropriate, because such state is
usually a sign of borked installation and needs to be resolved by the adminstrator.
$ build/test-fs-util
Path "/tmp/test-readlink_and_make_absolute" already exists and is not a directory, refusing.
(or)
Directory "/tmp/test-readlink_and_make_absolute" already exists, but has mode 0775 that is too permissive (0755 was requested), refusing.
(or)
Directory "/tmp/test-readlink_and_make_absolute" already exists, but is owned by 1001:1000 (1000:1000 was requested), refusing.
Assertion 'mkdir_safe(tempdir, 0755, getuid(), getgid(), MKDIR_WARN_MODE) >= 0' failed at ../src/test/test-fs-util.c:320, function test_readlink_and_make_absolute(). Aborting.
No functional change except for the new log lines.
This is similar to TAKE_PTR() but operates on file descriptors, and thus
assigns -1 to the fd parameter after returning it.
Removes 60 lines from our codebase. Pretty good too I think.
This macro will read a pointer of any type, return it, and set the
pointer to NULL. This is useful as an explicit concept of passing
ownership of a memory area between pointers.
This takes inspiration from Rust:
https://doc.rust-lang.org/std/option/enum.Option.html#method.take
and was suggested by Alan Jenkins (@sourcejedi).
It drops ~160 lines of code from our codebase, which makes me like it.
Also, I think it clarifies passing of ownership, and thus helps
readability a bit (at least for the initiated who know the new macro)
This rearranges chase_symlinks() a bit: if no special flags are
specified it will now revert to behaviour before
b12d25a8d6. However, if the new
CHASE_TRAIL_SLASH flag is specified it will follow the behaviour
introduced by that commit.
I wasn't sure which one to make the beaviour that requires specification
of a flag to enable. I opted to make the "append trailing slash"
behaviour the one to enable by a flag, following the thinking that the
function should primarily be used to generate a normalized path, and I
am pretty sure a path without trailing slash is the more "normalized"
one, as the trailing slash is not really a part of it, but merely a
"decorator" that tells various system calls to generate ENOTDIR if the
path doesn't refer to a path.
Or to say this differently: if the slash was part of normalization then
we really should add it in all cases when the final path is a directory,
not just when the user originally specified it.
Fixes: #8544
Replaces: #8545
safe_atou16_full() is like safe_atou16() but also takes a base
parameter. safe_atou16() is then implemented as inline function on top
of it, passing 0 as base. Similar safe_atoux16() is reworked as inline
function too, with 16 as base.
The warning is not emitted for absolute paths like /dev/sda or /home, which are
converted to .device and .mount unit names without any fuss.
Most of the time it's unlikely that users use invalid unit names on purpose,
so let's warn them. Warnings are silenced when --quiet is used.
$ build/systemctl show -p Id hello@foo-bar/baz
Invalid unit name "hello@foo-bar/baz" was escaped as "hello@foo-bar-baz" (maybe you should use systemd-escape?)
Id=hello@foo-bar-baz.service
$ build/systemd-run --user --slice foo-bar/baz --unit foo-bar/foo true
Invalid unit name "foo-bar/foo" was escaped as "foo-bar-foo" (maybe you should use systemd-escape?)
Invalid unit name "foo-bar/baz" was escaped as "foo-bar-baz" (maybe you should use systemd-escape?)
Running as unit: foo-bar-foo.service
Fixes#8302.
The orignal reproducer from oss-fuzz depends on the hostname (via %H and %c).
The hostname needs a dash for msan to report this, so a simpler case from
@evverx with the dash hardcoded is also added.
The issue is a false positive from msan, which does not instruct stpncpy
(https://github.com/google/sanitizers/issues/926). Let's add a work-around
until this is fixed.
We have only three bits of space, i.e. 8 possible classes. Immediately reject
anything outside of that range. Add the fuzzer test case and an additional
unit test.
oss-fuzz #6908.
We probably should allow very deep calls of our recursive functions. Let's add
a limit to avoid resource exhaustion. 240 is 10 per hour (if somebody is using
this for time based triggers...), so it should be more than enough for most use
cases, and is conveniently below the 250 stack limit in msan.
oss-fuzz #6917.
"noreturn" is reserved and can be used in other header files we include:
[ 16s] In file included from /usr/include/gcrypt.h:30:0,
[ 16s] from ../src/journal/journal-file.h:26,
[ 16s] from ../src/journal/journal-vacuum.c:31:
[ 16s] /usr/include/gpg-error.h:1544:46: error: expected ‘,’ or ‘;’ before ‘)’ token
[ 16s] void gpgrt_log_bug (const char *fmt, ...) GPGRT_ATTR_NR_PRINTF(1,2);
Here we include grcrypt.h (which in turns include gpg-error.h) *after* we
"noreturn" was defined in macro.h.
gmtime_r() will return NULL in that case, and we would crash.
I committed the reproducer case in fuzz-regressions/, even though we don't have
ubsan hooked up yet. Let's add it anyway in case it is useful in the future. We
actually crash anyway when compiled with asserts, so this can be easily
reproduced without ubsan.
oss-fuzz #6886.
kernel >= 4.5 (with commit 32bc201e19) supports
RTA_EXPIRES netlink attribute to set router lifetime. This simply detect
the kernel version (>=4.5) and set the lifetime properly, fallback to
expiring route in userspace for kernel that doesnt support it.
Signed-off-by: Daniel Dao <dqminh89@gmail.com>
Suspend to Hibernate is a new sleep method that invokes suspend
for a predefined period of time before automatically waking up
and hibernating the system.
It's similar to HybridSleep however there isn't a performance
impact on every suspend cycle.
It's intended to use with systems that may have a higher power
drain in their supported suspend states to prevent battery and
data loss over an extended suspend cycle.
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
ISO C does not allow empty statements outside of functions, and gcc
will warn the trailing semicolons when compiling with -pedantic:
warning: ISO C does not allow extra ‘;’ outside of a function [-Wpedantic]
But our code cannot compile with -pedantic anyway, at least because
warning: ISO C does not support ‘__PRETTY_FUNCTION__’ predefined identifier [-Wpedantic]
Without -pedatnic, clang and even old gcc (3.4) generate no warnings about
those semicolons, so let's just drop __useless_struct_to_allow_trailing_semicolon__.
Quite often we need to set up a number of fds as stdin/stdout/stderr of
a process we are about to start. Add a generic implementation for a
routine doing that that takes care to do so properly:
1. Can handle the case where stdin/stdout/stderr where previously
closed, and the fds to set as stdin/stdout/stderr hence likely in the
0..2 range. handling this properly is nasty, since we need to first
move the fds out of this range in order to later move them back in, to
make things fully robust.
2. Can optionally open /dev/null in case for one or more of the fds, in
a smart way, sharing the open file if possible between multiple of
the fds.
3. Guarantees that O_CLOEXEC is not set on the three fds, even if the fds
already were in the 0..2 range and hence possibly weren't moved.
At various places we only want to close fds if they are not
stdin/stdout/stderr, i.e. fds 0, 1, 2. Let's add a unified helper call
for that, and port everything over.
I figure sooneror later we'll have more of these docs, hence let's give
them a clean place to be.
This leaves NEWS and README/README.md as well as the LICENSE texts in
the root directory of the project since that appears to be customary for
Free Software projects.
There isn't much difference, but in general we prefer to use the standard
functions. glibc provides reallocarray since version 2.26.
I moved explicit_bzero is configure test to the bottom, so that the two stdlib
functions are at the bottom.
gcc warns about unitialized memory access because it notices that ssize_t which
is < 0 could be cast to positive int value. We know that this can't really
happen because only -1 can be returned, but OTOH, in principle a large
*positive* value cannot be cast properly. This is unlikely too, since xattrs
cannot be too large, but it seems cleaner to just use a size_t to return the
value and avoid the cast altoghter. This makes the code simpler and gcc is
happy too.
The following warning goes away:
[113/1502] Compiling C object 'src/basic/basic@sta/xattr-util.c.o'.
In file included from ../src/basic/alloc-util.h:28:0,
from ../src/basic/xattr-util.c:30:
../src/basic/xattr-util.c: In function ‘fd_getcrtime_at’:
../src/basic/macro.h:207:60: warning: ‘b’ may be used uninitialized in this function [-Wmaybe-uninitialized]
UNIQ_T(A,aq) < UNIQ_T(B,bq) ? UNIQ_T(A,aq) : UNIQ_T(B,bq); \
^
../src/basic/xattr-util.c:155:19: note: ‘b’ was declared here
usec_t a, b;
^
$ sudo systemd-run -p RootDirectory=/usr -E LD_LIBRARY_PATH=/lib/systemd/ -E SYSTEMD_LOG_LEVEL=debug /bin/systemd-detect-virt
Before
systemd-detect-virt[18498]: No virtualization found in DMI
systemd-detect-virt[18498]: No virtualization found in CPUID
systemd-detect-virt[18498]: Virtualization XEN not found, /proc/xen does not exist
systemd-detect-virt[18498]: This platform does not support /proc/device-tree
systemd-detect-virt[18498]: Failed to check for virtualization: No such file or directory
The first four lines are at debug level, so the user would only see that last
one usually, which is not very enlightening.
This now becomes:
systemd-detect-virt[21172]: No virtualization found in DMI
systemd-detect-virt[21172]: No virtualization found in CPUID
systemd-detect-virt[21172]: Virtualization XEN not found, /proc/xen does not exist
systemd-detect-virt[21172]: This platform does not support /proc/device-tree
systemd-detect-virt[21172]: /proc/cpuinfo not found, assuming no UML virtualization.
systemd-detect-virt[21172]: This platform does not support /proc/sysinfo
systemd-detect-virt[21172]: Found VM virtualization none
systemd-detect-virt[21172]: none
We do more checks, which is good too.
Then it can be used in the asserts in logging functions without causing
infinite recursion. The error is just printed to stderr, it should be
good enough for the common case.
gcc-8 throws an error if it knows snprintf might truncate output and the
return value is ignored:
../src/udev/udev-builtin-net_id.c: In function 'dev_pci_slot':
../src/udev/udev-builtin-net_id.c:297:47: error: '%s' directive output may be truncated writing up to 255 bytes into a region of size between 0 and 4095 [-Werror=format-truncation=]
snprintf(str, sizeof str, "%s/%s/address", slots, dent->d_name);
^~
../src/udev/udev-builtin-net_id.c:297:17: note: 'snprintf' output between 10 and 4360 bytes into a destination of size 4096
snprintf(str, sizeof str, "%s/%s/address", slots, dent->d_name);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
Let's check all return values. This actually makes the code better, because there's
no point in trying to open a file when the name has been truncated, etc.
If log_do_header() was called with overly long parameters, it'd generate
improper output. Essentially, it'd be truncated at random point, in particular
missing a newline at the end, so it'd run with the next field, usually MESSAGE=.
log_do_header is called with parameters from compiled code (file name, lien
nubmer, etc), so in practice this was unlikely to ever be a problem, but it is
possible. In particular, if systemd was compiled from sources in some deeply
nested directory (which happens for example in mock and other build roots), the
filename could be very long.
As a safety measure, let's truncate all parameters to 256 bytes. So we have
5 fields which are 256 bytes (plus the field name prefix), and a few other
fields with fixed width. This must always fit in the 2048 byte buffer.
I don't think there's much gain in calculating the required length precisely,
since it's a lot of fields and a few bytes allocated on the stack don't matter.
log_dispatch_internal has only one caller where the extra_field/extra
params are not null: log_unit_full. When log_unit_full() was called,
when we got to log_dispatch_internal, our header would look like this:
PRIORITY=7
SYSLOG_FACILITY=3
CODE_FILE=../src/core/manager.c
CODE_LINE=2145
CODE_FUNC=manager_invoke_sigchld_event
USER_UNIT=gnome-terminal-server.service
65dffa7a3b984a6d9a46f0b8fb57710bUSER_INVOCATION_ID=
SYSLOG_IDENTIFIER=systemd
It took me a while to understand why I'm not seeing mangled messages in the
journal (after all, "" is a valid rvalue for log messages). The answer is that
journald rejects any field name which starts with a digit, and the MESSAGE_ID
that was used here starts with a digit. Hence, those lines would be silently
filtered out.
This patch adds safe_atoux16 for parsing an unsigned hexadecimal 16bit int, and
uses that for parsing USB device and vendor IDs.
This fixes a compile error with gcc-8 because while we know that USB IDs are 2 bytes,
the compiler does not know that.
../src/udev/udev-builtin-hwdb.c:80:38: error: '%04X' directive output may be
truncated writing between 4 and 8 bytes into a region of size between 2 and 6
[-Werror=format-truncation=]
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Signed-off-by: Patrick Uiterwijk <puiterwijk@redhat.com>
This is an attempt to improve #8228 a bit, by extending the /run/nologin
a bit, but still keeping it somewhat brief.
On purpose I used the vague wording "unprivileged user" rather than
"non-root user" so that pam_nologin can be updated to disable its
behaviour for members of the "wheel" group one day, and our messages
would still make sense.
See #8228.
So far, we had two implementations of reboot-with-parameter doing pretty
much the same. Let's unify that in a generic implementation used by
both.
This is particulary nice as it unifies all /run/systemd/reboot-param
handling in a single .c file.
This is primarily preparation for a follow-up commit that adds a common
implementation of the other side of the reboot parameter file, i.e. the
code that reads the file and issues reboot() for it.
This mimics the raw_clone() call we have in place already and
establishes a new syscall wrapper raw_reboot() that wraps the kernel's
reboot() system call in a bit more low-level fashion that glibc's
reboot() wrapper. The main difference is that the extra "arg" argument
is supported.
Ultimately this just replaces the syscall wrapper implementation we
currently have at three places in our codebase by a single one.
With this change this means that all our syscall() invocations are
neatly separated out in static inline system call wrappers in our header
functions.
Previously, we'd try to open kmsg on failure of the journal/syslog even
if no automatic fallback to kmsg was requested — and we wouldn't even
use the open connection afterwards...
So, the kernel's management of cgroup/BPF programs is a bit misdesigned:
if you attach a BPF program to a cgroup and close the fd for it it will
stay pinned to the cgroup with no chance of ever removing it again (or
otherwise getting ahold of it again), because the fd is used for
selecting which BPF program to detach. The only way to get rid of the
program again is to destroy the cgroup itself.
This is particularly bad for root the cgroup (and in fact any other
cgroup that we cannot realistically remove during runtime, such as
/system.slice, /init.scope or /system.slice/dbus.service) as getting rid
of the program only works by rebooting the system.
To counter this let's closely keep track to which cgroup a BPF program
is attached and let's implicitly detach the BPF program when we are
about to close the BPF fd.
This hence changes the bpf_program_cgroup_attach() function to track
where we attached the program and changes bpf_program_cgroup_detach() to
use this information. Moreover bpf_program_unref() will now implicitly
call bpf_program_cgroup_detach().
In order to simplify things, bpf_program_cgroup_attach() will now
implicitly invoke bpf_program_load_kernel() when necessary, simplifying
the caller's side.
Finally, this adds proper reference counting to BPF programs. This
is useful for working with two BPF programs in parallel: the BPF program
we are preparing for installation and the BPF program we so far
installed, shortening the window when we detach the old one and reattach
the new one.
Let's "seal" off the BPF program as soo as bpf_program_load_kernel() is
called, which allows us to make it idempotent: since the program can't
be modified anymore after being turned into a kernel object it's safe to
shortcut behaviour if called multiple times.
In meson.build we check that functions are available using:
meson.get_compiler('c').has_function('foo')
which checks the following:
- if __stub_foo or __stub___foo are defined, return false
- if foo is declared (a pointer to the function can be taken), return true
- otherwise check for __builtin_memfd_create
_stub is documented by glibc as
It defines a symbol '__stub_FUNCTION' for each function
in the C library which is a stub, meaning it will fail
every time called, usually setting errno to ENOSYS.
So if __stub is defined, we know we don't want to use the glibc version, but
this doesn't tell us if the name itself is defined or not. If it _is_ defined,
and we define our replacement as an inline static function, we get an error:
In file included from ../src/basic/missing.h:1358:0,
from ../src/basic/util.h:47,
from ../src/basic/calendarspec.h:29,
from ../src/basic/calendarspec.c:34:
../src/basic/missing_syscall.h:65:19: error: static declaration of 'memfd_create' follows non-static declaration
static inline int memfd_create(const char *name, unsigned int flags) {
^~~~~~~~~~~~
.../usr/include/bits/mman-shared.h:46:5: note: previous declaration of 'memfd_create' was here
int memfd_create (const char *__name, unsigned int __flags) __THROW;
^~~~~~~~~~~~
To avoid this problem, call our inline functions different than glibc,
and use a #define to map the official name to our replacement.
Fixes#8099.
v2:
- use "missing_" as the prefix instead of "_"
v3:
- rebase and update for statx()
Unfortunately "statx" is also present in "struct statx", so the define
causes issues. Work around this by using a typedef.
I checked that systemd compiles with current glibc
(glibc-devel-2.26-24.fc27.x86_64) if HAVE_MEMFD_CREATE, HAVE_GETTID,
HAVE_PIVOT_ROOT, HAVE_SETNS, HAVE_RENAMEAT2, HAVE_KCMP, HAVE_KEYCTL,
HAVE_COPY_FILE_RANGE, HAVE_BPF, HAVE_STATX are forced to 0.
Setting HAVE_NAME_TO_HANDLE_AT to 0 causes an issue, but it's not because of
the define, but because of struct file_handle.
The Linux kernel exposes the birth time now for files through statx()
hence make use of it where available. We keep the xattr logic in place
for this however, since only a subset of file systems on Linux currently
expose the birth time. NFS and tmpfs for example do not support it. OTOH
there are other file systems that do support the birth time but might
not support xattrs (smb…), hence make the best of the two, in particular
in order to deal with journal files copied between file system types and
to maintain compatibility with older file systems that are updated to
newer version of the file system.
Let's make use this at various places we call fsync(), to make things
fully reliable, as the kernel devs suggest to first fsync() files and
then fsync() the directories they are located in.
Let's add a common implementation for regular file checks, that are
careful to return the right error code (EISDIR/EISLNK/EBADFD) when we
are encountering a wrong file node.