So it's apparently problematic that we use STRV_MAKE() (i.e. a compound
initializer) outside of the {} block we use it in (and that includes
outside of the ({}) block, too). Hence, let's rework the macro to not
need that.
This also makes the macro shorter, which is definitely a good and more
readable. Moreover, it will now complain if the iterator is a "char*"
instead of a "const char*", which is good too.
Fixes: #11394
Found by inspecting results of running this small program:
int main(int argc, const char **argv) {
for (int i = 1; i < argc; i++) {
FILE *f;
char line[1024], prev[1024], *r;
int lineno;
prev[0] = '\0';
lineno = 1;
f = fopen(argv[i], "r");
if (!f)
exit(1);
do {
r = fgets(line, sizeof(line), f);
if (!r)
break;
if (strcmp(line, prev) == 0)
printf("%s:%d: error: dup %s", argv[i], lineno, line);
lineno++;
strcpy(prev, line);
} while (!feof(f));
fclose(f);
}
}
This reverts commit b26c904113.
I don't see anythign wrong, but Ubuntu autopkgtest CI started failing fairly
consistently since this was merged. Let's see if reverting fixes things.
This affects systemd-journald and systemd-coredump.
Example entry:
$ journalctl -o export -n1 'MESSAGE=Something logged'
__CURSOR=s=976542d120c649f494471be317829ef9;i=34e;b=4871e4c474574ce4a462dfe3f1c37f06;m=c7d0c37dd2;t=57c4ac58f3b98;x=67598e942bd23dc0
__REALTIME_TIMESTAMP=1544035467475864
__MONOTONIC_TIMESTAMP=858200964562
_BOOT_ID=4871e4c474574ce4a462dfe3f1c37f06
PRIORITY=6
_UID=1000
_GID=1000
_CAP_EFFECTIVE=0
_SELINUX_CONTEXT=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
_AUDIT_SESSION=1
_AUDIT_LOGINUID=1000
_SYSTEMD_OWNER_UID=1000
_SYSTEMD_UNIT=user@1000.service
_SYSTEMD_SLICE=user-1000.slice
_SYSTEMD_USER_SLICE=-.slice
_SYSTEMD_INVOCATION_ID=1c4a469986d448719cb0f9141a10810e
_MACHINE_ID=08a5690a2eed47cf92ac0a5d2e3cf6b0
_HOSTNAME=krowka
_TRANSPORT=syslog
SYSLOG_FACILITY=17
SYSLOG_IDENTIFIER=syslog-caller
MESSAGE=Something logged
_COMM=poc
_EXE=/home/zbyszek/src/systemd-work3/poc
_SYSTEMD_CGROUP=/user.slice/user-1000.slice/user@1000.service/gnome-terminal-server.service
_SYSTEMD_USER_UNIT=gnome-terminal-server.service
SYSLOG_PID=4108
SYSLOG_TIMESTAMP=Dec 5 19:44:27
_PID=4108
_CMDLINE=./poc AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA>
_SOURCE_REALTIME_TIMESTAMP=1544035467475848
$ journalctl -o export -n1 'MESSAGE=Something logged' --output-fields=_CMDLINE|wc
6 2053 2097410
2MB might be hard for some clients to use meaningfully, but OTOH, it is
important to log the full commandline sometimes. For example, when the program
is crashing, the exact argument list is useful.
This fixes a crash where we would read the commandline, whose length is under
control of the sending program, and then crash when trying to create a stack
allocation for it.
CVE-2018-16864
https://bugzilla.redhat.com/show_bug.cgi?id=1653855
The message actually doesn't get written to disk, because
journal_file_append_entry() returns -E2BIG.
https://hamberg.no/erlend/posts/2013-02-18-static-array-indices.html
This only works with clang, unfortunately gcc doesn't seem to implement the check
(tested with gcc-8.2.1-5.fc29.x86_64).
Simulated error:
[2/3] Compiling C object 'systemd-nspawn@exe/src_nspawn_nspawn.c.o'.
../src/nspawn/nspawn.c:3179:45: warning: array argument is too small; contains 15 elements, callee requires at least 16 [-Warray-bounds]
candidate = (uid_t) siphash24(arg_machine, strlen(arg_machine), hash_key);
^ ~~~~~~~~
../src/basic/siphash24.h:24:64: note: callee declares array parameter as static here
uint64_t siphash24(const void *in, size_t inlen, const uint8_t k[static 16]);
^~~~~~~~~~~~
Nitpicky, but we've used a lot of random spacings and names in the past,
but we're trying to be completely consistent on "cgroup vN" now.
Generated by `fd -0 | xargs -0 -n1 sed -ri --follow-symlinks 's/cgroups? ?v?([0-9])/cgroup v\1/gI'`.
I manually ignored places where it's not appropriate to replace (eg.
"cgroup2" fstype and in src/shared/linux).
"src/basic/time-util.c" is part of src/basic and should not include
"src/shared/serialize.h". It is one of the few cases where this
is done. Also, it's not even required.
Continuation of a3ebe5eb620e49f0d24082876cafc7579261e64f:
in other places we sometimes use assert_se(), and sometimes normal error
handling. sigfillset and sigaddset can only fail if mask is NULL (which cannot
happen if we are passing in a reference), or if the signal number is invalid
(which really shouldn't happen when we are using a constant like SIGCHLD. If
SIGCHLD is invalid, we have a bigger problem). So let's simplify things and
always use assert_se() in those cases.
In sigset_add_many() we could conceivably pass an invalid signal, so let's keep
normal error handling here. The caller can do assert_se() around the
sigprocmask_many() call if appropriate.
'>= 0' is used for consistency with the rest of the codebase.
All underlying glibc calls are free to return NULL if the size argument
is 0. We most often call those functions with a fixed argument, or at least
something which obviously cannot be zero, but it's too easy to forget.
E.g. coverity complains about "rows = new0(JsonVariant*, n_rows-1);" in
format-table.c There is an assert that n_rows > 0, so we could hit this
corner case here. Let's simplify callers and make those functions "safe".
CID #1397035.
The compiler is mostly able to optimize this away:
$ size build{,-opt}/src/shared/libsystemd-shared-239.so
(before)
text data bss dec hex filename
2643329 580940 3112 3227381 313ef5 build/src/shared/libsystemd-shared-239.so (-O0 -g)
2170013 578588 3089 2751690 29fcca build-opt/src/shared/libsystemd-shared-239.so (-03 -flto -g)
(after)
text data bss dec hex filename
2644017 580940 3112 3228069 3141a5 build/src/shared/libsystemd-shared-239.so
2170765 578588 3057 2752410 29ff9a build-opt/src/shared/libsystemd-shared-239.so
cgroup_no_v1=all doesn't make a whole lot of sense with legacy hierarchy
(where we use v1 hierarchy for everything), or hybrid hierarchy (where
we still use v1 hierarchy for resource control).
Right now we have to tell people to add both cgroup_no_v1=all and
systemd.unified_cgroup_hierarchy=1 to get the desired behaviour,
however in reality it's hard to imagine any situation where someone
passes cgroup_no_v1=all but *doesn't* want to use the unified cgroup
hierarchy.
Make it so that cgroup_no_v1=all produces intuitive behaviour in systemd
by default, although it can still be disabled by passing
systemd.unified_cgroup_hierarchy=0 explicitly.
$ build/systemctl --version
systemd 239-3555-g6178cbb5b5
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN +PCRE2 default-hierarchy=hybrid
$ git tag v240 -m 'v240'
$ ninja -C build
ninja: Entering directory `build'
[76/76] Linking target fuzz-unit-file.
$ build/systemctl --version
systemd 240
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN +PCRE2 default-hierarchy=hybrid
This is very useful during development, because a precise version string is
embedded in the build product and displayed during boot, so we don't have to
guess answers for questions like "did I just boot the latest version or the one
from before?".
This change creates an overhead for "noop" builds. On my laptop, 'ninja -C
build' that does nothing goes from 0.1 to 0.5 s. It would be nice to avoid
this, but I think that <1 s is still acceptable.
Fixes#7183.
PACKAGE_VERSION is renamed to GIT_VERSION, to make it obvious that this is the
more dynamically changing version string.
Why save to a file? It would be easy to generate the version tag using
run_command(), but we want to go through a file so that stuff gets rebuilt when
this file changes. If we just defined an variable in meson, ninja wouldn't know
it needs to rebuild things.
The function to replacement paths into the configuration file list was borked.
Apart from the crash with empty root prefix, it would incorrectly handle the
case where root *was* set, and the replacement file was supposed to override
an existing file.
prefix_root is used instead of path_join because prefix_root removes duplicate
slashes (when --root=dir/ is used).
A test is added.
Fixes#11124.
PACKAGE_VERSION is more explicit, and also, we don't pretend that changing the
project name in meson.build has any real effect. "systemd" is embedded in a
thousand different places, so let's just use the hardcoded string consistently.
This is mostly in preparation for future changes.
read_line() is a lot more careful and optimized than read_nul_string()
but does mostly the same thing. let's replace the latter by the former,
just with a special flag that toggles between the slightly different EOL
rules if both.
Let's first remove an item from the hashmap and only then destroy it.
This makes sure that destructor functions can mdoify the hashtables in
their own codee and we won't be confused by that.
The ?: operator is very useful for chaining comparison functions
(strcmp, memcmp, CMP), since its behavior is to return the result
of the comparison function call if non-zero, or continue evaluating
the chain of comparison functions.
This simplifies the code in that using a temporary `r` variable
to store the function results is no longer necessary and the checks
for non-zero to return are no longer needed either, resulting in a
typical three-fold reduction to the number of lines in the code.
Introduce a new memcmp_nn() to compare two memory buffers in
lexicographic order, taking length in consideration.
Tested: $ ninja -C build/ test
All test cases pass. In particular, test_multiple_neighbors_sorted()
in test-lldp would catch regressions introduced by this commit.
This is mostly paranoia, but let's better be safer than sorry. This of
course means there's always an implicit limit to how much we can read at
a time of 2G. But that should be ample.
u32ctz() was undefined for zero due to __builtin_ctz() [1].
Explicitly check for zero to make the behavior defined.
Note that this issue only affected in4_addr_netmask_to_prefixlen()
which is the only caller.
It may seem slightly odd, to return 32 (bits) for utz(0). But that
is what in4_addr_netmask_to_prefixlen() needs, and it probably makes
the most sense here.
[1] https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
Fixes: ba91431154
This has been irritating me for quite a while: let's prefix these enum
values with a common prefix, like we do for almost all other enums.
No change in behaviour, just some renaming.
fdopen doesn't accept "e", it's ignored. Let's not mislead people into
believing that it actually sets O_CLOEXEC.
From `man 3 fdopen`:
> e (since glibc 2.7):
> Open the file with the O_CLOEXEC flag. See open(2) for more information. This flag is ignored for fdopen()
As mentioned by @jlebon in #11131.
This has the side effect to upgrade the log level at which the log is emitted
from debug to warning.
This might be better since after all we didn't apply a tmpfiles.d/ rule and
that actually might end up being problematic eventually.
We're always interested into finding unsafe transitions so let's make the
helper return true when it finds such transitions so we don't need to negate
its results.
No functional changes.
We previously returned -EPERM but it can be returned for various other reasons
too.
Let's use -ENOLINK instead as this value shouldn't be used currently. This
allows users of CHASE_SAFE to detect without any ambiguities when unsafe
transitions are encountered by chase_symlinks().
All current users of CHASE_SAFE that explicitly reacted on -EPERM have been
converted to react on -ENOLINK.
Linux is stupid and sometimes returns a "struct sockaddr_un" that is
longer than its fields, as it NUL terminates .sun_path[] even if it has
full length. ubsan detects this, rightfully. Since this is a Linux
misdesign let's trick out ubsan a bit.
Fixes: #11024
We want it for global variables, which LLVM supports and GCC currently
does not (GCC does support it for functions, but we care about global
variables here).
Why is this relevant? When asan is used global variables are padded with
hotzones before and after. But we can't have that for the registration
variables we place in special ELF sections: we want them tightly packed
so that we can iterate through them.
Note that for gcc this isn't an issue, as it will pack stuff in
non-standard sections anyway, even if asan is used.
This cleans up a bit how we set up things for the ELF section magic:
1. Let's always use our gcc macros, instead of __attribute__ directly
2. Align our structures to sizeof(void*), i.e. the pointer size, rather
than a fixed 8 or __BIGGEST_ALIGNMENT__. The former is unnecessarily
high for 32bit systems, the latter too high for 64bit systems. gcc
seems to use ptr alignment for static variables itself, hence this
should be good enough for us too. Moreover, the Linux kernel also
uses pointer alginment for all its ELF section registration magic,
hence this should be good enough for us too.
3. Let's always prefix the sections we create ourself with SYSTEMD_,
just to make clear where they come from.
4. Always align the pointer we start from when iterating through these
lists. This should be unnecessary, but makes things nicely
systematic, as we'll align all pointers we use to access these
sections properly.
We follow no general rule, but in most cases we do not place a space
outside of macro.h. Hence let's stick to that, and adapt macro.h too,
and follow the rule systematically that there shall not be a space
between __attribute__ and ((...
Yes, this does not matter at all, and is purely OCD cosmetics. But then
again, the uses of __attribute__ are very local only, hence the changes
cleaning this up are small and are unlikely to have to be repeated too
often...
This splits out a bunch of functions from fileio.c that have to do with
temporary files. Simply to make the header files a bit shorter, and to
group things more nicely.
No code changes, just some rearranging of source files.
Whenever we invoke external, foreign code from code that has
RLIMIT_NOFILE's soft limit bumped to high values, revert it to 1024
first. This is a safety precaution for compatibility with programs using
select() which cannot operate with fds > 1024.
This commit adds the call to rlimit_nofile_safe() to all invocations of
exec{v,ve,l}() and friends that either are in code that we know runs
with RLIMIT_NOFILE bumped up (which is PID 1 and all journal code for
starters) or that is part of shared code that might end up there.
The calls are placed as early as we can in processes invoking a flavour
of execve(), but after the last time we do fd manipulations, so that we
can still take benefit of the high fd limits for that.
This is really basic stuff and in a follow-up commit will use it all
across the codebase, including in process-util.[ch] which is in
src/basic/. Hence let's move it back to src/basic/ itself.
We are pretty careful to reject abstract sockets that are too long to fit in
the address structure as a NUL-terminated string. And since we parse sockets as
strings, it is not possible to embed a NUL in the the address either. But we
might receive an external socket (abstract or not), and we want to be able to
print its address in all cases. We would call socket_address_verify() and
refuse to print various sockets that the kernel considers legit.
Let's do the strict verification only in case of socket addresses we parse and
open ourselves, and do less strict verification when printing addresses of
existing sockets, and use c-escaping to print embedded NULs and such.
More tests are added.
This should make LGTM happier because on FIXME comment is removed.
It's pretty useful to allow parse_boolean() to take a NULL argument and
return an error in that case, rather than abort. i.e. making this a
runtime rather than programming error allows us to shorten code
elsewhere.
This flag can be used to make chase_symlinks() emit a warning when it
encounters an error.
Such flag can be useful for generating a comprehensive and detailed warning
since chase_symlinks() can generate a warning with a full context.
For now only warnings for unsafe transitions are produced.
device_path_make_{major_minor|canonical) generate device node paths
given a mode_t and a dev_t. We have similar code all over the place,
let's unify this in one place. The former will generate a "/dev/char/"
or "/dev/block" path, and never go to disk. The latter then goes to disk
and resolves that path to the actual path of the device node.
device_path_parse_major_minor() reverses device_path_make_major_minor(),
also withozut going to disk.
We have similar code doing something like this at various places, let's
unify this in a single set of functions. This also allows us to teach
them special tricks, for example handling of the
/run/systemd/inaccessible/{blk|chr} device nodes, which we use for
masking device nodes, and which do not exist in /dev/char/* and
/dev/block/*
We should probably drop path_join() entirely in the long run (and
then rename path_join_many() to it?), but for now let's make one a
wrapper for the other.
Let's be a bit more careful when parsing major/minor pairs, and filter
out more corner cases. This also means using safe_atou() rather than
sscanf() to avoid weird negative unsigned handling and such.
As it turns out glibc and the Linux kernel have different ideas about
the size of dev_t and how many bits exist for the major and the minor.
When validating major/minor numbers we should check against the kernel's
actual sizes, hence add macros for this.
Let's simplify things and drop the logic that /var/lib/machines is setup
as auto-growing btrfs loopback file /var/lib/machines.raw.
THis was done in order to make quota available for machine management,
but quite frankly never really worked properly, as we couldn't grow the
file system in sync with its use properly. Moreover philosophically it's
problematic overriding the admin's choice of file system like this.
Let's hence drop this, and simplify things. Deleting code is a good
feeling.
Now that regular file systems provide project quota we could probably
add per-machine quota support based on that, hence the btrfs quota
argument is not that interesting anymore (though btrfs quota is a bit
more powerful as it allows recursive quota, i.e. that the machine pool
gets an overall quota in addition to per-machine quota).
Fixes: #2728
This is also supposed to be preparation for doing #10234 eventually,
where a very similar operation is requested: instead of importing a tree
to /var/lib/machines it would need to be imported into
/var/lib/portables/.
This adds two optional functions that may be passed to the various copy
functions. One is invoked whenever we start copying a new file object,
the other while we copy file payload in each loop iteration.
When the caller passes one or both they can get notifications about copy
progress, for example to log where things are.
This is to startswith() what PATH_STARTSWITH_SET() is to
path_startswith().
Or in other words, checks if the specified string has any of the listed
prefixes, and if so, returns the remainder of the string.