Ideally, coccinelle would strip unnecessary braces too. But I do not see any
option in coccinelle for this, so instead, I edited the patch text using
search&replace to remove the braces. Unfortunately this is not fully automatic,
in particular it didn't deal well with if-else-if-else blocks and ifdefs, so
there is an increased likelikehood be some bugs in such spots.
I also removed part of the patch that coccinelle generated for udev, where we
returns -1 for failure. This should be fixed independently.
This way, we can extend the macro a bit with stuff pulled in from other
headers without this affecting everything which pulls in macro.h, which
is one of our most basic headers.
This is just refactoring, no change in behaviour, in prepartion for
later changes.
Let's fold get_user_creds_clean() into get_user_creds(), and introduce a
flags argument for it to select "clean" behaviour. This flags parameter
also learns to other new flags:
- USER_CREDS_SYNTHESIZE_FALLBACK: in this mode the user records for
root/nobody are only synthesized as fallback. Normally, the synthesized
records take precedence over what is in the user database. With this
flag set this is reversed, and the user database takes precedence, and
the synthesized records are only used if they are missing there. This
flag should be set in cases where doing NSS is deemed safe, and where
there's interest in knowing the correct shell, for example if the
admin changed root's shell to zsh or suchlike.
- USER_CREDS_ALLOW_MISSING: if set, and a UID/GID is specified by
numeric value, and there's no user/group record for it accept it
anyway. This allows us to fix#9767
This then also ports all users to set the most appropriate flags.
Fixes: #9767
[zj: remove one isempty() call]
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
This part of the copyright blurb stems from the GPL use recommendations:
https://www.gnu.org/licenses/gpl-howto.en.html
The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.
hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.
I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
This macro will read a pointer of any type, return it, and set the
pointer to NULL. This is useful as an explicit concept of passing
ownership of a memory area between pointers.
This takes inspiration from Rust:
https://doc.rust-lang.org/std/option/enum.Option.html#method.take
and was suggested by Alan Jenkins (@sourcejedi).
It drops ~160 lines of code from our codebase, which makes me like it.
Also, I think it clarifies passing of ownership, and thus helps
readability a bit (at least for the initiated who know the new macro)
Let's make use this at various places we call fsync(), to make things
fully reliable, as the kernel devs suggest to first fsync() files and
then fsync() the directories they are located in.
This commint adds a new command line parameter to sytemd-coredump. The
parameter should be mappend to core_pattern's placeholder %h - hostname.
The field _HOSTNAME holds the name from the kernel's namespaces which might be
different then the one comming from process' namespaces.
It is true that the real hostname is usually available in the field
COREDUMP_ENVIRON (environment variables) but I believe it is more reliable to
use the value passed by kernel.
----
The length of iovec is no longer static and hence I corrected the declarations
of the functions set_iovec_field and set_iovec_field_free.
Thank you @yuwata and @poettering!
First, let's rename it to disable_coredumps(), as in the rest of our
codebase we spell it "coredump" rather than "core_dump", so let's stick
to that.
However, also log about failures to turn off core dumpling on LOG_DEBUG,
because debug logging is always a good idea.
Let's replace usage of fputc_unlocked() and friends by __fsetlocking(f,
FSETLOCKING_BYCALLER). This turns off locking for the entire FILE*,
instead of doing individual per-call decision whether to use normal
calls or _unlocked() calls.
This has various benefits:
1. It's easier to read and easier not to forget
2. It's more comprehensive, as fprintf() and friends are covered too
(as these functions have no _unlocked() counterpart)
3. Philosophically, it's a bit more correct, because it's more a
property of the file handle really whether we ever pass it on to another
thread, not of the operations we then apply to it.
This patch reworks all pieces of codes that so far used fxyz_unlocked()
calls to use __fsetlocking() instead. It also reworks all places that
use open_memstream(), i.e. use stdio FILE* for string manipulations.
Note that this in some way a revert of 4b61c87511.
Our CODING_STYLE suggests not comparing with NULL, but relying on C's
downgrade-to-bool feature for that. Fix up some code to match these
guidelines. (This is not comprehensive, the coccinelle output for this
is unfortunately kinda borked)
The "nobody" user might possibly be seen by the journal or coredumping
code if unmapped userns-using processes are somehow visible to them.
Let's make sure we don't do the ACL magic for this user either, since
this is a special system user that might be backed by different real
users in different contexts.
This adds uid_is_system() and gid_is_system(), similar in style to
uid_is_dynamic(). That a helper like this is useful is illustrated by
the fact that test-condition.c didn't get the check right so far, which
this patch fixes.
The advantage is that is the name is mispellt, cpp will warn us.
$ git grep -Ee "conf.set\('(HAVE|ENABLE)_" -l|xargs sed -r -i "s/conf.set\('(HAVE|ENABLE)_/conf.set10('\1_/"
$ git grep -Ee '#ifn?def (HAVE|ENABLE)' -l|xargs sed -r -i 's/#ifdef (HAVE|ENABLE)/#if \1/; s/#ifndef (HAVE|ENABLE)/#if ! \1/;'
$ git grep -Ee 'if.*defined\(HAVE' -l|xargs sed -i -r 's/defined\((HAVE_[A-Z0-9_]*)\)/\1/g'
$ git grep -Ee 'if.*defined\(ENABLE' -l|xargs sed -i -r 's/defined\((ENABLE_[A-Z0-9_]*)\)/\1/g'
+ manual changes to meson.build
squash! build-sys: use #if Y instead of #ifdef Y everywhere
v2:
- fix incorrect setting of HAVE_LIBIDN2
This adds IOVEC_INIT() and IOVEC_MAKE() for initializing iovec structures
from a pointer and a size. On top of these IOVEC_INIT_STRING() and
IOVEC_MAKE_STRING() are added which take a string and automatically
determine the size of the string using strlen().
This patch removes the old IOVEC_SET_STRING() macro, given that
IOVEC_MAKE_STRING() is now useful for similar purposes. Note that the
old IOVEC_SET_STRING() invocations were two characters shorter than the
new ones using IOVEC_MAKE_STRING(), but I think the new syntax is more
readable and more generic as it simply resolves to a C99 literal
structure initialization. Moreover, we can use very similar syntax now
for initializing strings and pointer+size iovec entries. We canalso use
the new macros to initialize function parameters on-the-fly or array
definitions. And given that we shouldn't have so many ways to do the
same stuff, let's just settle on the new macros.
(This also converts some code to use _cleanup_ where dynamically
allocated strings were using IOVEC_SET_STRING() before, to modernize
things a bit)
As a follow-up for db3f45e2d2 let's do the
same for all other cases where we create a FILE* with local scope and
know that no other threads hence can have access to it.
For most cases this shouldn't change much really, but this should speed
dbus introspection and calender time formatting up a bit.
This was exposed by the previous commit. This could be potentially
unpleasant, but we are saved by the fact that this code path was only
taken for journald crashes, where we control COMM and know that it doesn't
contain any special characters. Use log_dispatch which does not do any
format processing to push the message out.
We check these a number of times, hence let's unify these checks here.
This also allows us to make the PID 1 check more elaborate as we can
check both the PID and the cgroup. Checking the PID has the benefit that
we'll also cover cases where PID 1 might still be in the root cgroup, and
the cgroup check has the benefit that we also cover crashes in forked
off crasher processes (the way we actually do it in systemd)
Given that this is a field primarily processed by computers, and not so
much by humans, assign "1" instead of "yes". Also, use parse_boolean()
as we usually do for parsing it again.
This makes things more alike udev options (as one example), such as
SYSTEMD_READY where we also spit out "1" and "0", and parse with
parse_boolean().
We would only log a terse message when pid1 or systemd-journald crashed.
It seems better to reuse the normal code paths as much as possible,
with the following differences:
- if pid1 crashes, we cannot launch the helper, so we don't analyze the
coredump, just write it to file directly from the helper invoked by the
kernel;
- if journald crashes, we can produce the backtrace, but we don't log full
structured messages.
With comparison to previous code, advantages are:
- we go through most of the steps, so for example vacuuming is performed,
- we gather and log more data. In particular for journald and pid1 crashes we
generate a backtrace, and for pid1 crashes we record the metadata (fdinfo,
maps, etc.),
- coredumpctl shows pid1 crashes.
A disavantage (inefficiency) is that we gather metadata for journald crashes
which is then ignored because _TRANSPORT=kernel does not support structued
messages.
Messages for the systemd-journald "crash" have _TRANSPORT=kernel, and
_TRANSPORT=journal for the pid1 "crash".
Feb 26 16:27:55 systemd[1]: systemd-journald.service: Main process exited, code=dumped, status=11/SEGV
Feb 26 16:27:55 systemd[1]: systemd-journald.service: Unit entered failed state.
Feb 26 16:37:54 systemd-coredump[18801]: Process 18729 (systemd-journal) of user 0 dumped core.
Feb 26 16:37:54 systemd-coredump[18801]: Coredump diverted to /var/lib/systemd/coredump/core.systemd-journal.0.36c14bf3c6ce4c38914f441038990979.18729.1488145074000000.lz4
Feb 26 16:37:54 systemd-coredump[18801]: Stack trace of thread 18729:
Feb 26 16:37:54 systemd-coredump[18801]: #0 0x00007f46d6a06b8d fsync (libpthread.so.0)
Feb 26 16:37:54 systemd-coredump[18801]: #1 0x00007f46d71bfc47 journal_file_set_online (libsystemd-shared-233.so)
Feb 26 16:37:54 systemd-coredump[18801]: #2 0x00007f46d71c1c31 journal_file_append_object (libsystemd-shared-233.so)
Feb 26 16:37:54 systemd-coredump[18801]: #3 0x00007f46d71c3405 journal_file_append_data (libsystemd-shared-233.so)
Feb 26 16:37:54 systemd-coredump[18801]: #4 0x00007f46d71c4b7c journal_file_append_entry (libsystemd-shared-233.so)
Feb 26 16:37:54 systemd-coredump[18801]: #5 0x00005577688cf056 write_to_journal (systemd-journald)
Feb 26 16:37:54 systemd-coredump[18801]: #6 0x00005577688d2e98 dispatch_message_real (systemd-journald)
Feb 26 16:37:54 kernel: systemd-coredum: 9 output lines suppressed due to ratelimiting
Feb 26 16:37:54 systemd-journald[18810]: Journal started
Feb 26 16:50:59 systemd-coredump[19229]: Due to PID 1 having crashed coredump collection will now be turned off.
Feb 26 16:51:00 systemd[1]: Caught <SEGV>, dumped core as pid 19228.
Feb 26 16:51:00 systemd[1]: Freezing execution.
Feb 26 16:51:00 systemd-coredump[19229]: Process 19228 (systemd) of user 0 dumped core.
Stack trace of thread 19228:
#0 0x00007fab82075c47 kill (libc.so.6)
#1 0x000055fdf7c38b6b crash (systemd)
#2 0x00007fab824175c0 __restore_rt (libpthread.so.0)
#3 0x00007fab82148573 epoll_wait (libc.so.6)
#4 0x00007fab8366f84a sd_event_wait (libsystemd-shared-233.so)
#5 0x00007fab836701de sd_event_run (libsystemd-shared-233.so)
#6 0x000055fdf7c4a380 manager_loop (systemd)
#7 0x000055fdf7c402c2 main (systemd)
#8 0x00007fab82060401 __libc_start_main (libc.so.6)
#9 0x000055fdf7c3818a _start (systemd)
Poor machine ;)
Unit systemd-coredump@1-3854-0.service is failed/failed, not counting it.
TIME PID UID GID SIG COREFILE EXE
Fri 2017-02-24 11:11:00 EST 10002 1000 1000 6 none /home/zbyszek/src/systemd-work/.libs/lt-Sat 2017-02-25 00:49:32 EST 26921 0 0 11 error /usr/libexec/fprintd
Sat 2017-02-25 11:56:30 EST 30703 1000 1000 - - /usr/bin/python3.5
Sat 2017-02-25 13:16:54 EST 3275 1000 1000 11 present /usr/bin/bash
Sat 2017-02-25 17:25:40 EST 4049 1000 1000 11 truncated /usr/bin/bash
For info and gdb output, the filename is marked in red and "(truncated)" is
appended. (Red is necessary because the annotation is hard to see when running
under a pager.)
Fixed#3883.
We logged about this, but did not attach information directly to the log
entry. It *would* be nice to log the full untruncated size, but afaict, to do
this, we would have to read the full data from the kernel. Doing this just to
log that information seems a bit excessive, in particular when the limit could
be set quite low. So for now let's just add a boolean field.
Most of the fields in the context array come from the kernel (passed
through argv), but two are special: comm and exe. We allocate them
ourselves. We forgot to initialize context[CONTEXT_COMM] with the value
we allocated (introduced in 9aa8202314).
To simplify things, just set context[CONTEXT_COMM] and context[CONTEXT_EXE],
and free those two fields at the end.
Fixes#5442.
Our coredump handler operates on a "context" supplied by the kernel via
the core_pattern arguments. When we pass off a coredump for processing
to coredumpd we pass along enough information for this context to be
reconstructed. This information is passed in the usual journal fields,
and that means we extended the 1s granularity timestamp to 1µs
granularity by appending 6 zeroes. We need to chop them off again when
reconstructing the original kernel context.
Fixes: #4779
(Note that we only do this for the journal metadata, not for the xattrs,
as the xattrs are only supposed to store the original 1:1 info we
acquired from the kernel.)
This adds a unified "copy_flags" parameter to all copy_xyz() function
calls, replacing the various boolean flags so far used. This should make
many invocations more readable as it is clear what behaviour is
precisely requested. This also prepares ground for adding support for
more modes later on.