Now we first gather all process metadata and populate the process info cache
with them. In this way, the cache only references metadata recorded in iovecs[]
so there's no need to bother freeing (part of) cached metadata later.
The other advantage is that the coredump handler mode and the service mode are
more similar as the cache is populated in the same way for both cases.
It also renames the array indexes so it becomes clear which metadata are passed
by the kernel and which ones are retrieved from the runtime environment.
'input_fd' variable name is used mostly everywhere except in process_socket()
where it's named 'coredump_fd', which is pretty confusing since 'coredump_fd'
is used for the coredump filename in submit_coredump().
So let's use 'input_fd' consistently as name for the pipe fd passed by the
kernel.
No functional changes.
Previous code was allocating an array of iovecs big enough to store all the
fields added later by various functions.
This forced us to calculate the size of the array in advance which is too error
prone if for example one wants to add new fields or simply rework the
code. Various assertions were added to make sure there's no overflow but it's
still more code for no good reasons.
Instead, this patch switches to the new iovec array handling interface so the
array is grown dynamically when needed.
The other contraint was that some iovecs were supposed to be freed whereas some
others were not. This makes the code hard to (re)organize. The new code always
allocates fields so it becomes easier to rework the code.
I couldn't see any reason why the kernel could provide COMM to the coredump
handler via the core_pattern command line but could not make it available in
/proc. So let's assume that this info is always available in /proc.
For "backtrace" mode (when --backtrace option is passed), I assumed that the
crashing process still exists at the time systemd-coredump is called.
Also changing the core_pattern line is an API breakage for any users of the
backtrace mode but given that systemd-coredump is installed in
/usr/lib/systemd, it's a private tool which has no internal users. At least no
one complained when the hostname was added to the core_pattern line
(f45b801551)...
Indeed it's much easier to get it from /proc since the kernel substitutes '%e'
specifier with multiple strings if the process name contains spaces (!).
The functions to retrieve and print process cmdlines were based on the
assumption that they contain printable ASCII, and everything else
should be filtered out. That assumption doesn't hold in today's world,
where people are free to use unicode everywhere.
This replaces the custom cmdline reading code with a more generic approach
using utf8_escape_non_printable_full().
For kernel threads, truncation is done on the parenthesized name, so we'll
get "[worker]", "[worker…]", …, "[w…]", "[…", "…" as we reduce the number of
available columns.
This implementation is most likely slower for very long cmdlines, but I don't
think this is very important. The common case is to have short commandlines,
and should print those properly. Absurdly long cmdlines are the exception,
which needs to be handled correctly and safely, but speed is not too important.
Fixes#12532.
v2:
- use size_t for the number of columns. This change propagates into various
other functions that call get_process_cmdline(), increasing the size of the
patch, but the changes are rather trivial.
If creation of the message failed, we'd write a bogus entry:
systemd-coredump[1400]: Cannot store coredump of 416 (systemd-journal): No space left on device
systemd-coredump[1400]: MESSAGE=Process 416 (systemd-journal) of user 0 dumped core.
systemd-coredump[1400]: Coredump diverted to
This fixes a crash where we would read the commandline, whose length is under
control of the sending program, and then crash when trying to create a stack
allocation for it.
CVE-2018-16864
https://bugzilla.redhat.com/show_bug.cgi?id=1653855
The message actually doesn't get written to disk, because
journal_file_append_entry() returns -E2BIG.
systemd-coredump[9982]: MESSAGE=Process 771 (systemd-journal) of user 0 dumped core.
systemd-coredump[9982]: Coredump diverted to /var/lib/systemd/coredump/core...
log_dispatch() calls log_dispatch_internal() which calls write_to_journal()
which appends MESSAGE= on its own.
fdopen doesn't accept "e", it's ignored. Let's not mislead people into
believing that it actually sets O_CLOEXEC.
From `man 3 fdopen`:
> e (since glibc 2.7):
> Open the file with the O_CLOEXEC flag. See open(2) for more information. This flag is ignored for fdopen()
As mentioned by @jlebon in #11131.
This splits out a bunch of functions from fileio.c that have to do with
temporary files. Simply to make the header files a bit shorter, and to
group things more nicely.
No code changes, just some rearranging of source files.
Ideally, coccinelle would strip unnecessary braces too. But I do not see any
option in coccinelle for this, so instead, I edited the patch text using
search&replace to remove the braces. Unfortunately this is not fully automatic,
in particular it didn't deal well with if-else-if-else blocks and ifdefs, so
there is an increased likelikehood be some bugs in such spots.
I also removed part of the patch that coccinelle generated for udev, where we
returns -1 for failure. This should be fixed independently.
This way, we can extend the macro a bit with stuff pulled in from other
headers without this affecting everything which pulls in macro.h, which
is one of our most basic headers.
This is just refactoring, no change in behaviour, in prepartion for
later changes.
Let's fold get_user_creds_clean() into get_user_creds(), and introduce a
flags argument for it to select "clean" behaviour. This flags parameter
also learns to other new flags:
- USER_CREDS_SYNTHESIZE_FALLBACK: in this mode the user records for
root/nobody are only synthesized as fallback. Normally, the synthesized
records take precedence over what is in the user database. With this
flag set this is reversed, and the user database takes precedence, and
the synthesized records are only used if they are missing there. This
flag should be set in cases where doing NSS is deemed safe, and where
there's interest in knowing the correct shell, for example if the
admin changed root's shell to zsh or suchlike.
- USER_CREDS_ALLOW_MISSING: if set, and a UID/GID is specified by
numeric value, and there's no user/group record for it accept it
anyway. This allows us to fix#9767
This then also ports all users to set the most appropriate flags.
Fixes: #9767
[zj: remove one isempty() call]
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
This part of the copyright blurb stems from the GPL use recommendations:
https://www.gnu.org/licenses/gpl-howto.en.html
The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.
hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.
I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
This macro will read a pointer of any type, return it, and set the
pointer to NULL. This is useful as an explicit concept of passing
ownership of a memory area between pointers.
This takes inspiration from Rust:
https://doc.rust-lang.org/std/option/enum.Option.html#method.take
and was suggested by Alan Jenkins (@sourcejedi).
It drops ~160 lines of code from our codebase, which makes me like it.
Also, I think it clarifies passing of ownership, and thus helps
readability a bit (at least for the initiated who know the new macro)
Let's make use this at various places we call fsync(), to make things
fully reliable, as the kernel devs suggest to first fsync() files and
then fsync() the directories they are located in.
This commint adds a new command line parameter to sytemd-coredump. The
parameter should be mappend to core_pattern's placeholder %h - hostname.
The field _HOSTNAME holds the name from the kernel's namespaces which might be
different then the one comming from process' namespaces.
It is true that the real hostname is usually available in the field
COREDUMP_ENVIRON (environment variables) but I believe it is more reliable to
use the value passed by kernel.
----
The length of iovec is no longer static and hence I corrected the declarations
of the functions set_iovec_field and set_iovec_field_free.
Thank you @yuwata and @poettering!
First, let's rename it to disable_coredumps(), as in the rest of our
codebase we spell it "coredump" rather than "core_dump", so let's stick
to that.
However, also log about failures to turn off core dumpling on LOG_DEBUG,
because debug logging is always a good idea.