Commit graph

27 commits

Author SHA1 Message Date
Zbigniew Jędrzejewski-Szmek b9aaa7f480 coredumpctl: print a hint when no journal files are found
[guest@fedora ~]$ coredumpctl
No coredumps found.

[guest@fedora ~]$ ./coredumpctl
Hint: You are currently not seeing messages from other users and the system.
      Users in groups 'adm', 'systemd-journal', 'wheel' can see all messages.
      Pass -q to turn off this notice.
No coredumps found.

Fixes #1733.
2017-02-28 21:38:47 -05:00
Zbigniew Jędrzejewski-Szmek 7d8e7c0e19 coredumpctl: use a 3s timeout for checking units
This is just a hint, so we shouldn't wait too long. A short timeout
helps for the case where pid1 of dbus have crashed.
2017-02-28 21:34:53 -05:00
Zbigniew Jędrzejewski-Szmek cc4419ed92 coredumpctl,man: mark truncated messages as such in output
Unit systemd-coredump@1-3854-0.service is failed/failed, not counting it.
TIME                            PID   UID   GID SIG COREFILE  EXE
Fri 2017-02-24 11:11:00 EST   10002  1000  1000   6 none      /home/zbyszek/src/systemd-work/.libs/lt-Sat 2017-02-25 00:49:32 EST   26921     0     0  11 error     /usr/libexec/fprintd
Sat 2017-02-25 11:56:30 EST   30703  1000  1000   - -         /usr/bin/python3.5
Sat 2017-02-25 13:16:54 EST    3275  1000  1000  11 present   /usr/bin/bash
Sat 2017-02-25 17:25:40 EST    4049  1000  1000  11 truncated /usr/bin/bash

For info and gdb output, the filename is marked in red and "(truncated)" is
appended. (Red is necessary because the annotation is hard to see when running
under a pager.)

Fixed #3883.
2017-02-26 19:45:10 -05:00
Zbigniew Jędrzejewski-Szmek 7bbf2d8423 coredumpctl: add debug information which services count towards the warning
A few times I have seen the hint unexpectedly. Add this so debug info
so it's easier to see what's happening.

...
Unit systemd-coredump@0-3119-0.service is failed/failed, not counting it.
Unit systemd-coredump@1-3854-0.service is activating/start-pre, counting it.
...
-- Notice: 1 systemd-coredump@.service unit is running, output may be incomplete.
2017-02-26 19:45:10 -05:00
Giedrius Statkevičius 32485d0904 coredumpctl: implement --since/--until (-S/-U) for info/list verbs
Implement --since/--until (-S/-U) in the same fashion as journalctl.
This lets the user filter the results a bit so it would be easier to
find relevant info in case there were many core dumps.
2017-02-24 21:29:40 +02:00
Thomas H. P. Andersen 0f5a443e8c coredump: fix assign in while loop (#5417)
From: #5393
2017-02-22 00:14:54 +01:00
Zbigniew Jędrzejewski-Szmek 012f2b7de7 coredumpctl: print a hint if any coredumps are in flight (#5393)
Fixes #4685.
2017-02-21 11:08:35 +01:00
Zbigniew Jędrzejewski-Szmek a7581ff940 coredumpctl: display non-coredump coredump entries too
$ ./coredumpctl --no-pager -1
TIME                            PID   UID   GID SIG COREFILE EXE
Sun 2016-11-06 10:10:51 EST   29514  1002  1002   - -        /usr/bin/python3.5

$ ./coredumpctl info 29514
           PID: 29514 (python3)
           UID: 1002 (zbyszek)
           GID: 1002 (zbyszek)
        Reason: ZeroDivisionError
     Timestamp: Sun 2016-11-06 10:10:51 EST (3h 22min ago)
  Command Line: python3 systemd_coredump_exception_handler.py
    Executable: /usr/bin/python3.5
 Control Group: /user.slice/user-1002.slice/user@1002.service/gnome-terminal-server.service
          Unit: user@1002.service
     User Unit: gnome-terminal-server.service
         Slice: user-1002.slice
     Owner UID: 1002 (zbyszek)
       Boot ID: 1531fd22ec84429e85ae888b12fadb91
    Machine ID: 519a16632fbd4c71966ce9305b360c9c
      Hostname: laptop
       Storage: none
       Message: Process 29514 (systemd_coredump_exception_handler.py) of user zbyszek failed with ZeroDivisionError: division by

                Traceback (most recent call last):
                  File "systemd_coredump_exception_handler.py", line 134, in <module>
                    g()
                  File "systemd_coredump_exception_handler.py", line 133, in g
                    f()
                  File "systemd_coredump_exception_handler.py", line 131, in f
                    div0 = 1 / 0
                ZeroDivisionError: division by zero

                Local variables in innermost frame:
                  a=3
                  h=<function f at 0x7efdc14b6ea0>
2017-02-15 00:45:12 -05:00
Zbigniew Jędrzejewski-Szmek 2b0445262a tree-wide: add SD_ID128_MAKE_STR, remove LOG_MESSAGE_ID
Embedding sd_id128_t's in constant strings was rather cumbersome. We had
SD_ID128_CONST_STR which returned a const char[], but it had two problems:
- it wasn't possible to statically concatanate this array with a normal string
- gcc wasn't really able to optimize this, and generated code to perform the
  "conversion" at runtime.
Because of this, even our own code in coredumpctl wasn't using
SD_ID128_CONST_STR.

Add a new macro to generate a constant string: SD_ID128_MAKE_STR.
It is not as elegant as SD_ID128_CONST_STR, because it requires a repetition
of the numbers, but in practice it is more convenient to use, and allows gcc
to generate smarter code:

$ size .libs/systemd{,-logind,-journald}{.old,}
   text	   data	    bss	    dec	    hex	filename
1265204	 149564	   4808	1419576	 15a938	.libs/systemd.old
1260268	 149564	   4808	1414640	 1595f0	.libs/systemd
 246805	  13852	    209	 260866	  3fb02	.libs/systemd-logind.old
 240973	  13852	    209	 255034	  3e43a	.libs/systemd-logind
 146839	   4984	     34	 151857	  25131	.libs/systemd-journald.old
 146391	   4984	     34	 151409	  24f71	.libs/systemd-journald

It is also much easier to check if a certain binary uses a certain MESSAGE_ID:

$ strings .libs/systemd.old|grep MESSAGE_ID
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x

$ strings .libs/systemd|grep MESSAGE_ID
MESSAGE_ID=c7a787079b354eaaa9e77b371893cd27
MESSAGE_ID=b07a249cd024414a82dd00cd181378ff
MESSAGE_ID=641257651c1b4ec9a8624d7a40a9e1e7
MESSAGE_ID=de5b426a63be47a7b6ac3eaac82e2f6f
MESSAGE_ID=d34d037fff1847e6ae669a370e694725
MESSAGE_ID=7d4958e842da4a758f6c1cdc7b36dcc5
MESSAGE_ID=1dee0369c7fc4736b7099b38ecb46ee7
MESSAGE_ID=39f53479d3a045ac8e11786248231fbf
MESSAGE_ID=be02cf6855d2428ba40df7e9d022f03d
MESSAGE_ID=7b05ebc668384222baa8881179cfda54
MESSAGE_ID=9d1aaa27d60140bd96365438aad20286
2017-02-15 00:45:12 -05:00
Zbigniew Jędrzejewski-Szmek 5ab9ed0762 coredumpctl: just use argv instead of building a temporary set
No functional change, and we don't lose match order.
2017-02-15 00:33:10 -05:00
Namhyung Kim df65f77bb5 coredumpctl: Add -r/--reverse option
Like journalctl, users sometimes want to see coredump list in reverse
order.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
2017-02-13 22:55:25 +09:00
Namhyung Kim 06b76011d7 coredumpctl: Remove dubious newline in the help message
It seems the -o opiton and -D option can be printed together.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
2017-02-13 22:36:43 +09:00
Franck Bui 3e7bc89b8f coredumpctl: let gdb handle the SIGINT signal (#4901)
Even if pressing Ctrl-c after spawning gdb with "coredumpctl gdb" is not really
useful, we should let gdb handle the signal entirely otherwise the user can be
suprised to see a different behavior when gdb is started by coredumpctl vs when
it's started directly.

Indeed in the former case, gdb exits due to coredumpctl being killed by the
signal.

So this patch makes coredumpctl ignore SIGINT as long as gdb is running.
2016-12-17 09:49:17 -05:00
Zbigniew Jędrzejewski-Szmek 605405c6cc tree-wide: drop NULL sentinel from strjoin
This makes strjoin and strjoina more similar and avoids the useless final
argument.

spatch -I . -I ./src -I ./src/basic -I ./src/basic -I ./src/shared -I ./src/shared -I ./src/network -I ./src/locale -I ./src/login -I ./src/journal -I ./src/journal -I ./src/timedate -I ./src/timesync -I ./src/nspawn -I ./src/resolve -I ./src/resolve -I ./src/systemd -I ./src/core -I ./src/core -I ./src/libudev -I ./src/udev -I ./src/udev/net -I ./src/udev -I ./src/libsystemd/sd-bus -I ./src/libsystemd/sd-event -I ./src/libsystemd/sd-login -I ./src/libsystemd/sd-netlink -I ./src/libsystemd/sd-network -I ./src/libsystemd/sd-hwdb -I ./src/libsystemd/sd-device -I ./src/libsystemd/sd-id128 -I ./src/libsystemd-network --sp-file coccinelle/strjoin.cocci --in-place $(git ls-files src/*.c)

git grep -e '\bstrjoin\b.*NULL' -l|xargs sed -i -r 's/strjoin\((.*), NULL\)/strjoin(\1)/'

This might have missed a few cases (spatch has a really hard time dealing
with _cleanup_ macros), but that's no big issue, they can always be fixed
later.
2016-10-23 11:43:27 -04:00
Zbigniew Jędrzejewski-Szmek bb7c5bad4a coredumpctl: delay the "on tty" refusal until as late as possible
For the user, if the core file is missing or inaccessible, it is
more interesting that the fact that they forgot to pipe to a file.
So delay the failure from the check until after we have verified
that the file or the COREDUMP field are present.

Partially fixes #4161.

Also, error reporting on failure was duplicated. save_core() now
always prints an error message (because it knows the paths involved,
so can the most useful message), and the callers don't have to.
2016-09-28 23:49:01 +02:00
Zbigniew Jędrzejewski-Szmek 062b99e8be coredumpctl: tighten print_field() code
Propagate errors properly, so that if we hit oom or an error in the
journal, the whole command will fail. This is important when using
the output in scripts.

Support the output of multiple values for the same field with -F.
The journal supports that, and our official commands should too, as
far as it makes sense. -F can be used to print user-defined fields
(e.g. somebody could use a TAG field with multiple occurences), so
we should support that too. That seems better than silently printing
the last value found as was done before.

We would iterate trying to match the same field with all possible
field names. Once we find something, cut the loop short, since we
know that nothing else can match.
2016-09-28 23:49:01 +02:00
Zbigniew Jędrzejewski-Szmek 04de587942 coredumpctl: rework presence reporting
The column for "present" was easy to miss, especially if somebody had no
coredumps present at all, in which case the column of spaces of width one
wasn't visually distinguished from the neighbouring columns. Replace this
with an explicit text, one of: "missing", "journal", "present", "error".

$ coredumpctl
TIME                            PID   UID   GID SIG COREFILE EXE
Mon 2016-09-26 22:46:31 CEST   8623     0     0  11 missing  /usr/bin/bash
Mon 2016-09-26 22:46:35 CEST   8639  1001  1001  11 missing  /usr/bin/bash
Tue 2016-09-27 01:10:46 CEST  16110  1001  1001  11 journal  /usr/bin/bash
Tue 2016-09-27 01:13:20 CEST  16290  1001  1001  11 journal  /usr/bin/bash
Tue 2016-09-27 01:33:48 CEST  17867  1001  1001  11 present  /usr/bin/bash
Tue 2016-09-27 01:37:55 CEST  18549     0     0  11 error    /usr/bin/bash

Also, use access(…, R_OK), so that we can report a present but inaccessible
file different than a missing one.
2016-09-28 23:49:01 +02:00
Zbigniew Jędrzejewski-Szmek 47f5064207 coredumpctl: report corefile presence properly
In 'list', show present also for coredumps stored in the journal.

In 'status', replace "File" with "Storage" line that is always present.
Possible values:
   Storage: none
   Storage: journal
   Storage: /path/to/file (inacessible)
   Storage: /path/to/file

Previously the File field be only present if the file was accessible, so users
had to manually extract the file name precisely in the cases where it was
needed, i.e. when coredumpctl couldn't access the file. It's much more friendly
to always show something. This output is designed for human consumption, so
it's better to be a bit verbose.

The call to sd_j_set_data_threshold is moved, so that status is always printed
with the default of 64k, list uses 4k, and coredump retrieval is done with the
limit unset. This should make checking for the presence of the COREDUMP field
not too costly.
2016-09-28 23:49:01 +02:00
Zbigniew Jędrzejewski-Szmek 554ed50f90 coredumpctl: report user unit properly 2016-09-28 23:49:01 +02:00
Zbigniew Jędrzejewski-Szmek cfeead6c77 coredumpctl: fix spurious "more than one entry matches" warning
sd_journal_previous() returns 0 if it didn't do any move, so the
warning was stupidly always printed.
2016-09-28 23:49:01 +02:00
Zbigniew Jędrzejewski-Szmek 954d3a51af coredumpctl: fix handling of files written to fd
Added in 9fe13294a9 (by me :[```), and later obfuscated in d0c8806d4a, if an
uncompressed external file or an internally stored coredump was supposed to be
written to a file descriptor, nothing would be written.
2016-09-28 23:49:01 +02:00
Zbigniew Jędrzejewski-Szmek fc6cec8613 coredump: remove Storage=both option
Back when external storage was initially added in 34c10968cb, this mode of
storage was added. This could have made some sense back when XZ compression was
used, and an uncompressed core on disk could be used as short-lived cache file
which does require costly decompression. But now fast LZ4 compression is used
(by default) both internally and externally, so we have duplicated storage,
using the same compression and same default maximum core size in both cases,
but with different expiration lifetimes. Even the uncompressed-external,
compressed-internal mode is not very useful: for small files, decompression
with LZ4 is fast enough not to matter, and for large files, decompression is
still relatively fast, but the disk-usage penalty is very big.

An additional problem with the two modes of storage is that it complicates
the code and makes it much harder to return a useful error message to the user
if we cannot find the core file, since if we cannot find the file we have to
check the internal storage first.

This patch drops "both" storage mode. Effectively this means that if somebody
configured coredump this way, they will get a warning about an unsupported
value for Storage, and the default of "external" will be used.
I'm pretty sure that this mode is very rarely used anyway.
2016-09-28 23:49:01 +02:00
Topi Miettinen 646853bdd8 fileio: simplify mkostemp_safe() (#4090)
According to its manual page, flags given to mkostemp(3) shouldn't include
O_RDWR, O_CREAT or O_EXCL flags as these are always included. Beyond
those, the only flag that all callers (except a few tests where it
probably doesn't matter) use is O_CLOEXEC, so set that unconditionally.
2016-09-13 08:20:38 +02:00
Lennart Poettering 992e8f224b util-lib: rework /tmp and /var/tmp handling code
Beef up the existing var_tmp() call, rename it to var_tmp_dir() and add a
matching tmp_dir() call (the former looks for the place for /var/tmp, the
latter for /tmp).

Both calls check $TMPDIR, $TEMP, $TMP, following the algorithm Python3 uses.
All dirs are validated before use. secure_getenv() is used in order to limite
exposure in suid binaries.

This also ports a couple of users over to these new APIs.

The var_tmp() return parameter is changed from an allocated buffer the caller
will own to a const string either pointing into environ[], or into a static
const buffer. Given that environ[] is mostly considered constant (and this is
exposed in the very well-known getenv() call), this should be OK behaviour and
allows us to avoid memory allocations in most cases.

Note that $TMPDIR and friends override both /var/tmp and /tmp usage if set.
2016-08-04 16:27:07 +02:00
Zbigniew Jędrzejewski-Szmek 0f8aeac255 coredumpctl: grammaro fix
Mentioned in #2901.
2016-04-02 11:35:08 -04:00
Alexander Kuleshov ea4b98e657 tree-wide: merge pager_open_if_enabled() to the pager_open()
Many subsystems define own pager_open_if_enabled() function which
checks '--no-pager' command line argument and open pager depends
on its value. All implementations of pager_open_if_enabled() are
the same. Let's merger this function with pager_open() from the
shared/pager.c and remove pager_open_if_enabled() from all subsytems
to prevent code duplication.
2016-02-26 01:13:23 +06:00
Lennart Poettering f50cd2b2f5 build-sys: move coredump logic into subdir of its own 2016-02-10 14:32:27 +01:00
Renamed from src/journal/coredumpctl.c (Browse further)