decompress_blob_zstd() would allocate ever bigger buffers in a loop trying to
get a buffer big enough to decompress the input data. This is wasteful, since
we can just query the size of the decompressed data from the compressed header.
Worse, it doesn't work when the output size is capped, i.e. when dst_max != 0.
If the decompressed blob happened to be bigger than dst_max, decompression
would fail with -ENOBUFS. We need to use "stream decompression" instead, and
only get min(uncompressed size, dst_max) bytes of output.
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1856037 in a second way.
SD_JOURNAL_FOREACH_DATA() and SD_JOURNAL_FOREACH_UNIQUE() would immediately
terminate when a field couldn't be accessed. This can happen for example when a
field is compressed with an unavailable compression format. But it's likely
that this is the wrong thing to do: the caller for example might want to
iterate over the fields but isn't interested in all of them. coredumpctl is
like this: it uses SD_JOURNAL_FOREACH_DATA() but only uses a subset of the
fields.
Add two new functions sd_journal_enumerate_good_data() and
sd_journal_enumerate_good_unique() that retry sd_journal_enumerate_data() and
sd_journal_enumerate_unique() if the return value is something that applies to
a single field: ENOBUS, E2BIG, EOPNOTSUPP.
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1856037.
An alternative would be to make the macros themselves smarter instead of adding
new symbols, and do the looping internally in the macro. I don't like that
approach for two reasons. First, it would embed the logic in the macro, so
recompilation would be required if we decide to update the logic. With the
current version of the patch, recompilation is required to use the new symbols,
but after that, library upgrades are enough. So the current approach is safer
in case further updates are needed. Second, our headers use primitive C, and it
is hard to do the macros without using newer features.
Even with the new keyed hash table journal feature: if an attacker
manages to get access to the journal file id it could synthesize records
that result in hash collisions. Let's rotate automatically when we
notice that, so that a new journal file ID is generated, our performance
is restored and the attacker has to guess a new file ID before being
able to trigger the issue again.
That said, untrusted peers should never get access to journal files in
the first case...
This adds a new (incompatible) feature to journal files: if enabled the
hash function used for the hash tables is no longer jenkins hash with a
zero key, but siphash keyed by the file uuid that is included in the
file header anyway. This should make our hash tables more robust against
collision attacks, as long as the attacker has no read access to the
journal files. We switch from jenkins to siphash simply because it's
more well-known and we standardize for the rest of our codebase onto it.
This is hardening in order to make collision attacks harder for clients
that can forge log messages but have no read access to the logs. It has
no effect on clients that have read access.
Let's prefix this with "jenkins_" since it wraps the jenkins hash. We
want to add support for other hash functions to journald soon, hence
better be clear with what this is. In particular as all other symbols
defined by lookup3.h actually are prefixed "jenkins_".
Let's clean this up a bit, following our usual nomenclature to name
return parameters ret-xyz.
This is mostly a bit of renaming, but there's also some minor other
changes: if we return a pointer to a mmap'ed object plus its offset, in
almost all cases we are happy if either parameter is NULL in case the
caller is not interested in it. Let's fix the remaining case to do this
too, to minimize surprises.
The object flags field is a bitmask, hence don't sloppily define
_OBJECT_COMPRESSED_MAX as one mor than the previous flag. That worked OK
as long as we only had two flags, but will fall apart as soon as we have
three. Let's fix this.
(It's kinda sloppy how the string table is built here, as it will be
quite sparse as soon as we have more enum entries, but let's keep it for
now.)
Presently, CLI utilities such as systemctl will check whether they have a tty
attached or not to decide whether to parse /proc/cmdline or EFI variable
SystemdOptions looking for systemd.log_* entries.
But this check will be misleading if these tools are being launched by a
daemon, such as a monitoring daemon or automation service that runs in
background.
Make log handling of CLI tools uniform by never checking /proc/cmdline or EFI
variables to determine the logging level.
Furthermore, introduce a new log_setup_cli() shortcut to set up common options
used by most command-line utilities.
Patch contains a coccinelle script, but it only works in some cases. Many
parts were converted by hand.
Note: I did not fix errors in return value handing. This will be done separate
to keep the patch comprehensible. No functional change is intended in this
patch.
poll() sets POLLNVAL inside of the poll structures if an invalid fd is
passed. So far we generally didn't check for that, thus not taking
notice of the error. Given that this specific kind of error is generally
indication of a programming error, and given that our code is embedded
into our projects via NSS or because people link against our library,
let's explicitly check for this and convert it to EBADF.
(I ran into a busy loop because of this missing check when some of my
test code accidentally closed an fd it shouldn't close, so this is a
real thing)
This is a follow-up for 9f83091e3c.
Instead of reading the mtime off the configuration files after reading,
let's do so before reading, but with the fd we read the data from. This
is not only cleaner (as it allows us to save one stat()), but also has
the benefit that we'll detect changes that happen while we read the
files.
This also reworks unit file drop-ins to use the common code for
determining drop-in mtime, instead of reading system clock for that.
It annoyed me for quite a while that running "journalctl --file=…" on a
file that is not readable failed with a "File not found" error instead
of a permission error. Let's fix that.
We make this work by using the GLOB_NOCHECK flag for glob() which means
that files are not accessible will be returned in the array as they are
instead of being filtered away. This then means that our later attemps
to open the files will fail cleanly with a good error message.
- If length of formatted string >= LONG_LINE_MAX then return -ENOBUFS
- Normal Case:
- length of formatted string < POSIX defined LINE_MAX
- Allocate sbuf to accomodate the message
- Rare case:
- LINE_MAX < length of formatted string < LONG_LINE_MAX
- Allocate the required length using alloca()
Let's introduce an explicit line ending marker for line endings due to
pid change.
Let's also make sure we don't get confused with buffer management.
Fixes: #15654
If the previous received buffer length is almost equal to the allocated
buffer size, before this change the next read can only receive a couple
of bytes (in the worst case only 1 byte), which is not efficient.
We always need to make them unions with a "struct cmsghdr" in them, so
that things properly aligned. Otherwise we might end up at an unaligned
address and the counting goes all wrong, possibly making the kernel
refuse our buffers.
Also, let's make sure we initialize the control buffers to zero when
sending, but leave them uninitialized when reading.
Both the alignment and the initialization thing is mentioned in the
cmsg(3) man page.
HUGE_SIZE was defined inconsistently.
> In file included from ../src/basic/alloc-util.h:9,
> from ../src/journal/test-compress.c:9:
> ../src/journal/test-compress.c: In function ‘main’:
> ../src/journal/test-compress.c:280:33: error: ‘HUGE_SIZE’ undeclared (first use in this function)
> 280 | assert_se(huge = malloc(HUGE_SIZE));
Don't assume that 4MB can be allocated from stack since there could be smaller
DefaultLimitSTACK= in force, so let's use malloc(). NUL terminate the huge
strings by hand, also ensure termination in test_lz4_decompress_partial() and
optimize the memset() for the string.
Some items in /proc and /etc may not be accessible to poor unprivileged users
due to e.g. SELinux, BOFH or both, so check for EACCES and EPERM.
/var/tmp may be a symlink to /tmp and then path_compare() will always fail, so
let's stick to /tmp like elsewhere.
/tmp may be mounted with noexec option and then trying to execute scripts from
there would fail.
Detect and warn if seccomp is already in use, which could make seccomp test
fail if the syscalls are already blocked.
Unset $TMPDIR so it will not break specifier tests where %T is assumed to be
/tmp and %V /var/tmp.