Even with the new keyed hash table journal feature: if an attacker
manages to get access to the journal file id it could synthesize records
that result in hash collisions. Let's rotate automatically when we
notice that, so that a new journal file ID is generated, our performance
is restored and the attacker has to guess a new file ID before being
able to trigger the issue again.
That said, untrusted peers should never get access to journal files in
the first case...
This adds a new (incompatible) feature to journal files: if enabled the
hash function used for the hash tables is no longer jenkins hash with a
zero key, but siphash keyed by the file uuid that is included in the
file header anyway. This should make our hash tables more robust against
collision attacks, as long as the attacker has no read access to the
journal files. We switch from jenkins to siphash simply because it's
more well-known and we standardize for the rest of our codebase onto it.
This is hardening in order to make collision attacks harder for clients
that can forge log messages but have no read access to the logs. It has
no effect on clients that have read access.
Let's prefix this with "jenkins_" since it wraps the jenkins hash. We
want to add support for other hash functions to journald soon, hence
better be clear with what this is. In particular as all other symbols
defined by lookup3.h actually are prefixed "jenkins_".
Let's clean this up a bit, following our usual nomenclature to name
return parameters ret-xyz.
This is mostly a bit of renaming, but there's also some minor other
changes: if we return a pointer to a mmap'ed object plus its offset, in
almost all cases we are happy if either parameter is NULL in case the
caller is not interested in it. Let's fix the remaining case to do this
too, to minimize surprises.
The object flags field is a bitmask, hence don't sloppily define
_OBJECT_COMPRESSED_MAX as one mor than the previous flag. That worked OK
as long as we only had two flags, but will fall apart as soon as we have
three. Let's fix this.
(It's kinda sloppy how the string table is built here, as it will be
quite sparse as soon as we have more enum entries, but let's keep it for
now.)
Presently, CLI utilities such as systemctl will check whether they have a tty
attached or not to decide whether to parse /proc/cmdline or EFI variable
SystemdOptions looking for systemd.log_* entries.
But this check will be misleading if these tools are being launched by a
daemon, such as a monitoring daemon or automation service that runs in
background.
Make log handling of CLI tools uniform by never checking /proc/cmdline or EFI
variables to determine the logging level.
Furthermore, introduce a new log_setup_cli() shortcut to set up common options
used by most command-line utilities.
Patch contains a coccinelle script, but it only works in some cases. Many
parts were converted by hand.
Note: I did not fix errors in return value handing. This will be done separate
to keep the patch comprehensible. No functional change is intended in this
patch.
poll() sets POLLNVAL inside of the poll structures if an invalid fd is
passed. So far we generally didn't check for that, thus not taking
notice of the error. Given that this specific kind of error is generally
indication of a programming error, and given that our code is embedded
into our projects via NSS or because people link against our library,
let's explicitly check for this and convert it to EBADF.
(I ran into a busy loop because of this missing check when some of my
test code accidentally closed an fd it shouldn't close, so this is a
real thing)
This is a follow-up for 9f83091e3c.
Instead of reading the mtime off the configuration files after reading,
let's do so before reading, but with the fd we read the data from. This
is not only cleaner (as it allows us to save one stat()), but also has
the benefit that we'll detect changes that happen while we read the
files.
This also reworks unit file drop-ins to use the common code for
determining drop-in mtime, instead of reading system clock for that.
It annoyed me for quite a while that running "journalctl --file=…" on a
file that is not readable failed with a "File not found" error instead
of a permission error. Let's fix that.
We make this work by using the GLOB_NOCHECK flag for glob() which means
that files are not accessible will be returned in the array as they are
instead of being filtered away. This then means that our later attemps
to open the files will fail cleanly with a good error message.
- If length of formatted string >= LONG_LINE_MAX then return -ENOBUFS
- Normal Case:
- length of formatted string < POSIX defined LINE_MAX
- Allocate sbuf to accomodate the message
- Rare case:
- LINE_MAX < length of formatted string < LONG_LINE_MAX
- Allocate the required length using alloca()
Let's introduce an explicit line ending marker for line endings due to
pid change.
Let's also make sure we don't get confused with buffer management.
Fixes: #15654
If the previous received buffer length is almost equal to the allocated
buffer size, before this change the next read can only receive a couple
of bytes (in the worst case only 1 byte), which is not efficient.
We always need to make them unions with a "struct cmsghdr" in them, so
that things properly aligned. Otherwise we might end up at an unaligned
address and the counting goes all wrong, possibly making the kernel
refuse our buffers.
Also, let's make sure we initialize the control buffers to zero when
sending, but leave them uninitialized when reading.
Both the alignment and the initialization thing is mentioned in the
cmsg(3) man page.
HUGE_SIZE was defined inconsistently.
> In file included from ../src/basic/alloc-util.h:9,
> from ../src/journal/test-compress.c:9:
> ../src/journal/test-compress.c: In function ‘main’:
> ../src/journal/test-compress.c:280:33: error: ‘HUGE_SIZE’ undeclared (first use in this function)
> 280 | assert_se(huge = malloc(HUGE_SIZE));
Don't assume that 4MB can be allocated from stack since there could be smaller
DefaultLimitSTACK= in force, so let's use malloc(). NUL terminate the huge
strings by hand, also ensure termination in test_lz4_decompress_partial() and
optimize the memset() for the string.
Some items in /proc and /etc may not be accessible to poor unprivileged users
due to e.g. SELinux, BOFH or both, so check for EACCES and EPERM.
/var/tmp may be a symlink to /tmp and then path_compare() will always fail, so
let's stick to /tmp like elsewhere.
/tmp may be mounted with noexec option and then trying to execute scripts from
there would fail.
Detect and warn if seccomp is already in use, which could make seccomp test
fail if the syscalls are already blocked.
Unset $TMPDIR so it will not break specifier tests where %T is assumed to be
/tmp and %V /var/tmp.
Our journal code is generally supposed to be written in a fashion that
the underlying file can be deallocated any time, i.e. our mmap of it
suddenly becomes all zeroes. The idea is that we catch that when parsing
everything. For that to work safely we need to make sure that when doing
arithmetics or comparisons on values read from the map we don't run into
TTOCTTOU issues when determining validity. Hence we need to copy out the
values before use and operate on the copies. This requires some special
care since the C compiler could suppress our copies as optimization.
Hence use the new READ_NOW() macro to force a copy by using memcpy(),
and use it whenever we start doing an arithmetic operation on it, or
validity checking of multiple steps.
Fixes: #14943