We had all kinds of indentation: 2 sp, 3 sp, 4 sp, 8 sp, and mixed.
4 sp was the most common, in particular the majority of scripts under test/
used that. Let's standarize on 4 sp, because many commandlines are long and
there's a lot of nesting, and with 8sp indentation less stuff fits. 4 sp
also seems to be the default indentation, so this will make it less likely
that people will mess up if they don't load the editor config. (I think people
often use vi, and vi has no support to load project-wide configuration
automatically. We distribute a .vimrc file, but it is not loaded by default,
and even the instructions in it seem to discourage its use for security
reasons.)
Also remove the few vim config lines that were left. We should either have them
on all files, or none.
Also remove some strange stuff like '#!/bin/env bash', yikes.
The journal files might not be tiny hence let's write them to /var/tmp/
instead of /tmp. Also, let's turn on NOCOW on the files, as these tests
might apparently be slow on btrfs.
Fixes: #12210
systemd-journal-remote always wrote the boot-id of the device it was running on
to the header of its journal files. When the source had a different boot-id
(because it was generated on a different boot, or a different device), the
boot-ids in the file were inconsistent. The _BOOT_ID field was that of the
source, but the journal file header and each entry object header were that of
the device systemd-journal-remote ran on. This breaks journalctl --list-boots
on any of these files.
Set the boot-id in the header to be that of the source. This also fixes the
entry object headers.
Let's be helpful to static analyzers which care about whether we
knowingly ignore return values. We do in these cases, since they are
usually part of error paths.
Let's unify the two similar code paths to watch /run/systemd/journal.
The code in manager.c is similar, but it uses mkdir_p_label(), and unifying
that would be too much trouble, so let's just adjust the error messages to
be the same.
CID #1400224.
The thread launched in journal_file_set_offline() accesses a memory
mapped file, so it needs to handle SIGBUS. Leave SIGBUS unblocked on the
offlining thread so that it uses the same handler as the main thread.
The result of triggering SIGBUS in a thread where it's blocked is
undefined in Linux. The tested implementations were observed to cause
the default handler to run, taking down the whole journald process.
We can leave SIGBUS unblocked in multiple threads since it's handler is
thread-safe. If SIGBUS is sent to the journald process asynchronously
(i.e. with kill, sigqueue, or raise), either thread handling it will
result in the same behavior: it will install the default handler and
reraise the signal, killing the process.
Fixes: #12042
let's use mkdir_parents() (because its shorter), and 0755 as access
mode, so that things have the access mode tmpfiles.d also suggests.
Prompted by: #11903
The option cursor-file takes a filename as argument. If the file exists and
contains a valid cursor, this is used to start the output after this position.
At the end, the last cursor gets written to the file.
This allows for an easy implementation of a timer that regularly looks in the
journal for some messages.
journalctl --cursor-file err-cursor -b -p err
journalctl --cursor-file audit-cursor -t audit --grep DENIED
Or you might want to walk the journal in steps of 10 messages:
journalctl --cursor-file ./curs -n10 --since=today -t systemd
We could potentially create an unterminated string and then call normal string
operations on it. Let's be more careful: first remove the suffix we ignore anyway,
then find if the string is of acceptable length, and possibly ignore it if it
is too long. The code rejects lengths above 31 bytes. Language names that are
actually used are much shorter, so this doesn't matter much.
In normal use, this allow us to drop dead entries from the cache and reduces
the cache size so that we don't evict entries unnecessarily. The time limit is
there mostly to serve as a guard against malicious logging from many different
PIDs.
This is far from perfect, but should give mostly reasonable values. My
assumption is that if somebody has a few hundred MB of memory, they are
unlikely to have thousands of processes logging. A hundred would already be a
lot. So let's scale the cache size propritionally to the total memory size,
with clamping on both ends.
The formula gives 64 cache entries for each GB of RAM.
Found by inspecting results of running this small program:
int main(int argc, const char **argv) {
for (int i = 1; i < argc; i++) {
FILE *f;
char line[1024], prev[1024], *r;
int lineno;
prev[0] = '\0';
lineno = 1;
f = fopen(argv[i], "r");
if (!f)
exit(1);
do {
r = fgets(line, sizeof(line), f);
if (!r)
break;
if (strcmp(line, prev) == 0)
printf("%s:%d: error: dup %s", argv[i], lineno, line);
lineno++;
strcpy(prev, line);
} while (!feof(f));
fclose(f);
}
}
We immediately read the whole contents into memory, making thigs much more
expensive. Sealed fds should be used instead since they are more efficient
on our side.
We'd first parse all or most of the message, and only then consider if it
is not too large. Also, when encountering a single field over the limit,
we'd still process the preceding part of the message. Let's be stricter,
and check size limits early, and let's refuse the whole message if it fails
any of the size limits.
We allocate a iovec entry for each field, so with many short entries,
our memory usage and processing time can be large, even with a relatively
small message size. Let's refuse overly long entries.
CVE-2018-16865
https://bugzilla.redhat.com/show_bug.cgi?id=1653861
What from I can see, the problem is not from an alloca, despite what the CVE
description says, but from the attack multiplication that comes from creating
many very small iovecs: (void* + size_t) for each three bytes of input message.
This fixes a crash where we would read the commandline, whose length is under
control of the sending program, and then crash when trying to create a stack
allocation for it.
CVE-2018-16864
https://bugzilla.redhat.com/show_bug.cgi?id=1653855
The message actually doesn't get written to disk, because
journal_file_append_entry() returns -E2BIG.
Nitpicky, but we've used a lot of random spacings and names in the past,
but we're trying to be completely consistent on "cgroup vN" now.
Generated by `fd -0 | xargs -0 -n1 sed -ri --follow-symlinks 's/cgroups? ?v?([0-9])/cgroup v\1/gI'`.
I manually ignored places where it's not appropriate to replace (eg.
"cgroup2" fstype and in src/shared/linux).
Continuation of a3ebe5eb620e49f0d24082876cafc7579261e64f:
in other places we sometimes use assert_se(), and sometimes normal error
handling. sigfillset and sigaddset can only fail if mask is NULL (which cannot
happen if we are passing in a reference), or if the signal number is invalid
(which really shouldn't happen when we are using a constant like SIGCHLD. If
SIGCHLD is invalid, we have a bigger problem). So let's simplify things and
always use assert_se() in those cases.
In sigset_add_many() we could conceivably pass an invalid signal, so let's keep
normal error handling here. The caller can do assert_se() around the
sigprocmask_many() call if appropriate.
'>= 0' is used for consistency with the rest of the codebase.
With cgroupsv1 a zombie process is migrated to root cgroup in all
hierarchies. This was changed for unified hierarchy and /proc/PID/cgroup
reports cgroup to which process belonged before it exited.
Be more suspicious about cgroup path reported by the kernel and use
unit_id provided by the log client if the kernel reports that process is
running in the root cgroup.
Users tend to care the most about 'log->unit_id' mapping so systemctl
status can correctly report last log lines. Also we wouldn't be able to
infer anything useful from "/" path anyway.
See: 2e91fa7f6d
This splits out a bunch of functions from fileio.c that have to do with
temporary files. Simply to make the header files a bit shorter, and to
group things more nicely.
No code changes, just some rearranging of source files.