Commit graph

329 commits

Author SHA1 Message Date
Lennart Poettering 0c69794138 tree-wide: remove Lennart's copyright lines
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
2018-06-14 10:20:20 +02:00
Lennart Poettering 818bf54632 tree-wide: drop 'This file is part of systemd' blurb
This part of the copyright blurb stems from the GPL use recommendations:

https://www.gnu.org/licenses/gpl-howto.en.html

The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.

hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
2018-06-14 10:20:20 +02:00
Michal Sekletar bb3ff70a86 journal: forward messages from /dev/log unmodified to syslog.socket 2018-06-11 21:26:22 +02:00
Zbigniew Jędrzejewski-Szmek d180c34998 journal: allow boot_id to be passed to journal_append_entry()
In this commit, this is done only in testing code, i.e. there is
no functional change apart from tests.
2018-05-31 14:30:23 +02:00
Zbigniew Jędrzejewski-Szmek 5a271b08b3 journal: remove unused args from journal_file_copy_entry() 2018-05-31 14:30:23 +02:00
Lennart Poettering da6053d0a7 tree-wide: be more careful with the type of array sizes
Previously we were a bit sloppy with the index and size types of arrays,
we'd regularly use unsigned. While I don't think this ever resulted in
real issues I think we should be more careful there and follow a
stricter regime: unless there's a strong reason not to use size_t for
array sizes and indexes, size_t it should be. Any allocations we do
ultimately will use size_t anyway, and converting forth and back between
unsigned and size_t will always be a source of problems.

Note that on 32bit machines "unsigned" and "size_t" are equivalent, and
on 64bit machines our arrays shouldn't grow that large anyway, and if
they do we have a problem, however that kind of overly large allocation
we have protections for usually, but for overflows we do not have that
so much, hence let's add it.

So yeah, it's a story of the current code being already "good enough",
but I think some extra type hygiene is better.

This patch tries to be comprehensive, but it probably isn't and I missed
a few cases. But I guess we can cover that later as we notice it. Among
smaller fixes, this changes:

1. strv_length()' return type becomes size_t

2. the unit file changes array size becomes size_t

3. DNS answer and query array sizes become size_t

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=76745
2018-04-27 14:29:06 +02:00
Lennart Poettering 5d13a15b1d tree-wide: drop spurious newlines (#8764)
Double newlines (i.e. one empty lines) are great to structure code. But
let's avoid triple newlines (i.e. two empty lines), quadruple newlines,
quintuple newlines, …, that's just spurious whitespace.

It's an easy way to drop 121 lines of code, and keeps the coding style
of our sources a bit tigther.
2018-04-19 12:13:23 +02:00
Zbigniew Jędrzejewski-Szmek 11a1589223 tree-wide: drop license boilerplate
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.

I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
2018-04-06 18:58:55 +02:00
David Tardon 3de8ff5a69 journald: bump rate limits (#8660)
Apparently, it is quite common to hit a problem, where systemd-journald
would drop messages because service is logging too fast.
2018-04-05 13:06:59 +02:00
Alex Gartrell 1b7cf0e587 journal: make the compression threshold tunable
Allow a user to set a number of bytes as Compress to use as the compression
threshold.
2018-03-20 14:54:07 -07:00
Alex Gartrell 57850536d5 journal: provide compress_threshold_bytes parameter
Previously the compression threshold was hardcoded to 512, which meant that
smaller values wouldn't be compressed. This left some storage savings on the
table, so instead, we make that number tunable.
2018-03-20 11:48:52 -07:00
Lennart Poettering db4a47e9fe coccinelle: O_NDELAY → O_NONBLOCK
Apparently O_NONBLOCK is the modern name used in most documentation and
for most cases in our sources. Let's hence replace the old alias
O_NDELAY and stick to O_NONBLOCK everywhere.
2018-01-24 11:09:29 +01:00
Yu Watanabe bb6b922f9f journal: coding style fix
This is originally pointed out by @cpsw.
2018-01-15 23:53:10 +09:00
Lennart Poettering 2fce06b0d6 journald: introduce new uid_for_system_journal() helper
We use the same check at two places, let's add a tiny helper function
for it, since it's not entirely trivialy, and we changes this before
multiple times, and it's a good thing if we can change it at one place
only instead of multiple.
2018-01-04 13:28:24 +01:00
Lennart Poettering 673192494c coccinelle: automatically rewrite memset() to zero() or memzero() where we can
We are pretty good at this already, hence only a single case is actually
found by this.
2017-12-14 19:47:46 +01:00
Lennart Poettering fbd0b64f44
tree-wide: make use of new STRLEN() macro everywhere (#7639)
Let's employ coccinelle to do this for us.

Follow-up for #7625.
2017-12-14 19:02:29 +01:00
Lennart Poettering 05fd2156b7 journal,coredump: do not do ACL magic for "nobody" user either
The "nobody" user might possibly be seen by the journal or coredumping
code if unmapped userns-using processes are somehow visible to them.
Let's make sure we don't do the ACL magic for this user either, since
this is a special system user that might be backed by different real
users in different contexts.
2017-12-06 13:40:50 +01:00
Lennart Poettering 4e72397b00 coredump,journal: do not do ACL magic for processes of dynamic UIDs
Dynamic UIDs should be treated like system users in this regard.
2017-12-06 13:40:50 +01:00
Lennart Poettering ece877d434 user-util: add new uid_is_system() helper
This adds uid_is_system() and gid_is_system(), similar in style to
uid_is_dynamic(). That a helper like this is useful is illustrated by
the fact that test-condition.c didn't get the check right so far, which
this patch fixes.
2017-12-06 13:40:50 +01:00
Lennart Poettering 5908ff1c4b journal: fix log message when dropping messages
Fixes: #7506
2017-11-29 22:11:59 +01:00
Lennart Poettering f643ae7171 journal: driver messages can now contain object fields, account for that
In some cases we can now log about processes, hence we must keep room
for that.
2017-11-29 11:36:22 +01:00
Zbigniew Jędrzejewski-Szmek f916819053 journal: use new helpers with journal_file_close
journal_file_close_set() is not necessary anymore.
2017-11-28 21:34:50 +01:00
Zbigniew Jędrzejewski-Szmek 53e1b68390 Add SPDX license identifiers to source files under the LGPL
This follows what the kernel is doing, c.f.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.
2017-11-19 19:08:15 +01:00
Lennart Poettering d3070fbdf6 core: implement /run/systemd/units/-based path for passing unit info from PID 1 to journald
And let's make use of it to implement two new unit settings with it:

1. LogLevelMax= is a new per-unit setting that may be used to configure
   log priority filtering: set it to LogLevelMax=notice and only
   messages of level "notice" and lower (i.e. more important) will be
   processed, all others are dropped.

2. LogExtraFields= is a new per-unit setting for configuring per-unit
   journal fields, that are implicitly included in every log record
   generated by the unit's processes. It takes field/value pairs in the
   form of FOO=BAR.

Also, related to this, one exisiting unit setting is ported to this new
facility:

3. The invocation ID is now pulled from /run/systemd/units/ instead of
   cgroupfs xattrs. This substantially relaxes requirements of systemd
   on the kernel version and the privileges it runs with (specifically,
   cgroupfs xattrs are not available in containers, since they are
   stored in kernel memory, and hence are unsafe to permit to lesser
   privileged code).

/run/systemd/units/ is a new directory, which contains a number of files
and symlinks encoding the above information. PID 1 creates and manages
these files, and journald reads them from there.

Note that this is supposed to be a direct path between PID 1 and the
journal only, due to the special runtime environment the journal runs
in. Normally, today we shouldn't introduce new interfaces that (mis-)use
a file system as IPC framework, and instead just an IPC system, but this
is very hard to do between the journal and PID 1, as long as the IPC
system is a subject PID 1 manages, and itself a client to the journal.

This patch cleans up a couple of types used in journal code:
specifically we switch to size_t for a couple of memory-sizing values,
as size_t is the right choice for everything that is memory.

Fixes: #4089
Fixes: #3041
Fixes: #4441
2017-11-16 12:40:17 +01:00
Lennart Poettering 131819424d journald: when logging about dropped messages, include more meta data
When we drop messages of a unit, we log about. Let's add some structured
data to that. Let's include how many messages we dropped, but more
importantly, let's link up the message we generate to the unit we
dropped the messages from by using the "OBJECT" logic, i.e. by
generating OBJECT_SYSTEMD_UNIT= fields and suchlike, that "journalctl
-u" and friends already look for.

Fixes: #6494
2017-11-16 12:40:17 +01:00
Lennart Poettering bcde742e78 conf-parser: turn three bool function params into a flags fields
This makes things more readable and fixes some issues with incorrect
flag propagation between the various flavours of config_parse().
2017-11-13 10:24:03 +01:00
Lennart Poettering 4aa1d31c89 Merge pull request #6974 from keszybz/clean-up-defines
Clean up define definitions
2017-10-04 19:25:30 +02:00
Yu Watanabe 4c70109600 tree-wide: use IN_SET macro (#6977) 2017-10-04 16:01:32 +02:00
Zbigniew Jędrzejewski-Szmek 349cc4a507 build-sys: use #if Y instead of #ifdef Y everywhere
The advantage is that is the name is mispellt, cpp will warn us.

$ git grep -Ee "conf.set\('(HAVE|ENABLE)_" -l|xargs sed -r -i "s/conf.set\('(HAVE|ENABLE)_/conf.set10('\1_/"
$ git grep -Ee '#ifn?def (HAVE|ENABLE)' -l|xargs sed -r -i 's/#ifdef (HAVE|ENABLE)/#if \1/; s/#ifndef (HAVE|ENABLE)/#if ! \1/;'
$ git grep -Ee 'if.*defined\(HAVE' -l|xargs sed -i -r 's/defined\((HAVE_[A-Z0-9_]*)\)/\1/g'
$ git grep -Ee 'if.*defined\(ENABLE' -l|xargs sed -i -r 's/defined\((ENABLE_[A-Z0-9_]*)\)/\1/g'
+ manual changes to meson.build

squash! build-sys: use #if Y instead of #ifdef Y everywhere

v2:
- fix incorrect setting of HAVE_LIBIDN2
2017-10-04 12:09:29 +02:00
Andreas Rammhold 3742095b27
tree-wide: use IN_SET where possible
In addition to the changes from #6933 this handles cases that could be
matched with the included cocci file.
2017-10-02 13:09:54 +02:00
Lennart Poettering e6a7ec4b8e io-util: add new IOVEC_INIT/IOVEC_MAKE macros
This adds IOVEC_INIT() and IOVEC_MAKE() for initializing iovec structures
from a pointer and a size. On top of these IOVEC_INIT_STRING() and
IOVEC_MAKE_STRING() are added which take a string and automatically
determine the size of the string using strlen().

This patch removes the old IOVEC_SET_STRING() macro, given that
IOVEC_MAKE_STRING() is now useful for similar purposes. Note that the
old IOVEC_SET_STRING() invocations were two characters shorter than the
new ones using IOVEC_MAKE_STRING(), but I think the new syntax is more
readable and more generic as it simply resolves to a C99 literal
structure initialization. Moreover, we can use very similar syntax now
for initializing strings and pointer+size iovec entries. We canalso use
the new macros to initialize function parameters on-the-fly or array
definitions. And given that we shouldn't have so many ways to do the
same stuff, let's just settle on the new macros.

(This also converts some code to use _cleanup_ where dynamically
allocated strings were using IOVEC_SET_STRING() before, to modernize
things a bit)
2017-09-22 15:28:04 +02:00
Lennart Poettering ec20fe5ffb journald: make maximum size of stream log lines configurable and bump it to 48K (#6838)
This adds a new setting LineMax= to journald.conf, and sets it by
default to 48K. When we convert stream-based stdout/stderr logging into
record-based log entries, read up to the specified amount of bytes
before forcing a line-break.

This also makes three related changes:

- When a NUL byte is read we'll not recognize this as alternative line
  break, instead of silently dropping everything after it. (see #4863)

- The reason for a line-break is now encoded in the log record, if it
  wasn't a plain newline. Specifically, we distuingish "nul",
  "line-max" and "eof", for line breaks due to NUL byte, due to the
  maximum line length as configured with LineMax= or due to end of
  stream. This data is stored in the new implicit _LINE_BREAK= field.
  It's not synthesized for plain \n line breaks.

- A randomized 128bit ID is assigned to each log stream.

With these three changes in place it's (mostly) possible to reconstruct
the original byte streams from log data, as (most) of the context of
the conversion from the byte stream to log records is saved now. (So,
the only bits we still drop are empty lines. Which might be something to
look into in a future change, and which is outside of the scope of this
work)

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=86465
See: #4863
Replaces: #4875
2017-09-22 10:22:24 +02:00
Lennart Poettering 22e3a02b9d journald: add minimal client metadata caching
Cache client metadata, in order to be improve runtime behaviour under
pressure.

This is inspired by @vcaputo's work, specifically:

https://github.com/systemd/systemd/pull/2280

That code implements related but different semantics.

For a longer explanation what this change implements please have a look
at the long source comment this patch adds to journald-context.c.

After this commit:

        # time bash -c 'dd bs=$((1024*1024)) count=$((1*1024)) if=/dev/urandom | systemd-cat'
        1024+0 records in
        1024+0 records out
        1073741824 bytes (1.1 GB, 1.0 GiB) copied, 11.2783 s, 95.2 MB/s

        real	0m11.283s
        user	0m0.007s
        sys	0m6.216s

Before this commit:

        # time bash -c 'dd bs=$((1024*1024)) count=$((1*1024)) if=/dev/urandom | systemd-cat'
        1024+0 records in
        1024+0 records out
        1073741824 bytes (1.1 GB, 1.0 GiB) copied, 52.0788 s, 20.6 MB/s

        real	0m52.099s
        user	0m0.014s
        sys	0m7.170s

As side effect, this corrects the journal's rate limiter feature: we now
always use the unit name as key for the ratelimiter.
2017-07-31 18:21:21 +02:00
Lennart Poettering df0ff12775 tree-wide: make use of getpid_cached() wherever we can
This moves pretty much all uses of getpid() over to getpid_raw(). I
didn't specifically check whether the optimization is worth it for each
replacement, but in order to keep things simple and systematic I
switched over everything at once.
2017-07-20 20:27:24 +02:00
Susant Sahani b2392ff31c journald: make reading /dev/kmsg optional (#6362)
Closes #6022
2017-07-15 13:57:52 +02:00
Evgeny Vereshchagin 4417e1a33d Merge pull request #5960 from keszybz/journald-memleak
Journald and journal-remote memleak fixes
2017-05-21 01:41:48 +03:00
Zbigniew Jędrzejewski-Szmek c6e9e16f77 journald: fix trivial memleak
Fixes #5516.
2017-05-19 19:15:26 -04:00
Gary Tierney 6d395665e5 Revert "selinux: split up mac_selinux_have() from mac_selinux_use()"
This reverts commit 6355e75610.

The previously mentioned commit inadvertently broke a lot of SELinux related
functionality for both unprivileged users and systemd instances running as
MANAGER_USER.  In particular, setting the correct SELinux context after a User=
directive is used would fail to work since we attempt to set the security
context after changing UID.  Additionally, it causes activated socket units to
be mislabeled for systemd --user processes since setsockcreatecon() would never
be called.

Reverting this fixes the issues with labeling outlined above, and reinstates
SELinux access checks on unprivileged user services.
2017-05-12 14:43:39 +01:00
Yu Watanabe da4128543f tree-wide: fix wrong indent (#5757)
Fixes wrong indent introduced by the commit 43688c49d1.
2017-04-19 08:48:29 +02:00
Namhyung Kim b4e7bdcb53 journal: avoid duplicated call to get cgroup path (#5404)
The cg_pid_get_path_shifted() is called twice during
server_dispatch_message().  We can get rid of the second by passing the
path to dispatch_message_real().
2017-02-23 13:04:57 +01:00
Zbigniew Jędrzejewski-Szmek 2b0445262a tree-wide: add SD_ID128_MAKE_STR, remove LOG_MESSAGE_ID
Embedding sd_id128_t's in constant strings was rather cumbersome. We had
SD_ID128_CONST_STR which returned a const char[], but it had two problems:
- it wasn't possible to statically concatanate this array with a normal string
- gcc wasn't really able to optimize this, and generated code to perform the
  "conversion" at runtime.
Because of this, even our own code in coredumpctl wasn't using
SD_ID128_CONST_STR.

Add a new macro to generate a constant string: SD_ID128_MAKE_STR.
It is not as elegant as SD_ID128_CONST_STR, because it requires a repetition
of the numbers, but in practice it is more convenient to use, and allows gcc
to generate smarter code:

$ size .libs/systemd{,-logind,-journald}{.old,}
   text	   data	    bss	    dec	    hex	filename
1265204	 149564	   4808	1419576	 15a938	.libs/systemd.old
1260268	 149564	   4808	1414640	 1595f0	.libs/systemd
 246805	  13852	    209	 260866	  3fb02	.libs/systemd-logind.old
 240973	  13852	    209	 255034	  3e43a	.libs/systemd-logind
 146839	   4984	     34	 151857	  25131	.libs/systemd-journald.old
 146391	   4984	     34	 151409	  24f71	.libs/systemd-journald

It is also much easier to check if a certain binary uses a certain MESSAGE_ID:

$ strings .libs/systemd.old|grep MESSAGE_ID
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x

$ strings .libs/systemd|grep MESSAGE_ID
MESSAGE_ID=c7a787079b354eaaa9e77b371893cd27
MESSAGE_ID=b07a249cd024414a82dd00cd181378ff
MESSAGE_ID=641257651c1b4ec9a8624d7a40a9e1e7
MESSAGE_ID=de5b426a63be47a7b6ac3eaac82e2f6f
MESSAGE_ID=d34d037fff1847e6ae669a370e694725
MESSAGE_ID=7d4958e842da4a758f6c1cdc7b36dcc5
MESSAGE_ID=1dee0369c7fc4736b7099b38ecb46ee7
MESSAGE_ID=39f53479d3a045ac8e11786248231fbf
MESSAGE_ID=be02cf6855d2428ba40df7e9d022f03d
MESSAGE_ID=7b05ebc668384222baa8881179cfda54
MESSAGE_ID=9d1aaa27d60140bd96365438aad20286
2017-02-15 00:45:12 -05:00
Lennart Poettering f78273c8da journald: don't flush to /var/log/journal before we get asked to
This changes journald to not write to /var/log/journal until it received
SIGUSR1 for the first time, thus having been requested to flush the runtime
journal to disk.

This makes the journal work nicer with systems which have the root file system
writable early, but still need to rearrange /var before journald should start
writing and creating files to it, for example because ACLs need to be applied
first, or because /var is to be mounted from another file system, NFS or tmpfs
(as is the case for systemd.volatile=state).

Before this change we required setupts with /var split out to mount the root
disk read-only early on, and ship an /etc/fstab that remounted it writable only
after having placed /var at the right place. But even that was racy for various
preparations as journald might end up accessing the file system before it was
entirely set up, as soon as it was writable.

With this change we make scheduling when to start writing to /var/log/journal
explicit. This means persistent mode now requires
systemd-journal-flush.service in the mix to work, as otherwise journald would
never write to the directory.

See: #1397
2016-12-21 19:09:29 +01:00
Lennart Poettering 1d84ad9445 util-lib: various improvements to kernel command line parsing
This improves kernel command line parsing in a number of ways:

a) An kernel option "foo_bar=xyz" is now considered equivalent to
   "foo-bar-xyz", i.e. when comparing kernel command line option names "-" and
   "_" are now considered equivalent (this only applies to the option names
   though, not the option values!). Most of our kernel options used "-" as word
   separator in kernel command line options so far, but some used "_". With
   this change, which was a source of confusion for users (well, at least of
   one user: myself, I just couldn't remember that it's systemd.debug-shell,
   not systemd.debug_shell). Considering both as equivalent is inspired how
   modern kernel module loading normalizes all kernel module names to use
   underscores now too.

b) All options previously using a dash for separating words in kernel command
   line options now use an underscore instead, in all documentation and in
   code. Since a) has been implemented this should not create any compatibility
   problems, but normalizes our documentation and our code.

c) All kernel command line options which take booleans (or are boolean-like)
   have been reworked so that "foobar" (without argument) is now equivalent to
   "foobar=1" (but not "foobar=0"), thus normalizing the handling of our
   boolean arguments. Specifically this means systemd.debug-shell and
   systemd_debug_shell=1 are now entirely equivalent.

d) All kernel command line options which take an argument, and where no
   argument is specified will now result in a log message. e.g. passing just
   "systemd.unit" will no result in a complain that it needs an argument. This
   is implemented in the proc_cmdline_missing_value() function.

e) There's now a call proc_cmdline_get_bool() similar to proc_cmdline_get_key()
   that parses booleans (following the logic explained in c).

f) The proc_cmdline_parse() call's boolean argument has been replaced by a new
   flags argument that takes a common set of bits with proc_cmdline_get_key().

g) All kernel command line APIs now begin with the same "proc_cmdline_" prefix.

h) There are now tests for much of this. Yay!
2016-12-21 19:09:08 +01:00
Franck Bui 3099caf2b5 journal: make sure to initially populate the space info cache (#4807)
Make sure to populate the cache in cache_space_refresh() at least once
otherwise it's possible that the system boots fast enough (and the journal
flush service is finished) before the invalidate cache timeout (30 us) has
expired.

Fixes: #4790
2016-12-02 18:40:10 +01:00
Waldemar Brodkorb 9bab3b65b0 fix journald startup problem when code is compiled with -DNDEBUG (#4735)
Similar to this patch from here:
http://systemd-devel.freedesktop.narkive.com/AvfCbi6c/patch-0-3-using-assert-se-on-actions-with-side-effects-on-test-cases

If the code is compiled with -DNDEBUG which is the default for
some embedded buildsystems, systemd-journald does not startup
and silently fails.
2016-11-25 11:24:58 +01:00
Zbigniew Jędrzejewski-Szmek f97b34a629 Rename formats-util.h to format-util.h
We don't have plural in the name of any other -util files and this
inconsistency trips me up every time I try to type this file name
from memory. "formats-util" is even hard to pronounce.
2016-11-07 10:15:08 -05:00
Lennart Poettering 493fd52f1a Merge pull request #4510 from keszybz/tree-wide-cleanups
Tree wide cleanups
2016-11-03 13:59:20 -06:00
Lennart Poettering 229ba9fd57 Merge pull request #4459 from keszybz/commandline-parsing
Commandline parsing simplification and udev fix
2016-10-24 17:20:37 +02:00
Zbigniew Jędrzejewski-Szmek 605405c6cc tree-wide: drop NULL sentinel from strjoin
This makes strjoin and strjoina more similar and avoids the useless final
argument.

spatch -I . -I ./src -I ./src/basic -I ./src/basic -I ./src/shared -I ./src/shared -I ./src/network -I ./src/locale -I ./src/login -I ./src/journal -I ./src/journal -I ./src/timedate -I ./src/timesync -I ./src/nspawn -I ./src/resolve -I ./src/resolve -I ./src/systemd -I ./src/core -I ./src/core -I ./src/libudev -I ./src/udev -I ./src/udev/net -I ./src/udev -I ./src/libsystemd/sd-bus -I ./src/libsystemd/sd-event -I ./src/libsystemd/sd-login -I ./src/libsystemd/sd-netlink -I ./src/libsystemd/sd-network -I ./src/libsystemd/sd-hwdb -I ./src/libsystemd/sd-device -I ./src/libsystemd/sd-id128 -I ./src/libsystemd-network --sp-file coccinelle/strjoin.cocci --in-place $(git ls-files src/*.c)

git grep -e '\bstrjoin\b.*NULL' -l|xargs sed -i -r 's/strjoin\((.*), NULL\)/strjoin(\1)/'

This might have missed a few cases (spatch has a really hard time dealing
with _cleanup_ macros), but that's no big issue, they can always be fixed
later.
2016-10-23 11:43:27 -04:00
Zbigniew Jędrzejewski-Szmek d7f69e16f1 tree-wide: make parse_proc_cmdline() strip "rd." prefix automatically
This stripping is contolled by a new boolean parameter. When the parameter
is true, it means that the caller does not care about the distinction between
initrd and real root, and wants to act on both rd-dot-prefixed and unprefixed
parameters in the initramfs, and only on the unprefixed parameters in real
root. If the parameter is false, behaviour is the same as before.

Changes by caller:
log.c (systemd.log_*):      changed to accept rd-dot-prefix params
pid1:                       no change, custom logic
cryptsetup-generator:       no change, still accepts rd-dot-prefix params
debug-generator:            no change, does not accept rd-dot-prefix params
fsck:                       changed to accept rd-dot-prefix params
fstab-generator:            no change, custom logic
gpt-auto-generator:         no change, custom logic
hibernate-resume-generator: no change, does not accept rd-dot-prefix params
journald:                   changed to accept rd-dot-prefix params
modules-load:               no change, still accepts rd-dot-prefix params
quote-check:                no change, does not accept rd-dot-prefix params
udevd:                      no change, still accepts rd-dot-prefix params

I added support for "rd." params in the three cases where I think it's
useful: logging, fsck options, journald forwarding options.
2016-10-22 16:08:55 -04:00
Zbigniew Jędrzejewski-Szmek 5707ecf300 journald: convert journald to use parse_proc_cmdline
This makes journald use the common option parsing functionality.
One behavioural change is implemented:
"systemd.journald.forward_to_syslog" is now equivalent to
"systemd.journald.forward_to_syslog=1".
I think it's nicer to use this way.
2016-10-22 14:38:10 -04:00
Thomas Hindoe Paaboel Andersen b5331acc96 journal: remove unused variable 2016-10-22 16:00:11 +02:00
Umut Tezduyar Lindskog 863a5610c7 journald: systemd.journald.max_level_* kernel command line options (#4427)
The log forward levels can be configured through kernel command line.
2016-10-21 19:40:55 -04:00
Franck Bui 57f443a6d9 journal: rename determine_space_for() into cache_space_refresh()
Now that determine_space_for() only deals with storage space (cached) values,
rename it so it reflects the fact that only the cached storage space values are
updated.
2016-10-19 09:53:07 +02:00
Franck Bui 3a19f2150d journal: introduce patch_min_use() helper
Updating min_use is rather an unusual operation that is limited when we first
open the journal files, therefore extracts it from determine_space_for() and
create a function of its own and call this new function when needed.

determine_space_for() is now dealing with storage space (cached) values only.

There should be no functional changes.
2016-10-19 09:53:07 +02:00
Franck Bui a0edc477bd journal: introduce cache_space_invalidate()
Introduce a dedicated helper in order to reset the storage space cache.
2016-10-19 09:53:07 +02:00
Franck Bui 23aba34349 journal: cache used vfs stats as well
The set of storage space values we cache are calculated according to a couple
of filesystem statistics (free blocks, block size).

This patch caches the vfs stats we're interested in so these values are
available later and coherent with the rest of the space cached values.
2016-10-19 09:53:07 +02:00
Franck Bui 18e758bf25 journal: don't emit space usage message when opening the journal (#4190)
This patch makes system_journal_open() stop emitting the space usage
message. The caller is now free to emit this message when appropriate.

When restarting the journal, we can now emit the message *after*
flushing the journal (if required) so that all flushed log entries are
written in the persistent journal *before* the status message.

This is required since the status message is always younger than the
flushed entries.

Fixes #4190.
2016-10-19 09:53:07 +02:00
Franck Bui cba5629e87 journal: introduce server_space_usage_message()
This commit simply extracts from determine_space_for() the code which emits the
storage usage message and put it into a function of its own so it can be reused
by others paths later.

No functional changes.
2016-10-19 09:53:07 +02:00
Franck Bui 266a470005 journal: introduce JournalStorage and JournalStorageSpace structures
This structure keeps track of specificities for a given journal type
(persistent or volatile) such as metrics, name, etc...

The cached space values are now moved in this structure so that each
journal has its own set of cached values.

Previously only one set existed and we didn't know if the cached
values were for the runtime journal or the persistent one.

When doing:

   determine_space_for(s, runtime_metrics, ...);
   determine_space_for(s, system_metrics, ...);

the second call returned the cached values for the runtime metrics.
2016-10-19 09:53:07 +02:00
Franck Bui e0ed6db9cd journal: introduce determine_path_usage()
This commit simply extracts from determine_space_for() the code which
determines the FS usage where the passed path lives (statvfs(3)) and put it
into a function of its own so it can be reused by others paths later.

No functional changes.
2016-10-19 09:53:07 +02:00
Zbigniew Jędrzejewski-Szmek c1a9199ec4 Merge pull request #4362 from poettering/journalbootlistfix 2016-10-13 07:45:09 -04:00
Lennart Poettering ae739cc1ed journal: refuse opening journal files from the future for writing
Never permit that we write to journal files that have newer timestamps than our
local wallclock has. If we'd accept that, then the entries in the file might
end up not being ordered strictly.

Let's refuse this with ETXTBSY, and then immediately rotate to use a new file,
so that each file remains strictly ordered also be wallclock internally.
2016-10-12 20:25:20 +02:00
Lennart Poettering 7c07001711 journald: automatically rotate journal files when the clock jumps backwards
As soon as we notice that the clock jumps backwards, rotate journal files. This
is beneficial, as this makes sure that the entries in journal files remain
strictly ordered internally, and thus the bisection algorithm applied on it is
not confused.

This should help avoiding borked wallclock-based bisection on journal files as
witnessed in #4278.
2016-10-12 20:25:20 +02:00
Lennart Poettering 0f972d66d4 journald: use the event loop dispatch timestamp for journal entries
Let's use the earliest linearized event timestamp for journal entries we have:
the event dispatch timestamp from the event loop, instead of requerying the
timestamp at the time of writing.

This makes the time a bit more accurate, allows us to query the kernel time one
time less per event loop, and also makes sure we always use the same timestamp
for both attempts to write an entry to a journal file.
2016-10-12 20:25:20 +02:00
Lennart Poettering 4b58153dd2 core: add "invocation ID" concept to service manager
This adds a new invocation ID concept to the service manager. The invocation ID
identifies each runtime cycle of a unit uniquely. A new randomized 128bit ID is
generated each time a unit moves from and inactive to an activating or active
state.

The primary usecase for this concept is to connect the runtime data PID 1
maintains about a service with the offline data the journal stores about it.
Previously we'd use the unit name plus start/stop times, which however is
highly racy since the journal will generally process log data after the service
already ended.

The "invocation ID" kinda matches the "boot ID" concept of the Linux kernel,
except that it applies to an individual unit instead of the whole system.

The invocation ID is passed to the activated processes as environment variable.
It is additionally stored as extended attribute on the cgroup of the unit. The
latter is used by journald to automatically retrieve it for each log logged
message and attach it to the log entry. The environment variable is very easily
accessible, even for unprivileged services. OTOH the extended attribute is only
accessible to privileged processes (this is because cgroupfs only supports the
"trusted." xattr namespace, not "user."). The environment variable may be
altered by services, the extended attribute may not be, hence is the better
choice for the journal.

Note that reading the invocation ID off the extended attribute from journald is
racy, similar to the way reading the unit name for a logging process is.

This patch adds APIs to read the invocation ID to sd-id128:
sd_id128_get_invocation() may be used in a similar fashion to
sd_id128_get_boot().

PID1's own logging is updated to always include the invocation ID when it logs
information about a unit.

A new bus call GetUnitByInvocationID() is added that allows retrieving a bus
path to a unit by its invocation ID. The bus path is built using the invocation
ID, thus providing a path for referring to a unit that is valid only for the
current runtime cycleof it.

Outlook for the future: should the kernel eventually allow passing of cgroup
information along AF_UNIX/SOCK_DGRAM messages via a unique cgroup id, then we
can alter the invocation ID to be generated as hash from that rather than
entirely randomly. This way we can derive the invocation race-freely from the
messages.
2016-10-07 20:14:38 +02:00
Lennart Poettering 398a50cdd1 journal: fix format string used for usec_t 2016-10-07 20:14:38 +02:00
Lennart Poettering d473176a74 journal: complete slice info in journal metadata
We are already attaching the system slice information to log messages, now add
theuser slice info too, as well as the object slice info.
2016-10-07 20:14:38 +02:00
Felix Zhang dd8352659c journal: fix typo in comment (#4176) 2016-09-18 11:14:50 +02:00
Zbigniew Jędrzejewski-Szmek 43688c49d1 tree-wide: rename config_parse_many to …_nulstr
In preparation for adding a version which takes a strv.
2016-09-16 10:32:03 -04:00
Vito Caputo 6431c7e216 journal: add/use flushed_flag_is_set() helper (#4041)
Minor cleanup suggested by Lennart.
2016-08-26 17:51:13 +02:00
Vito Caputo 929eeb5498 journal: implicitly flush to var on recovery (#4028)
When the system journal becomes re-opened post-flush with the runtime
journal open, it implies we've recovered from something like an ENOSPC
situation where the system journal rotate had failed, leaving the system
journal closed, causing the runtime journal to be opened post-flush.

For the duration of the unavailable system journal, we log to the
runtime journal.  But when the system journal gets opened (space made
available, for example), we need to close the runtime journal before new
journal writes will go to the system journal.  Calling
server_flush_to_var() after opening the system journal with a runtime
journal present, post-flush, achieves this while preserving the runtime
journal's contents in the system journal.

The combination of the present flushed flag file and the runtime journal
being open is a state where we should be logging to the system journal,
so it's appropriate to resume doing so once we've successfully opened
the system journal.
2016-08-25 17:37:57 +02:00
Zbigniew Jędrzejewski-Szmek 61755fdae0 journald: do not create split journals for dynamic users
Dynamic users should be treated like system users, and their logs
should end up in the main system journal.
2016-08-18 23:34:40 -04:00
Vito Caputo 105bdb46b4 journal: ensure open journals from find_journal() (#3973)
If journals get into a closed state like when rotate fails due to
ENOSPC, when space is made available it currently goes unnoticed leaving
the journals in a closed state indefinitely.

By calling system_journal_open() on entry to find_journal() we ensure
the journal has been opened/created if possible.

Also moved system_journal_open() up to after open_journal(), before
find_journal().

Fixes https://github.com/systemd/systemd/issues/3968
2016-08-17 14:51:07 +02:00
Lennart Poettering 3bbaff3e08 tree-wide: use sd_id128_is_null() instead of sd_id128_equal where appropriate
It's a bit easier to read because shorter. Also, most likely a tiny bit faster.
2016-07-22 12:38:08 +02:00
Zbigniew Jędrzejewski-Szmek 2ed968802c tree-wide: get rid of selinux_context_t (#3732)
9eb9c93275
deprecated selinux_context_t. Replace with a simple char* everywhere.

Alternative fix for #3719.
2016-07-15 18:44:02 +02:00
Torstein Husebø 61233823aa treewide: fix typos and remove accidental repetition of words 2016-07-11 16:18:43 +02:00
Lennart Poettering fc2fffe770 tree-wide: introduce new SOCKADDR_UN_LEN() macro, and use it everywhere
The macro determines the right length of a AF_UNIX "struct sockaddr_un" to pass to
connect() or bind(). It automatically figures out if the socket refers to an
abstract namespace socket, or a socket in the file system, and properly handles
the full length of the path field.

This macro is not only safer, but also simpler to use, than the usual
offsetof() + strlen() logic.
2016-05-05 22:24:36 +02:00
Lennart Poettering 5d1ce25728 sd-journal: add API for opening journal files or directories by fd
Also, expose this via the "journalctl --file=-" syntax for STDIN. This feature
remains undocumented though, as it is probably not too useful in real-life as
this still requires fds that support mmaping and seeking, i.e. does not work
for pipes, for which reading from STDIN is most commonly used.
2016-04-25 15:24:46 +02:00
Zbigniew Jędrzejewski-Szmek ccddd104fc tree-wide: use mdash instead of a two minuses 2016-04-21 23:00:13 -04:00
Zbigniew Jędrzejewski-Szmek 6e1045e538 journald: rewrite function with switch, fix handling of -ESHUTDOWN
The comments and the log messages are next to one another, so it's easier
to check that the messages match the comments.

The sign was omitted in the check for -ESHUTDOWN, so it was never matched.
2016-04-16 18:40:21 -04:00
Vito Caputo 9ed794a32d tree-wide: minor formatting inconsistency cleanups 2016-02-23 14:20:34 -08:00
Vito Caputo b58c888f30 journal: defer journal closes on rotate
When we rotate journals, we must set offline and close the current one,
but don't generally need to wait for this to complete.

Instead, we'll initiate an asynchronous offline via
journal_file_set_offline(oldfile, false), and add the file to a
per-server set of deferred closes to be closed later when they
won't block.

There's one complication however; journal_file_open() via
journal_file_verify_header() assumes that any writable journal in the
online state is the product of an unclean shutdown or other form of
corruption.

Thus there's a need for journal_file_open() to be aware of deferred
closes and synchronize with their completion when opening preexisting
journals for writing.  To facilitate this the deferred closes set is
supplied to the journal_file_open() function where the deferred closes
may be closed synchronously before verifying the header in such
circumstances.
2016-02-19 18:50:20 -08:00
Vito Caputo ac2e41f510 journal: asynchronous journal_file_set_offline()
This adds a wait flag to journal_file_set_offline(), when false the offline is
performed asynchronously in a separate thread.

When wait is true, if an asynchronous offline is already in-progress it is
restarted and waited for.  Otherwise the offline is performed synchronously
without the use of a thread.

journal_file_set_online() cancels or waits for the asynchronous offline to
complete if in-flight, depending on where in the offline process the thread
happens to be.  If the thread is in the fsync() phase, it is cancelled and
waiting is unnecessary.  Otherwise, the thread is joined before proceeding.

A new offline_state member is added to JournalFile which is used via
atomic operations for communicating between the offline thread and the
journal_file_set_{offline,online}() functions.
2016-02-19 18:50:20 -08:00
Vito Caputo 69a3a6fd3d journal: add void cast to journal_file_close() calls 2016-02-19 18:50:16 -08:00
Daniel Mack b26fa1a2fb tree-wide: remove Emacs lines from all files
This should be handled fine now by .dir-locals.el, so need to carry that
stuff in every file.
2016-02-10 13:41:57 +01:00
Vito Caputo 089ed40bf4 journal: remove template from open_journal args
None of the callers take advantage of this parameter, it's always NULL,
this is just a private helper function to simplify the call sites so
drop the template parameter altogether.  If a caller emerges later who
needs it, it can be restored.
2016-02-05 21:30:53 -08:00
Tom Gundersen 9766c16bd0 Merge pull request #2440 from poettering/journal-fix
journald: minor fixes
2016-01-26 18:16:48 +01:00
Lennart Poettering 4850d39ab7 journald: add a couple of static asserts checking logging constants
Whenever we include a log level or facility in a journal string field, make sure the compiler checks for us that that's
actually the right thing to do.
2016-01-26 14:43:24 +01:00
Lennart Poettering 1d35b2d6e2 Merge pull request #2424 from keszybz/journald-disk-usage
Journald disk usage
2016-01-26 14:20:45 +01:00
Lennart Poettering e167d7fd8d journald: minor fixes
This primarily contains some minor coding style fixups for 7a24f3bf2f and earlier changes. Specifically:

* Don't log at log levels above LOG_DEBUG from "library" code like journal-file.c

* Don't negate errno values before passing them to log_debug_errno(), as the call can handle this fine anyway

* Cast some calls we knowingly ignore the return values of to (void)

* Don't clobber function call-by-ref return values on failure

* Don't mix function calls and variable declarations in one line

There's also one more relevant change: when failing to enqueue a journal change fs event, we'll run it immediately.
2016-01-26 14:13:30 +01:00
Zbigniew Jędrzejewski-Szmek 32917e3388 journald: restore oom safety
v2:
- use xsprintf
2016-01-25 10:53:51 -05:00
Zbigniew Jędrzejewski-Szmek 9d5a981398 Merge pull request #2318 from vcaputo/coalesce-ftruncates-redux
journal: coalesce ftruncate()s in 250ms windows
2016-01-23 22:09:51 -05:00
Zbigniew Jędrzejewski-Szmek 282c5c4e42 journald: use structured message + catalog entry for disk usage
The format of the journald disk usage log entry was changed back and
forth a few times. It is annoying to have a very verbose message, but
if it is short it is hard to understand. But we have a tool for this,
the catalogue.

$ journalctl -x -u systemd-journald
Jan 23 18:48:50 rawhide systemd-journald[891]: Runtime journal (/run/log/journal/) is 8.0M, max 196.2M, 188.2M free.
-- Subject: Disk space used by the journal
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Runtime journal (/run/log/journal/) is currently using 8.0M.
-- Maximum allowed usage is set to 196.2M.
-- Leaving at least 294.3M free (of currently available 1.9G of disk space).
-- Enforced usage limit is thus 196.2M, of which 188.2M are still available.
--
-- The limits controlling how much disk space is used by the journal may
-- be configured with SystemMaxUse=, SystemKeepFree=, SystemMaxFileSize=,
-- RuntimeMaxUse=, RuntimeKeepFree=, RuntimeMaxFileSize= settings in
-- /etc/systemd/journald.conf. See journald.conf(5) for details.
Jan 23 18:48:50 rawhide systemd-journald[891]: System journal (/var/log/journal/) is 480.1M, max 1.6G, 1.2G free.
-- Subject: Disk space used by the journal
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- System journal (/var/log/journal/) is currently using 480.1M.
-- Maximum allowed usage is set to 1.6G.
-- Leaving at least 2.5G free (of currently available 5.8G of disk space).
-- Enforced usage limit is thus 1.6G, of which 1.2G are still available.
--
-- The limits controlling how much disk space is used by the journal may
-- be configured with SystemMaxUse=, SystemKeepFree=, SystemMaxFileSize=,
-- RuntimeMaxUse=, RuntimeKeepFree=, RuntimeMaxFileSize= settings in
-- /etc/systemd/journald.conf. See journald.conf(5) for details.
2016-01-23 19:49:00 -05:00
Zbigniew Jędrzejewski-Szmek 8a03c9ef74 journald: allow additional payload in server_driver_message
The code to format the iovec is shared with log.c. All call sites to
server_driver_message are changed to include the additional "MESSAGE="
part, but the new functionality is not used and change in functionality
is not expected.

iovec is preallocated, so the maximum number of messages is limited.
In server_driver_message N_IOVEC_PAYLOAD_FIELDS is currently set to 1.

New code is not oom safe, it will fail if memory cannot be allocated.
This will be fixed in subsequent commit.
2016-01-23 19:49:00 -05:00
Zbigniew Jędrzejewski-Szmek ff82c36c79 journald: do not free uninitialized pointer in error path 2016-01-18 15:21:28 -05:00
Vito Caputo 7a24f3bf2f journal: coalesce ftruncate()s in 250ms windows
Prior to this change every journal append causes an ftruncate() for the
sake of inotify propagation of the mmap-based writes.

With this change the notification is deferred up to ~250ms, coalescing
any repeated journal writes during the deferred period into a single
ftruncate().  The ftruncate() call isn't free and doing it on every
append adds unnecessary overhead and latency in the journald event loop.

Introduces journal_file_enable_post_change_timer() which manages a
timer on the provided sd-event instance for scheduling coalesced
ftruncates.  The ftruncate() behavior is unchanged unless
journal_file_enable_post_change_timer() is called on the JournalFile.

While not a tremendous improvement, profiling systemd-journald event loop
latencies using instrumentation as introduced by 34b8751 it was observed that
coalescing the ftruncates was low-hanging fruit worth pursuing.

Note orders 12 and 13 shifting left into order 11 and order 6 dipping into
order 5:

Unmodified:
     log2(us)   1 2 3  4 5  6   7   8  9   10 11   12   13 14 15 16 17 18 19
                -----------------------------------------------------------
[10685.414572]  0 0 0  0 38 602 61  2  290 60 1643 2554 13 1  4  1  0  0  1
[10690.415114]  0 0 0  0 0  646 54  7  309 44 2073 2148 17 1  3  0  0  0  1
[10695.415509]  0 0 0  0 1  650 73  3  324 37 2071 2270 9  0  0  1  0  1  0
[10700.416297]  0 0 0  0 0  659 50  4  318 38 2111 2152 6  0  1  0  0  1  1
[10705.417136]  0 0 0  0 2  660 48  4  320 38 2129 2146 12 1  1  0  0  1  1
[10710.489114]  0 0 0  0 0  673 38  3  321 37 1925 2339 7  0  0  0  0  1  1
[10715.489613]  0 0 0  0 3  656 64  8  317 48 2365 2007 7  0  0  0  0  0  1

Coalesced:
     log2(us)   1 2 3  4 5  6   7   8  9   10 11   12   13 14 15 16 17 18 19
                -----------------------------------------------------------
[ 6169.161360]  0 0 0  1 24 786 54  11 389 24 4192 771  6  4  0  0  1  0  1
[ 6174.161705]  0 0 0  1 18 800 35  6  380 27 3977 893  3  1  0  0  1  0  1
[ 6179.162741]  0 0 0  1 28 768 51  4  391 16 3998 831  5  3  0  0  0  0  2
[ 6184.162856]  0 0 0  0 19 770 60  2  376 26 3795 1004 9  5  1  0  1  0  1
[ 6189.163279]  0 0 0  0 28 761 49  7  372 27 3729 1056 3  2  0  0  1  0  1
[ 6194.164255]  0 0 0  0 25 785 49  7  394 19 3996 908  6  3  2  0  0  0  1
[ 6199.164658]  0 0 0  0 29 797 35  5  389 18 3995 898  3  4  1  1  1  0  1

The remaining high-order delays are a result of the synchronous fsyncs in
systemd-journald, beyond the scope of this commit.
2016-01-14 16:36:07 -08:00
David Herrmann de418eb91c Merge pull request #2053 from poettering/selinux-fix
Two unrelated fixes
2015-11-30 19:30:03 +01:00
Zbigniew Jędrzejewski-Szmek 5c3bde3fa8 journal: move the gist of server_fix_perms to acl-util.[hc]
Most of the function is moved to acl-util.c to make it possible to
add tests in subsequent commit.

Setting of the mode in server_fix_perms is removed:
- we either just created the file ourselves, and the permission be better right,
- or the file was already there, and we should not modify the permissions.

server_fix_perms is renamed to server_fix_acls to better reflect new
meaning, and made static because it is only used in one file.
2015-11-27 23:32:32 -05:00
Lennart Poettering 6355e75610 selinux: split up mac_selinux_have() from mac_selinux_use()
Let's distuingish the cases where our code takes an active role in
selinux management, or just passively reports whatever selinux
properties are set.

mac_selinux_have() now checks whether selinux is around for the passive
stuff, and mac_selinux_use() for the active stuff. The latter checks the
former, plus also checks UID == 0, under the assumption that only when
we run priviliged selinux management really makes sense.

Fixes: #1941
2015-11-27 20:28:13 +01:00