Commit graph

1301 commits

Author SHA1 Message Date
Evgeny Vereshchagin 5535d8f7a9 cgroup: assume the use of v1 when all the preceding checks fail (#7366)
This patch restores the default that was changed in 2977724b09,
making the tools depending on it work again.

Closes: #6477 and https://github.com/lxc/lxc/issues/1669
2017-11-17 09:47:49 +01:00
Lennart Poettering d3070fbdf6 core: implement /run/systemd/units/-based path for passing unit info from PID 1 to journald
And let's make use of it to implement two new unit settings with it:

1. LogLevelMax= is a new per-unit setting that may be used to configure
   log priority filtering: set it to LogLevelMax=notice and only
   messages of level "notice" and lower (i.e. more important) will be
   processed, all others are dropped.

2. LogExtraFields= is a new per-unit setting for configuring per-unit
   journal fields, that are implicitly included in every log record
   generated by the unit's processes. It takes field/value pairs in the
   form of FOO=BAR.

Also, related to this, one exisiting unit setting is ported to this new
facility:

3. The invocation ID is now pulled from /run/systemd/units/ instead of
   cgroupfs xattrs. This substantially relaxes requirements of systemd
   on the kernel version and the privileges it runs with (specifically,
   cgroupfs xattrs are not available in containers, since they are
   stored in kernel memory, and hence are unsafe to permit to lesser
   privileged code).

/run/systemd/units/ is a new directory, which contains a number of files
and symlinks encoding the above information. PID 1 creates and manages
these files, and journald reads them from there.

Note that this is supposed to be a direct path between PID 1 and the
journal only, due to the special runtime environment the journal runs
in. Normally, today we shouldn't introduce new interfaces that (mis-)use
a file system as IPC framework, and instead just an IPC system, but this
is very hard to do between the journal and PID 1, as long as the IPC
system is a subject PID 1 manages, and itself a client to the journal.

This patch cleans up a couple of types used in journal code:
specifically we switch to size_t for a couple of memory-sizing values,
as size_t is the right choice for everything that is memory.

Fixes: #4089
Fixes: #3041
Fixes: #4441
2017-11-16 12:40:17 +01:00
Zbigniew Jędrzejewski-Szmek 9aa2113365 util-lib: add debug messages when checking cgroup layout
This has become very complex, let's make it a bit easier to diagnose.
2017-11-15 22:58:24 +01:00
Topi Miettinen 9c6888ac45 basic: remove redundant check (#7320)
The check is redundant as the whole block is only evaluated if
__IGNORE_pkey_mprotect is not defined. Change to #else.
2017-11-13 22:00:03 +01:00
Lennart Poettering f38326f21a
Merge pull request #7284 from poettering/cgroup-delegate-mask
add a concept of delegating cgroups per unit while enabling specific controllers
2017-11-13 12:14:23 +01:00
Lennart Poettering 7546145e26 string-util: add delete_trailing_chars() and skip_leading_chars() helpers
And let's port over a couple of users to the new APIs.
2017-11-13 10:47:15 +01:00
Lennart Poettering 57d6f70019 fileio: make use of DEFINE_TRIVIAL_CLEANUP_FUNC where it makes sense 2017-11-13 10:24:03 +01:00
Lennart Poettering ec635a2d21 cgroup: improve cg_mask_to_string a bit, and add tests for it 2017-11-13 10:24:03 +01:00
Lennart Poettering 00b4a24743 cgroup-util: add brief comments clarifying which controllers are v2-only and which v1-only 2017-11-13 10:24:03 +01:00
Zbigniew Jędrzejewski-Szmek 213f2883c0 basic/missing: add numbers for pkey_mprotect
Follow-up for b835eeb4ec.
2017-11-13 09:27:53 +01:00
Zbigniew Jędrzejewski-Szmek edc3eff50f
Merge pull request #7301 from poettering/loginctl-ellipsize
Fix loginctl seat sysfs tree ellipsation logic.

Simple reproducer:
loginctl --full seat-status seat0|cat
→ after this PR, all lines are shown in full. Before, lines were ellipsized to terminal width.
2017-11-12 16:25:54 +01:00
Zbigniew Jędrzejewski-Szmek f886559e55
Merge pull request #7186 from poettering/track-deps
rework unit dependency data structure to track why deps get created
2017-11-12 16:14:41 +01:00
Yu Watanabe cf26c4a722 parse-util: add parse_errno() and parse_syscall_and_errno() 2017-11-11 21:29:17 +09:00
Yu Watanabe 1850d0d271 basic/errno-list: remove errno_max() and define ERRNO_MAX as 4095
In Linux kernel code, MAX_ERRNO is defined as 4095.
Here, we use that value for ERRNO_MAX.
2017-11-11 21:28:25 +09:00
Lennart Poettering ddbc931986 string-util: when ellipsizing to a length if (size_t) -1, become a NOP
Let's say that (size_t) -1 (i.e. SIZE_T_MAX) is equivalent to
"unbounded" ellipsation, i.e. ellipsation as NOP. In which case the
relevant functions become little more than strdup()/strndup().

This is useful to simplify caller code in case we want to turn off
ellipsation in certain code paths with minimal caller-side handling for
this.
2017-11-10 21:41:53 +01:00
Lennart Poettering 45e1c7d5ca virt: trivial whitespace fixes 2017-11-10 19:00:06 +01:00
Zbigniew Jędrzejewski-Szmek 556c7bae0c basic/hashmap: add cleanup of memory pools (#7164)
It was dropped in 89439d4fc0. As a result, every
process that uses a hashmap allocates and then leaks the hashmap mempools.
The mempools are only allocated in the main thread, but we don't know where
the memory is used.

So let's check if we are the last thread and free the mempools then. This is
fairly heavy, because /proc/self/status has to be opened and parsed, but we do
it only when compiled for valgrind, i.e. not by default, and compared to running
under valgrind or asan, the extra cost is acceptable. The big advantage is that
we don't have to think or filter out this false positive.

As a micro-opt, cleanup is attempted only in the main thread. We could allow
any thread to check if it is the last one and perform cleanup, but that'd mean
that we'd have to _do_ the check in every thread. We don't use threads like
that, our non-main threads are always short-lived, so let's just accept the
possibility that we'll leak memory if a thread survives. The check is also
non-atomic, but it's called in a destructor of the main thread _and_ we do
cleanup only when there are no other threads, so the risk of some library
suddenly spawning another thread is very low. All in all, this is not perfect,
but should work in 999‰ of cases.

Fixes the following valgrind warning:

==22564== HEAP SUMMARY:
==22564==     in use at exit: 8,192 bytes in 2 blocks
==22564==   total heap usage: 243 allocs, 241 frees, 151,905 bytes allocated
==22564==
==22564== 4,096 bytes in 1 blocks are still reachable in loss record 1 of 2
==22564==    at 0x4C2FB6B: malloc (vg_replace_malloc.c:299)
==22564==    by 0x4F08A8C: mempool_alloc_tile (mempool.c:62)
==22564==    by 0x4F08B16: mempool_alloc0_tile (mempool.c:81)
==22564==    by 0x4EF8DE0: hashmap_base_new (hashmap.c:748)
==22564==    by 0x4EF8ED9: internal_hashmap_new (hashmap.c:782)
==22564==    by 0x11045D: test_hashmap_copy (test-hashmap-plain.c:87)
==22564==    by 0x115722: test_hashmap_funcs (test-hashmap-plain.c:914)
==22564==    by 0x10FC9D: main (test-hashmap.c:60)
==22564==
==22564== 4,096 bytes in 1 blocks are still reachable in loss record 2 of 2
==22564==    at 0x4C2FB6B: malloc (vg_replace_malloc.c:299)
==22564==    by 0x4F08A8C: mempool_alloc_tile (mempool.c:62)
==22564==    by 0x4F08B16: mempool_alloc0_tile (mempool.c:81)
==22564==    by 0x4EF8DE0: hashmap_base_new (hashmap.c:748)
==22564==    by 0x4EF8EF8: internal_ordered_hashmap_new (hashmap.c:786)
==22564==    by 0x10A2A0: test_ordered_hashmap_copy (test-hashmap-ordered.c:89)
==22564==    by 0x10F70F: test_ordered_hashmap_funcs (test-hashmap-ordered.c:916)
==22564==    by 0x10FCA2: main (test-hashmap.c:61)
==22564==
==22564== LEAK SUMMARY:
==22564==    definitely lost: 0 bytes in 0 blocks
==22564==    indirectly lost: 0 bytes in 0 blocks
==22564==      possibly lost: 0 bytes in 0 blocks
==22564==    still reachable: 8,192 bytes in 2 blocks
==22564==         suppressed: 0 bytes in 0 blocks

v2:
- check if we are the main thread

v3:
- check if there are no other threads
2017-11-10 15:44:58 +01:00
tblume ed457f1380 systemd-firstboot: add vconsole keymap support (#7035)
Enable systemd-firstboot to set the keymap.

RFE:

https://github.com/systemd/systemd/issues/6346
2017-11-10 10:31:44 +01:00
Zbigniew Jędrzejewski-Szmek 3f6914175c util-lib: mark variable with _unused_ to silence clang warning
_unused_ means "the variable is meant to be possible unused and gcc
will not generate a warning about it", which is exactly what we need here,
since we're only declaring it for the side effect of _cleanup_.
2017-11-01 23:10:25 +01:00
Zbigniew Jędrzejewski-Szmek ac097c841e Remove a bunch of unused variables
gcc does not warn about those, because of the _cleanup_ usage.
clang is smarter here.
2017-11-01 23:06:44 +01:00
Lennart Poettering 8c4a8ea2ac fs-util: small tweak in chase_symlinks()
If we follow an absolute symlink there's no need to prefix the path with
a "/", since by definition it already has one.

This helps suppressing double "/" in resolved paths containing absolute
symlinks.
2017-10-26 17:54:56 +02:00
Lennart Poettering f7c9f4a2a9 btrfs-util: when opening subvolume fds, always set O_NOFOLLOW
Some of the btrfs utility functions already used O_NOFOLLOW others
didn't. Let's streamline this, and refuse operation when we are called
for symlinks on "remove" and "snapshot" too.

In particular in the "remove" case following symlinks is a bad idea, and
is quite different from how unlink() and friends work, which always
remove the symlink, and not the destination, a logic we should follow
here too.
2017-10-26 17:54:56 +02:00
Razvan Cojocaru 530c1c3028 systemd-detect-virt: refine hypervisor detection (#7171)
Continue to try to get more details about the actual underlying
hypervisor with successive tests until none are available.
This fixes issue #7165.
2017-10-26 16:59:04 +02:00
Lennart Poettering 35682fd4a1 Merge pull request #7127 from keszybz/sundry-tweaks
Various unrelated small patches
2017-10-26 10:57:00 +02:00
Zbigniew Jędrzejewski-Szmek c47f86e660 util-lib: simplify kexec_loaded() 2017-10-18 17:14:05 +02:00
Zbigniew Jędrzejewski-Szmek 14ce0c25c2 timedatectl: stop using xstrftime
When using strftime in arbitrary locales, we cannot really say how big the
buffer should be. Let's make the buffer "large", which will work fine pretty
much always, and just print n/a if the timestamp does not fit. strftime returns
0 if the buffer is too small and a NUL-terminated string otherwise, so we
can drop the size specifications in string formatting.

$ export LANG=fa_IR.UTF-8
$ date
چهارشنبه ۱۸ اكتبر ۱۷، ساعت ۱۰:۵۴:۲۴ (+0330)
$ timedatectl
Assertion 'xstrftime: a[] must be big enough' failed at ../src/timedate/timedatectl.c:105, function print_status_info(). Aborting.

now:

$ timedatectl
        Local time: چهارشنبه 2017-10-18 16:29:40 CEST
    Universal time: چهارشنبه 2017-10-18 14:29:40 UTC
          RTC time: چهارشنبه 2017-10-18 14:29:40
…

https://bugzilla.redhat.com/show_bug.cgi?id=1503452
2017-10-18 16:30:37 +02:00
Zbigniew Jędrzejewski-Szmek 0051ca1f63 Merge pull request #7061 from lkundrak/lr/serialized-environment
Environment serialization/deserialization inconsistently validates the variables
2017-10-15 12:47:30 +02:00
Lennart Poettering 523578aa6d basic: split unit-name.[ch] into two (#7065)
It always bothered me a bit that unit-name.[ch] contains so many
definitions that aren't really have much to do with unit nameing, for
example all the unit state definitions.

With this patch unit-name.[ch] is split into two: the file now contains
only the unit naming related operations, and everything else is split
out into a new set of files unit-def.[ch]. That's mostly unit state
stuff as well as dbus path and interface name operations.

No functional changes. This just moves code around.

(Note as both .c files include each other's headers this doesn't make
the build simpler or anything. All it does is make the C files a bit
shorter, and medicate my pretend OCD)
2017-10-11 20:21:28 +02:00
Lubomir Rintel c7d797bbdf basic/env-util: don't relax unesaping of serialized environment strings
We wrote them ourselves -- they shouldn't contain invalid sequences.
2017-10-11 15:05:38 +02:00
Lubomir Rintel ea43bdd1d7 basic/env-util: drop the validation when deserializing environment
The environment variables we've serialized can quite possibly contain
characters outside the set allowed by env_assignment_is_valid(). In
fact, my environment seems to contain a couple of these:

  * TERMCAP set by screen contains a '\x7f' character
  * BASH_FUNC_module%% variable has a '%' character in name

Strict check of environment variables name and value certainly makes sense for
unit files, but not so much for deserialization of values we already had
in our environment.
2017-10-11 15:01:32 +02:00
Zbigniew Jędrzejewski-Szmek 651d47d14b tests: skip tests when cg_pid_get_path fails (#7033)
v2:
- cast the fstype_t type to ull, because it varies between arches.
  Making it long long should be on the safe side.
2017-10-10 20:55:20 +02:00
Lennart Poettering b74023db06 Merge pull request #7003 from yuwata/enable-dynamic-user
timesyncd, journal-upload: Enable DynamicUser=
2017-10-10 10:05:43 +02:00
Zbigniew Jędrzejewski-Szmek 232ac0d681 util-lib: introdude _cleanup_ macros for kmod objects 2017-10-08 22:04:07 +02:00
Yu Watanabe c31ad02403 mkdir: introduce follow_symlink flag to mkdir_safe{,_label}() 2017-10-06 16:03:33 +09:00
Lennart Poettering eae51da36e unit: when JobTimeoutSec= is turned off, implicitly turn off JobRunningTimeoutSec= too
We added JobRunningTimeoutSec= late, and Dracut configured only
JobTimeoutSec= to turn of root device timeouts before. With this change
we'll propagate a reset of JobTimeoutSec= into JobRunningTimeoutSec=,
but only if the latter wasn't set explicitly.

This should restore compatibility with older systemd versions.

Fixes: #6402
2017-10-05 13:06:44 +02:00
Zbigniew Jędrzejewski-Szmek 03d4358277 Merge pull request #6975 from sourcejedi/logind_pid_0_v2
Selectively revert "tree-wide: use pid_is_valid() at more places"
2017-10-04 21:33:52 +02:00
Zbigniew Jędrzejewski-Szmek 5991ce44dc udevadm,basic: replace nulstr_contains with STR_IN_SET (#6965)
STR_IN_SET is a newer approach which is easier to write and read, and which
seems to result in space savings too:

before:
4949848 build/src/shared/libsystemd-shared-234.so
 350704 build/systemctl
4967184 build/systemd
 826216 build/udevadm

after:
4949848 build/src/shared/libsystemd-shared-234.so
 350704 build/systemctl
4966888 build/systemd
 826168 build/udevadm
2017-10-04 19:32:12 +02:00
Lennart Poettering 4aa1d31c89 Merge pull request #6974 from keszybz/clean-up-defines
Clean up define definitions
2017-10-04 19:25:30 +02:00
Yu Watanabe 4c70109600 tree-wide: use IN_SET macro (#6977) 2017-10-04 16:01:32 +02:00
Zbigniew Jędrzejewski-Szmek f9fa32f09c build-sys: s/HAVE_SMACK/ENABLE_SMACK/
Same justification as for HAVE_UTMP.
2017-10-04 12:09:50 +02:00
Zbigniew Jędrzejewski-Szmek 392fd235fd build-sys: s/HAVE_IMA/ENABLE_IMA/
Same justification as for HAVE_UTMP.
2017-10-04 12:09:50 +02:00
Zbigniew Jędrzejewski-Szmek 3211da4bcb build-sys: s/HAVE_UTMP/ENABLE_UTMP/
"Have" should be about the external environment and dependencies. Anything
which is a pure yes/no choice should be "enable".
2017-10-04 12:09:50 +02:00
Zbigniew Jędrzejewski-Szmek 349cc4a507 build-sys: use #if Y instead of #ifdef Y everywhere
The advantage is that is the name is mispellt, cpp will warn us.

$ git grep -Ee "conf.set\('(HAVE|ENABLE)_" -l|xargs sed -r -i "s/conf.set\('(HAVE|ENABLE)_/conf.set10('\1_/"
$ git grep -Ee '#ifn?def (HAVE|ENABLE)' -l|xargs sed -r -i 's/#ifdef (HAVE|ENABLE)/#if \1/; s/#ifndef (HAVE|ENABLE)/#if ! \1/;'
$ git grep -Ee 'if.*defined\(HAVE' -l|xargs sed -i -r 's/defined\((HAVE_[A-Z0-9_]*)\)/\1/g'
$ git grep -Ee 'if.*defined\(ENABLE' -l|xargs sed -i -r 's/defined\((ENABLE_[A-Z0-9_]*)\)/\1/g'
+ manual changes to meson.build

squash! build-sys: use #if Y instead of #ifdef Y everywhere

v2:
- fix incorrect setting of HAVE_LIBIDN2
2017-10-04 12:09:29 +02:00
Alan Jenkins 07b38ba51e Revert "tree-wide: use pid_is_valid() at more places"
This reverts commit ee043777be.

It broke almost everywhere it touched.  The places that
handn't been converted, were mostly followed by special
handling for the invalid PID `0`.  That explains why they
tested for `pid < 0` instead of `pid <= 0`.

I think that one was the first commit I reviewed, heh.
2017-10-03 12:43:24 +01:00
Zbigniew Jędrzejewski-Szmek 4b9545f19e build-sys: change all HAVE_DECL_ macros to HAVE_
This is a legacy of autotools, where one detection routine used a different
prefix then the others.

$ git grep -e HAVE_DECL_ -l|xargs sed -i s/HAVE_DECL_/HAVE_/g
2017-10-03 10:32:34 +02:00
Yu Watanabe 8502cadd4c Merge pull request #6940 from poettering/magic-dirs
make sure StateDirectory= and friends play nicely with DynamicUser= and RootImage=/RootDirectory=
2017-10-03 13:28:48 +09:00
Zbigniew Jędrzejewski-Szmek ebfa2c158e Merge pull request #6943 from poettering/dissect-ro
Automatically recognize that "squashfs" and "iso9660" area always read-only
2017-10-02 20:39:08 +02:00
Lennart Poettering 2a5beb669f path-util: some updates to path_make_relative()
Don't miscount number of "../" to generate, if we "." is included in an
input path.

Also, refuse if we encounter "../" since we can't possibly follow that
up properly, without file system access.

Some other modernizations.
2017-10-02 17:41:44 +02:00
Zbigniew Jędrzejewski-Szmek d29ad81e3b Minor line wrapping adjustment 2017-10-02 14:52:12 +02:00
Andreas Rammhold ec2ce0c5d7
tree-wide: use !IN_SET(..) for a != b && a != c && …
The included cocci was used to generate the changes.

Thanks to @flo-wer for pointing this case out.
2017-10-02 13:09:56 +02:00