Commit graph

4847 commits

Author SHA1 Message Date
Frantisek Sumsal 4a2c3dc318
Merge pull request #11252 from evverx/use-asan-wrapper-on-travis-ci
travis: run PID1, journald and everything else under ASan+UBsan
2019-01-06 18:48:38 +01:00
Zbigniew Jędrzejewski-Szmek 3042bbebdd tree-wide: use c99 static for array size declarations
https://hamberg.no/erlend/posts/2013-02-18-static-array-indices.html

This only works with clang, unfortunately gcc doesn't seem to implement the check
(tested with gcc-8.2.1-5.fc29.x86_64).

Simulated error:
[2/3] Compiling C object 'systemd-nspawn@exe/src_nspawn_nspawn.c.o'.
../src/nspawn/nspawn.c:3179:45: warning: array argument is too small; contains 15 elements, callee requires at least 16 [-Warray-bounds]
                        candidate = (uid_t) siphash24(arg_machine, strlen(arg_machine), hash_key);
                                            ^                                           ~~~~~~~~
../src/basic/siphash24.h:24:64: note: callee declares array parameter as static here
uint64_t siphash24(const void *in, size_t inlen, const uint8_t k[static 16]);
                                                               ^~~~~~~~~~~~
2019-01-04 12:37:25 +01:00
YunQiang Su f5855697aa Pass separate dev_t var to device_path_parse_major_minor
MIPS/O32's st_rdev member of struct stat is unsigned long, which
is 32bit, while dev_t is defined as 64bit, which make some problems
in device_path_parse_major_minor.

Don't pass st.st_rdev, st_mode to device_path_parse_major_minor,
while pass 2 seperate variables. The result of stat is alos copied
out into these 2 variables. Fixes: #11247
2019-01-03 15:04:08 +01:00
Yu Watanabe a92f2af28a
Merge pull request #11230 from keszybz/version-string-alt
Generate version string from git describe (alternative approach)
2019-01-03 21:33:55 +09:00
Chris Down 4e1dfa45e9 cgroup: s/cgroups? ?v?([0-9])/cgroup v\1/gI
Nitpicky, but we've used a lot of random spacings and names in the past,
but we're trying to be completely consistent on "cgroup vN" now.

Generated by `fd -0 | xargs -0 -n1 sed -ri --follow-symlinks 's/cgroups?  ?v?([0-9])/cgroup v\1/gI'`.

I manually ignored places where it's not appropriate to replace (eg.
"cgroup2" fstype and in src/shared/linux).
2019-01-03 11:32:40 +09:00
Evgeny Vereshchagin 7e11a95e41 tests: reproduce https://github.com/systemd/systemd/issues/11251 2018-12-29 19:14:28 +01:00
Dave Reisner 8ca9e92c74 Make default locale a compile time option
Default to a locale that's guaranteed to exist everywhere, but let
distros override this with something more exotic if they choose to.

Closes #11259.
2018-12-29 21:43:04 +09:00
Evgeny Vereshchagin 7334ade4a7 core: free lines after reading them
Closes https://github.com/systemd/systemd/issues/11251.
2018-12-28 15:36:00 +09:00
Patrick Williams 8eab766804 core: support %j in unit dependency resolution
Commit 250e9fadbc introduced
support for %j/%J specifier in unit files.  The function
unit_name_printf is used in unit dependency resolution,
such as Wants / After directives, but was missing support
for the %j.  Add to allow directives such as:

    [Unit]
    Wants=bar-%j.target

Fixes: systemd/systemd#11217
Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
2018-12-22 17:21:13 +09:00
Zbigniew Jędrzejewski-Szmek 3d3075e309 meson: simplify handling of pkgconfigdatadir=no, pkgconfiglibdir=no
The idea was that those vars could be configured to 'no' to not install the .pc
files, or they could be set to '', and then they would be built but not
installed. This was inherited from the autoconf build system. This couldn't
work because '' is replaced by the default value. Also, having this level of
control doesn't seem necessary, since creating those files is very
quick. Skipping with 'no' was implemented only for systemd.pc and not the other
.pc files. Let's simplify things and skip installation if the target dir
is configured as 'no' for all .pc files.
2018-12-21 13:43:20 +01:00
Zbigniew Jędrzejewski-Szmek 681bd2c524 meson: generate version tag from git
$ build/systemctl --version
systemd 239-3555-g6178cbb5b5
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN +PCRE2 default-hierarchy=hybrid
$ git tag v240 -m 'v240'
$ ninja -C build
ninja: Entering directory `build'
[76/76] Linking target fuzz-unit-file.
$ build/systemctl --version
systemd 240
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN +PCRE2 default-hierarchy=hybrid

This is very useful during development, because a precise version string is
embedded in the build product and displayed during boot, so we don't have to
guess answers for questions like "did I just boot the latest version or the one
from before?".

This change creates an overhead for "noop" builds. On my laptop, 'ninja -C
build' that does nothing goes from 0.1 to 0.5 s. It would be nice to avoid
this, but I think that <1 s is still acceptable.

Fixes #7183.

PACKAGE_VERSION is renamed to GIT_VERSION, to make it obvious that this is the
more dynamically changing version string.

Why save to a file? It would be easy to generate the version tag using
run_command(), but we want to go through a file so that stuff gets rebuilt when
this file changes. If we just defined an variable in meson, ninja wouldn't know
it needs to rebuild things.
2018-12-21 13:43:20 +01:00
Zbigniew Jędrzejewski-Szmek a67c318df8 meson: define PROJECT_VERSION as the "bare" project version
Let's not use atoi() if we can simply provide the project version as a number.

In C code, this is the numerical project version. In substitutions in other
files, this is just the bare substitution.

The "PACKAGE_" prefix is from autotools, and is strange. We call systemd a
"project", and "package" is something that distros build. Let's rename.

PACKAGE_URL is renamed to PROJECT_URL for the same reasons and for consistency.

(This leave PACKAGE_VERSION as the stringified define for C code.)
2018-12-20 21:35:29 +01:00
Zbigniew Jędrzejewski-Szmek ec8126d723 Revert "core/mount: minimize impact on mount storm."
This reverts commit 89f9752ea0.

This patch causes various problems during boot, where a "mount storm" occurs
naturally. Current approach is flakey, and it seems very risky to push a
feature like this which impacts boot right before a release. So let's revert
for now, and consider a more robust solution after later.

Fixes #11209.

> https://github.com/systemd/systemd/pull/11196#issuecomment-448523186:
"Reverting 89f9752ea0 and fcfb1f775e fixes this test."
2018-12-19 11:37:41 +01:00
Zbigniew Jędrzejewski-Szmek e36db50075 Revert "mount: disable mount-storm protection while mount unit is starting."
This reverts commit fcfb1f775e.
2018-12-19 11:32:17 +01:00
Zbigniew Jędrzejewski-Szmek d2aaf13099 Remove use of PACKAGE_STRING
PACKAGE_VERSION is more explicit, and also, we don't pretend that changing the
project name in meson.build has any real effect. "systemd" is embedded in a
thousand different places, so let's just use the hardcoded string consistently.
This is mostly in preparation for future changes.
2018-12-19 09:29:32 +01:00
NeilBrown fcfb1f775e mount: disable mount-storm protection while mount unit is starting.
The starting of mount units requires that changes to
/proc/self/mountinfo be processed before the SIGCHILD from the
completion of /sbin/mount is processed, as described by the comment
  /* Note that due to the io event priority logic, we can be sure the new mountinfo is loaded
   * before we process the SIGCHLD for the mount command. */

The recently-added mount-storm protection can defeat this as it
will sometimes deliberately delay processing of /proc/self/mountinfo.

So we need to disable mount-storm protection when a mount unit is starting.
We do this by keeping a counter of the number of pending
mounts, and disabling the protection when this is non-zero.

Thanks to @asavah for finding and reporting this problem.
2018-12-19 00:44:19 +01:00
Michal Sekletar 4c70a4a748 core: do cgroup migration first and only then connect to journald
Fixes #11162
2018-12-17 19:22:30 +01:00
Alexey Bogdanenko 8f9f3cb724 core: fix KeyringMode for user services
KeyringMode option is useful for user services. Also, documentation for the
option suggests that the option applies to user services. However, setting the
option to any of its allowed values has no effect.

This commit fixes that and removes EXEC_NEW_KEYRING flag. The flag is no longer
necessary: instead of checking if the flag is set we can check if keyring_mode
is not equal to EXEC_KEYRING_INHERIT.
2018-12-17 16:56:36 +01:00
Zbigniew Jędrzejewski-Szmek 582de70f2f
Merge pull request #11086 from poettering/nscd-cache-flush
flush nscd's caches when we register user/groups/hostnames
2018-12-17 11:29:58 +01:00
Khem Raj baa162cecd core: Fix use after free case in load_from_path()
ensure that mfree() on filename is called after the logging function
which uses the string pointed by filename

Signed-off-by: Khem Raj <raj.khem@gmail.com>
2018-12-16 22:02:00 -08:00
NeilBrown 89f9752ea0 core/mount: minimize impact on mount storm.
If we create 2000 mounts (on a 1-CPU qemu VM) with
  mkdir -p /MNT/{1..2000}
  time for i in {1..2000}; do mount --bind /etc /MNT/$i ; done

it takes around 20 seconds to complete.  Much of this time is taken up
by systemd repeatedly processing /proc/self/mountinfo.
If I disable the processing, the time drops to about 4 seconds.

I have reports that on a larger system with multiple active user sessions, each
with it's own systemd, the impact can be higher.

One particular use-case where a large number of mounts can be expected in quick
succession is when the "clearcase" SCM starts up.

This patch modifies the handling up events from /proc/self/mountinfo so
that systemd backs off when a storm is detected.  Specifically the time to process
mountinfo is measured, and the process will not be repeated until 10 times
that duration has passed.  This ensures systemd won't use more than 10% of
real time processing mountinfo.

With this patch, my test above takes about 5 seconds.
2018-12-16 12:38:40 +01:00
Lennart Poettering 2d41e9b7a0
Merge pull request #11143 from keszybz/enable-symlink
Runtime mask symlink confusion fix
2018-12-16 12:37:07 +01:00
Lennart Poettering 460ec54908 core: flush nscd's caches whenever we allocate/release a dynamic user
This should make dynamic users and nscd work together better.

Fixes: #10740
2018-12-15 12:10:19 +01:00
Zbigniew Jędrzejewski-Szmek 58d9d89b4b pid1: fix free of uninitialized pointer in unit_fail_if_noncanonical()
https://bugzilla.redhat.com/show_bug.cgi?id=1653068
2018-12-14 11:21:16 +01:00
Chris Down 0c17c00433
Merge pull request #11131 from poettering/make-lucab-happy
optionally relabel additional files/dirs for selinux after loading policy
2018-12-12 23:17:39 +00:00
Chris Down cb5e3bc37d cgroup: Don't explicitly check for member in UNIT_BEFORE
The parent slice is always filtered ahead of time from UNIT_BEFORE, so
checking if the current member is the same as the parent unit will never
pass.

I may also write a SLICE_FOREACH_CHILD macro to remove some more of the
parent slice checks, but this requires a bit of a rework and general
refactoring and may not be worth it, so let's just do this for now.
2018-12-12 20:50:10 +01:00
Chris Down e92aaed30e tree-wide: Remove O_CLOEXEC from fdopen
fdopen doesn't accept "e", it's ignored. Let's not mislead people into
believing that it actually sets O_CLOEXEC.

From `man 3 fdopen`:

> e (since glibc 2.7):
> Open the file with the O_CLOEXEC flag. See open(2) for more information. This flag is ignored for fdopen()

As mentioned by @jlebon in #11131.
2018-12-12 20:47:40 +01:00
Lennart Poettering 70a74ec645 mount-setup: don't consider it reason to fail if we can't relabel cgroupfs
We usually don't care much about relabel failures, let's not do that
here either.
2018-12-12 20:46:07 +01:00
Lennart Poettering c4217b43d1 mount-setup: use FOREACH_STRING where appropriate 2018-12-12 20:46:07 +01:00
Lennart Poettering 65e183d789 mount-setup: optionally, relabel a configured set of files/dirs after loading policy
Fixes: #10466
2018-12-12 20:46:07 +01:00
Zbigniew Jędrzejewski-Szmek 303ee60151 Mark *data and *userdata params to specifier_printf() as const
It would be very wrong if any of the specfier printf calls modified
any of the objects or data being printed. Let's mark all arguments as const
(primarily to make it easier for the reader to see where modifications cannot
occur).
2018-12-12 16:45:33 +01:00
Lennart Poettering a95c0505ad core: extend comments regarding coldplug() vs. catchup() 2018-12-12 11:20:53 +01:00
Lennart Poettering a1c7334b61 core: when a unit state changes only propagate to jobs after reloading is complete
Previously, we'd immediately propagate unit state changes into any jobs
pending for them, always. With this we only do this if the manager is
out of the "reload" state. This fixes the problem #8803 tried to
address, by simply not completing jobs until after the reload (and thus
reestablishment of the dbus connection) is complete.

Note that there's no need to later on explicitly catch up with the
missed job state changes (i.e. there's no need to call
unit_process_job() later one explicitly). That's because for jobs in
JOB_WAITING state on deserialization all jobs are requeued into the run
queue anyway, and thus checked again if they can complete now. And for
JOB_RUNNING jobs unit_catchup() phase is going to trigger missed out
state changes *after* the reload complete anyway (after all that's what
distinguishes from unit_coldplug()).

Replaces: #8803
2018-12-12 11:15:07 +01:00
Lennart Poettering 16c74914d2 core: split out all logic that updates a Job on a unit's unit_notify() invocation
Just some refactoring, no change in behaviour.
2018-12-12 11:15:07 +01:00
Lennart Poettering b17c9620c8 core: rework how we deserialize jobs
Let's add a helper call unit_deserialize_job() for this purpose, and
let's move registration in the global jobs hash table into
job_install_deserialized() so that it it is done after all superficial
checks are done, and before transitioning into installed states, so that
rollback code is not necessary anymore.
2018-12-12 11:15:07 +01:00
Lennart Poettering 48235ad6b7 job: be more careful when removing job object from jobs hash table
Let's validate that the ID is actually allocated to us before remove a
job.

This is relevant as various bits of code will call job_free() on
partially set up Job objects, and we really shouldn't remove another job
object accidentally from the hash table, when the set up didn't
complete.
2018-12-12 11:15:07 +01:00
Lennart Poettering 4a53080be6 core: don't track jobs-finishing-during-reload explicitly
Memory management is borked for this, and moreover this is unnecessary
since f0831ed2a0, i.e. since coldplug() and catchup() are two different
concepts: the former restoring the state from before a reload, the
latter than adjusting it again to the actual status in effect after the
reload.

Fixes: #10716
Mostly reverts: #8803
2018-12-12 11:15:06 +01:00
Lennart Poettering 728ba51e98 job: update job_free() to follow our usual return-NULL style 2018-12-12 11:14:26 +01:00
Zbigniew Jędrzejewski-Szmek d7ef125726 core: fix typo in comment 2018-12-11 22:20:07 +01:00
Lennart Poettering d742f4b54b cgroup: correct mangling of return values
Let's nor return the unmangled return value before we actually mangle
it.

Fixes: #11062
2018-12-10 16:09:41 +01:00
Lennart Poettering 92a993041a cgroup: call cg_all_unified() right before using the result
Let's not query it before we actually need it.
2018-12-10 16:09:41 +01:00
Lennart Poettering 8f3fd07ac0
Merge pull request #11105 from keszybz/path-parsing
Some tightening of our path parsing code
2018-12-10 15:50:08 +01:00
Zbigniew Jędrzejewski-Szmek 1d4c6f5bef pid1: set Description even for devices which don't exist yet
We'd only set the description after the device appeared in sysfs, so
we'd always print
"A start job is running for dev-disk-by\x2duuid-aaaa ... aaaa.device (42s / 1min 30s)"
Let's make this
"A start job is running for /dev/disk/by-duuid/aaaa ... aaaa (42s / 1min 30s)"

https://bugzilla.redhat.com/show_bug.cgi?id=1655860
2018-12-10 14:00:42 +01:00
Lennart Poettering 2327f95499
Merge pull request #10984 from fbuihuu/tmpfiles-be-more-explicit-with-unsafe-transition
tmpfiles: be more explicit when an unsafe path transition is met
2018-12-10 12:31:56 +01:00
Zbigniew Jędrzejewski-Szmek 4cb06c5949 Use VLA instead of alloca
The test is the same, but an array is more readable.
2018-12-10 11:57:26 +01:00
Zbigniew Jędrzejewski-Szmek 60473f0c23 pid1: fix (harmless) off-by-one in PATH_MAX comparison
PATH_MAX is supposed to include the terminating NUL byte. But we already
check that there is no NUL byte in the specified path. Hence the maximum
length we can expect is PATH_MAX - 1.

This doesn't change much, but makes this use of PATH_MAX consistent with the
rest of the codebase.
2018-12-10 11:57:26 +01:00
Franck Bui 36c97decbe fs-util: make chase_symlink() returns -ENOLINK when unsafe transitions are met
We previously returned -EPERM but it can be returned for various other reasons
too.

Let's use -ENOLINK instead as this value shouldn't be used currently. This
allows users of CHASE_SAFE to detect without any ambiguities when unsafe
transitions are encountered by chase_symlinks().

All current users of CHASE_SAFE that explicitly reacted on -EPERM have been
converted to react on -ENOLINK.
2018-12-10 09:18:27 +01:00
Lennart Poettering 06721f39f6 swap: always add in extras when we load a swap unit
Much like for the mount units we need fields such as the slice
initialized by the time we activate the swap, hence when the kernel
let's us know about a new swap that appeared we need to initialize the
slice in any Swap object we allocated for that right-away, even if we
can't read the real unit file for the swap device.
2018-12-07 17:35:32 +01:00
Lennart Poettering a0a424083f swap: drop return value
We don't actually return any valid 'r' here, let's explicitly return 0
here hence instead.
2018-12-07 17:35:32 +01:00
Lennart Poettering eb04385dc5 swap: don't propagate issues with processing /proc/swaps
This follows similar recent changes in mount.c: error should be consider
local, and not be propagated.
2018-12-07 17:35:32 +01:00
Lennart Poettering 46f944802c swap: when loading a unit from /proc/swaps, mark its load state as good
This follows similar logic in the mount unit.
2018-12-07 17:35:32 +01:00
Lennart Poettering 15332d738f swap: split out code adding in additional unit props into a function of its own
This adds swap_add_extras() similar to mount_add_extras().

No change in behaviour, just some refactoring.
2018-12-07 17:35:32 +01:00
Lennart Poettering ba6fbb2cee swap: fix misplaced comment 2018-12-07 17:35:32 +01:00
Lennart Poettering 31135818e4 swap: flush out state when activating a unit, not when deactivating it
This is similar to the previous commit which did the same change for
mount units.
2018-12-07 17:35:32 +01:00
Lennart Poettering 7eba1463de mount: flush out cycle state on DEAD→MOUNTED only, not the other way round
For services (and other units) we generally follow the rule that at the
beginning of each cycle, i.e. when the INACTIVE/FAILED state is left for
ACTIVATING/ACTIVE we flush out various state variables. Mount units
handled this differently so far when the unit state change was effected
outside of systemd: in that case these variables would be flushed out
when going back to INACTIVE/FAILED already.

Let's fix that, and flush out this state always during the activating
transition, not during the deactivating transition.
2018-12-07 17:35:32 +01:00
Lennart Poettering ec88d1ea05 mount: replace three closely related mount flags into a proper flags enum
We pass these flags around, and even created a structure for them. Let's
fix things properly, and make them a flags value of its own.
2018-12-07 17:35:32 +01:00
Lennart Poettering b6418dc94e mount: strdup() device paths we collect
We never know what the changes triggered by mount_set_state() do to the
unit. Let's be safe and copy the device path into our set, so that we
are safe against that.
2018-12-07 17:35:32 +01:00
Lennart Poettering f8064c4fda mount: when the kernel reports a mount to be established reset all kinds of load failures
It doesn't matter what kind of precise failure we had earlier with
loading the unit, let's report that it loaded successfully now, after
all the kernel is an OK source for that, like any other.
2018-12-07 17:35:32 +01:00
Lennart Poettering a37422045f mount: regenerate all deps whenever a mount's parameters changes
Whenever we notice a change on an existing /proc/self/mountinfo line,
let's update the deps generated from it. For that, let's flush out the
old deps generated this way, and add in the new ones.

This takes benefit of the fact that today (unlike a comment this patch
removes says) we can remove deps in a somewhat reasonable way.
2018-12-07 17:35:32 +01:00
Lennart Poettering 6d7e89b070 mount: when allocating a Mount object based on /proc/self/mountinfo mark it so
Let's set 'from_proc_self_mountinfo' right away, since we know its from
there. This is important so that when the load queue is dispatched (and
thus mount_load() called) this
fact is already known.
2018-12-07 17:35:32 +01:00
Lennart Poettering 26e35b164b mount: let mount_add_extras() take care of remote-fs.target deps
In a previous commit we added logic that mount_add_extras() (or more
precisely mount_add_default_dependencies()) adds in dependencies on
remote-fs.target and local-fs.target, hence we can drop this from
mount_setup_new_unit() and let the usual load queue dispatching take
care of this.
2018-12-07 17:34:29 +01:00
Lennart Poettering e10fe04266 mount: use free_and_strdup() over plain strdup()
Let's initialize two fields with free_and_strdup() rather than directly
with strdup(). The fields should not be initialized so far, but it's
still nicer to be prepared for futzre code changes and always free
what's stored before replacing it.
2018-12-07 17:34:29 +01:00
Lennart Poettering 60b8c203f8 mount: generate error message matching the error location 2018-12-07 17:34:29 +01:00
Lennart Poettering 839ee05838 mount: move allocation of Unit object into mount_setup_new_unit()
This should encapsulate things in a nicer way.
2018-12-07 17:34:29 +01:00
Lennart Poettering bbee24bc96 mount: add a common helper for filling in info from /proc/self/mountinfo 2018-12-07 17:30:14 +01:00
Lennart Poettering ba0d56f55f mount: don't propagate errors from mount_setup_unit() further up
If we can't process a specific line in /proc/self/mountinfo we should
log about it (which we do), but this should not affect other lines, nor
further processing of mount units. Let's keep these failures local.

Fixes: #10874
2018-12-07 17:30:14 +01:00
Lennart Poettering 3f459cd96e mount: simplify de-serialization of control PID a bit 2018-12-07 17:30:14 +01:00
Lennart Poettering 2c09fb818f mount: serialize umount retry counter across reloads/reexec 2018-12-07 17:30:14 +01:00
Lennart Poettering 780ae0221a mount: make sure mount_add_extras() is always invoked when we load a mount unit
We need to make sure that the slice property is initialized whenever
mount_load() is invoked, even if we fail to load things properly off
disk. This is important since we generally don't allow changing the
slice after a unit has been started. But given that we must track the
state of external objects with mount units we must hence initialize the
property no matter what.
2018-12-07 17:30:14 +01:00
Lennart Poettering 1f73647647 mount: document that mount_add_extras() must work with active units, too 2018-12-07 17:30:14 +01:00
Lennart Poettering d54bab90e6 mount: set up local-fs.target/remote-fs.target deps in mount_add_default_dependencies() too
This deps are very similar to the -pre deps, hence establish them at the
same place, in particular as they should only be generated if default
deps are on.

This allows us to later on remove similar code that adds in these deps
whenever /proc/self/mountinfo changes.
2018-12-07 17:30:14 +01:00
Lennart Poettering 32a5f8fe22 mount: remove unnecessary initialization of device_wants_mount and 'if' 2018-12-07 17:30:14 +01:00
Lennart Poettering a26592cf58 mount: use mfree() where appropriate 2018-12-07 17:30:14 +01:00
Lennart Poettering 06e457b1a3 mount: rename needs_quota() → mount_needs_quota()
No change of logic, just some renaming, in order to match more closely
the naming of the other, similar functions.
2018-12-07 17:30:14 +01:00
Lennart Poettering 7121cbcffe mount: merge redundant call mount_needs_network() into mount_is_network() 2018-12-07 17:30:14 +01:00
Zbigniew Jędrzejewski-Szmek 9b674e2581 core/device: fix typo 2018-12-07 15:39:52 +01:00
Zbigniew Jędrzejewski-Szmek 871fa294ff Merge pull request #10935 from poettering/rlimit-nofile-safe
Merged by hand to resolve a trivial conflict in TODO.
2018-12-06 17:19:21 +01:00
Yu Watanabe f0a43eb821
Merge pull request #11063 from yuwata/update-missing-v3
missing: split missing.h into small pieces
2018-12-06 16:54:27 +01:00
Zbigniew Jędrzejewski-Szmek 2d479ff1cc
Merge pull request #10963 from poettering/bus-force-state-change-signal
force PropertiesChanged bus signal on all unit state changes
2018-12-06 16:42:21 +01:00
Yu Watanabe e93672eeac tree-wide: drop missing.h from headers and use relevant missing_*.h 2018-12-06 13:31:16 +01:00
Lennart Poettering ee589a1882
Merge pull request #11041 from yuwata/update-missing-v2
missing: separate missing.h more
2018-12-04 16:19:02 +01:00
Lennart Poettering 1e9e7196cb
Merge pull request #11042 from yuwata/tiny-coding-style-fixes
Tiny coding style fixes
2018-12-04 16:15:09 +01:00
Lennart Poettering ea900d2bfe
Merge pull request #11009 from poettering/root-cgroup-again
tweak root cgroup attribute fiddling for cgroupsv1 again
2018-12-04 12:33:03 +01:00
Yu Watanabe f2a3de0116 tree-wide: add whitespace between type and variable name 2018-12-04 09:29:54 +01:00
Yu Watanabe 3843e8260c missing: rename securebits.h to missing_securebits.h 2018-12-04 07:49:24 +01:00
Lennart Poettering a365325e04
Merge pull request #10567 from cdown/disable_controller
cgroup: Add DisableControllers= directive to disable controller in subtree
2018-12-03 18:58:29 +01:00
Lennart Poettering 76b31bbb24
Merge pull request #10920 from yuwata/hashmap-destructor
hashmap: make hashmap_free() call destructors of key or value
2018-12-03 17:59:44 +01:00
Chris Down c72703e26d cgroup: Add DisableControllers= directive to disable controller in subtree
Some controllers (like the CPU controller) have a performance cost that
is non-trivial on certain workloads. While this can be mitigated and
improved to an extent, there will for some controllers always be some
overheads associated with the benefits gained from the controller.
Inside Facebook, the fix applied has been to disable the CPU controller
forcibly with `cgroup_disable=cpu` on the kernel command line.

This presents a problem: to disable or reenable the controller, a reboot
is required, but this is quite cumbersome and slow to do for many
thousands of machines, especially machines where disabling/enabling a
stateful service on a machine is a matter of several minutes.

Currently systemd provides some configuration knobs for these in the
form of `[Default]CPUAccounting`, `[Default]MemoryAccounting`, and the
like. The limitation of these is that Default*Accounting is overrideable
by individual services, of which any one could decide to reenable a
controller within the hierarchy at any point just by using a controller
feature implicitly (eg. `CPUWeight`), even if the use of that CPU
feature could just be opportunistic. Since many services are provided by
the distribution, or by upstream teams at a particular organisation,
it's not a sustainable solution to simply try to find and remove
offending directives from these units.

This commit presents a more direct solution -- a DisableControllers=
directive that forcibly disallows a controller from being enabled within
a subtree.
2018-12-03 15:40:31 +00:00
Chris Down 4f6f62e468 cgroup: Traverse leaves to realised cgroup to release controllers
This adds a depth-first version of unit_realize_cgroup_now which can
only do depth-first disabling of controllers, in preparation for the
DisableController= directive.
2018-12-03 14:37:39 +00:00
Chris Down a57669d290 cgroup: Rework unit_realize_cgroup_now to explicitly be breadth-first
systemd currently doesn't really expend much effort in disabling
controllers. unit_realize_cgroup_now *may* be able to disable a
controller in the basic case when using cgroup v2, but generally won't
manage as downstream dependents may still use it.

This code doesn't add any logic to fix that, but it starts the process
of moving to have a breadth-first version of unit_realize_cgroup_now for
enabling, and a depth-first version of unit_realize_cgroup_now for
disabling.
2018-12-03 14:37:39 +00:00
Chris Down 0d2d6fbf15 cgroup: Move attribute application into unit_create_cgroup
We always end up doing these together, so just colocate them and require
manager state for unit_create_cgroup.
2018-12-03 14:37:38 +00:00
Lennart Poettering ad2bf5df89
Merge pull request #10992 from yuwata/follow-up-10948
network: make fib rule accept arbitrary ip protocol
2018-12-03 11:09:04 +01:00
Lennart Poettering 686d13b9f2 util-lib: split out env file parsing code into env-file.c
It's quite complex, let's split this out.

No code changes, just some file rearranging.
2018-12-02 13:22:29 +01:00
Lennart Poettering e4de72876e util-lib: split out all temporary file related calls into tmpfiles-util.c
This splits out a bunch of functions from fileio.c that have to do with
temporary files. Simply to make the header files a bit shorter, and to
group things more nicely.

No code changes, just some rearranging of source files.
2018-12-02 13:22:29 +01:00
Lennart Poettering ee228be10c util-lib: don't include fileio.h from fileio-label.h
There's no reason for doing that, hence simply don't.
2018-12-02 13:22:29 +01:00
Yu Watanabe 7a08d314f2 tree-wide: make hash_ops typesafe 2018-12-02 07:53:27 +01:00
Yu Watanabe d2b42d63c4 core,run: make SocketProtocol= accept protocol name in upper case an protocol number 2018-12-02 06:13:47 +01:00
Yu Watanabe da96ad5ae2 util: rename socket_protocol_{from,to}_name() to ip_protocol_{from,to}_name() 2018-12-02 05:48:27 +01:00
Lennart Poettering 5dd9527883 tree-wide: remove various unused functions
All found with "cppcheck --enable=unusedFunction".
2018-12-02 13:35:34 +09:00
Lennart Poettering 67e2ea1542 cgroup: suffix unit file settings with "=" in log output
Let's follow our recommendations from CODING_STYLE and suffix unit file
settings with "=" everywhere.
2018-12-01 12:57:51 +01:00
Lennart Poettering be2c032781 core: don't try to write CPU quota and memory limit cgroup attrs on root cgroup
In the kernel sources attempts to write to either are refused with
EINVAL. Not sure why these attributes are exported anyway on cgroupsv1,
but this means we really should ignore them altogether.

This simplifies our code as this means cgroupsv1 is more alike cgroupsv2
in this regard.

Fixes: #10969
2018-12-01 12:57:51 +01:00
Lennart Poettering 3c4832ada4 core: enqueue unit earlier when state changes
Previously, we'd enqueue a unit to the dbus queue whenever the state
changed, after we processed the state change fully. This commit to the
beginning of the state change. This has the benefit that when the state
change causes a job to complete the unit is already in the dbus queue,
and thus we get the guarantee that any unit change can be sent out to
clients before the job change.
2018-12-01 12:53:26 +01:00
Lennart Poettering e6d05912cb core: when we install a job, announce this via the bus
Whenever we enqueue a job, we should announce this on the bus, hence add
both the job and the unit to the dbus queues. (Why both? The former
should be obvious, the latter because we send out Job properties).

In most cases adding these to the queue is not necessary, as
other properties tend to change at the same time and result in a change
being sent out. However, let's clean this up and make it explicit.
2018-12-01 12:53:26 +01:00
Lennart Poettering 641e0d7a1b core: clarify that the CollectMode bus property is constant
it's configured from unit files only, and hence is constant.
2018-12-01 12:53:26 +01:00
Lennart Poettering af92c603bb core: send out unit change events when a new invocation ID is acquired
It's free, as this generally coincides with unit_start(), but let's make
this clean and explicit.
2018-12-01 12:53:26 +01:00
Lennart Poettering e18f8852f3 core: invalidate invidual Assert/Condition properties when sending out change messages
Let's inform the clients about assert/condition property changes as they
happen, it's basically for free because assert/condition property
changes generally coincide with other unit state changes (after all
these checks are done on unit_start())
2018-12-01 12:53:26 +01:00
Lennart Poettering 13142276c1 core: before returning new job path to clients, force out JobNew signals
When a client requests a new job, let's make sure we for out the JobNew
signals for it, before we return successfully from the method call.
After all we shouldn't return a path that is not announced yet, as
announcement of jobs should be considered part of the job setup.
2018-12-01 12:53:26 +01:00
Lennart Poettering 17407bc28d core: before sending out a job new/change/removal message, send out unit change message for job's unit
We always want the state of the unit to be reflected first to the
client before we claim the job has changed state, after all the job is
the request to change unit state, and thus job changes are kinda the
confirmation that the state changed as requested.
2018-12-01 12:53:26 +01:00
Lennart Poettering 6fcbec6f9b core: whenever we change state of a unit, force out PropertiesChanged bus signal
This allows clients to follow our internal state changes safely.

Previously, quick state changes (for example, when we restart a unit due
to Restart= after it quickly transitioned through DEAD/FAILED states)
would be coalesced into one bus signal event, with this change there's
the guarantee that all state changes after the unit was announced ones
are reflected on th bus.

Note we only do this kind of guaranteed flushing only for unit state
changes, not for other unit property changes, where clients still have
to expect coalescing. This is because the unit state is a very
important, high-level concept.

Fixes: #10185
2018-12-01 12:53:26 +01:00
Lennart Poettering 37d0b962ef core: when we manage to resolve a user, only enqueue dbus event, don't send out message right-away
Let's only enqueue the dbus signal generation, let's not do it
right-away, after all we want coalescing to take effect here.
2018-12-01 12:53:26 +01:00
Lennart Poettering ece0fe12ad tree-wide: (void)ify some setsid() and related calls 2018-12-01 12:50:45 +01:00
Lennart Poettering 595225af7a tree-wide: invoke rlimit_nofile_safe() before various exec{v,ve,l}() invocations
Whenever we invoke external, foreign code from code that has
RLIMIT_NOFILE's soft limit bumped to high values, revert it to 1024
first. This is a safety precaution for compatibility with programs using
select() which cannot operate with fds > 1024.

This commit adds the call to rlimit_nofile_safe() to all invocations of
exec{v,ve,l}() and friends that either are in code that we know runs
with RLIMIT_NOFILE bumped up (which is PID 1 and all journal code for
starters) or that is part of shared code that might end up there.

The calls are placed as early as we can in processes invoking a flavour
of execve(), but after the last time we do fd manipulations, so that we
can still take benefit of the high fd limits for that.
2018-12-01 12:50:45 +01:00
Zbigniew Jędrzejewski-Szmek b2ac2b01c8
Merge pull request #10996 from poettering/oci-prep
Preparation for the nspawn-OCI work
2018-11-30 10:09:00 +01:00
Zbigniew Jędrzejewski-Szmek 049af8ad0c Split out part of mount-util.c into mountpoint-util.c
The idea is that anything which is related to actually manipulating mounts is
in mount-util.c, but functions for mountpoint introspection are moved to the
new file. Anything which requires libmount must be in mount-util.c.

This was supposed to be a preparation for further changes, with no functional
difference, but it results in a significant change in linkage:

$ ldd build/libnss_*.so.2
(before)
build/libnss_myhostname.so.2:
	linux-vdso.so.1 (0x00007fff77bf5000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f4bbb7b2000)
	libmount.so.1 => /lib64/libmount.so.1 (0x00007f4bbb755000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f4bbb734000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f4bbb56e000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f4bbb8c1000)
	libblkid.so.1 => /lib64/libblkid.so.1 (0x00007f4bbb51b000)
	libuuid.so.1 => /lib64/libuuid.so.1 (0x00007f4bbb512000)
	libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f4bbb4e3000)
	libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f4bbb45e000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f4bbb458000)
build/libnss_mymachines.so.2:
	linux-vdso.so.1 (0x00007ffc19cc0000)
	librt.so.1 => /lib64/librt.so.1 (0x00007fdecb74b000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007fdecb744000)
	libmount.so.1 => /lib64/libmount.so.1 (0x00007fdecb6e7000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fdecb6c6000)
	libc.so.6 => /lib64/libc.so.6 (0x00007fdecb500000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fdecb8a9000)
	libblkid.so.1 => /lib64/libblkid.so.1 (0x00007fdecb4ad000)
	libuuid.so.1 => /lib64/libuuid.so.1 (0x00007fdecb4a2000)
	libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fdecb475000)
	libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fdecb3f0000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007fdecb3ea000)
build/libnss_resolve.so.2:
	linux-vdso.so.1 (0x00007ffe8ef8e000)
	librt.so.1 => /lib64/librt.so.1 (0x00007fcf314bd000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007fcf314b6000)
	libmount.so.1 => /lib64/libmount.so.1 (0x00007fcf31459000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fcf31438000)
	libc.so.6 => /lib64/libc.so.6 (0x00007fcf31272000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fcf31615000)
	libblkid.so.1 => /lib64/libblkid.so.1 (0x00007fcf3121f000)
	libuuid.so.1 => /lib64/libuuid.so.1 (0x00007fcf31214000)
	libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fcf311e7000)
	libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fcf31162000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007fcf3115c000)
build/libnss_systemd.so.2:
	linux-vdso.so.1 (0x00007ffda6d17000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f610b83c000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007f610b835000)
	libmount.so.1 => /lib64/libmount.so.1 (0x00007f610b7d8000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f610b7b7000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f610b5f1000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f610b995000)
	libblkid.so.1 => /lib64/libblkid.so.1 (0x00007f610b59e000)
	libuuid.so.1 => /lib64/libuuid.so.1 (0x00007f610b593000)
	libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f610b566000)
	libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f610b4e1000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f610b4db000)

(after)
build/libnss_myhostname.so.2:
	linux-vdso.so.1 (0x00007fff0b5e2000)
	librt.so.1 => /lib64/librt.so.1 (0x00007fde0c328000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fde0c307000)
	libc.so.6 => /lib64/libc.so.6 (0x00007fde0c141000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fde0c435000)
build/libnss_mymachines.so.2:
	linux-vdso.so.1 (0x00007ffdc30a7000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f06ecabb000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007f06ecab4000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f06eca93000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f06ec8cd000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f06ecc15000)
build/libnss_resolve.so.2:
	linux-vdso.so.1 (0x00007ffe95747000)
	librt.so.1 => /lib64/librt.so.1 (0x00007fa56a80f000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007fa56a808000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fa56a7e7000)
	libc.so.6 => /lib64/libc.so.6 (0x00007fa56a621000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fa56a964000)
build/libnss_systemd.so.2:
	linux-vdso.so.1 (0x00007ffe67b51000)
	librt.so.1 => /lib64/librt.so.1 (0x00007ffb32113000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007ffb3210c000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ffb320eb000)
	libc.so.6 => /lib64/libc.so.6 (0x00007ffb31f25000)
	/lib64/ld-linux-x86-64.so.2 (0x00007ffb3226a000)

I don't quite understand what is going on here, but let's not be too picky.
2018-11-29 21:03:44 +01:00
Lennart Poettering 30874dda3a dev-setup: generalize logic we use to create "inaccessible" device nodes
Let's generalize this, so that we can use this in nspawn later on, which
is pretty useful as we need to be able to mask files from the inner
child of nspawn too, where the host's /run/systemd/inaccessible
directory is not visible anymore. Moreover, if nspawn can create these
nodes on its own before the payload this means the payload can run with
fewer privileges.
2018-11-29 20:21:40 +01:00
Lennart Poettering d5aecba6e0 cgroup: use device_path_parse_major_minor() also for block device paths
Not only when we populate the "devices" cgroup controller we need
major/minor numbers, but for the io/blkio one it's the same, hence let's
use the same logic for both.
2018-11-29 20:21:39 +01:00
Lennart Poettering 846b3bd61e stat-util: add new APIs device_path_make_{major_minor|canonical}() and device_path_parse_major_minor()
device_path_make_{major_minor|canonical)  generate device node paths
given a mode_t and a dev_t. We have similar code all over the place,
let's unify this in one place. The former will generate a "/dev/char/"
or "/dev/block" path, and never go to disk. The latter then goes to disk
and resolves that path to the actual path of the device node.

device_path_parse_major_minor() reverses device_path_make_major_minor(),
also withozut going to disk.

We have similar code doing something like this at various places, let's
unify this in a single set of functions. This also allows us to teach
them special tricks, for example handling of the
/run/systemd/inaccessible/{blk|chr} device nodes, which we use for
masking device nodes, and which do not exist in /dev/char/* and
/dev/block/*
2018-11-29 20:21:39 +01:00
Lennart Poettering 8e8b5d2e6d cgroups: beef up DeviceAllow= syntax a bit
Previously we'd allow pattern expressions such as "char-input" to match
all input devices. Internally, this would look up the right major to
test in /proc/devices. With this commit the syntax is slightly extended:

- "char-*" can be used to match any kind of character device, and
  similar "block-*. This expression would work previously already, but
  instead of actually installing a wildcard match it would install many
  individual matches for everything listed in /proc/devices.

- "char-<MAJOR>" with "<MAJOR>" being a numerical parameter works now
  too. This allows clients to install whitelist items by specifying the
  major directly.

The main reason to add these is to provide limited compat support for
clients that for some reason contain whitelists with major/minor numbers
(such as OCI containers).
2018-11-29 20:21:39 +01:00
Lennart Poettering 74c48bf5a8 core: add special handling for devices cgroup allow lists for /dev/block/* and /dev/char/* device nodes
This adds some code to hanlde /dev/block/* and /dev/char/* device node
paths specially: instead of actually stat()ing them we'll just parse the
major/minor name from the name. This is useful 'hack' to allow clients
to install whitelists for devices that don't actually have to exist.

Also, let's similarly handle /run/systemd/inaccessible/{blk|chr}. This
allows us to simplify our built-in default whitelist to not require a
"ignore_enoent" mode for these nodes.

In general we should be careful with hardcoding major/minor numbers, but
in this case this should safe.
2018-11-29 20:03:56 +01:00
Zbigniew Jędrzejewski-Szmek 8b4e51a60e
Merge pull request #10797 from poettering/run-generator
add new "systemd-run-generator" for running arbitrary commands from the kernel command line as system services using the "systemd.run=" kernel command line switch
2018-11-28 22:40:55 +01:00
Yu Watanabe 50ae773f85
Merge pull request #10970 from yuwata/from-name-return-negative-errno
util: make *_from_name() returns negative errno on error
2018-11-29 03:18:03 +09:00
Yu Watanabe acf4d15893 util: make *_from_name() returns negative errno on error 2018-11-28 20:20:50 +09:00
Lennart Poettering b4525804a1 core: USB function properties do not change dynamically, don't claim so
This reduces our PropertiesChanged signals a bit in size as we don't
keep out blasting properties that cannot change anyway all the time.
2018-11-28 10:29:51 +01:00
Lennart Poettering 4917894417
Merge pull request #10944 from poettering/redirect-file-fix
StandardOutput=file: fixes
2018-11-27 13:18:26 +01:00
Zbigniew Jędrzejewski-Szmek 6fa158f55c
Merge pull request #10902 from poettering/highlight-status
Highlight status
2018-11-27 12:53:43 +01:00
Lennart Poettering 41fc585a7a core: be more careful when inheriting stdout fds to stderr
We need to compare the fd name/file name if we inherit an fd from stdout
to stderr. Let's do that.

Fixes: #10875
2018-11-27 10:06:51 +01:00
Lennart Poettering 1704fba92f dbus-execute: generate the correct transient unit setting 2018-11-27 10:06:50 +01:00
Lennart Poettering dbe6c4b657 dbus-execute: fix indentation 2018-11-27 10:06:50 +01:00
Lennart Poettering 922ce049d1 core: drop references to 'StandardOutputFileToCreate'
This property never existed, let's drop any reference to it.
2018-11-27 10:06:50 +01:00
Lennart Poettering 7af67e9a8b core: allow to set exit status when using SuccessAction=/FailureAction=exit in units
This adds SuccessActionExitStatus= and FailureActionExitStatus= that may
be used to configure the exit status to propagate in when
SuccessAction=exit or FailureAction=exit is used.

When not specified let's also propagate the exit status of the main
process we fork off for the unit.
2018-11-27 09:44:40 +01:00
Lennart Poettering 78f93209fc core: when Delegate=yes is set for a unit, run ExecStartPre= and friends in a subcgroup of the unit
Otherwise we might conflict with the "no-processes-in-inner-cgroup" rule
of cgroupsv2. Consider nspawn starting up and initializing its cgroup
hierarchy with "supervisor/" and "payload/" as subcgroup, with itself
moved into the former and the payload into the latter. Now, if an
ExecStartPre= is run right after it cannot be placed in the main cgroup,
because that is now in inner cgroup with populated children.

Hence, let's run these helpers in another sub-cgroup .control/ below it.

This is somewhat ugly since it weakens the clear separation of
ownership, but given that this is an explicit contract, and double opt-in should be acceptable.

Fixes: #10482
2018-11-26 18:43:23 +01:00
Lennart Poettering 5b262f74e4 unit: tweak status output a bit
Let's highlight the unit description string in the status updates, to
separate them a bit more the english sentence they are part of, and thus
make the different casing less surprising.
2018-11-26 18:24:12 +01:00
Lennart Poettering ccfc08d4bc show-status: use free_and_replace() where we can 2018-11-26 18:24:12 +01:00
Lennart Poettering a885727a64 show-status: fold two bool flags function arguments into a flags
parameter
2018-11-26 18:24:12 +01:00
Yu Watanabe 938dbb292a
Merge pull request #10901 from poettering/startswith-list
add new STARTSWITH_SET() macro
2018-11-26 22:40:51 +09:00
Lennart Poettering 9630d4dd68
Merge pull request #10894 from poettering/root-cgroup-fix
A multitude of cgroup fixes
2018-11-26 14:13:01 +01:00
Lennart Poettering da9fc98ded tree-wide: port more code over to PATH_STARTSWITH_SET() 2018-11-26 14:08:46 +01:00
Lennart Poettering 49fe5c0996 tree-wide: port various places over to STARTSWITH_SET() 2018-11-26 14:08:46 +01:00
Lennart Poettering b8b6f32104 cgroup: when we unload a unit, also update all its parent's members mask
This way we can corectly ensure that when a unit that requires some
controller goes away, we propagate the removal of it all the way up, so
that the controller is turned off in all the parents too.
2018-11-23 13:41:37 +01:00
Lennart Poettering 5af8805872 cgroup: drastically simplify caching of cgroups members mask
Previously we tried to be smart: when a new unit appeared and it only
added controllers to the cgroup mask we'd update the cached members mask
in all parents by ORing in the controller flags in their cached values.
Unfortunately this was quite broken, as we missed some conditions when
this cache had to be reset (for example, when a unit got unloaded),
moreover the optimization doesn't work when a controller is removed
anyway (as in that case there's no other way for the parent to iterate
though all children if any other, remaining child unit still needs it).
Hence, let's simplify the logic substantially: instead of updating the
cache on the right events (which we didn't get right), let's simply
invalidate the cache, and generate it lazily when we encounter it later.
This should actually result in better behaviour as we don't have to
calculate the new members mask for a whole subtree whever we have the
suspicion something changed, but can delay it to the point where we
actually need the members mask.

This allows us to simplify things quite a bit, which is good, since
validating this cache for correctness is hard enough.

Fixes: #9512
2018-11-23 13:41:37 +01:00
Lennart Poettering 8a0d538815 cgroup: extend comment on what unit_release_cgroup() is for 2018-11-23 13:41:37 +01:00
Lennart Poettering 1fd3a10c38 cgroup: extend reasons when we realize the enable mask
After creating a cgroup we need to initialize its
"cgroup.subtree_control" file with the controllers its children want to
use. Currently we do so whenever the mkdir() on the cgroup succeeded,
i.e. when we know the cgroup is "fresh". Let's update the condition
slightly that we also do so when internally we assume a cgroup doesn't
exist yet, even if it already does (maybe left-over from a previous
run).

This shouldn't change anything IRL but make things a bit more robust.
2018-11-23 13:41:37 +01:00
Lennart Poettering d5095dcd30 cgroup: tighten call that detects whether we need to realize a unit's cgroup a bit, and comment why 2018-11-23 13:41:37 +01:00
Lennart Poettering 5a62e5e2ac cgroup: document what the various masks variables are used for 2018-11-23 13:41:37 +01:00
Lennart Poettering 27c4ed790a cgroup: simplify check whether it makes sense to realize a cgroup 2018-11-23 13:41:37 +01:00
Lennart Poettering e00068e71f cgroup: in unit_invalidate_cgroup() actually modify invalidation mask
Previously this would manipulate the realization mask for invalidating
the realization. This is a bit ugly though as the realization mask's
primary purpose to is to reflect in which hierarchies a cgroup currently
exists, and it's probably a good idea to keep that in sync with
realities.

We nowadays have the an explicit fields for invalidating cgroup
controller information, the "cgroup_invalidated_mask", let's use this
one instead.

The effect is pretty much the same, as the main consumer of these masks
(unit_has_mask_realize()) checks both anyway.
2018-11-23 13:41:37 +01:00
Lennart Poettering 27adcc9737 cgroup: be more careful with which controllers we can enable/disable on a cgroup
This changes cg_enable_everywhere() to return which controllers are
enabled for the specified cgroup. This information is then used to
correctly track the enablement mask currently in effect for a unit.
Moreover, when we try to turn off a controller, and this works, then
this is indicates that the parent unit might succesfully turn it off
now, too as our unit might have kept it busy.

So far, when realizing cgroups, i.e. when syncing up the kernel
representation of relevant cgroups with our own idea we would strictly
work from the root to the leaves. This is generally a good approach, as
when controllers are enabled this has to happen in root-to-leaves order.
However, when controllers are disabled this has to happen in the
opposite order: in leaves-to-root order (this is because controllers can
only be enabled in a child if it is already enabled in the parent, and
if it shall be disabled in the parent then it has to be disabled in the
child first, otherwise it is considered busy when it is attempted to
remove it in the parent).

To make things complicated when invalidating a unit's cgroup membershup
systemd can actually turn off some controllers previously turned on at
the very same time as it turns on other controllers previously turned
off. In such a case we have to work up leaves-to-root *and*
root-to-leaves right after each other. With this patch this is
implemented: we still generally operate root-to-leaves, but as soon as
we noticed we successfully turned off a controller previously turned on
for a cgroup we'll re-enqueue the cgroup realization for all parents of
a unit, thus implementing leaves-to-root where necessary.
2018-11-23 13:41:37 +01:00
Zbigniew Jędrzejewski-Szmek e5e0a79623 pid1,sd-device: use PATH_STARTSWITH_SET more 2018-11-23 13:37:47 +01:00
Lennart Poettering 26a17ca280 cgroup: add explanatory comment 2018-11-23 12:24:37 +01:00
Lennart Poettering 442ce7759c cgroup: units that aren't loaded properly should not result in cgroup controllers being pulled in
This shouldn't make much difference in real life, but is a bit cleaner.
2018-11-23 12:24:37 +01:00
Lennart Poettering 0adf88b68c cgroup: dump delegation mask too 2018-11-23 12:24:37 +01:00
Lennart Poettering 1649244588 cgroup: make unit_get_needs_bpf_firewall() static too 2018-11-23 12:24:37 +01:00
Lennart Poettering 53aea74a60 cgroup: make some functions static 2018-11-23 12:24:37 +01:00
Lennart Poettering 52fecf20b9 cgroup: fine tune when to apply cgroup attributes to the root cgroup
Let's tweak when precisely to apply cgroup attributes on the root
cgroup.

With this we now follow the following rules:

1. On cgroupsv2 we never apply any regular cgroups to the host root,
   since the attributes generally do not exist there.

2. On cgroupsv1 we do not apply any "weight" or "shares" style
   attributes to the host root cgroup, since they don't make much sense
   on the top level where there's only one group, hence no need to
   compare weights against each other. The other attributes are applied
   to the host root cgroup however.

3. In any case we don't apply attributes to the root of container
   environments (and --user roots), under the assumption that this is
   managed by the manager further up. (Note that on cgroupsv2 this is
   even enforced by the kernel)

4. BPF pseudo-attributes are applied in all cases (since we can have as
   many of them as we want)
2018-11-23 12:24:37 +01:00
Lennart Poettering 589a5f7a38 cgroup: append \n to static strings we write to cgroup attributes
This is a bit cleaner since we when we format numeric limits we append
it. And this way write_string_file() doesn't have to append it.
2018-11-23 12:24:37 +01:00
Lennart Poettering 28cfdc5aeb cgroup: tighten manager_owns_host_root_cgroup() a bit
This tightening is not strictly necessary (as the m->cgroup_root check
further down does the same), but let's make this explicit.
2018-11-23 12:24:37 +01:00
Lennart Poettering 611c4f8afb cgroup: rename {manager_owns|unit_has}_root_cgroup() → .._host_root_cgroup()
Let's emphasize that this function checks for the host root cgroup, i.e.
returns false for the root cgroup when we run in a container where
CLONE_NEWCGROUP is used. There has been some confusion around this
already, for example cgroup_context_apply() uses the function
incorrectly (which we'll fix in a later commit).

Just some refactoring, not change in behaviour.
2018-11-23 12:24:37 +01:00
Lennart Poettering 293d32df39 cgroup: add a common routine for writing to attributes, and logging about it
We can use this at quite a few places, and this allows us to shorten our
code quite a bit.
2018-11-23 12:24:37 +01:00
Lennart Poettering 39b9fefb2e cgroup: add a new macro for determining log level for cgroup attr write failures
For now, let's use it only at one place, but a follow-up commit will
make more use of it.
2018-11-23 12:24:37 +01:00
Lennart Poettering 2c74e12bb3 cgroup: ignore EPERM for a couple of more attribute writes 2018-11-23 12:24:37 +01:00
Lennart Poettering 8c83840772 cgroup: add comment explaining why we ignore EINVAL at two places
These are just copies from further down.
2018-11-23 12:24:37 +01:00
Lennart Poettering 73fe5314bf cgroup: suffix settings with "=" in log messages where appropriate 2018-11-23 12:24:37 +01:00
Lennart Poettering a0c339ed4b cgroup: only install cgroup release agent when we own the root cgroup
If we run in a container we shouldn't patch around this, and most likely
we can't anyway, and there's not much point in complaining about this.
Hence let's strictly say: the agent is private property of the host's
system instance, nothing else.
2018-11-23 12:24:37 +01:00
Lennart Poettering de8a711a58 cgroup: use structured initialization 2018-11-23 12:24:37 +01:00
Lennart Poettering 00e7b3c8e5 unit: minor optimization, use stack over heap, when we can 2018-11-23 00:46:56 +01:00
Lennart Poettering 27da878e7e unit: drop an unused fields from Unit struct 2018-11-23 00:37:00 +01:00
Lennart Poettering 66fa4bdd70 core: add two minor comments (#10890) 2018-11-23 06:25:27 +09:00
Zbigniew Jędrzejewski-Szmek baaa35ad70 coccinelle: make use of SYNTHETIC_ERRNO
Ideally, coccinelle would strip unnecessary braces too. But I do not see any
option in coccinelle for this, so instead, I edited the patch text using
search&replace to remove the braces. Unfortunately this is not fully automatic,
in particular it didn't deal well with if-else-if-else blocks and ifdefs, so
there is an increased likelikehood be some bugs in such spots.

I also removed part of the patch that coccinelle generated for udev, where we
returns -1 for failure. This should be fixed independently.
2018-11-22 10:54:38 +01:00
Michal Koutný aa1f95d264 core: Detect initial timer state from serialized data
We keep a mark whether a single-shot timer was triggered in the caller's
variable initial. When such a timer elapses while we are
serializing/deserializing the inner state, we consider the timer
incorrectly as elapsed and don't trigger it later.

This patch exploits last_trigger timestamp that we already serialize,
hence we can eliminate the argument initial completely.

A reproducer for OnBootSec= timers:
        cat >repro.c <<EOD
        /*
         * Compile:	gcc repro.c -o repro
         * Run:		./repro
         */
        #include <errno.h>
        #include <fcntl.h>
        #include <stdio.h>
        #include <stdlib.h>
        #include <sys/stat.h>
        #include <sys/types.h>
        #include <time.h>
        #include <unistd.h>

        int main(int argc, char *argv[]) {
        	char command[1024];
        	int pause;

        	struct timespec now;

        	while (1) {
        		usleep(rand() % 200000); // prevent periodic repeats
               		clock_gettime(CLOCK_MONOTONIC, &now);
        		printf("%i\n", now.tv_sec);

        		system("rm -f $PWD/mark");
        		snprintf(command, 1024, "systemd-run --user --on-boot=%i --timer-property=AccuracySec=100ms "
        					"touch $PWD/mark", now.tv_sec + 1);
        		system(command);
        		system("systemctl --user list-timers");
        		pause = (1000000000 - now.tv_nsec)/1000 - 70000; // fiddle to hit the middle of reloading
        		usleep(pause > 0 ? pause : 0);
        		system("systemctl --user daemon-reload");
        		sync();
        		sleep(2);
        		if (open("./mark", 0) < 0)
        			if (errno == ENOENT) {
        				printf("mark file does not exist\n");
        				break;
        			}
        	}

        	return 0;
        }
        EOD
2018-11-21 11:28:33 +01:00
Lennart Poettering 818623aca5
Merge pull request #10860 from keszybz/more-cleanup-2
Do more stuff from main macros
2018-11-21 11:07:31 +01:00
Lennart Poettering e3b8d0637d core: run env generators with non-zero umask
For PID 1 we adjust the umask to 0, but generators should not run that
way, given that they might be implemented as shell scripts and such.
Let's hence explicitly adjust the umask for them.

We already do this for unit generators. Let's do this for env
generators, too.
2018-11-20 23:35:04 +01:00
Zbigniew Jędrzejewski-Szmek 294bf0c34a Split out pretty-print.c and move pager.c and main-func.h to shared/
This is high-level functionality, and fits better in shared/ (which is for
our executables), than in basic/ (which is also for libraries).
2018-11-20 18:40:02 +01:00
Lennart Poettering bb25977244 main: don't freeze PID 1 in containers, exit with non-zero instead
After all we have a nice way to propagate total failures, hence let's
use it.
2018-11-20 17:04:07 +01:00
Lennart Poettering bb85a58208 main: use EXIT_EXCEPTION instead of EXIT_FAILURE at two more exceptional places 2018-11-20 17:04:07 +01:00
Lennart Poettering 79a224c460 main: when reloading PID 1 let's reset the default environment
Otherwise we keep collecting stuff from env generators, and we really
shouldn't.

This was working properly on reexec but not on reload, as for reexec we
would always start fresh, but for reload would reuse the Manager object
and hence its default environment set.

Fixes: #10671
2018-11-19 13:01:19 +01:00
Lennart Poettering 2fbbbf9a5f manager: log on two OOM occasions 2018-11-19 12:22:56 +01:00
Lennart Poettering b08020893a
Merge pull request #10809 from keszybz/unit-log-result
Add helper function for logging unit results
2018-11-19 11:07:07 +01:00
Chris Down a88c5b8ac4 cgroup v2: DefaultCPUAccounting=yes if CPU controller isn't required
We now don't enable the CPU controller just for CPU accounting if we are
on 4.15+ and using pure unified hierarchy, as this is provided
externally to the CPU controller. This makes CPUAccounting=yes
essentially free, so enabling it by default when it's cheap seems like a
good idea.
2018-11-18 12:21:41 +00:00
Chris Down f98c25850f cgroup v2: Don't require CPU controller for CPU accounting in 4.15+
systemd only uses functions that are as of Linux 4.15+ provided
externally to the CPU controller (currently usage_usec), so if we have a
new enough kernel, we don't need to set CGROUP_MASK_CPU for
CPUAccounting=true as the CPU controller does not need to necessarily be
enabled in this case.

Part of this patch is modelled on an earlier patch by Ryutaroh Matsumoto
(see PR #9665).
2018-11-18 12:21:41 +00:00
Zbigniew Jędrzejewski-Szmek 1c2c7c6cb3 pkgconfig: define variables relative to ${prefix}/${rootprefix}/${sysconfdir}
Fixes #4549.

People want to be able to redefine the prefixes relative to which the other
variables are defined. Something like
  pkgconf --define-variable=prefix=/home/user/installpath --variable=systemdsystemunitdir systemd

I'm not convinced that this entirely useful, because the installed systemd will
not look at those paths, but maybe it's OK as an alternative type of $DESTDIR.
This has been requested a few times over the years, so let's just provide this.

I thought this would be more complicated, since we allow all kinds of directories
to be overrides in the compilation configuration. But it turns out that all the
directories defined in systemd.pc are relative to three prefixes:
$prefix, $rootprefix, and $sysconfdir. So this patch adds $rootprefix and $sysconfdir
to the .pc file and then changes the subsequent definitions in the .pc file to use
them. In the end we define each path twice using the same rules: once in meson.build
and once in the .pc file.

Without overrides:
$ for i in $(pkgconf --with-path=build/src/core systemd --print-variables); do
     echo -n "$i = "; pkgconf --with-path=$PWD/build/src/core --variable=$i systemd
  done
containeruidbasemax = 1878982656
containeruidbasemin = 524288
dynamicuidmax = 65519
dynamicuidmin = 61184
systemgidmax = 999
systemuidmax = 999
catalogdir = /usr/lib/systemd/catalog
modulesloaddir = /usr/lib/modules-load.d
binfmtdir = /usr/lib/binfmt.d
sysctldir = /usr/lib/sysctl.d
sysusersdir = /usr/lib/sysusers.d
tmpfilesdir = /usr/lib/tmpfiles.d
systemdshutdowndir = /usr/lib/systemd/system-shutdown
systemdsleepdir = /usr/lib/systemd/system-sleep
systemdusergeneratordir = /usr/lib/systemd/user-generators
systemdsystemgeneratordir = /usr/lib/systemd/system-generators
systemduserunitpath = /etc/systemd/user:/etc/systemd/user:/run/systemd/user:/usr/local/lib/systemd/user:/usr/local/share/systemd/user:/usr/lib/systemd/user:/usr/lib/systemd/user:/usr/share/systemd/user
systemdsystemunitpath = /etc/systemd/system:/etc/systemd/system:/run/systemd/system:/usr/local/lib/systemd/system:/usr/lib/systemd/system:/usr/lib/systemd/system:/lib/systemd/system
systemduserconfdir = /etc/systemd/user
systemdsystemconfdir = /etc/systemd/system
systemduserpresetdir = /usr/lib/systemd/user-preset
systemduserunitdir = /usr/lib/systemd/user
systemdsystempresetdir = /usr/lib/systemd/system-preset
systemdsystemunitdir = /usr/lib/systemd/system
systemdutildir = /usr/lib/systemd
sysconfdir = /etc
rootprefix = /usr
prefix = /usr
pcfiledir = /usr/share/pkgconfig

With overrides:
$ for i in $(pkgconf --with-path=build/src/core systemd --print-variables); do
     echo -n "$i = "; pkgconf --with-path=$PWD/build/src/core \
     --define-variable=prefix=/PREFIX \
     --define-variable=rootprefix=/ROOTPREFIX \
     --define-variable=sysconfdir=/SYSCONF --variable=$i systemd
  done
containeruidbasemax = 1878982656
containeruidbasemin = 524288
dynamicuidmax = 65519
dynamicuidmin = 61184
systemgidmax = 999
systemuidmax = 999
catalogdir = /PREFIX/lib/systemd/catalog
modulesloaddir = /PREFIX/lib/modules-load.d
binfmtdir = /PREFIX/lib/binfmt.d
sysctldir = /PREFIX/lib/sysctl.d
sysusersdir = /PREFIX/lib/sysusers.d
tmpfilesdir = /PREFIX/lib/tmpfiles.d
systemdshutdowndir = /ROOTPREFIX/lib/systemd/system-shutdown
systemdsleepdir = /ROOTPREFIX/lib/systemd/system-sleep
systemdusergeneratordir = /PREFIX/lib/systemd/user-generators
systemdsystemgeneratordir = /ROOTPREFIX/lib/systemd/system-generators
systemduserunitpath = /SYSCONF/systemd/user:/etc/systemd/user:/run/systemd/user:/usr/local/lib/systemd/user:/usr/local/share/systemd/user:/PREFIX/lib/systemd/user:/usr/lib/systemd/user:/usr/share/systemd/user
systemdsystemunitpath = /SYSCONF/systemd/system:/etc/systemd/system:/run/systemd/system:/usr/local/lib/systemd/system:/ROOTPREFIX/lib/systemd/system:/usr/lib/systemd/system:/lib/systemd/system
systemduserconfdir = /SYSCONF/systemd/user
systemdsystemconfdir = /SYSCONF/systemd/system
systemduserpresetdir = /PREFIX/lib/systemd/user-preset
systemduserunitdir = /PREFIX/lib/systemd/user
systemdsystempresetdir = /ROOTPREFIX/lib/systemd/system-preset
systemdsystemunitdir = /ROOTPREFIX/lib/systemd/system
systemdutildir = /usr/lib/systemd
sysconfdir = /SYSCONF
rootprefix = /ROOTPREFIX
prefix = /PREFIX
pcfiledir = /usr/share/pkgconfig

(pkgconf doesn't provide a way to print all variables together with their definitions,
according to the man page. Disappointing.)
2018-11-17 15:38:15 +01:00
Lennart Poettering 46f2d09f31 conf-parse: drop unused prototype 2018-11-17 08:47:27 +01:00
Zbigniew Jędrzejewski-Szmek aac99f303a core: introduce a helper function to wrap unit_log_{success,failure}
It's inline so that the compiler can easily optimize away the call to get
status string.
2018-11-16 19:47:07 +01:00
Lennart Poettering ae3cc6ec0d
Merge pull request #10770 from poettering/unit-done-log
improvements to structure log events from PID1
2018-11-16 17:54:19 +01:00
Lennart Poettering 6415fecd4c
Merge pull request #10785 from poettering/cgroup-join-removal
remove JoinControllers= setting
2018-11-16 17:53:26 +01:00
Lennart Poettering 3382cf28b6
Merge pull request #10802 from poettering/hide-only-on
man: let's deprecate PermissionsStartOnly=
2018-11-16 17:53:01 +01:00
Lennart Poettering f20db19954 cocci: simplify some if checks 2018-11-16 16:05:29 +01:00
Lennart Poettering ff30a86bd4 job: simplify status message extraction
As @keszybz points out these fields are always here, there's no point in
checking if they are NULL or not.
2018-11-16 15:30:36 +01:00
Lennart Poettering 9a80f2f453 job: when a job was skipped due to a failed condition, log about it
Previously we'd neither show console status output nor log output. Let's
fix that, and still log something.
2018-11-16 15:30:36 +01:00
Lennart Poettering 6e64994d69 core: make unit_start() return a distinguishable error code in case conditions didn't hold
Ideally we'd even propagate this all the way to the client, by having a
separate JobType enum value for this. But it's hard to add this without
breaking compat, hence for now let's at least internally propagate this
case differently from the case "already on it".

This is then used to call job_finish_and_invalidate() slightly
differently, with the already= parameter false, as in the failed
condition case no message was likely produced so far.
2018-11-16 15:22:48 +01:00
Lennart Poettering 0e2b4a822e job: add two explanatory comments 2018-11-16 15:22:48 +01:00
Lennart Poettering a69b3872ac job: let's remove one comparison and reduce indentation level by one 2018-11-16 15:22:48 +01:00
Lennart Poettering 523ee2d414 core: log a recognizable message when a unit succeeds, too
We already are doing it on failure, let's do it on success, too.

Fixes: #10265
2018-11-16 15:22:48 +01:00
Lennart Poettering 91bbd9b796 core: make log messages about unit processes exiting recognizable 2018-11-16 15:22:48 +01:00
Lennart Poettering 7c047d7443 core: make log messages about units entering a 'failed' state recognizable
Let's make this recognizable, and carry result information in a
structure fashion.
2018-11-16 15:22:48 +01:00
Lennart Poettering b344b363ce job: also include job ID in log messages when we begin with a job 2018-11-16 15:22:48 +01:00
Lennart Poettering 33a3fdd978 core: move unit_status_emit_starting_stopping_reloading() and related calls to job.c
This call is only used by job.c and very specific to job handling.
Moreover the very similar logic of job_emit_status_message() is already
in job.c.

Hence, let's clean this up, and move both sets of functions to job.c,
and rename them a bit so that they express precisely what they do:

1. unit_status_emit_starting_stopping_reloading() →
   job_emit_begin_status_message()
2. job_emit_status_message() → job_emit_done_status_message()

The first call is after all what we call when we begin with the
execution of a job, and the second call what we call when we are done
wiht it.

Just some moving and renaming, not other changes, and hence no change in
behaviour.
2018-11-16 15:22:48 +01:00
Lennart Poettering f8c34706f5 job: add log message when we can't enable the job run event source 2018-11-16 15:22:48 +01:00
Lennart Poettering 8ebd9175db job: add comment for EAGAIN job run case 2018-11-16 15:22:48 +01:00
Lennart Poettering ea2c0e4526 job: minor coding style tweaks 2018-11-16 15:22:48 +01:00
Lennart Poettering 1cd81629e1 job: include JOB_ID field in log message about jobs 2018-11-16 15:22:48 +01:00
Lennart Poettering fae9bc298a cgroup: when determining which controllers we need, always extend the mask according to cpu/cpuacct joint mounting
Note that for cgroup_context_get_mask() this doesn't actually change
much, but it does prepare the ground for #10507 later on.
2018-11-16 14:54:13 +01:00
Lennart Poettering 143fadf369 core: remove JoinControllers= configuration setting
This removes the ability to configure which cgroup controllers to mount
together. Instead, we'll now hardcode that "cpu" and "cpuacct" are
mounted together as well as "net_cls" and "net_prio".

The concept of mounting controllers together has no future as it does
not exist to cgroupsv2. Moreover, the current logic is systematically
broken, as revealed by the discussions in #10507. Also, we surveyed Red
Hat customers and couldn't find a single user of the concept (which
isn't particularly surprising, as it is broken...)

This reduced the (already way too complex) cgroup handling for us, since
we now know whenever we make a change to a cgroup for one controller to
which other controllers it applies.
2018-11-16 14:54:13 +01:00
Lennart Poettering 8d33dca2ff core: fix capitalization of CPUShares= settings 2018-11-16 14:46:49 +01:00
Lennart Poettering 899feb7225 man: let's deprecate PermissionsStartOnly=
The concept is redundant and predates the special chars that do the same
in ExecStar=. Let's settle on advertising just the latter, and hide
PermissionsStartOnly= from the docs (even if we continue supporting it).
2018-11-16 14:31:37 +01:00
Lennart Poettering b39cfe21b4 core: show error code even when debug logging 2018-11-16 12:22:11 +01:00
Zbigniew Jędrzejewski-Szmek cd5a29ce98
Merge pull request #10742 from poettering/c-utf8
default to C.UTF-8 locale, and many improvements to env var file parsing/kernel cmdline parsing
2018-11-15 12:47:17 +01:00
Lennart Poettering 042cad5737
Merge pull request #10753 from keszybz/pager-no-interrupt
Add mode in journalctl where ^C is handled by the pager
2018-11-14 20:09:39 +01:00
Lennart Poettering 3dafa6bc76 core: drop dbus queue recursion check
We don't dispatch the queue recursively anymore, hence let's simplify
things a bit.

As pointed out by @fbuihuu:

https://github.com/systemd/systemd/pull/10763#discussion_r233209550
2018-11-14 20:09:11 +01:00
Lennart Poettering e6755a3350 locale-util: introduce common helper locale_variables_free() for freeing locale variable arrays 2018-11-14 17:01:55 +01:00
Lennart Poettering 13df9c398d fileio: automatically add NULL sentinel to parse_env_file()
Let's modernize things a bit.
2018-11-14 17:01:55 +01:00
Lennart Poettering aa8fbc74e3 fileio: drop "newline" parameter for env file parsers
Now that we don't (mis-)use the env file parser to parse kernel command
lines there's no need anymore to override the used newline character
set. Let's hence drop the argument and just "\n\r" always. This nicely
simplifies our code.
2018-11-14 17:01:54 +01:00
Lennart Poettering 0db9bd200f locale-setup: default to C.UTF-8
Most distributions already were shipping a C.UTF-8 locale and even Fedora
now supports the C.UTF-8 locale, and there's clear indication that this
is going upstream too. Hence, let's default to it now too, if nothing
else is set.

Note that this is only a fallback if noting else is set, and since
distros generally configure a default for this behaviour shouldn't
really change in installed systems.

On new systems this makes vconsole.conf redundant.
2018-11-14 17:01:54 +01:00
Lennart Poettering cd45734f87 locale-setup: rework to use new proc_cmdline_get_key_many() API
This way we use the same code to access /proc/cmdline as everywhere
else, and thus support it in containers and the SYSTEMD_PROC_CMDLINE env
var.
2018-11-14 17:01:54 +01:00
Lennart Poettering 6176be15ac locale-setup: use precise type for locale variable iterator 2018-11-14 17:01:54 +01:00
Lennart Poettering 05e062cb89 locale-setup: use _cleanup_strv_free_ 2018-11-14 17:01:54 +01:00
Zbigniew Jędrzejewski-Szmek fde32028a4 Move LONG_LINE_MAX definition to fileio.h
All users of the macro (except for one, in serialize.c), use the macro in
connection with read_line(), so they must include fileio.h.  Let's not play
libc games and require multiple header file to be included for the most common
use of a function.

The removal of def.h includes is not exact. I mostly went over the commits that
switch over to use read_line() and add def.h at the same time and reverted the
addition of def.h in those files.
2018-11-14 16:25:32 +01:00
Zbigniew Jędrzejewski-Szmek 0221d68a13 basic/pager: convert the pager options to a flags argument
Pretty much everything uses just the first argument, and this doesn't make this
common pattern more complicated, but makes it simpler to pass multiple options.
2018-11-14 16:25:11 +01:00
Zbigniew Jędrzejewski-Szmek eff162bf96
Merge pull request #10763 from poettering/pending-reload-fix
when reloading, don't throttle unit/job dbus change signal generation
2018-11-14 10:44:50 +01:00
Lennart Poettering 8204470252 unit: don't claim there was no IP traffic generated by a unit when we don't know
Only if we have some IP traffic accounting at all we should claim that.
2018-11-14 09:53:50 +01:00
Lennart Poettering 6eb65e7ca4 core: split out audit message generation from unit_notify()
Just some refactoring, no change in behaviour.
2018-11-14 09:51:47 +01:00
Pavel Hrdina 2af3eed1aa bpf-devices: fix order of removing and adding BPF programs
The current code has multiple issues and it should never be done like
that.  If someone updates list of allowed devices we should attach new
program before we remove the old one for two reasons:

1. It takes some time to attach new program so there is a period of time
when all devices are allowed.

2. BPF programs have limit for number of instructions (4096) and if user
adds a lot of devices we might hit the instruction limit and the new
program will not be accepted which will result in allow all devices
because the old program was already removed.

In order to attach the new program before we remove the old one we need
to use BPF_F_ALLOW_MULTI flag every time.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2018-11-13 14:03:01 +01:00
Pavel Hrdina 0b82cd2502 bpf-devices: fix cgroup v2 devices detection
If cgroup v2 bpf devices is supported we need to return 1, not -1.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2018-11-13 12:58:05 +01:00
Lennart Poettering b8d381c477 core: make sure we don't throttle change signal generator when a reload is pending
Fixes: #10627
2018-11-13 12:51:48 +01:00
Lennart Poettering 4b66bccab0 core: when we can't send the pending reload message, say we ignore it in the warning we log
No change in behaviour, just better wording.
2018-11-13 12:01:22 +01:00
Lennart Poettering 209de5256b core: rename queued_message → pending_reload_message
This field is only used for pending Reload() replies, hence let's rename
it to be more descriptive and precise.

No change in behaviour.
2018-11-13 11:59:06 +01:00
INSUN PYO 8724defeae core: use local variable m instead of u->manager 2018-11-13 10:39:35 +01:00
Tommi Rantala 429926e9cc core: include unit name in emergency_action() reason message
Add unit name in StartLimitAction=, FailureAction= and SuccessAction=
emergency_action() reason messages, so that the problematic unit is
easily visible, for example:

    "unit dbus.service failed"
2018-11-12 16:36:03 +01:00
Lennart Poettering f57dfcba08
Merge pull request #10722 from yuwata/sd-device-monitor-change-arguments
sd-device-monitor: slightly re-design API arguments
2018-11-11 15:43:45 +01:00
Lennart Poettering a9353a5c5b core: log about /var/run/ prefix used in PIDFile=, patch it to be /run instead
In a way this is a follow-up for
a2d1fb882c, but adds a similar warning for
PIDFile=.

There's a much stronger case for doing this kind of notification in
tmpfiles.d (since it helps relating lines to each other for the purpose
of merging them). Doing this for PIDFile= is mostly about being
systematic and copying tmpfiles.d/ behaviour here.

While we are at it, let's also support relative filenames in PIDFile=
now, and prefix them with /run, to make them absolute.

Fixes: #10657
2018-11-10 19:17:00 +01:00
Yu Watanabe deb2b7348e sd-device: drop priority and description from sd_device_monitor_attach_event() and sd_device_monitor_start()
Now we have sd_device_monitor_get_event_soruce(). So, it is not
necessary to include these parameters in the functions for sd_device_monitor.
2018-11-10 22:53:00 +09:00
Lennart Poettering bedea99dce core: expose bus client names currently reffing a unit as property
This is useful for debugging client-side ref counting of units: for each
ref taken on a unit the client's sender name is listed. If a client has
multiple refs on the same unit it is listed multiple times.
2018-11-09 17:15:34 +01:00
Lennart Poettering e5c36295d8 unit: enqueue cgroup empty check event if the last ref on a unit is dropped 2018-11-09 17:08:59 +01:00
Lennart Poettering c20076a8c1 pid1: add a new AbandonScope() method call on the Manager object
This is the same as Abandon() on the Scope object, but saves clients
from first translating a unit name into a unit object path. This logic
matches how all the other unit methods have counterparts on the Manager
object too (e.g. StopUnit() on the Manager object matching Stop() on the
Unit object), this one was simply forgotten so far.
2018-11-09 17:08:59 +01:00
Lennart Poettering c2baf11c36 cgroup: actually reset the cgroup invalidation mask after we made our changes
Previously we never unmasked the mask after it was set once. Let's fix
that.
2018-11-08 15:20:52 +01:00
Lennart Poettering 6529ccfa20 unit: replace three non-type-safe macros by type-safe inline functions
Behaviour is prett ymuch the same, but there's some additional type
checking done on the input parameters.

(In the case of UNIT_WRITE_FLAGS_NOOP() the C compiler won't actually do
the type checking necessarily, but static chckers at least could)
2018-11-08 13:55:25 +01:00
Yu Watanabe 64af816228 core/device: use LIST_FOREACH_AFTER() or friend 2018-11-08 10:58:25 +01:00
Lennart Poettering bbf1120623 unit: make UNIT() cast function deal with NULL pointers
Fixes: #10681
2018-11-08 10:47:08 +01:00
Yu Watanabe b9c04eafb8 core: introduce exec_params_clear()
Follow-up for 1ad6e8b302.

Fixes #10677.
2018-11-08 09:36:37 +01:00
Zbigniew Jędrzejewski-Szmek e44c5a3ba6
Merge pull request #10594 from poettering/env-reload-fix
change handling of environment block of PID1's manager object
2018-11-07 12:49:13 +01:00
Lennart Poettering ed63705975
Merge pull request #10650 from yuwata/udevadm-trigger-use-write-string-file
udevadm: use write_string_file() helper function
2018-11-06 16:46:25 +03:00
Giuseppe Scrivano 875622c39e core, sysctl: skip ENOENT for /proc/sys/net/unix/max_dgram_qlen
sysctl is disabled for /proc mounted from an user namespace thus entries like
/proc/sys/net/unix/max_dgram_qlen do not exist.  In this case, skip the error
and do not try to change the default for the AF_UNIX datagram queue length.
2018-11-06 16:41:34 +03:00
Yu Watanabe 57512c893e tree-wide: set WRITE_STRING_FILE_DISABLE_BUFFER flag when we write files under /proc or /sys 2018-11-06 21:24:03 +09:00
Lennart Poettering 1ad6e8b302 core: split environment block mantained by PID 1's Manager object in two
This splits the "environment" field of Manager into two:
transient_environment and client_environment. The former is generated
from configuration file, kernel cmdline, environment generators. The
latter is the one the user can control with "systemctl set-environment"
and similar.

Both sets are merged transparently whenever needed. Separating the two
sets has the benefit that we can safely flush out the former while
keeping the latter during daemon reload cycles, so that env var settings
from env generators or configuration files do not accumulate, but
dynamic API changes are kept around.

Note that this change is not entirely transparent to users: if the user
first uses "set-environment" to override a transient variable, and then
uses "unset-environment" to unset it again things will revert to the
original transient variable now, while previously the variable was fully
removed. This change in behaviour should not matter too much though I
figure.

Fixes: #9972
2018-10-31 18:00:53 +01:00
Lennart Poettering bea1a01310 strv: wrap strv_new() in a macro so that NULL sentinel is implicit 2018-10-31 18:00:52 +01:00
Joerg Behrmann 56ef8db9f5 core: apply WorkingDirectory after enforce_user
If WorkingDirectory is on NFS, root might only have the privileges of
nobody and the chdir to the WorkingDirectory might fail, even if the
user running the service would have the proper privileges to chdir to
that directory.

Fixes #10568
2018-10-31 12:07:24 +01:00
Insun fee04d7f3a core: don't drop timer expired but not yet processed when system date is changed
There is difference between time set by the user and real elapsed time because of accuracy feature.
If you change the system date(or time) between these times, the timer drops.

You can easily reproduce it with the following command.
-----------------------------------------------------------
$ systemd-run --on-active=3s ls; sleep 3; date -s "`date`"
-----------------------------------------------------------

In the following command, the problem is rarely reproduced. But it exists.
---------------------------------------------------------------------------------------------
$ systemd-run --on-active=3s --timer-property=AccuracySec=1us ls ; sleep 1; date -s "`date`"
---------------------------------------------------------------------------------------------

Note : Global AccuracySec value.
----------------------------------------------------------------------
$ cat /etc/systemd/system.conf
DefaultTimerAccuracySec=1min
----------------------------------------------------------------------
2018-10-30 15:38:45 +01:00
Franck Bui 651b3b6c92 core: skip unit deserialization and move to the next one when unit_deserialize() fails
If unit_deserialize() fails (because one read line is overly long), it returns
an error and we would have assumed that the next read would point to the next
unit to deserialize.

But instead unit_deserialize() can leave the file offset in the middle of a
line.

Therefore we need to ignore and skip the current unit in this case too.

While at it, move unit deserialization in a dedicated functions. That should
make the code easier to read.
2018-10-30 14:52:13 +01:00
Lennart Poettering 6897dfe85a core: add free_and_replace() at one more place 2018-10-26 19:49:15 +02:00
Lennart Poettering 7593c3ecbf tree-wide: add a few missing _sentinel_ decorators 2018-10-26 19:49:15 +02:00
Lennart Poettering cd6b7d50c3 chown-recursive: TAKE_FD() is your friend 2018-10-26 15:13:26 +02:00