Commit graph

4637 commits

Author SHA1 Message Date
Chris Down 4e1dfa45e9 cgroup: s/cgroups? ?v?([0-9])/cgroup v\1/gI
Nitpicky, but we've used a lot of random spacings and names in the past,
but we're trying to be completely consistent on "cgroup vN" now.

Generated by `fd -0 | xargs -0 -n1 sed -ri --follow-symlinks 's/cgroups?  ?v?([0-9])/cgroup v\1/gI'`.

I manually ignored places where it's not appropriate to replace (eg.
"cgroup2" fstype and in src/shared/linux).
2019-01-03 11:32:40 +09:00
Dave Reisner 8ca9e92c74 Make default locale a compile time option
Default to a locale that's guaranteed to exist everywhere, but let
distros override this with something more exotic if they choose to.

Closes #11259.
2018-12-29 21:43:04 +09:00
Evgeny Vereshchagin 7334ade4a7 core: free lines after reading them
Closes https://github.com/systemd/systemd/issues/11251.
2018-12-28 15:36:00 +09:00
Patrick Williams 8eab766804 core: support %j in unit dependency resolution
Commit 250e9fadbc introduced
support for %j/%J specifier in unit files.  The function
unit_name_printf is used in unit dependency resolution,
such as Wants / After directives, but was missing support
for the %j.  Add to allow directives such as:

    [Unit]
    Wants=bar-%j.target

Fixes: systemd/systemd#11217
Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
2018-12-22 17:21:13 +09:00
Zbigniew Jędrzejewski-Szmek ec8126d723 Revert "core/mount: minimize impact on mount storm."
This reverts commit 89f9752ea0.

This patch causes various problems during boot, where a "mount storm" occurs
naturally. Current approach is flakey, and it seems very risky to push a
feature like this which impacts boot right before a release. So let's revert
for now, and consider a more robust solution after later.

Fixes #11209.

> https://github.com/systemd/systemd/pull/11196#issuecomment-448523186:
"Reverting 89f9752ea0 and fcfb1f775e fixes this test."
2018-12-19 11:37:41 +01:00
Zbigniew Jędrzejewski-Szmek e36db50075 Revert "mount: disable mount-storm protection while mount unit is starting."
This reverts commit fcfb1f775e.
2018-12-19 11:32:17 +01:00
NeilBrown fcfb1f775e mount: disable mount-storm protection while mount unit is starting.
The starting of mount units requires that changes to
/proc/self/mountinfo be processed before the SIGCHILD from the
completion of /sbin/mount is processed, as described by the comment
  /* Note that due to the io event priority logic, we can be sure the new mountinfo is loaded
   * before we process the SIGCHLD for the mount command. */

The recently-added mount-storm protection can defeat this as it
will sometimes deliberately delay processing of /proc/self/mountinfo.

So we need to disable mount-storm protection when a mount unit is starting.
We do this by keeping a counter of the number of pending
mounts, and disabling the protection when this is non-zero.

Thanks to @asavah for finding and reporting this problem.
2018-12-19 00:44:19 +01:00
Michal Sekletar 4c70a4a748 core: do cgroup migration first and only then connect to journald
Fixes #11162
2018-12-17 19:22:30 +01:00
Alexey Bogdanenko 8f9f3cb724 core: fix KeyringMode for user services
KeyringMode option is useful for user services. Also, documentation for the
option suggests that the option applies to user services. However, setting the
option to any of its allowed values has no effect.

This commit fixes that and removes EXEC_NEW_KEYRING flag. The flag is no longer
necessary: instead of checking if the flag is set we can check if keyring_mode
is not equal to EXEC_KEYRING_INHERIT.
2018-12-17 16:56:36 +01:00
Zbigniew Jędrzejewski-Szmek 582de70f2f
Merge pull request #11086 from poettering/nscd-cache-flush
flush nscd's caches when we register user/groups/hostnames
2018-12-17 11:29:58 +01:00
Khem Raj baa162cecd core: Fix use after free case in load_from_path()
ensure that mfree() on filename is called after the logging function
which uses the string pointed by filename

Signed-off-by: Khem Raj <raj.khem@gmail.com>
2018-12-16 22:02:00 -08:00
NeilBrown 89f9752ea0 core/mount: minimize impact on mount storm.
If we create 2000 mounts (on a 1-CPU qemu VM) with
  mkdir -p /MNT/{1..2000}
  time for i in {1..2000}; do mount --bind /etc /MNT/$i ; done

it takes around 20 seconds to complete.  Much of this time is taken up
by systemd repeatedly processing /proc/self/mountinfo.
If I disable the processing, the time drops to about 4 seconds.

I have reports that on a larger system with multiple active user sessions, each
with it's own systemd, the impact can be higher.

One particular use-case where a large number of mounts can be expected in quick
succession is when the "clearcase" SCM starts up.

This patch modifies the handling up events from /proc/self/mountinfo so
that systemd backs off when a storm is detected.  Specifically the time to process
mountinfo is measured, and the process will not be repeated until 10 times
that duration has passed.  This ensures systemd won't use more than 10% of
real time processing mountinfo.

With this patch, my test above takes about 5 seconds.
2018-12-16 12:38:40 +01:00
Lennart Poettering 2d41e9b7a0
Merge pull request #11143 from keszybz/enable-symlink
Runtime mask symlink confusion fix
2018-12-16 12:37:07 +01:00
Lennart Poettering 460ec54908 core: flush nscd's caches whenever we allocate/release a dynamic user
This should make dynamic users and nscd work together better.

Fixes: #10740
2018-12-15 12:10:19 +01:00
Zbigniew Jędrzejewski-Szmek 58d9d89b4b pid1: fix free of uninitialized pointer in unit_fail_if_noncanonical()
https://bugzilla.redhat.com/show_bug.cgi?id=1653068
2018-12-14 11:21:16 +01:00
Chris Down 0c17c00433
Merge pull request #11131 from poettering/make-lucab-happy
optionally relabel additional files/dirs for selinux after loading policy
2018-12-12 23:17:39 +00:00
Chris Down cb5e3bc37d cgroup: Don't explicitly check for member in UNIT_BEFORE
The parent slice is always filtered ahead of time from UNIT_BEFORE, so
checking if the current member is the same as the parent unit will never
pass.

I may also write a SLICE_FOREACH_CHILD macro to remove some more of the
parent slice checks, but this requires a bit of a rework and general
refactoring and may not be worth it, so let's just do this for now.
2018-12-12 20:50:10 +01:00
Chris Down e92aaed30e tree-wide: Remove O_CLOEXEC from fdopen
fdopen doesn't accept "e", it's ignored. Let's not mislead people into
believing that it actually sets O_CLOEXEC.

From `man 3 fdopen`:

> e (since glibc 2.7):
> Open the file with the O_CLOEXEC flag. See open(2) for more information. This flag is ignored for fdopen()

As mentioned by @jlebon in #11131.
2018-12-12 20:47:40 +01:00
Lennart Poettering 70a74ec645 mount-setup: don't consider it reason to fail if we can't relabel cgroupfs
We usually don't care much about relabel failures, let's not do that
here either.
2018-12-12 20:46:07 +01:00
Lennart Poettering c4217b43d1 mount-setup: use FOREACH_STRING where appropriate 2018-12-12 20:46:07 +01:00
Lennart Poettering 65e183d789 mount-setup: optionally, relabel a configured set of files/dirs after loading policy
Fixes: #10466
2018-12-12 20:46:07 +01:00
Zbigniew Jędrzejewski-Szmek 303ee60151 Mark *data and *userdata params to specifier_printf() as const
It would be very wrong if any of the specfier printf calls modified
any of the objects or data being printed. Let's mark all arguments as const
(primarily to make it easier for the reader to see where modifications cannot
occur).
2018-12-12 16:45:33 +01:00
Lennart Poettering a95c0505ad core: extend comments regarding coldplug() vs. catchup() 2018-12-12 11:20:53 +01:00
Lennart Poettering a1c7334b61 core: when a unit state changes only propagate to jobs after reloading is complete
Previously, we'd immediately propagate unit state changes into any jobs
pending for them, always. With this we only do this if the manager is
out of the "reload" state. This fixes the problem #8803 tried to
address, by simply not completing jobs until after the reload (and thus
reestablishment of the dbus connection) is complete.

Note that there's no need to later on explicitly catch up with the
missed job state changes (i.e. there's no need to call
unit_process_job() later one explicitly). That's because for jobs in
JOB_WAITING state on deserialization all jobs are requeued into the run
queue anyway, and thus checked again if they can complete now. And for
JOB_RUNNING jobs unit_catchup() phase is going to trigger missed out
state changes *after* the reload complete anyway (after all that's what
distinguishes from unit_coldplug()).

Replaces: #8803
2018-12-12 11:15:07 +01:00
Lennart Poettering 16c74914d2 core: split out all logic that updates a Job on a unit's unit_notify() invocation
Just some refactoring, no change in behaviour.
2018-12-12 11:15:07 +01:00
Lennart Poettering b17c9620c8 core: rework how we deserialize jobs
Let's add a helper call unit_deserialize_job() for this purpose, and
let's move registration in the global jobs hash table into
job_install_deserialized() so that it it is done after all superficial
checks are done, and before transitioning into installed states, so that
rollback code is not necessary anymore.
2018-12-12 11:15:07 +01:00
Lennart Poettering 48235ad6b7 job: be more careful when removing job object from jobs hash table
Let's validate that the ID is actually allocated to us before remove a
job.

This is relevant as various bits of code will call job_free() on
partially set up Job objects, and we really shouldn't remove another job
object accidentally from the hash table, when the set up didn't
complete.
2018-12-12 11:15:07 +01:00
Lennart Poettering 4a53080be6 core: don't track jobs-finishing-during-reload explicitly
Memory management is borked for this, and moreover this is unnecessary
since f0831ed2a0, i.e. since coldplug() and catchup() are two different
concepts: the former restoring the state from before a reload, the
latter than adjusting it again to the actual status in effect after the
reload.

Fixes: #10716
Mostly reverts: #8803
2018-12-12 11:15:06 +01:00
Lennart Poettering 728ba51e98 job: update job_free() to follow our usual return-NULL style 2018-12-12 11:14:26 +01:00
Zbigniew Jędrzejewski-Szmek d7ef125726 core: fix typo in comment 2018-12-11 22:20:07 +01:00
Lennart Poettering d742f4b54b cgroup: correct mangling of return values
Let's nor return the unmangled return value before we actually mangle
it.

Fixes: #11062
2018-12-10 16:09:41 +01:00
Lennart Poettering 92a993041a cgroup: call cg_all_unified() right before using the result
Let's not query it before we actually need it.
2018-12-10 16:09:41 +01:00
Lennart Poettering 8f3fd07ac0
Merge pull request #11105 from keszybz/path-parsing
Some tightening of our path parsing code
2018-12-10 15:50:08 +01:00
Zbigniew Jędrzejewski-Szmek 1d4c6f5bef pid1: set Description even for devices which don't exist yet
We'd only set the description after the device appeared in sysfs, so
we'd always print
"A start job is running for dev-disk-by\x2duuid-aaaa ... aaaa.device (42s / 1min 30s)"
Let's make this
"A start job is running for /dev/disk/by-duuid/aaaa ... aaaa (42s / 1min 30s)"

https://bugzilla.redhat.com/show_bug.cgi?id=1655860
2018-12-10 14:00:42 +01:00
Lennart Poettering 2327f95499
Merge pull request #10984 from fbuihuu/tmpfiles-be-more-explicit-with-unsafe-transition
tmpfiles: be more explicit when an unsafe path transition is met
2018-12-10 12:31:56 +01:00
Zbigniew Jędrzejewski-Szmek 4cb06c5949 Use VLA instead of alloca
The test is the same, but an array is more readable.
2018-12-10 11:57:26 +01:00
Zbigniew Jędrzejewski-Szmek 60473f0c23 pid1: fix (harmless) off-by-one in PATH_MAX comparison
PATH_MAX is supposed to include the terminating NUL byte. But we already
check that there is no NUL byte in the specified path. Hence the maximum
length we can expect is PATH_MAX - 1.

This doesn't change much, but makes this use of PATH_MAX consistent with the
rest of the codebase.
2018-12-10 11:57:26 +01:00
Franck Bui 36c97decbe fs-util: make chase_symlink() returns -ENOLINK when unsafe transitions are met
We previously returned -EPERM but it can be returned for various other reasons
too.

Let's use -ENOLINK instead as this value shouldn't be used currently. This
allows users of CHASE_SAFE to detect without any ambiguities when unsafe
transitions are encountered by chase_symlinks().

All current users of CHASE_SAFE that explicitly reacted on -EPERM have been
converted to react on -ENOLINK.
2018-12-10 09:18:27 +01:00
Lennart Poettering 06721f39f6 swap: always add in extras when we load a swap unit
Much like for the mount units we need fields such as the slice
initialized by the time we activate the swap, hence when the kernel
let's us know about a new swap that appeared we need to initialize the
slice in any Swap object we allocated for that right-away, even if we
can't read the real unit file for the swap device.
2018-12-07 17:35:32 +01:00
Lennart Poettering a0a424083f swap: drop return value
We don't actually return any valid 'r' here, let's explicitly return 0
here hence instead.
2018-12-07 17:35:32 +01:00
Lennart Poettering eb04385dc5 swap: don't propagate issues with processing /proc/swaps
This follows similar recent changes in mount.c: error should be consider
local, and not be propagated.
2018-12-07 17:35:32 +01:00
Lennart Poettering 46f944802c swap: when loading a unit from /proc/swaps, mark its load state as good
This follows similar logic in the mount unit.
2018-12-07 17:35:32 +01:00
Lennart Poettering 15332d738f swap: split out code adding in additional unit props into a function of its own
This adds swap_add_extras() similar to mount_add_extras().

No change in behaviour, just some refactoring.
2018-12-07 17:35:32 +01:00
Lennart Poettering ba6fbb2cee swap: fix misplaced comment 2018-12-07 17:35:32 +01:00
Lennart Poettering 31135818e4 swap: flush out state when activating a unit, not when deactivating it
This is similar to the previous commit which did the same change for
mount units.
2018-12-07 17:35:32 +01:00
Lennart Poettering 7eba1463de mount: flush out cycle state on DEAD→MOUNTED only, not the other way round
For services (and other units) we generally follow the rule that at the
beginning of each cycle, i.e. when the INACTIVE/FAILED state is left for
ACTIVATING/ACTIVE we flush out various state variables. Mount units
handled this differently so far when the unit state change was effected
outside of systemd: in that case these variables would be flushed out
when going back to INACTIVE/FAILED already.

Let's fix that, and flush out this state always during the activating
transition, not during the deactivating transition.
2018-12-07 17:35:32 +01:00
Lennart Poettering ec88d1ea05 mount: replace three closely related mount flags into a proper flags enum
We pass these flags around, and even created a structure for them. Let's
fix things properly, and make them a flags value of its own.
2018-12-07 17:35:32 +01:00
Lennart Poettering b6418dc94e mount: strdup() device paths we collect
We never know what the changes triggered by mount_set_state() do to the
unit. Let's be safe and copy the device path into our set, so that we
are safe against that.
2018-12-07 17:35:32 +01:00
Lennart Poettering f8064c4fda mount: when the kernel reports a mount to be established reset all kinds of load failures
It doesn't matter what kind of precise failure we had earlier with
loading the unit, let's report that it loaded successfully now, after
all the kernel is an OK source for that, like any other.
2018-12-07 17:35:32 +01:00
Lennart Poettering a37422045f mount: regenerate all deps whenever a mount's parameters changes
Whenever we notice a change on an existing /proc/self/mountinfo line,
let's update the deps generated from it. For that, let's flush out the
old deps generated this way, and add in the new ones.

This takes benefit of the fact that today (unlike a comment this patch
removes says) we can remove deps in a somewhat reasonable way.
2018-12-07 17:35:32 +01:00