Commit graph

3117 commits

Author SHA1 Message Date
Yu Watanabe 3536f49e8f core: add {State,Cache,Log,Configuration}Directory= (#6384)
This introduces {State,Cache,Log,Configuration}Directory= those are
similar to RuntimeDirectory=. They create the directories under
/var/lib, /var/cache/, /var/log, or /etc, respectively, with the mode
specified in {State,Cache,Log,Configuration}DirectoryMode=.

This also fixes #6391.
2017-07-18 14:34:52 +02:00
Lennart Poettering e758bc9132 Merge pull request #6387 from keszybz/fix-timeout-0
Fix x-systemd.timeout=0 in fstab
2017-07-18 00:04:24 +02:00
Zbigniew Jędrzejewski-Szmek 4a06cbf838 Use config_parse_sec_fix_0() also for JobRunningTimeoutSec
2d79a0bbb9 did that for TimeoutSec=,
89beff89ed did that for JobTimeoutSec=,
and 0004f698df did that for
x-systemd.device-timeout=. But after parsing x-systemd.device-timeout=xxx
we write it out as JobRunningTimeoutSec=xxx. Two options:
- write out JobRunningTimeoutSec=<a very big number>,
- change JobRunningTimeoutSec= to behave like the other options.

I think it would be confusing for JobRunningTimeoutSec= to have different
syntax then TimeoutSec= and JobTimeoutSec=, so this patch implements the
second option.

Fixes #6264, https://bugzilla.redhat.com/show_bug.cgi?id=1462378.
2017-07-17 16:03:49 -04:00
Lennart Poettering 688230d3a7 Merge pull request #6354 from walyong/smack_process_label_free
core: modify resource leak and missed security context dump
2017-07-17 10:04:12 +02:00
Yu Watanabe 23a7448efa core: support subdirectories in RuntimeDirectory= option 2017-07-17 16:30:53 +09:00
Yu Watanabe 53f47dfc7b core: allow preserving contents of RuntimeDirectory= over process restart
This introduces RuntimeDirectoryPreserve= option which takes a boolean
argument or 'restart'.

Closes #6087.
2017-07-17 16:22:25 +09:00
IPv4v6 cb4069d95e add version argument to help function (#6377)
Signed-off-by: Stefan Pietsch <mail.ipv4v6+gh@gmail.com>
2017-07-15 13:53:21 +02:00
Tejun Heo 4095205ecc core: support "nsdelegate" cgroup v2 mount option (#6294)
cgroup namespace wasn't useful for delegation because it allowed resource
control interface files (e.g. memory.high) to be written from inside the
namespace - this allowed the namespace parent's resource distribution to be
disturbed by its namespace-scoped children.

A new mount option, "nsdelegate", was added to cgroup v2 to address this issue.
The flag is meangingful only when mounting cgroup v2 in the init namespace and
makes a cgroup namespace a delegation boundary.  The kernel feature is pending
for v4.13.

This should have been the default behavior on cgroup namespaces and this commit
makes systemd try "nsdelegate" first when trying to mount cgroup v2 and fall
back if the option is not supported.

Note that this has danger of breaking usages which depend on modifying the
parent's resource settings from the namespace root, which isn't a valid thing
to do, but such usages may still exist.
2017-07-14 19:27:13 +02:00
WaLyong Cho 80c21aea11 core: dump also missed security context 2017-07-13 13:12:24 +09:00
WaLyong Cho 5b8e1b7755 core: modify resource leak by SmackProcessLabel= 2017-07-13 13:12:15 +09:00
Lennart Poettering 782c925f7f Revert "core: link user keyring to session keyring (#6275)" (#6342)
This reverts commit 437a85112e.

The outcome of this isn't that clear, let's revert this for now, see
discussion on #6286.
2017-07-12 10:00:43 -04:00
Lennart Poettering 6297d07b82 Merge pull request #6300 from keszybz/refuse-to-load-some-units
Refuse to load some units
2017-07-12 09:28:20 +02:00
Zbigniew Jędrzejewski-Szmek 2c75fb7330 core/load-fragment: refuse units with errors in RootDirectory/RootImage/DynamicUser
Behaviour of the service is completely different with the option off, so the
service would probably mess up state on disk and do unexpected things.
2017-07-11 13:38:13 -04:00
Zbigniew Jędrzejewski-Szmek bb28e68477 core/load-fragment: refuse units with errors in certain directives
If an error is encountered in any of the Exec* lines, WorkingDirectory,
SELinuxContext, ApparmorProfile, SmackProcessLabel, Service (in .socket
units), User, or Group, refuse to load the unit. If the config stanza
has support, ignore the failure if '-' is present.

For those configuration directives, even if we started the unit, it's
pretty likely that it'll do something unexpected (like write files
in a wrong place, or with a wrong context, or run with wrong permissions,
etc). It seems better to refuse to start the unit and have the admin
clean up the configuration without giving the service a chance to mess
up stuff.

Note that all "security" options that restrict what the unit can do
(Capabilities, AmbientCapabilities, Restrict*, SystemCallFilter, Limit*,
PrivateDevices, Protect*, etc) are _not_ treated like this. Such options are
only supplementary, and are not always available depending on the architecture
and compilation options, so unit authors have to make sure that the service
runs correctly without them anyway.

Fixes #6237, #6277.
2017-07-11 13:38:02 -04:00
Zbigniew Jędrzejewski-Szmek 13ddc3fc2b systemd: do not stop units bound to inactive units while coldplugging (#6316)
When running systemd-analyze verify I would get a random subset of warnings
(sometimes none, sometimes one or two):

dev-mapper-luks\x2d8db85dcf\x2d6230\x2d4e88\x2d940d\x2dba176d062b31.swap: Unit is bound to inactive unit dev-mapper-luks\x2d8db85dcf\x2d6230\x2d4e88\x2d940d\x2dba176d062b31.device. Stopping, too.
home.mount: Unit is bound to inactive unit dev-disk-by\x2duuid-75751556\x2d6e31\x2d438b\x2d99c9\x2dd626330d9a1b.device. Stopping, too.
boot.mount: Unit is bound to inactive unit dev-disk-by\x2duuid-56c56bfd\x2d93f0\x2d48fb\x2dbc4b\x2d90aa67144ea5.device. Stopping, too.

When running with debug on, it's pretty obvious what is happening:

home.mount: Changed dead -> mounted
home.mount: Unit is bound to inactive unit dev-disk-by\x2duuid-75751556\x2d6e31\x2d438b\x2d99c9\x2dd626330d9a1b.device. Stopping, too.
home.mount: Trying to enqueue job home.mount/stop/fail
home.mount: Installed new job home.mount/stop as 27
home.mount: Enqueued job home.mount/stop as 27
...
dev-disk-by\x2duuid-75751556\x2d6e31\x2d438b\x2d99c9\x2dd626330d9a1b.device: Installed new job dev-disk-by\x2duuid-75751556\x2d6e31\x2d438b\x2d99c9\x2dd626330d9a1b.device/start as 47
dev-disk-by\x2duuid-75751556\x2d6e31\x2d438b\x2d99c9\x2dd626330d9a1b.device: Changed dead -> plugged
dev-disk-by\x2duuid-75751556\x2d6e31\x2d438b\x2d99c9\x2dd626330d9a1b.device: Job dev-disk-by\x2duuid-75751556\x2d6e31\x2d438b\x2d99c9\x2dd626330d9a1b.device/start finished, result=done

Fixes #2206, https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=808151.
2017-07-11 10:45:03 +02:00
Christian Hesse 437a85112e core: link user keyring to session keyring (#6275)
Commit  74dd6b515f (core: run each system
service with a fresh session keyring) broke adding keys to user keyring.
Added keys could not be accessed with error message:

keyctl_read_alloc: Permission denied

So link the user keyring to our session keyring.
2017-07-04 09:38:31 +02:00
Franck Bui 9e7bc4dc3a automount: don't lstat(2) upon umount request (#6086)
When umounting an NFS filesystem, it is not safe to lstat(2) the mountpoint at
all as that can block indefinitely if the NFS server is down.

umount() will not block, but lstat() will.

This patch therefore removes the call to lstat(2) and defers the handling of
any error to the child process which will issue the umount call.
2017-07-03 14:48:21 +02:00
Zbigniew Jędrzejewski-Szmek 0004f698df Parse "timeout=0" as infinity in various generators (#6264)
This extends 2d79a0bbb9 to the kernel
command line parsing.

The parsing is changed a bit to only understand "0" as infinity. If units are
specified, parse normally, e.g. "0s" is just 0. This makes it possible to
provide a zero timeout if necessary.

Simple test is added.

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1462378.
2017-07-03 14:29:32 +02:00
Zbigniew Jędrzejewski-Szmek c23218ae25 core/loopback-setup: make log messages nicer
Under nspawn, systemd would print:

Got address error code: Operation not permitted
Got address error code: Operation not permitted
Got start error code: Operation not permitted

which is quite unclear out of context. Change that to:

Failed to add address 127.0.0.1 to loopback interface: Operation not permitted
Failed to add address ::1 to loopback interface: Operation not permitted
Failed to bring loopback interface up: Operation not permitted
2017-07-02 12:03:56 -04:00
Zbigniew Jędrzejewski-Szmek 047d7219fd core: do not print color console message about gc-ed jobs
This is just a cosmetic issue.

Garbage collection of jobs (especially the ones that we create automatically)
is something of an internal implementation detail and should not be made
visible to the users. But it's probably still useful to log this in the
journal, so the code is rearranged to skip one of the messages if we log to the
console and the journal separately, and to keep the message if we log
everything to the console.

Fixes #6254.
2017-07-01 19:10:48 -04:00
NeilBrown ca5b440aa7 core/automount: revise the "fun fact" comments. (#6245)
Fun fact 1 suggests that a "close()" is needed, but that close() has long since been
removed.  So the comment in now meaningless and possibly confusing.

Fun fact 2 refers to a bug that has been fixed in Linux prior to v4.12

Commit: 9fa4eb8e490a ("autofs: sanity check status reported with AUTOFS_DEV_IOCTL_FAIL")

so revise the comment so that no-one goes pointlessly looking for the bug.
2017-06-30 12:51:33 -04:00
Lennart Poettering 0d787d5ff8 Only drop the capabilities from the bounding set if we are running as PID1 (#6204)
The CapabilityBoundingSet option only makes sense if we are running as
PID1.

The system.conf.d(5) manpage, already states that the CapabilityBoundingSet
option:
  Controls which capabilities to include in the capability bounding set
  for PID 1 and its children.

https://github.com/systemd/systemd/issues/6080
2017-06-28 13:29:45 -04:00
Lennart Poettering 9af868392b Merge pull request #5976 from fbuihuu/swap-fix
Swap fix
2017-06-27 20:28:56 +02:00
Zbigniew Jędrzejewski-Szmek 23e4a234fc Merge pull request #6200 from poettering/ioprio-transient 2017-06-26 21:30:36 -04:00
Evgeny Vereshchagin 82891136d7 Merge pull request #6160 from poettering/non-pollable-fdstore
support for non-pollable fds in the service fdstore
2017-06-27 00:22:58 +03:00
WaLyong Cho 217f95db7b smack: handling smack onlycap list (#5542)
To support smack onlycap, read onlycap list from /etc/smack/onlycap and write
to /sys/fs/smackfs/onlycap.

https://lwn.net/Articles/292128/
http://lkml.iu.edu/hypermail/linux/kernel/1505.2/04718.html
http://schaufler-ca.com/description_from_the_linux_source_tree
2017-06-26 21:00:03 +02:00
Lennart Poettering 8d1dd6abe2 core: when parsing integers passed over the bus use int32_t, not int
sd-bus will return "i" integers as "int32_t", not "int". Now on all
archs we care about that's the same, but we still should do this
properly.
2017-06-26 17:43:18 +02:00
Lennart Poettering 7f452159b8 core: make IOSchedulingClass= and IOSchedulingPriority= settable for transient units
This patch is a bit more complex thant I hoped. In particular the single
IOScheduling= property exposed on the bus is split up into
IOSchedulingClass= and IOSchedulingPriority= (though compat is
retained). Otherwise the asymmetry between setting props and getting
them is a bit too nasty.

Fixes #5613
2017-06-26 17:43:18 +02:00
Franck Bui d31ae54818 swap: make use of IN_SET()
Minor cleanups, no functional changes.
2017-06-26 15:26:39 +02:00
Franck Bui cabf58b24f swap: make use of manager_get_unit() in swap.c
Minor cleanup, no functional changes.
2017-06-26 15:26:39 +02:00
Lennart Poettering 9efb9df9e3 core: make NotifyAccess= and FileDescriptorStoreMax= available to transient services
This is helpful for debugging/testing #5606.
2017-06-26 15:14:41 +02:00
Lennart Poettering 3ceb72e558 core: permit FDSTORE=1 messages with non-pollable fds
This also alters the documentation to recommend memfds rather than /run
for serializing state across reboots. That's because /run doesn't
actually have the same lifecycle as the fd store, as it is cleared out
on restarts.

Fixes: #5606
2017-06-26 15:14:41 +02:00
Zbigniew Jędrzejewski-Szmek f900f5825d core: allow specifiers to be resolved in Environment= passed over dbus (#6144)
When specifiers are included in the Environment block in StartTransientUnit,
we resolve specifiers on the PID1 side. Nevertheless we store the unresolved
version in the transient unit file, so that it'll be resolved when loading
the unit. I think this looks nicer.

I also removed the writing of the merged Environment block to the transient
file. Afaict, this resulted in variables being written multiple times, but
this needs to be tested properly.

Fixes #5699.
2017-06-26 11:26:59 +02:00
Zbigniew Jędrzejewski-Szmek 2d2a815c90 core/loopback-setup: set a timeout for the kernel reply (#6187)
This shouldn't happen, but just in case for any reason the kernel
doesn't answer, it's probably much better to continue.

Follow-up for fb8939274a.
2017-06-26 09:33:37 +02:00
Zbigniew Jędrzejewski-Szmek 7e867138f5 Merge pull request #5600 from fbuihuu/make-logind-restartable
Make logind restartable.
2017-06-24 18:58:36 -04:00
Evgeny Vereshchagin a4bdea5e31 Merge pull request #6185 from keszybz/env-serialization
Environment serialization fixes
2017-06-24 19:37:45 +03:00
Zbigniew Jędrzejewski-Szmek d233c99ac8 manager: just warn about an invalid environment entry
Apart from bugs (as in #6152), this can happen if we ever make
our requirements for environment entries more stringent. As with
the rest of deserialization, we should just warn and continue.
2017-06-23 20:46:33 -04:00
Zbigniew Jędrzejewski-Szmek 62c460c6e2 manager: raise level of notices about failed deserialization
This is something that shouldn't happen. If it does, we want to
know about it.
2017-06-23 20:46:33 -04:00
Torstein Husebø 3561eafa92 core: fix typo (#6183) 2017-06-23 18:53:29 -04:00
Lennart Poettering fb8939274a loopback-setup: also add IP addresses to loopback devices
This changes loopback setup to not only start the loopback device but
also add the relevant IP addresses to it. This way, we can synchronously
wait until that's complete, and properly guarantee that loopback setup
is complete at the time we start our first processes.

This is a semi-revert of f3fc48150b, but
heavily updated.

Fixes: #5641
2017-06-23 12:07:03 +02:00
Zbigniew Jędrzejewski-Szmek 9fe4a3d98e Merge pull request #6176 from poettering/timer-boottime-monotonic
WakeSystem= and other timer fixes.
2017-06-22 21:34:39 -04:00
Zbigniew Jędrzejewski-Szmek 53ec2a3226 Merge pull request #6164 from poettering/bus-non-existant-object
sd-bus: don't treat requests for non-existent objects as connection errors
2017-06-22 20:56:59 -04:00
Lennart Poettering 00c83b4300 core: return a friendlier error for a dbus path referring to a non-existent unit
See: #6059
2017-06-22 20:54:54 -04:00
Michal Sekletar 05e33aa1d5 core: unset sysfs path after transition to dead state (#6174)
Device is gone and most likely it will get garbage collected. However in
cases when it doesn't get gc'ed (because it is referenced by some
other unit, e.g. mount from fstab) we need to unset sysfs. This is
because when device appears next time, possibly, with different sysfs
path we need to update the sysfs path. Current code could end up caching
stale sysfs path forever.

In reality this is not a problem for normal disks (unless you swap them
during system runtime). However this issue causes failures to mount
filesystems on LVM where sysfs path depends on activation
order (i.e. logical volumes from volume group that is activated first
get assigned lower dm-X numbers and corresponding syspaths).

Fixes #6126.
2017-06-22 20:29:15 -04:00
Michal Koutný b007626897 core: dbus: Interpret released names properly (#6175)
When a DBus name is released, NameOwnerChanged signal contains an empty string
as new_owner. Commit bbc2908 changed interpretation of the empty string to a
valid name, which is not consistent with values that are sent by dbus-daemon.

As a side effect, this masks symptoms of systemd-logind dbus disconnections
(#2925) by completely restarting it so it can freshly reconnect to dbus.
2017-06-22 20:26:04 -04:00
Lennart Poettering c54be90b28 timer: make sure we use the right monotonic timestamp
This reworks timer_enter_waiting() in a couple of ways in order to clean
it up a bit and fix #5629.

Most importantly, we previously we initialized ts_monotonic to either
the current time in CLOCK_MONOTONIC or in CLOCK_BOOTTIME, depending on
t->wake_system. Then given specific conditions we'd use this time as
base for our timers. And afterwards, if t->wake_system was on we'd
convetr the resulting value from CLOCK_MONOTONIC to CLOCK_BOOTTIME again
— which of course is wrong since we already were in CLOCK_BOOTTIME! This
fixes this logic, by using a triple timestamp so that we always have the
right base around, and initially only calculate in CLOCK_MONOTONIC and
only convert as last step.

Conversion between the clocks is now done with the generic
usec_shift_clock(), and additions via usec_add() making these
calculations a bit safer.

Fixes: #5629
2017-06-22 21:04:20 +02:00
Lennart Poettering 79fc8b9623 timer: convert property_get_next_elapse_monotonic() to use usec_shift_clock()
Let's use the generic clock shifting logic here.
2017-06-22 20:59:48 +02:00
Lennart Poettering 6c223c6719 Merge pull request #6113 from keszybz/shell-quoting
Use "dollar-single-quotes" to escape shell-sensitive strings
2017-06-20 20:17:03 +02:00
Zbigniew Jędrzejewski-Szmek 804ee07c13 Use "dollar-single-quotes" to escape shell-sensitive strings
Also called "ANSI-C Quoting" in info:(bash) ANSI-C Quoting.

The escaping rules are a POSIX proposal, and are described in
http://austingroupbugs.net/view.php?id=249. There's a lot of back-and-forth on
the details of escaping of control characters, but we'll be only using a small
subset of the syntax that is common to all proposals and is widely supported.
Unfortunately dash and fish and maybe some other shells do not support it (see
the man page patch for a list).

This allows environment variables to be safely exported using show-environment
and imported into the shell. Shells which do not support this syntax will have
to do something like
    export $(systemctl show-environment|grep -v '=\$')
or whatever is appropriate in their case. I think csh and fish do not support
the A=B syntax anyway, so the change is moot for them.

Fixes #5536.

v2:
- also escape newlines (which currently disallowed in shell values, so this
  doesn't really matter), and tabs (as $'\t'), and ! (as $'!'). This way quoted
  output can be included directly in both interactive and noninteractive bash.
2017-06-19 19:39:43 -04:00
Michal Koutný 171f12ced8 job: Ensure JobRunningTimeoutSec= survives serialization (#6128)
This is a fixup of commit a2df3ea4ae.
When there is a running job with JobRunningTimeoutSec= and systemd serializes
its state (e.g. during daemon-reload), the timer event source won't be properly
restored in job_coldplug().
Thus save and serialize begin_running_usec too and reinitialize the timer based
on that value.
2017-06-18 11:51:17 -04:00