Commit graph

5539 commits

Author SHA1 Message Date
Lennart Poettering 04d1ee0f7e main: bump RLIMIT_MEMLOCK by physical RAM size
Let's allow more memory to be locked on beefy machines than on small
ones. The previous limit of 64M is the lower bound still. This
effectively means on a 4GB machine we can lock 512M, which should be
more than enough, but still not lock up the machine entirely under
pressure.

Fixes: #15053
2020-04-28 19:54:21 +02:00
Lennart Poettering 035e3cb9f9
Merge pull request #15563 from keszybz/wait-callback-tweak
Add NULL callback check in one more place
2020-04-23 19:50:36 +02:00
Lennart Poettering 0f4a141744
Merge pull request #15504 from poettering/cmsg-find-pure
just the recvmsg_safe() stuff from #15457
2020-04-23 17:28:19 +02:00
Lennart Poettering a9ab5cdb50
Merge pull request #15472 from keszybz/dbus-api-docs
A few more dbus api documentation updates
2020-04-23 17:01:11 +02:00
Zbigniew Jędrzejewski-Szmek 8f3e342fa9 core: fix unused variable warning when !HAVE_SECCOMP 2020-04-23 14:42:09 +02:00
Lennart Poettering 3691bcf3c5 tree-wide: use recvmsg_safe() at various places
Let's be extra careful whenever we return from recvmsg() and see
MSG_CTRUNC set. This generally means we ran into a programming error, as
we didn't size the control buffer large enough. It's an error condition
we should at least log about, or propagate up. Hence do that.

This is particularly important when receiving fds, since for those the
control data can be of any size. In particular on stream sockets that's
nasty, because if we miss an fd because of control data truncation we
cannot recover, we might not even realize that we are one off.

(Also, when failing early, if there's any chance the socket might be
AF_UNIX let's close all received fds, all the time. We got this right
most of the time, but there were a few cases missing. God, UNIX is hard
to use)
2020-04-23 09:41:47 +02:00
Lennart Poettering 0cd41757d0 sd-bus: work around ubsan warning
ubsan complains that we add an offset to a NULL ptr here in some cases.
Which isn't really a bug though, since we only use it as the end
condition for a for loop, but we can still fix it...

Fixes: #15522
2020-04-23 08:54:30 +02:00
Zbigniew Jędrzejewski-Szmek 9786b27cbb
Merge pull request #15546 from poettering/pid1-serialize-comand-fix
core: make sure to restore the control command id, too
2020-04-23 08:51:58 +02:00
Zbigniew Jędrzejewski-Szmek 1943d50e4e
Merge pull request #15507 from poettering/bus-log-api
add generic bus interface for setting log level that can be implemented by any daemon
2020-04-22 23:30:09 +02:00
Lennart Poettering e8cf09b2a2 core: make sure we don't get confused when setting TERM for a tty fd
Fixes: #15344
2020-04-22 22:59:41 +02:00
Lennart Poettering e9da62b18a core: make sure to restore the control command id, too
Fixes: #15356
2020-04-22 20:34:02 +02:00
Lennart Poettering 5b99bd5fd4 core: some minor clean-ups/modernizations 2020-04-22 20:33:57 +02:00
Zbigniew Jędrzejewski-Szmek eecf4f7e04
Merge pull request #15526 from bluca/start_stop_wait_debug
Add debug log when a job in the activation queue is not runnable
2020-04-22 14:24:07 +02:00
Luca Boccassi c03fbd37d6 core: add debug log when a job in the activation queue is not runnable
When a job is skipped due its dependencies not being ready, log
a debug message saying what is holding it back.
This was very useful with transient units timing out to figure
out where the problem was.
2020-04-22 09:58:12 +01:00
Lennart Poettering df3d3bdfe8 core: minor error code handling fixes 2020-04-22 08:56:05 +02:00
Frantisek Sumsal 86b52a3958 tree-wide: fix spelling errors
Based on a report from Fossies.org using Codespell.

Followup to #15436
2020-04-21 23:21:08 +02:00
Luca Boccassi c5bc2c01ee core: add log_get_max_level check optimization in log_unit_full
Just as log_full already does, check if the log level would result in
logging immediately in the macro in order to avoid doing
unnecessary work that adds up in hot spots.
2020-04-21 18:05:24 +01:00
Lennart Poettering 893f801d67 core: implement generic log control API in PID1 too
It has slightly different setters in place, so it needs some special
love, which is easy enough though.
2020-04-21 17:08:23 +02:00
Lennart Poettering 25141692e9 core: use generic implementations of log level/target bus propertier getters
The setters are slightly different, hence keep them as they are for now.
2020-04-21 17:08:23 +02:00
Zbigniew Jędrzejewski-Szmek 38b38500c6 tree-wide: use "hostname" spelling everywhere
It's not that I think that "hostname" is vastly superior to "host name". Quite
the opposite — the difference is small, and in some context the two-word version
does fit better. But in the tree, there are ~200 occurrences of the first, and
>1600 of the other, and consistent spelling is more important than any particular
spelling choice.
2020-04-21 16:58:04 +02:00
Lennart Poettering 33b58dfb41 core: automatically add udev dependency for units using RootImage=
We use udev to wait for /dev/loopX devices to be fully proped hence we
need an implicit ordering dependency on it, for RootImage= to work
reliably in early boot, too.

Fixes: #14972
2020-04-21 16:31:06 +02:00
Lennart Poettering 09f8722801
Merge pull request #15396 from keszybz/dbus-api-docs
D-bus API docs
2020-04-17 23:40:50 +02:00
Lennart Poettering 4444e8533f
Merge pull request #15454 from keszybz/codespell-fixes
Spelling fixes based on codespell suggestions
2020-04-17 15:53:01 +02:00
Zbigniew Jędrzejewski-Szmek dad97f0425 manager: add dbus parameter names 2020-04-16 19:46:40 +02:00
Zbigniew Jędrzejewski-Szmek 162392b75a tree-wide: spellcheck using codespell
Fixes #15436.
2020-04-16 18:00:40 +02:00
Alin Popa c5f8a179a2 watchdog: reduce watchdog pings in timeout interval
The watchdog ping is performed for every iteration of manager event
loop. This results in a lot of ioctls on watchdog device driver
especially during boot or if services are aggressively using sd_notify.
Depending on the watchdog device driver this may have performance
impact on embedded systems.
The patch skips sending the watchdog to device driver if the ping is
requested before half of the watchdog timeout.
2020-04-16 16:32:05 +02:00
Lennart Poettering daf8f72b4e core: make sure ProtectHostname= is handled gracefully in containers lacking seccomp
Fixes: #15408
2020-04-13 17:32:27 +02:00
Zbigniew Jędrzejewski-Szmek 38cd55b007 Remove unneded {}s
$ perl -i -0pe 's|\s+{\n([^\n]*;)\n\s+}\n|\n\1\n|gms' **/*.c

Inspired by ea7cbf5bdd.
2020-04-13 09:31:49 +02:00
Zbigniew Jędrzejewski-Szmek a1a11d5610
Merge pull request #15365 from poettering/remount-fs-pstore-fix
pid1: automatically generate systemd-remount-fs.service deps, plus enable systemd-pstore from sysinit.target
2020-04-12 17:16:07 +02:00
Lennart Poettering 9b3c65ed36
Merge pull request #15352 from poettering/user-group-name-valdity-rework
user/group name validity rework
2020-04-09 18:49:22 +02:00
Lennart Poettering 611cb82612
Merge pull request #15318 from fbuihuu/inherit-umask-for-user-units
pid1: by default make user units inherit their umask from the user ma…
2020-04-09 17:15:55 +02:00
Lennart Poettering f58921bde3
Merge pull request #15332 from keszybz/coredump-filter
CoredumpFilter=
2020-04-09 17:15:26 +02:00
Franck Bui 5a7c4f4f3b mount: introduce mount_is_nofail() helper 2020-04-09 15:26:21 +02:00
Franck Bui 2ec15c4f8a mount: default startup dependencies and default network ones are orthogonal
Regardless of whether a mount is setup in initrd or int the main system,
the network default dependencies _netdev should still be honored.

IOW if a mount unit use the following options "x-initrd.mount,_netdev", it
should be ordered against initrd-fs.target, network.target,
network-online.target.

/dev/vdb1 /mnt ext4 x-initrd.mount,_netdev defaults 0 0

Before this patch:

Before=umount.target initrd-fs.target
After=system.slice sysroot.mount dev-vdb1.device -.mount systemd-journald.socket blockdev@dev-vdb1.target

After this patch:

Before=initrd-fs.target umount.target
After=network-online.target -.mount blockdev@dev-vdb1.target dev-vdb1.device sysroot.mount system.slice network.target systemd-journald.socket
2020-04-09 15:26:14 +02:00
Franck Bui 61154cf953 mount: introduce mount_add_default_ordering_dependencies()
Move the handling of the usual startup/shutdown dependencies in a dedicated
funtion.

No functional change.
2020-04-09 15:25:14 +02:00
Franck Bui b3d7aef525 automount: fix handling of default dependencies for automount units
First After=local-fs-pre.target wasn't described in the man page although it's
part of the default dependencies automatically set by pid1.

Secondly, Before=local-fs.target was only set if the automount unit was
generated from the fstab-generator because the dep was explicitly
generated. It was also not documented as a default dependency.

Fix it by managing the dep from pid1 instead.
2020-04-09 15:17:17 +02:00
Franck Bui 83cdc87094 mount: let pid1 alone handle the default dependencies for mount units
fstab-generator was also handling the default ordering dependencies for mount
units setup in initrd. To do that it was turning the defaults dependencies off
completely and ordered the mount unit against either local-fs.target or
initrd-fs.target or initrd-root-fs.target itself.

But it had the bad side effect to also remove all other default dependencies as
well. Thus if an initrd mount was using _netdev, the network dependencies were
missing.

In general fstab-generator shouldn't use DefaultDependecies=no because it can
handle only a small set of the default dependencies the rest are dealt by pid1.

So this patch makes pid1 handle all default dependencies.
2020-04-09 15:17:09 +02:00
Franck Bui 5e37d1930b pid1: by default make user units inherit their umask from the user manager
This patch changes the way user managers set the default umask for the units it
manages.

Indeed one can expect that if user manager's umask is redefined through PAM
(via /etc/login.defs or pam_umask), all its children including the units it
spawns have their umask set to the new value.

Hence make user units inherit their umask value from their parent instead of
the hard coded value 0022 but allow them to override this value via their unit
file.

Note that reexecuting managers with 'systemctl daemon-reexec' after changing
UMask= has no effect. To take effect managers need to be restarted with
'systemct restart' instead. This behavior was already present before this
patch.

Fixes #6077.
2020-04-09 14:17:07 +02:00
Zbigniew Jędrzejewski-Szmek b94fb74bae manager: also log at debug level failure to read oom_score_adj 2020-04-09 14:08:48 +02:00
Zbigniew Jędrzejewski-Szmek ad21e542b2 manager: add CoredumpFilter= setting
Fixes #6685.
2020-04-09 14:08:48 +02:00
Wen Yang f74349d88b mount-setup: change the system mount propagation to shared by default only at bootup
The commit b3ac5f8cb9 has changed the system mount propagation to
shared by default, and according to the following patch:
https://github.com/opencontainers/runc/pull/208
When starting the container, the pouch daemon will call runc to execute
make-private.

However, if the systemctl daemon-reexec is executed after the container
has been started, the system mount propagation will be changed to share
again by default, and the make-private operation above will have no chance
to execute.
2020-04-09 10:14:20 +02:00
Lennart Poettering 7a8867abfa user-util: rework how we validate user names
This reworks the user validation infrastructure. There are now two
modes. In regular mode we are strict and test against a strict set of
valid chars. And in "relaxed" mode we just filter out some really
obvious, dangerous stuff. i.e. strict is whitelisting what is OK, but
"relaxed" is blacklisting what is really not OK.

The idea is that we use strict mode whenver we allocate a new user
(i.e. in sysusers.d or homed), while "relaxed" mode is when we process
users registered elsewhere, (i.e. userdb, logind, …)

The requirements on user name validity vary wildly. SSSD thinks its fine
to embedd "@" for example, while the suggested NAME_REGEX field on
Debian does not even allow uppercase chars…

This effectively liberaralizes a lot what we expect from usernames.

The code that warns about questionnable user names is now optional and
only used at places such as unit file parsing, so that it doesn't show
up on every userdb query, but only when processing configuration files
that know better.

Fixes: #15149 #15090
2020-04-08 17:11:20 +02:00
Lennart Poettering f3b7a79b97 core: automatically add dependency on systemd-remount-fs.service if StateDirectory= is used
And similar for other settings that require a writable /var/.

Rationale: if these options are used for early-boot services (such as
systemd-pstore.service) we need /var/ writable. And if /var/ is on the
root fs, then systemd-remount-fs.service is the service that ensures
that /var/ is writable.

This allows us to remove explicit deps in services such as
systemd-pstore.service.
2020-04-08 16:29:25 +02:00
root f9d29f6d06 fix manager_state 2020-04-07 15:27:50 +02:00
Topi Miettinen b8b7b838fd
dbus-execute: show also ProtectClock
Show also `ProtectClock=` setting. This also lets `systemd-analyze security`
get the correct state for it.
2020-04-03 16:32:10 +03:00
Franck Bui 457d65932b mount: mount unit activated by automount unit should be only ordered against the automount unit
Both fstab-generator and pid1 are duplicating the handling of
"Before=local-fs.target" dependency for mount units.

fstab-generator is correctly skipping this dep if the mount unit is activated
by an automount unit.

However the condition used by pid1 was incorrect and missed the case when a mount
unit uses "x-systemd.automount" since in this case the mount unit should be
(only) ordered against its automount unit counterpart instead.
2020-04-02 10:56:30 +02:00
Franck Bui bc9e5a4c67 fstab-util: introduce fstab_is_extrinsic() 2020-04-01 10:36:28 +02:00
Franck Bui b862c25716 device: drop refuse_after
Scheduling devices after a given unit can be useful to start device *jobs* at a
specific time in the transaction, see commit 4195077ab4.

This (hidden) change was introduced by commit eef85c4a3f.
2020-04-01 10:35:14 +02:00
Vito Caputo b46c3e4913 *: use _cleanup_close_ with fdopen() where trivial
Also convert these to use take_fdopen().
2020-03-31 06:48:03 -07:00
Lennart Poettering 01262d0d9e
Merge pull request #13084 from ddstreet/log_time
log: add support for prefixing console log messages with current timestamp
2020-03-30 17:57:34 +02:00
Zbigniew Jędrzejewski-Szmek 4a874560c5
Merge pull request #15217 from keszybz/beef-up-sd-path
Export sd-path functions and beef up systemd-path to show more items
2020-03-29 22:57:53 +02:00
Zbigniew Jędrzejewski-Szmek 3e118d135b
Merge pull request #15233 from msekletar/device-dbus
device: make sure we don't generate bogus PropertiesChanged signals
2020-03-29 20:52:31 +02:00
Michal Sekletár 2e129d5d6b device: don't emit PropetiesChanged needlessly
Functions called from device_setup_unit() already make sure that unit is
enqueued in case it is a new unit or properties exported on the bus have
changed.

This should prevent unnecessary DBus wakeups and associated DBus traffic
when device_setup_unit() was called while reparsing /proc/self/mountinfo
due to the mountinfo notifications. Note that we parse
/proc/self/mountinfo quite often on the busy systems (e.g. k8s container
hosts) but majority of the time mounts didn't change, only some mount
got added. Thus we don't need to generate PropertiesChanged for devices
associated with the mounts that didn't change.

Thanks to Renaud Métrich <rmetrich@redhat.com> for debugging the
problem and providing draft version of the patch.
2020-03-28 21:26:40 +01:00
Michal Sekletár 7c4d139485 device: make sure we emit PropertiesChanged signal once we set sysfs 2020-03-28 21:26:40 +01:00
Christian Göttsche 1b7d1cae0f selinux: print enforcing state in access check debug message 2020-03-27 20:54:58 +01:00
Zbigniew Jędrzejewski-Szmek 385093b702 Split out generator directory setup to a src/core/generator-setup.c
Those functions have only one non-test user, so we can move them to src/core/.
2020-03-27 20:12:44 +01:00
Zbigniew Jędrzejewski-Szmek 51327bcc74 sd-path: rename the two functions
I think the two names were both pretty bad. They did not give a proper hint
what the difference between the two functions is, and sd_path_home sounds like
it is somehow related to /home or home directories or whatever, when in fact
both functions return the same set of paths as either a colon-delimited string
or a strv. "_strv" suffix is used by various functions in sd-bus, so let's
reuse that.

Those functions are not public yet, so let's rename.
2020-03-27 20:12:44 +01:00
Zbigniew Jędrzejewski-Szmek 4908de44b0 {systemd,udev}.pc: add names with underscores, deprecate the old ones
The names with multiple lowercase words run together are hard to read.  We
started that way with very short names like rootprefix, but then same pattern
was applied to longer and longer names. Looking at the body of .pc files
available on my machine, many packages use underscores; let's do the same. Old
names are kept for compatiblity, so this is backwards compatible.
2020-03-27 20:12:44 +01:00
Wen Yang acd1987a18 core/unit: print info when unit_add_name failed
When there are hundreds of mounts on the server, it will take a long
time to analyze the failure of a certain mount unit. So it is useful
to print the reason why unit_add_name() failed.
2020-03-27 18:46:13 +01:00
Anita Zhang 613328c3e2 cgroup-util: helper to cg_get_attribute and convert to uint64_t
A common pattern in the codebase is reading a cgroup memory value
and converting it to a uint64_t. Let's make it a helper and refactor a
few places to use it so it's more concise.
2020-03-24 16:05:16 -07:00
Vito Caputo d4a3494e65 swap: check p->what for NULL
Commit 61f9cf4e4c introduced swap_get_parameters(s) but only
checked its return for NULL and not its ->what.

Fixes https://github.com/systemd/systemd/issues/15070
2020-03-23 08:30:32 +01:00
Christian Göttsche 194fe32296 selinux: delay mac_selinux_enforcing call after SELinux was determined to be enabled
Calling `mac_selinux_enforcing()`, which calls `security_getenforce()`, on a SELinux disabled system causes the following error message to be printed:
    Failed to get SELinux enforced status: No such file or directory

Fixes: 257188f80c ("selinux: cache enforced status and treat retrieve failure as enforced mode")
Supersedes: #15145
2020-03-19 08:56:30 +01:00
Zbigniew Jędrzejewski-Szmek 8c357762c7
Merge pull request #14749 from msekletar/cpu-aff-numa-v3
core: add support for setting CPUAffinity= to special "numa" value
2020-03-18 13:35:58 +01:00
Michal Sekletár e2b2fb7f56 core: add support for setting CPUAffinity= to special "numa" value
systemd will automatically derive CPU affinity mask from NUMA node
mask.

Fixes #13248
2020-03-16 08:57:28 +01:00
Michal Sekletár 1808f76870 shared: split out NUMA code from cpu-set-util.c to numa-util.c 2020-03-16 08:23:18 +01:00
Zbigniew Jędrzejewski-Szmek cd57038a30 core: consistently emit oom warning when parsing args
In practice we are very unlikely to fail at this point, but for
consistency, we should always warn when allocation fails, and
we have free_and_strdup_warn() for this.
2020-03-13 15:54:27 +01:00
Anita Zhang 557b9716f6
Merge pull request #15097 from keszybz/job-logging-and-news-update
Job logging and news update
2020-03-12 13:01:00 -07:00
Zbigniew Jędrzejewski-Szmek c86c31d9ca core: log an info line about the default target
If the default target is not specified on the kernel command line, it
may not be obvious what we are booting into. See for example
https://github.com/systemd/systemd/issues/15062#issuecomment-597577464.
2020-03-12 15:09:22 +01:00
Topi Miettinen c3151977d7 namespace: fix MAC labels of /dev when PrivateDevices=yes
Without changing the SELinux label for private /dev of a service, it will take
a generic file system label:
system_u:object_r:tmpfs_t:s0

After this change it is the same as without `PrivateDevices=yes`:
system_u:object_r:device_t:s0

This helps writing SELinux policies, as the same rules for `/dev` will apply
despite any `PrivateDevices=yes` setting.
2020-03-12 08:23:27 +00:00
Zbigniew Jędrzejewski-Szmek f1d075dcb7 core: shorten local variable names in do_queue_default_job()
There is really no risk of confusion here: it's a simple function
that queues one job.
2020-03-11 13:43:19 +01:00
Anita Zhang c1566ef0d2 core: transition to FINAL_SIGTERM state after ExecStopPost=
Fixes #14566
2020-03-11 10:15:33 +01:00
Topi Miettinen efa2f3a18b
execute: don't create /tmp and /var/tmp if both are inaccessible
If both /tmp and either /var/tmp or whole /var are inaccessible, there's no
need to create the temporary directories.
2020-03-10 16:51:29 +02:00
Topi Miettinen de46b2be07
namespace: ignore prefix chars when comparing paths
Other callers of path_strv_contains() or PATH_IN_SET() don't seem to handle
paths prefixed with -+.
2020-03-10 16:48:34 +02:00
Topi Miettinen 3b5b6826aa mount-setup: make /dev noexec
/dev used to be mounted with "exec" flag due to /dev/MAKEDEV script but that's
history and it's now located in /sbin. mmap() with file descriptor to
"/dev/zero" (instead of modern mmap(,,,MAP_ANON...))  will still work.
2020-03-09 19:08:42 +01:00
Zbigniew Jędrzejewski-Szmek 908055f61f
Merge pull request #15033 from yuwata/state-directory-migrate-issue
execute: Fix migration from DynamicUser=yes to no
2020-03-09 17:34:55 +01:00
Chris Down 0d14eefb0d
Merge pull request #14893 from benzea/benzea/generator-path-environment-variables
Generator path environment variables
2020-03-06 16:15:08 +00:00
Chris Down dfb3303b6c
Merge pull request #15010 from cgzones/selinux_reload_cache_enforce
SELinux: add trigger for policy reload and cache enforced status
2020-03-06 16:12:48 +00:00
Benjamin Berg b7cf4b4ef5 core: Fix resolution of nested DM devices for cgroups
When using the cgroups IO controller, the device that is controlled
should always be the toplevel block device. This did not get resolved
correctly for an LVM volume inside a LUKS device, because the code would
only resolve one level of indirection.

Fix this by recursively looking up the originating block device for DM
devices.

Resolves: #15008
2020-03-06 16:11:44 +00:00
Yu Watanabe 5197b41fac
Merge pull request #15022 from keszybz/make-units-remain-after-exit
Make units remain after exit
2020-03-06 21:14:36 +09:00
Yu Watanabe 578dc69f2a execute: Fix migration from DynamicUser=yes to no
Closes #12131.
2020-03-06 21:02:26 +09:00
Zbigniew Jędrzejewski-Szmek eda0cbf071
Use Finished instead of Started for Type=oneshot services (#14851)
UnitStatusMessageFormats.finished_job, if present,
will be called with the same arguments as
job_get_done_status_message_format() to provide a format string
appropriate for the context

This commit replaces "Started" with "Finished" for started oneshot
units, as mentioned in the referenced issue

Closes #2458.
2020-03-05 17:24:19 +01:00
Zbigniew Jędrzejewski-Szmek 44e5d00603 pid1: remove unnecessary terminator
We specify the number of items as the first argument already.
2020-03-05 08:13:49 +01:00
Yu Watanabe df883de98a pid1, nspawn: voidify loopback_setup() 2020-03-04 14:18:55 +01:00
Benjamin Berg cccf570355 core: Move environment generator path lookup into path-lookup.c 2020-03-04 11:24:33 +01:00
Yu Watanabe dd0395b565 make namespace_flags_to_string() not return empty string
This improves the following debug log.

Before:
systemd[1162]: Restricting namespace to: .

After:
systemd[1162]: Restricting namespace to: n/a.
2020-03-03 21:17:38 +01:00
Christian Göttsche 257188f80c selinux: cache enforced status and treat retrieve failure as enforced mode 2020-03-03 20:30:59 +01:00
Christian Göttsche 61f3e897f1 selinux: add trigger for policy reload to refresh internal selabel cache
Reload the internal selabel cache automatically on SELinux policy reloads so non pid-1 daemons are participating.

Run the reload function `mac_selinux_reload()` not manually on daemon-reload, but rather pass it as callback to libselinux.
Trigger the callback prior usage of the systemd internal selabel cache by depleting the selinux netlink socket via `avc_netlink_check_nb()`.

Improves: a9dfac21ec ("core: reload SELinux label cache on daemon-reload")
Improves: #13363
2020-03-03 20:25:49 +01:00
Zbigniew Jędrzejewski-Szmek 105a1a36cd tree-wide: fix spelling of lookup and setup verbs
"set up" and "look up" are the verbs, "setup" and "lookup" are the nouns.
2020-03-03 15:02:53 +01:00
Zbigniew Jędrzejewski-Szmek 52c222db11
Merge pull request #14992 from keszybz/syslog-address-length-fix
Syslog address length fix
2020-03-02 21:31:24 +01:00
Zbigniew Jędrzejewski-Szmek 86fca584c3 core/execute: use return value from sockaddr_un_set_path(), remove duplicate check 2020-03-02 15:56:30 +01:00
Zbigniew Jędrzejewski-Szmek f36a9d5909 tree-wide: use the return value from sockaddr_un_set_path()
It fully initializes the address structure, so no need for pre-initialization,
and also returns the length of the address, so no need to recalculate using
SOCKADDR_UN_LEN().

socklen_t is unsigned, so let's not use an int for it. (It doesn't matter, but
seems cleaner and more portable to not assume anything about the type.)
2020-03-02 15:55:44 +01:00
Zbigniew Jędrzejewski-Szmek df46dbca92
Merge pull request #14976 from keszybz/show-status-message
core: fix message about show status state
2020-03-01 23:37:41 +01:00
Zbigniew Jędrzejewski-Szmek 0d066dd1a4 pid1: add new mode systemd.show-status=error and use it when 'quiet' is passed
systemd.show-status=error is useful for the case where people care about errors
only.

If people want to have a quiet boot, they most likely don't want to see all
status output even if there is a delay in boot, so make "quiet" imply
systemd.show-status=error instead of systemd.show-status=auto.

Fixes #14976.
2020-03-01 11:48:23 +01:00
Zbigniew Jędrzejewski-Szmek 5bcf34ebf3 pid1: when showing error status, do not switch to status=temporary
We would flip to status=temporary mode on the first error, and then switch back
to status=auto after the initial transaction was done. This isn't very useful,
because usually all the messages about successfully started units and not
related to the original failure. In fact, all those messages most likely cause
the information about the prime error to scroll off screen. And if the user
requested quiet boot, there's no reason to think that they care about those
success messages.

Also, when logging about dependency cycles, treat this similarly to a unit
error and show the message even if the status is "soft disabled" (before we
wouldn't show it in that case).
2020-03-01 11:42:42 +01:00
Zbigniew Jędrzejewski-Szmek 1b4154a891 pid1: make cylon timeout significantly bigger when not showing any messages
When we are booting with show-status=on, normally new status updates happen a
few times per second. Thus, it is reasonable to start showing the cylon eye
after 5 s, because that means a significant delay has happened. When we are
running with show-status=off or show-status=auto (and no error had occured),
the user is expecting maybe 15 to 90 seconds with no output (because that's
usually how long the whole boot takes). So we shouldn't bother the user with
information about a few seconds of delay. Let's make the timeout 25s if we are
not showing any messages.

Conversly, when we are outputting status messages, we can show the cylon eye
with a shorter delay, now that we removed the connection to enablement status.
Let's make this 2s, so users get feedback about delays more quickly.
2020-03-01 11:42:35 +01:00
Zbigniew Jędrzejewski-Szmek ef15d3e1ab pid1: touch the /run/systemd/show-status just once
We know if we created the file before, no need to repeat the operation. The
state in /run should always match our internal state. Since we call
manager_set_show_status() quite often internally, this saves quite a few
pointless syscalls.
2020-03-01 11:42:26 +01:00
Zbigniew Jędrzejewski-Szmek 7365a29670 pid1: when printing status message status, give reason 2020-03-01 11:42:19 +01:00
Zbigniew Jędrzejewski-Szmek 5ca02bfc39 core: fix message about show status state
We would say "Enabling" also for SHOW_STATUS_AUTO, which is actually
"soft off". So just print the exact state to make things easier to understand.
Also add a helper function to avoid repeating the enum value list.

For #14814.
2020-03-01 11:42:12 +01:00
Topi Miettinen aeac9dd647 Revert "namespace: fix MAC labels of /dev when PrivateDevices=yes"
This reverts commit e6e81ec0a5.
2020-02-29 23:35:43 +09:00
Zbigniew Jędrzejewski-Szmek ee00d1e95e pid1: do not fail if we get EPERM while setting up network name
In a user namespace container:
Feb 28 12:45:53 0b2420135953 systemd[1]: Starting Home Manager...
Feb 28 12:45:53 0b2420135953 systemd[21]: systemd-homed.service: Failed to set up network namespacing: Operation not permitted
Feb 28 12:45:53 0b2420135953 systemd[21]: systemd-homed.service: Failed at step NETWORK spawning /usr/lib/systemd/systemd-homed: Operation not permitted
Feb 28 12:45:53 0b2420135953 systemd[1]: systemd-homed.service: Main process exited, code=exited, status=225/NETWORK
Feb 28 12:45:53 0b2420135953 systemd[1]: systemd-homed.service: Failed with result 'exit-code'.
Feb 28 12:45:53 0b2420135953 systemd[1]: Failed to start Home Manager.

We should treat this similarly to the case where network namespace are not
supported at all.

https://bugzilla.redhat.com/show_bug.cgi?id=1807465
2020-02-29 19:33:19 +09:00
Nate Jones ecf63c9102 execute: Make '+' exec prefix ignore PrivateTmp=yes
The man pages state that the '+' prefix in Exec* directives should
ignore filesystem namespacing options such as PrivateTmp. Now it does.

This is very similar to #8842, just with PrivateTmp instead of
PrivateDevices.
2020-02-29 19:32:01 +09:00
Topi Miettinen e6e81ec0a5 namespace: fix MAC labels of /dev when PrivateDevices=yes
Without changing the SELinux label for private /dev of a service, it will take
a generic file system label:
system_u:object_r:tmpfs_t:s0

After this change it is the same as without `PrivateDevices=yes`:
system_u:object_r:device_t:s0

This helps writing SELinux policies, as the same rules for `/dev` will apply
despite any `PrivateDevices=yes` setting.
2020-02-28 14:17:48 +00:00
HATAYAMA Daisuke 9c1f969d40 swap: finish the secondary swap units' jobs if deactivation of the primary swap unit fails
Currently, if deactivation of the primary swap unit fails:

    # LANG=C systemctl --no-pager stop dev-mapper-fedora\\x2dswap.swap
    Job for dev-mapper-fedora\x2dswap.swap failed.
    See "systemctl status "dev-mapper-fedora\\x2dswap.swap"" and "journalctl -xe" for details.

then there are still the running stop jobs for all the secondary swap units
that follow the primary one:

    # systemctl list-jobs
     JOB UNIT                                                                                                         TYPE STATE
     3233 dev-disk-by\x2duuid-2dc8b9b1\x2da0a5\x2d44d8\x2d89c4\x2d6cdd26cd5ce0.swap                                    stop running
     3232 dev-dm\x2d1.swap                                                                                             stop running
     3231 dev-disk-by\x2did-dm\x2duuid\x2dLVM\x2dyuXWpCCIurGzz2nkGCVnUFSi7GH6E3ZcQjkKLnF0Fil0RJmhoLN8fcOnDybWCMTj.swap stop running
     3230 dev-disk-by\x2did-dm\x2dname\x2dfedora\x2dswap.swap                                                          stop running
     3234 dev-fedora-swap.swap                                                                                         stop running

    5 jobs listed.

This remains endlessly because their JobTimeoutUSec is infinity:

    # LANG=C systemctl show -p JobTimeoutUSec dev-fedora-swap.swap
    JobTimeoutUSec=infinity

If this issue happens during system shutdown, the system shutdown appears to
get hang and the system will be forcibly shutdown or rebooted 30 minutes later
by the following configuration:

    # grep -E "^JobTimeout" /usr/lib/systemd/system/reboot.target
    JobTimeoutSec=30min
    JobTimeoutAction=reboot-force

The scenario in the real world seems that there is some service unit with
KillMode=none, processes whose memory is being swapped out are not killed
during stop operation in the service unit and then swapoff command fails.

On the other hand, it works well in successful case of swapoff command because
the secondary jobs monitor /proc/swaps file and can detect deletion of the
corresponding swap file.

This commit fixes the issue by finishing the secondary swap units' jobs if
deactivation of the primary swap unit fails.

Fixes: #11577
2020-02-27 08:40:51 +01:00
Dan Streetman c5673ed0de log: add support for prefixing console log messages with current timestamp 2020-02-10 07:01:30 -05:00
Anita Zhang 72545ae057 core: sync SeccompParseFlags between dbus-execute and load-fragment
9e48626571 added some new syscalls to the
filter lists. However, on systems that do not yet support the new calls,
running systemd-run with the filter set results in error:

```
$ sudo systemd-run -t -r -p "SystemCallFilter=~@mount" /bin/true
Failed to start transient service unit: Invalid argument
```

Having the same properties in a unit file will start the service
without issue. This is because the load-fragment code will parse the
syscall filters in permissive mode:
https://github.com/systemd/systemd/blob/master/src/core/load-fragment.c#L2909
whereas the dbus-execute equivalent of the code does not.

Since the permissive mode appears to be the right setting to support
older kernels/libseccomp, this will update the dbus-execute parsing
to also be permissive.
2020-02-07 13:39:35 +09:00
Zbigniew Jędrzejewski-Szmek cff789b746 core/selinux-access: use _cleanup_ and improve logging
Instead of setting the bus error structure and then freeing it, let's only set
it if used. If we will ignore the selinux denial, say ", ignore" to make this
clear. Also, use _cleanup_ to avoid gotos.
2020-02-06 23:06:55 +01:00
Zbigniew Jędrzejewski-Szmek 22cd7aabec core/selinux-access: do not use NULL for %s
../src/core/selinux-access.c: In function ‘mac_selinux_generic_access_check’:
../src/basic/log.h:223:27: error: ‘%s’ directive argument is null [-Werror=format-overflow=]
../src/core/selinux-access.c:235:85: note: format string is defined here
  235 |                         log_warning_errno(errno, "SELinux getcon_raw failed (tclass=%s perm=%s): %m", tclass, permission);
      |                                                                                     ^~

I wonder why nobody ever noticed this.

Fixes #14691 (other issues listed in that ticket have already been fixed).
2020-02-06 22:27:12 +01:00
Yu Watanabe 2b6a90d17f selinux: update log message to suppress warning by coverity
Fixes CID#1417440 and CID#1417438.
2020-02-06 16:04:50 +01:00
Zbigniew Jędrzejewski-Szmek 5650ec7a11
Merge pull request #14156 from fbuihuu/deal-with-aliases-when-disabling
Consider aliases in /usr when disabling units
2020-02-06 10:46:21 +01:00
Christian Göttsche 1acf344dfa core: do not prepare a SELinux context for dummy files for devicenode bind-mounting
Let systemd create the dummy file where a device node will be mounted on with the default label for the parent directory (e.g. /tmp/namespace-dev-yTMwAe/dev/).

Fixes: #13762
2020-02-06 10:20:14 +01:00
Yu Watanabe fae403f1dc
Merge pull request #14778 from cgzones/split_selinux_logging
Improve SELinux logging in permissive mode
2020-02-05 15:20:50 +09:00
Christian Göttsche c0f765cac8 core: move bus-util include out of selinux-access header 2020-02-04 19:26:38 +01:00
Christian Göttsche f156e60c66 core: unit_label_path(): take const unit 2020-02-04 18:36:19 +01:00
Christian Göttsche 1648233dce selinux-access: log warning on context acquisition failure
Relevant when testing in permissive mode, where the function does not return a failure to the client.
This helps to configure a system in permissive mode, without getting surprising failures when switching to enforced mode.
2020-02-04 18:11:42 +01:00
Zbigniew Jędrzejewski-Szmek dc9fd22d3d Merge pull request #14398 from poettering/mount-prep 2020-02-04 16:28:51 +01:00
Yu Watanabe 50152bb1c5 core: call dynamic_user_acquire() only when 'group' is non-null
When unit is reloaded, and the reloaded unit has bad-setting, then
unit_patch_contexts() is not called and exec_context::user and group
may not be configured.

A minimum reproducer for the case is:
- step 1.
$ sudo systemctl edit --full hoge.service
[Service]
oneshot
ExecStart=sleep 1h

- step 2.
$ sudo systemctl start hoge.service

- step 3.
$ sudo systemctl edit --full hoge.service
[Service]
Type=oneshot
ExecStart=@bindir@/sleep 1h
DynamicUser=yes

Then pid1 crashed.

Fixes #14733.
2020-02-03 21:51:07 +09:00
Yu Watanabe 60d0a5098b util: uid_t, gid_t, and pid_t must be 32bit
We already have assert_cc(sizeof(uid_t) == sizeof(uint32_t)) or friends
at various places.
2020-02-02 17:13:08 +01:00
Lennart Poettering dc5437c78b journald: add ability to activate by varlink socket
If we have exit on idle, then operations such as "journalctl
--namespace=foo --rotate" should work even if the journal daemon is
currently not running.

(Note that we don't do activation by varlink for the main instance of
journald, I am not sure the deadlocks it might introduce are worth it)
2020-01-31 15:03:55 +01:00
Lennart Poettering 91dd5f7cbe core: add new LogNamespace= execution setting 2020-01-31 15:01:43 +01:00
Lennart Poettering 8d251485fa core: fsck images specified as RootImage= too before using them 2020-01-29 19:29:59 +01:00
Zbigniew Jędrzejewski-Szmek d2b45da40a
Merge pull request #14633 from poettering/logind-switch-polkit
add polkit hookup for VT switching in logind
2020-01-28 16:09:09 +01:00
Kevin Kuehler fc64760dda core: shared: Add ProtectClock= to systemd.exec 2020-01-26 12:23:33 -08:00
Lennart Poettering 269e4d2d6b shared: split out polkit stuff from bus-util.c → bus-polkit.c
It's enough, complex stuff to warrant its own source file.

No other changes, just splitting out.
2020-01-22 12:34:10 +01:00
Lennart Poettering 44b0d1fd59 core: add implicit ordering dep on blockdev@.target from all mount units
This way we shuld be able to order mounts properly against their backing
services in case complex storage is used (i.e. LUKS), even if the device
path used for mounting the devices is different from the expected device
node of the backing service.

Specifically, if we have a LUKS device /dev/mapper/foo that is mounted
by this name all is trivial as the relationship can be established a
priori easily. But if it is mounted via a /dev/disk/by-uuid/ symlink or
similar we only can relate the device node generated to the one mounted
at the moment the device is actually established. That's because the
UUID of the fs is stored inside the encrypted volume and thus not
knowable until the volume is set up. This patch tries to improve on this
situation: a implicit After=blockdev@.target dependency is generated for
all mounts, based on the data from /proc/self/mountinfo, which should be
the actual device node, with all symlinks resolved. This means that as
soon as the mount is established the ordering via blockdev@.target will
work, and that means during shutdown it is honoured, which is what we
are looking for.

Note that specifying /etc/fstab entries via UUID= for LUKS devices still
sucks and shouldn't be done, because it means we cannot know which LUKS
device to activate to make an fs appear, and that means unless the
volume is set up at boot anyway we can't really handle things
automatically when putting together transactions that need the mount.
2020-01-21 20:23:44 +01:00
Lennart Poettering e3e6f99689 core: downgrade swap → device dep to Requires=
This catches up with 9d06297e26 and adapts
the change made to swap units. We generally don't want to react
a-posteriori to swap devices disappearing, bad things will happen
anyway.
2020-01-21 20:23:40 +01:00
Lennart Poettering 61f9cf4e4c swap: generate automatic dependencies also for /proc/swaps devices
This catches up with the logic we do for mounts: we create deps based on
/proc/swaps now too, with the right flags set.
2020-01-21 20:23:37 +01:00
Lennart Poettering 5de0acf40d core: let's be defensive, /dev/nfs is also a special mount source, filter it out 2020-01-21 20:23:34 +01:00
Lennart Poettering 219f3cd941 core: drop _pure_ from static functions
For static functions the compiler should figure this out on its own.
2020-01-21 20:23:30 +01:00
Yu Watanabe 5029912157 network,udev: use uint64_t for bit rate
Fixes #14620.
2020-01-21 16:51:19 +01:00
Lennart Poettering b90cf10245 core: make a number of functions not used externally static 2020-01-21 11:51:45 +01:00
Lennart Poettering 96462ae998 core: show the UID we cannot parse 2020-01-21 11:51:26 +01:00
Lennart Poettering 3b7f79dc9f core: make sure StandardInput=file: doesn't get dup'ed to stdout/stderr by default
Fixes: #14560
2020-01-20 17:19:42 +01:00
Sascha Dewald fc57f105d9 pkgconf: add full generator paths 2020-01-18 17:48:28 +01:00
Lennart Poettering 0879fbd6fe mount: make checks on perpetual mount units more lax
We don#t really care where perpetual mounts are mounted from, since they
have to exist since before we run anyway.
2020-01-17 15:09:18 +01:00
Lennart Poettering 88414eed6f core: never allow perpetual units to be masked
Fixes: #14550
2020-01-17 15:02:15 +01:00
Zbigniew Jędrzejewski-Szmek 4ca739e20a core: reduce indentation a bit 2020-01-17 08:13:09 +01:00
Zbigniew Jędrzejewski-Szmek 251d3d20c3
Merge pull request #14581 from poettering/setcred-alternative-fix
alternative pam_setcred() fix
2020-01-16 09:53:26 +01:00
Lennart Poettering 4bb68f2fee core: on each iteration processing /proc/self/mountinfo merge all discovery flags for each path
This extends on d253a45e1c, and instead of
merging just a single flag from previous mount entries of
/proc/self/mountinfo for the same path we merge all three.

This shouldn't change behaviour, but I think make things more readable.

Previously we'd set MOUNT_PROC_IS_MOUNTED unconditionally, we still do.

Previously we'd inherit MOUNT_PROC_JUST_MOUNTED from a previous entry on
the same line, we still do.

MOUNT_PROC_JUST_CHANGED should generally stay set too. Why that? If we
have two mount entries on the same mount point we'd first process one
and then the other, and the almost certainly different mount parameters
of the two would mean we'd set MOUNT_PROC_JUST_CHANGED for the second.
And with this we'll definitely do that still.

This also adds a comment explaining the situation a bit, and why we get
into this situation.
2020-01-15 17:42:12 +01:00
Lennart Poettering 46d7c6afbf execute: allow pam_setcred() to fail, ignore errors
Fixes: #14567
Alternative-To: #14569
2020-01-15 17:10:43 +01:00
Lennart Poettering 5b8d1f6b77 execute: add const to array parameters, where possible 2020-01-15 17:10:28 +01:00
Lennart Poettering 19d22d433d core: add user/group resolution varlink interface to PID 1 2020-01-15 15:28:55 +01:00
Lennart Poettering 4bad7eedae core: make return parameter of dynamic_user_lookup_name() optional 2020-01-15 15:28:52 +01:00
Lennart Poettering eea45a3399
Merge pull request #14424 from poettering/watch-bus-name-rework
pid1: simplify drastically how we watch bus names for service's BusName= setting
2020-01-15 11:46:11 +01:00
Lennart Poettering c238a2f889 cgroup: minor comment improvement
As pointed out here:

https://github.com/systemd/systemd/pull/14564#discussion_r366305882
2020-01-14 16:57:51 +01:00
Zbigniew Jędrzejewski-Szmek 952805a9e3
Merge pull request #14564 from poettering/cgroup-realize-fix
pid1: cgroup realization fix
2020-01-14 15:52:56 +01:00
Zbigniew Jędrzejewski-Szmek ea7fe1d1c2
Merge pull request #14390 from poettering/gpt-var-tmp
introduce GPT partition types for /var and /var/tmp and support them for auto-discovery
2020-01-14 15:37:53 +01:00
Zbigniew Jędrzejewski-Szmek 7c286cd6a6
Merge pull request #14505 from poettering/refuse-on-failure
refuse OnFailure= deps on units that have no failure state
2020-01-14 14:19:04 +01:00
Jun'ichi Nomura 1d086a6e59 mount: mark an existing "mounting" unit from /proc/self/mountinfo as "just_mounted"
When starting a mount unit, systemd invokes mount command and moves the
unit's internal state to "mounting".  Then it watches for updates of
/proc/self/mountinfo.  When the expected mount entry newly appears in
mountinfo, the unit internal state is changed to "mounting-done".
Finally, when systemd finds the mount command has finished, it checks
whether the unit internal state is "mounting-done" and changes the state
to "mounted".
If the state was not "mounting-done" in the last step though mount command
was successfully finished, the unit is marked as "failed" with following
log messages:
  Mount process finished, but there is no mount.
  Failed with result 'protocol'.

If daemon-reload is done in parallel with starting mount unit, it is
possible that things happen in following order and result in above failure.
  1. the mount unit state changes to "mounting"
  2. daemon-reload saves the unit state
  3. kernel completes the mount and /proc/self/mountinfo is updated
  4. daemon-reload restores the saved unit state, that is "mounting"
  5. systemd notices the mount command has finished but the unit state
     is still "mounting" though it should be "mounting-done"

mount_setup_existing_unit() should take into account that MOUNT_MOUNTING
is transitional state and set MOUNT_PROC_JUST_MOUNTED flag if the unit
comes from /proc/self/mountinfo so that mount_process_proc_self_mountinfo()
later can make state transition from "mounting" to "mounting-done".

Fixes: #10872
2020-01-14 12:15:09 +01:00
Lennart Poettering 48fd01e5f3 cgroup: drop redundant if check 2020-01-14 10:44:58 +01:00
Lennart Poettering e1e98911a8 cgroup: update only siblings that got realized once
Fixes: #14475
Replaces: #14554
2020-01-14 10:44:19 +01:00
Lennart Poettering 95ae4d1420 cgroup: drop unnecessary {} 2020-01-14 10:44:19 +01:00
Lennart Poettering a0d6590c4e cgroup: no need to cast dev_t to dev_t 2020-01-14 10:44:19 +01:00
Lennart Poettering 57f1030b13 cgroup: use log_warning_errno() where possible 2020-01-14 10:44:19 +01:00
Lennart Poettering 0bdd282a4e killall: update reference to root storage daemon interface docs 2020-01-13 18:53:46 +01:00
Lennart Poettering 765d88698f
Merge pull request #14400 from keszybz/alias-check
Alias check rework
2020-01-13 18:03:13 +01:00
Lennart Poettering 04d8507f68
Merge pull request #14381 from keszybz/ifindex-cleanup
Resolve alternative names
2020-01-13 17:57:59 +01:00
Lennart Poettering 200427cf7c
Merge pull request #11199 from dargad/restore-pam-setcred
Restore call to pam_setcred
2020-01-13 17:44:34 +01:00
Lennart Poettering 575a915a74
Merge pull request #14532 from poettering/namespace-dynamic-user-fix
Make DynamicUser=1 work in a userns container
2020-01-13 16:47:15 +01:00
Dariusz Gadomski afb11bf1b8 execute: Detect groups added by PAM and merge them with supplementary groups 2020-01-13 10:29:20 +01:00
Dariusz Gadomski 3bb39ea936 execute: Restore call to pam_setcred 2020-01-13 10:29:20 +01:00
Zbigniew Jędrzejewski-Szmek 5c3fa98db6 util-lib: move things that parse ifnames to shared/
In subsequent commits, calls to if_nametoindex() will be replaced by a wrapper
that falls back to alternative name resolution over netlink. netlink support
requires libsystemd (for sd-netlink), and we don't want to add any functions
that require netlink in basic/. So stuff that calls if_nametoindex() for user
supplied interface names, and everything that depends on that, needs to be
moved.
2020-01-11 12:07:28 +01:00
Lennart Poettering cd6cf81b3f
Merge pull request #14533 from poettering/swap-prio-fixes
Fix Priority= parsing in .swap units
2020-01-10 17:06:16 +01:00
Lennart Poettering a5292dfdf7
Merge pull request #14508 from poettering/namespace-ro-fix
core: be nicer to read-only images
2020-01-10 17:03:01 +01:00
Zbigniew Jędrzejewski-Szmek 1bf1558552 core,install: allow one more case of "instance propagation"
If we have a template unit template@.service, it should be allowed to specify a
dependency on a unit without an instance, bar@.service. When the unit is created,
the instance will be propagated into the target, so template@inst.service will
depend on bar@inst.service.

This commit changes unit_dependency_name_compatible(), which makes the manager
accept links like that, and unit_file_verify_alias(), so that the installation
function will agree to create a symlink like that, and finally the tests are
adjusted to pass.
2020-01-10 14:31:01 +01:00
Franck Bui 29a743f993 core: explicit mention of unit ID is redundant with log_unit_*() 2020-01-10 14:20:28 +01:00
Zbigniew Jędrzejewski-Szmek 9a4f9e69e1 shared/unit-file: expose function to check .wants/.requires symlink validity
No functional change.
2020-01-10 12:20:18 +01:00
Anita Zhang e127d90210
Merge pull request #14534 from poettering/cgroup-sibling-fix
core: fix re-realization of cgroup siblings
2020-01-09 23:57:39 -08:00
Anita Zhang 79bb680cbf
Merge pull request #14527 from poettering/socket-state-fix
core: propagate service state to socket in more load states
2020-01-09 17:10:48 -08:00
Lennart Poettering 65f6b6bdcb core: fix re-realization of cgroup siblings
This is a fix-up for eef85c4a3f which
broke this.

Tracked down by @w-simon

Fixes: #14453
2020-01-09 17:31:41 +01:00
Lennart Poettering 6fca66a7f1 core: set error value correctly 2020-01-09 17:08:40 +01:00
Lennart Poettering af4454cb17 core: use unit-based logging instead of generic logging where appropriate 2020-01-09 17:08:36 +01:00
Lennart Poettering eb34a981d6 core: initialize priority_set when parsing swap unit files
Fixes: #14524
2020-01-09 17:08:31 +01:00
Lennart Poettering 6afc31615e core: no need to initialize swap structure fields if all zeroes anyway 2020-01-09 17:08:27 +01:00
Lennart Poettering 6d9e0ca400 core: expose swap priority value via dbus only if it is set 2020-01-09 17:08:10 +01:00
Lennart Poettering 404308486a core: be more restrictive on the dependency types we allow to be created transiently
We should allow the ones that the [Unit] section of regular unit files
may accet, but no other, in particular not the internal deps we
synthesize as reverse of explicitly configured ones, such was WantedBy=.

Fixes: #14251
2020-01-09 15:42:34 +01:00
Lennart Poettering 7cce68e1e0 core: make sure we use the correct mount flag when re-mounting bind mounts
When in a userns environment we cannot take away per-mount point flags
set on a mount point that was passed to us. Hence we need to be careful
to always check the actual mount flags in place and manipulate only
those flags of them that we actually want to change and not reset more
as side-effect.

We mostly got this right already in
bind_remount_recursive_with_mountinfo(), but didn't in the simpler
bind_remount_one_with_mountinfo(). Catch up.

(The old code assumed that the MountEntry.flags field contained the
right flag settings, but it actually doesn't for new mounts we just
established as for those mount() establishes the initial flags for us,
and we have to read them back to figure out which ones the kernel
picked.)

Fixes: #13622
2020-01-09 15:18:08 +01:00
Lennart Poettering 4c2ef32767 core: propagate service state to socket in more load states
It makes sense to filter state changes for some load states that
shouldn't happen, but the common cases should be accepted, because they
might happen during runtime when "systemctl daemon-reload" is issued and
unit files changed state in between. Otherwise we lose events.

Fixes: #4708
2020-01-09 11:53:26 +01:00
Lennart Poettering b0a94268f8 core: when we cannot open an image file for write, try read-only
Closes: #14442
2020-01-09 11:18:06 +01:00
Lennart Poettering c8c535d589 namespace: tweak checks whether we can mount image read-only
So far we set up a loopback file read-only iff ProtectSystem= and
ProtectHome= both where set to values that mark these dirs read-only.
Let's extend that and also be happy if /home and the root dir are marked
read-only by some other means.

Fixes: #14442
2020-01-09 11:18:02 +01:00
Lennart Poettering c80a9a33d0 core: clearly refuse OnFailure= deps on units that can't fail
Similar, refuse triggering deps on units that cannot trigger.

And rework how we ignore After= dependencies on device units, to work
the same way.

See: #14142
2020-01-09 11:03:53 +01:00
Lennart Poettering 48dfa8b0e1
Merge pull request #14512 from poettering/root-image-devices
Make RootImage= work reliable with DeviceAllow= in the mix
2020-01-08 12:18:38 +01:00
Zbigniew Jędrzejewski-Szmek 06ae8800d0
Merge pull request #14465 from poettering/setprio-rework
When Nice= is used, clamp to RLIMIT_NICE
2020-01-08 11:06:46 +01:00
Lennart Poettering 867af7282b unit: make sure to pull in modprobe@loop.service when RootImage= is used with DeviceAllow=
Fixes: #14214
2020-01-07 18:53:31 +01:00
Lennart Poettering 07141aa005 bpf-devices: line-break some overly long function signatures 2020-01-07 18:41:36 +01:00
Lennart Poettering fc67a943d9 core: drop initial ListNames() bus call from PID 1
Previously, when first connecting to the bus after connecting to it we'd
issue a ListNames() bus call to the driver to figure out which bus names
are currently active. This information was then used to initialize the
initial state for services that use BusName=.

This change removes the whole code for this and replaces it with
something vastly simpler.

First of all, the ListNames() call was issues synchronosuly, which meant
if dbus was for some reason synchronously calling into PID1 for some
reason we'd deadlock. As it turns out there's now a good chance it does:
the nss-systemd userdb hookup means that any user dbus-daemon resolves
might result in a varlink call into PID 1, and dbus resolves quite a lot
of users while parsing its policy. My original goal was to fix this
deadlock.

But as it turns out we don't need the ListNames() call at all anymore,
since #12957 has been merged. That PR was supposed to fix a race where
asynchronous installation of bus matches would cause us missing the
initial owner of a bus name when a service is first started. It fixed it
(correctly) by enquiring with GetOwnerName() who currently owns the
name, right after installing the match. But this means whenever we start watching a bus name we anyway
issue a GetOwnerName() for it, and that means also when first connecting
to the bus we don't need to issue ListNames() anymore since that just
tells us the same info: which names are currently owned.

hence, let's drop ListNames() and instead make better use of the
GetOwnerName() result: if it failed the name is not owned.

Also, while we are at it, let's simplify the unit's owner_name_changed()
callback(): let's drop the "old_owner" argument. We never used that
besides logging, and it's hard to synthesize from just the return of a
GetOwnerName(), hence don't bother.
2020-01-06 15:21:47 +01:00
Lennart Poettering a5b0784795 core: create/remove unit bus name slots always together
When a service unit watches a bus name (i.e. because of BusName= being
set), then we do two things: we install a match slot to watch how its
ownership changes, and we inquire about the current owner. Make sure we
always do both together or neither.

This in particular fixes a corner-case memleak when destroying bus
connections, since we never freed the GetNameOwner() bus slots when
destroying a bus when they were still ongoing.
2020-01-06 15:21:44 +01:00
Lennart Poettering 5085ef0d71 core: no need to eat up error
This is a method call reply. We might as well propagate the error. The
worst that happens is that sd-bus logs about it.
2020-01-06 15:21:40 +01:00
Lennart Poettering 17bda1f19d core: shorten code a bit
The return parameter here cannot be NULL, the bus call either succeeds
or fails but will never uceed and return an empty owner.
2020-01-06 15:21:37 +01:00
Lennart Poettering a54654ba70 core: don't check potentially NULL error, it's not gonna work anyway 2020-01-06 15:21:33 +01:00
Lennart Poettering 42837b8134 core: don't check error parameter of get_name_owner_handler()
It's a *return* parameter, not an input parameter. Yes, this is a bit
confusing for method call replies, but we try to use the same message
handler for all incoming messages, hence the parameter. We are supposed
to write any error into it we encounter, if we want, and our caller will
log it, but that's it.
2020-01-06 15:21:30 +01:00
Topi Miettinen ec04aef442 dbus-execute: avoid extra strdup()
bind_mount_add does the strdup(), so we can avoid
strdup()ing the strings.
2020-01-04 10:47:28 +01:00
Kevin Kuehler 097537f07a job: Don't mark as redundant if deps are relevant
In the steps given in #13850, the resulting graph looks like:

    C (Anchor) -> B -> A

Since B is inactive, it will be flagged as redundant and removed from
the transaction, causing A to get garbage collected. The proposed fix is
to not mark nodes as redundant if doing so causes a relevant node to be
garbage collected.

Fixes #13850
2020-01-03 15:58:10 +01:00
Dimitri John Ledkov 390902012c core: in execute, Never fail setting Nice priority
Instead, push to the closest possible Nice priority setting.

Replaces: #11397
2020-01-02 20:50:14 +01:00
Leonid Evdokimov 6e3c443b56 Fix typo 2019-12-26 12:07:08 +09:00
Lennart Poettering d4dffb8533 dissect: introduce new recognizable partition types for /var and /var/tmp
This has been requested many times before. Let's add it finally.

GPT auto-discovery for /var is a bit more complex than for other
partition types: the other partitions can to some degree be shared
between multiple OS installations on the same disk (think: swap, /home,
/srv). However, /var is inherently something bound to an installation,
i.e. specific to its identity, or actually *is* its identity, and hence
something that cannot be shared.

To deal with this this new code is particularly careful when it comes to
/var: it will not mount things blindly, but insist that the UUID of the
partition matches a hashed version of the machine-id of the
installation, so that each installation has a very specific /var
associated with it, and would never use any other. (We actually use
HMAC-SHA256 on the GPT partition type for /var, keyed by the machine-id,
since machine-id is something we want to keep somewhat private).

Setting the right UUID for installations takes extra care. To make
things a bit simpler to set up, we avoid this safety check for nspawn
and RootImage= in unit files, under the assumption that such container
and service images unlikely will have multiple installations on them.
The check is hence only required when booting full machines, i.e. in
in systemd-gpt-auto-generator.

To help with putting together images for full machines, PR #14368
introduces a repartition tool that can automatically fill in correctly
calculated UUIDs on first boot if images have the var partition UUID
initialized to all zeroes. With that in place systems can be put
together in a way that on first boot the machine ID is determined and
the partition table automatically adjusted to have the /var partition
with the right UUID.
2019-12-23 14:43:59 +01:00
Lennart Poettering b895fa08e6 Revert "Drop dbus activation stub service"
This reverts commit 07125d24ee.

In contrast to what is claimed in #13396 dbus-broker apparently does
care for the service file to be around, and otherwise will claim
"Service Not Activatable" in the time between systemd starting up the
broker and connecting to it, which the stub service file is supposed to
make go away.

Reverting this makes the integration test suite pass again on host with
dbus-broker (i.e. current Fedora desktop).

Tested with dbus-broker-21-6.fc31.x86_64.
2019-12-20 17:28:12 +01:00
Lennart Poettering 1cb342447d
Merge pull request #14329 from anitazha/user_invocation_id
export user units' InvocationID and use as _SYSTEMD_INVOCATION_ID
2019-12-20 10:20:21 +01:00
Anita Zhang 2f8c48b605 core,journal: export user units' InvocationID and use as _SYSTEMD_INVOCATION_ID
Write a user unit's invocation ID to /run/user/<uid>/systemd/units/ similar
to how a system unit's invocation ID is written to /run/systemd/units/.

This lets the journal read and add a user unit's invocation ID to the
_SYSTEMD_INVOCATION_ID field of logs instead of the user manager's
invocation ID.

Fixes #12474
2019-12-19 17:42:17 -08:00
Anita Zhang e5f10cafe0 core: create inaccessible nodes for users when making runtime dirs
To support ProtectHome=y in a user namespace (which mounts the inaccessible
nodes), the nodes need to be accessible by the user. Create these paths and
devices in the user runtime directory so they can be used later if needed.
2019-12-18 11:09:30 -08:00
Anita Zhang 5749f855a7 core: PrivateUsers=true for (unprivileged) user managers
Let per-user service managers have user namespaces too.

For unprivileged users, user namespaces are set up much earlier
(before the mount, network, and UTS namespaces vs after) in
order to obtain capbilities in the new user namespace and enable use of
the other listed namespaces. However for privileged users (root), the
set up for the user namspace is still done at the end to avoid any
restrictions with combining namespaces inside a user namespace (see
inline comments).

Closes #10576
2019-12-18 11:09:30 -08:00