Commit Graph

616 Commits

Author SHA1 Message Date
Lennart Poettering 8d251485fa core: fsck images specified as RootImage= too before using them 2020-01-29 19:29:59 +01:00
Kevin Kuehler fc64760dda core: shared: Add ProtectClock= to systemd.exec 2020-01-26 12:23:33 -08:00
Lennart Poettering 46d7c6afbf execute: allow pam_setcred() to fail, ignore errors
Fixes: #14567
Alternative-To: #14569
2020-01-15 17:10:43 +01:00
Lennart Poettering 5b8d1f6b77 execute: add const to array parameters, where possible 2020-01-15 17:10:28 +01:00
Zbigniew Jędrzejewski-Szmek ea7fe1d1c2
Merge pull request #14390 from poettering/gpt-var-tmp
introduce GPT partition types for /var and /var/tmp and support them for auto-discovery
2020-01-14 15:37:53 +01:00
Dariusz Gadomski afb11bf1b8 execute: Detect groups added by PAM and merge them with supplementary groups 2020-01-13 10:29:20 +01:00
Dariusz Gadomski 3bb39ea936 execute: Restore call to pam_setcred 2020-01-13 10:29:20 +01:00
Dimitri John Ledkov 390902012c core: in execute, Never fail setting Nice priority
Instead, push to the closest possible Nice priority setting.

Replaces: #11397
2020-01-02 20:50:14 +01:00
Lennart Poettering d4dffb8533 dissect: introduce new recognizable partition types for /var and /var/tmp
This has been requested many times before. Let's add it finally.

GPT auto-discovery for /var is a bit more complex than for other
partition types: the other partitions can to some degree be shared
between multiple OS installations on the same disk (think: swap, /home,
/srv). However, /var is inherently something bound to an installation,
i.e. specific to its identity, or actually *is* its identity, and hence
something that cannot be shared.

To deal with this this new code is particularly careful when it comes to
/var: it will not mount things blindly, but insist that the UUID of the
partition matches a hashed version of the machine-id of the
installation, so that each installation has a very specific /var
associated with it, and would never use any other. (We actually use
HMAC-SHA256 on the GPT partition type for /var, keyed by the machine-id,
since machine-id is something we want to keep somewhat private).

Setting the right UUID for installations takes extra care. To make
things a bit simpler to set up, we avoid this safety check for nspawn
and RootImage= in unit files, under the assumption that such container
and service images unlikely will have multiple installations on them.
The check is hence only required when booting full machines, i.e. in
in systemd-gpt-auto-generator.

To help with putting together images for full machines, PR #14368
introduces a repartition tool that can automatically fill in correctly
calculated UUIDs on first boot if images have the var partition UUID
initialized to all zeroes. With that in place systems can be put
together in a way that on first boot the machine ID is determined and
the partition table automatically adjusted to have the /var partition
with the right UUID.
2019-12-23 14:43:59 +01:00
Anita Zhang 5749f855a7 core: PrivateUsers=true for (unprivileged) user managers
Let per-user service managers have user namespaces too.

For unprivileged users, user namespaces are set up much earlier
(before the mount, network, and UTS namespaces vs after) in
order to obtain capbilities in the new user namespace and enable use of
the other listed namespaces. However for privileged users (root), the
set up for the user namspace is still done at the end to avoid any
restrictions with combining namespaces inside a user namespace (see
inline comments).

Closes #10576
2019-12-18 11:09:30 -08:00
Kevin Kuehler 943800f4e7 execute: Call capability_ambient_set_apply even if ambient set is 0
The function capability_ambient_set_apply() now drops capabilities not
in the capability_ambient_set(), so it is necessary to call it when
the ambient set is empty.

Fixes #13163
2019-11-27 10:57:23 -08:00
Lennart Poettering 6d19b71876 core: don't insist on ProtectHostname= if unshare() is blocked
Previously we'd only skip ProtectHostname= if kernel support for
namespaces was lacking. With this change we also accept if unshare()
fails because it is blocked.
2019-11-20 12:49:06 +01:00
Lennart Poettering 4e67759960 core: be more lenient when checking whether sandboxing is necessary
In some containers unshare() is made unavailable entirely. Let's deal
with this that more gracefully and disable our sandboxing of services
then, so that we work in a container, under the assumption the container
manager is then responsible for sandboxing if we can't do it ourselves.

Previously, we'd insist on sandboxing as soon as any form of BindPath=
is used. With this change we only insist on it if we have a setting like
that where source and destination differ, i.e. there's a mapping
established that actually rearranges things, and thus would result in
systematically different behaviour if skipped (as opposed to mappings
that just make stuff read-only/writable that otherwise arent').

(Let's also update a test that intended to test for this behaviour with
a more specific configuration that still triggers the behaviour with
this change in place)

Fixes: #13955

(For testing purposes unshare() can easily be blocked with
systemd-nspawn --system-call-filter=~unshare.)
2019-11-20 12:30:04 +01:00
Kevin Kuehler 94a7b2759d core: ProtectKernelLogs= mask kmsg in proc and sys
Block access to /dev/kmsg and /proc/kmsg when ProtectKernelLogs is set.
2019-11-14 12:58:43 -08:00
Kevin Kuehler 8470304018 core: Add ProtectKernelLogs
If seccomp is enabled, load the SYSCALL_FILTER_SET_SYSLOG into the
seccomp filter set. Drop the CAP_SYSLOG capability.
2019-11-11 12:12:02 -08:00
Yu Watanabe 021cdf8330 tree-wide: drop signal.h when signal-util.h is included 2019-11-04 00:30:32 +09:00
Yu Watanabe e30e8b5073 tree-wide: drop stat.h or statfs.h when stat-util.h is included 2019-11-04 00:30:32 +09:00
Yu Watanabe 92133d2878 tree-wide: drop socket.h when socket-util.h is included 2019-11-04 00:30:32 +09:00
Yu Watanabe 927d2351d7 tree-wide: drop pwd.h and grp.h when user-util.h is included 2019-11-04 00:30:32 +09:00
Yu Watanabe dc54946cea tree-wide: drop capability.h when capability-util.h is included 2019-11-04 00:30:32 +09:00
Yu Watanabe 99e0d09731 tree-wide: drop glob.h when glob-util.h is included 2019-11-04 00:30:32 +09:00
Yu Watanabe 455fa9610c tree-wide: drop string.h when string-util.h or friends are included 2019-11-04 00:30:32 +09:00
Yu Watanabe f5947a5e92 tree-wide: drop missing.h 2019-10-31 17:57:03 +09:00
Zbigniew Jędrzejewski-Szmek 86e94d95d0
Merge pull request #13246 from keszybz/add-SystemdOptions-efi-variable
Add efi variable to augment /proc/cmdline
2019-10-03 12:19:44 +02:00
Frantisek Sumsal 54756dce57 execute: explicitly ignore fd_wait_for_event()'s return value
Fixes CID#1402316
2019-10-01 10:25:36 +02:00
Zbigniew Jędrzejewski-Szmek 5ac1530eca tree-wide: say "ratelimit" not "rate_limit"
"ratelimit" is a real word, so we don't need to use the other form anywhere.
We had both forms in various places, let's standarize on the shorter and more
correct one.
2019-09-20 16:05:53 +02:00
Zbigniew Jędrzejewski-Szmek fdb3decaa7 util-lib: move some functions from basic/cgroup-util to shared/cgroup-setup
This way less stuff needs to be in basic. Initially, I wanted to move all the
parts of cgroup-utils.[ch] that depend on efivars.[ch] to shared, because
efivars.[ch] is in shared/. Later on, I decide to split efivars.[ch], so the
move done in this patch is not necessary anymore. Nevertheless, it is still
valid on its own. If at some point we want to expose libbasic, it is better to
to not have stuff that belong in libshared there.
2019-09-16 18:08:00 +02:00
Yu Watanabe 12213aed12 core: move timeout_clean_usec from Service to ExecContext 2019-08-28 23:09:54 +09:00
Yu Watanabe 7f622a19d9 core: also remove private directories by systemctl clean
Fixes #13355.
2019-08-28 23:09:44 +09:00
Yu Watanabe 494d0247f9 core: introduce exec_directory_is_private() helper function
Also, this follows up 40cd2ecc26.
2019-08-25 16:27:42 +09:00
Lennart Poettering e04ed6db6b exit-status: rename EXIT_STATUS_GLIBC → EXIT_STATUS_LIBC
After all these two exit codes are defined by ISO C as part of the C
library, and it's not the GNU implementation defines them.
2019-07-29 19:05:25 +02:00
Zbigniew Jędrzejewski-Szmek e1714f0250 shared/exit-status: turn status level into a bitmask, add "test"
The "test" doesn't really test much automatically, but it is still useful
to look at the mappings.
2019-07-29 15:54:45 +02:00
Lennart Poettering 2caa38e99f tree-wide: some more [static] related fixes
let's add [static] where it was missing so far

Drop [static] on parameters that can be NULL.

Add an assert() around parameters that have [static] and can't be NULL
hence.

Add some "const" where it was forgotten.
2019-07-12 16:40:10 +02:00
Lennart Poettering b910cc72c0 tree-wide: get rid of strappend()
It's a special case of strjoin(), so no need to keep both. In particular
as typing strjoin() is even shoert than strappend().
2019-07-12 14:31:12 +09:00
Lennart Poettering 66855de739 tree-wide: make use of errno_or_else() everywhere 2019-07-11 23:20:31 +02:00
Lennart Poettering 6b7b2ed96b core: add type of resource string table 2019-07-11 12:18:51 +02:00
Lennart Poettering 4c2f584230 core: hook up service unit type with the new clean operation
The implementation is pretty straight-foward: when we get a request to
clean some type of resources we fork off a process doing that, and while
it is running we are in the "cleaning" state.
2019-07-11 12:18:51 +02:00
Yu Watanabe 4bbccb02ea tree-wide: introduce strerror_safe() 2019-07-05 02:43:56 +09:00
Michal Sekletar 33fe9e3fd0 execute: drop SYNTHETIC_ERRNO because error code was received from the apply_numa_policy() 2019-06-25 21:52:28 +02:00
Joerg Behrmann fa97f63067 core: factor root_directory application out of apply_working_directory
Fixes: #12498
2019-06-25 22:53:33 +09:00
Lennart Poettering 2193f17c08 core: mention why we do migration for everything but ConfigurationDirectory= 2019-06-25 10:47:46 +02:00
Lennart Poettering cf52c45d6b core: log when we convert from DynamicUser=1 to =0 or vice versa 2019-06-25 10:47:46 +02:00
Lennart Poettering c7e42ceb7a
Merge pull request #12869 from poettering/dynamic-user-re-migrate
DynamicUser=1 state directory back migration
2019-06-25 10:06:03 +02:00
Michal Sekletar b070c7c0e1 core: introduce NUMAPolicy and NUMAMask options
Make possible to set NUMA allocation policy for manager. Manager's
policy is by default inherited to all forked off processes. However, it
is possible to override the policy on per-service basis. Currently we
support, these policies: default, prefer, bind, interleave, local.
See man 2 set_mempolicy for details on each policy.

Overall NUMA policy actually consists of two parts. Policy itself and
bitmask representing NUMA nodes where is policy effective. Node mask can
be specified using related option, NUMAMask. Default mask can be
overwritten on per-service level.
2019-06-24 16:58:54 +02:00
Lennart Poettering 5c6d40d132 core: migrate service directories back from private if needed
Fixes: #12131
2019-06-24 16:20:34 +02:00
Lennart Poettering 3f5b15084e core: add missing space to DynamicUser=1 directory comment
(also line break again)
2019-06-24 16:20:34 +02:00
Yu Watanabe 657ee2d82b tree-wide: replace strjoin() with path_join() 2019-06-21 03:26:16 +09:00
Michal Sekletar e7fca352ba execute: dump CPUAffinity as a range string instead of a list of CPUs
We do this already when printing the property in systemctl so be
consistent and do the same for systemd-analyze dump.
2019-06-03 15:21:52 +02:00
Michal Sekletar 3f09629c22
Merge pull request #12628 from keszybz/dbus-execute
Rework cpu affinity parsing
2019-05-30 12:32:53 +02:00
Zbigniew Jędrzejewski-Szmek 0985c7c4e2 Rework cpu affinity parsing
The CPU_SET_S api is pretty bad. In particular, it has a parameter for the size
of the array, but operations which take two (CPU_EQUAL_S) or even three arrays
(CPU_{AND,OR,XOR}_S) still take just one size. This means that all arrays must
be of the same size, or buffer overruns will occur. This is exactly what our
code would do, if it received an array of unexpected size over the network.
("Unexpected" here means anything different from what cpu_set_malloc() detects
as the "right" size.)

Let's rework this, and store the size in bytes of the allocated storage area.

The code will now parse any number up to 8191, independently of what the current
kernel supports. This matches the kernel maximum setting for any architecture,
to make things more portable.

Fixes #12605.
2019-05-29 10:20:42 +02:00