Commit graph

564 commits

Author SHA1 Message Date
Lennart Poettering ccc16c7842 core: prefer SCMP_ACT_KILL_PROCESS for SystemCallFilter= behaviour
If we have it, use it. It makes a ton more sense.

Fixes: #11967
2019-05-24 10:48:28 +02:00
Zbigniew Jędrzejewski-Szmek 7cc5ef5f18 pid1: improve message when setting up namespace fails
I covered the most obvious paths: those where there's a clear problem
with a path specified by the user.

Prints something like this (at error level):
May 21 20:00:01.040418 systemd[125871]: bad-workdir.service: Failed to set up mount namespacing: /run/systemd/unit-root/etc/tomcat9/Catalina: No such file or directory
May 21 20:00:01.040456 systemd[125871]: bad-workdir.service: Failed at step NAMESPACE spawning /bin/true: No such file or directory

Fixes #10972.
2019-05-22 16:28:02 +02:00
Ben Boeckel 5238e95759 codespell: fix spelling errors 2019-04-29 16:47:18 +02:00
Lennart Poettering f69567cbe2 core: expose SUID/SGID restriction as new unit setting RestrictSUIDSGID= 2019-04-02 16:56:48 +02:00
Zbigniew Jędrzejewski-Szmek e1af3bc62a
Merge pull request #12106 from poettering/nosuidns
add "nosuid" flag to exec directory mounts of DynamicUser=1 services
2019-03-26 08:58:00 +01:00
Lennart Poettering 607b358ef2 core: drop suid/sgid bit of files/dirs when doing recursive chown
This adds some extra paranoia: when we recursively chown a directory for
use with DynamicUser=1 services we'll now drop suid/sgid from all files
we chown().

Of course, such files should not exist in the first place, and noone
should get access to those dirs who isn't root anyway, but let's better
be safe than sorry, and drop everything we come across.
2019-03-26 08:29:37 +01:00
Lennart Poettering 9ce4e4b0f6 namespace: when DynamicUser=1 is set, mount StateDirectory= bind mounts "nosuid"
Add even more suid/sgid protection to DynamicUser= envionments: the
state directories we bind mount from the host will now have the nosuid
flag set, to disable the effect of nosuid on them.
2019-03-25 19:57:15 +01:00
Lennart Poettering 6f765baf23 core: rework how we reset the TTY after use by a service
This makes two changes:

1. Instead of resetting the configured service TTY each time after a
   process exited, let's do so only when the service goes back to "dead"
   state. This should be preferable in case the started processes leave
   background child processes around that still reference the TTY.

2. chmod() and chown() the TTY at the same time. This should make it
   safe to run "systemd-run -p DynamicUser=1 -p StandardInput=tty -p
   TTYPath=/dev/tty8 /bin/bash" without leaving a TTY owned by a dynamic
   user around.
2019-03-20 21:28:02 +01:00
Lennart Poettering 6c0ae73956 execute: split check if we might touch a tty out of exec_context_may_touch_console()
Some simple refactoring that'll come handy in a later commit.
2019-03-20 21:20:00 +01:00
Lennart Poettering 955f1c852e execute: use path_equal() to compare tty names
After all they might be strings such as pts/1 which we really should
consider the same as pts//1.
2019-03-20 21:18:59 +01:00
Lennart Poettering 08f6769675 execute: generalize uid/gid handling in two cases for any kind of uid/gid 2019-03-19 16:57:33 +01:00
Lennart Poettering 206e9864de core: change ownership/mode of the execution directories also for static users
It's probably unexpected if we do a recursive chown() when dynamic users
are used but not on static users.

hence, let's tweak the logic slightly, and recursively chown in both
cases, except when operating on the configuration directory.

Fixes: #11842
2019-03-19 16:57:33 +01:00
Lennart Poettering d484580ca6 execute: remove one redundant comparison check 2019-03-19 16:52:28 +01:00
Lennart Poettering 40cd2ecc26 execute: also do the private/ symlink dance when runtime dir preservation is requested
In that case it's not safe to leave a regular dir around, hence, move it
to private/ too.
2019-03-19 16:52:28 +01:00
Lennart Poettering edbfeb1204 execute: use path_join() where appropriate 2019-03-19 16:52:28 +01:00
Lennart Poettering 7bc4bf4a69 execute: use path_join() where appropriate 2019-03-13 17:38:43 +01:00
Lennart Poettering 0a9707187b util: split out memcmp()/memset() related calls into memory-util.[ch]
Just some source rearranging.
2019-03-13 12:16:43 +01:00
Lennart Poettering 9e73208afc execute: no need to synthesize $HOME for uid==0 again, get_home_dir() already does that 2019-03-12 16:10:55 +01:00
Lennart Poettering 7bbead1d0b execute: simplify paths we set as HOME/SHELL for invoked programs 2019-03-12 16:10:55 +01:00
Zbigniew Jędrzejewski-Szmek fb6692ed33
Merge pull request #11927 from poettering/network-namespace-path
Add NetworkNamespacePath= to unit files
2019-03-12 14:29:14 +01:00
Lennart Poettering 4cea310fc7 execute: remove one aa profile output from context dump
The same data is output a few lines further up already, drop one.
2019-03-11 11:05:22 +09:00
Lennart Poettering a8d08f39d1 core: add new setting NetworkNamespacePath= for configuring a netns by path for a service
Fixes: #2741
2019-03-07 16:55:23 +01:00
Lennart Poettering da6bc6ed05 execute: no need to check for NULL when function right after does anyway 2019-03-07 16:55:19 +01:00
Lennart Poettering 2fa3742d96 execute: make things a tiny bit shorter 2019-03-07 16:53:45 +01:00
Lennart Poettering 8e8009dc50 execute: use structured initialization 2019-03-07 16:53:45 +01:00
Anita Zhang 7ca69792e5 core: add ':' prefix to ExecXYZ= skip env var substitution 2019-02-20 17:58:14 +01:00
Lennart Poettering eb5149ba74
Merge pull request #11682 from topimiettinen/private-utsname
core: ProtectHostname feature
2019-02-20 14:12:15 +01:00
Topi Miettinen aecd5ac621 core: ProtectHostname= feature
Let services use a private UTS namespace. In addition, a seccomp filter is
installed on set{host,domain}name and a ro bind mounts on
/proc/sys/kernel/{host,domain}name.
2019-02-20 10:50:44 +02:00
Franck Bui 37ed15d7ed namespace: make MountFlags=shared work again
Since commit 0722b35934, the root mountpoint is
unconditionnally turned to slave which breaks units that are using explicitly
MountFlags=shared (and no other options that would implicitly require a slave
root mountpoint).

Here is a test case:

  $ systemctl cat test-shared-mount-flag.service
  # /etc/systemd/system/test-shared-mount-flag.service
  [Service]
  Type=simple
  ExecStartPre=/usr/bin/mkdir -p /mnt/tmp
  ExecStart=/bin/sh -c "/usr/bin/mount -t tmpfs -o size=10M none /mnt/tmp && sleep infinity"
  ExecStop=-/bin/sh -c "/usr/bin/umount /mnt/tmp"
  MountFlags=shared

  $ systemctl start test-shared-mount-flag.service
  $ findmnt /mnt/tmp
  $

Mount on /mnt/tmp is not visible from the host although MountFlags=shared was
used.

This patch fixes that and turns the root mountpoint to slave when it's really
required.
2019-02-20 06:20:40 +09:00
Taro Yamada 6cff72eb0a Add a warning about the difference in permissions between existing directories and unit settings.
To follows the intent of 30c81ce, this change does not execute chmod() and just add warnings.
2019-01-29 09:52:21 +09:00
Taro Yamada ff9e7900c0 Revert "Fixes #11128"
This reverts commit 0bf05f0122 because it breaks 30c81ce.
Please see #11540.
2019-01-27 13:43:30 +09:00
Taro Yamada 0bf05f0122 Fixes #11128 2019-01-22 11:14:51 +01:00
Lennart Poettering ce932d2d33 execute: make sure to call into PAM after initializing resource limits
We want that pam_limits takes precedence over our settings, after all.

Fixes: #11386
2019-01-18 17:31:36 +01:00
Zbigniew Jędrzejewski-Szmek 3042bbebdd tree-wide: use c99 static for array size declarations
https://hamberg.no/erlend/posts/2013-02-18-static-array-indices.html

This only works with clang, unfortunately gcc doesn't seem to implement the check
(tested with gcc-8.2.1-5.fc29.x86_64).

Simulated error:
[2/3] Compiling C object 'systemd-nspawn@exe/src_nspawn_nspawn.c.o'.
../src/nspawn/nspawn.c:3179:45: warning: array argument is too small; contains 15 elements, callee requires at least 16 [-Warray-bounds]
                        candidate = (uid_t) siphash24(arg_machine, strlen(arg_machine), hash_key);
                                            ^                                           ~~~~~~~~
../src/basic/siphash24.h:24:64: note: callee declares array parameter as static here
uint64_t siphash24(const void *in, size_t inlen, const uint8_t k[static 16]);
                                                               ^~~~~~~~~~~~
2019-01-04 12:37:25 +01:00
Chris Down 4e1dfa45e9 cgroup: s/cgroups? ?v?([0-9])/cgroup v\1/gI
Nitpicky, but we've used a lot of random spacings and names in the past,
but we're trying to be completely consistent on "cgroup vN" now.

Generated by `fd -0 | xargs -0 -n1 sed -ri --follow-symlinks 's/cgroups?  ?v?([0-9])/cgroup v\1/gI'`.

I manually ignored places where it's not appropriate to replace (eg.
"cgroup2" fstype and in src/shared/linux).
2019-01-03 11:32:40 +09:00
Michal Sekletar 4c70a4a748 core: do cgroup migration first and only then connect to journald
Fixes #11162
2018-12-17 19:22:30 +01:00
Alexey Bogdanenko 8f9f3cb724 core: fix KeyringMode for user services
KeyringMode option is useful for user services. Also, documentation for the
option suggests that the option applies to user services. However, setting the
option to any of its allowed values has no effect.

This commit fixes that and removes EXEC_NEW_KEYRING flag. The flag is no longer
necessary: instead of checking if the flag is set we can check if keyring_mode
is not equal to EXEC_KEYRING_INHERIT.
2018-12-17 16:56:36 +01:00
Yu Watanabe 3843e8260c missing: rename securebits.h to missing_securebits.h 2018-12-04 07:49:24 +01:00
Lennart Poettering 686d13b9f2 util-lib: split out env file parsing code into env-file.c
It's quite complex, let's split this out.

No code changes, just some file rearranging.
2018-12-02 13:22:29 +01:00
Lennart Poettering 4917894417
Merge pull request #10944 from poettering/redirect-file-fix
StandardOutput=file: fixes
2018-11-27 13:18:26 +01:00
Lennart Poettering 41fc585a7a core: be more careful when inheriting stdout fds to stderr
We need to compare the fd name/file name if we inherit an fd from stdout
to stderr. Let's do that.

Fixes: #10875
2018-11-27 10:06:51 +01:00
Lennart Poettering 78f93209fc core: when Delegate=yes is set for a unit, run ExecStartPre= and friends in a subcgroup of the unit
Otherwise we might conflict with the "no-processes-in-inner-cgroup" rule
of cgroupsv2. Consider nspawn starting up and initializing its cgroup
hierarchy with "supervisor/" and "payload/" as subcgroup, with itself
moved into the former and the payload into the latter. Now, if an
ExecStartPre= is run right after it cannot be placed in the main cgroup,
because that is now in inner cgroup with populated children.

Hence, let's run these helpers in another sub-cgroup .control/ below it.

This is somewhat ugly since it weakens the clear separation of
ownership, but given that this is an explicit contract, and double opt-in should be acceptable.

Fixes: #10482
2018-11-26 18:43:23 +01:00
Lennart Poettering aa8fbc74e3 fileio: drop "newline" parameter for env file parsers
Now that we don't (mis-)use the env file parser to parse kernel command
lines there's no need anymore to override the used newline character
set. Let's hence drop the argument and just "\n\r" always. This nicely
simplifies our code.
2018-11-14 17:01:54 +01:00
Yu Watanabe b9c04eafb8 core: introduce exec_params_clear()
Follow-up for 1ad6e8b302.

Fixes #10677.
2018-11-08 09:36:37 +01:00
Joerg Behrmann 56ef8db9f5 core: apply WorkingDirectory after enforce_user
If WorkingDirectory is on NFS, root might only have the privileges of
nobody and the chdir to the WorkingDirectory might fail, even if the
user running the service would have the proper privileges to chdir to
that directory.

Fixes #10568
2018-10-31 12:07:24 +01:00
Lennart Poettering 6897dfe85a core: add free_and_replace() at one more place 2018-10-26 19:49:15 +02:00
Lennart Poettering 2194547e3b execute: if we fail to do namespacing, explain why we refuse to continue in a debug message 2018-10-24 17:08:12 +02:00
Evgeny Vereshchagin 2ac1ff68f2 core: stop ignoring errors in connect_logger_as
When journald reaches the maximum number of active streams, it,
basically, starts to decline new connections. On the client
side it can be detected by getting EPIPE and, if the writing
process isn't lucky enough, getting SIGPIPE soon afterwards.
systemd has always ignored EPIPE, which makes it very hard
to keep track of services losing logs. This patch should make
it easier to detect such services by just staring at the logs
carefully.

In case anyone is interested, the following one-liner run as any user
can be used to paralyze all the stream logging on a machine:

for i in {1..4096}; do systemd-cat -t HEY-$i & done
2018-10-19 10:32:21 +02:00
Anita Zhang 90fc172e19 core: implement per unit journal rate limiting
Add LogRateLimitIntervalSec= and LogRateLimitBurst= options for
services. If provided, these values get passed to the journald
client context, and those values are used in the rate limiting
function in the journal over the the journald.conf values.

Part of #10230
2018-10-18 09:56:20 +02:00
Lennart Poettering 7d853ca6bc execute: shorten things a bit 2018-10-17 21:18:09 +02:00