Commit graph

583 commits

Author SHA1 Message Date
Taro Yamada 0bf05f0122 Fixes #11128 2019-01-22 11:14:51 +01:00
Lennart Poettering ce932d2d33 execute: make sure to call into PAM after initializing resource limits
We want that pam_limits takes precedence over our settings, after all.

Fixes: #11386
2019-01-18 17:31:36 +01:00
Zbigniew Jędrzejewski-Szmek 3042bbebdd tree-wide: use c99 static for array size declarations
https://hamberg.no/erlend/posts/2013-02-18-static-array-indices.html

This only works with clang, unfortunately gcc doesn't seem to implement the check
(tested with gcc-8.2.1-5.fc29.x86_64).

Simulated error:
[2/3] Compiling C object 'systemd-nspawn@exe/src_nspawn_nspawn.c.o'.
../src/nspawn/nspawn.c:3179:45: warning: array argument is too small; contains 15 elements, callee requires at least 16 [-Warray-bounds]
                        candidate = (uid_t) siphash24(arg_machine, strlen(arg_machine), hash_key);
                                            ^                                           ~~~~~~~~
../src/basic/siphash24.h:24:64: note: callee declares array parameter as static here
uint64_t siphash24(const void *in, size_t inlen, const uint8_t k[static 16]);
                                                               ^~~~~~~~~~~~
2019-01-04 12:37:25 +01:00
Chris Down 4e1dfa45e9 cgroup: s/cgroups? ?v?([0-9])/cgroup v\1/gI
Nitpicky, but we've used a lot of random spacings and names in the past,
but we're trying to be completely consistent on "cgroup vN" now.

Generated by `fd -0 | xargs -0 -n1 sed -ri --follow-symlinks 's/cgroups?  ?v?([0-9])/cgroup v\1/gI'`.

I manually ignored places where it's not appropriate to replace (eg.
"cgroup2" fstype and in src/shared/linux).
2019-01-03 11:32:40 +09:00
Michal Sekletar 4c70a4a748 core: do cgroup migration first and only then connect to journald
Fixes #11162
2018-12-17 19:22:30 +01:00
Alexey Bogdanenko 8f9f3cb724 core: fix KeyringMode for user services
KeyringMode option is useful for user services. Also, documentation for the
option suggests that the option applies to user services. However, setting the
option to any of its allowed values has no effect.

This commit fixes that and removes EXEC_NEW_KEYRING flag. The flag is no longer
necessary: instead of checking if the flag is set we can check if keyring_mode
is not equal to EXEC_KEYRING_INHERIT.
2018-12-17 16:56:36 +01:00
Yu Watanabe 3843e8260c missing: rename securebits.h to missing_securebits.h 2018-12-04 07:49:24 +01:00
Lennart Poettering 686d13b9f2 util-lib: split out env file parsing code into env-file.c
It's quite complex, let's split this out.

No code changes, just some file rearranging.
2018-12-02 13:22:29 +01:00
Lennart Poettering 4917894417
Merge pull request #10944 from poettering/redirect-file-fix
StandardOutput=file: fixes
2018-11-27 13:18:26 +01:00
Lennart Poettering 41fc585a7a core: be more careful when inheriting stdout fds to stderr
We need to compare the fd name/file name if we inherit an fd from stdout
to stderr. Let's do that.

Fixes: #10875
2018-11-27 10:06:51 +01:00
Lennart Poettering 78f93209fc core: when Delegate=yes is set for a unit, run ExecStartPre= and friends in a subcgroup of the unit
Otherwise we might conflict with the "no-processes-in-inner-cgroup" rule
of cgroupsv2. Consider nspawn starting up and initializing its cgroup
hierarchy with "supervisor/" and "payload/" as subcgroup, with itself
moved into the former and the payload into the latter. Now, if an
ExecStartPre= is run right after it cannot be placed in the main cgroup,
because that is now in inner cgroup with populated children.

Hence, let's run these helpers in another sub-cgroup .control/ below it.

This is somewhat ugly since it weakens the clear separation of
ownership, but given that this is an explicit contract, and double opt-in should be acceptable.

Fixes: #10482
2018-11-26 18:43:23 +01:00
Lennart Poettering aa8fbc74e3 fileio: drop "newline" parameter for env file parsers
Now that we don't (mis-)use the env file parser to parse kernel command
lines there's no need anymore to override the used newline character
set. Let's hence drop the argument and just "\n\r" always. This nicely
simplifies our code.
2018-11-14 17:01:54 +01:00
Yu Watanabe b9c04eafb8 core: introduce exec_params_clear()
Follow-up for 1ad6e8b302.

Fixes #10677.
2018-11-08 09:36:37 +01:00
Joerg Behrmann 56ef8db9f5 core: apply WorkingDirectory after enforce_user
If WorkingDirectory is on NFS, root might only have the privileges of
nobody and the chdir to the WorkingDirectory might fail, even if the
user running the service would have the proper privileges to chdir to
that directory.

Fixes #10568
2018-10-31 12:07:24 +01:00
Lennart Poettering 6897dfe85a core: add free_and_replace() at one more place 2018-10-26 19:49:15 +02:00
Lennart Poettering 2194547e3b execute: if we fail to do namespacing, explain why we refuse to continue in a debug message 2018-10-24 17:08:12 +02:00
Evgeny Vereshchagin 2ac1ff68f2 core: stop ignoring errors in connect_logger_as
When journald reaches the maximum number of active streams, it,
basically, starts to decline new connections. On the client
side it can be detected by getting EPIPE and, if the writing
process isn't lucky enough, getting SIGPIPE soon afterwards.
systemd has always ignored EPIPE, which makes it very hard
to keep track of services losing logs. This patch should make
it easier to detect such services by just staring at the logs
carefully.

In case anyone is interested, the following one-liner run as any user
can be used to paralyze all the stream logging on a machine:

for i in {1..4096}; do systemd-cat -t HEY-$i & done
2018-10-19 10:32:21 +02:00
Anita Zhang 90fc172e19 core: implement per unit journal rate limiting
Add LogRateLimitIntervalSec= and LogRateLimitBurst= options for
services. If provided, these values get passed to the journald
client context, and those values are used in the rate limiting
function in the journal over the the journald.conf values.

Part of #10230
2018-10-18 09:56:20 +02:00
Lennart Poettering 7d853ca6bc execute: shorten things a bit 2018-10-17 21:18:09 +02:00
Lennart Poettering 15a3e96f92 tree-wide: port various users over to sockaddr_un_set_path()
CID 1396140
CID 1396141
2018-10-15 19:40:51 +02:00
Lennart Poettering ee8d493cbd
Merge pull request #10158 from keszybz/seccomp-log-tightening
Seccomp log tightening
2018-09-26 15:56:32 +02:00
Lennart Poettering 7c428bb5d5
Merge pull request #10059 from yuwata/env-exec-directory
core: introduce $RUNTIME_DIRECTORY= or friends
2018-09-25 12:34:30 +02:00
Yu Watanabe 6c9c51e5e2 fs-util: make symlink_idempotent() optionally create relative link 2018-09-24 18:52:53 +03:00
Zbigniew Jędrzejewski-Szmek b54f36c604 seccomp: reduce logging about failure to add syscall to seccomp
Our logs are full of:
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call oldstat() / -10037, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call get_thread_area() / -10076, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call set_thread_area() / -10079, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call oldfstat() / -10034, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call oldolduname() / -10036, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call oldlstat() / -10035, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call waitpid() / -10073, ignoring: Numerical argument out of domain
...
This is pointless and makes debug logs hard to read. Let's keep the logs
in test code, but disable it in nspawn and pid1. This is done through a function
parameter because those functions operate recursively and it's not possible to
make the caller to log meaningfully.


There should be no functional change, except the skipped debug logs.
2018-09-24 17:21:09 +02:00
Yu Watanabe aca835ed2e core/execute: do not use the negative errno when setup_namespace() returns -ENOANO
Without this, log shows meaningless error message 'No anode', e.g.,
===
Failed to unshare the mount namespace: Operation not permitted
foo.service: Failed to set up mount namespacing: No anode
foo.service: Failed at step NAMESPACE spawning /usr/bin/test: No anode
===

Follow-up for 1beab8b0d0.
2018-09-18 14:31:09 +09:00
Yu Watanabe fb2042dd55 core: add new environment variable $RUNTIME_DIRECTORY= or friends
The variable is generated from RuntimeDirectory= or friends.
If multiple directories are set, then they are concatenated with
the separator ':'.
2018-09-13 17:02:58 +09:00
Yu Watanabe 7c1cb6f198 core: add one more assert() 2018-09-13 17:02:58 +09:00
Yu Watanabe 76a9460d44 core: fix assert() about number of built environment variables
Follow-up for 4b58153dd2 and
fd63e712b2.
2018-09-13 17:02:58 +09:00
Yu Watanabe 52e4d62550
Merge pull request #9852 from poettering/namespace-errno
namespace: be more careful when handling namespacing failures
2018-08-22 11:16:29 +09:00
Lennart Poettering 1beab8b0d0 namespace: be more careful when handling namespacing failures gracefully
This makes two changes to the namespacing code:

1. We'll only gracefully skip service namespacing on access failure if
   exclusively sandboxing options where selected, and not mount-related
   options that result in a very different view of the world. For example,
   ignoring RootDirectory=, RootImage= or Bind= is really probablematic,
   but ReadOnlyPaths= is just a weaker sandbox.

2. The namespacing code will now return a clearly recognizable error
   code when it cannot enforce its namespacing, so that we cannot
   confuse EPERM errors from mount() with those from unshare(). Only the
   errors from the first unshare() are now taken as hint to gracefully
   disable namespacing.

Fixes: #9844 #9835
2018-08-21 20:00:33 +02:00
Zbigniew Jędrzejewski-Szmek 7692fed98b
Merge pull request #9783 from poettering/get-user-creds-flags
beef up get_user_creds() a bit and other improvements
2018-08-21 10:09:33 +02:00
Lennart Poettering fafff8f1ff user-util: rework get_user_creds()
Let's fold get_user_creds_clean() into get_user_creds(), and introduce a
flags argument for it to select "clean" behaviour. This flags parameter
also learns to other new flags:

- USER_CREDS_SYNTHESIZE_FALLBACK: in this mode the user records for
  root/nobody are only synthesized as fallback. Normally, the synthesized
  records take precedence over what is in the user database.  With this
  flag set this is reversed, and the user database takes precedence, and
  the synthesized records are only used if they are missing there. This
  flag should be set in cases where doing NSS is deemed safe, and where
  there's interest in knowing the correct shell, for example if the
  admin changed root's shell to zsh or suchlike.

- USER_CREDS_ALLOW_MISSING: if set, and a UID/GID is specified by
  numeric value, and there's no user/group record for it accept it
  anyway. This allows us to fix #9767

This then also ports all users to set the most appropriate flags.

Fixes: #9767

[zj: remove one isempty() call]
2018-08-20 15:58:21 +02:00
Lennart Poettering 3cd24c1aa9 core: when setting up PAM, try to get tty of STDIN_FILENO if not set explicitly
When stdin/stdout/stderr is initialized from an fd, let's read the tty
name of it if we can, and pass that to PAM.

This makes sure that "machinectl shell" sessions have proper TTY fields
initialized that "loginctl" then shows.
2018-08-20 12:28:17 +02:00
Yu Watanabe 4c3a2b84d8 core/execute: fix dump format for Limit*=
Fixes #9846.
2018-08-10 11:59:16 +02:00
Yu Watanabe 7e8d494b33 core: use memcpy_safe()
Fixes #9738.
2018-08-08 17:11:43 +09:00
Zbigniew Jędrzejewski-Szmek 5b316330be
Merge pull request #9624 from poettering/service-state-flush
flush out ExecStatus structures when a new service cycle begins
2018-08-02 09:50:39 +02:00
Zbigniew Jędrzejewski-Szmek 54fe2ce1b9
Merge pull request #9504 from poettering/nss-deadlock
some nss deadlock love
2018-07-26 10:16:25 +02:00
Lennart Poettering 5686391b00 core: introduce new Type=exec service type
Users are often surprised that "systemd-run" command lines like
"systemd-run -p User=idontexist /bin/true" will return successfully,
even though the logs show that the process couldn't be invoked, as the
user "idontexist" doesn't exist. This is because Type=simple will only
wait until fork() succeeded before returning start-up success.

This patch adds a new service type Type=exec, which is very similar to
Type=simple, but waits until the child process completed the execve()
before returning success. It uses a pipe that has O_CLOEXEC set for this
logic, so that the kernel automatically sends POLLHUP on it when the
execve() succeeded but leaves the pipe open if not. This means PID 1
waits exactly until the execve() succeeded in the child, and not longer
and not shorter, which is the desired functionality.

Making use of this new functionality, the command line
"systemd-run -p User=idontexist -p Type=exec /bin/true" will now fail,
as expected.
2018-07-25 22:48:11 +02:00
Lennart Poettering 25b583d7ff core: swap order of "n_storage_fds" and "n_socket_fds" parameters
When process fd lists to pass to activated programs we always place the
socket activation fds first, and the storage fds last. Irritatingly in
almost all calls the "n_storage_fds" parameter (i.e. the number of
storage fds to pass) came first so far, and the "n_socket_fds" parameter
second. Let's clean this up, and specify the number of fds in the order
the fds themselves are passed.

(Also, let's fix one more case where "unsigned" was used to size an
array, while we should use "size_t" instead.)
2018-07-25 22:48:11 +02:00
Lennart Poettering 6a1d4d9fa6 core: properly reset all ExecStatus structures when entering a new unit cycle
Whenever a unit is started fresh we should flush out any runtime data
from the previous cycle. We are pretty good at that already, but what so
far we missed was the ExecStart=/ExecStop=/… command exit status data.
Let's fix that, and properly flush out that stuff too.

Consider this service:

    [Service]
    ExecStart=/bin/sleep infinity
    ExecStop=/bin/false

When this service is started, then stopped and then started again
"systemctl status" would show the ExecStop= results of the previous run
along with the ExecStart= results of the current one, which is very
confusing. With this patch this is corrected: the data is kept right
until the moment the new service cycle starts, and then flushed out.
Hence "systemctl status" in that case will only show the ExecStart=
data, but no ExecStop= data, like it should be.

This should fix part of the confusion of #9588
2018-07-23 13:36:47 +02:00
Lennart Poettering ee39ca20c6 core: drop "argv" field from ExecParameter structure
We always initialize it from the same field in ExecCommand anyway, hence
there's no point in passing it separately to exec_spawn(), after all we
already pass the ExecCommand structure itself anyway.

No change in behaviour.
2018-07-23 13:36:47 +02:00
Lennart Poettering 2ed26ed065 execute: use structure initialization when filling in exec status 2018-07-23 13:36:47 +02:00
Lennart Poettering d521916d0f pid1: tell PAM/NSS modules why we are calling them 2018-07-20 16:57:35 +02:00
Zsolt Dollenstein 566b7d23eb Add support for opening files for appending
Addresses part of #8983
2018-07-20 03:54:22 -07:00
Chris Lamb 3fe910794b Correct a number of trivial typos. 2018-06-18 22:44:44 +02:00
Lennart Poettering 0c69794138 tree-wide: remove Lennart's copyright lines
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
2018-06-14 10:20:20 +02:00
Lennart Poettering 818bf54632 tree-wide: drop 'This file is part of systemd' blurb
This part of the copyright blurb stems from the GPL use recommendations:

https://www.gnu.org/licenses/gpl-howto.en.html

The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.

hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
2018-06-14 10:20:20 +02:00
Lennart Poettering 228af36fff core: add new PrivateMounts= unit setting
This new setting is supposed to be useful in most cases where
"MountFlags=slave" is currently used, i.e. as an explicit way to run a
service in its own mount namespace and decouple propagation from all
mounts of the new mount namespace towards the host.

The effect of MountFlags=slave and PrivateMounts=yes is mostly the same,
as both cause a CLONE_NEWNS namespace to be opened, and both will result
in all mounts within it to be mounted MS_SLAVE. The difference is mostly
on the conceptual/philosophical level: configuring the propagation mode
is nothing people should have to think about, in particular as the
matter is not precisely easyto grok. Moreover, MountFlags= allows configuration
of "private" and "slave" modes which don't really make much sense to use
in real-life and are quite confusing. In particular PrivateMounts=private means
mounts made on the host stay pinned for good by the service which is
particularly nasty for removable media mount. And PrivateMounts=shared
is in most ways a NOP when used a alone...

The main technical difference between setting only MountFlags=slave or
only PrivateMounts=yes in a unit file is that the former remounts all
mounts to MS_SLAVE and leaves them there, while that latter remounts
them to MS_SHARED again right after. The latter is generally a nicer
approach, since it disables propagation, while MS_SHARED is afterwards
in effect, which is really nice as that means further namespacing down
the tree will get MS_SHARED logic by default and we unify how
applications see our mounts as we always pass them as MS_SHARED
regardless whether any mount namespacing is used or not.

The effect of PrivateMounts=yes was implied already by all the other
mount namespacing options. With this new option we add an explicit knob
for it, to request it without any other option used as well.

See: #4393
2018-06-12 16:12:10 +02:00
Zbigniew Jędrzejewski-Szmek a1230ff972 basic/log: add the log_struct terminator to macro
This way all callers do not need to specify it.
Exhaustively tested by running test-log under valgrind ;)
2018-06-04 13:46:03 +02:00
Yu Watanabe 37c56f89d2 core: setup mount namespace when RootDirectory= and RuntimeDirectory= or friends are set
The directories specified by RuntimeDirectory= or friends are created
on host. So, it is necessary to bind-mount them on root directory.
2018-05-25 17:33:03 +09:00
Yu Watanabe 5609f6888b core: make StateDirectory= or friends works with DynamicUser= and RootDirectory=/RootImage=
The symbolic links to private directories specified by StateDirectory=
or its friends are created on the host. So, when DynamicUser= and
RootDirectory=/RootImage= are set, then the executed process cannot
access private directory.
This makes the private directories are mounted on the non-private place
when both DynamicUser= and RootDirectory=/RootImage= are set.

Fixes #8965.
2018-05-25 17:25:17 +09:00
Lennart Poettering cdc0f9be92
Merge pull request #8817 from yuwata/cleanup-nsflags
core: allow to specify RestrictNamespaces= multiple times
2018-05-24 16:49:13 +02:00
Yu Watanabe fdff1da299 core: chown RuntimeDirectory= if DynamicUser= is set
When DynamicUser= is set, then RuntimeDirectory= should be always
chowned, as the service unit may enable RuntimeDirectoryPreserve=,
and the uid or gid may changed from the last run.
This also makes easier to migrate the service to use DynamicUser=.
2018-05-22 22:26:22 +09:00
Lennart Poettering 9f8168eb23 process-util: add new helper call for adjusting the OOM score
And let's make use of it in execute.c
2018-05-17 20:47:21 +02:00
Lennart Poettering 34a5df58da rlimit-util: introduce setrlimit_closest_all()
This new call applies all configured resource limits in one.
2018-05-17 20:40:04 +02:00
Lennart Poettering 31ce987c2b rlimit-util: add a common destructor call for arrays of struct rlimit 2018-05-17 20:36:52 +02:00
Lennart Poettering 6550c24c7f rlimit-util: rework rlimit_{from|to}_string() to work without "Limit" prefix
let's make the call more generic, so that we can also easily use it for
parsing "RLIMIT_xyz" style constants.
2018-05-17 20:36:52 +02:00
Felipe Sateler 57b7a260c2 core: undo the dependency inversion between unit.h and all unit types 2018-05-15 14:24:34 -04:00
Yu Watanabe 130d3d22e9 tree-wide: use strv_free_and_replace() macro 2018-05-10 00:57:34 +09:00
Yu Watanabe aa9d574de9 load-fragment: allow to specify RestrictNamespaces= multiple times
If multiple RestrictNamespaces= settings are set, then merge the settings.
This also drops supporting "~yes" and "~no".
2018-05-05 11:07:37 +09:00
Yu Watanabe 86c2a9f1c2 nsflsgs: drop namespace_flag_{from,to}_string()
This also drops namespace_flag_to_string_many_with_check(), and
renames namespace_flag_{from,to}_string_many() to
namespace_flags_{from,to}_string().
2018-05-05 11:07:37 +09:00
Yu Watanabe b5a33299b0 core: disable namespace sandboxing for '+' prefixed lines
Fixes #8842.
2018-05-01 13:44:06 +09:00
Lennart Poettering da6053d0a7 tree-wide: be more careful with the type of array sizes
Previously we were a bit sloppy with the index and size types of arrays,
we'd regularly use unsigned. While I don't think this ever resulted in
real issues I think we should be more careful there and follow a
stricter regime: unless there's a strong reason not to use size_t for
array sizes and indexes, size_t it should be. Any allocations we do
ultimately will use size_t anyway, and converting forth and back between
unsigned and size_t will always be a source of problems.

Note that on 32bit machines "unsigned" and "size_t" are equivalent, and
on 64bit machines our arrays shouldn't grow that large anyway, and if
they do we have a problem, however that kind of overly large allocation
we have protections for usually, but for overflows we do not have that
so much, hence let's add it.

So yeah, it's a story of the current code being already "good enough",
but I think some extra type hygiene is better.

This patch tries to be comprehensive, but it probably isn't and I missed
a few cases. But I guess we can cover that later as we notice it. Among
smaller fixes, this changes:

1. strv_length()' return type becomes size_t

2. the unit file changes array size becomes size_t

3. DNS answer and query array sizes become size_t

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=76745
2018-04-27 14:29:06 +02:00
Lennart Poettering 5d13a15b1d tree-wide: drop spurious newlines (#8764)
Double newlines (i.e. one empty lines) are great to structure code. But
let's avoid triple newlines (i.e. two empty lines), quadruple newlines,
quintuple newlines, …, that's just spurious whitespace.

It's an easy way to drop 121 lines of code, and keeps the coding style
of our sources a bit tigther.
2018-04-19 12:13:23 +02:00
Zbigniew Jędrzejewski-Szmek 11a1589223 tree-wide: drop license boilerplate
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.

I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
2018-04-06 18:58:55 +02:00
Yu Watanabe 1cc6c93a95 tree-wide: use TAKE_PTR() and TAKE_FD() macros 2018-04-05 14:26:26 +09:00
Dimitri John Ledkov e64c2d0b5f core: use setreuid/setregid trick to create session keyring with right ownership (#8447)
Re-use the hacks used to link user keyring, when creating the session
keyring. This way changing ownership of the keyring is not required, and thus
incovation_id can be correctly created in restricted environments.

Creating invocation_id with root permissions works and linking it into session
keyring works, as at that point session keyring is possessed.

Simple way to validate this is with following commands:

$ journalctl -f &
$ sudo systemd-run --uid 1000 /bin/sh -c 'keyctl describe @s; keyctl list @s; keyctl read `keyctl search @s user invocation_id`'

which now works in LXD containers as well as on the host.

Fixes: https://github.com/systemd/systemd/issues/7655
2018-03-27 12:58:10 +02:00
Lennart Poettering 959071cac2
Merge pull request #8552 from keszybz/test-improvements
Test and diagnostics improvements
2018-03-23 15:26:54 +01:00
Zbigniew Jędrzejewski-Szmek 37c1d5e97d tree-wide: warn when a directory path already exists but has bad mode/owner/type
When we are attempting to create directory somewhere in the bowels of /var/lib
and get an error that it already exists, it can be quite hard to diagnose what
is wrong (especially for a user who is not aware that the directory must have
the specified owner, and permissions not looser than what was requested). Let's
print a warning in most cases. A warning is appropriate, because such state is
usually a sign of borked installation and needs to be resolved by the adminstrator.

$ build/test-fs-util

Path "/tmp/test-readlink_and_make_absolute" already exists and is not a directory, refusing.
   (or)
Directory "/tmp/test-readlink_and_make_absolute" already exists, but has mode 0775 that is too permissive (0755 was requested), refusing.
   (or)
Directory "/tmp/test-readlink_and_make_absolute" already exists, but is owned by 1001:1000 (1000:1000 was requested), refusing.

Assertion 'mkdir_safe(tempdir, 0755, getuid(), getgid(), MKDIR_WARN_MODE) >= 0' failed at ../src/test/test-fs-util.c:320, function test_readlink_and_make_absolute(). Aborting.

No functional change except for the new log lines.
2018-03-23 10:26:38 +01:00
Lennart Poettering ae2a15bc14 macro: introduce TAKE_PTR() macro
This macro will read a pointer of any type, return it, and set the
pointer to NULL. This is useful as an explicit concept of passing
ownership of a memory area between pointers.

This takes inspiration from Rust:

https://doc.rust-lang.org/std/option/enum.Option.html#method.take

and was suggested by Alan Jenkins (@sourcejedi).

It drops ~160 lines of code from our codebase, which makes me like it.
Also, I think it clarifies passing of ownership, and thus helps
readability a bit (at least for the initiated who know the new macro)
2018-03-22 20:21:42 +01:00
Zbigniew Jędrzejewski-Szmek d50b5839b0 basic/mkdir: convert bool flag to enum
In preparation for subsequent changes...
2018-03-22 15:57:56 +01:00
Lennart Poettering 2b33ab0957 tree-wide: port various places over to use new rearrange_stdio() 2018-03-02 11:42:10 +01:00
Zbigniew Jędrzejewski-Szmek 30c81ce2ce pid1: when creating service directories, don't chown existing files (#8181)
This partially reverts 3536f49e8f and
3536f49e8f.

When the user is dynamic, and we are setting up state, cache, or logs dirs,
behaviour is unchanged, we always do a recursive chown. This is necessary
because the user number might change between invocations.

But when setting up a directory for non-dynamic user, or a runtime directory
for a dynamic user, do any ownership or mode changes only when the directory
is initially created. Nothing says that the files under those directories have
to be all recursively owned by our user. This restores behaviour before
3536f49e8f, so modifications to the state of
the runtime directory persist between ExecStartPre's and ExecStart's, and even
longer in case the directory is persistent.

I think it _would_ be a nice property if setting a user would automatically
propagate to ownership of any Runtime/Logs/Cache directories. But this is
incompatible with another nice property, namely preserving changes to those
directories made by an admin, and with allowing change of ownership of files
in those directories by the service (e.g. to allow other users to access them).
Of the two, I think the second property is more important. Also, it's backwards
compatible.

https://bugzilla.redhat.com/show_bug.cgi?id=1508495

There is no need to chmod a directory we just created, so move that step
up into a branch. After that, 'effective' is only used once, so get rid of
it too.
2018-02-22 11:30:59 +01:00
Yu Watanabe 2abd4e388a core: add new setting TemporaryFileSystem=
This introduces a new setting TemporaryFileSystem=. This is useful
to hide files not relevant to the processes invoked by unit, while
necessary files or directories can be still accessed by combining
with Bind{,ReadOnly}Paths=.
2018-02-21 09:17:52 +09:00
Yu Watanabe 4ca763a902 core/namespace: make '-' prefix in Bind{,ReadOnly}Paths= work
Each path in `Bind{ReadOnly}Paths=` accept '-' prefix. However,
the prefix is completely ignored.
This makes it work as expected.
2018-02-21 09:07:56 +09:00
Yu Watanabe 8e06d57ccb core/execute: clear bind_mounts 2018-02-21 09:05:37 +09:00
Yu Watanabe a635a7aec6 core/execute: simplify compile_bind_mounts()
It is not necessary to re-assign error code.
2018-02-21 09:05:35 +09:00
Lennart Poettering 7b91264852 terminal-util: make resolve_dev_console() less weird
Let's normalize the behaviour: return a negative errno style error code,
and return the resolved string directly as argument.
2018-02-14 17:30:37 +01:00
Lennart Poettering 8854d79504 terminal-util: rework acquire_terminal()
This modernizes acquire_terminal() in a couple of ways:

1. The three boolean arguments are replaced by a flags parameter, that
   should be more descriptive in what it does.

2. We now properly handle inotify queue overruns

3. We use _cleanup_ for closing the fds now, to shorten the code quite a
   bit.

Behaviour should not be altered by this.
2018-02-13 21:24:37 +01:00
Yu Watanabe 34cf6c4340 core/execute: make arguments constant if possible
Also make functions static if possible.
2018-02-06 16:00:50 +09:00
Yu Watanabe e8a565cb66 core: make ExecRuntime be manager managed object
Before this, each ExecRuntime object is owned by a unit. However,
it may be shared with other units which enable JoinsNamespaceOf=.
Thus, by the serialization/deserialization process, its sharing
information, more specifically, reference counter is lost, and
causes issue #7790.

This makes ExecRuntime objects be managed by manager, and changes
the serialization/deserialization process.

Fixes #7790.
2018-02-06 16:00:34 +09:00
Zbigniew Jędrzejewski-Szmek 5035800495
Merge pull request #7763 from yuwata/fix-7761
Revert "core/execute: RuntimeDirectory= or friends requires mount namespace"
2018-01-05 12:38:29 +01:00
Lennart Poettering 2e87a1fde9 tree-wide: make use of wait_for_terminate_and_check() at various places
Using wait_for_terminate_and_check() instead of wait_for_terminate()
let's us simplify, shorten and unify the return value checking and
logging of waitid().  Hence, let's use it all over the place.
2018-01-04 13:27:27 +01:00
Yu Watanabe 4657abb5d4 execute: make "runtime" argument const in exec_needs_mount_namespace()
The argument can be const, then let's make so.
2018-01-04 00:48:18 +09:00
Yu Watanabe b43ee82fc1 core: RuntimeDirectory= does not request new mount namespace
Now RuntimeDirectory= does not create 'private' directory.
Thus, it is not neccessary to request new mount namespace.

Follow-up for 8092a48cc1.
2018-01-04 00:26:35 +09:00
Yu Watanabe 42b1d8e0f5 Revert "core/execute: RuntimeDirectory= or friends requires mount namespace"
This reverts commit 652bb2637a.

Fixes #7761.
2018-01-04 00:26:11 +09:00
Lennart Poettering 4c253ed1ca tree-wide: introduce new safe_fork() helper and port everything over
This adds a new safe_fork() wrapper around fork() and makes use of it
everywhere. The new wrapper does a couple of things we previously did
manually and separately in a safer, more correct and automatic way:

1. Optionally resets signal handlers/mask in the child

2. Sets a name on all processes we fork off right after forking off (and
   the patch assigns useful names for all processes we fork off now,
   following a systematic naming scheme: always enclosed in () – in order
   to indicate that these are not proper, exec()ed processes, but only
   forked off children, and if the process is long-running with only our
   own code, without execve()'ing something else, it gets am "sd-" prefix.)

3. Optionally closes all file descriptors in the child

4. Optionally sets a PR_SET_DEATHSIG to SIGTERM in the child, in a safe
   way so that the parent dying before this happens being handled
   safely.

5. Optionally reopens the logs

6. Optionally connects stdin/stdout/stderr to /dev/null

7. Debug logs about the forked off processes.
2017-12-25 11:48:21 +01:00
Yu Watanabe 586290017d tree-wide: use !strv_isempty() instead of strv_length() > 0 2017-12-19 10:43:57 +09:00
Lennart Poettering f1d34068ef tree-wide: add DEBUG_LOGGING macro that checks whether debug logging is on (#7645)
This makes things a bit easier to read I think, and also makes sure we
always use the _unlikely_ wrapper around it, which so far we used
sometimes and other times we didn't. Let's clean that up.
2017-12-15 11:09:00 +01:00
Lennart Poettering 234519ae6d tree-wide: drop a few == NULL and != NULL comparison
Our CODING_STYLE suggests not comparing with NULL, but relying on C's
downgrade-to-bool feature for that. Fix up some code to match these
guidelines. (This is not comprehensive, the coccinelle output for this
is unfortunately kinda borked)
2017-12-11 16:05:40 +01:00
Yu Watanabe da681e1bd2 tree-wide: use cpu_set_mfree() 2017-12-06 10:32:38 +09:00
Yu Watanabe 7f59dd3566 execute: define the variable mac_selinux_contex_net only when build with SELinux 2017-12-05 14:08:09 +09:00
Yu Watanabe 92b423b9b4 execute: define setup_smack() only if SMACK is enabled
This suppresses the following warning
```
execute.c:2149:12: warning: ‘setup_smack’ defined but not used [-Wunused-function]
 static int setup_smack(
            ^~~~~~~~~~~
```
2017-12-05 14:04:15 +09:00
Lennart Poettering 949befd3f0
core: support upgrading from DynamicUser=0 to DynamicUser=1 for unit directories (#7507)
This makes sure we migrate /var/lib/<foo> if it exists to
/var/lib/private/<foo> if DynamicUser=1 is set. This is useful to allow
turning on DynamicUser= on services that previously didn't use it, and
we can deal with this, and migrate the relevant directories as
necessary.

Note that "downgrading" from DynamicUser=1 backto DynamicUser=0 works
too. However in that case we simply continue to use
/var/lib/private/<foo>, which works because /var/lib/<foo> is a symlink
there after all.
2017-11-30 11:52:39 +01:00
Lennart Poettering 62b9bb2661 cgroup-util: merge cg_set_tasks_access() and cg-set_group_access() into one
We never use these functions seperately, hence don't bother splitting
them into to.

Also, simplify things a bit, and maintain tables for the attribute files
to chown. Let's also update those tables a bit, and include thenew
"cgroup.threads" file in it, that needs to be delegated too, according
to the documentation.
2017-11-25 17:08:21 +01:00
jobol 37ac2744cc core/exec: Restore SmackProcessLabel setting (#7378)
Smack LSM needs the capability CAP_MAC_ADMIN to allow
setting of the current Smack exec label. Consequently,
dropping capabilities must be done after changing the
current exec label.

This is only related to Smack LSM. But for clarity and
regularity, all setting of security context moved before
dropping capabilities.

See Issue 7108
2017-11-21 12:01:13 +01:00
Lennart Poettering 0133d5553a
Merge pull request #7198 from poettering/stdin-stdout
Add StandardInput=data, StandardInput=file:... and more
2017-11-19 19:49:11 +01:00
Zbigniew Jędrzejewski-Szmek 53e1b68390 Add SPDX license identifiers to source files under the LGPL
This follows what the kernel is doing, c.f.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.
2017-11-19 19:08:15 +01:00
Lennart Poettering befc4a800e core: add exec_context_dump() support for fd: and file: stdio settings
This was missing for using fdnames as stdio, let's add support for
fdnames as well as file paths in one go.
2017-11-17 11:13:44 +01:00
Lennart Poettering 2038c3f584 core: add support for StandardInputFile= and friends
These new settings permit specifiying arbitrary paths as
stdin/stdout/stderr locations. We try to open/create them as necessary.
Some special magic is applied:

1) if the same path is specified for both input and output/stderr, we'll
   open it only once O_RDWR, and duplicate them fd instead.

2) If we an AF_UNIX socket path is specified, we'll connect() to it,
   rather than open() it. This allows invoking systemd services with
   stdin/stdout/stderr connected to arbitrary foreign service sockets.

Fixes: #3991
2017-11-17 11:13:44 +01:00