Commit graph

152 commits

Author SHA1 Message Date
Lennart Poettering 0133d5553a
Merge pull request #7198 from poettering/stdin-stdout
Add StandardInput=data, StandardInput=file:... and more
2017-11-19 19:49:11 +01:00
Zbigniew Jędrzejewski-Szmek 53e1b68390 Add SPDX license identifiers to source files under the LGPL
This follows what the kernel is doing, c.f.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.
2017-11-19 19:08:15 +01:00
Lennart Poettering 99be45a46f fs-util: rename path_is_safe() → path_is_normalized()
Already, path_is_safe() refused paths container the "." dir. Doing that
isn't strictly necessary to be "safe" by most definitions of the word.
But it is necessary in order to consider a path "normalized". Hence,
"path_is_safe()" is slightly misleading a name, but
"path_is_normalize()" is more descriptive, hence let's rename things
accordingly.

No functional changes.
2017-11-17 11:13:44 +01:00
Lennart Poettering 2038c3f584 core: add support for StandardInputFile= and friends
These new settings permit specifiying arbitrary paths as
stdin/stdout/stderr locations. We try to open/create them as necessary.
Some special magic is applied:

1) if the same path is specified for both input and output/stderr, we'll
   open it only once O_RDWR, and duplicate them fd instead.

2) If we an AF_UNIX socket path is specified, we'll connect() to it,
   rather than open() it. This allows invoking systemd services with
   stdin/stdout/stderr connected to arbitrary foreign service sockets.

Fixes: #3991
2017-11-17 11:13:44 +01:00
Lennart Poettering 0664775c84 core: fix handling of transient StandardOutputFileDescriptorName= and friends
Let's make sure to process the fdname first, before changing the actual
input/output setting, since the fdname part can fail due to OOM.

This way we don't leave half-initialized bits around.
2017-11-17 11:13:44 +01:00
Lennart Poettering 5073ff6bec core: fold property_get_input_fdname() and property_get_output_fdname() into one
property_get_output_fdname() already had two different control flows for
stdout and stderr, it might as well handle stdin too, thus shortening
our code a bit.
2017-11-17 11:13:44 +01:00
Lennart Poettering 08f3be7a38 core: add two new unit file settings: StandardInputData= + StandardInputText=
Both permit configuring data to pass through STDIN to an invoked
process. StandardInputText= accepts a line of text (possibly with
embedded C-style escapes as well as unit specifiers), which is appended
to the buffer to pass as stdin, followed by a single newline.
StandardInputData= is similar, but accepts arbitrary base64 encoded
data, and will not resolve specifiers or C-style escapes, nor append
newlines.

This may be used to pass input/configuration data to services, directly
in-line from unit files, either in a cooked or in a more raw format.
2017-11-17 11:13:44 +01:00
Lennart Poettering d3070fbdf6 core: implement /run/systemd/units/-based path for passing unit info from PID 1 to journald
And let's make use of it to implement two new unit settings with it:

1. LogLevelMax= is a new per-unit setting that may be used to configure
   log priority filtering: set it to LogLevelMax=notice and only
   messages of level "notice" and lower (i.e. more important) will be
   processed, all others are dropped.

2. LogExtraFields= is a new per-unit setting for configuring per-unit
   journal fields, that are implicitly included in every log record
   generated by the unit's processes. It takes field/value pairs in the
   form of FOO=BAR.

Also, related to this, one exisiting unit setting is ported to this new
facility:

3. The invocation ID is now pulled from /run/systemd/units/ instead of
   cgroupfs xattrs. This substantially relaxes requirements of systemd
   on the kernel version and the privileges it runs with (specifically,
   cgroupfs xattrs are not available in containers, since they are
   stored in kernel memory, and hence are unsafe to permit to lesser
   privileged code).

/run/systemd/units/ is a new directory, which contains a number of files
and symlinks encoding the above information. PID 1 creates and manages
these files, and journald reads them from there.

Note that this is supposed to be a direct path between PID 1 and the
journal only, due to the special runtime environment the journal runs
in. Normally, today we shouldn't introduce new interfaces that (mis-)use
a file system as IPC framework, and instead just an IPC system, but this
is very hard to do between the journal and PID 1, as long as the IPC
system is a subject PID 1 manages, and itself a client to the journal.

This patch cleans up a couple of types used in journal code:
specifically we switch to size_t for a couple of memory-sizing values,
as size_t is the right choice for everything that is memory.

Fixes: #4089
Fixes: #3041
Fixes: #4441
2017-11-16 12:40:17 +01:00
Yu Watanabe 3df90f24cc core: allow to specify errno number in SystemCallErrorNumber= 2017-11-11 21:54:24 +09:00
Yu Watanabe 8cfa775f4f core: add support to specify errno in SystemCallFilter=
This makes each system call in SystemCallFilter= blacklist optionally
takes errno name or number after a colon. The errno takes precedence
over the one given by SystemCallErrorNumber=.

C.f. #7173.
Closes #7169.
2017-11-11 21:54:12 +09:00
Zbigniew Jędrzejewski-Szmek ac097c841e Remove a bunch of unused variables
gcc does not warn about those, because of the _cleanup_ usage.
clang is smarter here.
2017-11-01 23:06:44 +01:00
Yu Watanabe 3f856a2813 core: fix D-Bus API
This fixes bugs introduced by cffaed83e8.

Closes #7129.
2017-10-24 17:21:27 +09:00
Zbigniew Jędrzejewski-Szmek 349cc4a507 build-sys: use #if Y instead of #ifdef Y everywhere
The advantage is that is the name is mispellt, cpp will warn us.

$ git grep -Ee "conf.set\('(HAVE|ENABLE)_" -l|xargs sed -r -i "s/conf.set\('(HAVE|ENABLE)_/conf.set10('\1_/"
$ git grep -Ee '#ifn?def (HAVE|ENABLE)' -l|xargs sed -r -i 's/#ifdef (HAVE|ENABLE)/#if \1/; s/#ifndef (HAVE|ENABLE)/#if ! \1/;'
$ git grep -Ee 'if.*defined\(HAVE' -l|xargs sed -i -r 's/defined\((HAVE_[A-Z0-9_]*)\)/\1/g'
$ git grep -Ee 'if.*defined\(ENABLE' -l|xargs sed -i -r 's/defined\((ENABLE_[A-Z0-9_]*)\)/\1/g'
+ manual changes to meson.build

squash! build-sys: use #if Y instead of #ifdef Y everywhere

v2:
- fix incorrect setting of HAVE_LIBIDN2
2017-10-04 12:09:29 +02:00
Lennart Poettering 091e9efed3 core: fix StateDirectory= (and friends) safety checks when decoding transient unit properties
Let's make sure relative directories such as "foo/bar" are accepted, by
using the same validation checks as in unit file parsing.
2017-10-02 17:41:44 +02:00
Lennart Poettering 72fd17682d core: usually our enum's _INVALID and _MAX special values are named after the full type
In most cases we followed the rule that the special _INVALID and _MAX
values we use in our enums use the full type name as prefix (in contrast
to regular values that we often make shorter), do so for
ExecDirectoryType as well.

No functional changes, just a little bit of renaming to make this code
more like the rest.
2017-10-02 17:41:43 +02:00
Zbigniew Jędrzejewski-Szmek 61ceaea5c8 Move one space from dbus-execute.c to execute.c
The number of spaces is conserved ;)
2017-09-16 08:45:02 +02:00
Lennart Poettering b1edf4456e core: add new per-unit setting KeyringMode= for controlling kernel keyring setup
Usually, it's a good thing that we isolate the kernel session keyring
for the various services and disconnect them from the user keyring.
However, in case of the cryptsetup key caching we actually want that
multiple instances of the cryptsetup service can share the keys in the
root user's user keyring, hence we need to be able to disable this logic
for them.

This adds KeyringMode=inherit|private|shared:

    inherit: don't do any keyring magic (this is the default in systemd --user)
    private: a private keyring as before (default in systemd --system)
    shared: the new setting
2017-09-15 16:53:35 +02:00
Lennart Poettering 1c68232ee2 core: rework how we treat specifiers in Environment= of transient units
Let's validate the data passed in after resolving specifiers, but let's
write out to the unit snippet the list without specifiers applied. This
way the pre-existing comment actually starts matching what is actually
implemented.
2017-09-14 15:17:40 +02:00
Lennart Poettering 41de9cc29e core: support specifier expansion in PassEnvironment=
I can't come up with any usecase for this, but let's add this here, to
match what we support for Environment=. It's kind surprising if we
support specifier expansion for some environment related settings, but
not for others.
2017-09-14 15:17:40 +02:00
Lennart Poettering 00819cc151 core: add new UnsetEnvironment= setting for unit files
With this setting we can explicitly unset specific variables for
processes of a unit, as last step of assembling the environment block
for them. This is useful to fix #6407.

While we are at it, greatly expand the documentation on how the
environment block for forked off processes is assembled.
2017-09-14 15:17:40 +02:00
Yu Watanabe ae978b9f6c bus-util: remove unnecessary initializations 2017-08-31 17:08:30 +09:00
Lennart Poettering 3167f78a11 core: open up LockPersonality= for transient units
Let's make "systemd-run -p LockPersonality=1 -t /bin/sh" work.
2017-08-29 15:58:13 +02:00
Topi Miettinen 78e864e5b3 seccomp: LockPersonality boolean (#6193)
Add LockPersonality boolean to allow locking down personality(2)
system call so that the execution domain can't be changed.
This may be useful to improve security because odd emulations
may be poorly tested and source of vulnerabilities, while
system services shouldn't need any weird personalities.
2017-08-29 15:54:50 +02:00
Lennart Poettering 3ed0cd26ea execute: replace command flag bools by a flags field
This way, we can extend it later on in an easier way, and can pass it
along nicely.
2017-08-10 14:44:58 +02:00
Yu Watanabe cffaed83e8 core: add missing properties in D-Bus API
Closes #6466.
2017-08-08 00:37:02 +09:00
Yu Watanabe e940c1ef1d core: fix typo 2017-08-06 13:08:37 +09:00
Yu Watanabe 3536f49e8f core: add {State,Cache,Log,Configuration}Directory= (#6384)
This introduces {State,Cache,Log,Configuration}Directory= those are
similar to RuntimeDirectory=. They create the directories under
/var/lib, /var/cache/, /var/log, or /etc, respectively, with the mode
specified in {State,Cache,Log,Configuration}DirectoryMode=.

This also fixes #6391.
2017-07-18 14:34:52 +02:00
Yu Watanabe 53f47dfc7b core: allow preserving contents of RuntimeDirectory= over process restart
This introduces RuntimeDirectoryPreserve= option which takes a boolean
argument or 'restart'.

Closes #6087.
2017-07-17 16:22:25 +09:00
Zbigniew Jędrzejewski-Szmek 23e4a234fc Merge pull request #6200 from poettering/ioprio-transient 2017-06-26 21:30:36 -04:00
Lennart Poettering 8d1dd6abe2 core: when parsing integers passed over the bus use int32_t, not int
sd-bus will return "i" integers as "int32_t", not "int". Now on all
archs we care about that's the same, but we still should do this
properly.
2017-06-26 17:43:18 +02:00
Lennart Poettering 7f452159b8 core: make IOSchedulingClass= and IOSchedulingPriority= settable for transient units
This patch is a bit more complex thant I hoped. In particular the single
IOScheduling= property exposed on the bus is split up into
IOSchedulingClass= and IOSchedulingPriority= (though compat is
retained). Otherwise the asymmetry between setting props and getting
them is a bit too nasty.

Fixes #5613
2017-06-26 17:43:18 +02:00
Zbigniew Jędrzejewski-Szmek f900f5825d core: allow specifiers to be resolved in Environment= passed over dbus (#6144)
When specifiers are included in the Environment block in StartTransientUnit,
we resolve specifiers on the PID1 side. Nevertheless we store the unresolved
version in the transient unit file, so that it'll be resolved when loading
the unit. I think this looks nicer.

I also removed the writing of the merged Environment block to the transient
file. Afaict, this resulted in variables being written multiple times, but
this needs to be tested properly.

Fixes #5699.
2017-06-26 11:26:59 +02:00
Evgeny Vereshchagin c9b0610856 core: fix values of BindPaths and BindReadOnlyPaths properties on 32-bit platforms (#5713)
$ busctl get-property \
    org.freedesktop.systemd1 \
    /org/freedesktop/systemd1/unit/run_2dr471de87550554a6dbb165501c33c5dab_2eservice \
    org.freedesktop.systemd1.Service BindReadOnlyPaths

a(ssbt) 1 "/etc" "/etc" false 9228635523571007488

The correct values are 0 and 16384
2017-04-10 13:20:17 +02:00
Lennart Poettering 915e6d1676 core: add RootImage= setting for using a specific image file as root directory for a service
This is similar to RootDirectory= but mounts the root file system from a
block device or loopback file instead of another directory.

This reuses the image dissector code now used by nspawn and
gpt-auto-discovery.
2017-02-07 12:19:42 +01:00
Lennart Poettering 20b7a0070c core: actually make "+" prefix in ReadOnlyPaths=, InaccessiblePaths=, ReadWritablePaths= work
5327c910d2 claimed to add support for "+"
for prefixing paths with the configured RootDirectory=. But actually it
only implemented it in the backend, it did not add support for it to the
configuration file parsers. Fix that now.
2017-02-07 11:22:05 +01:00
Lennart Poettering 5d997827e2 core: add a per-unit setting MountAPIVFS= for mounting /dev, /proc, /sys in conjunction with RootDirectory=
This adds a boolean unit file setting MountAPIVFS=. If set, the three
main API VFS mounts will be mounted for the service. This only has an
effect on RootDirectory=, which it makes a ton times more useful.

(This is basically the /dev + /proc + /sys mounting code posted in the
original #4727, but rebased on current git, and with the automatic logic
replaced by explicit logic controlled by a unit file setting)
2017-02-07 11:22:05 +01:00
Evgeny Vereshchagin 91d910e386 core: fix copy paste error (s/source/destination/) (#5197)
```
-bash-4.3# systemd-run --property BindPaths=/etc:tmp/hey sh -c 'ls /tmp/hey'
```
prints
`Destination path tmp/hey is not absolute.`
instead of
`Destination path /etc is not absolute.`

CID #1368239
2017-01-31 13:04:01 -05:00
Evgeny Vereshchagin 9b531f04fb core: fix memleak in bus_exec_context_set_transient_property (#5143)
Fixes:
```sh
systemd-run --property EnvironmentFile=/some/environment/file /bin/sleep 30
```
```
 23 bytes in 1 blocks are definitely lost in loss record 1 of 7
    at 0x4C2DB9D: malloc (vg_replace_malloc.c:299)
    by 0x4E85488: malloc_multiply (alloc-util.h:70)
    by 0x4E85F19: strjoin_real (string-util.c:252)
    by 0x1AF741: bus_exec_context_set_transient_property (dbus-execute.c:1418)
    by 0x1A907C: bus_service_set_property (dbus-service.c:330)
    by 0x1A66DD: bus_unit_set_properties (dbus-unit.c:1456)
    by 0x19CF93: transient_unit_from_message (dbus-manager.c:892)
    by 0x19D388: method_start_transient_unit (dbus-manager.c:980)
    by 0x4F60544: method_callbacks_run (bus-objects.c:418)
    by 0x4F62D9D: object_find_and_run (bus-objects.c:1255)
    by 0x4F633CE: bus_process_object (bus-objects.c:1371)
    by 0x4F2CE1D: process_message (sd-bus.c:2563)
```
Closes: #5142
2017-01-24 21:53:50 -05:00
Zbigniew Jędrzejewski-Szmek c73838280c Modify mount_propagation_flags_from_string to return a normal int code
This means that callers can distiguish an error from flags==0,
and don't have to special-case the empty string.
2016-12-17 13:57:04 -05:00
Lennart Poettering d2d6c096f6 core: add ability to define arbitrary bind mounts for services
This adds two new settings BindPaths= and BindReadOnlyPaths=. They allow
defining arbitrary bind mounts specific to particular services. This is
particularly useful for services with RootDirectory= set as this permits making
specific bits of the host directory available to chrooted services.

The two new settings follow the concepts nspawn already possess in --bind= and
--bind-ro=, as well as the .nspawn settings Bind= and BindReadOnly= (and these
latter options should probably be renamed to BindPaths= and BindReadOnlyPaths=
too).

Fixes: #3439
2016-12-14 00:54:10 +01:00
Lennart Poettering 835552511e core: hook up MountFlags= to the transient unit logic
This makes "systemd-run -p MountFlags=shared -t /bin/sh" work, by making
MountFlags= to the list of properties that may be accessed transiently.
2016-12-13 21:22:13 +01:00
Djalal Harouni 6a8c2d5915 core: property is RestrictNamespaces with s 2016-11-15 15:04:38 +01:00
Lennart Poettering add005357d core: add new RestrictNamespaces= unit file setting
This new setting permits restricting whether namespaces may be created and
managed by processes started by a unit. It installs a seccomp filter blocking
certain invocations of unshare(), clone() and setns().

RestrictNamespaces=no is the default, and does not restrict namespaces in any
way. RestrictNamespaces=yes takes away the ability to create or manage any kind
of namspace. "RestrictNamespaces=mnt ipc" restricts the creation of namespaces
so that only mount and IPC namespaces may be created/managed, but no other
kind of namespaces.

This setting should be improve security quite a bit as in particular user
namespacing was a major source of CVEs in the kernel in the past, and is
accessible to unprivileged processes. With this setting the entire attack
surface may be removed for system services that do not make use of namespaces.
2016-11-04 07:40:13 -06:00
Zbigniew Jędrzejewski-Szmek 605405c6cc tree-wide: drop NULL sentinel from strjoin
This makes strjoin and strjoina more similar and avoids the useless final
argument.

spatch -I . -I ./src -I ./src/basic -I ./src/basic -I ./src/shared -I ./src/shared -I ./src/network -I ./src/locale -I ./src/login -I ./src/journal -I ./src/journal -I ./src/timedate -I ./src/timesync -I ./src/nspawn -I ./src/resolve -I ./src/resolve -I ./src/systemd -I ./src/core -I ./src/core -I ./src/libudev -I ./src/udev -I ./src/udev/net -I ./src/udev -I ./src/libsystemd/sd-bus -I ./src/libsystemd/sd-event -I ./src/libsystemd/sd-login -I ./src/libsystemd/sd-netlink -I ./src/libsystemd/sd-network -I ./src/libsystemd/sd-hwdb -I ./src/libsystemd/sd-device -I ./src/libsystemd/sd-id128 -I ./src/libsystemd-network --sp-file coccinelle/strjoin.cocci --in-place $(git ls-files src/*.c)

git grep -e '\bstrjoin\b.*NULL' -l|xargs sed -i -r 's/strjoin\((.*), NULL\)/strjoin(\1)/'

This might have missed a few cases (spatch has a really hard time dealing
with _cleanup_ macros), but that's no big issue, they can always be fixed
later.
2016-10-23 11:43:27 -04:00
Luca Bruno 52c239d770 core/exec: add a named-descriptor option ("fd") for streams (#4179)
This commit adds a `fd` option to `StandardInput=`,
`StandardOutput=` and `StandardError=` properties in order to
connect standard streams to externally named descriptors provided
by some socket units.

This option looks for a file descriptor named as the corresponding
stream. Custom names can be specified, separated by a colon.
If multiple name-matches exist, the first matching fd will be used.
2016-10-17 20:05:49 -04:00
Djalal Harouni 502d704e5e core:sandbox: Add ProtectKernelModules= option
This is useful to turn off explicit module load and unload operations on modular
kernels. This option removes CAP_SYS_MODULE from the capability bounding set for
the unit, and installs a system call filter to block module system calls.

This option will not prevent the kernel from loading modules using the module
auto-load feature which is a system wide operation.
2016-10-12 13:31:21 +02:00
Lennart Poettering 59eeb84ba6 core: add two new service settings ProtectKernelTunables= and ProtectControlGroups=
If enabled, these will block write access to /sys, /proc/sys and
/proc/sys/fs/cgroup.
2016-09-25 10:18:48 +02:00
Lennart Poettering 00d9ef8560 core: add RemoveIPC= setting
This adds the boolean RemoveIPC= setting to service, socket, mount and swap
units (i.e.  all unit types that may invoke processes). if turned on, and the
unit's user/group is not root, all IPC objects of the user/group are removed
when the service is shut down. The life-cycle of the IPC objects is hence bound
to the unit life-cycle.

This is particularly relevant for units with dynamic users, as it is essential
that no objects owned by the dynamic users survive the service exiting. In
fact, this patch adds code to imply RemoveIPC= if DynamicUser= is set.

In order to communicate the UID/GID of an executed process back to PID 1 this
adds a new "user lookup" socket pair, that is inherited into the forked
processes, and closed before the exec(). This is needed since we cannot do NSS
from PID 1 due to deadlock risks, However need to know the used UID/GID in
order to clean up IPC owned by it if the unit shuts down.
2016-08-19 00:37:25 +02:00
Zbigniew Jędrzejewski-Szmek d87a2ef782 Merge pull request #3884 from poettering/private-users 2016-08-06 17:04:45 -04:00
Lennart Poettering 41bf0590cc util-lib: unify parsing of nice level values
This adds parse_nice() that parses a nice level and ensures it is in the right
range, via a new nice_is_valid() helper. It then ports over a number of users
to this.

No functional changes.
2016-08-05 11:18:32 +02:00