Commit graph

490 commits

Author SHA1 Message Date
Lennart Poettering a3be2849b2 seccomp: add new helper call seccomp_load_filter_set()
This allows us to unify most of the code in apply_protect_kernel_modules() and
apply_private_devices().
2016-10-24 17:32:50 +02:00
Lennart Poettering 8d7b0c8fd7 seccomp: add new seccomp_init_conservative() helper
This adds a new seccomp_init_conservative() helper call that is mostly just a
wrapper around seccomp_init(), but turns off NNP and adds in all secondary
archs, for best compatibility with everything else.

Pretty much all of our code used the very same constructs for these three
steps, hence unifying this in one small function makes things a lot shorter.

This also changes incorrect usage of the "scmp_filter_ctx" type at various
places. libseccomp defines it as typedef to "void*", i.e. it is a pointer type
(pretty poor choice already!) that casts implicitly to and from all other
pointer types (even poorer choice: you defined a confusing type now, and don't
even gain any bit of type safety through it...). A lot of the code assumed the
type would refer to a structure, and hence aded additional "*" here and there.
Remove that.
2016-10-24 17:32:50 +02:00
Lennart Poettering 25a8d8a0cb core: rework apply_protect_kernel_modules() to use seccomp_add_syscall_filter_set()
Let's simplify this call, by making use of the new infrastructure.

This is actually more in line with Djalal's original patch but instead of
search the filter set in the array by its name we can now use the set index and
jump directly to it.
2016-10-24 17:32:50 +02:00
Lennart Poettering 8130926d32 core: rework syscall filter set handling
A variety of fixes:

- rename the SystemCallFilterSet structure to SyscallFilterSet. So far the main
  instance of it (the syscall_filter_sets[] array) used to abbreviate
  "SystemCall" as "Syscall". Let's stick to one of the two syntaxes, and not
  mix and match too wildly. Let's pick the shorter name in this case, as it is
  sufficiently well established to not confuse hackers reading this.

- Export explicit indexes into the syscall_filter_sets[] array via an enum.
  This way, code that wants to make use of a specific filter set, can index it
  directly via the enum, instead of having to search for it. This makes
  apply_private_devices() in particular a lot simpler.

- Provide two new helper calls in seccomp-util.c: syscall_filter_set_find() to
  find a set by its name, seccomp_add_syscall_filter_set() to add a set to a
  seccomp object.

- Update SystemCallFilter= parser to use extract_first_word().  Let's work on
  deprecating FOREACH_WORD_QUOTED().

- Simplify apply_private_devices() using this functionality
2016-10-24 17:32:50 +02:00
Lennart Poettering e0f3720e39 core: move misplaced comment to the right place 2016-10-24 17:29:51 +02:00
Lennart Poettering f673b62df6 core: simplify skip_seccomp_unavailable() a bit
Let's prefer early-exit over deep-indented if blocks. Not behavioural change.
2016-10-24 17:29:51 +02:00
Djalal Harouni 366ddd252e core: do not assert when sysconf(_SC_NGROUPS_MAX) fails (#4466)
Remove the assert and check the return code of sysconf(_SC_NGROUPS_MAX).

_SC_NGROUPS_MAX maps to NGROUPS_MAX which is defined in <limits.h> to
65536 these days. The value is a sysctl read-only
/proc/sys/kernel/ngroups_max and the kernel assumes that it is always
positive otherwise things may break. Follow this and support only
positive values for all other case return either -errno or -EOPNOTSUPP.

Now if there are systems that want to re-write NGROUPS_MAX then they
should not pass SupplementaryGroups= in units even if it is empty, in
this case nothing fails and we just ignore supplementary groups. However
if SupplementaryGroups= is passed even if it is empty we have to assume
that there will be groups manipulation from our side or the kernel and
since the kernel always assumes that NGROUPS_MAX is positive, then
follow that and support only positive values.
2016-10-24 13:13:06 +02:00
Djalal Harouni 8b6903ad4d core: lets move the setup of working directory before group enforce
This is minor but lets try to split and move bit by bit cgroups and
portable environment setup before applying the security context.
2016-10-23 23:27:20 +02:00
Djalal Harouni 4d885bd326 core: first lookup and cache creds then apply them after namespace setup
This fixes: https://github.com/systemd/systemd/issues/4357

Let's lookup and cache creds then apply them. We also switch from
getgroups() to getgrouplist().
2016-10-23 23:24:14 +02:00
Zbigniew Jędrzejewski-Szmek 605405c6cc tree-wide: drop NULL sentinel from strjoin
This makes strjoin and strjoina more similar and avoids the useless final
argument.

spatch -I . -I ./src -I ./src/basic -I ./src/basic -I ./src/shared -I ./src/shared -I ./src/network -I ./src/locale -I ./src/login -I ./src/journal -I ./src/journal -I ./src/timedate -I ./src/timesync -I ./src/nspawn -I ./src/resolve -I ./src/resolve -I ./src/systemd -I ./src/core -I ./src/core -I ./src/libudev -I ./src/udev -I ./src/udev/net -I ./src/udev -I ./src/libsystemd/sd-bus -I ./src/libsystemd/sd-event -I ./src/libsystemd/sd-login -I ./src/libsystemd/sd-netlink -I ./src/libsystemd/sd-network -I ./src/libsystemd/sd-hwdb -I ./src/libsystemd/sd-device -I ./src/libsystemd/sd-id128 -I ./src/libsystemd-network --sp-file coccinelle/strjoin.cocci --in-place $(git ls-files src/*.c)

git grep -e '\bstrjoin\b.*NULL' -l|xargs sed -i -r 's/strjoin\((.*), NULL\)/strjoin(\1)/'

This might have missed a few cases (spatch has a really hard time dealing
with _cleanup_ macros), but that's no big issue, they can always be fixed
later.
2016-10-23 11:43:27 -04:00
Luca Bruno 52c239d770 core/exec: add a named-descriptor option ("fd") for streams (#4179)
This commit adds a `fd` option to `StandardInput=`,
`StandardOutput=` and `StandardError=` properties in order to
connect standard streams to externally named descriptors provided
by some socket units.

This option looks for a file descriptor named as the corresponding
stream. Custom names can be specified, separated by a colon.
If multiple name-matches exist, the first matching fd will be used.
2016-10-17 20:05:49 -04:00
Zbigniew Jędrzejewski-Szmek 6b430fdb7c tree-wide: use mfree more 2016-10-16 23:35:39 -04:00
Djalal Harouni e66a2f658b core: make sure to dump ProtectKernelModules= value 2016-10-12 14:12:17 +02:00
Djalal Harouni 4084e8fc89 core: check protect_kernel_modules and private_devices in order to setup NNP 2016-10-12 14:12:07 +02:00
Djalal Harouni c575770b75 core:sandbox: lets make /lib/modules/ inaccessible on ProtectKernelModules=
Lets go further and make /lib/modules/ inaccessible for services that do
not have business with modules, this is a minor improvment but it may
help on setups with custom modules and they are limited... in regard of
kernel auto-load feature.

This change introduce NameSpaceInfo struct which we may embed later
inside ExecContext but for now lets just reduce the argument number to
setup_namespace() and merge ProtectKernelModules feature.
2016-10-12 14:11:16 +02:00
Djalal Harouni 502d704e5e core:sandbox: Add ProtectKernelModules= option
This is useful to turn off explicit module load and unload operations on modular
kernels. This option removes CAP_SYS_MODULE from the capability bounding set for
the unit, and installs a system call filter to block module system calls.

This option will not prevent the kernel from loading modules using the module
auto-load feature which is a system wide operation.
2016-10-12 13:31:21 +02:00
Lennart Poettering e0d2adfde6 core: chown() any TTY used for stdin, not just when StandardInput=tty is used (#4347)
If stdin is supplied as an fd for transient units (using the
StandardInputFileDescriptor pseudo-property for transient units), then we
should also fix up the TTY ownership, not just when we opened the TTY
ourselves.

This simply drops the explicit is_terminal_input()-based check. Note that
chown_terminal() internally does a much more appropriate isatty()-based check
anyway, hence we can drop this without replacement.

Fixes: #4260
2016-10-11 14:07:22 -04:00
Lennart Poettering 4b58153dd2 core: add "invocation ID" concept to service manager
This adds a new invocation ID concept to the service manager. The invocation ID
identifies each runtime cycle of a unit uniquely. A new randomized 128bit ID is
generated each time a unit moves from and inactive to an activating or active
state.

The primary usecase for this concept is to connect the runtime data PID 1
maintains about a service with the offline data the journal stores about it.
Previously we'd use the unit name plus start/stop times, which however is
highly racy since the journal will generally process log data after the service
already ended.

The "invocation ID" kinda matches the "boot ID" concept of the Linux kernel,
except that it applies to an individual unit instead of the whole system.

The invocation ID is passed to the activated processes as environment variable.
It is additionally stored as extended attribute on the cgroup of the unit. The
latter is used by journald to automatically retrieve it for each log logged
message and attach it to the log entry. The environment variable is very easily
accessible, even for unprivileged services. OTOH the extended attribute is only
accessible to privileged processes (this is because cgroupfs only supports the
"trusted." xattr namespace, not "user."). The environment variable may be
altered by services, the extended attribute may not be, hence is the better
choice for the journal.

Note that reading the invocation ID off the extended attribute from journald is
racy, similar to the way reading the unit name for a logging process is.

This patch adds APIs to read the invocation ID to sd-id128:
sd_id128_get_invocation() may be used in a similar fashion to
sd_id128_get_boot().

PID1's own logging is updated to always include the invocation ID when it logs
information about a unit.

A new bus call GetUnitByInvocationID() is added that allows retrieving a bus
path to a unit by its invocation ID. The bus path is built using the invocation
ID, thus providing a path for referring to a unit that is valid only for the
current runtime cycleof it.

Outlook for the future: should the kernel eventually allow passing of cgroup
information along AF_UNIX/SOCK_DGRAM messages via a unique cgroup id, then we
can alter the invocation ID to be generated as hash from that rather than
entirely randomly. This way we can derive the invocation race-freely from the
messages.
2016-10-07 20:14:38 +02:00
Lennart Poettering 97f0e76f18 user-util: rework maybe_setgroups() a bit
Let's drop the caching of the setgroups /proc field for now. While there's a
strict regime in place when it changes states, let's better not cache it since
we cannot really be sure we follow that regime correctly.

More importantly however, this is not in performance sensitive code, and
there's no indication the cache is really beneficial, hence let's drop the
caching and make things a bit simpler.

Also, while we are at it, rework the error handling a bit, and always return
negative errno-style error codes, following our usual coding style. This has
the benefit that we can sensible hanld read_one_line_file() errors, without
having to updat errno explicitly.
2016-10-06 19:04:10 +02:00
Lennart Poettering 2d6fce8d7c core: leave PAM stub process around with GIDs updated
In the process execution code of PID 1, before
096424d123 the GID settings where changed before
invoking PAM, and the UID settings after. After the change both changes are
made after the PAM session hooks are run. When invoking PAM we fork once, and
leave a stub process around which will invoke the PAM session end hooks when
the session goes away. This code previously was dropping the remaining privs
(which were precisely the UID). Fix this code to do this correctly again, by
really dropping them else (i.e. the GID as well).

While we are at it, also fix error logging of this code.

Fixes: #4238
2016-10-06 19:04:10 +02:00
Giuseppe Scrivano 36d854780c core: do not fail in a container if we can't use setgroups
It might be blocked through /proc/PID/setgroups
2016-10-06 11:49:00 +02:00
Stefan Schweter 629ff674ac tree-wide: remove consecutive duplicate words in comments 2016-10-04 17:06:25 +02:00
Djalal Harouni 8f81a5f61b core: Use @raw-io syscall group to filter I/O syscalls when PrivateDevices= is set
Instead of having a local syscall list, use the @raw-io group which
contains the same set of syscalls to filter.
2016-09-25 12:52:27 +02:00
Lennart Poettering cefc33aee2 execute: move SMACK setup code into its own function
While we are at it, move PAM code #ifdeffery into setup_pam() to simplify the
main execution logic a bit.
2016-09-25 10:52:57 +02:00
Lennart Poettering ba128bb809 execute: filter low-level I/O syscalls if PrivateDevices= is set
If device access is restricted via PrivateDevices=, let's also block the
various low-level I/O syscalls at the same time, so that we know that the
minimal set of devices in our virtualized /dev are really everything the unit
can access.
2016-09-25 10:52:57 +02:00
Lennart Poettering 096424d123 execute: drop group priviliges only after setting up namespace
If PrivateDevices=yes is set, the namespace code creates device nodes in /dev
that should be owned by the host's root, hence let's make sure we set up the
namespace before dropping group privileges.
2016-09-25 10:42:18 +02:00
Lennart Poettering 3fbe8dbe41 execute: if RuntimeDirectory= is set, it should be writable
Implicitly make all dirs set with RuntimeDirectory= writable, as the concept
otherwise makes no sense.
2016-09-25 10:19:05 +02:00
Lennart Poettering be39ccf3a0 execute: move suppression of HOME=/ and SHELL=/bin/nologin into user-util.c
This adds a new call get_user_creds_clean(), which is just like
get_user_creds() but returns NULL in the home/shell parameters if they contain
no useful information. This code previously lived in execute.c, but by
generalizing this we can reuse it in run.c.
2016-09-25 10:18:57 +02:00
Lennart Poettering 07689d5d2c execute: split out creation of runtime dirs into its own functions 2016-09-25 10:18:54 +02:00
Lennart Poettering 59eeb84ba6 core: add two new service settings ProtectKernelTunables= and ProtectControlGroups=
If enabled, these will block write access to /sys, /proc/sys and
/proc/sys/fs/cgroup.
2016-09-25 10:18:48 +02:00
Lennart Poettering 72246c2a65 core: enforce seccomp for secondary archs too, for all rules
Let's make sure that all our rules apply to all archs the local kernel
supports.
2016-09-25 10:18:44 +02:00
Felipe Sateler d347d9029c seccomp: also detect if seccomp filtering is enabled
In https://github.com/systemd/systemd/pull/4004 , a runtime detection
method for seccomp was added. However, it does not detect the case
where CONFIG_SECCOMP=y but CONFIG_SECCOMP_FILTER=n. This is possible
if the architecture does not support filtering yet.
Add a check for that case too.

While at it, change get_proc_field usage to use PR_GET_SECCOMP prctl,
as that should save a few system calls and (unnecessary) allocations.
Previously, reading of /proc/self/stat was done as recommended by
prctl(2) as safer. However, given that we need to do the prctl call
anyway, lets skip opening, reading and parsing the file.

Code for checking inspired by
https://outflux.net/teach-seccomp/autodetect.html
2016-09-06 20:25:49 -03:00
Felipe Sateler 83f12b27d1 core: do not fail at step SECCOMP if there is no kernel support (#4004)
Fixes #3882
2016-08-22 22:40:58 +03:00
Lennart Poettering fd63e712b2 core: bypass dynamic user lookups from dbus-daemon
dbus-daemon does NSS name look-ups in order to enforce its bus policy. This
might dead-lock if an NSS module use wants to use D-Bus for the look-up itself,
like our nss-systemd does. Let's work around this by bypassing bus
communication in the NSS module if we run inside of dbus-daemon. To make this
work we keep a bit of extra state in /run/systemd/dynamic-uid/ so that we don't
have to consult the bus, but can still resolve the names.

Note that the normal codepath continues to be via the bus, so that resolving
works from all mount namespaces and is subject to authentication, as before.

This is a bit dirty, but not too dirty, as dbus daemon is kinda special anyway
for PID 1.
2016-08-19 00:50:24 +02:00
Lennart Poettering 00d9ef8560 core: add RemoveIPC= setting
This adds the boolean RemoveIPC= setting to service, socket, mount and swap
units (i.e.  all unit types that may invoke processes). if turned on, and the
unit's user/group is not root, all IPC objects of the user/group are removed
when the service is shut down. The life-cycle of the IPC objects is hence bound
to the unit life-cycle.

This is particularly relevant for units with dynamic users, as it is essential
that no objects owned by the dynamic users survive the service exiting. In
fact, this patch adds code to imply RemoveIPC= if DynamicUser= is set.

In order to communicate the UID/GID of an executed process back to PID 1 this
adds a new "user lookup" socket pair, that is inherited into the forked
processes, and closed before the exec(). This is needed since we cannot do NSS
from PID 1 due to deadlock risks, However need to know the used UID/GID in
order to clean up IPC owned by it if the unit shuts down.
2016-08-19 00:37:25 +02:00
Lennart Poettering 92b25bcabb core: make use of uid_is_valid() when checking for UID validity 2016-08-18 22:49:48 +02:00
Zbigniew Jędrzejewski-Szmek d87a2ef782 Merge pull request #3884 from poettering/private-users 2016-08-06 17:04:45 -04:00
Lennart Poettering b08af3b127 core: only set the watchdog variables in ExecStart= lines 2016-08-04 23:08:05 +02:00
Lennart Poettering af9d16e10a core: use the correct APIs to determine whether a dual timestamp is initialized 2016-08-04 16:27:07 +02:00
Lennart Poettering c39f1ce24d core: turn various execution flags into a proper flags parameter
The ExecParameters structure contains a number of bit-flags, that were so far
exposed as bool:1, change this to a proper, single binary bit flag field. This
makes things a bit more expressive, and is helpful as we add more flags, since
these booleans are passed around in various callers, for example
service_spawn(), whose signature can be made much shorter now.

Not all bit booleans from ExecParameters are moved into the flags field for
now, but this can be added later.
2016-08-04 16:27:07 +02:00
Lennart Poettering d251207d55 core: add new PrivateUsers= option to service execution
This setting adds minimal user namespacing support to a service. When set the invoked
processes will run in their own user namespace. Only a trivial mapping will be
set up: the root user/group is mapped to root, and the user/group of the
service will be mapped to itself, everything else is mapped to nobody.

If this setting is used the service runs with no capabilities on the host, but
configurable capabilities within the service.

This setting is particularly useful in conjunction with RootDirectory= as the
need to synchronize /etc/passwd and /etc/group between the host and the service
OS tree is reduced, as only three UID/GIDs need to match: root, nobody and the
user of the service itself. But even outside the RootDirectory= case this
setting is useful to substantially reduce the attack surface of a service.

Example command to test this:

        systemd-run -p PrivateUsers=1 -p User=foobar -t /bin/sh

This runs a shell as user "foobar". When typing "ps" only processes owned by
"root", by "foobar", and by "nobody" should be visible.
2016-08-03 20:42:04 +02:00
Lennart Poettering 7049382803 execute: don't set $SHELL and $HOME for services, if they don't contain interesting data 2016-08-03 14:52:16 +02:00
Lennart Poettering 6af760f3b2 core: inherit TERM from PID 1 for all services started on /dev/console
This way, invoking nspawn from a shell in the best case inherits the TERM
setting all the way down into the login shell spawned in the container.

Fixes: #3697
2016-08-03 14:52:16 +02:00
Lennart Poettering 409093fe10 nss: add new "nss-systemd" NSS module for mapping dynamic users
With this NSS module all dynamic service users will be resolvable via NSS like
any real user.
2016-07-22 15:53:45 +02:00
Lennart Poettering 29206d4619 core: add a concept of "dynamic" user ids, that are allocated as long as a service is running
This adds a new boolean setting DynamicUser= to service files. If set, a new
user will be allocated dynamically when the unit is started, and released when
it is stopped. The user ID is allocated from the range 61184..65519. The user
will not be added to /etc/passwd (but an NSS module to be added later should
make it show up in getent passwd).

For now, care should be taken that the service writes no files to disk, since
this might result in files owned by UIDs that might get assigned dynamically to
a different service later on. Later patches will tighten sandboxing in order to
ensure that this cannot happen, except for a few selected directories.

A simple way to test this is:

        systemd-run -p DynamicUser=1 /bin/sleep 99999
2016-07-22 15:53:45 +02:00
Lennart Poettering 33df919d5c execute: make sure JoinsNamespaceOf= doesn't leak ns fds to executed processes 2016-07-20 14:53:15 +02:00
Lennart Poettering 7a1ab780c4 execute: normalize connect_logger_as() parameters slightly
All other functions in execute.c that need the unit id take a Unit* parameter
as first argument. Let's change connect_logger_as() to follow a similar logic.
2016-07-20 14:53:15 +02:00
Alessandro Puccetti 2a624c36e6 doc,core: Read{Write,Only}Paths= and InaccessiblePaths=
This patch renames Read{Write,Only}Directories= and InaccessibleDirectories=
to Read{Write,Only}Paths= and InaccessiblePaths=, previous names are kept
as aliases but they are not advertised in the documentation.

Renamed variables:
`read_write_dirs` --> `read_write_paths`
`read_only_dirs` --> `read_only_paths`
`inaccessible_dirs` --> `inaccessible_paths`
2016-07-19 17:22:02 +02:00
Torstein Husebø 61233823aa treewide: fix typos and remove accidental repetition of words 2016-07-11 16:18:43 +02:00
Jouke Witteveen 84eada2f7f execute: Do not alter call-by-ref parameter on failure
Prevent free from being called on (a part of) the call-by-reference
variable env when setup_pam fails.
2016-07-08 09:42:48 +02:00
Jouke Witteveen 1280503b7e execute: Cleanup the environment early
By cleaning up before setting up PAM we maintain control of overriding
behavior in setting variables. Otherwise, pam_putenv is in control.
This also makes sure we use a cleaned up environment in replacing
variables in argv.
2016-07-07 14:15:50 +02:00
Lennart Poettering f4170c671b execute: add a new easy-to-use RestrictRealtime= option to units
It takes a boolean value. If true, access to SCHED_RR, SCHED_FIFO and
SCHED_DEADLINE is blocked, which my be used to lock up the system.
2016-06-23 01:45:45 +02:00
Lennart Poettering abd84d4d83 execute: be a little less drastic when MemoryDenyWriteExecute= hits
Let's politely refuse with EPERM rather than kill the whole thing right-away.
2016-06-23 01:35:04 +02:00
Lennart Poettering 686d9ba614 execute: set PR_SET_NO_NEW_PRIVS also in case the exec memory protection is used
This was forgotten when MemoryDenyWriteExecute= was added: we should set NNP in
all cases when we set seccomp filters.
2016-06-23 01:33:07 +02:00
Lennart Poettering 03857c43ce execute: use the return value of setrlimit_closest() properly
It's a function defined by us, hence we should look for the error in its return
value, not in "errno".
2016-06-23 01:31:24 +02:00
Lennart Poettering 7bce046bcf core: set $JOURNAL_STREAM to the dev_t/ino_t of the journal stream of executed services
This permits services to detect whether their stdout/stderr is connected to the
journal, and if so talk to the journal directly, thus permitting carrying of
metadata.

As requested by the gtk folks: #2473
2016-06-15 23:00:27 +02:00
Lennart Poettering fd1f9c89f7 execute: minor coding style improvements 2016-06-15 22:51:01 +02:00
Jouke Witteveen 2065ca699b core/execute: pass env vars to PAM session setup (#3503)
Move the merger of environment variables before setting up the PAM
session and pass the aggregate environment to PAM setup. This allows
control over the PAM session hooks through environment variables.

PAM session initiation may update the environment. On successful
initiation of a PAM session, we adopt the environment of the
PAM context.
2016-06-13 12:50:12 +02:00
Alessandro Puccetti cf677fe686 core/execute: add the magic character '!' to allow privileged execution (#3493)
This patch implements the new magic character '!'. By putting '!' in front
of a command, systemd executes it with full privileges ignoring paramters
such as User, Group, SupplementaryGroups, CapabilityBoundingSet,
AmbientCapabilities, SecureBits, SystemCallFilter, SELinuxContext,
AppArmorProfile, SmackProcessLabel, and RestrictAddressFamilies.

Fixes partially https://github.com/systemd/systemd/issues/3414
Related to https://github.com/coreos/rkt/issues/2482

Testing:
1. Create a user 'bob'
2. Create the unit file /etc/systemd/system/exec-perm.service
   (You can use the example below)
3. sudo systemctl start ext-perm.service
4. Verify that the commands starting with '!' were not executed as bob,
   4.1 Looking to the output of ls -l /tmp/exec-perm
   4.2 Each file contains the result of the id command.

`````````````````````````````````````````````````````````````````
[Unit]
Description=ext-perm

[Service]
Type=oneshot
TimeoutStartSec=0
User=bob
ExecStartPre=!/usr/bin/sh -c "/usr/bin/rm /tmp/exec-perm*" ;
    /usr/bin/sh -c "/usr/bin/id > /tmp/exec-perm-start-pre"
ExecStart=/usr/bin/sh -c "/usr/bin/id > /tmp/exec-perm-start" ;
    !/usr/bin/sh -c "/usr/bin/id > /tmp/exec-perm-star-2"
ExecStartPost=/usr/bin/sh -c "/usr/bin/id > /tmp/exec-perm-start-post"
ExecReload=/usr/bin/sh -c "/usr/bin/id > /tmp/exec-perm-reload"
ExecStop=!/usr/bin/sh -c "/usr/bin/id > /tmp/exec-perm-stop"
ExecStopPost=/usr/bin/sh -c "/usr/bin/id > /tmp/exec-perm-stop-post"

[Install]
WantedBy=multi-user.target]
`````````````````````````````````````````````````````````````````
2016-06-10 18:19:54 +02:00
Lennart Poettering 1ff74fb6e3 execute: check whether the specified fd is a tty before chowning/chmoding it (#3457)
Let's add an extra safety check before we chmod/chown a TTY to the right user,
as we might end up having connected something to STDIN/STDOUT that is actually
not a TTY, even though this might have been requested, due to permissive
StandardInput= settings or transient service activation with fds passed in.

Fixes:

https://bugs.freedesktop.org/show_bug.cgi?id=85255
2016-06-09 10:01:16 +02:00
Topi Miettinen f3e4363593 core: Restrict mmap and mprotect with PAGE_WRITE|PAGE_EXEC (#3319) (#3379)
New exec boolean MemoryDenyWriteExecute, when set, installs
a seccomp filter to reject mmap(2) with PAGE_WRITE|PAGE_EXEC
and mprotect(2) with PAGE_EXEC.
2016-06-03 17:58:18 +02:00
Lennart Poettering fc2fffe770 tree-wide: introduce new SOCKADDR_UN_LEN() macro, and use it everywhere
The macro determines the right length of a AF_UNIX "struct sockaddr_un" to pass to
connect() or bind(). It automatically figures out if the socket refers to an
abstract namespace socket, or a socket in the file system, and properly handles
the full length of the path field.

This macro is not only safer, but also simpler to use, than the usual
offsetof() + strlen() logic.
2016-05-05 22:24:36 +02:00
Daniel Mack 68de79d6a4 Merge pull request #2760 from ronnychevalier/rc/core_no_new_privileges_seccompv3
core: set NoNewPrivileges for seccomp if we don't have CAP_SYS_ADMIN
2016-03-21 12:57:43 +01:00
Ronny Chevalier 19c0b0b9a5 core: set NoNewPrivileges for seccomp if we don't have CAP_SYS_ADMIN
The manpage of seccomp specify that using seccomp with
SECCOMP_SET_MODE_FILTER will return EACCES if the caller do not have
CAP_SYS_ADMIN set, or if the no_new_privileges bit is not set. Hence,
without NoNewPrivilege set, it is impossible to use a SystemCall*
directive with a User directive set in system mode.

Now, NoNewPrivileges is set if we are in user mode, or if we are in
system mode and we don't have CAP_SYS_ADMIN, and SystemCall*
directives are used.
2016-02-28 14:44:26 +01:00
Thomas Hindoe Paaboel Andersen 7f508f2c74 tree-wide: indentation fixes 2016-02-26 22:23:38 +01:00
Vito Caputo 313cefa1d9 tree-wide: make ++/-- usage consistent WRT spacing
Throughout the tree there's spurious use of spaces separating ++ and --
operators from their respective operands.  Make ++ and -- operator
consistent with the majority of existing uses; discard the spaces.
2016-02-22 20:32:04 -08:00
Lennart Poettering 479050b363 core: drop Capabilities= setting
The setting is hardly useful (since its effect is generally reduced to zero due
to file system caps), and with the advent of ambient caps an actually useful
replacement exists, hence let's get rid of this.

I am pretty sure this was unused and our man page already recommended against
its use, hence this should be a safe thing to remove.
2016-02-13 11:59:34 +01:00
Daniel Mack 9ca6ff50ab Remove kdbus custom endpoint support
This feature will not be used anytime soon, so remove a bit of cruft.

The BusPolicy= config directive will stay around as compat noop.
2016-02-11 22:12:04 +01:00
Daniel Mack b26fa1a2fb tree-wide: remove Emacs lines from all files
This should be handled fine now by .dir-locals.el, so need to carry that
stuff in every file.
2016-02-10 13:41:57 +01:00
Lennart Poettering 1e22b5cda0 core: don't reset /dev/console if stdin/stdout/stderr as passed as fd in a transient service
Otherwise we might end resetting /dev/console all the time when a transient service starts or stops.

Fixes #2377
Fixes #2198
Fixes #2061
2016-01-28 16:25:39 +01:00
Lennart Poettering 7bb70b6e3d core: normalize error handling a bit, in setup_pam()
Assign errno-style errors to a variable called "r" when they happen, the same way we do this in most other calls. It's
bad enough that the error handling part of the function deals with two different error variables (pam_code and r) now,
but before this fix it was even three!
2016-01-25 17:19:19 +01:00
Zbigniew Jędrzejewski-Szmek 2a836ca970 systemd: remove dead code
We only go to fail label if pam_pid <= 0.

CID #1306746.
2016-01-20 18:55:56 -05:00
Zbigniew Jędrzejewski-Szmek b326715278 tree-wide: check if errno is greater than zero (2)
Compare errno with zero in a way that tells gcc that
(if the condition is true) errno is positive.
2016-01-13 15:10:17 -05:00
Zbigniew Jędrzejewski-Szmek f5e5c28f42 tree-wide: check if errno is greater then zero
gcc is confused by the common idiom of
  return errno ? -errno : -ESOMETHING
and thinks a positive value may be returned. Replace this condition
with errno > 0 to help gcc and avoid many spurious warnings. I filed
a gcc rfe a long time ago, but it hard to say if it will ever be
implemented [1].

Both conventions were used in the codebase, this change makes things
more consistent. This is a follow up to bcb161b023.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61846
2016-01-13 15:09:55 -05:00
Ismo Puustinen 755d4b67a4 capabilities: added support for ambient capabilities.
This patch adds support for ambient capabilities in service files. The
idea with ambient capabilities is that the execed processes can run with
non-root user and get some inherited capabilities, without having any
need to add the capabilities to the executable file.

You need at least Linux 4.3 to use ambient capabilities. SecureBit
keep-caps is automatically added when you use ambient capabilities and
wish to change the user.

An example system service file might look like this:

[Unit]
Description=Service for testing caps

[Service]
ExecStart=/usr/bin/sleep 10000
User=nobody
AmbientCapabilities=CAP_NET_ADMIN CAP_NET_RAW

After starting the service it has these capabilities:

CapInh: 0000000000003000
CapPrm: 0000000000003000
CapEff: 0000000000003000
CapBnd: 0000003fffffffff
CapAmb: 0000000000003000
2016-01-12 12:14:50 +02:00
Ismo Puustinen a103496ca5 capabilities: keep bounding set in non-inverted format.
Change the capability bounding set parser and logic so that the bounding
set is kept as a positive set internally. This means that the set
reflects those capabilities that we want to keep instead of drop.
2016-01-12 12:14:50 +02:00
Evgeny Vereshchagin 3c11da9d80 core: dump soft limits too 2015-11-30 21:29:39 +00:00
lc85446 74129a1276 core:execute: fix fork() fail handling in exec_spawn()
If pid < 0 after fork(), 0 is always returned because r =
    exec_context_load_environment() has exited successfully.

    This will make the caller of exec_spawn() not able to handle
    the fork() error case and make systemd abort assert() possibly.
2015-11-27 14:15:33 +01:00
Evgeny Vereshchagin fdbbadbd0d core: dump rlim_cur too 2015-11-27 11:26:55 +00:00
Filipe Brandenburger b4c14404b3 execute: Add new PassEnvironment= directive
This directive allows passing environment variables from the system
manager to spawned services. Variables in the system manager can be set
inside a container by passing `--set-env=...` options to systemd-spawn.

Tested with an on-disk test.service unit. Tested using multiple variable
names on a single line, with an empty setting to clear the current list
of variables, with non-existing variables.

Tested using `systemd-run -p PassEnvironment=VARNAME` to confirm it
works with transient units.

Confirmed that `systemctl show` will display the PassEnvironment
settings.

Checked that man pages are generated correctly.

No regressions in `make check`.
2015-11-11 07:55:23 -08:00
Lennart Poettering b5efdb8af4 util-lib: split out allocation calls into alloc-util.[ch] 2015-10-27 13:45:53 +01:00
Lennart Poettering 7d50b32a12 util-lib: split out globbing related calls into glob-util.[ch] 2015-10-27 13:25:58 +01:00
Lennart Poettering 7ccbd1ae84 util-lib: split out syslog-related calls into syslog-util.[ch] 2015-10-27 13:25:57 +01:00
Lennart Poettering 430f0182b7 src/basic: rename audit.[ch] → audit-util.[ch] and capability.[ch] → capability-util.[ch]
The files are named too generically, so that they might conflict with
the upstream project headers. Hence, let's add a "-util" suffix, to
clarify that this are just our utility headers and not any official
upstream headers.
2015-10-27 13:25:57 +01:00
Lennart Poettering 8b43440b7e util-lib: move string table stuff into its own string-table.[ch] 2015-10-27 13:25:56 +01:00
Lennart Poettering f4f15635ec util-lib: move a number of fs operations into fs-util.[ch] 2015-10-27 13:25:56 +01:00
Lennart Poettering 78f22b973f util-lib: split out resource limits related calls into rlimit-util.[ch] 2015-10-27 13:25:56 +01:00
Lennart Poettering 6bedfcbb29 util-lib: split string parsing related calls from util.[ch] into parse-util.[ch] 2015-10-27 13:25:55 +01:00
Lennart Poettering b1d4f8e154 util-lib: split out user/group/uid/gid calls into user-util.[ch] 2015-10-26 01:24:38 +01:00
Lennart Poettering c004493cde util-lib: split out IO related calls to io-util.[ch] 2015-10-26 01:24:38 +01:00
Lennart Poettering 3ffd4af220 util-lib: split out fd-related operations into fd-util.[ch]
There are more than enough to deserve their own .c file, hence move them
over.
2015-10-25 13:19:18 +01:00
Lennart Poettering 07630cea1f util-lib: split our string related calls from util.[ch] into its own file string-util.[ch]
There are more than enough calls doing string manipulations to deserve
its own files, hence do something about it.

This patch also sorts the #include blocks of all files that needed to be
updated, according to the sorting suggestions from CODING_STYLE. Since
pretty much every file needs our string manipulation functions this
effectively means that most files have sorted #include blocks now.

Also touches a few unrelated include files.
2015-10-24 23:05:02 +02:00
Evgeny Vereshchagin c2bbd90b5a core: dump RuntimeDirectories and RuntimeDirectoryMode too 2015-10-16 18:44:47 +00:00
Lennart Poettering 3d18b16755 util: do not reset terminal in acquire_terminal()
Before, we'd always reset acquired terminals, which is not really
desired, as we expose a setting TTYReset= which is supposed to control
whether the TTY is reset or not. Previously that setting would only
enable a second resetting of the TTY, which is of course pointless...

Hence, move the implicit resetting out of acquire_terminal() and make
the callers do it if they need it.
2015-10-08 15:03:31 +02:00
Lennart Poettering a34ceba66f core: add support for setting stdin/stdout/stderr for transient services
When starting a transient service, allow setting stdin/stdout/stderr fds
for it, by passing them in via the bus.

This also simplifies some of the serialization code for units.
2015-10-08 12:55:15 +02:00
Lennart Poettering e287086b8a ask-password: add support for caching passwords in the kernel keyring
This adds support for caching harddisk passwords in the kernel keyring
if it is available, thus supporting caching without Plymouth being
around.

This is also useful for hooking up "gdm-auto-login" with the collected
boot-time harddisk password, in order to support gnome keyring
passphrase unlocking via the HDD password, if it is the same.

Any passwords added to the kernel keyring this way have a timeout of
2.5min at which time they are purged from the kernel.
2015-10-07 12:26:14 +02:00
Lennart Poettering 8dd4c05b54 core: add support for naming file descriptors passed using socket activation
This adds support for naming file descriptors passed using socket
activation. The names are passed in a new $LISTEN_FDNAMES= environment
variable, that matches the existign $LISTEN_FDS= one and contains a
colon-separated list of names.

This also adds support for naming fds submitted to the per-service fd
store using FDNAME= in the sd_notify() message.

This also adds a new FileDescriptorName= setting for socket unit files
to set the name for fds created by socket units.

This also adds a new call sd_listen_fds_with_names(), that is similar to
sd_listen_fds(), but also returns the names of the fds.

systemd-activate gained the new --fdname= switch to specify a name for
testing socket activation.

This is based on #1247 by Maciej Wereski.

Fixes #1247.
2015-10-06 11:52:48 +02:00
Lennart Poettering 5f5d8eab1f core: allow setting WorkingDirectory= to the special value ~
If set to ~ the working directory is set to the home directory of the
user configured in User=.

This change also exposes the existing switch for the working directory
that allowed making missing working directories non-fatal.

This also changes "machinectl shell" to make use of this to ensure that
the invoked shell is by default in the user's home directory.

Fixes #1268.
2015-09-29 21:55:51 +02:00
Lennart Poettering 1ed11ff659 Merge pull request #1349 from dvdhrm/sync-pam
core: make setup_pam() synchronous
2015-09-23 18:42:41 +02:00
Sangjung Woo b213e1c11d exec: move mac_smack_apply_pid() and setup_pam() to same condition block
This cleans up exec_child() function by moving mac_smack_apply_pid()
and setup_pam() to the same condition block, since both of them have
the same condition (i.e params->apply_permissions). It improves
readability without changing its operation.
2015-09-23 22:11:38 +09:00
Sangjung Woo 6bf6e43e7e exec: call setup_pam() after SMACK labeling
When 'SmackProcessLabel=' is used in user@.service file, all processes
launched in systemd user session should be labeled as the designated name
of 'SmackProcessLabel' directive. However, if systemd has its own smack
label using '--with-smack-run-label' configuration, '(sd-pam)' is
labeled as the specific name of '--with-smack-run-label'. If
'SmackProcessLabel=' is used in user@.service file without
'--with-smack-run-label' configuration, (sd-pam) is labeled as "_" since
systemd (i.e. pid=1) is labeled as "_".

This is mainly because setup_pam() function is called before applying
smack label to child process. This patch fixes it by calling setup_pam()
after setting the smack label.
2015-09-23 20:41:52 +09:00
David Herrmann 2d7c6aa20c core: make setup_pam() synchronous
If we spawn a unit with a non-empty 'PAMName=', we fork off a
child-process _inside_ the unit, known as '(sd-pam)', which watches the
session. It waits for the main-process to exit and then finishes it via
pam_close_session(3).

However, the '(sd-pam)' setup is highly asynchronous. There is no
guarantee that process gets spawned before we finish the unit setup.
Therefore, there might be a root-owned process inside of the cgroup of
the unit, thus causing cg_migrate() to error-out with EPERM.

This patch makes setup_pam() synchronous and waits for the '(sd-pam)'
setup to finish before continuing. This guarantees that setresuid(2) was
at least tried before we continue with the child setup of the real unit.
Note that if setresuid(2) fails, we already warn loudly about it. You
really must make sure that you own the passed user if using 'PAMName='.
It seems very plausible to rely on that assumption.
2015-09-23 00:51:20 +02:00
Ronny Chevalier 5bc7452b32 core: fix group ownership when Group is set
When Group is set in the unit, the runtime directories are owned by
this group and not the default group of the user (same for cgroup paths
and standard outputs)

Fix #1231
2015-09-21 18:14:44 +02:00
Lennart Poettering c7cc737fce execute: fix return type from write() 2015-09-11 18:31:49 +02:00
Lennart Poettering 54eb2300ae execute: invalidate idle pipe after use
Not strictly necessary, but makes clear the fds are invalidated. Make
sure we do the same here as in most other cases.
2015-09-11 18:31:49 +02:00
Lennart Poettering 525d3cc746 tree-wide: take benefit of the fact that hashmap_free() returns NULL
And set_free() too.

Another Coccinelle patch.
2015-09-09 23:12:07 +02:00
Lennart Poettering 6796073e33 tree-wide: make use of the fact that strv_free() returns NULL
Another Coccinelle patch.
2015-09-09 23:05:58 +02:00
Lennart Poettering 1f6b411372 tree-wide: update empty-if coccinelle script to cover empty-while and more
Let's also clean up single-line while and for blocks.
2015-09-09 14:59:51 +02:00
Lennart Poettering 76ef789d26 tree-wide: make use of log_error_errno() return value
Turns this:

        r = -errno;
        log_error_errno(errno, "foo");

into this:

        r = log_error_errno(errno, "foo");

and this:

        r = log_error_errno(errno, "foo");
        return r;

into this:

        return log_error_errno(errno, "foo");
2015-09-09 08:20:20 +02:00
Lennart Poettering 3cc2aff1ab tree-wide: don't do assignments within if checks
Turn this:

       if ((r = foo()) < 0) { ...

into this:

       r = foo();
       if (r < 0) { ...
2015-09-09 08:20:20 +02:00
Lennart Poettering a1e58e8ee1 tree-wide: use coccinelle to patch a lot of code to use mfree()
This replaces this:

        free(p);
        p = NULL;

by this:

        p = mfree(p);

Change generated using coccinelle. Semantic patch is added to the
sources.
2015-09-09 08:19:27 +02:00
Lennart Poettering 023a4f6701 core: optionally create LOGIN_PROCESS or USER_PROCESS utmp entries
When generating utmp/wtmp entries, optionally add both LOGIN_PROCESS and
INIT_PROCESS entries or even all three of LOGIN_PROCESS, INIT_PROCESS
and USER_PROCESS entries, instead of just a single INIT_PROCESS entry.

With this change systemd may be used to not only invoke a getty directly
in a SysV-compliant way but alternatively also a login(1) implementation
or even forego getty and login entirely, and invoke arbitrary shells in
a way that they appear in who(1) or w(1).

This is preparation for a later commit that adds a "machinectl shell"
operation to invoke a shell in a container, in a way that is compatible
with who(1) and w(1).
2015-08-24 22:46:45 +02:00
Lennart Poettering 6bfe5c28ca execute: don't fail if we create the runtime directory from two processes simultaneously
If a service has both ExecStart= and ExecStartPost= set with
Type=simple, then it might happen that we have two children create the
runtime directory of a service (as configured with RuntimeDirectory=) at
the same time. Previously we did this with mkdir_safe() which will
create the dir only if it is missing, but if it already exists will at
least verify the access mode and ownership to match the right values.
This is problematic in this case, since it creates and then adjusts the
settings, thus it might happen that one child creates the directory with
root owner, another one then verifies it, and only afterwards the
directory ownership is fixed by the original child, while the second
child already failed.

With this change we'll now always adjust the access mode, so that we
know that it is right. In the worst case this means we adjust the
mode/ownership even though its unnecessary, but this should have no
negative effect.

https://bugzilla.redhat.com/show_bug.cgi?id=1226509
2015-08-06 14:56:20 +03:00
Thomas Hindoe Paaboel Andersen 7d6884b65e tree-wide: fix indentation 2015-08-06 00:44:19 +02:00
WaLyong Cho e419a0e310 core: set default process label only exec label is none
When command path has access label and no SmackProcessLabel= is not
set, default process label will be set. But if the default process
label has no rule for the access label of the command path then smack
access error will be occurred.
So, if the command path has execute label then the child have to set
its label to the same of execute label of command path instead of
default process label.
2015-08-04 21:11:24 +09:00
Daniel Mack ad118bda15 tree-wide: fix write_string_file() user that should not create files
The latest consolidation cleanup of write_string_file() revealed some users
of that helper which should have used write_string_file_no_create() in the
past but didn't. Basically, all existing users that write to files in /sys
and /proc should not expect to write to a file which is not yet existant.
2015-07-06 19:27:20 -04:00
Daniel Mack 4c1fc3e404 fileio: consolidate write_string_file*()
Merge write_string_file(), write_string_file_no_create() and
write_string_file_atomic() into write_string_file() and provide a flags mask
that allows combinations of atomic writing, newline appending and automatic
file creation. Change all users accordingly.
2015-07-06 19:19:25 -04:00
WaLyong Cho e174dce271 smack: add default smack process label config
Similar to SmackProcessLabel=, if this configuration is set, systemd
executes processes with given SMACK label. If unit has
SmackProcessLabel=, this config is overwritten.
But, do NOT be confused with SMACK64EXEC of execute file. This default
execute process label(and also label which is set by
SmackProcessLabel=) is set fork-ed process SMACK subject label and
used to access the execute file.
If the execution file has also SMACK64EXEC, finally executed process
has SMACK64EXEC subject.
While if the execution file has no SMACK64EXEC, the executed process
has label of this config(or label which is set by
SmackProcessLabel=). Because if execution file has no SMACK64EXEC then
excuted process inherits label from caller process(in this case, the
caller is systemd).
2015-06-22 23:44:09 +09:00
Kay Sievers 1b09f548c7 turn kdbus support into a runtime option
./configure --enable/disable-kdbus can be used to set the default
behavior regarding kdbus.

If no kdbus kernel support is available, dbus-dameon will be used.

With --enable-kdbus, the kernel command line option "kdbus=0" can
be used to disable kdbus.

With --disable-kdbus, the kernel command line option "kdbus=1" is
required to enable kdbus support.
2015-06-17 18:01:49 +02:00
Daniel Mack d6e5f3ad1f core: execute: fix regression in pam_setup()
Commit 72c0a2c25 ("everywhere: port everything to sigprocmask_many()
and friends") reworked code tree-wide to use the new sigprocmask_many()
helper. In this, it caused a regression in pam_setup, because it
dropped a line to initialize the 'ss' signal mask which is later used
in sigwait().

While at it, move the variable declaration to an inner scope.
2015-06-17 14:43:11 +02:00
Lennart Poettering 72c0a2c255 everywhere: port everything to sigprocmask_many() and friends
This ports a lot of manual code over to sigprocmask_many() and friends.

Also, we now consistly check for sigprocmask() failures with
assert_se(), since the call cannot realistically fail unless there's a
programming error.

Also encloses a few sd_event_add_signal() calls with (void) when we
ignore the return values for it knowingly.
2015-06-15 20:13:23 +02:00
Lennart Poettering ce30c8dcb4 tree-wide: whenever we fork off a foreign child process reset signal mask/handlers
Also, when the child is potentially long-running make sure to set a
death signal.

Also, ignore the result of the reset operations explicitly by casting
them to (void).
2015-06-10 01:28:58 +02:00
Lennart Poettering 24882e06c1 util: split out signal-util.[ch] from util.[ch]
No functional changes.
2015-05-29 20:14:11 +02:00
Lennart Poettering 050f727728 util: introduce PERSONALITY_INVALID as macro for 0xffffffffLU 2015-05-21 19:48:49 +02:00
Alban Crequy ee818b89f4 core: Private*/Protect* options with RootDirectory
When a service is chrooted with the option RootDirectory=/opt/..., then
the options PrivateDevices, PrivateTmp, ProtectHome, ProtectSystem must
mount the directories under $RootDirectory/{dev,tmp,home,usr,boot}.

The test-ns tool can test setup_namespace() with and without chroot:
 $ sudo TEST_NS_PROJECTS=/home/lennart/projects ./test-ns
 $ sudo TEST_NS_CHROOT=/home/alban/debian-tree TEST_NS_PROJECTS=/home/alban/debian-tree/home/alban/Documents ./test-ns
2015-05-18 18:47:45 +02:00
Lennart Poettering 8b44a3d22c core: make exec code a bit more readable
Let's add a function that checks whether we need fs namespacing, to make
things easier to read, instead of using a humungous if expression...
2015-05-13 16:34:02 +02:00
Lennart Poettering f2341e0a87 core,network: major per-object logging rework
This changes log_unit_info() (and friends) to take a real Unit* object
insted of just a unit name as parameter. The call will now prefix all
logged messages with the unit name, thus allowing the unit name to be
dropped from the various passed romat strings, simplifying invocations
drastically, and unifying log output across messages. Also, UNIT= vs.
USER_UNIT= is now derived from the Manager object attached to the Unit
object, instead of getpid(). This has the benefit of correcting the
field for --test runs.

Also contains a couple of other logging improvements:

- Drops a couple of strerror() invocations in favour of using %m.

- Not only .mount units now warn if a symlinks exist for the mount
  point already, .automount units do that too, now.

- A few invocations of log_struct() that didn't actually pass any
  additional structured data have been replaced by simpler invocations
  of log_unit_info() and friends.

- For structured data a new LOG_UNIT_MESSAGE() macro has been added,
  that works like LOG_MESSAGE() but prefixes the message with the unit
  name. Similar, there's now LOG_LINK_MESSAGE() and
  LOG_NETDEV_MESSAGE().

- For structured data new LOG_UNIT_ID(), LOG_LINK_INTERFACE(),
  LOG_NETDEV_INTERFACE() macros have been added that generate the
  necessary per object fields. The old log_unit_struct() call has been
  removed in favour of these new macros used in raw log_struct()
  invocations. In addition to removing one more function call this
  allows generated structured log messages that contain two object
  fields, as necessary for example for network interfaces that are
  joined into another network interface, and whose messages shall be
  indexed by both.

- The LOG_ERRNO() macro has been removed, in favour of
  log_struct_errno(). The latter has the benefit of ensuring that %m in
  format strings is properly resolved to the specified error number.

- A number of logging messages have been converted to use
  log_unit_info() instead of log_info()

- The client code in sysv-generator no longer #includes core code from
  src/core/.

- log_unit_full_errno() has been removed, log_unit_full() instead takes
  an errno now, too.

- log_unit_info(), log_link_info(), log_netdev_info() and friends, now
  avoid double evaluation of their parameters
2015-05-11 22:24:45 +02:00
Lennart Poettering dd305ec9c6 core: when we cannot add PID to a scope cgroup, log about it
Also, place the scope unit in failed state.
2015-04-28 12:20:57 +02:00
Ronny Chevalier 288a74cce5 shared: add terminal-util.[ch] 2015-04-11 00:34:02 +02:00
Ronny Chevalier 0b452006de shared: add process-util.[ch] 2015-04-10 23:54:49 +02:00
Ronny Chevalier 6482f6269c shared: add formats-util.h 2015-04-10 23:54:48 +02:00
Lennart Poettering c687863750 util: rework rm_rf() logic
- Move to its own file rm-rf.c

- Change parameters into a single flags parameter

- Remove "honour sticky" logic, it's unused these days
2015-04-06 10:57:53 +02:00
Zbigniew Jędrzejewski-Szmek cd972d6911 core: don't wait for reply if writing to pipe fails
This shouldn't really happen, but it's seems cleaner to
continue on error.

CID #1237552.
2015-03-13 23:42:17 -04:00
Thomas Hindoe Paaboel Andersen 2eec67acbb remove unused includes
This patch removes includes that are not used. The removals were found with
include-what-you-use which checks if any of the symbols from a header is
in use.
2015-02-23 23:53:42 +01:00
Zbigniew Jędrzejewski-Szmek ca905b2fce core: make RuntimeDirectory honor SELinux labels
https://bugzilla.redhat.com/show_bug.cgi?id=1192726
2015-02-14 14:12:19 -05:00
Thomas Hindoe Paaboel Andersen 0a6f50c0af include <poll.h> instead of <sys/poll.h>
include-what-you-use automatically does this and it makes finding
unnecessary harder to spot. The only content of poll.h is a include
of sys/poll.h so should be harmless.
2015-02-12 20:47:38 +01:00
Lennart Poettering cf1d0302ae exec: also evaluate working_directory_missing_ok when not applying chroots 2015-02-12 18:58:43 +01:00
Lennart Poettering 4c08c8242a core: don't fail to run services in --user instances if $HOME is missing
Otherwise we cannot even invoke systemd-exit.service anymore, thus not
even exit.

https://bugs.freedesktop.org/show_bug.cgi?id=83100
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=759320
2015-02-12 12:21:16 +01:00
Cristian Rodríguez e89fe484df core: remove unneeded <libgen.h> include
execute.c only uses basename (the GNU version in <string.h>)
2015-02-11 18:31:29 +01:00
Lennart Poettering 63c372cb9d util: rework strappenda(), and rename it strjoina()
After all it is now much more like strjoin() than strappend(). At the
same time, add support for NULL sentinels, even if they are normally not
necessary.
2015-02-03 02:05:59 +01:00
Martin Pitt 6cb7fa17b3 core: Fix EACCES check for OOM adjustments
Commit 3bd5c3 added a check for EACCES, but missed the minus sign.
2015-01-13 07:06:31 +01:00
Lennart Poettering ff0af2a166 core: modernize execution code a bit
Among other things, avoid log_struct() unless we really need it.

Also, use "r" as variable to store function errors in, instead of "err".
"r" is pretty much what we use everywhere else, hence using the same
here make sense.

FInally, in the child, when we want to log, make sure to open the
logging framework first, since it is explicitly closed in preparation
for the exec().
2015-01-09 18:35:36 +01:00
Lennart Poettering 3bd5c32d93 core: check both EPERM and EACCES for OOM adjustments 2015-01-08 23:21:01 +01:00
Lennart Poettering d5243d6286 core: make EPERM errors when applying OOM adjustment for forked processes non-fatal
This should be useful for user namespaces.
2015-01-08 23:13:45 +01:00
Zbigniew Jędrzejewski-Szmek 553d2243e2 tree-wide: remove unnecessary LOG_PRI
LOG_DEBUG is already a log level, there is no need to use LOG_PRI which
is for filtering out the facility.
2015-01-06 00:29:40 -05:00
Zbigniew Jędrzejewski-Szmek 524daa8c3e journal: call connect() with dropped privileges
When systemd starts a service, it first opened /run/systemd/journal/stdout
socket, and only later switched to the right user.group (if they are
specified). Later on, journald looked at the credentials, and saw
root.root, because credentials are stored at the time the socket is
opened. As a result, all messages passed over _TRANSPORT=stdout were
logged with _UID=0, _GID=0.

Drop real uid and gid temporarily to fix the issue.
2015-01-05 14:12:58 -05:00
Lennart Poettering 039f0e70a0 env-util: don't include files from src/core/ 2014-12-23 19:15:27 +01:00
Lennart Poettering 3c70e3bb02 core: rearrange code so that libsystemd/sd-bus/ does not include header files from core
Stuff in src/shared or src/libsystemd should *never* include code from
src/core or any of the tools, so don't do that here either. It's not OK!
2014-12-23 19:15:27 +01:00
Lennart Poettering 24167f3db8 execute: the runtime directory can only be on tmpfs, hence don't use rm_rf_dangerous() needlessly 2014-12-19 02:07:42 +01:00
Zbigniew Jędrzejewski-Szmek f1acf85a36 core: make exec_command_free_list return NULL 2014-12-18 19:26:21 -05:00
Torstein Husebø ee33e53a70 core: correct spacing near eol in code comments 2014-12-11 15:09:51 +01:00
Lennart Poettering 7b3fd6313c scope: make attachment of initial PIDs a bit more robust 2014-12-10 22:06:44 +01:00
Lennart Poettering 2822da4fb7 util: introduce our own gperf based capability list
This way, we can ensure we have a more complete, up-to-date list of
capabilities around, always.
2014-12-10 03:21:07 +01:00
Michal Sekletar 9008e1ac92 selinux: figure out selinux context applied on exec() before closing all fds
We need original socket_fd around otherwise mac_selinux_get_child_mls_label
fails with -EINVAL return code. Also don't call setexeccon twice but rather pass
context value of SELinuxContext option as an extra argument.
2014-12-04 03:00:01 +01:00
Lennart Poettering fed1e721fd treewide: introduce UID_INVALID (and friends) as macro for (uid_t) -1 2014-11-28 20:55:04 +01:00
Michal Schmidt 56f64d9576 treewide: use log_*_errno whenever %m is in the format string
If the format string contains %m, clearly errno must have a meaningful
value, so we might as well use log_*_errno to have ERRNO= logged.

Using:
find . -name '*.[ch]' | xargs sed -r -i -e \
's/log_(debug|info|notice|warning|error|emergency)\((".*%m.*")/log_\1_errno(errno, \2/'

Plus some whitespace, linewrap, and indent adjustments.
2014-11-28 19:49:27 +01:00
Michal Schmidt 31938a8560 core: convert log_unit_*() to log_unit_*_errno()
Using:
find . -name '*.[ch]' | xargs sed -r -i -e \
's/log_unit_(debug|info|notice|warning|error|emergency)\(([^"]+), "(.*)%s"(.*), strerror\(-([a-zA-Z_]+)\)\);/log_unit_\1_errno(\2, \5, "\3%m"\4);/'
2014-11-28 13:29:21 +01:00
Michal Schmidt da927ba997 treewide: no need to negate errno for log_*_errno()
It corrrectly handles both positive and negative errno values.
2014-11-28 13:29:21 +01:00
Michal Schmidt 0a1beeb642 treewide: auto-convert the simple cases to log_*_errno()
As a followup to 086891e5c1 "log: add an "error" parameter to all
low-level logging calls and intrdouce log_error_errno() as log calls
that take error numbers", use sed to convert the simple cases to use
the new macros:

find . -name '*.[ch]' | xargs sed -r -i -e \
's/log_(debug|info|notice|warning|error|emergency)\("(.*)%s"(.*), strerror\(-([a-zA-Z_]+)\)\);/log_\1_errno(-\4, "\2%m"\3);/'

Multi-line log_*() invocations are not covered.
And we also should add log_unit_*_errno().
2014-11-28 12:04:41 +01:00
Lennart Poettering e2cc6eca73 log: fix order of log_unit_struct() to match other logging calls
Also, while we are at it, introduce some syntactic sugar for creating
ERRNO= and MESSAGE= structured logging fields.
2014-11-28 02:18:46 +01:00
Lennart Poettering 79008bddf6 log: rearrange log function naming
- Rename log_meta() → log_internal(), to follow naming scheme of most
  other log functions that are usually invoked through macros, but never
  directly.

- Rename log_info_object() to log_object_info(), simply because the
  object should be before any other parameters, to follow OO-style
  programming style.
2014-11-27 22:05:24 +01:00
WaLyong Cho 2ca620c4ed smack: introduce new SmackProcessLabel option
In service file, if the file has some of special SMACK label in
ExecStart= and systemd has no permission for the special SMACK label
then permission error will occurred. To resolve this, systemd should
be able to set its SMACK label to something accessible of ExecStart=.
So introduce new SmackProcessLabel. If label is specified with
SmackProcessLabel= then the child systemd will set its label to
that. To successfully execute the ExecStart=, accessible label should
be specified with SmackProcessLabel=.
Additionally, by SMACK policy, if the file in ExecStart= has no
SMACK64EXEC then the executed process will have given label by
SmackProcessLabel=. But if the file has SMACK64EXEC then the
SMACK64EXEC label will be overridden.

[zj: reword man page]
2014-11-24 10:20:53 -05:00
Lennart Poettering a931ad47a8 core: introduce new Delegate=yes/no property controlling creation of cgroup subhierarchies
For priviliged units this resource control property ensures that the
processes have all controllers systemd manages enabled.

For unpriviliged services (those with User= set) this ensures that
access rights to the service cgroup is granted to the user in question,
to create further subgroups. Note that this only applies to the
name=systemd hierarchy though, as access to other controllers is not
safe for unpriviliged processes.

Delegate=yes should be set for container scopes where a systemd instance
inside the container shall manage the hierarchies below its own cgroup
and have access to all controllers.

Delegate=yes should also be set for user@.service, so that systemd
--user can run, controlling its own cgroup tree.

This commit changes machined, systemd-nspawn@.service and user@.service
to set this boolean, in order to ensure that container management will
just work, and the user systemd instance can run fine.
2014-11-05 18:49:14 +01:00
Lennart Poettering 6baa7db008 mac: also rename use_{smack,selinux,apparmor}() calls so that they share the new mac_{smack,selinux,apparmor}_xyz() convention 2014-10-23 17:34:30 +02:00
WaLyong Cho cc56fafeeb mac: rename apis with mac_{selinux/smack}_ prefix 2014-10-23 17:13:15 +02:00
Lukas Nykryn 7491ccf2cb environment: append unit_id to error messages regarding EnvironmentFile 2014-10-17 16:05:57 +02:00
Lennart Poettering 8fa6cbe1a9 execute: downgrade namespace error to "warning"
Also, extend the printed warning a bit, explaining the situation more
verbosely.
2014-10-17 13:54:27 +02:00
Michal Sekletar 0015ebf3fa execute: don't fail child when we don't have privileges to setup namespaces
If we don't have privileges to setup the namespaces then we are most likely
running inside some sort of unprivileged container, hence not being able to
create namespace is not a problem because spawned service can't access host
system anyway.
2014-10-17 11:51:46 +02:00
Michael Scherer 5482192e57 Report aa_change_onexec error code
Since aa_change_onexec return the error code in errno, and return
-1, the current code do not give any useful information when
something fail. This make apparmor easier to debug, as seen on
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=760526
2014-10-11 12:04:47 +02:00
Tom Gundersen e63ff941ea core: execute - don't leak strv 2014-09-30 11:34:01 +02:00
Jan Synacek 86b23b07c9 swap: introduce Discard property
Process possible "discard" values from /etc/fstab.
2014-09-29 11:08:12 -04:00
Michal Sekletar 16115b0a7b socket: introduce SELinuxContextFromNet option
This makes possible to spawn service instances triggered by socket with
MLS/MCS SELinux labels which are created based on information provided by
connected peer.

Implementation of label_get_child_mls_label derived from xinetd.

Reviewed-by: Paul Moore <pmoore@redhat.com>
2014-09-19 12:32:06 +02:00
Thomas Hindoe Paaboel Andersen 822a59607c execute: silence warnings
Mark two function parameters as const
2014-09-08 22:10:36 +02:00
Daniel Mack e44da745d1 service: hook up custom endpoint logic
If BusPolicy= was passed, the parser function will have created
an ExecContext->bus_endpoint object, along with policy information.

In that case, create a kdbus endpoint, and pass its path name to the
namespace logic, to it will be mounted over the actual 'bus' node.

At endpoint creation time, no policy is updloaded. That is done after
fork(), through a separate call. This is necessary because we don't
know the real uid of the process earlier than that.
2014-09-08 14:15:02 +02:00
Daniel Mack a610cc4f18 namespace: add support for custom kdbus endpoint
If a path to a previously created custom kdbus endpoint is passed in,
bind-mount a new devtmpfs that contains a 'bus' node, which in turn in
bind-mounted with the custom endpoint. This tmpfs then mounted over the
kdbus subtree that refers to the current bus.

This way, we can fake the bus node in order to lock down services with
a kdbus custom endpoint policy.
2014-09-08 14:12:56 +02:00
Daniel Mack bb7dd0b04a bus: add kdbus endpoint types
Add types to describe endpoints and associated policy entries,
and add a BusEndpoint instace to ExecContext.
2014-09-08 11:06:45 +02:00
Daniel Mack d35fbf6bdf exec: move code executed after fork into exec_child()
This factors out one conditional branch that has grown way too big, and
makes the code more readable by using return statements rather than jump
labels.
2014-09-05 12:19:39 +02:00
Daniel Mack 9fa95f8539 exec: factor out most function arguments of exec_spawn() to ExecParameters
This way, the list of arguments to that function gets more comprehensive,
and we can get around passing lots of NULL and 0 arguments from socket.c,
swap.c and mount.c.

It also allows for splitting up the code in exec_spawn().

While at it, make ExecContext const in execute.c.
2014-09-05 12:18:57 +02:00
Lennart Poettering 1b6d7fa742 util: make use of newly added reset_signal_mask() call wherever appropriate 2014-08-26 21:12:54 +02:00
Lennart Poettering f461c8073d execute: explain in a comment, why close_all_fds() is invoked the second time differently 2014-08-21 17:35:19 +02:00
Lennart Poettering 4c94096027 core: unify how we generate the prefix string when dumping unit state 2014-08-21 17:24:21 +02:00
Lennart Poettering 3bb07b7680 Revert "socket: introduce SELinuxLabelViaNet option"
This reverts commit cf8bd44339.

Needs more discussion on the mailing list.
2014-08-19 19:16:08 +02:00
Michal Sekletar cf8bd44339 socket: introduce SELinuxLabelViaNet option
This makes possible to spawn service instances triggered by socket with
MLS/MCS SELinux labels which are created based on information provided by
connected peer.

Implementation of label_get_child_label derived from xinetd.

Reviewed-by: Paul Moore <pmoore@redhat.com>
2014-08-19 18:57:12 +02:00
Kay Sievers 3a43da2832 time-util: add and use USEC/NSEC_INFINIY 2014-07-29 13:20:20 +02:00
Lennart Poettering 418b9be500 firstboot: add new component to query basic system settings on first boot, or when creating OS images offline
A new tool "systemd-firstboot" can be used either interactively on boot,
where it will query basic locale, timezone, hostname, root password
information and set it. Or it can be used non-interactively from the
command line when prepareing disk images for booting. When used
non-inertactively the tool can either copy settings from the host, or
take settings on the command line.

$ systemd-firstboot --root=/path/to/my/new/root --copy-locale --copy-root-password --hostname=waldi

The tool will be automatically invoked (interactively) now on first boot
if /etc is found unpopulated.

This also creates the infrastructure for generators to be notified via
an environment variable whether they are running on the first boot, or
not.
2014-07-07 15:25:55 +02:00
Lennart Poettering 717603e391 machinectl: show /etc/os-release information of container in status output 2014-07-03 17:54:24 +02:00
Ronny Chevalier e1d758033d use more _cleanup_ macro 2014-06-24 19:09:57 +02:00
Lennart Poettering 1b8689f949 core: rename ReadOnlySystem= to ProtectSystem= and add a third value for also mounting /etc read-only
Also, rename ProtectedHome= to ProtectHome=, to simplify things a bit.

With this in place we now have two neat options ProtectSystem= and
ProtectHome= for protecting the OS itself (and optionally its
configuration), and for protecting the user's data.
2014-06-04 18:12:55 +02:00
Lennart Poettering 417116f234 core: add new ReadOnlySystem= and ProtectedHome= settings for service units
ReadOnlySystem= uses fs namespaces to mount /usr and /boot read-only for
a service.

ProtectedHome= uses fs namespaces to mount /home and /run/user
inaccessible or read-only for a service.

This patch also enables these settings for all our long-running services.

Together they should be good building block for a minimal service
sandbox, removing the ability for services to modify the operating
system or access the user's private data.
2014-06-03 23:57:51 +02:00
Zbigniew Jędrzejewski-Szmek de0671ee7f Remove unnecessary casts in printfs
No functional change expected :)
2014-05-15 15:29:58 +02:00
Lennart Poettering 7f8aa67131 core: remove tcpwrap support
tcpwrap is legacy code, that is barely maintained upstream. It's APIs
are awful, and the feature set it exposes (such as DNS and IDENT
access control) questionnable. We should not support this natively in
systemd.

Hence, let's remove the code. If people want to continue making use of
this, they can do so by plugging in "tcpd" for the processes they start.
With that scheme things are as well or badly supported as they were from
traditional inetd, hence no functionality is really lost.
2014-03-24 20:07:42 +01:00
Lennart Poettering 3d94f76c99 util: replace close_pipe() with new safe_close_pair()
safe_close_pair() is more like safe_close(), except that it handles
pairs of fds, and doesn't make and misleading allusion, as it works
similarly well for socketpairs() as for pipe()s...
2014-03-24 03:22:44 +01:00
Lennart Poettering 03e334a1c7 util: replace close_nointr_nofail() by a more useful safe_close()
safe_close() automatically becomes a NOP when a negative fd is passed,
and returns -1 unconditionally. This makes it easy to write lines like
this:

        fd = safe_close(fd);

Which will close an fd if it is open, and reset the fd variable
correctly.

By making use of this new scheme we can drop a > 200 lines of code that
was required to test for non-negative fds or to reset the closed fd
variable afterwards.
2014-03-18 19:31:34 +01:00
Lennart Poettering 517d56b1d0 missing: if RLIMIT_RTTIME is not defined by the libc, then we need a new define for the max number of rlimits, too 2014-03-05 02:31:09 +01:00
Lennart Poettering e66cf1a3f9 core: introduce new RuntimeDirectory= and RuntimeDirectoryMode= unit settings
As discussed on the ML these are useful to manage runtime directories
below /run for services.
2014-03-03 17:55:32 +01:00
Lennart Poettering 98b47d54ce execute: free directory path if we fail to remove it because we cannot allocate a thread 2014-03-03 17:55:32 +01:00
Lennart Poettering f513e420c8 exec: imply NoNewPriviliges= only when seccomp filters are used in user mode 2014-02-26 02:28:52 +01:00
Lennart Poettering 4298d0b512 core: add new RestrictAddressFamilies= switch
This new unit settings allows restricting which address families are
available to processes. This is an effective way to minimize the attack
surface of services, by turning off entire network stacks for them.

This is based on seccomp, and does not work on x86-32, since seccomp
cannot filter socketcall() syscalls on that platform.
2014-02-26 02:19:28 +01:00
Lennart Poettering 7c66bae2ff seccomp: we should control NO_NEW_PRIVS on our own, not let seccomp do this for us 2014-02-26 02:19:28 +01:00
Michael Scherer eef65bf3ee core: Add AppArmor profile switching
This permit to switch to a specific apparmor profile when starting a daemon. This
will result in a non operation if apparmor is disabled.
It also add a new build requirement on libapparmor for using this feature.
2014-02-21 03:44:20 +01:00
Lennart Poettering 1756a0118e execute: modernizations 2014-02-19 17:53:50 +01:00
Lennart Poettering ac45f971a1 core: add Personality= option for units to set the personality for spawned processes 2014-02-19 03:27:03 +01:00
Lennart Poettering e9642be2cc seccomp: add helper call to add all secondary archs to a seccomp filter
And make use of it where appropriate for executing services and for
nspawn.
2014-02-18 22:14:00 +01:00
Lennart Poettering 5f8640fb62 core: store and expose SELinuxContext field normalized as bool + string 2014-02-17 16:52:52 +01:00
Lennart Poettering 57183d117a core: add SystemCallArchitectures= unit setting to allow disabling of non-native
architecture support for system calls

Also, turn system call filter bus properties into complex types instead
of concatenated strings.
2014-02-13 00:24:00 +01:00
Lennart Poettering 351a19b17d core: fix build without libseccomp 2014-02-12 18:44:40 +01:00
Lennart Poettering 17df7223be core: rework syscall filter
- Allow configuration of an errno error to return from blacklisted
  syscalls, instead of immediately terminating a process.

- Fix parsing logic when libseccomp support is turned off

- Only keep the actual syscall set in the ExecContext, and generate the
  string version only on demand.
2014-02-12 18:30:36 +01:00
Ronny Chevalier c0467cf387 syscallfilter: port to libseccomp 2014-02-12 18:30:36 +01:00
Lennart Poettering 82adf6af7c nspawn,man: use a common vocabulary when referring to selinux security contexts
Let's always call the security labels the same way:

  SMACK: "Smack Label"
  SELINUX: "SELinux Security Context"

And the low-level encapsulation is called "seclabel". Now let's hope we
stick to this vocabulary in future, too, and don't mix "label"s and
"security contexts" and so on wildly.
2014-02-10 13:18:16 +01:00
Michael Scherer 0d3f7bb3a5 exec: Add support for ignoring errors on SELinuxContext by prefixing it with -, like for others settings.
Also remove call to security_check_context, as this doesn't serve anything, since
setexeccon will fail anyway.
2014-02-10 13:18:16 +01:00
Michael Scherer 5c56a259e0 exec: Ignore the setting SELinuxContext if selinux is not enabled 2014-02-10 13:18:16 +01:00
Michael Scherer 7b52a628f8 exec: Add SELinuxContext configuration item
This permit to let system administrators decide of the domain of a service.
This can be used with templated units to have each service in a différent
domain ( for example, a per customer database, using MLS or anything ),
or can be used to force a non selinux enabled system (jvm, erlang, etc)
to start in a different domain for each service.
2014-02-10 13:18:16 +01:00
Lennart Poettering 7f112f50fe exec: introduce PrivateDevices= switch to provide services with a private /dev
Similar to PrivateNetwork=, PrivateTmp= introduce PrivateDevices= that
sets up a private /dev with only the API pseudo-devices like /dev/null,
/dev/zero, /dev/random, but not any physical devices in them.
2014-01-20 21:28:37 +01:00
Zbigniew Jędrzejewski-Szmek 5ce70e5bcd Introduce cleanup functions for cap_free
Unfortunately a different cleanup function is necessary per type,
because cap_t** and char** are incompatible with void**.
2014-01-02 19:45:48 -05:00
Zbigniew Jędrzejewski-Szmek ccd06097c7 Use format patterns for usec_t, pid_t, nsec_t, usec_t
It is nicer to predefine patterns using configure time check instead of
using casts everywhere.

Since we do not need to use any flags, include "%" in the format instead
of excluding it like PRI* macros.
2014-01-02 19:45:47 -05:00
Stefan Beller f146f5e159 core: Forgot to dereference pointer when checking for NULL
Actually we already checked for !rt before, now we'd like to examine
the return value of the memory allocation.
2013-12-30 16:23:13 +01:00
Lennart Poettering 43a99a7afe build-sys: minor fixes found with cppcheck 2013-12-25 19:00:38 +01:00
Lennart Poettering 09812eb764 sd-daemon: introduce sd_watchdog_enabled() for parsing $WATCHDOG_USEC
Also, introduce a new environment variable named $WATCHDOG_PID which
cotnains the PID of the process that is supposed to send the keep-alive
events. This is similar how $LISTEN_FDS and $LISTEN_PID work together,
and protects against confusing processes further down the process tree
due to inherited environment.
2013-12-22 22:19:03 +01:00
Lennart Poettering 7cae38c4fa execute: set TERM even if we don't open the tty on our own
This way, when a tty path is configured TERM is set, which is nice to
set a useful term for gettys.
2013-12-18 18:21:28 +01:00
Lennart Poettering 531dca789e execute: also set SO_SNDBUF when spawning a service with stdout/stderr connected to journald 2013-12-16 20:34:46 +01:00
Zbigniew Jędrzejewski-Szmek 2b6bf07dd2 Get rid of our reimplementation of basename
The only problem is that libgen.h #defines basename to point to it's
own broken implementation instead of the GNU one. This can be fixed
by #undefining basename.
2013-12-06 21:29:55 -05:00
Lennart Poettering 613b411c94 service: add the ability for units to join other unit's PrivateNetwork= and PrivateTmp= namespaces 2013-11-27 20:28:48 +01:00
Lennart Poettering 9588bc3209 Remove dead code and unexport some calls
"make check-api-unused" informs us about code that is not used anymore
or that is exported but only used internally. Fix these all over the
place.
2013-11-08 18:12:45 +01:00
Lennart Poettering 250a918dc4 strv: introduce new strv_from_stdarg_alloca() macro to generate a string array from stdarg function parameters
This allows us to turn lists of strings passed in easily into string
arrays without having to allocate memory.
2013-10-29 19:53:43 +01:00
Zbigniew Jędrzejewski-Szmek d8c9d3a468 systemd: use unit name in PrivateTmp directories
Unit name is used whole in the directory name, so that the unit name
can be easily extracted from it, e.g. "/tmp/systemd-abcd.service-DEDBIF1".

https://bugzilla.redhat.com/show_bug.cgi?id=957439
2013-10-22 22:54:09 -04:00
Lennart Poettering 44b601bc79 macro: clean up usage of gcc attributes
Always use our own macros, and name all our own macros the same style.
2013-10-16 06:14:59 +02:00
Lennart Poettering 71fda00f32 list: make our list macros a bit easier to use by not requring type spec on each invocation
We can determine the list entry type via the typeof() gcc construct, and
so we should to make the macros much shorter to use.
2013-10-14 06:11:19 +02:00
Zbigniew Jędrzejewski-Szmek 970edce6ef execute: more debugging messages 2013-10-03 22:12:32 -04:00
Zbigniew Jędrzejewski-Szmek 59fccd8211 execute.c: always set $SHELL
In e6dca81 $SHELL was added to user@.service. Let's
instead provide it to all units which have a user.
2013-10-02 22:23:56 +02:00
Zbigniew Jędrzejewski-Szmek f546241b6d execute.c: little modernization 2013-09-26 11:12:03 +02:00
Lennart Poettering 13b84ec7df cgroup: if we do a cgroup operation then do something on all supported controllers
Previously we did operations like attach, trim or migrate only on the
controllers that were enabled for a specific unit. With this changes we
will now do them for all supproted controllers, and fall back to all
possible prefix paths if the specified paths do not exist.

This fixes issues if a controller is being disabled for a unit where it
was previously enabled, and makes sure that all processes stay as "far
down" the tree as groups exist.
2013-09-25 03:38:17 +02:00
Zbigniew Jędrzejewski-Szmek f485606bf8 Make tmpdir removal asynchronous
https://bugs.freedesktop.org/show_bug.cgi?id=68232
2013-09-17 10:26:30 -05:00
Zbigniew Jędrzejewski-Szmek 31a7eb86f1 systemd: do not output status messages once gettys are running
Make Type=idle communication bidirectional: when bootup is finished,
the manager, as before, signals idling Type=idle jobs to continue.
However, if the boot takes too long, idling jobs signal the manager
that they have had enough, wait a tiny bit more, and continue, taking
ownership of the console. The manager, when signalled that Type=idle
jobs are done, makes a note and will not write to the console anymore.

This is a cosmetic issue, but quite noticable, so let's just fix it.

Based on Harald Hoyer's patch.

https://bugs.freedesktop.org/show_bug.cgi?id=54247
http://unix.stackexchange.com/questions/51805/systemd-messages-after-starting-login/
2013-07-15 21:34:57 -04:00
Lennart Poettering 8aa7519366 core: grant user@.service instances write access to their own cgroup 2013-07-11 01:56:12 +02:00
Jason St. John e9c1ea9de8 replace tabs with spaces in various files
The affected files in this patch had inconsistent use of tabs vs. spaces
for indentation, and this patch eliminates the stray tabs.

Also, the opening brace of sigchld_hdl() in activate.c was moved so the
opening braces are consistent throughout the file.
2013-07-02 23:06:22 -04:00
Lennart Poettering 4ad490007b core: general cgroup rework
Replace the very generic cgroup hookup with a much simpler one. With
this change only the high-level cgroup settings remain, the ability to
set arbitrary cgroup attributes is removed, so is support for adding
units to arbitrary cgroup controllers or setting arbitrary paths for
them (especially paths that are different for the various controllers).

This also introduces a new -.slice root slice, that is the parent of
system.slice and friends. This enables easy admin configuration of
root-level cgrouo properties.

This replaces DeviceDeny= by DevicePolicy=, and implicitly adds in
/dev/null, /dev/zero and friends if DeviceAllow= is used (unless this is
turned off by DevicePolicy=).
2013-06-27 04:17:34 +02:00
Zbigniew Jędrzejewski-Szmek 507f22bd01 Use stdint.h macros instead of casts to print uint64_t values
Casts are visually heavy, and can obscure unwanted truncations.
2013-06-10 10:10:07 -04:00
Zbigniew Jędrzejewski-Szmek 44a6b1b680 Add __attribute__((const, pure, format)) in various places
I'm assuming that it's fine if a _const_ or _pure_ function
calls assert. It is assumed that the assert won't trigger,
and even if it does, it can only trigger on the first call
with a given set of parameters, and we don't care if the
compiler moves the order of calls.
2013-05-02 22:52:09 -04:00
Harald Hoyer 7fd1b19bc9 move _cleanup_ attribute in front of the type
http://lists.freedesktop.org/archives/systemd-devel/2013-April/010510.html
2013-04-18 09:11:22 +02:00
Harald Hoyer 5521d69856 core/execute: only clean the environment, if we have one 2013-04-18 07:16:01 +02:00
Lennart Poettering 099a804b11 execute: always add in EXECUTABLE= field when we log something about execution 2013-04-18 02:18:33 +02:00
Harald Hoyer ebc05a09ad core/execute: report invalid environment variables from files
Because "export key=val" is not supported by systemd, an error is logged
where the invalid assignment is coming from.

Introduce strv_env_clean_log() to log invalid environment assignments,
where logging is possible and allowed.

parse_env_file_internal() is modified to allow WHITESPACE in keys, to
report the issues later on.
2013-04-17 15:31:45 +02:00
Harald Hoyer 03bb799e0b core/execute.c: debug log final execve() with argv[]
https://bugzilla.redhat.com/show_bug.cgi?id=772073
2013-04-17 09:54:24 +02:00
Lennart Poettering 2d76d14ea3 cgroup: clean-ups 2013-04-08 20:35:25 +02:00
Lennart Poettering 974efc4658 cgroup: always keep access mode of 'tasks' and 'cgroup.procs' files in cgroup directories in sync 2013-04-08 18:22:47 +02:00
Zbigniew Jędrzejewski-Szmek c84a948831 Add _cleanup_globfree_
Fixes a memleak in error path in exec_context_load_environment.
2013-04-05 19:58:53 -04:00
Zbigniew Jędrzejewski-Szmek b92bea5d2a Use initalization instead of explicit zeroing
Before, we would initialize many fields twice: first
by filling the structure with zeros, and then a second
time with the real values. We can let the compiler do
the job for us, avoiding one copy.

A downside of this patch is that text gets slightly
bigger. This is because all zero() calls are effectively
inlined:

$ size build/.libs/systemd
         text    data     bss     dec     hex filename
before 897737  107300    2560 1007597   f5fed build/.libs/systemd
after  897873  107300    2560 1007733   f6075 build/.libs/systemd

… actually less than 1‰.

A few asserts that the parameter is not null had to be removed. I
don't think this changes much, because first, it is quite unlikely
for the assert to fail, and second, an immediate SEGV is almost as
good as an assert.
2013-04-05 19:50:57 -04:00
Lennart Poettering f73141d765 shared: rework env file reader
Implement this with a proper state machine, so that newlines and
escaped chars can appear in string assignments. This should bring the
parser much closer to shell.
2013-04-03 20:12:57 +02:00
Lennart Poettering 574d5f2dfc util: rename write_one_line_file() to write_string_file()
You can write much more than just one line with this call (and we
frequently do), so let's correct the naming.
2013-04-03 20:12:56 +02:00
Zbigniew Jędrzejewski-Szmek e62d8c3944 Modernization
Use _cleanup_ and wrap lines to ~80 chars and such.
2013-03-31 14:36:12 -04:00