Commit graph

146 commits

Author SHA1 Message Date
Zbigniew Jędrzejewski-Szmek b744e8937c Merge pull request #4067 from poettering/invocation-id
Add an "invocation ID" concept to the service manager
2016-10-11 13:40:50 -04:00
Lennart Poettering 1f0958f640 core: when determining whether a process exit status is clean, consider whether it is a command or a daemon
SIGTERM should be considered a clean exit code for daemons (i.e. long-running
processes, as a daemon without SIGTERM handler may be shut down without issues
via SIGTERM still) while it should not be considered a clean exit code for
commands (i.e. short-running processes).

Let's add two different clean checking modes for this, and use the right one at
the appropriate places.

Fixes: #4275
2016-10-10 22:57:01 +02:00
Lennart Poettering 4b58153dd2 core: add "invocation ID" concept to service manager
This adds a new invocation ID concept to the service manager. The invocation ID
identifies each runtime cycle of a unit uniquely. A new randomized 128bit ID is
generated each time a unit moves from and inactive to an activating or active
state.

The primary usecase for this concept is to connect the runtime data PID 1
maintains about a service with the offline data the journal stores about it.
Previously we'd use the unit name plus start/stop times, which however is
highly racy since the journal will generally process log data after the service
already ended.

The "invocation ID" kinda matches the "boot ID" concept of the Linux kernel,
except that it applies to an individual unit instead of the whole system.

The invocation ID is passed to the activated processes as environment variable.
It is additionally stored as extended attribute on the cgroup of the unit. The
latter is used by journald to automatically retrieve it for each log logged
message and attach it to the log entry. The environment variable is very easily
accessible, even for unprivileged services. OTOH the extended attribute is only
accessible to privileged processes (this is because cgroupfs only supports the
"trusted." xattr namespace, not "user."). The environment variable may be
altered by services, the extended attribute may not be, hence is the better
choice for the journal.

Note that reading the invocation ID off the extended attribute from journald is
racy, similar to the way reading the unit name for a logging process is.

This patch adds APIs to read the invocation ID to sd-id128:
sd_id128_get_invocation() may be used in a similar fashion to
sd_id128_get_boot().

PID1's own logging is updated to always include the invocation ID when it logs
information about a unit.

A new bus call GetUnitByInvocationID() is added that allows retrieving a bus
path to a unit by its invocation ID. The bus path is built using the invocation
ID, thus providing a path for referring to a unit that is valid only for the
current runtime cycleof it.

Outlook for the future: should the kernel eventually allow passing of cgroup
information along AF_UNIX/SOCK_DGRAM messages via a unique cgroup id, then we
can alter the invocation ID to be generated as hash from that rather than
entirely randomly. This way we can derive the invocation race-freely from the
messages.
2016-10-07 20:14:38 +02:00
Lennart Poettering 00d9ef8560 core: add RemoveIPC= setting
This adds the boolean RemoveIPC= setting to service, socket, mount and swap
units (i.e.  all unit types that may invoke processes). if turned on, and the
unit's user/group is not root, all IPC objects of the user/group are removed
when the service is shut down. The life-cycle of the IPC objects is hence bound
to the unit life-cycle.

This is particularly relevant for units with dynamic users, as it is essential
that no objects owned by the dynamic users survive the service exiting. In
fact, this patch adds code to imply RemoveIPC= if DynamicUser= is set.

In order to communicate the UID/GID of an executed process back to PID 1 this
adds a new "user lookup" socket pair, that is inherited into the forked
processes, and closed before the exec(). This is needed since we cannot do NSS
from PID 1 due to deadlock risks, However need to know the used UID/GID in
order to clean up IPC owned by it if the unit shuts down.
2016-08-19 00:37:25 +02:00
Lennart Poettering a0fef983ab core: remember first unit failure, not last unit failure
Previously, the result value of a unit was overriden with each failure that
took place, so that the result always reported the last failure that took
place.

With this commit this is changed, so that the first failure taking place is
stored instead. This should normally not matter much as multiple failures are
sufficiently uncommon. However, it improves one behaviour: if we send SIGABRT
to a service due to a watchdog timeout, then this currently would be reported
as "coredump" failure, rather than the "watchodg" failure it really is. Hence,
in order to report information about the type of the failure, and not about
the effect of it, let's change this from all unit type to store the first, not
the last failure.

This addresses the issue pointed out here:

https://github.com/systemd/systemd/pull/3818#discussion_r73433520
2016-08-04 23:08:05 +02:00
Lennart Poettering c39f1ce24d core: turn various execution flags into a proper flags parameter
The ExecParameters structure contains a number of bit-flags, that were so far
exposed as bool:1, change this to a proper, single binary bit flag field. This
makes things a bit more expressive, and is helpful as we add more flags, since
these booleans are passed around in various callers, for example
service_spawn(), whose signature can be made much shorter now.

Not all bit booleans from ExecParameters are moved into the flags field for
now, but this can be added later.
2016-08-04 16:27:07 +02:00
Lennart Poettering 29206d4619 core: add a concept of "dynamic" user ids, that are allocated as long as a service is running
This adds a new boolean setting DynamicUser= to service files. If set, a new
user will be allocated dynamically when the unit is started, and released when
it is stopped. The user ID is allocated from the range 61184..65519. The user
will not be added to /etc/passwd (but an NSS module to be added later should
make it show up in getent passwd).

For now, care should be taken that the service writes no files to disk, since
this might result in files owned by UIDs that might get assigned dynamically to
a different service later on. Later patches will tighten sandboxing in order to
ensure that this cannot happen, except for a few selected directories.

A simple way to test this is:

        systemd-run -p DynamicUser=1 /bin/sleep 99999
2016-07-22 15:53:45 +02:00
Lennart Poettering 1ed7ebcfca Merge pull request #3170 from poettering/v230-preparation-fixes
make virtualization detection quieter, rework unit start limit logic, detect unit file drop-in changes correctly, fix autofs state propagation
2016-05-04 10:46:13 +02:00
Lennart Poettering 072993504e core: move enforcement of the start limit into per-unit-type code again
Let's move the enforcement of the per-unit start limit from unit.c into the
type-specific files again. For unit types that know a concept of "result" codes
this allows us to hook up the start limit condition to it with an explicit
result code. Also, this makes sure that the state checks in clal like
service_start() may be done before the start limit is checked, as the start
limit really should be checked last, right before everything has been verified
to be in order.

The generic start limit logic is left in unit.c, but the invocation of it is
moved into the per-type files, in the various xyz_start() functions, so that
they may place the check at the right location.

Note that this change drops the enforcement entirely from device, slice, target
and scope units, since these unit types generally may not fail activation, or
may only be activated a single time. This is also documented now.

Note that restores the "start-limit-hit" result code that existed before
6bf0f408e4 already in the service code. However,
it's not introduced for all units that have a result code concept.

Fixes #3166.
2016-05-02 13:08:00 +02:00
Zbigniew Jędrzejewski-Szmek ce99c68a33 Move no_instances information to shared/
This way it can be used in install.c in subsequent commit.
2016-05-01 19:58:59 -04:00
Zbigniew Jędrzejewski-Szmek 8a993b61d1 Move no_alias information to shared/
This way it can be used in install.c in subsequent commit.
2016-05-01 19:40:51 -04:00
Lennart Poettering 291d565a04 core,systemctl: add bus API to retrieve processes of a unit
This adds a new GetProcesses() bus call to the Unit object which returns an
array consisting of all PIDs, their process names, as well as their full cgroup
paths. This is then used by "systemctl status" to show the per-unit process
tree.

This has the benefit that the client-side no longer needs to access the
cgroupfs directly to show the process tree of a unit. Instead, it now uses this
new API, which means it also works if -H or -M are used correctly, as the
information from the specific host is used, and not the one from the local
system.

Fixes: #2945
2016-04-22 16:06:20 +02:00
Lennart Poettering 463d0d1569 core: remove ManagerRunningAs enum
Previously, we had two enums ManagerRunningAs and UnitFileScope, that were
mostly identical and converted from one to the other all the time. The latter
had one more value UNIT_FILE_GLOBAL however.

Let's simplify things, and remove ManagerRunningAs and replace it by
UnitFileScope everywhere, thus making the translation unnecessary. Introduce
two new macros MANAGER_IS_SYSTEM() and MANAGER_IS_USER() to simplify checking
if we are running in one or the user context.
2016-04-12 13:43:30 +02:00
Daniel Mack 9ca6ff50ab Remove kdbus custom endpoint support
This feature will not be used anytime soon, so remove a bit of cruft.

The BusPolicy= config directive will stay around as compat noop.
2016-02-11 22:12:04 +01:00
Daniel Mack b26fa1a2fb tree-wide: remove Emacs lines from all files
This should be handled fine now by .dir-locals.el, so need to carry that
stuff in every file.
2016-02-10 13:41:57 +01:00
Lennart Poettering 7a7821c878 core: rework job_get_timeout() to use usec_t and handle USEC_INFINITY time events correctly 2016-02-04 00:35:43 +01:00
Lennart Poettering 36c16a7cdd core: rework unit timeout handling, and add new setting RuntimeMaxSec=
This clean-ups timeout handling in PID 1. Specifically, instead of storing 0 in internal timeout variables as
indication for a disabled timeout, use USEC_INFINITY which is in-line with how we do this in the rest of our code
(following the logic that 0 means "no", and USEC_INFINITY means "never").

This also replace all usec_t additions with invocations to usec_add(), so that USEC_INFINITY is properly propagated,
and sd-event considers it has indication for turning off the event source.

This also alters the deserialization of the units to restart timeouts from the time they were originally started from.
Before this patch timeouts would be restarted beginning with the time of the deserialization, which could lead to
artificially prolonged timeouts if a daemon reload took place.

Finally, a new RuntimeMaxSec= setting is introduced for service units, that specifies a maximum runtime after which a
specific service is forcibly terminated. This is useful to put time limits on time-intensive processing jobs.

This also simplifies the various xyz_spawn() calls of the various types in that explicit distruction of the timers is
removed, as that is done anyway by the state change handlers, and a state change is always done when the xyz_spawn()
calls fail.

Fixes: #2249
2016-02-01 22:18:16 +01:00
Harald Hoyer 9d06297e26 core: Do not bind a mount unit to a device, if it was from mountinfo
If a mount unit is bound to a device, systemd tries to umount the
mount point, if it thinks the device has gone away.

Due to the uevent queue and inotify of /proc/self/mountinfo being two
different sources, systemd can never get the ordering reliably correct.

It can happen, that in the uevent queue ADD,REMOVE,ADD is queued
and an inotify of mountinfo (or libmount event) happend with the
device in question.

systemd cannot know, at which point of time the mount happend in the
ADD,REMOVE,ADD sequence.

The real ordering might have been ADD,REMOVE,ADD,mount
and systemd might think ADD,mount,REMOVE,ADD and would umount the
mountpoint.

A test script which triggered this behaviour is:
rm -f test-efi-disk.img
dd if=/dev/null of=test-efi-disk.img bs=1M seek=512 count=1
parted --script test-efi-disk.img \
  "mklabel gpt" \
  "mkpart ESP fat32 1MiB 511MiB" \
  "set 1 boot on"
LOOP=$(losetup --show -f -P test-efi-disk.img)
udevadm settle
mkfs.vfat -F32 ${LOOP}p1
mkdir -p mnt
mount ${LOOP}p1 mnt
... <dostuffwith mnt>

Without the "udevadm settle" systemd unmounted mnt while the script was
operating on mnt.

Of course the question is, why there was a REMOVE in the first place,
but this is not part of this patch.
2015-11-24 14:08:50 +01:00
Franck Bui 8bf23dc757 make sure all swap units are ordered before the swap target
When shutting down the system, the swap devices can be disabled long
time before the swap target is stopped. They're actually the first
units systemd turns off on my system.

This is incorrect and due to swap devices having multiple associated
swap unit files. The main one is usually created by the fstab
generator and is used to start the swap device.

Once done, systemd creates some 'alias' units for the same swap
device, one for each swap dev link. But those units are missing an
ordering dependencies which was created by the fstab generator for the
main swap unit.

Therefore during shutdown those 'alias' units can be stopped at
anytime before unmount.target target.

This patch makes sure that all swap units are stopped after the
swap.target target.
2015-11-23 11:23:44 +01:00
Thomas Hindoe Paaboel Andersen b4bbcaa9c4 tree-wide: group include of libudev.h with sd-* 2015-11-17 07:06:08 +01:00
Thomas Hindoe Paaboel Andersen cf0fbc49e6 tree-wide: sort includes
Sort the includes accoding to the new coding style.
2015-11-16 22:09:36 +01:00
Lennart Poettering 4c9ea260ae core: simplify things a bit by checking default_dependencies boolean in callee, not caller
It's nicer to hide the check away in the various
xyz_add_default_dependencies() calls, rather than making it explicit in
the caller, and thus require deeper nesing.
2015-11-11 20:42:39 +01:00
Lennart Poettering ba64af90ec core: change return value of the unit's enumerate() call to void
We cannot handle enumeration failures in a sensible way, hence let's try
hard to continue without making such failures fatal, and log about it
with precise error messages.
2015-11-10 21:03:49 +01:00
Jan Engelhardt b938cb902c doc: correct punctuation and improve typography in documentation 2015-11-06 13:00:02 +01:00
Lennart Poettering c386f58864 core: all unit types that watch control PIDs should use the same logic
When coldplugging the unit state, make sure to follow the same basic
logic for all unit types: always verify whether the control PID is still
a waitable process before proceeding.
2015-10-27 17:59:45 +01:00
Lennart Poettering 7b3e062cb6 process-util: move a couple of process-related calls over 2015-10-27 14:24:58 +01:00
Lennart Poettering b5efdb8af4 util-lib: split out allocation calls into alloc-util.[ch] 2015-10-27 13:45:53 +01:00
Lennart Poettering 8b43440b7e util-lib: move string table stuff into its own string-table.[ch] 2015-10-27 13:25:56 +01:00
Lennart Poettering 6bedfcbb29 util-lib: split string parsing related calls from util.[ch] into parse-util.[ch] 2015-10-27 13:25:55 +01:00
Lennart Poettering 3ffd4af220 util-lib: split out fd-related operations into fd-util.[ch]
There are more than enough to deserve their own .c file, hence move them
over.
2015-10-25 13:19:18 +01:00
Lennart Poettering 07630cea1f util-lib: split our string related calls from util.[ch] into its own file string-util.[ch]
There are more than enough calls doing string manipulations to deserve
its own files, hence do something about it.

This patch also sorts the #include blocks of all files that needed to be
updated, according to the sorting suggestions from CODING_STYLE. Since
pretty much every file needs our string manipulation functions this
effectively means that most files have sorted #include blocks now.

Also touches a few unrelated include files.
2015-10-24 23:05:02 +02:00
Lennart Poettering 4f5dd3943b util: split out escaping code into escape.[ch]
This really deserves its own file, given how much code this is now.
2015-10-24 23:04:42 +02:00
Lennart Poettering a34ceba66f core: add support for setting stdin/stdout/stderr for transient services
When starting a transient service, allow setting stdin/stdout/stderr fds
for it, by passing them in via the bus.

This also simplifies some of the serialization code for units.
2015-10-08 12:55:15 +02:00
Zbigniew Jędrzejewski-Szmek 7e55de3b96 Move all unit states to basic/ and extend systemctl --state=help 2015-09-28 15:09:34 -04:00
Lennart Poettering 525d3cc746 tree-wide: take benefit of the fact that hashmap_free() returns NULL
And set_free() too.

Another Coccinelle patch.
2015-09-09 23:12:07 +02:00
Lennart Poettering 74ca738f6a util: introduce safe_fclose() and port everything over to it
Adds a coccinelle script to port things over automatically.
2015-09-09 15:26:11 +02:00
Lennart Poettering a1e58e8ee1 tree-wide: use coccinelle to patch a lot of code to use mfree()
This replaces this:

        free(p);
        p = NULL;

by this:

        p = mfree(p);

Change generated using coccinelle. Semantic patch is added to the
sources.
2015-09-09 08:19:27 +02:00
Lennart Poettering 75f86906c5 basic: rework virtualization detection API
Introduce a proper enum, and don't pass around string ids anymore. This
simplifies things quite a bit, and makes virtualization detection more
similar to architecture detection.
2015-09-07 13:42:47 +02:00
Lennart Poettering d79200e26e unit: unify how we assing slices to units
This adds a new call unit_set_slice(), and simplifies
unit_add_default_slice(). THis should make our code a bit more robust
and simpler.
2015-08-31 13:20:43 +02:00
Lennart Poettering 21b735e798 core: add unit_dbus_interface_from_type() to unit-name.h
Let's add a way to get the type-specific D-Bus interface of a unit from
either its type or name to src/basic/unit-name.[ch]. That way we can
share it with the client side, where it is useful in tools like cgls or
machinectl.

Also ports over machinectl to make use of this.
2015-08-28 02:10:10 +02:00
Michal Schmidt c382d69e3d core: remove generic job completion messages from unit vtables
These units' message format strings are identical to the generic
strings. Since we can always rely on the fallback, these are now
redundant.
2015-07-21 19:24:20 +02:00
Karel Zak bf1d7ba70a swap: use swapon -o
This patch simplify swapon usage in systemd. The command swapon(8)
since util-linux v2.26 supports "-o <list>". The idea is exactly the
same like for mount(8). The -o specifies options in fstab-compatible
way. For systemd it means that it does not have to care about things
like "discard" or another swapon specific options.

        swapon -o <options-from-fstab>

For backward compatibility the code cares about "Priority:" swap unit
field (for a case when Priority: is set, but pri= in the Options: is
missing).

References: http://lists.freedesktop.org/archives/systemd-devel/2014-October/023576.html
2015-05-25 17:06:15 +02:00
Lennart Poettering b2c23da8ce core: rename SystemdRunningAs to ManagerRunningAs
It's primarily just a property of the Manager object after all, and we
try to refer to PID 1 as "manager" instead of "systemd", hence let's to
stick to this here too.
2015-05-11 22:51:49 +02:00
Lennart Poettering f2341e0a87 core,network: major per-object logging rework
This changes log_unit_info() (and friends) to take a real Unit* object
insted of just a unit name as parameter. The call will now prefix all
logged messages with the unit name, thus allowing the unit name to be
dropped from the various passed romat strings, simplifying invocations
drastically, and unifying log output across messages. Also, UNIT= vs.
USER_UNIT= is now derived from the Manager object attached to the Unit
object, instead of getpid(). This has the benefit of correcting the
field for --test runs.

Also contains a couple of other logging improvements:

- Drops a couple of strerror() invocations in favour of using %m.

- Not only .mount units now warn if a symlinks exist for the mount
  point already, .automount units do that too, now.

- A few invocations of log_struct() that didn't actually pass any
  additional structured data have been replaced by simpler invocations
  of log_unit_info() and friends.

- For structured data a new LOG_UNIT_MESSAGE() macro has been added,
  that works like LOG_MESSAGE() but prefixes the message with the unit
  name. Similar, there's now LOG_LINK_MESSAGE() and
  LOG_NETDEV_MESSAGE().

- For structured data new LOG_UNIT_ID(), LOG_LINK_INTERFACE(),
  LOG_NETDEV_INTERFACE() macros have been added that generate the
  necessary per object fields. The old log_unit_struct() call has been
  removed in favour of these new macros used in raw log_struct()
  invocations. In addition to removing one more function call this
  allows generated structured log messages that contain two object
  fields, as necessary for example for network interfaces that are
  joined into another network interface, and whose messages shall be
  indexed by both.

- The LOG_ERRNO() macro has been removed, in favour of
  log_struct_errno(). The latter has the benefit of ensuring that %m in
  format strings is properly resolved to the specified error number.

- A number of logging messages have been converted to use
  log_unit_info() instead of log_info()

- The client code in sysv-generator no longer #includes core code from
  src/core/.

- log_unit_full_errno() has been removed, log_unit_full() instead takes
  an errno now, too.

- log_unit_info(), log_link_info(), log_netdev_info() and friends, now
  avoid double evaluation of their parameters
2015-05-11 22:24:45 +02:00
Lennart Poettering 6f856a0992 core: always initialize ExecParamters.bus_endpoint_fd to -1
Otherwise it might be passed in as 0, which is a valid fd, but usually
does not refer to a real endpoint.
2015-05-11 22:18:21 +02:00
Thomas Hindoe Paaboel Andersen 12322d4100 swap: avoid null pointer dereference
CID#1297436
2015-05-11 21:03:50 +02:00
Lennart Poettering 7410616cd9 core: rework unit name validation and manipulation logic
A variety of changes:

- Make sure all our calls distuingish OOM from other errors if OOM is
  not the only error possible.

- Be much stricter when parsing escaped paths, do not accept trailing or
  leading escaped slashes.

- Change unit validation to take a bit mask for allowing plain names,
  instance names or template names or an combination thereof.

- Refuse manipulating invalid unit name
2015-05-05 15:06:42 -07:00
Lennart Poettering 1c2e9646e4 core: simplify unit type detection logic
Introduce a new call unit_type_supported() and make use of it
everywhere.

Also, drop Manager parameter from per-type supported method prototype.
2015-04-30 01:29:00 +02:00
Tom Gundersen 7dfbe2e3fc core: annotate event sources 2015-04-29 17:08:31 +02:00
Lennart Poettering be847e82cf Revert "core: do not spawn jobs or touch other units during coldplugging"
This reverts commit 6e392c9c45.

We really shouldn't invent external state keeping hashmaps, if we can
keep this state in the units themselves.
2015-04-24 15:51:10 +02:00