Commit graph

236 commits

Author SHA1 Message Date
Lennart Poettering 1c876927e4 copy: change the various copy_xyz() calls to take a unified flags parameter
This adds a unified "copy_flags" parameter to all copy_xyz() function
calls, replacing the various boolean flags so far used. This should make
many invocations more readable as it is clear what behaviour is
precisely requested. This also prepares ground for adding support for
more modes later on.
2017-02-17 10:22:28 +01:00
Zbigniew Jędrzejewski-Szmek ec251fe7d5 tree-wide: adjust fall through comments so that gcc is happy
gcc 7 adds -Wimplicit-fallthrough=3 to -Wextra. There are a few ways
we could deal with that. After we take into account the need to stay compatible
with older versions of the compiler (and other compilers), I don't think adding
__attribute__((fallthrough)), even as a macro, is worth the trouble. It sticks
out too much, a comment is just as good. But gcc has some very specific
requiremnts how the comment should look. Adjust it the specific form that it
likes. I don't think the extra stuff we had in those comments was adding much
value.

(Note: the documentation seems to be wrong, and seems to describe a different
pattern from the one that is actually used. I guess either the docs or the code
will have to change before gcc 7 is finalized.)
2017-01-31 14:04:55 -05:00
Zbigniew Jędrzejewski-Szmek da3bddc993 core: add missing "=" in message
For consistency. Also drop "e.g." because it's somewhat redundant with the
ellipsis and the message is pretty long already.

Follow-up for 4d1fe20a58.
2017-01-11 16:37:34 -05:00
Stefan Hajnoczi 359a5bcf78 core: add AF_VSOCK support to socket units
Accept AF_VSOCK listen addresses in socket unit files.  Both guest and
host can now take advantage of socket activation.

The QEMU guest agent has recently been modified to support socket
activation and can run over AF_VSOCK with this patch.
2017-01-10 15:29:04 +00:00
Lennart Poettering 41733ae1e0 core: fix sockaddr length calculation for sockaddr_pretty() (#4966)
Let's simply store the socket address length in the SocketPeer object so
that we can use it when invoking sockaddr_pretty():

This fixes the issue described in #4943, but avoids calling
getpeername() twice.
2016-12-29 11:21:37 +01:00
Lennart Poettering 4d1fe20a58 core: improve log message about missing Listen setting (#4988)
Fixes: #4987
2016-12-29 10:39:30 +01:00
Stefan Hajnoczi b9495e8d58 core: prevent invalid socket symlink target dereference (#4895)
socket_find_symlink_target() returns a pointer to
p->address.sockaddr.un.sun_path when the first byte is non-zero without
checking that this is AF_UNIX socket.  Since sockaddr is a union this
byte could be non-zero for AF_INET sockets.

Existing callers happen to be safe but is an accident waiting to happen.
Use socket_address_get_path() since it checks for AF_UNIX.
2016-12-16 11:20:27 +01:00
Lennart Poettering 5125e76243 core: move specifier expansion out of service.c/socket.c
This monopolizes unit file specifier expansion in load-fragment.c, and removes
it from socket.c + service.c. This way expansion becomes an operation done exclusively at time of loading unit files.

Previously specifiers were resolved for all settings during loading of unit
files with the exception of ExecStart= and friends which were resolved in
socket.c and service.c. With this change the latter is also moved to the
loading of unit files.

Fixes: #3061
2016-12-07 18:47:32 +01:00
Franck Bui 7d5ceb6416 core: allow to redirect confirmation messages to a different console
It's rather hard to parse the confirmation messages (enabled with
systemd.confirm_spawn=true) amongst the status messages and the kernel
ones (if enabled).

This patch gives the possibility to the user to redirect the confirmation
message to a different virtual console, either by giving its name or its path,
so those messages are separated from the other ones and easier to read.
2016-11-17 18:16:16 +01:00
Zbigniew Jędrzejewski-Szmek f97b34a629 Rename formats-util.h to format-util.h
We don't have plural in the name of any other -util files and this
inconsistency trips me up every time I try to type this file name
from memory. "formats-util" is even hard to pronounce.
2016-11-07 10:15:08 -05:00
Zbigniew Jędrzejewski-Szmek b744e8937c Merge pull request #4067 from poettering/invocation-id
Add an "invocation ID" concept to the service manager
2016-10-11 13:40:50 -04:00
Lennart Poettering 1f0958f640 core: when determining whether a process exit status is clean, consider whether it is a command or a daemon
SIGTERM should be considered a clean exit code for daemons (i.e. long-running
processes, as a daemon without SIGTERM handler may be shut down without issues
via SIGTERM still) while it should not be considered a clean exit code for
commands (i.e. short-running processes).

Let's add two different clean checking modes for this, and use the right one at
the appropriate places.

Fixes: #4275
2016-10-10 22:57:01 +02:00
Lennart Poettering 4b58153dd2 core: add "invocation ID" concept to service manager
This adds a new invocation ID concept to the service manager. The invocation ID
identifies each runtime cycle of a unit uniquely. A new randomized 128bit ID is
generated each time a unit moves from and inactive to an activating or active
state.

The primary usecase for this concept is to connect the runtime data PID 1
maintains about a service with the offline data the journal stores about it.
Previously we'd use the unit name plus start/stop times, which however is
highly racy since the journal will generally process log data after the service
already ended.

The "invocation ID" kinda matches the "boot ID" concept of the Linux kernel,
except that it applies to an individual unit instead of the whole system.

The invocation ID is passed to the activated processes as environment variable.
It is additionally stored as extended attribute on the cgroup of the unit. The
latter is used by journald to automatically retrieve it for each log logged
message and attach it to the log entry. The environment variable is very easily
accessible, even for unprivileged services. OTOH the extended attribute is only
accessible to privileged processes (this is because cgroupfs only supports the
"trusted." xattr namespace, not "user."). The environment variable may be
altered by services, the extended attribute may not be, hence is the better
choice for the journal.

Note that reading the invocation ID off the extended attribute from journald is
racy, similar to the way reading the unit name for a logging process is.

This patch adds APIs to read the invocation ID to sd-id128:
sd_id128_get_invocation() may be used in a similar fashion to
sd_id128_get_boot().

PID1's own logging is updated to always include the invocation ID when it logs
information about a unit.

A new bus call GetUnitByInvocationID() is added that allows retrieving a bus
path to a unit by its invocation ID. The bus path is built using the invocation
ID, thus providing a path for referring to a unit that is valid only for the
current runtime cycleof it.

Outlook for the future: should the kernel eventually allow passing of cgroup
information along AF_UNIX/SOCK_DGRAM messages via a unique cgroup id, then we
can alter the invocation ID to be generated as hash from that rather than
entirely randomly. This way we can derive the invocation race-freely from the
messages.
2016-10-07 20:14:38 +02:00
Paweł Szewczyk 00bb64ecfa core: Fix USB functionfs activation and clarify its documentation (#4188)
There was no certainty about how the path in service file should look
like for usb functionfs activation. Because of this it was treated
differently in different places, which made this feature unusable.

This patch fixes the path to be the *mount directory* of functionfs, not
ep0 file path and clarifies in the documentation that ListenUSBFunction should be
the location of functionfs mount point, not ep0 file itself.
2016-09-26 18:45:47 +02:00
Lennart Poettering 00d9ef8560 core: add RemoveIPC= setting
This adds the boolean RemoveIPC= setting to service, socket, mount and swap
units (i.e.  all unit types that may invoke processes). if turned on, and the
unit's user/group is not root, all IPC objects of the user/group are removed
when the service is shut down. The life-cycle of the IPC objects is hence bound
to the unit life-cycle.

This is particularly relevant for units with dynamic users, as it is essential
that no objects owned by the dynamic users survive the service exiting. In
fact, this patch adds code to imply RemoveIPC= if DynamicUser= is set.

In order to communicate the UID/GID of an executed process back to PID 1 this
adds a new "user lookup" socket pair, that is inherited into the forked
processes, and closed before the exec(). This is needed since we cannot do NSS
from PID 1 due to deadlock risks, However need to know the used UID/GID in
order to clean up IPC owned by it if the unit shuts down.
2016-08-19 00:37:25 +02:00
Zbigniew Jędrzejewski-Szmek 3bb81a80bd Merge pull request #3818 from poettering/exit-status-env
beef up /var/tmp and /tmp handling; set $SERVICE_RESULT/$EXIT_CODE/$EXIT_STATUS on ExecStop= and make sure root/nobody are always resolvable
2016-08-05 20:55:08 -04:00
Zbigniew Jędrzejewski-Szmek 80a58668d9 socket: add helper function to remove code duplication 2016-08-05 08:24:00 -04:00
Zbigniew Jędrzejewski-Szmek ea8f50f808 core/socket: include remote address in the message when dropping connection
Without the address the message is not very useful.

Aug 04 23:52:21 rawhide systemd[1]: testlimit.socket: Too many incoming connections (4) from source ::1, dropping connection.
2016-08-05 08:16:31 -04:00
Zbigniew Jędrzejewski-Szmek 3ebcd323bd systemd: do not serialize peer, bump count when deserializing socket instead 2016-08-05 08:16:31 -04:00
Zbigniew Jędrzejewski-Szmek 166cf510c2 core/socket: rework SocketPeer refcounting
Make functions and definitions that don't need to be shared local to
socket.c.
2016-08-05 08:12:31 -04:00
Zbigniew Jędrzejewski-Szmek 9a73653c3e systemd: convert peers_by_address to a set 2016-08-04 23:53:07 -04:00
Lennart Poettering a0fef983ab core: remember first unit failure, not last unit failure
Previously, the result value of a unit was overriden with each failure that
took place, so that the result always reported the last failure that took
place.

With this commit this is changed, so that the first failure taking place is
stored instead. This should normally not matter much as multiple failures are
sufficiently uncommon. However, it improves one behaviour: if we send SIGABRT
to a service due to a watchdog timeout, then this currently would be reported
as "coredump" failure, rather than the "watchodg" failure it really is. Hence,
in order to report information about the type of the failure, and not about
the effect of it, let's change this from all unit type to store the first, not
the last failure.

This addresses the issue pointed out here:

https://github.com/systemd/systemd/pull/3818#discussion_r73433520
2016-08-04 23:08:05 +02:00
Lennart Poettering c39f1ce24d core: turn various execution flags into a proper flags parameter
The ExecParameters structure contains a number of bit-flags, that were so far
exposed as bool:1, change this to a proper, single binary bit flag field. This
makes things a bit more expressive, and is helpful as we add more flags, since
these booleans are passed around in various callers, for example
service_spawn(), whose signature can be made much shorter now.

Not all bit booleans from ExecParameters are moved into the flags field for
now, but this can be added later.
2016-08-04 16:27:07 +02:00
Susant Sahani 9d56542764 socket: add support to control no. of connections from one source (#3607)
Introduce MaxConnectionsPerSource= that is number of concurrent
connections allowed per IP.

RFE: 1939
2016-08-02 13:48:23 -04:00
Lennart Poettering 29206d4619 core: add a concept of "dynamic" user ids, that are allocated as long as a service is running
This adds a new boolean setting DynamicUser= to service files. If set, a new
user will be allocated dynamically when the unit is started, and released when
it is stopped. The user ID is allocated from the range 61184..65519. The user
will not be added to /etc/passwd (but an NSS module to be added later should
make it show up in getent passwd).

For now, care should be taken that the service writes no files to disk, since
this might result in files owned by UIDs that might get assigned dynamically to
a different service later on. Later patches will tighten sandboxing in order to
ensure that this cannot happen, except for a few selected directories.

A simple way to test this is:

        systemd-run -p DynamicUser=1 /bin/sleep 99999
2016-07-22 15:53:45 +02:00
Lennart Poettering 8e38570ebe tree-wide: htonl() is weird, let's use htobe32() instead (#3538)
Super-important change, yeah!
2016-06-15 01:26:01 +02:00
Martin Pitt d75103d4c6 Merge pull request #3202 from poettering/socket-fixes
don't reopen socket fds when reloading the daemon
2016-05-08 21:09:35 +02:00
Evgeny Vereshchagin 1745fa70e7 core: dump TriggerLimitIntervalSec and TriggerLimitBurst too 2016-05-06 21:03:16 +00:00
Lennart Poettering 60d9771c59 core: rework how we flush incoming traffic when a socket unit goes down
Previously, we'd simply close and reopen the socket file descriptors. This is
problematic however, as we won't transition through the SOCKET_CHOWN state
then, and thus the file ownership won't be correct for the sockets.

Rework the flushing logic, and actually read any queued data from the sockets
for flushing, and accept any queued messages and disconnect them.
2016-05-06 13:29:26 +02:00
Lennart Poettering 01a8b46757 core: don't implicit open missing socket fds on daemon reload
Previously, when the daemon was reloaded and the configuration of a socket unit
file was changed so that a different set of socket ports was defined for the
socket we'd simply reopen the socket fds not yet open. This is problematic
however, as this means the SOCKET_CHOWN state is not run for them, and thus
their UID/GID is not corrected.

With this change, don't open the missing file descriptors, but log about this
issue, and ask the user to restart the socket explicit, to make sure all
missing fds are opened.

Fixes: #3171
2016-05-06 13:01:17 +02:00
Lennart Poettering d24e561d96 core: split out selinux label retrieval logic into a function of its own
This should bring no behavioural change.
2016-05-06 12:16:58 +02:00
Lennart Poettering d2a50e3b52 core: fix owner user/group output in socket dump
The unit file settings are called SocketUser= and SocketGroup= hence name these
fields that way in the "systemd-analyze dump" output too.

https://github.com/systemd/systemd/issues/3171#issuecomment-216216995
2016-05-05 22:34:47 +02:00
Lennart Poettering 1f15ce2846 core: change default trigger limits for socket units
Let's lower the default values a bit, and pick different defaults for
Accept=yes and Accept=no sockets.

Fixes: #3167
2016-05-05 22:34:47 +02:00
Lennart Poettering d14e3a0de9 core: don't propagate service state to sockets as long as there's still a job for the service queued 2016-05-02 16:41:41 +02:00
Lennart Poettering 072993504e core: move enforcement of the start limit into per-unit-type code again
Let's move the enforcement of the per-unit start limit from unit.c into the
type-specific files again. For unit types that know a concept of "result" codes
this allows us to hook up the start limit condition to it with an explicit
result code. Also, this makes sure that the state checks in clal like
service_start() may be done before the start limit is checked, as the start
limit really should be checked last, right before everything has been verified
to be in order.

The generic start limit logic is left in unit.c, but the invocation of it is
moved into the per-type files, in the various xyz_start() functions, so that
they may place the check at the right location.

Note that this change drops the enforcement entirely from device, slice, target
and scope units, since these unit types generally may not fail activation, or
may only be activated a single time. This is also documented now.

Note that restores the "start-limit-hit" result code that existed before
6bf0f408e4 already in the service code. However,
it's not introduced for all units that have a result code concept.

Fixes #3166.
2016-05-02 13:08:00 +02:00
Lennart Poettering e4f673174e core: rework socket/service GC logic
There's no need to set the no_gc bit for service units that socket units
prepare, as we always keep a proper reference (as maintained by unit_ref_set())
on them, and such references are honoured by the GC logic anyway. Moreover,
explicitly setting the no_gc bit is problematic if the socket gets GC'ed for a
reason, as the service might then leak with the bit set.
2016-04-29 16:27:49 +02:00
Lennart Poettering 18e854f8e5 socket: really always close auxiliary fds when closing socket fds 2016-04-29 16:27:49 +02:00
Lennart Poettering 3e7a1f50e4 core: make sure to close connection fd when we fail to activate a per-connection service
Fixes: #2993 #2691
2016-04-29 16:27:48 +02:00
Lennart Poettering 8b26cdbd2a core: introduce activation rate limiting for socket units
This adds two new settings TriggerLimitIntervalSec= and TriggerLimitBurst= that
define a rate limit for activation of socket units. When the limit is hit, the
socket is is put into a failure mode. This is an alternative fix for #2467,
since the original fix resulted in issue #2684.

In a later commit the StartLimitInterval=/StartLimitBurst= rate limiter will be
changed to be applied after any start conditions checks are made. This way,
there are two separate rate limiters enforced: one at triggering time, before
any jobs are queued with this patch, as well as the start limit that is moved
again to be run immediately before the unit is activated. Condition checks are
done in between the two, and thus no longer affect the start limit.
2016-04-29 16:27:48 +02:00
Lennart Poettering 291d565a04 core,systemctl: add bus API to retrieve processes of a unit
This adds a new GetProcesses() bus call to the Unit object which returns an
array consisting of all PIDs, their process names, as well as their full cgroup
paths. This is then used by "systemctl status" to show the per-unit process
tree.

This has the benefit that the client-side no longer needs to access the
cgroupfs directly to show the process tree of a unit. Instead, it now uses this
new API, which means it also works if -H or -M are used correctly, as the
information from the specific host is used, and not the one from the local
system.

Fixes: #2945
2016-04-22 16:06:20 +02:00
Lennart Poettering 463d0d1569 core: remove ManagerRunningAs enum
Previously, we had two enums ManagerRunningAs and UnitFileScope, that were
mostly identical and converted from one to the other all the time. The latter
had one more value UNIT_FILE_GLOBAL however.

Let's simplify things, and remove ManagerRunningAs and replace it by
UnitFileScope everywhere, thus making the translation unnecessary. Introduce
two new macros MANAGER_IS_SYSTEM() and MANAGER_IS_USER() to simplify checking
if we are running in one or the user context.
2016-04-12 13:43:30 +02:00
Georgia Brikis 27a6ea9163 core: Fix path for opening ffs endpoint ep0
usbffs_address_create() expects an absolute path to the file that is
supposed to be opened. The path specified only leads to the directory
containing the endpoint ep0 not the endpoint itself. This commit adds
the endpoints name to the path.
2016-03-23 17:47:45 +01:00
Vito Caputo 313cefa1d9 tree-wide: make ++/-- usage consistent WRT spacing
Throughout the tree there's spurious use of spaces separating ++ and --
operators from their respective operands.  Make ++ and -- operator
consistent with the majority of existing uses; discard the spaces.
2016-02-22 20:32:04 -08:00
Daniel Mack 9ca6ff50ab Remove kdbus custom endpoint support
This feature will not be used anytime soon, so remove a bit of cruft.

The BusPolicy= config directive will stay around as compat noop.
2016-02-11 22:12:04 +01:00
Martin Pitt 16a798deb3 Merge pull request #2569 from zonque/removals
Remove some old cruft
2016-02-10 14:01:46 +01:00
Daniel Mack b26fa1a2fb tree-wide: remove Emacs lines from all files
This should be handled fine now by .dir-locals.el, so need to carry that
stuff in every file.
2016-02-10 13:41:57 +01:00
Lennart Poettering 6bf0f408e4 core: make the StartLimitXYZ= settings generic and apply to any kind of unit, not just services
This moves the StartLimitBurst=, StartLimitInterval=, StartLimitAction=, RebootArgument= from the [Service] section
into the [Unit] section of unit files, and thus support it in all unit types, not just in services.

This way we can enforce the start limit much earlier, in particular before testing the unit conditions, so that
repeated start-up failure due to failed conditions is also considered for the start limit logic.

For compatibility the four options may also be configured in the [Service] section still, but we only document them in
their new section [Unit].

This also renamed the socket unit failure code "service-failed-permanent" into "service-start-limit-hit" to express
more clearly what it is about, after all it's only triggered through the start limit being hit.

Finally, the code in busname_trigger_notify() and socket_trigger_notify() is altered to become more alike.

Fixes: #2467
2016-02-10 13:26:56 +01:00
Lennart Poettering 7a7821c878 core: rework job_get_timeout() to use usec_t and handle USEC_INFINITY time events correctly 2016-02-04 00:35:43 +01:00
Lennart Poettering 36c16a7cdd core: rework unit timeout handling, and add new setting RuntimeMaxSec=
This clean-ups timeout handling in PID 1. Specifically, instead of storing 0 in internal timeout variables as
indication for a disabled timeout, use USEC_INFINITY which is in-line with how we do this in the rest of our code
(following the logic that 0 means "no", and USEC_INFINITY means "never").

This also replace all usec_t additions with invocations to usec_add(), so that USEC_INFINITY is properly propagated,
and sd-event considers it has indication for turning off the event source.

This also alters the deserialization of the units to restart timeouts from the time they were originally started from.
Before this patch timeouts would be restarted beginning with the time of the deserialization, which could lead to
artificially prolonged timeouts if a daemon reload took place.

Finally, a new RuntimeMaxSec= setting is introduced for service units, that specifies a maximum runtime after which a
specific service is forcibly terminated. This is useful to put time limits on time-intensive processing jobs.

This also simplifies the various xyz_spawn() calls of the various types in that explicit distruction of the timers is
removed, as that is done anyway by the state change handlers, and a state change is always done when the xyz_spawn()
calls fail.

Fixes: #2249
2016-02-01 22:18:16 +01:00
Susant Sahani 62bc4efc7a core: socket options fix SCTP_NODELAY
SCTP_NODELAY is diffrent to TCP_NODELAY.
Apply proper options in case of SCTP.
2015-12-31 12:05:57 +05:30