Commit Graph

554 Commits

Author SHA1 Message Date
Zbigniew Jędrzejewski-Szmek 2aed63f427 tree-wide: fix spelling of "fallback"
Similarly to "setup" vs. "set up", "fallback" is a noun, and "fall back"
is the verb. (This is pretty clear when we construct a sentence in the
present continous: "we are falling back" not "we are fallbacking").
2020-08-20 17:45:32 +02:00
Lennart Poettering c3f8a065e9 execute: take ownership of more fields in ExecParameters
Let's simplify things a bit, and take ownership of more fields in
ExecParameters, so that they are automatically freed when the structure
is released.
2020-07-23 08:37:21 +02:00
Zbigniew Jędrzejewski-Szmek cbc056c819 core: wrap some long lines and other formatting changes 2020-07-08 16:37:23 +02:00
Anita Zhang bb9244781c core: don't consider SERVICE_SKIP_CONDITION for abnormal or failure restarts
Fixes: #16115
2020-06-10 17:12:55 +02:00
Jan Klötzke bf76080180 core: let user define start-/stop-timeout behaviour
The usual behaviour when a timeout expires is to terminate/kill the
service. This is what user usually want in production systems. To debug
services that fail to start/stop (especially sporadic failures) it
might be necessary to trigger the watchdog machinery and write core
dumps, though. Likewise, it is usually just a waste of time to
gracefully stop a stuck service. Instead it might save time to go
directly into kill mode.

This commit adds two new options to services: TimeoutStartFailureMode=
and TimeoutStopFailureMode=. Both take the same values and tweak the
behavior of systemd when a start/stop timeout expires:

 * 'terminate': is the default behaviour as it has always been,
 * 'abort': triggers the watchdog machinery and will send SIGABRT
   (unless WatchdogSignal was changed) and
 * 'kill' will directly send SIGKILL.

To handle the stop failure mode in stop-post state too a new
final-watchdog state needs to be introduced.
2020-06-09 10:04:57 +02:00
Lennart Poettering 6d02412d51
Merge pull request #15928 from poettering/kill-mode-warnings
warn on KillMode=none, inform about left-over processes on stop and warn about sysv services
2020-05-27 15:05:41 +02:00
Chris Down 4793c31083 service: Display updated WatchdogUSec from sd_notify
Suppose a service has WatchdogSec set to 2 seconds in its unit file. I
then start the service and WatchdogUSec is set correctly:

    % systemctl --user show psi-notify -p WatchdogUSec
    WatchdogUSec=2s

Now I call `sd_notify(0, "WATCHDOG_USEC=10000000")`. The new timer seems
to have taken effect, since I only send `WATCHDOG=1` every 4 seconds,
and systemd isn't triggering the watchdog handler. However, `systemctl
show` still shows WatchdogUSec as 2s:

    % systemctl --user show psi-notify -p WatchdogUSec
    WatchdogUSec=2s

This seems surprising, since this "original" watchdog timer isn't the
one taking effect any more. This patch makes it so that we instead
display the new watchdog timer after sd_notify(WATCHDOG_USEC):

    % systemctl --user show psi-notify -p WatchdogUSec
    WatchdogUSec=10s

Fixes #15726.
2020-05-27 09:09:40 +02:00
Lennart Poettering b0cea477d4 core: some more structured initialization 2020-05-26 23:52:18 +02:00
Lennart Poettering 4c42543429 core: also log about left-over processes during unit stop
Only log at LOG_INFO level, i.e. make this informational. During start
let's leave it at LOG_WARNING though.

Of course, it's ugly leaving processes around like that either in start
or in stop, but at start its more dangerous than on stop, so be tougher
there.
2020-05-26 23:52:13 +02:00
Benjamin Robin fcee2755ec core: Update prototype of notify_message, tags list is read only
Indicates that the tags list cannot be modified by notify_message function.
Since the tags list is created only once for multiple call to
notify_message functions.
2020-05-10 18:58:03 +02:00
Zbigniew Jędrzejewski-Szmek f6e9aa9e45 pid1: convert to the new scheme
In all the other cases, I think the code was clearer with the static table.
Here, not so much. And because of the existing dump code, the vtables cannot
be made static and need to remain exported. I still think it's worth to do the
change to have the cmdline introspection, but I'm disappointed with how this
came out.
2020-05-05 22:40:37 +02:00
Zbigniew Jędrzejewski-Szmek b76ef59756
Merge pull request #13512 from msekletar/freezer
core: introduce support for cgroup freezer
2020-05-01 07:52:29 +02:00
Kenny Levinsen 3052049260 core: (De-)Serialize poll flag for fds in fdstore
This replaces manual string splitting and unescaping with
extract_first_word.
2020-04-30 19:42:53 +02:00
Kenny Levinsen cb5a46b845 core: Add optional FDPOLL=0 argument to fdstore
A service can specify FDSTORE=1 FDPOLL=0 to request that PID1 does not
poll the fd to remove them on error. If set, fds will only be removed on
FDSTOREREMOVE=1 or when the service is done.

Fixes: #12086
2020-04-30 19:42:26 +02:00
Michal Sekletár d9e45bc3ab core: introduce support for cgroup freezer
With cgroup v2 the cgroup freezer is implemented as a cgroup
attribute called cgroup.freeze. cgroup can be frozen by writing "1"
to the file and kernel will send us a notification through
"cgroup.events" after the operation is finished and processes in the
cgroup entered quiescent state, i.e. they are not scheduled to
run. Writing "0" to the attribute file does the inverse and process
execution is resumed.

This commit exposes above low-level functionality through systemd's DBus
API. Each unit type must provide specialized implementation for these
methods, otherwise, we return an error. So far only service, scope, and
slice unit types provide the support. It is possible to check if a
given unit has the support using CanFreeze() DBus property.

Note that DBus API has a synchronous behavior and we dispatch the reply
to freeze/thaw requests only after the kernel has notified us that
requested operation was completed.
2020-04-30 19:02:51 +02:00
Lennart Poettering e9da62b18a core: make sure to restore the control command id, too
Fixes: #15356
2020-04-22 20:34:02 +02:00
Lennart Poettering 5b99bd5fd4 core: some minor clean-ups/modernizations 2020-04-22 20:33:57 +02:00
Zbigniew Jędrzejewski-Szmek 162392b75a tree-wide: spellcheck using codespell
Fixes #15436.
2020-04-16 18:00:40 +02:00
Anita Zhang c1566ef0d2 core: transition to FINAL_SIGTERM state after ExecStopPost=
Fixes #14566
2020-03-11 10:15:33 +01:00
Zbigniew Jędrzejewski-Szmek eda0cbf071
Use Finished instead of Started for Type=oneshot services (#14851)
UnitStatusMessageFormats.finished_job, if present,
will be called with the same arguments as
job_get_done_status_message_format() to provide a format string
appropriate for the context

This commit replaces "Started" with "Finished" for started oneshot
units, as mentioned in the referenced issue

Closes #2458.
2020-03-05 17:24:19 +01:00
Lennart Poettering 3b7f79dc9f core: make sure StandardInput=file: doesn't get dup'ed to stdout/stderr by default
Fixes: #14560
2020-01-20 17:19:42 +01:00
Lennart Poettering eea45a3399
Merge pull request #14424 from poettering/watch-bus-name-rework
pid1: simplify drastically how we watch bus names for service's BusName= setting
2020-01-15 11:46:11 +01:00
Lennart Poettering c80a9a33d0 core: clearly refuse OnFailure= deps on units that can't fail
Similar, refuse triggering deps on units that cannot trigger.

And rework how we ignore After= dependencies on device units, to work
the same way.

See: #14142
2020-01-09 11:03:53 +01:00
Lennart Poettering fc67a943d9 core: drop initial ListNames() bus call from PID 1
Previously, when first connecting to the bus after connecting to it we'd
issue a ListNames() bus call to the driver to figure out which bus names
are currently active. This information was then used to initialize the
initial state for services that use BusName=.

This change removes the whole code for this and replaces it with
something vastly simpler.

First of all, the ListNames() call was issues synchronosuly, which meant
if dbus was for some reason synchronously calling into PID1 for some
reason we'd deadlock. As it turns out there's now a good chance it does:
the nss-systemd userdb hookup means that any user dbus-daemon resolves
might result in a varlink call into PID 1, and dbus resolves quite a lot
of users while parsing its policy. My original goal was to fix this
deadlock.

But as it turns out we don't need the ListNames() call at all anymore,
since #12957 has been merged. That PR was supposed to fix a race where
asynchronous installation of bus matches would cause us missing the
initial owner of a bus name when a service is first started. It fixed it
(correctly) by enquiring with GetOwnerName() who currently owns the
name, right after installing the match. But this means whenever we start watching a bus name we anyway
issue a GetOwnerName() for it, and that means also when first connecting
to the bus we don't need to issue ListNames() anymore since that just
tells us the same info: which names are currently owned.

hence, let's drop ListNames() and instead make better use of the
GetOwnerName() result: if it failed the name is not owned.

Also, while we are at it, let's simplify the unit's owner_name_changed()
callback(): let's drop the "old_owner" argument. We never used that
besides logging, and it's hard to synthesize from just the return of a
GetOwnerName(), hence don't bother.
2020-01-06 15:21:47 +01:00
Zbigniew Jędrzejewski-Szmek 868f7d36cc core/service: downgrade "scheduling restart" message to debug
I see we log this during every boot, even though it is a routine expected event:
Nov 12 14:50:01 krowka systemd[1]: systemd-journald.service: Service has no hold-off time (RestartSec=0), scheduling restart.
(and for other services too). Let's downgrade this to debug level.

https://bugzilla.redhat.com/show_bug.cgi?id=1614871
2019-11-22 14:19:51 +01:00
Anita Zhang 3e1db806b0 core: change top-level drop-in from -.service.d to service.d
Discussed in #13743, the -.service semantic conflicts with the
existing root mount and slice names, making this feature not
uniformly extensible to all types. Change the name to be
<type>.d instead.

Updating to this format also extends the top-level dropin to
unit types.
2019-11-07 08:34:53 +01:00
Yu Watanabe 021cdf8330 tree-wide: drop signal.h when signal-util.h is included 2019-11-04 00:30:32 +09:00
Lennart Poettering a93503e86f
Merge pull request #13866 from keszybz/nspawn-restarts
Make 'machinectl reboot' functional
2019-10-30 10:53:28 +01:00
Zbigniew Jędrzejewski-Szmek ebc57b89c6 pid1: log the reason why restart will or will not happen
I was trying to figure out why the restart was not happening, and it wasn't
at all obvious. Let's add a nice debug message.
2019-10-29 10:54:45 +01:00
Zbigniew Jędrzejewski-Szmek a5648b8094 basic/fs-util: change CHASE_OPEN flag into a separate output parameter
chase_symlinks() would return negative on error, and either a non-negative status
or a non-negative fd when CHASE_OPEN was given. This made the interface quite
complicated, because dependning on the flags used, we would get two different
"types" of return object. Coverity was always confused by this, and flagged
every use of chase_symlinks() without CHASE_OPEN as a resource leak (because it
would this that an fd is returned). This patch uses a saparate output parameter,
so there is no confusion.

(I think it is OK to have functions which return either an error or an fd. It's
only returning *either* an fd or a non-fd that is confusing.)
2019-10-24 22:44:24 +09:00
Chris Down 959daf9bfc
Merge pull request #13743 from anitazha/dropin_all_the_things
core: support top level drop-ins through -.service.d for service units
2019-10-16 23:10:05 -04:00
Anita Zhang d727acb650
Merge pull request #13754 from claudiozz/master
Allow restart for oneshot units
2019-10-16 14:21:59 -07:00
Claudio Zumbo 10e72727ee Allow restart for oneshot units
Picked up from https://github.com/systemd/systemd/pull/7474 , so
coauthored by @robermorales.
2019-10-16 09:44:20 -07:00
Anita Zhang d272467882 shared/dropin: support -.service.d/ top level drop-in for service units
Closes #12830
2019-10-15 11:14:54 -07:00
Zbigniew Jędrzejewski-Szmek e0cfed4c59 core/service: use common implementation of unit_load_fragment_and_dropin()
There is a slight functional change when load_state == UNIT_MERGED. Before,
we would not call unit_load_dropin(), but now we do. I'm not sure if this
causes an actual difference in behaviour, but since all other unit types do
this, I think it's better to do the same thing here too.
2019-10-11 11:25:04 +02:00
Anita Zhang e23d911664 core: disallow using '-.service' as a service name
-.service.d will become a special top level drop in so don't let it be a
usable service name (otherwise the interaction gets complicated).
2019-10-07 12:02:12 -07:00
Chris Down bc0623df16 cgroup: analyze: Report memory configurations that deviate from systemd
This is the most basic consumer of the new systemd-vs-kernel checker,
both acting as a reasonable standalone exerciser of the code, and also
as a way for easy inspection of deviations from systemd internal state.
2019-10-03 15:06:25 +01:00
Zbigniew Jędrzejewski-Szmek a232ebcc2c core: add support for RestartKillSignal= to override signal used for restart jobs
v2:
- if RestartKillSignal= is not specified, fall back to KillSignal=. This is necessary
  to preserve backwards compatibility (and keep KillSignal= generally useful).
2019-10-02 14:01:25 +02:00
Zbigniew Jędrzejewski-Szmek 28a2dfe801 core: add helper function to check job status
Since job.h includes unit.h, and unit.h includes job.h, imports need to
be adjusted to make sure unit.h is included first if the helper is used.
2019-10-01 15:05:27 +02:00
Anita Zhang 898fc00e79 core: add ExecXYZEx= bus hook ups to all exec command properties
The "Ex" variant was originally only added for ExecStartXYZ= but it makes
sense to have feature parity for the rest of the exec command properties
as well (e.g. ExecReload=, ExecStop=, etc).
2019-09-17 15:48:44 +00:00
Zbigniew Jędrzejewski-Szmek de5ae832f2
Merge pull request #13439 from yuwata/core-support-systemctl-clean-more
core: support systemctl clean more
2019-09-13 16:15:02 +02:00
Zbigniew Jędrzejewski-Szmek 6b4f7fb08c
Merge pull request #13385 from yuwata/core-remove-private-directories-13355
core: also remove private directories by systemctl clean
2019-08-31 09:28:39 +02:00
Yu Watanabe 12213aed12 core: move timeout_clean_usec from Service to ExecContext 2019-08-28 23:09:54 +09:00
Yu Watanabe 810ef3180e core: introduce unit_fork_and_watch_rm_rf() 2019-08-28 23:09:54 +09:00
Yu Watanabe 52a12341f9 core: make RuntimeDirectoryPreserve= works with non-service units 2019-08-23 00:08:16 +09:00
Yu Watanabe 95939aed21 core: introduce unit_destroy_runtime_directory()
Currently `unit_will_restart()` can return true only when the unit is
service. Hence, should not change anything.
2019-08-22 23:50:52 +09:00
Zbigniew Jędrzejewski-Szmek 5cc2cd1cd8 pid1: always log successfull process termination quietly
Fixes #13372.
2019-08-22 09:09:45 +02:00
Mattias Jernberg a5a8776ae5 core: Avoid race when starting dbus services
In high load scenarios it is possible for services to be started
before the NameOwnerChanged signal is properly installed.

Emulate a callback by also queuing a GetNameOwner when the match is
installed.

Fixes: #12956
2019-08-14 16:12:31 +02:00
Lennart Poettering d611cfa748 core: never propagate reload failure to service result
Fixes: #11238
2019-07-18 10:14:02 +09:00
Anita Zhang 31cd5f63ce core: ExecCondition= for services
Closes #10596
2019-07-17 11:35:02 +02:00