Commit graph

577 commits

Author SHA1 Message Date
Lennart Poettering b82f71c7ff tree-wide: constify a few static string tables 2019-03-25 14:04:34 +01:00
Yu Watanabe f9f88198ce core/unit: use condition_test_list() 2019-03-21 23:37:39 +09:00
Franck Bui f75f613d25 core: reduce the number of stalled PIDs from the watched processes list when possible
Some PIDs can remain in the watched list even though their processes have
exited since a long time. It can easily happen if the main process of a forking
service manages to spawn a child before the control process exits for example.

However when a pid is about to be mapped to a unit by calling unit_watch_pid(),
the caller usually knows if the pid should belong to this unit exclusively: if
we just forked() off a child, then we can be sure that its PID is otherwise
unused. In this case we take this opportunity to remove any stalled PIDs from
the watched process list.

If we learnt about a PID in any other form (for example via PID file, via
searching, MAINPID= and so on), then we can't assume anything.
2019-03-20 10:51:49 +01:00
Lennart Poettering 9adb695987 core: split error list in comment for unit_start() in two 2019-03-18 16:06:36 +01:00
Lennart Poettering 36c4dc089e core: change emergency_action() to return void
The function so far always returned -ECANCELLED, which is ignored in all
cases the function is invoked, except one: in unit_test_start_limit()
where -ECANCELLED is returned when the start limit is hit, which is part
of unit_start()'s protocol of return values.

Since the emergency_action() logic should be relatively generic and is
used in many places, let's drop the return value from it, since it's
constant anyway, and in alll cases useless. Instead, let's return it in
unit_test_start_limit(), where it's part of the protocol.

No change in behaviour.
2019-03-18 16:06:36 +01:00
Lennart Poettering 2de9b9793b core: check start limit on condition checks too
Let's add a safety precaution: if the start condition checks for a unit
are tested too often and fail each time, let's rate limit this too.

This should add extra safety in case people define .path, .timer or
.automount units that trigger a service that as a conditoin that always
fails.
2019-03-18 16:06:36 +01:00
Lennart Poettering 5766aca8d2 core: modernize unit_start() a bit
No change in behaviour, just a re-line-breaking of the various comments
to our current coding style, and some use of SYNTHETIC_ERRNO().
2019-03-18 16:06:36 +01:00
Lennart Poettering a4191c9fb5 core: unify code for checking whether unit to trigger is loaded 2019-03-18 16:06:36 +01:00
Lennart Poettering 97a3f4ee05 core: rename unit_{start_limit|condition|assert}_test() to unit_test_xyz()
Just some renaming, no change in behaviour.

Background: I'd like to add more functions unit_test_xyz() that test
various things, hence let's streamline the naming a bit.
2019-03-18 16:06:36 +01:00
Lennart Poettering 9e30cf74ce core: add comment explaining ECOMM return value of unit_start()
we explain all other return values, explain these ones too.
2019-03-18 16:06:36 +01:00
Stephane Chazelas 106bf8e445 remove "." path components from required mount paths
unit_require_mounts_for may be passed path arguments that contain "."
components like for user's home directories where "." is sometimes used
to specify some form of anchor point.

This change stops considering such path as an error and removes the "."
components instead.

Closes: #11910
2019-03-07 10:12:03 +01:00
Lennart Poettering 5bcffb4b54
Merge pull request #11457 from grooverdan/sendsigkill_no
service: killmode=cgroup|mixed, SendSIGKILL=no services are not multiprocess
2019-02-18 13:41:52 +01:00
Daniel Black c53d2d54bd service: make killmode=cgroup|mixed, SendSIGKILL=no services singletons
KillMode=mixed and control group are used to indicate that all
process should be killed off. SendSIGKILL is used for services
that require a clean shutdown. These are typically database
service where a SigKilled process would result in a lengthy
recovery and who's shutdown or startup time is quite variable
(so Timeout settings aren't of use).

Here we take these two factors and refuse to start a service if
there are existing processes within a control group. Databases,
while generally having some protection against multiple instances
running, lets not stress the rigor of these. Also ExecStartPre
parts of the service aren't as rigoriously written to protect
against against multiple use.

closes #8630
2019-01-29 15:35:59 +11:00
Jonathon Kowalski 6255af75d7 Return -EAGAIN instead of -EALREADY from unit_reload
Fixes: #11499

Let's return -EAGAIN so that on state change, unit_process_job tries to
add our job to run_queue again so that all the reloads that coalesced
into the installed reload (which itself merged into a running one)
inititally atleast runs *once*. This should ensure service picks up all
config changes reliably.

See the issue being fixed for a detailed explanation.
2019-01-20 22:12:24 +00:00
Lennart Poettering 2d41e9b7a0
Merge pull request #11143 from keszybz/enable-symlink
Runtime mask symlink confusion fix
2018-12-16 12:37:07 +01:00
Zbigniew Jędrzejewski-Szmek 58d9d89b4b pid1: fix free of uninitialized pointer in unit_fail_if_noncanonical()
https://bugzilla.redhat.com/show_bug.cgi?id=1653068
2018-12-14 11:21:16 +01:00
Zbigniew Jędrzejewski-Szmek 303ee60151 Mark *data and *userdata params to specifier_printf() as const
It would be very wrong if any of the specfier printf calls modified
any of the objects or data being printed. Let's mark all arguments as const
(primarily to make it easier for the reader to see where modifications cannot
occur).
2018-12-12 16:45:33 +01:00
Lennart Poettering a1c7334b61 core: when a unit state changes only propagate to jobs after reloading is complete
Previously, we'd immediately propagate unit state changes into any jobs
pending for them, always. With this we only do this if the manager is
out of the "reload" state. This fixes the problem #8803 tried to
address, by simply not completing jobs until after the reload (and thus
reestablishment of the dbus connection) is complete.

Note that there's no need to later on explicitly catch up with the
missed job state changes (i.e. there's no need to call
unit_process_job() later one explicitly). That's because for jobs in
JOB_WAITING state on deserialization all jobs are requeued into the run
queue anyway, and thus checked again if they can complete now. And for
JOB_RUNNING jobs unit_catchup() phase is going to trigger missed out
state changes *after* the reload complete anyway (after all that's what
distinguishes from unit_coldplug()).

Replaces: #8803
2018-12-12 11:15:07 +01:00
Lennart Poettering 16c74914d2 core: split out all logic that updates a Job on a unit's unit_notify() invocation
Just some refactoring, no change in behaviour.
2018-12-12 11:15:07 +01:00
Lennart Poettering b17c9620c8 core: rework how we deserialize jobs
Let's add a helper call unit_deserialize_job() for this purpose, and
let's move registration in the global jobs hash table into
job_install_deserialized() so that it it is done after all superficial
checks are done, and before transitioning into installed states, so that
rollback code is not necessary anymore.
2018-12-12 11:15:07 +01:00
Zbigniew Jędrzejewski-Szmek 4cb06c5949 Use VLA instead of alloca
The test is the same, but an array is more readable.
2018-12-10 11:57:26 +01:00
Zbigniew Jędrzejewski-Szmek 2d479ff1cc
Merge pull request #10963 from poettering/bus-force-state-change-signal
force PropertiesChanged bus signal on all unit state changes
2018-12-06 16:42:21 +01:00
Lennart Poettering e4de72876e util-lib: split out all temporary file related calls into tmpfiles-util.c
This splits out a bunch of functions from fileio.c that have to do with
temporary files. Simply to make the header files a bit shorter, and to
group things more nicely.

No code changes, just some rearranging of source files.
2018-12-02 13:22:29 +01:00
Lennart Poettering ee228be10c util-lib: don't include fileio.h from fileio-label.h
There's no reason for doing that, hence simply don't.
2018-12-02 13:22:29 +01:00
Lennart Poettering 3c4832ada4 core: enqueue unit earlier when state changes
Previously, we'd enqueue a unit to the dbus queue whenever the state
changed, after we processed the state change fully. This commit to the
beginning of the state change. This has the benefit that when the state
change causes a job to complete the unit is already in the dbus queue,
and thus we get the guarantee that any unit change can be sent out to
clients before the job change.
2018-12-01 12:53:26 +01:00
Lennart Poettering af92c603bb core: send out unit change events when a new invocation ID is acquired
It's free, as this generally coincides with unit_start(), but let's make
this clean and explicit.
2018-12-01 12:53:26 +01:00
Lennart Poettering e18f8852f3 core: invalidate invidual Assert/Condition properties when sending out change messages
Let's inform the clients about assert/condition property changes as they
happen, it's basically for free because assert/condition property
changes generally coincide with other unit state changes (after all
these checks are done on unit_start())
2018-12-01 12:53:26 +01:00
Lennart Poettering 37d0b962ef core: when we manage to resolve a user, only enqueue dbus event, don't send out message right-away
Let's only enqueue the dbus signal generation, let's not do it
right-away, after all we want coalescing to take effect here.
2018-12-01 12:53:26 +01:00
Zbigniew Jędrzejewski-Szmek 8b4e51a60e
Merge pull request #10797 from poettering/run-generator
add new "systemd-run-generator" for running arbitrary commands from the kernel command line as system services using the "systemd.run=" kernel command line switch
2018-11-28 22:40:55 +01:00
Lennart Poettering 7af67e9a8b core: allow to set exit status when using SuccessAction=/FailureAction=exit in units
This adds SuccessActionExitStatus= and FailureActionExitStatus= that may
be used to configure the exit status to propagate in when
SuccessAction=exit or FailureAction=exit is used.

When not specified let's also propagate the exit status of the main
process we fork off for the unit.
2018-11-27 09:44:40 +01:00
Lennart Poettering 5b262f74e4 unit: tweak status output a bit
Let's highlight the unit description string in the status updates, to
separate them a bit more the english sentence they are part of, and thus
make the different casing less surprising.
2018-11-26 18:24:12 +01:00
Lennart Poettering b8b6f32104 cgroup: when we unload a unit, also update all its parent's members mask
This way we can corectly ensure that when a unit that requires some
controller goes away, we propagate the removal of it all the way up, so
that the controller is turned off in all the parents too.
2018-11-23 13:41:37 +01:00
Lennart Poettering 5af8805872 cgroup: drastically simplify caching of cgroups members mask
Previously we tried to be smart: when a new unit appeared and it only
added controllers to the cgroup mask we'd update the cached members mask
in all parents by ORing in the controller flags in their cached values.
Unfortunately this was quite broken, as we missed some conditions when
this cache had to be reset (for example, when a unit got unloaded),
moreover the optimization doesn't work when a controller is removed
anyway (as in that case there's no other way for the parent to iterate
though all children if any other, remaining child unit still needs it).
Hence, let's simplify the logic substantially: instead of updating the
cache on the right events (which we didn't get right), let's simply
invalidate the cache, and generate it lazily when we encounter it later.
This should actually result in better behaviour as we don't have to
calculate the new members mask for a whole subtree whever we have the
suspicion something changed, but can delay it to the point where we
actually need the members mask.

This allows us to simplify things quite a bit, which is good, since
validating this cache for correctness is hard enough.

Fixes: #9512
2018-11-23 13:41:37 +01:00
Lennart Poettering 0adf88b68c cgroup: dump delegation mask too 2018-11-23 12:24:37 +01:00
Lennart Poettering 00e7b3c8e5 unit: minor optimization, use stack over heap, when we can 2018-11-23 00:46:56 +01:00
Lennart Poettering 66fa4bdd70 core: add two minor comments (#10890) 2018-11-23 06:25:27 +09:00
Lennart Poettering 6e64994d69 core: make unit_start() return a distinguishable error code in case conditions didn't hold
Ideally we'd even propagate this all the way to the client, by having a
separate JobType enum value for this. But it's hard to add this without
breaking compat, hence for now let's at least internally propagate this
case differently from the case "already on it".

This is then used to call job_finish_and_invalidate() slightly
differently, with the already= parameter false, as in the failed
condition case no message was likely produced so far.
2018-11-16 15:22:48 +01:00
Lennart Poettering 523ee2d414 core: log a recognizable message when a unit succeeds, too
We already are doing it on failure, let's do it on success, too.

Fixes: #10265
2018-11-16 15:22:48 +01:00
Lennart Poettering 91bbd9b796 core: make log messages about unit processes exiting recognizable 2018-11-16 15:22:48 +01:00
Lennart Poettering 7c047d7443 core: make log messages about units entering a 'failed' state recognizable
Let's make this recognizable, and carry result information in a
structure fashion.
2018-11-16 15:22:48 +01:00
Lennart Poettering 33a3fdd978 core: move unit_status_emit_starting_stopping_reloading() and related calls to job.c
This call is only used by job.c and very specific to job handling.
Moreover the very similar logic of job_emit_status_message() is already
in job.c.

Hence, let's clean this up, and move both sets of functions to job.c,
and rename them a bit so that they express precisely what they do:

1. unit_status_emit_starting_stopping_reloading() →
   job_emit_begin_status_message()
2. job_emit_status_message() → job_emit_done_status_message()

The first call is after all what we call when we begin with the
execution of a job, and the second call what we call when we are done
wiht it.

Just some moving and renaming, not other changes, and hence no change in
behaviour.
2018-11-16 15:22:48 +01:00
Lennart Poettering 8204470252 unit: don't claim there was no IP traffic generated by a unit when we don't know
Only if we have some IP traffic accounting at all we should claim that.
2018-11-14 09:53:50 +01:00
Lennart Poettering 6eb65e7ca4 core: split out audit message generation from unit_notify()
Just some refactoring, no change in behaviour.
2018-11-14 09:51:47 +01:00
INSUN PYO 8724defeae core: use local variable m instead of u->manager 2018-11-13 10:39:35 +01:00
Tommi Rantala 429926e9cc core: include unit name in emergency_action() reason message
Add unit name in StartLimitAction=, FailureAction= and SuccessAction=
emergency_action() reason messages, so that the problematic unit is
easily visible, for example:

    "unit dbus.service failed"
2018-11-12 16:36:03 +01:00
Lennart Poettering 1ad6e8b302 core: split environment block mantained by PID 1's Manager object in two
This splits the "environment" field of Manager into two:
transient_environment and client_environment. The former is generated
from configuration file, kernel cmdline, environment generators. The
latter is the one the user can control with "systemctl set-environment"
and similar.

Both sets are merged transparently whenever needed. Separating the two
sets has the benefit that we can safely flush out the former while
keeping the latter during daemon reload cycles, so that env var settings
from env generators or configuration files do not accumulate, but
dynamic API changes are kept around.

Note that this change is not entirely transparent to users: if the user
first uses "set-environment" to override a transient variable, and then
uses "unset-environment" to unset it again things will revert to the
original transient variable now, while previously the variable was fully
removed. This change in behaviour should not matter too much though I
figure.

Fixes: #9972
2018-10-31 18:00:53 +01:00
Lennart Poettering d68c645bd3 core: rework serialization
Let's be more careful with what we serialize: let's ensure we never
serialize strings that are longer than LONG_LINE_MAX, so that we know we
can read them back with read_line(…, LONG_LINE_MAX, …) safely.

In order to implement this all serialization functions are move to
serialize.[ch], and internally will do line size checks. We'd rather
skip a serialization line (with a loud warning) than write an overly
long line out. Of course, this is just a second level protection, after
all the data we serialize shouldn't be this long in the first place.

While we are at it also clean up logging: while serializing make sure to
always log about errors immediately. Also, (void)ify all calls we don't
expect errors in (or catch errors as part of the general
fflush_and_check() at the end.
2018-10-26 10:52:41 +02:00
Lennart Poettering 8948b3415d core: when deserializing state always use read_line(…, LONG_LINE_MAX, …)
This should be much better than fgets(), as we can read substantially
longer lines and overly long lines result in proper errors.

Fixes a vulnerability discovered by Jann Horn at Google.

CVE-2018-15686
LP: #1796402
https://bugzilla.redhat.com/show_bug.cgi?id=1639071
2018-10-26 10:40:01 +02:00
Martin Wilck e1e74614aa core: don't create Requires for workdir if "missing ok"
Don't add an implicit RequiresMountsFor depenency for the
WorkingDirectory of a unit if the "-" character was used to
indicate that "a missing working directory is not considered fatal"
(see systemd.exec(5)). Otherwise systemd might fail the unit
because of missing dependencies.
2018-10-25 11:35:59 +02:00
Yu Watanabe ec9d636b37 core: use ascii_toupper() instead of everytime judging whether it is the first message 2018-10-24 04:58:08 +09:00