Commit Graph

207 Commits

Author SHA1 Message Date
Kevin Kuehler 1f0f9f21c1 core: Add triggering job mode
When used with systemctl stop, follows TRIGGERED_BY dependencies and
adds them to the same transaction.

Fixes: #3043
2019-11-05 11:17:38 -08:00
Lennart Poettering 735a8b6d38 job: fix coverity issue
Fixes coverity issue 1403550
2019-07-31 09:45:03 +02:00
Michael Olbrich da8e178296 job: make the run queue order deterministic
Jobs are added to the run queue in random order. This happens because most
jobs are added by iterating over the transaction or dependency hash maps.

As a result, jobs that can be executed at the same time are started in a
different order each time.
On small embedded devices this can cause a measurable jitter for the point
in time when a job starts (~100ms jitter for 10 units that are started in
random order).
This results is a similar jitter for the boot time. This is undesirable in
general and make optimizing the boot time a lot harder.
Also, jobs that should have a higher priority because the unit has a higher
CPU weight might get executed later than others.

Fix this by turning the job run_queue into a Prioq and sort by the
following criteria (use the next if the values are equal):
- CPU weight
- nice level
- unit type
- unit name

The last one is just there for deterministic sorting to avoid any jitter.
2019-07-18 10:28:39 +02:00
Anita Zhang 31cd5f63ce core: ExecCondition= for services
Closes #10596
2019-07-17 11:35:02 +02:00
Yu Watanabe 8cec0a5c32 tree-wide: drop duplicated blank lines
```
$ for i in */*.[ch] */*/*.[ch]; do sed -e '/^$/ {N; s/\n$//g}' -i $i; done
$ git checkout HEAD -- basic/linux shared/linux
```
2019-07-15 18:41:27 +02:00
Lennart Poettering 2e8e1a1ab6
Merge pull request #12461 from Werkov/fix-job-ordering
Refactor job ordering implementation (and fix cycle detection)
2019-07-11 16:43:58 +02:00
Zbigniew Jędrzejewski-Szmek 2a8f53c67b Use unit->id instead of description in messages
v2:
- rename unit_identifier to unit_status_string
2019-07-10 13:35:26 +02:00
Zbigniew Jędrzejewski-Szmek 1f65fd4926 basic/time-util: add helper function to check if timestamp is set
No functional change.
2019-07-04 19:12:47 +02:00
Michal Koutný e602f15282 core: Extract job ordering logic
The job ordering logic is spread at multiple places of the code, making
it hard to maintain and also a bit to understand. The actual execution
order of two jobs always depends on their types and the ordering
contraint between their units. Extract this logic to a new function
job_compare. The second change is simplification of the order
evaluation, JOB_STOP takes always precedence (as documented), unless two
units are both stopping, then the ordering constraint is taken into
account.
2019-06-26 00:00:43 +02:00
Lennart Poettering 760877e90c util: split out sorting related calls to new sort-util.[ch] 2019-03-13 12:16:43 +01:00
Jonathon Kowalski 791cd15993 Fail RequisiteOf units with oneshots
Fixes: #11422

Oneshots going to inactive directly without ever entering UNIT_ACTIVE is
considered success. This however means that if something both Requires=
and Requisites= a unit of such nature, the verify-active job getting
merged into the start job makes it lose this property of failing the
depending jobs, as there, the start job has the result JOB_DONE on
success, so we never walk over RequisiteOf units.

This change makes sure that such units always go down. It is also only
meaningful with After=, but so is Requisite= itself. Also, we also catch
cases like a oneshot having RemainAfterExit= true making us start up
properly in such a setting, but then removing it, reloading the unit,
and restarting it. In such a case, we go down due to restart propagation
before them, and our start job waits on theirs, properly failing with
the JOB_DEPENDENCY result.

This covers cases where ConditionXYZ= creates a similar situation as
well.
2019-02-15 13:42:54 +01:00
Alberts Muktupāvels 52c6c9eaec core: when we uninstall a job, add unit to dbus queue
Commit e6d05912cb added unit to dbus
queue on job install. Do same on job uninstall to make sure we get
PropertiesChanged signal.
2019-02-12 16:55:45 +01:00
Lennart Poettering 92e29d82e6 tree-wide: fix some trailing whitespace
@bl33pbl0p, please fix your editor

(Apparently you never configured the source tree? If you did, then the
git pre-commit hook would have been enabled which doesn't allow
commiting non-whitespace clean stuff...)
2019-01-17 20:06:28 +01:00
bl33pbl0p 28d78d0726 Log the job being merged
Makes it easier to understand what was merged (and easier to realize why).

Example is a start job running, and another unit triggering a verify-active job. It is not clear what job was it that from baz.service that merged into the installed job for bar.service in the debug logs. This makes it useful when debugging issues.

Jan 15 11:45:58 jupiter systemd[1218]: baz.service: Trying to enqueue job baz.service/start/replace
Jan 15 11:45:58 jupiter systemd[1218]: baz.service: Installed new job baz.service/start as 498
Jan 15 11:45:58 jupiter systemd[1218]: bar.service: Merged into installed job bar.service/start as 497
Jan 15 11:45:58 jupiter systemd[1218]: baz.service: Enqueued job baz.service/start as 498

It becomes:
Jan 15 11:45:58 jupiter systemd[1218]: bar.service: Merged bar.service/verify-active into installed job bar.service/start as 497
2019-01-16 12:34:54 +01:00
Lennart Poettering b17c9620c8 core: rework how we deserialize jobs
Let's add a helper call unit_deserialize_job() for this purpose, and
let's move registration in the global jobs hash table into
job_install_deserialized() so that it it is done after all superficial
checks are done, and before transitioning into installed states, so that
rollback code is not necessary anymore.
2018-12-12 11:15:07 +01:00
Lennart Poettering 48235ad6b7 job: be more careful when removing job object from jobs hash table
Let's validate that the ID is actually allocated to us before remove a
job.

This is relevant as various bits of code will call job_free() on
partially set up Job objects, and we really shouldn't remove another job
object accidentally from the hash table, when the set up didn't
complete.
2018-12-12 11:15:07 +01:00
Lennart Poettering 4a53080be6 core: don't track jobs-finishing-during-reload explicitly
Memory management is borked for this, and moreover this is unnecessary
since f0831ed2a0, i.e. since coldplug() and catchup() are two different
concepts: the former restoring the state from before a reload, the
latter than adjusting it again to the actual status in effect after the
reload.

Fixes: #10716
Mostly reverts: #8803
2018-12-12 11:15:06 +01:00
Lennart Poettering 728ba51e98 job: update job_free() to follow our usual return-NULL style 2018-12-12 11:14:26 +01:00
Zbigniew Jędrzejewski-Szmek 2d479ff1cc
Merge pull request #10963 from poettering/bus-force-state-change-signal
force PropertiesChanged bus signal on all unit state changes
2018-12-06 16:42:21 +01:00
Yu Watanabe f2a3de0116 tree-wide: add whitespace between type and variable name 2018-12-04 09:29:54 +01:00
Lennart Poettering e6d05912cb core: when we install a job, announce this via the bus
Whenever we enqueue a job, we should announce this on the bus, hence add
both the job and the unit to the dbus queues. (Why both? The former
should be obvious, the latter because we send out Job properties).

In most cases adding these to the queue is not necessary, as
other properties tend to change at the same time and result in a change
being sent out. However, let's clean this up and make it explicit.
2018-12-01 12:53:26 +01:00
Lennart Poettering 7af67e9a8b core: allow to set exit status when using SuccessAction=/FailureAction=exit in units
This adds SuccessActionExitStatus= and FailureActionExitStatus= that may
be used to configure the exit status to propagate in when
SuccessAction=exit or FailureAction=exit is used.

When not specified let's also propagate the exit status of the main
process we fork off for the unit.
2018-11-27 09:44:40 +01:00
Zbigniew Jędrzejewski-Szmek baaa35ad70 coccinelle: make use of SYNTHETIC_ERRNO
Ideally, coccinelle would strip unnecessary braces too. But I do not see any
option in coccinelle for this, so instead, I edited the patch text using
search&replace to remove the braces. Unfortunately this is not fully automatic,
in particular it didn't deal well with if-else-if-else blocks and ifdefs, so
there is an increased likelikehood be some bugs in such spots.

I also removed part of the patch that coccinelle generated for udev, where we
returns -1 for failure. This should be fixed independently.
2018-11-22 10:54:38 +01:00
Lennart Poettering ff30a86bd4 job: simplify status message extraction
As @keszybz points out these fields are always here, there's no point in
checking if they are NULL or not.
2018-11-16 15:30:36 +01:00
Lennart Poettering 9a80f2f453 job: when a job was skipped due to a failed condition, log about it
Previously we'd neither show console status output nor log output. Let's
fix that, and still log something.
2018-11-16 15:30:36 +01:00
Lennart Poettering 6e64994d69 core: make unit_start() return a distinguishable error code in case conditions didn't hold
Ideally we'd even propagate this all the way to the client, by having a
separate JobType enum value for this. But it's hard to add this without
breaking compat, hence for now let's at least internally propagate this
case differently from the case "already on it".

This is then used to call job_finish_and_invalidate() slightly
differently, with the already= parameter false, as in the failed
condition case no message was likely produced so far.
2018-11-16 15:22:48 +01:00
Lennart Poettering 0e2b4a822e job: add two explanatory comments 2018-11-16 15:22:48 +01:00
Lennart Poettering a69b3872ac job: let's remove one comparison and reduce indentation level by one 2018-11-16 15:22:48 +01:00
Lennart Poettering b344b363ce job: also include job ID in log messages when we begin with a job 2018-11-16 15:22:48 +01:00
Lennart Poettering 33a3fdd978 core: move unit_status_emit_starting_stopping_reloading() and related calls to job.c
This call is only used by job.c and very specific to job handling.
Moreover the very similar logic of job_emit_status_message() is already
in job.c.

Hence, let's clean this up, and move both sets of functions to job.c,
and rename them a bit so that they express precisely what they do:

1. unit_status_emit_starting_stopping_reloading() →
   job_emit_begin_status_message()
2. job_emit_status_message() → job_emit_done_status_message()

The first call is after all what we call when we begin with the
execution of a job, and the second call what we call when we are done
wiht it.

Just some moving and renaming, not other changes, and hence no change in
behaviour.
2018-11-16 15:22:48 +01:00
Lennart Poettering f8c34706f5 job: add log message when we can't enable the job run event source 2018-11-16 15:22:48 +01:00
Lennart Poettering 8ebd9175db job: add comment for EAGAIN job run case 2018-11-16 15:22:48 +01:00
Lennart Poettering ea2c0e4526 job: minor coding style tweaks 2018-11-16 15:22:48 +01:00
Lennart Poettering 1cd81629e1 job: include JOB_ID field in log message about jobs 2018-11-16 15:22:48 +01:00
Lennart Poettering d68c645bd3 core: rework serialization
Let's be more careful with what we serialize: let's ensure we never
serialize strings that are longer than LONG_LINE_MAX, so that we know we
can read them back with read_line(…, LONG_LINE_MAX, …) safely.

In order to implement this all serialization functions are move to
serialize.[ch], and internally will do line size checks. We'd rather
skip a serialization line (with a loud warning) than write an overly
long line out. Of course, this is just a second level protection, after
all the data we serialize shouldn't be this long in the first place.

While we are at it also clean up logging: while serializing make sure to
always log about errors immediately. Also, (void)ify all calls we don't
expect errors in (or catch errors as part of the general
fflush_and_check() at the end.
2018-10-26 10:52:41 +02:00
Lennart Poettering 8948b3415d core: when deserializing state always use read_line(…, LONG_LINE_MAX, …)
This should be much better than fgets(), as we can read substantially
longer lines and overly long lines result in proper errors.

Fixes a vulnerability discovered by Jann Horn at Google.

CVE-2018-15686
LP: #1796402
https://bugzilla.redhat.com/show_bug.cgi?id=1639071
2018-10-26 10:40:01 +02:00
Lennart Poettering 108e8de655
Merge pull request #10439 from poettering/job-struct-init
three trivial simplifications/clean-ups
2018-10-17 22:55:00 +02:00
Lennart Poettering 15ec102145 job: add lots of colons to log messages 2018-10-17 21:18:09 +02:00
Lennart Poettering 08ac00f279 job: use structured initialization 2018-10-17 21:18:02 +02:00
Zbigniew Jędrzejewski-Szmek c7adcb1af9 core: do not "warn" about mundane emergency actions
For example in a container we'd log:
Oct 17 17:01:10 rawhide systemd[1]: Started Power-Off.
Oct 17 17:01:10 rawhide systemd[1]: Forcibly powering off: unit succeeded
Oct 17 17:01:10 rawhide systemd[1]: Reached target Power-Off.
Oct 17 17:01:10 rawhide systemd[1]: Shutting down.
and on the console we'd write (in red)
[  !!  ] Forcibly powering off: unit succeeded

This is not useful in any way, and the fact that we're calling an "emergency action"
is an internal implementation detail. Let's log about c-a-d and the watchdog actions
only.
2018-10-17 19:32:09 +02:00
Zbigniew Jędrzejewski-Szmek 1710d4beff core: limit service-watchdogs=no to actual "watchdog" commands
The setting is now only looked at when considering an action for a job timeout
or unit start limit. It is ignored for ctrl-alt-del, SuccessAction, SuccessFailure.

v2: turn the parameter into a flag field
v3: rename Options to Flags
2018-10-17 19:31:50 +02:00
Yu Watanabe 93bab28895 tree-wide: use typesafe_qsort() 2018-09-19 08:02:52 +09:00
Yu Watanabe f330408d62 tree-wide: drop empty lines in comments 2018-07-23 08:44:24 +02:00
Filipe Brandenburger 53a2383b8b core/job: add check for return of job_type_merge_and_collapse()
Using an assertion is fine, since calls to job_merge_into_installed()
are protected by a check for job_type_is_conflicting().

Uncovered by Coverity, fixes CID 996307.
2018-06-25 13:16:38 +02:00
Lennart Poettering 0c69794138 tree-wide: remove Lennart's copyright lines
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
2018-06-14 10:20:20 +02:00
Lennart Poettering 818bf54632 tree-wide: drop 'This file is part of systemd' blurb
This part of the copyright blurb stems from the GPL use recommendations:

https://www.gnu.org/licenses/gpl-howto.en.html

The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.

hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
2018-06-14 10:20:20 +02:00
Zbigniew Jędrzejewski-Szmek a1230ff972 basic/log: add the log_struct terminator to macro
This way all callers do not need to specify it.
Exhaustively tested by running test-log under valgrind ;)
2018-06-04 13:46:03 +02:00
Zbigniew Jędrzejewski-Szmek 509ad7897c core/job: shortening
Follow-up for a7a7163df7.
2018-05-20 23:25:04 +09:00
David Tardon a7a7163df7 fix race between daemon-reload and other commands
When "systemctl daemon-reload" is run at the same time as "systemctl
start foo", the latter might hang. That's because commands like start
wait for JobRemoved signal to know when the job is finished. But if the
job is finished during reloading, the signal is never sent.

The hang can be easily reproduced by running

    # for ((N=1; N>0; N++)) ; do echo $N ; systemctl daemon-reload ; done
    # for ((N=1; N>0; N++)) ; do echo $N ; systemctl start systemd-coredump.socket ; done

in two different terminals. The start command will hang after 1-2
iterations.

This keeps track of jobs that were started before reload and finished
during it and sends JobRemoved after the reload has finished.
2018-05-19 11:37:00 +02:00
Lennart Poettering d4fd1cf208 core: enforce that scope units can be started only once
Scope units are populated from PIDs specified by the bus client. We do
that when a scope is started. We really shouldn't allow scopes to be
started multiple times, as the PIDs then might be heavily out of date.
Moreover, clients should have the guarantee that any scope they allocate
has a clear runtime cycle which is not repetitive.
2018-04-27 21:52:45 +02:00