Commit Graph

740 Commits

Author SHA1 Message Date
Lennart Poettering 3ad228ce75 core: use structure initialization for Manager
No changes in behaviour, just a nicer way to fill in the Manager
initially.
2018-10-09 19:43:43 +02:00
Lennart Poettering af41e5086d core: rename ManagerExitCode → ManagerObjective
"ExitCode" is a bit of a misnomer in two ways: it suggests this was
about the "exit code" concept that exit()/waitid() deal with, but really
isn't. Moreover, it's not event just about exiting either, but more
often about reloading/reexecing or rebooting. Let's hence pick a new
name for this that is a bit more correct.

I initially thought about naming this the "state", but that'd be a
misnomer too, as the value really encodes a "goal" more than a current
state. Also we already have the externally visible ManagerState.

No actual changes in behaviour, just the rename.
2018-10-09 19:43:43 +02:00
Zbigniew Jędrzejewski-Szmek 7c3733d5de pid1: remove unnecessary error reassignment
LGTM was complaining:
> Comparison is always true because r >= 0.
2018-10-02 15:36:24 +02:00
Dimitri John Ledkov a3156a8ee4 core: execute generators with manager's environmnet 2018-09-24 13:40:50 +01:00
Dimitri John Ledkov ea368f0bd2 core: execute environment_generators with manager's environment 2018-09-24 13:40:10 +01:00
Dimitri John Ledkov 78ec1bb436 exec-util: in execute_directories, support initial exec environment 2018-09-24 13:40:10 +01:00
Yu Watanabe 4366e598ae core: replace udev_device by sd_device 2018-08-23 04:57:39 +09:00
Lennart Poettering a3c1168ac2 core: rework StopWhenUnneeded= logic
Previously, we'd act immediately on StopWhenUnneeded= when a unit state
changes. With this rework we'll maintain a queue instead: whenever
there's the chance that StopWhenUneeded= might have an effect we enqueue
the unit, and process it later when we have nothing better to do.

This should make the implementation a bit more reliable, as the unit notify event
cannot immediately enqueue tons of side-effect jobs that might
contradict each other, but we do so only in a strictly ordered fashion,
from the main event loop.

This slightly changes the check when to consider a unit "unneeded".
Previously, we'd assume that a unit in "deactivating" state could also
be cleaned up. With this new logic we'll only consider units unneeded
that are fully up and have no job queued. This means that whenever
there's something pending for a unit we won't clean it up.
2018-08-10 16:19:01 +02:00
Lennart Poettering ae0db6f132
Merge pull request #9687 from yuwata/rfe-9662
analyze: several systemd-analyze plot improvements
2018-07-24 09:43:57 +02:00
Yu Watanabe d4ee7bd849 core: serialize/deserialize several timestamps on initrd in different names 2018-07-24 03:45:51 +09:00
Yu Watanabe bee38b5cf8 core: serialize and deserialize current ShowStatus
Fixes #9663.
2018-07-23 23:42:48 +09:00
Yu Watanabe 7a293242e0 core: normalize ShowStatus 2018-07-23 21:55:26 +09:00
Alan Jenkins d3243f55ca core: remove support for API bus "started outside our own logic"
Looking at a recent Bad Day, my log contains over 100 lines of

    systemd[23895]: Failed to connect to API bus: Connection refused

It is due to "systemd --user" retrying to connect to an API bus.[*]  I
would prefer to avoid spamming the logs.  I don't think it is good for us
to retry so much like this.

systemd was mislead by something setting DBUS_SESSION_BUS_ADDRESS.  My best
guess is an unfortunate series of events caused gdm to set this.  gdm has
code to start a session dbus if there is not a bus available already (and
in this case it exports the environment variable).  I believe it does not
normally do this when running under systemd, because "systemd --user" and
hence "dbus.service" would already have been started by pam_systemd.

I see two possibilities

1. Rip out the check for DBUS_SESSION_BUS_ADDRESS entirely.
2. Only check for DBUS_SESSION_BUS_ADDRESS on startup.  Not in the
   "recheck" logic.

The justification for 2), is that the recheck is called from unit_notify(),
this is used to check whether the service just started (or stopped) was
"dbus.service".  This reason for rechecking does not apply if we think
the session bus was started outside our logic.

But I think we can justify 1).  dbus-daemon ships a statically-enabled
/usr/lib/systemd/user/dbus.service, which would conflict with an attempt to
use an external dbus.  Also "systemd --user" is started from user@.service;
if you try to start it manually so that it inherits an environment
variable, it will conflict if user@.service was started by pam_systemd
(or loginctl enable-linger).
2018-07-20 15:25:55 +02:00
Michal Koutný 1e75824cb0 core/manager: Make sure jobs statistics are not double after daemon-reload
We add n_installed_jobs and n_failed_jobs to our inner state after
deserialization. This is fine during daemon-reexec when we start with clear
Manager (and some jobs possibly queued before deserialization), however,
daemon-reload works with the same manager and adding the values would
effectively double the counters. Reset the counters before we deserialize and
add their values again.
2018-06-25 11:18:44 +02:00
Lennart Poettering 01d32c0030 core: use fflush_and_check() where appropriate 2018-06-20 23:59:29 +02:00
Lennart Poettering 0c69794138 tree-wide: remove Lennart's copyright lines
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
2018-06-14 10:20:20 +02:00
Lennart Poettering 818bf54632 tree-wide: drop 'This file is part of systemd' blurb
This part of the copyright blurb stems from the GPL use recommendations:

https://www.gnu.org/licenses/gpl-howto.en.html

The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.

hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
2018-06-14 10:20:20 +02:00
Franck Bui bda7d78ba1 pid1: preserve current value of log target across re-{load,execution}
To make debugging easier, this patches allows one to change the log target and
do reload/reexec without modifying configuration permanently, which makes
debugging easier.

Indeed if one changed the log target at runtime (via the bus or via signals),
the change was lost on the next reload/reexecution.

In order to restore back the default value (set via system.conf, environment
variables or any other means ), the empty string in the "LogTarget" property is
now supported as well as sending SIGTRMIN+26 signal.
2018-06-13 18:52:27 +02:00
Franck Bui a6ecbf836c pid1: preserve current value of log level across re-{load,execution}
To make debugging easier, this patches allows one to change the log level and
do reload/reexec without modifying configuration permanently, which makes
debugging easier.

Indeed if one changed the log max level at runtime (via the bus or via
signals), the change was lost on the next daemon reload/reexecution.

In order to restore the original value back (set via system.conf, environment
variables or any other means), the empty string in the "LogLevel" property is
now supported as well as sending SIGRTMIN+23 signal.
2018-06-13 18:52:27 +02:00
Lennart Poettering 8ace1db703 core: rework manager_load_startable_unit_or_warn() on top of unit_validate_load_state()
These functions do very similar work, let's unify common code.
2018-06-11 12:53:12 +02:00
Lennart Poettering 04eb582acc core: enumerate perpetual units in a separate per-unit-type method
Previously the enumerate() callback defined for each unit type would do
two things:

1. It would create perpetual units (i.e. -.slice, system.slice, -.mount and
   init.scope)

2. It would enumerate units from /proc/self/mountinfo, /proc/swaps and
   the udev database

With this change these two parts are split into two seperate methods:
enumerate() now only does #2, while enumerate_perpetual() is responsible
for #1. Why make this change? Well, perpetual units should have a
slightly different effect that those found through enumeration: as
perpetual units should be up unconditionally, perpetually and thus never
change state, they should also not pull in deps by their state changing,
not even when the state is first set to active. Thus, their state is
generally initialized through the per-device coldplug() method in
similar  fashion to the deserialized state from a previous run would be
put into place. OTOH units found through regular enumeration should
result in state changes (and thus pull in deps due to state changes),
hence their state should be put in effect in the catchup() method
instead. Hence, given this difference, let's also separate the
functions, so that the rule is:

1. What is created in enumerate_perpetual() should be started in
   coldplug()

2. What is created in enumerate() should be started in catchup().
2018-06-07 15:29:17 +02:00
Lennart Poettering f0831ed2a0 core: add a new unit method "catchup()"
This is very similar to the existing unit method coldplug() but is
called a bit later. The idea is that that coldplug() restores the unit
state from before any prior reload/restart, i.e. puts the deserialized
state in effect. The catchup() call is then called a bit later, to
catch up with the system state for which we missed notifications while
we were reloading. This is only really useful for mount, swap and device
mount points were we should be careful to generate all missing unit
state change events (i.e. call unit_notify() appropriately) for
everything that happened while we were reloading.
2018-06-07 15:28:50 +02:00
Lennart Poettering 62b0cbb358 core: use safe_fclose() where appropriate 2018-06-07 13:36:19 +02:00
Lennart Poettering 159f1e7666 core: split out early-boot preset logic into a function of its own 2018-06-07 13:36:19 +02:00
Lennart Poettering 9d4c195c64 core: split out bus initialization from manager_setup() 2018-06-07 13:36:19 +02:00
Lennart Poettering a5cc7e5ac1 core: schedule time and timezone change events a bit before .timer elapsation events
We really should make sure that .timer units are dispatched while taking
the newest time/timezone data into account.
2018-06-06 10:55:45 +02:00
Lennart Poettering 4f811d27d6 time-util: introduce common implementation of TFD_TIMER_CANCEL_ON_SET client code
We now use pretty much the same code at three places, let's unify that.
2018-06-06 10:55:45 +02:00
Lennart Poettering bbf5fd8e41 core: subscribe to /etc/localtime timezone changes and update timer elapsation accordingly
Fixes: #8233

This is our first real-life usecase for the new sd_event_add_inotify()
calls we just added.
2018-06-06 10:53:56 +02:00
Lennart Poettering 7feedd18fa core: move destruction of old time event sources to manager_setup_time_change()
It's a bit prettier that day as the function won't silently overwrite
any possibly pre-initialized field, and destroy it right before we
allocate a new event source.
2018-06-06 10:53:56 +02:00
Zbigniew Jędrzejewski-Szmek a1230ff972 basic/log: add the log_struct terminator to macro
This way all callers do not need to specify it.
Exhaustively tested by running test-log under valgrind ;)
2018-06-04 13:46:03 +02:00
Yu Watanabe 858d36c1ec path-util: introduce path_simplify()
The function is similar to path_kill_slashes() but also removes
initial './', trailing '/.', and '/./' in the path.
When the second argument of path_simplify() is false, then it
behaves as the same as path_kill_slashes(). Hence, this also
replaces path_kill_slashes() with path_simplify().
2018-06-03 23:39:26 +09:00
David Tardon a7a7163df7 fix race between daemon-reload and other commands
When "systemctl daemon-reload" is run at the same time as "systemctl
start foo", the latter might hang. That's because commands like start
wait for JobRemoved signal to know when the job is finished. But if the
job is finished during reloading, the signal is never sent.

The hang can be easily reproduced by running

    # for ((N=1; N>0; N++)) ; do echo $N ; systemctl daemon-reload ; done
    # for ((N=1; N>0; N++)) ; do echo $N ; systemctl start systemd-coredump.socket ; done

in two different terminals. The start command will hang after 1-2
iterations.

This keeps track of jobs that were started before reload and finished
during it and sends JobRemoved after the reload has finished.
2018-05-19 11:37:00 +02:00
Lennart Poettering 6f8fa29465
Merge pull request #8981 from keszybz/ratelimit-and-dbus
Ratelimit renaming and dbus error message fix
2018-05-18 21:38:30 +02:00
Lennart Poettering 31ce987c2b rlimit-util: add a common destructor call for arrays of struct rlimit 2018-05-17 20:36:52 +02:00
Felipe Sateler 57b7a260c2 core: undo the dependency inversion between unit.h and all unit types 2018-05-15 14:24:34 -04:00
Zbigniew Jędrzejewski-Szmek 930c124c3f pid1: do not write invalid utf-8 in error message
We'd write a sequence that was invalid unicode and this caused the d-bus
connection to be terminated:

$ busctl get-property org.freedesktop.systemd1 /org/freedesktop/systemd1/unit/dbus_2esocket org.freedesktop.systemd1.Unit SubState
s "running"
$ busctl get-property org.freedesktop.systemd1 /org/freedesktop/systemd1/unit/dbus_e2socket org.freedesktop.systemd1.Unit SubState
Remote peer disconnected
$ busctl get-property org.freedesktop.systemd1 /org/freedesktop/systemd1/unit/dbus_e2socket org.freedesktop.systemd1.Unit SubState
(hangs)

Fixes #8978.
2018-05-13 22:08:30 +02:00
Zbigniew Jędrzejewski-Szmek 7994ac1d85 Rename ratelimit_test to ratelimit_below
When I see "test", I have to think three times what the return value
means. With "below" this is immediately clear. ratelimit_below(&limit)
sounds almost like English and is imho immediately obvious.

(I also considered ratelimit_ok, but this strongly implies that being under the
limit is somehow better. Most of the times this is true, but then we use the
ratelimit to detect triple-c-a-d, and "ok" doesn't fit so well there.)

C.f. a1bcaa07.
2018-05-13 22:08:30 +02:00
Lennart Poettering da6053d0a7 tree-wide: be more careful with the type of array sizes
Previously we were a bit sloppy with the index and size types of arrays,
we'd regularly use unsigned. While I don't think this ever resulted in
real issues I think we should be more careful there and follow a
stricter regime: unless there's a strong reason not to use size_t for
array sizes and indexes, size_t it should be. Any allocations we do
ultimately will use size_t anyway, and converting forth and back between
unsigned and size_t will always be a source of problems.

Note that on 32bit machines "unsigned" and "size_t" are equivalent, and
on 64bit machines our arrays shouldn't grow that large anyway, and if
they do we have a problem, however that kind of overly large allocation
we have protections for usually, but for overflows we do not have that
so much, hence let's add it.

So yeah, it's a story of the current code being already "good enough",
but I think some extra type hygiene is better.

This patch tries to be comprehensive, but it probably isn't and I missed
a few cases. But I guess we can cover that later as we notice it. Among
smaller fixes, this changes:

1. strv_length()' return type becomes size_t

2. the unit file changes array size becomes size_t

3. DNS answer and query array sizes become size_t

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=76745
2018-04-27 14:29:06 +02:00
Zbigniew Jędrzejewski-Szmek a1113e0865 core/manager: make manager_enumerate() static 2018-04-24 11:44:19 +02:00
Zbigniew Jędrzejewski-Szmek 94b01dae47 core/manager: trivial simplification 2018-04-24 11:44:19 +02:00
Lennart Poettering 2cb36f7c1e
Merge pull request #8575 from keszybz/non-absolute-paths
Do not require absolute paths in ExecStart and friends
2018-04-17 15:54:10 +02:00
Zbigniew Jędrzejewski-Szmek 4109ede778 core/manager: split out function to verify that unit is loaded and not masked
No functional change.
2018-04-16 16:07:27 +02:00
Zbigniew Jędrzejewski-Szmek 11a1589223 tree-wide: drop license boilerplate
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.

I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
2018-04-06 18:58:55 +02:00
Yu Watanabe 1cc6c93a95 tree-wide: use TAKE_PTR() and TAKE_FD() macros 2018-04-05 14:26:26 +09:00
Michal Sekletar 19496554e2 core: delay adding target dependencies until all units are loaded and aliases resolved (#8381)
Currently we add target dependencies while we are loading units. This
can create ordering loops even if configuration doesn't contain any
loop. Take for example following configuration,

$ systemctl get-default
multi-user.target

$ cat /etc/systemd/system/test.service
[Unit]
After=default.target

[Service]
ExecStart=/bin/true

[Install]
WantedBy=multi-user.target

If we encounter such unit file early during manager start-up (e.g. load
queue is dispatched while enumerating devices due to SYSTEMD_WANTS in
udev rules) we would add stub unit default.target and we order it Before
test.service. At the same time we add implicit Before to
multi-user.target. Later we merge two units and we create ordering cycle
in the process.

To fix the issue we will now never add any target dependencies until we
loaded all the unit files and resolved all the aliases.
2018-03-23 15:28:06 +01:00
Lennart Poettering c10d6bdb89 macro: introduce new TAKE_FD() macro
This is similar to TAKE_PTR() but operates on file descriptors, and thus
assigns -1 to the fd parameter after returning it.

Removes 60 lines from our codebase. Pretty good too I think.
2018-03-22 20:30:40 +01:00
Lennart Poettering 31dc1ca3bf move MANAGER_IS_RELOADING() check into manager_recheck_{dbus|journal}() (#8510)
Let's better check this inside of the call than before it, so that we
never issue this while reloading, even should these calls be called due
to other reasons than just the unit notify.

This makes sure the reload state is unset a bit earlier in
manager_reload() so that we can safely call this function from there and
they do the right thing.

Follow-up for e63ebf71ed.
2018-03-21 12:03:45 +01:00
Yu Watanabe bcabcde5d2
Merge pull request #8408 from keszybz/ln-relative
bugs.fd.o bug archelogy
2018-03-19 18:32:30 +09:00
Zbigniew Jędrzejewski-Szmek ba0c7754d8 core/manager: move some comments to a better place 2018-03-16 23:15:54 +01:00
Zbigniew Jędrzejewski-Szmek e8112e67e4 Make MANAGER_TEST_RUN_MINIMAL just allocate data structures
When running tests like test-unit-name, there is not point in setting
up the cgroup and signals and interacting with the environment. Similarly
when running fuzz testing of the parser.

Add new MANAGER_TEST_RUN_BASIC which takes the role of MANAGER_TEST_RUN_MINIMAL,
and redefine MANAGER_TEST_RUN_MINIMAL to just create the basic data structures.
2018-03-11 16:33:59 +01:00
Zbigniew Jędrzejewski-Szmek dc409696cf Introduce _cleanup_(unit_freep) 2018-03-11 16:33:58 +01:00
Zbigniew Jędrzejewski-Szmek c70cac548a Introduce _cleanup_(manager_freep) 2018-03-11 16:33:57 +01:00
Zbigniew Jędrzejewski-Szmek 8750ac0238 pid1: make use of high rt signals on hppa with newer kernels
Back in 4dffec1459 we stopped using SIGRTMIN+26
and higher on hppa because they were not available. Then they became available
in linux 3.18:

  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1f25df2eff5b25f52c139d3ff31bc883eee9a0ab

Instead of hard-coding the list based on architecture, let's use a runtime
check like signal(7) says.

(A note about implementation: RTSIG_IF_AVAILABLE is defined to take the full
signal and not just an offset from SIGRTMIN so that it's still possible to
grep for SIGRTMIN\+.)

Add a simple "test" to print the signal values.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=84931.
2018-03-09 10:35:33 +01:00
Yu Watanabe a1d32bac2a
Revert "core: don't setup init.scope in test mode (#8380)" (#8390)
This reverts commit a9e8ecf037,
as it breaks test-path.

Fixes #8389.
2018-03-08 15:29:19 +09:00
Michal Sekletar a9e8ecf037 core: don't setup init.scope in test mode (#8380)
Reproducer:

$ meson build && cd build
$ ninja
$ sudo useradd test
$ sudo su test
$ ./systemd --system --test
...
Failed to create /user.slice/user-1000.slice/session-6.scope/init.scope control group: Permission denied
Failed to allocate manager object: Permission denied

Above error message is caused by the fact that user test didn't have its
own session and we tried to set up init.scope already running as user
test in the directory owned by different user.

Let's skip setting up init.scope altogether since we won't be launching
processes anyway.
2018-03-07 16:41:41 +01:00
Lennart Poettering e0a085811d core: don't process dbus unit and job queue when there are already too many messages pending
We maintain a queue of units and jobs that we are supposed to generate
change/new notifications for because they were either just created or
some of their property has changed. Let's throttle processing of this
queue a bit: as soon as > 1K of bus messages are queued for writing
let's skip processing the queue, and then recheck on the next
iteration again.

Moreover, never process more than 100 units in one go, return to the
event loop after that. Both limits together should put effective limits
on both space and time usage of the function, delaying further
operations until a later moment, when the queue is empty or the the
event loop is sufficiently idle again.

This should keep the number of generated messages much lower than
before on busy systems or where some client is hanging.

Note that this also means a bad client can slow down message dispatching
substantially for up to 90s if it likes to, for all clients. But that
should be acceptable as we only allow trusted bus clients, anyway.

Fixes: #8166
2018-02-27 19:54:29 +01:00
Lennart Poettering 30663b6c25
Merge pull request #8199 from keszybz/small-things
Sundry small cleanups
2018-02-19 16:55:10 +01:00
Zbigniew Jędrzejewski-Szmek 9ecdba8cb7 Move config_parse_join_controllers to shared, add test
config_parse_join_controllers would free the destination argument on failure,
which is contrary to our normal style, where failed parsing has no effect.
Moving it to shared also allows a test to be added.
2018-02-19 15:02:13 +01:00
Lennart Poettering a94ab7acfd
Merge pull request #8175 from keszybz/gc-cleanup
Garbage collection cleanup
2018-02-15 17:47:37 +01:00
Lennart Poettering 476a8618fc
Merge pull request #8150 from poettering/memory-accounting-by-default
pid1: turn memory accounting on by default now
2018-02-15 17:22:36 +01:00
Zbigniew Jędrzejewski-Szmek 648461c07d Merge pull request #8125 from poettering/cgroups-migrate
Trivial merge conflict resolved locally.
2018-02-15 16:15:45 +01:00
Zbigniew Jędrzejewski-Szmek 2ab3050f6e pid1: rename job_check_gc to job_may_gc
The reasoning is the same as for unit_can_gc.

v2:
- rename can_gc to may_gc
2018-02-15 14:09:40 +01:00
Zbigniew Jędrzejewski-Szmek 2641f02e23 pid1: fix collection of cycles of units which reference one another
A .socket will reference a .service unit, by registering a UnitRef with the
.service unit. If this .service unit has the .socket unit listed in Wants or
Sockets or such, a cycle will be created. We would not free this cycle
properly, because we treated any unit with non-empty refs as uncollectable. To
solve this issue, treats refs with UnitRef in u->refs_by_target similarly to
the refs in u->dependencies, and check if the "other" unit is known to be
needed. If it is not needed, do not treat the reference from it as preventing
the unit we are looking at from being freed.
2018-02-15 13:32:53 +01:00
Zbigniew Jędrzejewski-Szmek f2f725e5cc pid1: rename unit_check_gc to unit_may_gc
"check" is unclear: what is true, what is false? Let's rename to "can_gc" and
revert the return value ("positive" values are easier to grok).

v2:
- rename from unit_can_gc to unit_may_gc
2018-02-15 13:04:12 +01:00
Zbigniew Jędrzejewski-Szmek 444d586333 meson: add -Dmemory-accounting-default=true|false
This makes it easy to set the default for distributions and users which want to
default to off because they primarily use older kernels.
2018-02-15 12:02:41 +01:00
Zbigniew Jędrzejewski-Szmek 04a5236233
Merge pull request #8144 from poettering/journal-inotify-fixes
various journal fixes
2018-02-14 13:52:17 +01:00
Alan Jenkins 8afabc5090 manager: avoid infinite loop for unexpected waitid() error (#8168)
I think if we log the error as being _ignored_, we should also consider
the event as handled and clear it.  This was the behaviour prior to
575b300b (PR #7968).

I don't think we particularly wanted to change behaviour and keep retrying.
Sometimes that's useful, other times you cause more problems by filling the
logs.

Plus a nearby typo fix.
2018-02-13 19:04:31 +01:00
Lennart Poettering 5f109056d5 core: delay bus name synchronization after reload/reexec into a later event loop iteration
Previously, we'd synchronize bus names immediately when we succeeded
connecting to the bus, potentially even before coldplugging the units.
This was problematic, as synchronizing bus names meant invoking the
per-unit name change handler function which might change the unit's
state — which will result in consistency when done before we coldplug
things.

With this change we instead enqueue a job for the event loop to resync
the names in a later loop iteration, i.e. at a point where we know
coldplugging has finished.
2018-02-12 11:34:00 +01:00
Lennart Poettering cedf508886 core: simplify manager_recheck_journal() a bit
No need for an if check if we just pass along a bool anyway.
2018-02-12 11:34:00 +01:00
Lennart Poettering 217677abb0 core: tweak manager_journal_is_running() a bit more
Let's also use the journal if it is currently reloading. In that state
it should also be able to process our requests. Moreover, we might
otherwise end up disconnecting/reconnecting from the journal without
really any need to hence, relax the check accordingly.
2018-02-12 11:34:00 +01:00
Lennart Poettering 7d814a197a manager: tweak manager_journal_is_running() a bit regarding test mode
In test mode, let's not consider the journal to be up ever: we want all
output to go to stderr.
2018-02-12 11:34:00 +01:00
Lennart Poettering 8559b3b75c core: rework how we connect to the bus
This removes the current bus_init() call, as it had multiple problems:
it munged  handling of the three bus connections we care about (private,
"api" and system) into one, even though the conditions when which was
ready are very different. It also added redundant logging, as the
individual calls it called all logged on their own anyway.

The three calls bus_init_api(), bus_init_private() and bus_init_system()
are now made public. A new call manager_dbus_is_running() is added that
works much like manager_journal_is_running() and is a lot more careful
when checking whether dbus is around. Optionally it checks the unit's
deserialized_state rather than state, in order to accomodate for cases
where we cant to connect to the bus before deserializing the
"subscribed" list, before coldplugging the units.

manager_recheck_dbus() is added, that works a lot like
manager_recheck_journal() and is invoked in unit_notify(), i.e. when
units change state.

All in all this should make handling a bit more alike to journal
handling, and it also fixes one major bug: when running in user mode
we'll now connect to the system bus early on, without conditionalizing
this in anyway.
2018-02-12 11:34:00 +01:00
Lennart Poettering 004c7f169e core: fold manager_set_exec_params() into unit_set_exec_params()
Let's simplify things a bit: we so far called both functions every
single time, let's just merge one into the other, so that we have fewer
functions to call.
2018-02-12 11:34:00 +01:00
Lennart Poettering 548f69375e tree-wide: use path_hash_ops instead of string_hash_ops whenever we key by a path
Let's make use of our new hash_ops!
2018-02-12 11:07:55 +01:00
Lennart Poettering e0c46a7364 pid1: turn memory accounting on by default now
After discussions with @htejun it appears it's OK now to enable memory
accounting by default for all units without affecting system performance
too badly. facebook has made good experiences with deploying memory
accounting across their infrastructure.

This hence turns MemoryAccounting= from opt-in to opt-out, similar to
how TasksAccounting= is already handled. The other accounting options
remain off, their performance impact is too big still.
2018-02-09 20:06:33 +01:00
Yu Watanabe e8a565cb66 core: make ExecRuntime be manager managed object
Before this, each ExecRuntime object is owned by a unit. However,
it may be shared with other units which enable JoinsNamespaceOf=.
Thus, by the serialization/deserialization process, its sharing
information, more specifically, reference counter is lost, and
causes issue #7790.

This makes ExecRuntime objects be managed by manager, and changes
the serialization/deserialization process.

Fixes #7790.
2018-02-06 16:00:34 +09:00
Alan Jenkins cc2b9e6b20 rationalize interface for opening/closing logging
log_open_console() did not switch from stderr to /dev/console, when
"always_reopen_console" was set.  It was necessary to call
log_close_console() first.

By contrast, log_open() did switch between e.g. journald and kmsg according
to the value of "prohibit_ipc".

Let's fix log_open() to respect the values of all the log options, and we
can make log_close_*() private.

Also log_close_console() is changed.  There was some precaution, avoiding
closing the console fd if we are not PID 1.  I think commit 48a601fe made
a little mistake in leaving this in, and it only served to confuse
readers :).

Also I changed systemd-shutdown. Now we have log_set_prohibit_ipc(), let's
use it to clarify that systemd-shutdown is not expected to try and log via
journald (which it is about to kill).  We avoided ever asking it to, but
it's more convenient for the reader if they don't have to think about that.
In that sense, it's similar to using assert() to validate a function's
arguments.
2018-01-27 18:01:51 +00:00
Alan Jenkins ba30753899 pid1: when we can't log to journal, remember our fallback log target
If we have to force the logging to close the journal fd, then we can open
any fallback log target.  E.g. kmsg, if the target was the default
JOURNAL_OR_KMSG.

This is the behaviour I would expect from the documentation.  I couldn't
find any justification in the code, for why we would want to start dropping
log messages instead of sending them to the fallback target.

This means we will match the behaviour of processes which we fork and which
set `open_when_needed`, and with generators - which use
log_set_prohibit_ipc(true) - which we fork+exec during a reload.

IMO this illustrates that the log_open/log_close interface is too clunky.
So with the behaviour settled, I will refactor the interface in the next
commit :).
2018-01-26 22:47:16 +00:00
Zbigniew Jędrzejewski-Szmek dc3c9f5e36 core: initalize buffer 2018-01-26 00:59:23 +09:00
Yu Watanabe dd1db3c288 core: manager logs firmware and loader time when startup finished 2018-01-26 00:59:20 +09:00
Zbigniew Jędrzejewski-Szmek 5eb83fa645
Merge pull request #7991 from poettering/n-on-console
a comprehensive fix for the n_on_console miscounting issue
2018-01-25 13:48:08 +03:00
Lennart Poettering adefcf2821 core: rework how we count the n_on_console counter
Let's add a per-unit boolean that tells us whether our unit is currently
counted or not. This way it's unlikely we get out of sync again and
things are generally more robust.

This also allows us to remove the counting logic specific to service
units (which was in fact mostly a copy from the generic implementation),
in favour of fully generic code.

Replaces: #7824
2018-01-24 20:14:51 +01:00
Lennart Poettering 46fb617bf9 manager: minor manager_get_show_status() simplification
Since the the whole function ultimately is just a fancy getter for the
show_status field, let's actually return it as last step literally
without an extra needless "if".
2018-01-24 19:52:29 +01:00
Lennart Poettering 5a69973ff2 manager: add some explanatory comments to manager_dispatch_idle_pipe_fd() 2018-01-24 19:52:14 +01:00
Lennart Poettering d075092f14 pid1: make use of new "prohibit_ipc" logging flag in PID 1
Let's set it initially, and then toggle it only when we know its safe.
2018-01-24 18:22:56 +01:00
Lennart Poettering 62a769136d core: rework how we track which PIDs to watch for a unit
Previously, we'd maintain two hashmaps keyed by PIDs, pointing to Unit
interested in SIGCHLD events for them. This scheme allowed a specific
PID to be watched by exactly 0, 1 or 2 units.

With this rework this is replaced by a single hashmap which is primarily
keyed by the PID and points to a Unit interested in it. However, it
optionally also keyed by the negated PID, in which case it points to a
NULL terminated array of additional Unit objects also interested. This
scheme means arbitrary numbers of Units may now watch the same PID.

Runtime and memory behaviour should not be impact by this change, as for
the common case (i.e. each PID only watched by a single unit) behaviour
stays the same, but for the uncommon case (a PID watched by more than
one unit) we only pay with a single additional memory allocation for the
array.

Why this all? Primarily, because allowing exactly two units to watch a
specific PID is not sufficient for some niche cases, as processes can
belong to more than one unit these days:

1. sd_notify() with MAINPID= can be used to attach a process from a
   different cgroup to multiple units.

2. Similar, the PIDFile= setting in unit files can be used for similar
   setups,

3. By creating a scope unit a main process of a service may join a
   different unit, too.

4. On cgroupsv1 we frequently end up watching all processes remaining in
   a scope, and if a process opens lots of scopes one after the other it
   might thus end up being watch by many of them.

This patch hence removes the 2-unit-per-PID limit. It also makes a
couple of other changes, some of them quite relevant:

- manager_get_unit_by_pid() (and the bus call wrapping it) when there's
  ambiguity will prefer returning the Unit the process belongs to based on
  cgroup membership, and only check the watch-pids hashmap if that
  fails. This change in logic is probably more in line with what people
  expect and makes things more stable as each process can belong to
  exactly one cgroup only.

- Every SIGCHLD event is now dispatched to all units interested in its
  PID. Previously, there was some magic conditionalization: the SIGCHLD
  would only be dispatched to the unit if it was only interested in a
  single PID only, or the PID belonged to the control or main PID or we
  didn't dispatch a signle SIGCHLD to the unit in the current event loop
  iteration yet. These rules were quite arbitrary and also redundant as
  the the per-unit handlers would filter the PIDs anyway a second time.
  With this change we'll hence relax the rules: all we do now is
  dispatch every SIGCHLD event exactly once to each unit interested in
  it, and it's up to the unit to then use or ignore this. We use a
  generation counter in the unit to ensure that we only invoke the unit
  handler once for each event, protecting us from confusion if a unit is
  both associated with a specific PID through cgroup membership and
  through the "watch_pids" logic. It also protects us from being
  confused if the "watch_pids" hashmap is altered while we are
  dispatching to it (which is a very likely case).

- sd_notify() message dispatching has been reworked to be very similar
  to SIGCHLD handling now. A generation counter is used for dispatching
  as well.

This also adds a new test that validates that "watch_pid" registration
and unregstration works correctly.
2018-01-23 21:29:31 +01:00
Lennart Poettering 575b300b79 pid1: rework how we dispatch SIGCHLD and other signals
This fundamentally makes one change: we never process more than one
signal or more than one waitid() event per event loop. We'll never tight
loop around waitid() or around read() on our signalfd instead, but
always return to the main event loop after processing one event.

By doing this we put the event priorization handling into full power
again, as we'll always check for higher priority events before looking
at the next signal or waitid() again.

This introduces a new "defer" event source "sigchld_event". It's enabled
as soon as we see SIGCHLD, and disabled as soon as waitid() reported no
further children pending. It's running at a relatively high priority,
one step higher than signal handling itself, but lower than
/proc/self/mountinfo event handling, so that the latter always takes
precedence.

Since we want to process sd_notify() events at an even higher priority
than SIGCHLD (as before) it is moved one priority step up, too.

Fixes: #7932
Possibly fixes: #7966
2018-01-23 18:41:40 +01:00
Lennart Poettering 67ae4e8d59 core: move user lookup event priority to -11
This is internal stuff, us talking to ourselves and relatively
independent of everything else, let's put this at
highest priority hence.
2018-01-23 18:15:16 +01:00
Lennart Poettering 4259d20215 manager: add MANAGER_IS_RUNNING() for checking whether the manager is running
This macro is useful as the check is not obvious, and we better abstract
this away.
2018-01-23 16:43:56 +01:00
Lennart Poettering 4adf314b77 manager: split out send_ready and basic.target checking into functions of their own
Let's shorten manager_check_finished() a bit by splitting out checking
of basic.target and the two things we do when we reach it.

This should not change behaviour, except for one thing: we now check
basic.target's actual state for figuring out whether it is up, instead
of generically checking whether it has any job queued. This is arguably
more correct, and is what other code does too for similar purposes, for
example manager_state()
2018-01-23 16:39:12 +01:00
Jan Klötzke 2a12e32efa pid1: add option to disable service watchdogs
Add a "systemd.service_watchdogs=" option to the command line which
disables all service runtime watchdogs and emergency actions.
2018-01-22 18:10:03 +01:00
Zbigniew Jędrzejewski-Szmek d8eb10d61a core: delay logging the taint string until after basic.target is reached (#7935)
This happens to be almost the same moment as when we send READY=1 in the user
instance, but the logic is slightly different, since we log taint when
basic.target is reached in the system manager, but we send the notification
only in the user manager. So add a separate flag for this and propagate it
across reloads.

Fixes #7683.
2018-01-21 21:17:54 +09:00
Lennart Poettering db256aab13 core: be stricter when handling PID files and MAINPID sd_notify() messages
Let's be more restrictive when validating PID files and MAINPID=
messages: don't accept PIDs that make no sense, and if the configuration
source is not trusted, don't accept out-of-cgroup PIDs. A configuratin
source is considered trusted when the PID file is owned by root, or the
message was received from root.

This should lock things down a bit, in case service authors write out
PID files from unprivileged code or use NotifyAccess=all with
unprivileged code. Note that doing so was always problematic, just now
it's a bit less problematic.

When we open the PID file we'll now use the CHASE_SAFE chase_symlinks()
logic, to ensure that we won't follow an unpriviled-owned symlink to a
privileged-owned file thinking this was a valid privileged PID file,
even though it really isn't.

Fixes: #6632
2018-01-11 15:12:16 +01:00
Lennart Poettering 15e23e8cdf manager: make use of pid_is_valid() where appropriate 2018-01-11 15:12:16 +01:00
Lennart Poettering 007e4b5490 manager: make use of NEWLINE macro where appropriate 2018-01-11 15:12:16 +01:00
Lennart Poettering da5fb86100 manager: swap order in which we ellipsize/escape sd_notify() messages for debugging
If we have to chose between truncated escape sequences and strings
exploded to 4 times the desried length by fully escaping, prefer the
latter.

It's for debug only, hence doesn't really matter much.
2018-01-11 15:12:16 +01:00
Lennart Poettering 47cf8ff206 manager: rework manager_clean_environment()
Let's rename it manager_sanitize_environment() which is a more precise
name. Moreover, sort the environment implicitly inside it, as all our
callers do that anyway afterwards and we can save some code this way.

Also, update the list of env vars to drop, i.e. the env vars we manage
ourselves and don't want user code to interfear with. Also sort this
list to make it easier to update later on.
2018-01-10 18:30:06 +01:00
Lennart Poettering 665dfe9318 io-util: make flush_fd() return how many bytes where flushed
This is useful so that callers know whether anything at all and how much
was flushed.

This patches through users of this functions to ensure that the return
values > 0 which may be returned now are not propagated in public APIs.

Also, users that ignore the return value are changed to do so explicitly
now.
2018-01-05 13:55:08 +01:00
Lennart Poettering f1d34068ef tree-wide: add DEBUG_LOGGING macro that checks whether debug logging is on (#7645)
This makes things a bit easier to read I think, and also makes sure we
always use the _unlikely_ wrapper around it, which so far we used
sometimes and other times we didn't. Let's clean that up.
2017-12-15 11:09:00 +01:00
Lennart Poettering e3140015a7
Merge pull request #7640 from keszybz/tainting-updates
Tainting updates
2017-12-14 22:57:17 +01:00
Zbigniew Jędrzejewski-Szmek 198ce93248 core: drop taints for nobody user/group names
We have a check and warning at compile time. The user cannot do anything about
this at runtime, and all other taints are about checks that happen at runtime
and are specific to that system (and at least potentially correctable).

(The logic in the compilation-time check was updated to treat "nogroup" as OK,
but not the runtime check. But I think it's better to remove the runtime check
for this altogether, so this becomes moot.)
2017-12-14 22:14:38 +01:00
Lennart Poettering fbd0b64f44
tree-wide: make use of new STRLEN() macro everywhere (#7639)
Let's employ coccinelle to do this for us.

Follow-up for #7625.
2017-12-14 19:02:29 +01:00
Lennart Poettering 0d53667334 tree-wide: use __fsetlocking() instead of fxyz_unlocked()
Let's replace usage of fputc_unlocked() and friends by __fsetlocking(f,
FSETLOCKING_BYCALLER). This turns off locking for the entire FILE*,
instead of doing individual per-call decision whether to use normal
calls or _unlocked() calls.

This has various benefits:

1. It's easier to read and easier not to forget

2. It's more comprehensive, as fprintf() and friends are covered too
   (as these functions have no _unlocked() counterpart)

3. Philosophically, it's a bit more correct, because it's more a
   property of the file handle really whether we ever pass it on to another
   thread, not of the operations we then apply to it.

This patch reworks all pieces of codes that so far used fxyz_unlocked()
calls to use __fsetlocking() instead. It also reworks all places that
use open_memstream(), i.e. use stdio FILE* for string manipulations.

Note that this in some way a revert of 4b61c87511.
2017-12-14 10:42:25 +01:00
Alan Jenkins 0fd402b012 core: fix undefined behaviour due to uninitialized string buffer (#7597)
Failure of systemd to respond on the bus interface was bisected to af6b0ecc
"core: make "taint" string logic a bit more generic and output it at boot".

Failure was presumably caused by trying to append strings to an
unintialized buffer, leading to writing outside the unterminated buffer
and hence undefined behaviour.
2017-12-10 19:58:01 +09:00
Zbigniew Jędrzejewski-Szmek ba60adc623
Merge pull request #7572 from poettering/taint-manager
"taint" logic improvements and other minor fixes
2017-12-07 21:06:28 +01:00
Lennart Poettering 90d7464d83 manager: taint the manager if the overflowuid/overflowgid aren't set to 65534 2017-12-07 12:34:46 +01:00
Lennart Poettering af6b0ecc4c core: make "taint" string logic a bit more generic and output it at boot
The tainting logic existed for a long time, but was hidden inside the
bus interfaces. Let's give it a small bit more coverage, by logging its
value early at boot during initialization.
2017-12-07 11:27:07 +01:00
Lennart Poettering e27fe688f2 manager: don't check /usr state of initrd to determine "taint-usr" taint 2017-12-07 11:09:09 +01:00
Lennart Poettering 5eb397cfad manager: don't bother with creating /run/systemd/units/ in test mode
This makes sure running "systemd --test" works again on systems running
older systemd versions where the dir doesn't exist yet.
2017-12-07 11:07:55 +01:00
Lennart Poettering 279d81dd46 manager: split out code that sets up run_queue event source into function of its own
Let's shorten manager_new() a bit.
2017-12-07 11:02:47 +01:00
Lennart Poettering 45639f1be5 core: never remove "transient" and "control" directories from unit search path
This changes the unit search path logic to never drop the transient and
control directories from the unit search path. This is necessary as we
add new entries to both during runtime, due to the "systemctl
set-property" and transient unit logic.

Previously, the "transient" directory was created during early boot to
deal with this, but the "control" directories were not covered like
that. Creating the control directories early at boot is not possible
however, as /etc might be read-only then, and we do define a persistent
control directory. Hence, let's create these dirs on-demand when we need
them, and make sure the search path clean-up logic never drops them from
the search path even if they are initially missing.

(Also, always create these paths properly labelled)
2017-11-29 12:34:12 +01:00
Lennart Poettering 45a7b16bae core: don't reference rescue/emergency targets in --user mode
They are only defined for system mode, hence let's not check for them in
--user mode.

Follow-up for #7433
2017-11-29 12:34:12 +01:00
Yu Watanabe 706424c2e2 core/manager: check the existance of the special units (#7433)
In the user mode, not all special units exist.
So, we need to check whether the units exist or not before operate
something to the units.
Such the check was mistakenly dropped by e68537f0ba.

Fixes #7426.
2017-11-23 13:25:56 +01:00
Zbigniew Jędrzejewski-Szmek bfbcf21d75
Merge pull request #7406 from poettering/timestamp-rework
timestamping rework
2017-11-22 11:55:04 +01:00
Lennart Poettering e68537f0ba core: make use of unit_active_or_pending() where we can
Let's make use of unit_active_or_pending() where we can. Note that this
change changes beaviour in one specific case: when shutdown.target is
active we'll now also return that the system is in "stopping" state, not
only when we try to get into it. That makes sense as shutdown.target is
ordered before the actually shutdown units such as
"systemd-poweroff.service", and if the state is queried between reaching
those we should also report "stopping".
2017-11-21 11:01:34 +01:00
Lennart Poettering 49d5666cc5 manager: introduce MANAGER_IS_FINISHED() macro
Let's make our finished checks a bit more readable. Checking the
timestamp is not entirely obvious, hence let's abstract that a bit by
adding a macro that shows what we are doing here, not how we doing it.

This is particularly useful if we want to change the definition of
"finished" later on, in particular, when we try to fix #7023.
2017-11-21 11:01:34 +01:00
Lennart Poettering 713f6f901d manager: add manager_get_dump_string()
It's like manager_dump(), but returns a string. This allows us to reduce
some duplicate code. Also, while we are at it, turn off stdio locking
while we write to the memory FILE *f.
2017-11-21 11:01:34 +01:00
Lennart Poettering ad75b9e765 core: add manager_dump() call, and make it output timestamp data
It's a wrapper around manager_dump_units() and manager_dump_jobs(), and
outputs some additional timestamp data.

Also, port two users of this over.
2017-11-21 10:22:28 +01:00
Lennart Poettering 9f9f034271 manager: rework the timestamps logic, so that they are an enum-index array
This makes things quite a bit more systematic I think, as we can
systematically operate on all timestamps, for example for the purpose of
serialization/deserialization.

This rework doesn't necessarily make things shorter in the individual
lines, but it does reduce the line count a bit.

(This is useful particularly when we want to add additional timestamps,
for example to solve #7023)
2017-11-21 10:22:28 +01:00
Shawn Landden 4831981d89 tree-wide: adjust fall through comments so that gcc is happy
Distcc removes comments, making the comment silencing
not work.

I know there was a decision against a macro in commit
ec251fe7d5
2017-11-20 13:06:25 -08:00
Zbigniew Jędrzejewski-Szmek 53e1b68390 Add SPDX license identifiers to source files under the LGPL
This follows what the kernel is doing, c.f.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.
2017-11-19 19:08:15 +01:00
Lennart Poettering fd1306121d core: never apply first boot presets in the initrd
Presets are useful to initialize uninitialized /etc, but that doesn't
apply to the initrd.

Also, let's rename etc_empty → first_boot. After all, the variable
doesn't actually reflect whether /etc is really empty, it just reflects
whether /etc/machine-id existed originally or not. Moreover, we later on
directly initialize manager_set_first_boot() from it, hence let's just
name it the same way all through the codepath, to make this all less
confusing.

See: #7100
2017-11-17 11:28:17 +01:00
Lennart Poettering d3070fbdf6 core: implement /run/systemd/units/-based path for passing unit info from PID 1 to journald
And let's make use of it to implement two new unit settings with it:

1. LogLevelMax= is a new per-unit setting that may be used to configure
   log priority filtering: set it to LogLevelMax=notice and only
   messages of level "notice" and lower (i.e. more important) will be
   processed, all others are dropped.

2. LogExtraFields= is a new per-unit setting for configuring per-unit
   journal fields, that are implicitly included in every log record
   generated by the unit's processes. It takes field/value pairs in the
   form of FOO=BAR.

Also, related to this, one exisiting unit setting is ported to this new
facility:

3. The invocation ID is now pulled from /run/systemd/units/ instead of
   cgroupfs xattrs. This substantially relaxes requirements of systemd
   on the kernel version and the privileges it runs with (specifically,
   cgroupfs xattrs are not available in containers, since they are
   stored in kernel memory, and hence are unsafe to permit to lesser
   privileged code).

/run/systemd/units/ is a new directory, which contains a number of files
and symlinks encoding the above information. PID 1 creates and manages
these files, and journald reads them from there.

Note that this is supposed to be a direct path between PID 1 and the
journal only, due to the special runtime environment the journal runs
in. Normally, today we shouldn't introduce new interfaces that (mis-)use
a file system as IPC framework, and instead just an IPC system, but this
is very hard to do between the journal and PID 1, as long as the IPC
system is a subject PID 1 manages, and itself a client to the journal.

This patch cleans up a couple of types used in journal code:
specifically we switch to size_t for a couple of memory-sizing values,
as size_t is the right choice for everything that is memory.

Fixes: #4089
Fixes: #3041
Fixes: #4441
2017-11-16 12:40:17 +01:00
Zbigniew Jędrzejewski-Szmek 17f01ace62 core/manager: just return an error if we fail halfway
We would continue, but still return an error at the end. This isn't useful
because we'd still error-out in main().

Also, add a missing error message when we fail to mkdir.
2017-11-15 22:58:24 +01:00
Lennart Poettering eef85c4a3f core: track why unit dependencies came to be
This replaces the dependencies Set* objects by Hashmap* objects, where
the key is the depending Unit, and the value is a bitmask encoding why
the specific dependency was created.

The bitmask contains a number of different, defined bits, that indicate
why dependencies exist, for example whether they are created due to
explicitly configured deps in files, by udev rules or implicitly.

Note that memory usage is not increased by this change, even though we
store more information, as we manage to encode the bit mask inside the
value pointer each Hashmap entry contains.

Why this all? When we know how a dependency came to be, we can update
dependencies correctly when a configuration source changes but others
are left unaltered. Specifically:

1. We can fix UDEV_WANTS dependency generation: so far we kept adding
   dependencies configured that way, but if a device lost such a
   dependency we couldn't them again as there was no scheme for removing
   of dependencies in place.

2. We can implement "pin-pointed" reload of unit files. If we know what
   dependencies were created as result of configuration in a unit file,
   then we know what to flush out when we want to reload it.

3. It's useful for debugging: "systemd-analyze dump" now shows
   this information, helping substantially with understanding how
   systemd's dependency tree came to be the way it came to be.
2017-11-10 19:45:29 +01:00
Michal Sekletar 41dfa61d35 manager: fix connecting to bus when dbus is actually around (#7205)
manager_connect_bus() is called *before* manager_coldplug(). As a last
thing in service_coldplug() we set service state to
s->deserialized_state, and thus before we do that all services are
inactive and try_connect always evaluates to false. To fix that we must
look at deserialized state instead of current unit state.

Fixes #7146
2017-11-01 10:25:48 +01:00
Zbigniew Jędrzejewski-Szmek 0c2826c60c core: in --user mode, report READY=1 as soon as basic.target is reached (#7102)
When a user logs in, systemd-pam will wait for the user manager instance to
report readiness. We don't need to wait for all the jobs to finish, it
is enough if the basic startup is done and the user manager is responsive.

systemd --user will now send out a READY=1 notification when either of two
conditions becomes true:
- basic.target/start job is gone,
- the initial transaction is done.

Also fixes #2863.
2017-10-24 14:48:54 +02:00
Lennart Poettering 4aa1d31c89 Merge pull request #6974 from keszybz/clean-up-defines
Clean up define definitions
2017-10-04 19:25:30 +02:00
Yu Watanabe 4c70109600 tree-wide: use IN_SET macro (#6977) 2017-10-04 16:01:32 +02:00
Zbigniew Jędrzejewski-Szmek 349cc4a507 build-sys: use #if Y instead of #ifdef Y everywhere
The advantage is that is the name is mispellt, cpp will warn us.

$ git grep -Ee "conf.set\('(HAVE|ENABLE)_" -l|xargs sed -r -i "s/conf.set\('(HAVE|ENABLE)_/conf.set10('\1_/"
$ git grep -Ee '#ifn?def (HAVE|ENABLE)' -l|xargs sed -r -i 's/#ifdef (HAVE|ENABLE)/#if \1/; s/#ifndef (HAVE|ENABLE)/#if ! \1/;'
$ git grep -Ee 'if.*defined\(HAVE' -l|xargs sed -i -r 's/defined\((HAVE_[A-Z0-9_]*)\)/\1/g'
$ git grep -Ee 'if.*defined\(ENABLE' -l|xargs sed -i -r 's/defined\((ENABLE_[A-Z0-9_]*)\)/\1/g'
+ manual changes to meson.build

squash! build-sys: use #if Y instead of #ifdef Y everywhere

v2:
- fix incorrect setting of HAVE_LIBIDN2
2017-10-04 12:09:29 +02:00
Lennart Poettering c621849539 core: fix special directories for user services
The system paths were listed where the user paths should have been
listed. Correct that.
2017-10-02 17:41:44 +02:00
Lennart Poettering 72fd17682d core: usually our enum's _INVALID and _MAX special values are named after the full type
In most cases we followed the rule that the special _INVALID and _MAX
values we use in our enums use the full type name as prefix (in contrast
to regular values that we often make shorter), do so for
ExecDirectoryType as well.

No functional changes, just a little bit of renaming to make this code
more like the rest.
2017-10-02 17:41:43 +02:00
Andreas Rammhold ec2ce0c5d7
tree-wide: use `!IN_SET(..)` for `a != b && a != c && …`
The included cocci was used to generate the changes.

Thanks to @flo-wer for pointing this case out.
2017-10-02 13:09:56 +02:00
Andreas Rammhold 3742095b27
tree-wide: use IN_SET where possible
In addition to the changes from #6933 this handles cases that could be
matched with the included cocci file.
2017-10-02 13:09:54 +02:00
Lennart Poettering 09e2465407 cgroup: after determining that a cgroup is empty, asynchronously dispatch this
This makes sure that if we learn via inotify or another event source
that a cgroup is empty, and we checked that this is indeed the case (as
we might get spurious notifications through inotify, as the inotify
logic through the "cgroups.event" is pretty unspecific and might be
trigger for a variety of reasons), then we'll enqueue a defer event for
it, at a priority lower than SIGCHLD handling, so that we know for sure
that if there's waitid() data for a process we used it before
considering the cgroup empty notification.

Fixes: #6608
2017-09-27 18:26:18 +02:00
Lennart Poettering 91a6073ef7 core: rename cgroup_queue → cgroup_realize_queue
We are about to add second cgroup-related queue, called
"cgroup_empty_queue", hence let's rename "cgroup_queue" to
"cgroup_realize_queue" (as that is its purpose) to minimize confusion
about the two queues.

Just a rename, no functional changes.
2017-09-27 17:59:25 +02:00
Lennart Poettering f1c50becda core: make sure to log invocation ID of units also when doing structured logging 2017-09-22 15:24:55 +02:00
Lennart Poettering bd389aa734 manager: initialize timeouts when allocating a naked Manager object
This way we can safely run manager objects from tests and good timeouts
apply. Without this all timeouts are set 0, which means they fire
instantly, when run from tests which do not explicitly configure them
(the way main.c does).
2017-09-22 15:24:54 +02:00
Zbigniew Jędrzejewski-Szmek e0a3da1fd2 Make test_run into a flags field and disable generators again
Now generators are only run in systemd --test mode, where this makes
most sense (how are you going to test what would happen otherwise?).

Fixes #6842.

v2:
- rename test_run to test_run_flags
2017-09-19 20:14:05 +02:00
Zbigniew Jędrzejewski-Szmek a1f31f4715 core/manager: when running in test mode, use a temp dir for generated stuff
When running through systemd-analyze verify or with --test, we would
not run generators (environment or unit). But at the end, we would nuke
the generator dirs anyway.

Simplify things by actually running generators of both types, but redirecting
their output to a temporary directory. This has the advantage that we test more
code, and the verification is more complete.

Since now we are not touching the real generator directories, we also don't
delete them, which fixes #5609.
2017-09-14 19:41:24 +02:00
Zbigniew Jędrzejewski-Szmek 81fe6cdee2 pid1: improve the check guarding unit_file_preset_all()
When running in systemd-analyze verify, first_boot was initialized to -1
and never changed, so we'd try to run unit_file_preset_all(). Change the
check to > 0 which is more correct. Also, add a separate test for !test_run,
since we wouldn't want to run presets even if we were in first boot
(or /etc was empty for whatever other reason).
2017-09-14 19:07:44 +02:00
Zbigniew Jędrzejewski-Szmek c5aaaebced Merge pull request #6780 from poettering/agent-message
Three minor fixes.
2017-09-09 22:32:37 +02:00
Lennart Poettering d5f1532657 core: downgrade log message about inability to propagate cgroup release message
If dbus is already down during shutdown, we can't propagate the cgroup
release message anymore, but that's expected and nothing to warn about.
Hence let's downgrade the message from LOG_WARN to LOG_DEBUG.

Fixes: #6777
2017-09-08 17:24:57 +02:00
Michal Sekletar 5463fa0a88 manager: when reexecuting try to connect to bus only when dbus.service is around (#6773)
Trying to connect otherwise is pointless, because if socket isn't around
we won't connect. However, when dbus.socket is present we attempt to
connect. That attempt can't succeed because we are then supposed
to activate dbus.service as a response to connection from
us. This results in deadlock.

Fixes #6303
2017-09-08 15:41:44 +02:00
Alan Jenkins d60cb656fc manager: fix job mode when signalled to shutdown etc
The irreversible job mode is required to ensure that shutdown is not
interrupted by the activation of a unit with a conflict.

We already used the correct job mode for `ctrl-alt-del.target`.  But not
for `exit.target` (SIGINT of user manager).  The SIGRT shutdown signals
also needed fixing.

Also change SIGRTMIN+0 to isolate default.target, instead of starting
it.  The previous behaviour was documented.  However there was no reason
given for it, nor can we provide one.  The problem that isolate is too
aggressive anywhere outside of emergency.target (#2607) is orthogonal.
This feature is "accessible by different means and only really a safety
net"; it is confusing for it to differ from `systemctl default` without
explanation.

`AllowIsolate=yes` is retained on poweroff.target etc. for backwards
compatibility.

`sigpwr.target` is also an obvious candidate for linking to a shutdown
target.  Unforunately it is also a possible hook for implementing some
logic like system V init did, reading `/etc/powerstatus`.  If we switched
to starting `sigpwr.target` with REPLACE_IRREVERSIBLY, attempts to run
`systemctl shutdown` from it would fail, if they had not thought to set
`DefaultDependencies=no`.  We had provided no examples for `sigpwr`, and
the whole idea is cruft to keep legacy people happy.  For the moment, I
leave `sigpwr` alone, with no risk of disrupting anyone's
previously-working, half-working, or untested setup.

Fixes #6484.  See also #6471
2017-08-31 16:17:42 +01:00
Alan Jenkins c75fbadac6 manager: remove fallback for user/exit.target
The comment here was misleading: the job can fail to enqueue for reasons
other than the target not existing.

The fallback caused an error to be logged, and dates back to when the
"user" directory was named "session".  units/session/exit.target was added
later the same year.

This is consistent with the documentation (man systemd), and the handling
of similar signals.  It's also consistent with `systemctl exit`, which is
what most people would expect.
2017-08-31 16:17:41 +01:00
Lennart Poettering 19bbdd985e core: manager_set_exec_params() cannot fail, hence make it void
Let's simplify things a bit.
2017-08-10 15:02:50 +02:00
Lennart Poettering 8679efde21 execute: add one more ExecFlags flag, for controlling unconditional directory chowning
Let's decouple the Manager object from the execution logic a bit more
here too, and simply pass along the fact whether we should
unconditionally chown the runtime/... directories via the ExecFlags
field too.
2017-08-10 14:44:58 +02:00
Lennart Poettering af635cf377 execute: let's decouple execute.c a bit from the unit logic
Let's try to decouple the execution engine a bit from the Unit/Manager
concept, and hence pass one more flag as part of the ExecParameters flags
field.
2017-08-10 14:44:58 +02:00
Jouke Witteveen 15d167f8a3 core: propagate reload from RELOADING=1 notification (#6550) 2017-08-07 11:27:24 +02:00
Luca Bruno 28dd66ecfc core: evaluate presets after generators have run (#6526)
This commit moves the first-boot system preset-settings evaluation out
of main and into the manager startup logic itself. Notably, it reverses
the order between generators and presets evaluation, so that any changes
performed by first-boot generators are taken into the account by presets
logic.

After this change, units created by a generator can be enabled as part
of a preset.
2017-08-06 09:24:24 -04:00
Zbigniew Jędrzejewski-Szmek 0742986650 core: properly handle deserialization of unknown unit types (#6476)
We just abort startup, without printing any error. Make sure we always
print something, and when we cannot deserialize some unit, just ignore it and
continue.

Fixup for 4bc5d27b94. Without this, we would hang
in daemon-reexec after upgrade.
2017-07-31 08:05:35 +02:00
Lennart Poettering 4b61c87511 tree-wide: fput[cs]() → fput[cs]_unlocked() wherever that makes sense (#6396)
As a follow-up for db3f45e2d2 let's do the
same for all other cases where we create a FILE* with local scope and
know that no other threads hence can have access to it.

For most cases this shouldn't change much really, but this should speed
dbus introspection and calender time formatting up a bit.
2017-07-21 10:35:45 +02:00
Yu Watanabe 35aba85a73 core/manager: fix memory leak (#6400)
This fixes the memory leak introduced by 3536f49e8f,
which forgot to free the prefixes in the manager.

Fixes #6398.
2017-07-18 17:30:52 +03:00
Yu Watanabe 3536f49e8f core: add {State,Cache,Log,Configuration}Directory= (#6384)
This introduces {State,Cache,Log,Configuration}Directory= those are
similar to RuntimeDirectory=. They create the directories under
/var/lib, /var/cache/, /var/log, or /etc, respectively, with the mode
specified in {State,Cache,Log,Configuration}DirectoryMode=.

This also fixes #6391.
2017-07-18 14:34:52 +02:00
Zbigniew Jędrzejewski-Szmek d233c99ac8 manager: just warn about an invalid environment entry
Apart from bugs (as in #6152), this can happen if we ever make
our requirements for environment entries more stringent. As with
the rest of deserialization, we should just warn and continue.
2017-06-23 20:46:33 -04:00
Zbigniew Jędrzejewski-Szmek 62c460c6e2 manager: raise level of notices about failed deserialization
This is something that shouldn't happen. If it does, we want to
know about it.
2017-06-23 20:46:33 -04:00
Lennart Poettering 00c83b4300 core: return a friendlier error for a dbus path referring to a non-existent unit
See: #6059
2017-06-22 20:54:54 -04:00
Lennart Poettering c22800e40e cgroup: rename cg_unified() → cg_unified_controller()
cg_unified() is a bit generic a name, let's make clear that it checks
whether a specified controller is in unified mode.
2017-02-24 18:00:04 +01:00
Lennart Poettering b4cccbc13a cgroup: change cg_unified() to possibly return errors again
We use our cgroup APIs in various contexts, including from our libraries
sd-login, sd-bus. As we don#t control those environments we can't rely
that the unified cgroup setup logic succeeds, and hence really shouldn't
assert on it.

This more or less reverts 415fc41cea.
2017-02-24 17:52:58 +01:00
Lennart Poettering ecc0eab247 Merge pull request #4670 from htejun/systemd-controller-on-unified-v2
Systemd controller on unified v2
2017-02-23 16:23:02 +01:00
Lennart Poettering a4dde27d73 Merge pull request #5131 from keszybz/environment-generators
Environment generators
2017-02-21 11:11:44 +01:00
Zbigniew Jędrzejewski-Szmek 64691d2024 manager: run environment generators
Environment file generators are a lot like unit file generators, but not
exactly:

1. environment file generators are run for each manager instance, and their
   output is (or at least can be) individualized.

   The generators themselves are system-wide, the same for all users.

2. environment file generators are run sequentially, in priority order.

Thus, the lifetime of those files is tied to lifecycle of the manager
instance. Because generators are run sequentially, later generators can use or
modify the output of earlier generators.

Each generator is run with no arguments, and the whole state is stored in the
environment variables. The generator can echo a set of variable assignments to
standard output:

  VAR_A=something
  VAR_B=something else

This output is parsed, and the next and subsequent generators run with those
updated variables in the environment. After the last generator is done, the
environment that the manager itself exports is updated.

Each generator must return 0, otherwise the output is ignored.

The generators in */user-env-generator are for the user session managers,
including root, and the ones in */system-env-generator are for pid1.
2017-02-20 18:49:14 -05:00
Zbigniew Jędrzejewski-Szmek fe902fa496 core/manager: move environment serialization out to basic/env-util.c
This protocol is generally useful, we might just as well reuse it for the
env. generators.

The implementation is changed a bit: instead of making a new strv and freeing
the old one, just mutate the original. This is much faster with larger arrays,
while in fact atomicity is preserved, since we only either insert the new
entry or not, without being in inconsistent state.

v2:
- fix confusion with return value
2017-02-20 18:49:14 -05:00
Zbigniew Jędrzejewski-Szmek 71cb7d306a core/manager: fix grammar in comment 2017-02-20 18:49:14 -05:00
Zbigniew Jędrzejewski-Szmek c6e47247a7 basic/exec-util: add support for synchronous (ordered) execution
The output of processes can be gathered, and passed back to the callee.
(This commit just implements the basic functionality and tests.)

After the preparation in previous commits, the change in functionality is
relatively simple. For coding convenience, alarm is prepared *before* any
children are executed, and not before. This shouldn't matter usually, since
just forking of the children should be pretty quick. One could also argue that
this is more correct, because we will also catch the case when (for whatever
reason), forking itself is slow.

Three callback functions and three levels of serialization are used:
- from individual generator processes to the generator forker
- from the forker back to the main process
- deserialization in the main process

v2:
- replace an structure with an indexed array of callbacks
2017-02-20 18:49:13 -05:00
Zbigniew Jędrzejewski-Szmek 504afd7c34 core/manager: split out creation of serialization fd out to a helper
There is a slight change in behaviour: the user manager for root will create a
temporary file in /run/systemd, not /tmp. I don't think this matters, but
simplifies implementation.
2017-02-20 18:49:09 -05:00
Tejun Heo 415fc41cea core: simplify cg_[all_]unified()
cg_[all_]unified() test whether a specific controller or all controllers are on
the unified hierarchy.  While what's being asked is a simple binary question,
the callers must assume that the functions may fail any time, which
unnecessarily complicates their usages.  This complication is unnecessary.
Internally, the test result is cached anyway and there are only a few places
where the test actually needs to be performed.

This patch simplifies cg_[all_]unified().

* cg_[all_]unified() are updated to return bool.  If the result can't be
  decided, assertion failure is triggered.  Error handlings from their callers
  are dropped.

* cg_unified_flush() is updated to calculate the new result synchrnously and
  return whether it succeeded or not.  Places which need to flush the test
  result are updated to test for failure.  This ensures that all the following
  cg_[all_]unified() tests succeed.

* Places which expected possible cg_[all_]unified() failures are updated to
  call and test cg_unified_flush() before calling cg_[all_]unified().  This
  includes functions used while setting up mounts during boot and
  manager_setup_cgroup().
2017-02-18 17:51:13 -05:00
Zbigniew Jędrzejewski-Szmek 2b0445262a tree-wide: add SD_ID128_MAKE_STR, remove LOG_MESSAGE_ID
Embedding sd_id128_t's in constant strings was rather cumbersome. We had
SD_ID128_CONST_STR which returned a const char[], but it had two problems:
- it wasn't possible to statically concatanate this array with a normal string
- gcc wasn't really able to optimize this, and generated code to perform the
  "conversion" at runtime.
Because of this, even our own code in coredumpctl wasn't using
SD_ID128_CONST_STR.

Add a new macro to generate a constant string: SD_ID128_MAKE_STR.
It is not as elegant as SD_ID128_CONST_STR, because it requires a repetition
of the numbers, but in practice it is more convenient to use, and allows gcc
to generate smarter code:

$ size .libs/systemd{,-logind,-journald}{.old,}
   text	   data	    bss	    dec	    hex	filename
1265204	 149564	   4808	1419576	 15a938	.libs/systemd.old
1260268	 149564	   4808	1414640	 1595f0	.libs/systemd
 246805	  13852	    209	 260866	  3fb02	.libs/systemd-logind.old
 240973	  13852	    209	 255034	  3e43a	.libs/systemd-logind
 146839	   4984	     34	 151857	  25131	.libs/systemd-journald.old
 146391	   4984	     34	 151409	  24f71	.libs/systemd-journald

It is also much easier to check if a certain binary uses a certain MESSAGE_ID:

$ strings .libs/systemd.old|grep MESSAGE_ID
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x

$ strings .libs/systemd|grep MESSAGE_ID
MESSAGE_ID=c7a787079b354eaaa9e77b371893cd27
MESSAGE_ID=b07a249cd024414a82dd00cd181378ff
MESSAGE_ID=641257651c1b4ec9a8624d7a40a9e1e7
MESSAGE_ID=de5b426a63be47a7b6ac3eaac82e2f6f
MESSAGE_ID=d34d037fff1847e6ae669a370e694725
MESSAGE_ID=7d4958e842da4a758f6c1cdc7b36dcc5
MESSAGE_ID=1dee0369c7fc4736b7099b38ecb46ee7
MESSAGE_ID=39f53479d3a045ac8e11786248231fbf
MESSAGE_ID=be02cf6855d2428ba40df7e9d022f03d
MESSAGE_ID=7b05ebc668384222baa8881179cfda54
MESSAGE_ID=9d1aaa27d60140bd96365438aad20286
2017-02-15 00:45:12 -05:00
Zbigniew Jędrzejewski-Szmek 4440b27d41 core/manager: silence gcc warning about unitialized variable
At -O3, this was printed a hundred times for various callers of
manager_add_job_by_name(). AFAICT, there is no error and `unit` is always
intialized. Nevertheless, add explicit initialization to silence the noise.

src/core/manager.c: In function 'manager_start_target':
src/core/manager.c:1413:16: warning: 'unit' may be used uninitialized in this function [-Wmaybe-uninitialized]
         return manager_add_job(m, type, unit, mode, e, ret);
                ^
src/core/manager.c:1401:15: note: 'unit' was declared here
         Unit *unit;
               ^
2017-02-12 12:56:40 -05:00
Zbigniew Jędrzejewski-Szmek 7a6a095a9e core/manager: make manager_load_unit*() functions always take output arg
We were inconsistent, manager_load_unit_prepare() would crash if _ret was ever NULL.
But none of the callers use NULL. So simplify things and require it to be non-NULL.
2017-02-12 12:40:09 -05:00
Zbigniew Jędrzejewski-Szmek 89711996b3 basic/util: move execute_directory() to separate file
It's a fairly specialized function. Let's make new files for it and the tests.
2017-02-11 18:21:06 -05:00
Lennart Poettering d53333d4b1 core: use a memfd for serialization
If we can, use a memfd for serializing state during a daemon reload or
reexec. Fall back to a file in /run/systemd or /tmp only if memfds are
not available.

See: #5016
2017-02-06 16:58:35 +01:00
Lennart Poettering ae57dad3f9 manager: refuse reloading/reexecing when /run is overly full
Let's add an extra safety check: before entering a reload/reexec, let's
verify that there's enough room in /run for it.

Fixes: #5016
2017-02-06 16:58:06 +01:00
Zbigniew Jędrzejewski-Szmek a80c157506 core: downgrade "Time has been changed" to debug (#4906)
That message is emitted by every systemd instance on every resume:
Dec 06 08:03:38 laptop systemd[1]: Time has been changed
Dec 06 08:03:38 laptop systemd[823]: Time has been changed
Dec 06 08:03:38 laptop systemd[916]: Time has been changed
Dec 07 08:00:32 laptop systemd[1]: Time has been changed
Dec 07 08:00:32 laptop systemd[823]: Time has been changed
Dec 07 08:00:32 laptop systemd[916]: Time has been changed
-- Reboot --
Dec 07 08:02:46 laptop systemd[836]: Time has been changed
Dec 07 08:02:46 laptop systemd[1]: Time has been changed
Dec 07 08:02:46 laptop systemd[926]: Time has been changed
Dec 07 19:48:12 laptop systemd[1]: Time has been changed
Dec 07 19:48:12 laptop systemd[836]: Time has been changed
Dec 07 19:48:12 laptop systemd[926]: Time has been changed
...

Fixes #4896.
2016-12-18 13:21:19 +01:00
Zbigniew Jędrzejewski-Szmek 5a1d6cb19d pid1,catalog: use a different MESSAGE_ID for user manager startup
This add a new message id for the end of user instance startup.
User manager startup is a different beast then the system startup.
Their descriptions are completely different too. Let's just separate
them.

Partially fixes #3351.

Also remove "successful" from the description, since we don't know if
the startup was successful or not.
2016-12-11 12:41:23 -05:00
Reverend Homer 8fb3f00997 tree-wide: replace all readdir cycles with FOREACH_DIRENT{,_ALL} (#4853) 2016-12-09 10:04:30 +01:00
Lennart Poettering 2e6dbc0fcd Merge pull request #4538 from fbuihuu/confirm-spawn-fixes
Confirm spawn fixes/enhancements
2016-11-18 11:08:06 +01:00
Franck Bui b0eb29449e core: add 'c' in confirmation_spawn to resume the boot process 2016-11-17 18:16:50 +01:00
Franck Bui 7d5ceb6416 core: allow to redirect confirmation messages to a different console
It's rather hard to parse the confirmation messages (enabled with
systemd.confirm_spawn=true) amongst the status messages and the kernel
ones (if enabled).

This patch gives the possibility to the user to redirect the confirmation
message to a different virtual console, either by giving its name or its path,
so those messages are separated from the other ones and easier to read.
2016-11-17 18:16:16 +01:00
Franck Bui 42bf1ae17b core: prevent the cylon when confirmation_spawn=yes (#2194)
When booting with systemd.confirm_spawn=true, the eye of cylon
animation kicks in pretty quickly so user doesn't have any chance to
answer the questions which services to start before the confirmation
message is screwed by the cylon.

This basically breaks the confirm_spawn functionality completely.

This patch prevents the cylon animation to kick in when
confirmation_spawn=yes.

Fixes: #2194
2016-11-17 18:11:21 +01:00
Lennart Poettering c5a97ed132 core: GC redundant device jobs from the run queue
In contrast to all other unit types device units when queued just track
external state, they cannot effect state changes on their own. Hence unless a
client or other job waits for them there's no reason to keep them in the job
queue. This adds a concept of GC'ing jobs of this type as soon as no client or
other job waits for them anymore.

To ensure this works correctly we need to track which clients actually
reference a job (i.e. which ones enqueued it). Unfortunately that's pretty
nasty to do for direct connections, as sd_bus_track doesn't work for
them. For now, work around this, by simply remembering in a boolean that a job
was requested by a direct connection, and reset it when we notice the direct
connection is gone. This means the GC logic works fine, except that jobs are
not immediately removed when direct connections disconnect.

In the longer term, a rework of the bus logic should fix this properly. For now
this should be good enough, as GC works for fine all cases except this one, and
thus is a clear improvement over the previous behaviour.

Fixes: #1921
2016-11-16 15:03:26 +01:00
Lennart Poettering a2d72e265a core: drop n_in_gc_queue field of Manager structure
We count the units in the GC queue with this, but actually never make use of
it, hence drop it.
2016-11-16 15:03:26 +01:00
Lennart Poettering 493fd52f1a Merge pull request #4510 from keszybz/tree-wide-cleanups
Tree wide cleanups
2016-11-03 13:59:20 -06:00
Zbigniew Jędrzejewski-Szmek 605405c6cc tree-wide: drop NULL sentinel from strjoin
This makes strjoin and strjoina more similar and avoids the useless final
argument.

spatch -I . -I ./src -I ./src/basic -I ./src/basic -I ./src/shared -I ./src/shared -I ./src/network -I ./src/locale -I ./src/login -I ./src/journal -I ./src/journal -I ./src/timedate -I ./src/timesync -I ./src/nspawn -I ./src/resolve -I ./src/resolve -I ./src/systemd -I ./src/core -I ./src/core -I ./src/libudev -I ./src/udev -I ./src/udev/net -I ./src/udev -I ./src/libsystemd/sd-bus -I ./src/libsystemd/sd-event -I ./src/libsystemd/sd-login -I ./src/libsystemd/sd-netlink -I ./src/libsystemd/sd-network -I ./src/libsystemd/sd-hwdb -I ./src/libsystemd/sd-device -I ./src/libsystemd/sd-id128 -I ./src/libsystemd-network --sp-file coccinelle/strjoin.cocci --in-place $(git ls-files src/*.c)

git grep -e '\bstrjoin\b.*NULL' -l|xargs sed -i -r 's/strjoin\((.*), NULL\)/strjoin(\1)/'

This might have missed a few cases (spatch has a really hard time dealing
with _cleanup_ macros), but that's no big issue, they can always be fixed
later.
2016-10-23 11:43:27 -04:00
Zbigniew Jędrzejewski-Szmek fb4650aa34 tree-wide: use startswith return value to avoid hardcoded offset
I think it's an antipattern to have to count the number of bytes in
the prefix by hand. We should do this automatically to avoid wasting
programmer time, and possible errors. I didn't any offsets that were
wrong, so this change is mostly to make future development easier.
2016-10-22 16:15:46 -04:00
Lukas Nykryn ae8c7939df core: use emergency_action for ctr+alt+del burst
Fixes #4306
2016-10-21 15:13:50 +02:00
Zbigniew Jędrzejewski-Szmek 3ce40911bd pid1: downgrade some rlimit warnings
Since we ignore the result anyway, downgrade errors to warning.

log_oom() will still emit an error, but that's mostly theoretical, so it
is not worth complicating the code to avoid the small inconsistency
2016-10-19 22:17:16 -04:00
Zbigniew Jędrzejewski-Szmek 6b430fdb7c tree-wide: use mfree more 2016-10-16 23:35:39 -04:00
Lennart Poettering 4b58153dd2 core: add "invocation ID" concept to service manager
This adds a new invocation ID concept to the service manager. The invocation ID
identifies each runtime cycle of a unit uniquely. A new randomized 128bit ID is
generated each time a unit moves from and inactive to an activating or active
state.

The primary usecase for this concept is to connect the runtime data PID 1
maintains about a service with the offline data the journal stores about it.
Previously we'd use the unit name plus start/stop times, which however is
highly racy since the journal will generally process log data after the service
already ended.

The "invocation ID" kinda matches the "boot ID" concept of the Linux kernel,
except that it applies to an individual unit instead of the whole system.

The invocation ID is passed to the activated processes as environment variable.
It is additionally stored as extended attribute on the cgroup of the unit. The
latter is used by journald to automatically retrieve it for each log logged
message and attach it to the log entry. The environment variable is very easily
accessible, even for unprivileged services. OTOH the extended attribute is only
accessible to privileged processes (this is because cgroupfs only supports the
"trusted." xattr namespace, not "user."). The environment variable may be
altered by services, the extended attribute may not be, hence is the better
choice for the journal.

Note that reading the invocation ID off the extended attribute from journald is
racy, similar to the way reading the unit name for a logging process is.

This patch adds APIs to read the invocation ID to sd-id128:
sd_id128_get_invocation() may be used in a similar fashion to
sd_id128_get_boot().

PID1's own logging is updated to always include the invocation ID when it logs
information about a unit.

A new bus call GetUnitByInvocationID() is added that allows retrieving a bus
path to a unit by its invocation ID. The bus path is built using the invocation
ID, thus providing a path for referring to a unit that is valid only for the
current runtime cycleof it.

Outlook for the future: should the kernel eventually allow passing of cgroup
information along AF_UNIX/SOCK_DGRAM messages via a unique cgroup id, then we
can alter the invocation ID to be generated as hash from that rather than
entirely randomly. This way we can derive the invocation race-freely from the
messages.
2016-10-07 20:14:38 +02:00
Zbigniew Jędrzejewski-Szmek 8f4d640135 core: only warn on short reads on signal fd 2016-10-07 10:05:04 -04:00
Lennart Poettering 875ca88da5 manager: tighten incoming notification message checks
Let's not accept datagrams with embedded NUL bytes. Previously we'd simply
ignore everything after the first NUL byte. But given that sending us that is
pretty ugly let's instead complain and refuse.

With this change we'll only accept messages that have exactly zero or one NUL
bytes at the very end of the datagram.
2016-10-07 12:14:33 +02:00
Lennart Poettering 045a3d5989 manager: be stricter with incomining notifications, warn properly about too large ones
Let's make the kernel let us know the full, original datagram size of the
incoming message. If it's larger than the buffer space provided by us, drop the
whole message with a warning.

Before this change the kernel would truncate the message for us to the buffer
space provided, and we'd not complain about this, and simply process the
incomplete message as far as it made sense.
2016-10-07 12:12:10 +02:00
Lennart Poettering c55ae51e77 manager: don't ever busy loop when we get a notification message we can't process
If the kernel doesn't permit us to dequeue/process an incoming notification
datagram message it's still better to stop processing the notification messages
altogether than to enter a busy loop where we keep getting notified but can't
do a thing about it.

With this change, manager_dispatch_notify_fd() behaviour is changed like this:

- if an error indicating a spurious wake-up is seen on recvmsg(), ignore it
  (EAGAIN/EINTR)

- if any other error is seen on recvmsg() propagate it, thus disabling
  processing of further wakeups

- if any error is seen on later code in the function, warn about it but do not
  propagate it, as in this cas we're not going to busy loop as the offending
  message is already dequeued.
2016-10-07 12:08:51 +02:00
Lukáš Nykrýn 24dd31c19e core: add possibility to set action for ctrl-alt-del burst (#4105)
For some certification, it should not be possible to reboot the machine through ctrl-alt-delete. Currently we suggest our customers to mask the ctrl-alt-delete target, but that is obviously not enough.

Patching the keymaps to disable that is really not a way to go for them, because the settings need to be easily checked by some SCAP tools.
2016-10-06 21:08:21 -04:00
Zbigniew Jędrzejewski-Szmek a63ee40751 core: do not try to create /run/systemd/transient in test mode
This prevented systemd-analyze from unprivileged operation on older systemd
installations, which should be possible.
Also, we shouldn't touch the file system in test mode even if we can.
2016-10-01 22:53:17 +02:00
Zbigniew Jędrzejewski-Szmek 5fd2c135f1 core: update warning message
"closing all" might suggest that _all_ fds received with the notification message
will be closed. Reword the message to clarify that only the "unused" ones will be
closed.
2016-10-01 11:01:31 +02:00
Zbigniew Jędrzejewski-Szmek c4bee3c40e core: get rid of unneeded state variable
No functional change.
2016-10-01 11:01:31 +02:00
Zbigniew Jędrzejewski-Szmek a86b76753d pid1: more informative error message for ignored notifications
It's probably easier to diagnose a bad notification message if the
contents are printed. But still, do anything only if debugging is on.
2016-09-29 22:57:57 +02:00
Zbigniew Jędrzejewski-Szmek 8523bf7dd5 pid1: process zero-length notification messages again
This undoes 531ac2b234. I acked that patch without looking at the code
carefully enough. There are two problems:
- we want to process the fds anyway
- in principle empty notification messages are valid, and we should
  process them as usual, including logging using log_unit_debug().
2016-09-29 22:57:57 +02:00