Commit graph

757 commits

Author SHA1 Message Date
Benjamin Berg cccf570355 core: Move environment generator path lookup into path-lookup.c 2020-03-04 11:24:33 +01:00
Zbigniew Jędrzejewski-Szmek 52c222db11
Merge pull request #14992 from keszybz/syslog-address-length-fix
Syslog address length fix
2020-03-02 21:31:24 +01:00
Zbigniew Jędrzejewski-Szmek f36a9d5909 tree-wide: use the return value from sockaddr_un_set_path()
It fully initializes the address structure, so no need for pre-initialization,
and also returns the length of the address, so no need to recalculate using
SOCKADDR_UN_LEN().

socklen_t is unsigned, so let's not use an int for it. (It doesn't matter, but
seems cleaner and more portable to not assume anything about the type.)
2020-03-02 15:55:44 +01:00
Zbigniew Jędrzejewski-Szmek 0d066dd1a4 pid1: add new mode systemd.show-status=error and use it when 'quiet' is passed
systemd.show-status=error is useful for the case where people care about errors
only.

If people want to have a quiet boot, they most likely don't want to see all
status output even if there is a delay in boot, so make "quiet" imply
systemd.show-status=error instead of systemd.show-status=auto.

Fixes #14976.
2020-03-01 11:48:23 +01:00
Zbigniew Jędrzejewski-Szmek 5bcf34ebf3 pid1: when showing error status, do not switch to status=temporary
We would flip to status=temporary mode on the first error, and then switch back
to status=auto after the initial transaction was done. This isn't very useful,
because usually all the messages about successfully started units and not
related to the original failure. In fact, all those messages most likely cause
the information about the prime error to scroll off screen. And if the user
requested quiet boot, there's no reason to think that they care about those
success messages.

Also, when logging about dependency cycles, treat this similarly to a unit
error and show the message even if the status is "soft disabled" (before we
wouldn't show it in that case).
2020-03-01 11:42:42 +01:00
Zbigniew Jędrzejewski-Szmek 1b4154a891 pid1: make cylon timeout significantly bigger when not showing any messages
When we are booting with show-status=on, normally new status updates happen a
few times per second. Thus, it is reasonable to start showing the cylon eye
after 5 s, because that means a significant delay has happened. When we are
running with show-status=off or show-status=auto (and no error had occured),
the user is expecting maybe 15 to 90 seconds with no output (because that's
usually how long the whole boot takes). So we shouldn't bother the user with
information about a few seconds of delay. Let's make the timeout 25s if we are
not showing any messages.

Conversly, when we are outputting status messages, we can show the cylon eye
with a shorter delay, now that we removed the connection to enablement status.
Let's make this 2s, so users get feedback about delays more quickly.
2020-03-01 11:42:35 +01:00
Zbigniew Jędrzejewski-Szmek ef15d3e1ab pid1: touch the /run/systemd/show-status just once
We know if we created the file before, no need to repeat the operation. The
state in /run should always match our internal state. Since we call
manager_set_show_status() quite often internally, this saves quite a few
pointless syscalls.
2020-03-01 11:42:26 +01:00
Zbigniew Jędrzejewski-Szmek 7365a29670 pid1: when printing status message status, give reason 2020-03-01 11:42:19 +01:00
Zbigniew Jędrzejewski-Szmek 5ca02bfc39 core: fix message about show status state
We would say "Enabling" also for SHOW_STATUS_AUTO, which is actually
"soft off". So just print the exact state to make things easier to understand.
Also add a helper function to avoid repeating the enum value list.

For #14814.
2020-03-01 11:42:12 +01:00
Lennart Poettering 96462ae998 core: show the UID we cannot parse 2020-01-21 11:51:26 +01:00
Lennart Poettering 19d22d433d core: add user/group resolution varlink interface to PID 1 2020-01-15 15:28:55 +01:00
Lennart Poettering fc67a943d9 core: drop initial ListNames() bus call from PID 1
Previously, when first connecting to the bus after connecting to it we'd
issue a ListNames() bus call to the driver to figure out which bus names
are currently active. This information was then used to initialize the
initial state for services that use BusName=.

This change removes the whole code for this and replaces it with
something vastly simpler.

First of all, the ListNames() call was issues synchronosuly, which meant
if dbus was for some reason synchronously calling into PID1 for some
reason we'd deadlock. As it turns out there's now a good chance it does:
the nss-systemd userdb hookup means that any user dbus-daemon resolves
might result in a varlink call into PID 1, and dbus resolves quite a lot
of users while parsing its policy. My original goal was to fix this
deadlock.

But as it turns out we don't need the ListNames() call at all anymore,
since #12957 has been merged. That PR was supposed to fix a race where
asynchronous installation of bus matches would cause us missing the
initial owner of a bus name when a service is first started. It fixed it
(correctly) by enquiring with GetOwnerName() who currently owns the
name, right after installing the match. But this means whenever we start watching a bus name we anyway
issue a GetOwnerName() for it, and that means also when first connecting
to the bus we don't need to issue ListNames() anymore since that just
tells us the same info: which names are currently owned.

hence, let's drop ListNames() and instead make better use of the
GetOwnerName() result: if it failed the name is not owned.

Also, while we are at it, let's simplify the unit's owner_name_changed()
callback(): let's drop the "old_owner" argument. We never used that
besides logging, and it's hard to synthesize from just the return of a
GetOwnerName(), hence don't bother.
2020-01-06 15:21:47 +01:00
Anita Zhang 2f8c48b605 core,journal: export user units' InvocationID and use as _SYSTEMD_INVOCATION_ID
Write a user unit's invocation ID to /run/user/<uid>/systemd/units/ similar
to how a system unit's invocation ID is written to /run/systemd/units/.

This lets the journal read and add a user unit's invocation ID to the
_SYSTEMD_INVOCATION_ID field of logs instead of the user manager's
invocation ID.

Fixes #12474
2019-12-19 17:42:17 -08:00
Zbigniew Jędrzejewski-Szmek 3a0f06c41a core: make TasksMax a partially dynamic property
TasksMax= and DefaultTasksMax= can be specified as percentages. We don't
actually document of what the percentage is relative to, but the implementation
uses the smallest of /proc/sys/kernel/pid_max, /proc/sys/kernel/threads-max,
and /sys/fs/cgroup/pids.max (when present). When the value is a percentage,
we immediately convert it to an absolute value. If the limit later changes
(which can happen e.g. when systemd-sysctl runs), the absolute value becomes
outdated.

So let's store either the percentage or absolute value, whatever was specified,
and only convert to an absolute value when the value is used. For example, when
starting a unit, the absolute value will be calculated when the cgroup for
the unit is created.

Fixes #13419.
2019-11-14 18:41:54 +01:00
Zbigniew Jędrzejewski-Szmek 754499fab2
Merge pull request #13904 from keur/job_mode_triggering
Job mode triggering
2019-11-07 08:36:26 +01:00
Kevin Kuehler 1f0f9f21c1 core: Add triggering job mode
When used with systemctl stop, follows TRIGGERED_BY dependencies and
adds them to the same transaction.

Fixes: #3043
2019-11-05 11:17:38 -08:00
Yu Watanabe 021cdf8330 tree-wide: drop signal.h when signal-util.h is included 2019-11-04 00:30:32 +09:00
Yu Watanabe 455fa9610c tree-wide: drop string.h when string-util.h or friends are included 2019-11-04 00:30:32 +09:00
Yu Watanabe f5947a5e92 tree-wide: drop missing.h 2019-10-31 17:57:03 +09:00
Zbigniew Jędrzejewski-Szmek 0e7f5ad9d3 Move PLYMOUTH_SOCKET define to def.h and nuke plymouth-util.h
Let's not have a file with a single define.
2019-10-24 11:48:08 +02:00
Franck Bui 27c3112dcb fs-util: introduce inotify_add_watch_and_warn() helper
The default message for ENOSPC is very misleading: it says that the disk is
filled, but in fact the inotify watch limit is the problem.

So let's introduce and use a wrapper that simply calls inotify_add_watch(2) and
which fixes the error message up in case ENOSPC is returned.
2019-10-05 08:08:20 +02:00
Zbigniew Jędrzejewski-Szmek 6123dfaa72 pid1: disable printk ratelimit in early boot
We have the problem that many early boot or late shutdown issues are harder
to solve than they could be because we have no logs. When journald is not
running, messages are redirected to /dev/kmsg. It is also the time when many
things happen in a rapid succession, so we tend to hit the kernel printk
ratelimit fairly reliably. The end result is that we get no logs from the time
where they would be most useful. Thus let's disable the kernels ratelimit.

Once the system is up and running, the ratelimit is not a problem. But during
normal runtime, things also log to journald, and not to /dev/kmsg, so the
ratelimit is not useful. Hence, there doesn't seem to be much point in trying
to restore the ratelimit after boot is finished and journald is up and running.

See kernel's commit 750afe7babd117daabebf4855da18e4418ea845e for the
description of the kenrel interface. Our setting has lower precedence than
explicit configuration on the kenrel command line.
2019-09-20 16:05:53 +02:00
Zbigniew Jędrzejewski-Szmek 8c227e7f2b Drop RATELIMIT macros
Using plain structure initialization is both shorter _and_ more clearer.
We get type safety for free.
2019-09-20 16:05:53 +02:00
Zbigniew Jędrzejewski-Szmek db11487d10 manager: put bin before sbin for user instances
Traditionally, user logins had a $PATH in which /bin was before /sbin, while
root logins had a $PATH with /sbin first. This allows the tricks that
consolehelper is doing to work. But even if we ignore consolehelper, having the
path in this order might have been used by admins for other purposes, and
keeping the order in user sessions will make it easier the adoption of systemd
user sessions a bit easier.

Fixes #733.
https://bugzilla.redhat.com/show_bug.cgi?id=1744059

OOM handling in manager_default_environment wasn't really correct.
Now the (theorertical) malloc failure in strv_new() is handled.

Please note that this has no effect on:
- systems with merged /bin-/sbin (e.g. arch)

- when there are no binaries that differ between the two locations.

  E.g. on my F30 laptop there is exactly one program that is affected:
  /usr/bin/setup -> consolehelper.

  There is less and less stuff that relies on consolehelper, but there's still
  some.

So for "clean" systems this makes no difference, but helps with legacy setups.

$ dnf repoquery --releasever=31 --qf %{name} --whatrequires usermode
anaconda-live
audit-viewer
beesu
chkrootkit
driftnet
drobo-utils-gui
hddtemp
mate-system-log
mock
pure-ftpd
setuptool
subscription-manager
system-config-httpd
system-config-rootpassword
system-switch-java
system-switch-mail
usermode-gtk
vpnc-consoleuser
wifi-radar
xawtv
2019-08-27 18:24:44 +02:00
Zbigniew Jędrzejewski-Szmek 581fef8d56 core: stop removing non-existent and duplicate lookup paths
When we would iterate over the lookup paths for each unit, making the list as
short as possible was important for performance. With the current cache, it
doesn't matter much. Two classes of paths were being removed:
- paths which don't exist in the filesystem
- paths which symlink to a path earlier in the search list
Both of those points cause problems with the caching code:
- if a user creates a directory that didn't exist before and puts units there,
  now we will notice the new mtime an properly load the unit. When the path
  was removed from list, we wouldn't.
- we now properly detect whether a unit path is on the path or not.
  Before, if e.g. /lib/systemd/system, /usr/lib/systemd/systemd were both on
  the path, and /lib was a symlink to /usr/lib, the second directory would be
  pruned from the path. Then, the code would think that a symlink
  /etc/systemd/system/foo.service→/lib/systemd/system/foo.service is an alias,
  but /etc/systemd/system/foo.service→/usr/lib/systemd/system/foo.service would
  be considered a link (in the systemctl link sense).

Removing the pruning has a slight negative performance impact in case of
usr-merge systems which have systemd compiled with non-usr-merge paths.
Non-usr-merge systems are deprecated, and this impact should be very small, so
I think it's OK. If it turns out to be an issue, the loop in function that
builds the cache could be improved to skip over "duplicate" directories with
same logic that the cache pruning did before. I didn't want to add this,
becuase it complicates the code to improve a corner case.

Fixes #13272.
2019-08-27 18:12:20 +02:00
Lennart Poettering ea7584329b manager: simplify manager_get_confirm_spawn() a bit
Let's use our usual way of storing error codes.

Let's remove a redundant temporary variable we never change
2019-08-20 17:34:19 +02:00
Lennart Poettering 5756bff6f1
Merge pull request #13119 from keszybz/unit-loading-2
Rework unit loading to take into account all aliases
2019-07-30 17:55:37 +02:00
Zbigniew Jędrzejewski-Szmek 91e0ee5f16 pid1: drop unit caches only based on mtime
v2:
- do not watch mtime of transient and generated dirs

  We'd reload the map after every transient unit we created, which we don't
  need to do, since we create those units ourselves and know their fragment
  path.
2019-07-30 14:01:46 +02:00
Zbigniew Jędrzejewski-Szmek e8630e6952 pid1: use a cache for all unit aliases
This reworks how we load units from disk. Instead of chasing symlinks every
time we are asked to load a unit by name, we slurp all symlinks from disk
and build two hashmaps:
1. from unit name to either alias target, or fragment on disk
   (if an alias, we put just the target name in the hashmap, if a fragment
    we put an absolute path, so we can distinguish both).
2. from a unit name to all aliases

Reading all this data can be pretty costly (40 ms) on my machine, so we keep it
around for reuse.

The advantage is that we can reliably know what all the aliases of a given unit
are. This means we can reliably load dropins under all names. This fixes #11972.
2019-07-30 14:01:46 +02:00
Zbigniew Jędrzejewski-Szmek a505166845
Merge pull request #13096 from keszybz/unit-loading
Preparatory work for the unit loading rework
2019-07-19 21:47:10 +02:00
Zbigniew Jędrzejewski-Szmek 5cfa33e0bc Create src/shared/unit-file.[ch] for unit-file related ops
So far we put such functinos in install.[ch], but that is tied too closely
to enable/disable. Let's start moving things to a place with a better name.
2019-07-19 16:51:14 +02:00
Michael Olbrich da8e178296 job: make the run queue order deterministic
Jobs are added to the run queue in random order. This happens because most
jobs are added by iterating over the transaction or dependency hash maps.

As a result, jobs that can be executed at the same time are started in a
different order each time.
On small embedded devices this can cause a measurable jitter for the point
in time when a job starts (~100ms jitter for 10 units that are started in
random order).
This results is a similar jitter for the boot time. This is undesirable in
general and make optimizing the boot time a lot harder.
Also, jobs that should have a higher priority because the unit has a higher
CPU weight might get executed later than others.

Fix this by turning the job run_queue into a Prioq and sort by the
following criteria (use the next if the values are equal):
- CPU weight
- nice level
- unit type
- unit name

The last one is just there for deterministic sorting to avoid any jitter.
2019-07-18 10:28:39 +02:00
Lennart Poettering b910cc72c0 tree-wide: get rid of strappend()
It's a special case of strjoin(), so no need to keep both. In particular
as typing strjoin() is even shoert than strappend().
2019-07-12 14:31:12 +09:00
Zbigniew Jędrzejewski-Szmek 2a8f53c67b Use unit->id instead of description in messages
v2:
- rename unit_identifier to unit_status_string
2019-07-10 13:35:26 +02:00
Zbigniew Jędrzejewski-Szmek 36cf45078c Add config and kernel commandline option to use short identifiers
No functional change, just docs and configuration and parsing.

v2:
- change ShortIdentifiers=yes|no to StatusUnitFormat=name|description.
2019-07-10 13:35:26 +02:00
Zbigniew Jędrzejewski-Szmek 3454129571 pid1: use monotonic timestamp in dump if realtime is not available
$ systemd-analyze dump | head -3
Timestamp firmware: (null)
Timestamp loader: (null)
Timestamp kernel: Mon 2019-07-01 17:21:02 CEST

Since this is a debugging interface, it is OK to change the output format.
The user can infer what "Timestamp firmware: 123.456ms" means.
2019-07-04 22:52:25 +02:00
Zbigniew Jędrzejewski-Szmek 1f65fd4926 basic/time-util: add helper function to check if timestamp is set
No functional change.
2019-07-04 19:12:47 +02:00
Zbigniew Jędrzejewski-Szmek b1d5246d29 core: do not enumerate units in MANAGER_TEST_RUN_MINIMAL mode
In this mode we are not supposed to "interact with the environment", so loading
all units and printing warnings about syntax errors and /var/run usage seems
inappropriate.
2019-06-26 16:25:36 +02:00
Yu Watanabe 657ee2d82b tree-wide: replace strjoin() with path_join() 2019-06-21 03:26:16 +09:00
Lennart Poettering 3aa317943c
Merge pull request #12626 from keszybz/oompolicy-check
Make the check if oom-killer fired more robust
2019-05-21 18:29:01 +02:00
Zbigniew Jędrzejewski-Szmek 2ba6ae6b2b core: do an extra check if oom was triggered when handling sigchild
Should fix #12425.
2019-05-20 16:37:06 +02:00
Topi Miettinen 0a51b45ce4 small fixes: make get_process_state() static and fix typo 2019-05-20 16:23:22 +02:00
Ben Boeckel 5238e95759 codespell: fix spelling errors 2019-04-29 16:47:18 +02:00
Lennart Poettering 3661dc349e
Merge pull request #12217 from keszybz/unlocked-operations
Refactor how we do unlocked file operations
2019-04-12 13:51:53 +02:00
Zbigniew Jędrzejewski-Szmek 2fe21124a6 Add open_memstream_unlocked() wrapper 2019-04-12 11:44:57 +02:00
Zbigniew Jędrzejewski-Szmek 17e4b07088 core: vodify one more call to mkdir
CID #1400460.
2019-04-12 09:05:02 +02:00
Lennart Poettering afcfaa695c core: implement OOMPolicy= and watch cgroups for OOM killings
This adds a new per-service OOMPolicy= (along with a global
DefaultOOMPolicy=) that controls what to do if a process of the service
is killed by the kernel's OOM killer. It has three different values:
"continue" (old behaviour), "stop" (terminate the service), "kill" (let
the kernel kill all the service's processes).

On top of that, track OOM killer events per unit: generate a per-unit
structured, recognizable log message when we see an OOM killer event,
and put the service in a failure state if an OOM killer event was seen
and the selected policy was not "continue". A new "result" is defined
for this case: "oom-kill".

All of this relies on new cgroupv2 kernel functionality: the
"memory.events" notification interface and the "memory.oom.group"
attribute (which makes the kernel kill all cgroup processes
automatically).
2019-04-09 11:17:58 +02:00
Lennart Poettering cbe83389d5 core: rearrange cgroup empty events a bit
So far the priorities for cgroup empty event handling were pretty weird.
The raw events (on cgroupsv2 from inotify, on cgroupsv1 from the agent
dgram socket) where scheduled at a lower priority than the cgroup empty
queue dispatcher. Let's swap that and ensure that we can coalesce events
more agressively: let's process the raw events at higher priority than
the cgroup empty event (which remains at the same prio).
2019-04-09 11:17:57 +02:00
Zbigniew Jędrzejewski-Szmek 237ebf61e2
Merge pull request #12013 from yuwata/fix-switchroot-11997
core: on switching root do not emit device state change based on enumeration results
2019-04-02 16:06:07 +02:00
Zbigniew Jędrzejewski-Szmek c6335c3b51
Merge pull request #12115 from poettering/verbose-job-enqueue
add "systemctl --show-transaction start" as a more verbose "systemctl start" that shows enqueued jobs
2019-03-28 11:04:26 +01:00
Zbigniew Jędrzejewski-Szmek 330b8fb379 journalctl: voidify mkdir_p() call and unify two similar code paths
Let's unify the two similar code paths to watch /run/systemd/journal.
The code in manager.c is similar, but it uses mkdir_p_label(), and unifying
that would be too much trouble, so let's just adjust the error messages to
be the same.

CID #1400224.
2019-03-27 13:01:44 +01:00
Lennart Poettering 50cbaba4fe core: add new API for enqueing a job with returning the transaction data 2019-03-27 12:37:37 +01:00
Lennart Poettering b82f71c7ff tree-wide: constify a few static string tables 2019-03-25 14:04:34 +01:00
Franck Bui f75f613d25 core: reduce the number of stalled PIDs from the watched processes list when possible
Some PIDs can remain in the watched list even though their processes have
exited since a long time. It can easily happen if the main process of a forking
service manages to spawn a child before the control process exits for example.

However when a pid is about to be mapped to a unit by calling unit_watch_pid(),
the caller usually knows if the pid should belong to this unit exclusively: if
we just forked() off a child, then we can be sure that its PID is otherwise
unused. In this case we take this opportunity to remove any stalled PIDs from
the watched process list.

If we learnt about a PID in any other form (for example via PID file, via
searching, MAINPID= and so on), then we can't assume anything.
2019-03-20 10:51:49 +01:00
Lennart Poettering f60a028a4e tree-wide: use ERRNO_IS_DISCONNECT() at more places 2019-03-19 15:41:30 +01:00
Yu Watanabe c6e892bc0e core: add Manager::honor_device_enumeration flag
When system manager is started first time or after switching root,
then the udev's device tag data do not exist yet.
So, let's not honor the enumeration results.

Fixes #11997.
2019-03-15 19:47:43 +09:00
Lennart Poettering e56f9ffe51 util: split out plymouth related stuff 2019-03-14 13:25:51 +01:00
Lennart Poettering 0a9707187b util: split out memcmp()/memset() related calls into memory-util.[ch]
Just some source rearranging.
2019-03-13 12:16:43 +01:00
Matthias Klumpp 4b05f0c9d9 core: Allow to configure execute_directories execution behavior
This adds a new bitfield to `execute_directories()` which allows to
configure whether to ignore non-zero exit statuses of binaries run and
whether to allow parallel execution of commands.
In case errors are not ignored, the exit status of the failed script
will now be returned for error reposrting purposes or other further
future use.
2019-02-18 16:16:02 +01:00
Lennart Poettering dcf3c3c3d9 core: export $PIDFILE env var for services, derived from PIDFile= 2019-02-15 11:32:19 +01:00
Evgeny Vereshchagin 7334ade4a7 core: free lines after reading them
Closes https://github.com/systemd/systemd/issues/11251.
2018-12-28 15:36:00 +09:00
Lennart Poettering 4a53080be6 core: don't track jobs-finishing-during-reload explicitly
Memory management is borked for this, and moreover this is unnecessary
since f0831ed2a0, i.e. since coldplug() and catchup() are two different
concepts: the former restoring the state from before a reload, the
latter than adjusting it again to the actual status in effect after the
reload.

Fixes: #10716
Mostly reverts: #8803
2018-12-12 11:15:06 +01:00
Zbigniew Jędrzejewski-Szmek 60473f0c23 pid1: fix (harmless) off-by-one in PATH_MAX comparison
PATH_MAX is supposed to include the terminating NUL byte. But we already
check that there is no NUL byte in the specified path. Hence the maximum
length we can expect is PATH_MAX - 1.

This doesn't change much, but makes this use of PATH_MAX consistent with the
rest of the codebase.
2018-12-10 11:57:26 +01:00
Zbigniew Jędrzejewski-Szmek 8b4e51a60e
Merge pull request #10797 from poettering/run-generator
add new "systemd-run-generator" for running arbitrary commands from the kernel command line as system services using the "systemd.run=" kernel command line switch
2018-11-28 22:40:55 +01:00
Lennart Poettering 7af67e9a8b core: allow to set exit status when using SuccessAction=/FailureAction=exit in units
This adds SuccessActionExitStatus= and FailureActionExitStatus= that may
be used to configure the exit status to propagate in when
SuccessAction=exit or FailureAction=exit is used.

When not specified let's also propagate the exit status of the main
process we fork off for the unit.
2018-11-27 09:44:40 +01:00
Lennart Poettering a885727a64 show-status: fold two bool flags function arguments into a flags
parameter
2018-11-26 18:24:12 +01:00
Lennart Poettering e3b8d0637d core: run env generators with non-zero umask
For PID 1 we adjust the umask to 0, but generators should not run that
way, given that they might be implemented as shell scripts and such.
Let's hence explicitly adjust the umask for them.

We already do this for unit generators. Let's do this for env
generators, too.
2018-11-20 23:35:04 +01:00
Lennart Poettering 79a224c460 main: when reloading PID 1 let's reset the default environment
Otherwise we keep collecting stuff from env generators, and we really
shouldn't.

This was working properly on reexec but not on reload, as for reexec we
would always start fresh, but for reload would reuse the Manager object
and hence its default environment set.

Fixes: #10671
2018-11-19 13:01:19 +01:00
Lennart Poettering 2fbbbf9a5f manager: log on two OOM occasions 2018-11-19 12:22:56 +01:00
Lennart Poettering ae3cc6ec0d
Merge pull request #10770 from poettering/unit-done-log
improvements to structure log events from PID1
2018-11-16 17:54:19 +01:00
Lennart Poettering f20db19954 cocci: simplify some if checks 2018-11-16 16:05:29 +01:00
Lennart Poettering 6e64994d69 core: make unit_start() return a distinguishable error code in case conditions didn't hold
Ideally we'd even propagate this all the way to the client, by having a
separate JobType enum value for this. But it's hard to add this without
breaking compat, hence for now let's at least internally propagate this
case differently from the case "already on it".

This is then used to call job_finish_and_invalidate() slightly
differently, with the already= parameter false, as in the failed
condition case no message was likely produced so far.
2018-11-16 15:22:48 +01:00
Lennart Poettering b39cfe21b4 core: show error code even when debug logging 2018-11-16 12:22:11 +01:00
Lennart Poettering 3dafa6bc76 core: drop dbus queue recursion check
We don't dispatch the queue recursively anymore, hence let's simplify
things a bit.

As pointed out by @fbuihuu:

https://github.com/systemd/systemd/pull/10763#discussion_r233209550
2018-11-14 20:09:11 +01:00
Lennart Poettering b8d381c477 core: make sure we don't throttle change signal generator when a reload is pending
Fixes: #10627
2018-11-13 12:51:48 +01:00
Lennart Poettering 209de5256b core: rename queued_message → pending_reload_message
This field is only used for pending Reload() replies, hence let's rename
it to be more descriptive and precise.

No change in behaviour.
2018-11-13 11:59:06 +01:00
Lennart Poettering 1ad6e8b302 core: split environment block mantained by PID 1's Manager object in two
This splits the "environment" field of Manager into two:
transient_environment and client_environment. The former is generated
from configuration file, kernel cmdline, environment generators. The
latter is the one the user can control with "systemctl set-environment"
and similar.

Both sets are merged transparently whenever needed. Separating the two
sets has the benefit that we can safely flush out the former while
keeping the latter during daemon reload cycles, so that env var settings
from env generators or configuration files do not accumulate, but
dynamic API changes are kept around.

Note that this change is not entirely transparent to users: if the user
first uses "set-environment" to override a transient variable, and then
uses "unset-environment" to unset it again things will revert to the
original transient variable now, while previously the variable was fully
removed. This change in behaviour should not matter too much though I
figure.

Fixes: #9972
2018-10-31 18:00:53 +01:00
Franck Bui 651b3b6c92 core: skip unit deserialization and move to the next one when unit_deserialize() fails
If unit_deserialize() fails (because one read line is overly long), it returns
an error and we would have assumed that the next read would point to the next
unit to deserialize.

But instead unit_deserialize() can leave the file offset in the middle of a
line.

Therefore we need to ignore and skip the current unit in this case too.

While at it, move unit deserialization in a dedicated functions. That should
make the code easier to read.
2018-10-30 14:52:13 +01:00
Lennart Poettering d68c645bd3 core: rework serialization
Let's be more careful with what we serialize: let's ensure we never
serialize strings that are longer than LONG_LINE_MAX, so that we know we
can read them back with read_line(…, LONG_LINE_MAX, …) safely.

In order to implement this all serialization functions are move to
serialize.[ch], and internally will do line size checks. We'd rather
skip a serialization line (with a loud warning) than write an overly
long line out. Of course, this is just a second level protection, after
all the data we serialize shouldn't be this long in the first place.

While we are at it also clean up logging: while serializing make sure to
always log about errors immediately. Also, (void)ify all calls we don't
expect errors in (or catch errors as part of the general
fflush_and_check() at the end.
2018-10-26 10:52:41 +02:00
Lennart Poettering cdd620e346 core: make manager_serialize() a bit easier to read by adding predicate function
The predicate function manager_timestamp_shall_serialize() simply says
whether to serialize or not serialize a timestamp, and should make
things a bit easier to read.
2018-10-26 10:40:01 +02:00
Lennart Poettering c475e57d67 core: strjoina() in a loop is never OK
Let's use plain strjoin() instead.
2018-10-26 10:40:01 +02:00
Lennart Poettering 8948b3415d core: when deserializing state always use read_line(…, LONG_LINE_MAX, …)
This should be much better than fgets(), as we can read substantially
longer lines and overly long lines result in proper errors.

Fixes a vulnerability discovered by Jann Horn at Google.

CVE-2018-15686
LP: #1796402
https://bugzilla.redhat.com/show_bug.cgi?id=1639071
2018-10-26 10:40:01 +02:00
Lennart Poettering 2ff48e981e tree-wide: introduce setsockopt_int() helper and make use of it everywhere
As suggested by @heftig:

6d5e65f645 (commitcomment-30938667)
2018-10-18 19:50:29 +02:00
Lennart Poettering a42984dbc7
Merge pull request #10428 from keszybz/failure-actions
Implement manager status changes using SuccessAction=
2018-10-17 21:29:10 +02:00
Zbigniew Jędrzejewski-Szmek c7adcb1af9 core: do not "warn" about mundane emergency actions
For example in a container we'd log:
Oct 17 17:01:10 rawhide systemd[1]: Started Power-Off.
Oct 17 17:01:10 rawhide systemd[1]: Forcibly powering off: unit succeeded
Oct 17 17:01:10 rawhide systemd[1]: Reached target Power-Off.
Oct 17 17:01:10 rawhide systemd[1]: Shutting down.
and on the console we'd write (in red)
[  !!  ] Forcibly powering off: unit succeeded

This is not useful in any way, and the fact that we're calling an "emergency action"
is an internal implementation detail. Let's log about c-a-d and the watchdog actions
only.
2018-10-17 19:32:09 +02:00
Zbigniew Jędrzejewski-Szmek 1710d4beff core: limit service-watchdogs=no to actual "watchdog" commands
The setting is now only looked at when considering an action for a job timeout
or unit start limit. It is ignored for ctrl-alt-del, SuccessAction, SuccessFailure.

v2: turn the parameter into a flag field
v3: rename Options to Flags
2018-10-17 19:31:50 +02:00
Lennart Poettering fbda85b078 tree-wide: use sockaddr_un_unlink() at two more places where appropriate 2018-10-15 19:44:34 +02:00
Lennart Poettering 6d5e65f645 tree-wide: add a single version of "static const int one = 1"
All over the place we define local variables for the various sockopts
that take a bool-like "int" value. Sometimes they are const, sometimes
static, sometimes both, sometimes neither.

Let's clean this up, introduce a common const variable "const_int_one"
(as well as one matching "const_int_zero") and use it everywhere, all
acorss the codebase.
2018-10-15 19:40:51 +02:00
Lennart Poettering 15a3e96f92 tree-wide: port various users over to sockaddr_un_set_path()
CID 1396140
CID 1396141
2018-10-15 19:40:51 +02:00
Lennart Poettering 0b7e8b322d tree-wide: make use of TAKE_FD() at two more places 2018-10-15 19:35:00 +02:00
Lennart Poettering 155b687669 tree-wide: use sockaddr_un_unlink() whereever appropriate
Let's port everything over.
2018-10-15 19:35:00 +02:00
Yu Watanabe 6cc4d59b35
Merge pull request #10392 from poettering/manager-no-inotify-fail
make sure /etc/localtime issues don't cause systemd to fail boot
2018-10-15 07:15:12 +09:00
Yu Watanabe cb16b085c0 core: set _unused_ attribute to 'reloading'
Follow-up for 4df7d537c8.
2018-10-13 23:50:04 +09:00
Lennart Poettering ea5c5f680d core: ensure it's not fatal if we cannot watch /etc/localtime
See: #9602
2018-10-13 15:13:07 +02:00
Lennart Poettering 0cb21d8c60 core: add debug logging if we cant watch /etc/localtime itself 2018-10-13 15:12:58 +02:00
Lennart Poettering a6ee956610
Merge pull request #10356 from dtardon/covscan
assorted coverity/clang fixes
2018-10-12 18:43:04 +02:00
David Tardon f369f47c26 be consistent about sun_path length
Most places use the whole buffer for name, without leaving extra space
for the trailing NUL.
2018-10-12 12:38:49 +02:00
Zbigniew Jędrzejewski-Szmek 05067c3c1f manager: simplify error handling in manager_deserialize()
If a memory error occurred, we would still go through the path which sets the
error on ferror(). It is unlikely that ferror() returns true, but it's seems
cleaner to just propagate the error we already have.

The handling of fgets() returning NULL is also simplified: according to the man
page, it returns NULL only on EOF or error. So if feof() returns true, I don't
think we should call ferror() again.

While at it, let's set errno to 0 and check that it is set before returning it
as an error. The man pages for fgets() and feof() do not say anything about
setting errno.
2018-10-11 14:34:02 +02:00
Zbigniew Jędrzejewski-Szmek 4df7d537c8 manager: also use the reloading "cleanup" function in manager_startup
Here the behaviour is nominally changed, because we will decrease the
counter on error. But the only caller quits the program if error occurs,
so this makes no practical difference.
2018-10-11 14:34:00 +02:00
Zbigniew Jędrzejewski-Szmek d147e2b66b manager: use the _cleanup_ mechanism to do n_reloading counter handling
No functional change.
2018-10-11 14:33:22 +02:00