There are more than enough calls doing string manipulations to deserve
its own files, hence do something about it.
This patch also sorts the #include blocks of all files that needed to be
updated, according to the sorting suggestions from CODING_STYLE. Since
pretty much every file needs our string manipulation functions this
effectively means that most files have sorted #include blocks now.
Also touches a few unrelated include files.
This adds support for naming file descriptors passed using socket
activation. The names are passed in a new $LISTEN_FDNAMES= environment
variable, that matches the existign $LISTEN_FDS= one and contains a
colon-separated list of names.
This also adds support for naming fds submitted to the per-service fd
store using FDNAME= in the sd_notify() message.
This also adds a new FileDescriptorName= setting for socket unit files
to set the name for fds created by socket units.
This also adds a new call sd_listen_fds_with_names(), that is similar to
sd_listen_fds(), but also returns the names of the fds.
systemd-activate gained the new --fdname= switch to specify a name for
testing socket activation.
This is based on #1247 by Maciej Wereski.
Fixes#1247.
Let's underline the header line of the table shown by cgtop, how it is
customary for tables. In order to do this, let's introduce new ANSI
underline macros, and clean up the existing ones as side effect.
Add a new config directive called NetClass= to CGroup enabled units.
Allowed values are positive numbers for fix assignments and "auto" for
picking a free value automatically, for which we need to keep track of
dynamically assigned net class IDs of units. Introduce a hash table for
this, and also record the last ID that was given out, so the allocator
can start its search for the next 'hole' from there. This could
eventually be optimized with something like an irb.
The class IDs up to 65536 are considered reserved and won't be
assigned automatically by systemd. This barrier can be made a config
directive in the future.
Values set in unit files are stored in the CGroupContext of the
unit and considered read-only. The actually assigned number (which
may have been chosen dynamically) is stored in the unit itself and
is guaranteed to remain stable as long as the unit is active.
In the CGroup controller, set the configured CGroup net class to
net_cls.classid. Multiple unit may share the same net class ID,
and those which do are linked together.
The current implementation directly monitor /proc/self/mountinfo and
/run/mount/utab files. It's really not optimal because utab file is
private libmount stuff without any official guaranteed semantic.
The libmount since v2.26 provides API to monitor mount kernel &
userspace changes and since v2.27 the monitor is usable for
non-root users too.
This patch replaces the current implementation with libmount based
solution.
Signed-off-by: Karel Zak <kzak@redhat.com>
Let's make sure that we follow the same codepaths when adjusting a
cgroup property via the dbus SetProperty() call, and when we execute the
StartupCPUShares= effect.
Introduce a proper enum, and don't pass around string ids anymore. This
simplifies things quite a bit, and makes virtualization detection more
similar to architecture detection.
Let's move the actual cgroup part of it into a new separate function
manager_get_unit_by_pid_cgroup(), and then make
manager_get_unit_by_pid() just a wrapper that also checks the two pid
hashmaps.
Then, let's make sure the various calls that want to deliver events to
the owners of a PID check both hashmaps and the cgroup and deliver the
event to *each* of them. OTOH make sure bus calls like GetUnitByPID()
continue to check the PID hashmaps first and the cgroup only as
fallback.
This adds a new PID_TO_PTR() macro, plus PTR_TO_PID() and makes use of
it wherever we maintain processes in a hash table. Previously we
sometimes used LONG_TO_PTR() and other times ULONG_TO_PTR() for that,
hence let's make this more explicit and clean up things.
This patch set adds full support the new unified cgroup hierarchy logic
of modern kernels.
A new kernel command line option "systemd.unified_cgroup_hierarchy=1" is
added. If specified the unified hierarchy is mounted to /sys/fs/cgroup
instead of a tmpfs. No further hierarchies are mounted. The kernel
command line option defaults to off. We can turn it on by default as
soon as the kernel's APIs regarding this are stabilized (but even then
downstream distros might want to turn this off, as this will break any
tools that access cgroupfs directly).
It is possibly to choose for each boot individually whether the unified
or the legacy hierarchy is used. nspawn will by default provide the
legacy hierarchy to containers if the host is using it, and the unified
otherwise. However it is possible to run containers with the unified
hierarchy on a legacy host and vice versa, by setting the
$UNIFIED_CGROUP_HIERARCHY environment variable for nspawn to 1 or 0,
respectively.
The unified hierarchy provides reliable cgroup empty notifications for
the first time, via inotify. To make use of this we maintain one
manager-wide inotify fd, and each cgroup to it.
This patch also removes cg_delete() which is unused now.
On kernel 4.2 only the "memory" controller is compatible with the
unified hierarchy, hence that's the only controller systemd exposes when
booted in unified heirarchy mode.
This introduces a new enum for enumerating supported controllers, plus a
related enum for the mask bits mapping to it. The core is changed to
make use of this everywhere.
This moves PID 1 into a new "init.scope" implicit scope unit in the root
slice. This is necessary since on the unified hierarchy cgroups may
either contain subgroups or processes but not both. PID 1 hence has to
move out of the root cgroup (strictly speaking the root cgroup is the
only one where processes and subgroups are still allowed, but in order
to support containers nicey, we move PID 1 into the new scope in all
cases.) This new unit is also used on legacy hierarchy setups. It's
actually pretty useful on all systems, as it can then be used to filter
journal messages coming from PID 1, and so on.
The root slice ("-.slice") is now implicitly created and started (and
does not require a unit file on disk anymore), since
that's where "init.scope" is located and the slice needs to be started
before the scope can.
To check whether we are in unified or legacy hierarchy mode we use
statfs() on /sys/fs/cgroup. If the .f_type field reports tmpfs we are in
legacy mode, if it reports cgroupfs we are in unified mode.
This patch set carefuly makes sure that cgls and cgtop continue to work
as desired.
When invoking nspawn as a service it will implicitly create two
subcgroups in the cgroup it is using, one to move the nspawn process
into, the other to move the actual container processes into. This is
done because of the requirement that cgroups may either contain
processes or other subgroups.
Currently, PID1 installs an unfiltered NameOwnerChanged signal match, and
dispatches the signals itself. This does not scale, as right now, PID1
wakes up every time a bus client connects.
To fix this, install individual matches once they are requested by
unit_watch_bus_name(), and remove the watches again through their slot in
unit_unwatch_bus_name().
If the bus is not available during unit_watch_bus_name(), just store
name in the 'watch_bus' hashmap, and let bus_setup_api() do the installing
later.
Some places invoked fflush() directly with their own manual error
checking, let's unify all that by using fflush_and_check().
This also unifies the general error paths of fflush()+rename() file
writers.
./configure --enable/disable-kdbus can be used to set the default
behavior regarding kdbus.
If no kdbus kernel support is available, dbus-dameon will be used.
With --enable-kdbus, the kernel command line option "kdbus=0" can
be used to disable kdbus.
With --disable-kdbus, the kernel command line option "kdbus=1" is
required to enable kdbus support.
Previously, if a service A depended on a service B via Requires=, and A
was not running and B restarted this would trigger a start of A as well,
since the restart was propagated as restart independently of the state
of A.
This patch ensures that a restart of B would be propagated as a
try-restart to A, thus not changing its state if it isn't up.
http://lists.freedesktop.org/archives/systemd-devel/2015-May/032061.html
If systemd is built with GCC address sanitizer or leak sanitizer
the following memory leak ocurs:
May 12 02:02:46 linux.site systemd[326]: =================================================================
May 12 02:02:46 linux.site systemd[326]: ==326==ERROR: LeakSanitizer: detected memory leaks
May 12 02:02:46 linux.site systemd[326]: Direct leak of 101 byte(s) in 3 object(s) allocated from:
May 12 02:02:46 linux.site systemd[326]: #0 0x7fd1f504993f in strdup (/usr/lib64/libasan.so.2+0x6293f)
May 12 02:02:46 linux.site systemd[326]: #1 0x55d6ffac5336 in strv_new_ap src/shared/strv.c:163
May 12 02:02:46 linux.site systemd[326]: #2 0x55d6ffac56a9 in strv_new src/shared/strv.c:185
May 12 02:02:46 linux.site systemd[326]: #3 0x55d6ffa80272 in generator_paths src/shared/path-lookup.c:223
May 12 02:02:46 linux.site systemd[326]: #4 0x55d6ff9bdb0f in manager_run_generators src/core/manager.c:2828
May 12 02:02:46 linux.site systemd[326]: #5 0x55d6ff9b1a10 in manager_startup src/core/manager.c:1121
May 12 02:02:46 linux.site systemd[326]: #6 0x55d6ff9a78e3 in main src/core/main.c:1667
May 12 02:02:46 linux.site systemd[326]: #7 0x7fd1f394e8c4 in __libc_start_main (/lib64/libc.so.6+0x208c4)
May 12 02:02:46 linux.site systemd[326]: Direct leak of 29 byte(s) in 1 object(s) allocated from:
May 12 02:02:46 linux.site systemd[326]: #0 0x7fd1f504993f in strdup (/usr/lib64/libasan.so.2+0x6293f)
May 12 02:02:46 linux.site systemd[326]: #1 0x55d6ffac5288 in strv_new_ap src/shared/strv.c:152
May 12 02:02:46 linux.site systemd[326]: #2 0x55d6ffac56a9 in strv_new src/shared/strv.c:185
May 12 02:02:46 linux.site systemd[326]: #3 0x55d6ffa80272 in generator_paths src/shared/path-lookup.c:223
May 12 02:02:46 linux.site systemd[326]: #4 0x55d6ff9bdb0f in manager_run_generators src/core/manager.c:2828
May 12 02:02:46 linux.site systemd[326]: #5 0x55d6ff9b1a10 in manager_startup src/core/manager.c:1121
May 12 02:02:46 linux.site systemd[326]: #6 0x55d6ff9a78e3 in main src/core/main.c:1667
May 12 02:02:46 linux.site systemd[326]: #7 0x7fd1f394e8c4 in __libc_start_main (/lib64/libc.so.6+0x208c4)
May 12 02:02:46 linux.site systemd[326]: SUMMARY: AddressSanitizer: 130 byte(s) leaked in 4 allocation(s).
There is a leak due to the the use of cleanup_free instead _cleanup_strv_free_
It's primarily just a property of the Manager object after all, and we
try to refer to PID 1 as "manager" instead of "systemd", hence let's to
stick to this here too.
This changes log_unit_info() (and friends) to take a real Unit* object
insted of just a unit name as parameter. The call will now prefix all
logged messages with the unit name, thus allowing the unit name to be
dropped from the various passed romat strings, simplifying invocations
drastically, and unifying log output across messages. Also, UNIT= vs.
USER_UNIT= is now derived from the Manager object attached to the Unit
object, instead of getpid(). This has the benefit of correcting the
field for --test runs.
Also contains a couple of other logging improvements:
- Drops a couple of strerror() invocations in favour of using %m.
- Not only .mount units now warn if a symlinks exist for the mount
point already, .automount units do that too, now.
- A few invocations of log_struct() that didn't actually pass any
additional structured data have been replaced by simpler invocations
of log_unit_info() and friends.
- For structured data a new LOG_UNIT_MESSAGE() macro has been added,
that works like LOG_MESSAGE() but prefixes the message with the unit
name. Similar, there's now LOG_LINK_MESSAGE() and
LOG_NETDEV_MESSAGE().
- For structured data new LOG_UNIT_ID(), LOG_LINK_INTERFACE(),
LOG_NETDEV_INTERFACE() macros have been added that generate the
necessary per object fields. The old log_unit_struct() call has been
removed in favour of these new macros used in raw log_struct()
invocations. In addition to removing one more function call this
allows generated structured log messages that contain two object
fields, as necessary for example for network interfaces that are
joined into another network interface, and whose messages shall be
indexed by both.
- The LOG_ERRNO() macro has been removed, in favour of
log_struct_errno(). The latter has the benefit of ensuring that %m in
format strings is properly resolved to the specified error number.
- A number of logging messages have been converted to use
log_unit_info() instead of log_info()
- The client code in sysv-generator no longer #includes core code from
src/core/.
- log_unit_full_errno() has been removed, log_unit_full() instead takes
an errno now, too.
- log_unit_info(), log_link_info(), log_netdev_info() and friends, now
avoid double evaluation of their parameters
Whenever systemd is re-executed, it tries to create a system bus via
kdbus. If the system did not have kdbus loaded during bootup, but the
module is loaded later on manually, this will cause two system buses
running (kdbus and dbus-daemon in parallel).
This patch makes sure we never try to create kdbus buses if it wasn't
explicitly requested on the command-line.
A variety of changes:
- Make sure all our calls distuingish OOM from other errors if OOM is
not the only error possible.
- Be much stricter when parsing escaped paths, do not accept trailing or
leading escaped slashes.
- Change unit validation to take a bit mask for allowing plain names,
instance names or template names or an combination thereof.
- Refuse manipulating invalid unit name
Change cunescape() to return a normal error code, so that we can
distuingish OOM errors from parse errors.
This also adds a flags parameter to control whether "relaxed" or normal
parsing shall be done. If set no parse failures are generated, and the
only reason why cunescape() can fail is OOM.
Even if plymouth is running, it might have not displayed the splash yet,
so we'll see a few lines on fbcon when we should have otherwise had
nothing.
Plymouth integration was added to systemd in commit
6faa11140b. That same day, Plymouth got
systemd integration [0]. As such, the Plymouth integration has always
been obsolete, and was probably only for older Plymouth's. But I can't
imagine anybody running a Plymouth from 2011 with a systemd from 2015.
Remove the Plymouth/systemd integration, and let Plymouth's code tell
systemd to print the details.
[0] http://cgit.freedesktop.org/plymouth/commit/?id=537c16422cd49f1beeaab1ad39846a00018faec1
Signed-off-by: Jasper St. Pierre <jstpierre@mecheye.net>
Cc: Daniel Drake <dsd@endlessm.com>
Cc: Ray Strode <rstrode@redhat.com>
Because the order of coldplugging is not defined, we can reference a
not-yet-coldplugged unit and read its state while it has not yet been
set to a meaningful value.
This way, already active units may get started again.
We fix this by deferring such actions until all units have been at
least somehow coldplugged.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=88401
Otherwise every daemon reload prints out warnings like:
systemd[1]: Unit type .busname is not supported on this system.
systemd[1]: Unit type .swap is not supported on this system.
By notifying the clients when this property is changed it's possible to
allow "system health monitor" tools to get transitions like
running<->degraded. This is an alternative to send changes on the
SystemState property since the latter is more difficult to derive.
This patch removes includes that are not used. The removals were found with
include-what-you-use which checks if any of the symbols from a header is
in use.
include-what-you-use automatically does this and it makes finding
unnecessary harder to spot. The only content of poll.h is a include
of sys/poll.h so should be harmless.
After all it is now much more like strjoin() than strappend(). At the
same time, add support for NULL sentinels, even if they are normally not
necessary.
With this change the pull protocol implementation processes will pass
progress data to importd which then passes this information on via the
bus. We use sd_notify() as generic transport for this communication,
making importd listen to them, while matching the incoming messages to
the right transfer.
The kernel module system is not namespaced, so no container should ever
modify global options. Make sure we set the kdbus attach_flags_mask only
on a real boot as PID1.
Sometimes it is necessary to stop a generator from running. Either
because of a bug, or for testing, or some other reason. The only way
to do that would be to rename or chmod the generator binary, which is
inconvenient and does not survive upgrades. Allow masking and
overriding generators similarly to units and other configuration
files.
For the systemd instance, masking would be more common, rather than
overriding generators. For the user instances, it may also be useful
for users to have generators in $XDG_CONFIG_HOME to augment or
override system-wide generators.
Directories are searched according to the usual scheme (/usr/lib,
/usr/local/lib, /run, /etc), and files with the same name in higher
priority directories override files with the same name in lower
priority directories. Empty files and links to /dev/null mask a given
name.
https://bugs.freedesktop.org/show_bug.cgi?id=87230
Remove the optional sepearate opening of the directory,
it would be just too complicated with the change to
multiple directories.
Move the middle of execute_directory() to a seperate
function to make it easier to grok.
With this change it is possible to send file descriptors to PID 1, via
sd_pid_notify_with_fds() which PID 1 will store individually for each
service, and pass via the usual fd passing logic on next invocation.
This is useful for enable daemon reload schemes where daemons serialize
their state to /run, push their fds into PID 1 and terminate, restoring
their state on next start from the data in /run and passed in from PID
1.
The fds are kept by PID 1 as long as no POLLHUP or POLLERR is seen on
them, and the service they belong to are either not dead or failed, or
have a job queued.
Let's unify the code that counts the running jobs a bit, in order to
make sure we are less likely to miss one.
This is related to this bug:
https://bugs.freedesktop.org/show_bug.cgi?id=87349
However, it probably won't fix it fully, and I cannot reproduce the issue.
The change also adds an explicit assert change when the counter is off.
Containers do not really support .device, .automount or .swap units;
Systems compiled without support for swap do not support .swap units;
Systems without kdbus do not support .busname units.
With this change attempts to start a unsupported unit types will result
in an immediate "unsupported" job result, which is a lot more
descriptive then before. Also, attempts to start device units in
containers will now immediately fail instead of causing jobs to be
enqueued that never go away.
During upgrades and when transitioning between different systemd
versions in initrd and on the host we have to expect that some
serialization fields are unknown or parse incorrectly. This shouldn't
really be considered an error, hence downgrade the log messages about
it to debug. This way we can still trace it, but it doesn't confuse
users.
This kinda reverts 46849c3f.
Parsing the mount table with libmount races against the mount command,
which will handle the actual mounting before updating utab. This means
the poll event on /proc/self/mountinfo can kick of a reparse in systemd
before the utab information is available.
This change adds in an additional event source using inotify to watch
for changes to utab. It only watches for IN_MOVED_TO events, matching
libmount behavior of always overwriting this file using rename(2).
This does add a second pass through the mount table parsing when utab is
updated.
If the format string contains %m, clearly errno must have a meaningful
value, so we might as well use log_*_errno to have ERRNO= logged.
Using:
find . -name '*.[ch]' | xargs sed -r -i -e \
's/log_(debug|info|notice|warning|error|emergency)\((".*%m.*")/log_\1_errno(errno, \2/'
Plus some whitespace, linewrap, and indent adjustments.
As a followup to 086891e5c1 "log: add an "error" parameter to all
low-level logging calls and intrdouce log_error_errno() as log calls
that take error numbers", use sed to convert the simple cases to use
the new macros:
find . -name '*.[ch]' | xargs sed -r -i -e \
's/log_(debug|info|notice|warning|error|emergency)\("(.*)%s"(.*), strerror\(-([a-zA-Z_]+)\)\);/log_\1_errno(-\4, "\2%m"\3);/'
Multi-line log_*() invocations are not covered.
And we also should add log_unit_*_errno().
- Rename log_meta() → log_internal(), to follow naming scheme of most
other log functions that are usually invoked through macros, but never
directly.
- Rename log_info_object() to log_object_info(), simply because the
object should be before any other parameters, to follow OO-style
programming style.
strv_extend returns 0 in the case of success which means that
else if (bus_track_deserialize_item(&m->deserialized_subscribed, l) == 0)
log_warning("Unknown serialization item '%s'", l);
will be printed when value is added correctly.
kdbus has seen a larger update than expected lately, most notably with
kdbusfs, a file system to expose the kdbus control files:
* Each time a file system of this type is mounted, a new kdbus
domain is created.
* The layout inside each mount point is the same as before, except
that domains are not hierarchically nested anymore.
* Domains are therefore also unnamed now.
* Unmounting a kdbusfs will automatically also detroy the
associated domain.
* Hence, the action of creating a kdbus domain is now as
privileged as mounting a filesystem.
* This way, we can get around creating dev nodes for everything,
which is last but not least something that is not limited by
20-bit minor numbers.
The kdbus specific bits in nspawn have all been dropped now, as nspawn
can rely on the container OS to set up its own kdbus domain, simply by
mounting a new instance.
A new set of mounts has been added to mount things *after* the kernel
modules have been loaded. For now, only kdbus is in this set, which is
invoked with mount_setup_late().
This mirrors code in dbus.c when creating the private socket and
avoids error messages like:
systemd[1353]: bind(/run/user/603/systemd/notify) failed: No such file or directory
systemd[1353]: Failed to fully start up daemon: No such file or directory
The system start timeout as previously implemented would get confused by
long-running services that are included in the initial system startup
transaction for example by being cron-job-like long-running services
triggered immediately at boot. Such long-running jobs would be subject
to the default 15min timeout, esily triggering it.
Hence, remove this again. In a subsequent commit, introduce per-target
job timeouts instead, that allow us to control these timeouts more
finegrained.
Without the socket open we are going to crash and burn. If for
whatever reason we fail during deserialization we will fail when
trying to open the socket. In this case it is better to unlink the old
socket and maybe lose some messages, than to continue without the
notification socket.
Of course this situation should not happen, but we should handle
it as gracefully as possible anyway.
https://bugzilla.redhat.com/show_bug.cgi?id=1099299
It is redundant to store 'hash' and 'compare' function pointers in
struct Hashmap separately. The functions always comprise a pair.
Store a single pointer to struct hash_ops instead.
systemd keeps hundreds of hashmaps, so this saves a little bit of
memory.
We'll stay in "initializing" until basic.target has reached, at which
point we will enter "starting".
This is preparation so that we can change the startip timeout to only
apply to the first phase of startup, not the full procedure.
When this system-wide start-up timeout is hit we execute one of the
failure actions already implemented for services that fail.
This should not only be useful on embedded devices, but also on laptops
which have the power-button reachable when the lid is closed. This
devices, when in a backpack might get powered on by accident due to the
easily reachable power button. We want to make sure that the system
turns itself off if it starts up due this after a while.
When the system manages to fully start-up logind will suspend the
machine by default if the lid is closed. However, in some cases we don't
even get as far as logind, and the boot hangs much earlier, for example
because we ask for a LUKS password that nobody ever enters.
Yeah, this is a real-life problem on my Yoga 13, which has one of those
easily accessible power buttons, even if the device is closed.
Also add a bit of debugging output to help diagnose problems,
add missing units, and simplify cppflags.
Move test-engine to normal tests from manual tests, it should now
work without destroying the system.
As Zbigniew pointed out a new ConditionFirstBoot= appears like the nicer
way to hook in systemd-firstboot.service on first boots (those with /etc
unpopulated), so let's do this, and get rid of the generator again.
A new tool "systemd-firstboot" can be used either interactively on boot,
where it will query basic locale, timezone, hostname, root password
information and set it. Or it can be used non-interactively from the
command line when prepareing disk images for booting. When used
non-inertactively the tool can either copy settings from the host, or
take settings on the command line.
$ systemd-firstboot --root=/path/to/my/new/root --copy-locale --copy-root-password --hostname=waldi
The tool will be automatically invoked (interactively) now on first boot
if /etc is found unpopulated.
This also creates the infrastructure for generators to be notified via
an environment variable whether they are running on the first boot, or
not.
Only accept cpu quota values in percentages, get rid of period
definition.
It's not clear whether the CFS period controllable per-cgroup even has a
future in the kernel, hence let's simplify all this, hardcode the period
to 100ms and only accept percentage based quota values.
Similar to CPUShares= and BlockIOWeight= respectively. However only
assign the specified weight during startup. Each control group
attribute is re-assigned as weight by CPUShares=weight and
BlockIOWeight=weight after startup. If not CPUShares= or
BlockIOWeight= be specified, then the attribute is re-assigned to each
default attribute value. (default cpu.shares=1024, blkio.weight=1000)
If only CPUShares=weight or BlockIOWeight=weight be specified, then
that implies StartupCPUShares=weight and StartupBlockIOWeight=weight.
Running systemctl enable/disable/set-default/... with the --root
option under strace reveals that it accessed various files and
directories in the main fs, and not underneath the specified root.
This can lead to correct results only when the layout and
configuration in the container are identical, which often is not the
case. Fix this by adding the specified root to all file access
operations.
This patch does not handle some corner cases: symlinks which point
outside of the specified root might be interpreted differently than
they would be by the kernel if the specified root was the real root.
But systemctl does not create such symlinks by itself, and I think
this is enough of a corner case not to be worth the additional
complexity of reimplementing link chasing in systemd.
Also, simplify the code in a few places and remove an hypothetical
memory leak on error.
safe_close_pair() is more like safe_close(), except that it handles
pairs of fds, and doesn't make and misleading allusion, as it works
similarly well for socketpairs() as for pipe()s...
A service with PrivateNetwork= cannot access abstract namespace sockets
of the host anymore, hence let's better not use abstract namespace
sockets for this, since we want to make sure that PrivateNetwork=
is useful and doesn't break sd_notify().
safe_close() automatically becomes a NOP when a negative fd is passed,
and returns -1 unconditionally. This makes it easy to write lines like
this:
fd = safe_close(fd);
Which will close an fd if it is open, and reset the fd variable
correctly.
By making use of this new scheme we can drop a > 200 lines of code that
was required to test for non-negative fds or to reset the closed fd
variable afterwards.
The system state knows the states starting →
running/degraded/maintenance → stopping, where:
starting = system startup
running = normal operation
degraded = at least one unit is currently in failed state
maintenance = rescue/emergency mode is active or queued
stopping = system shutdown
When the manager receives a SIGUSR2 signal, it opens a memory stream
with open_memstream(), uses the returned file handle for logging, and
dumps the logged content with log_dump().
However, the char* buffer is only safe to use after the file handle has
been flushed with fflush, as the man pages states:
When the stream is closed (fclose(3)) or flushed (fflush(3)), the
locations pointed to by ptr and sizeloc are updated to contain,
respectively, a pointer to the buffer and the current size of the
buffer.
These values remain valid only as long as the caller performs no
further output on the stream. If further output is performed, then the
stream must again be flushed before trying to access these variables.
Without that call, dump remains NULL and the daemon crashes in
log_dump().
This is primarily useful for services that need to track clients which
reference certain objects they maintain, or which explicitly want to
subscribe to certain events. Something like this is done in a large
number of services, and not trivial to do. Hence, let's unify this at
one place.
This also ports over PID 1 to use this to ensure that subscriptions to
job and manager events are correctly tracked. As a side-effect this
makes sure we properly serialize and restore the track list across
daemon reexec/reload, which didn't work correctly before.
This also simplifies how we distribute messages to broadcast to the
direct busses: we only track subscriptions for the API bus and
implicitly assume that all direct busses are subscribed. This should be
a pretty OK simplification since clients connected via direct bus
connections are shortlived anyway.
Previously the returned object of constructor functions where sometimes
returned as last, sometimes as first and sometimes as second parameter.
Let's clean this up a bit. Here are the new rules:
1. The object the new object is derived from is put first, if there is any
2. The object we are creating will be returned in the next arguments
3. This is followed by any additional arguments
Rationale:
For functions that operate on an object we always put that object first.
Constructors should probably not be too different in this regard. Also,
if the additional parameters might want to use varargs which suggests to
put them last.
Note that this new scheme only applies to constructor functions, not to
all other functions. We do give a lot of freedom for those.
Note that this commit only changes the order of the new functions we
added, for old ones we accept the wrong order and leave it like that.
In some cases it is interesting to map a PID to two units at the same
time. For example, when a user logs in via a getty, which is reexeced to
/sbin/login that binary will be explicitly referenced as main pid of the
getty service, as well as implicitly referenced as part of the session
scope.
When a process dies that we can associate with a specific unit, start
watching all other processes of that unit, so that we can associate
those processes with the unit too.
Also, for service units start doing this as soon as we get the first
SIGCHLD for either control or main process, so that we can follow the
processes of the service from one to the other, as long as process that
remain are processes of the ones we watched that died and got reassigned
to us as parent.
Similar, for scope units start doing this as soon as the scope
controller abandons the unit, and thus management entirely reverts to
systemd. To abandon a unit introduce a new Abandon() scope unit method
call.
This reverts commit 28c758de94
but makes job_coldplug smarter.
In (v1) I changed the job start timestamp to be always set, so the
start time can be reported in the cylon eye message. The bug was that
when deserializing jobs, they would be ignored if their start
timestamp was unset which was synonymous with no timeout. But after
the change, jobs would have a start timestamp set despite having no
timeout. After deserialization they would be considered immediately
expired. Fix this by checking if the timeout is not zero when
considering jobs for expiration.
When set to auto, status will shown when the first ephemeral message
is shown (a job has been running for five seconds). Then until the
boot or shutdown ends, status messages will be shown.
No indication about the switch is done: I think it should be clear
for the user that first the cylon eye and the ephemeral messages appear,
and afterwards messages are displayed.
The initial arming of the event source was still wrong, but now should
really be fixed.
It would fire just once.
Also fix units from sec to usec as appropriate.
Decrease the switching interval to 1/3 s, so that when the time
remaining is displayed with 1s precision, it doesn't jump by 2s every
once in a while. Also, the system is feels noticably faster when the
status changes couple of times per second instead of every few
seconds.
Produces output like:
[ *** ] (1 of 2) A start job is running for slow.service (33s / 1min 30s)
The first nubmer is the time since job start, the second is the job timeout.
It is nicer to predefine patterns using configure time check instead of
using casts everywhere.
Since we do not need to use any flags, include "%" in the format instead
of excluding it like PRI* macros.
Information about signals which are not routinely received by systemd
are printed at info level. This should make it easier to see what is
happening in the system.
The only problem is that libgen.h #defines basename to point to it's
own broken implementation instead of the GNU one. This can be fixed
by #undefining basename.
Since we want to retain the ability to break kernel ←→ userspace ABI
after the next release, let's not make use by default of kdbus, so that
people with future kernels will not suddenly break with current systemd
versions.
kdbus support is left in all builds but must now be explicitly requested
at runtime (for example via setting $DBUS_SESSION_BUS). Via a configure
switch the old behaviour can be restored. In fact, we change autogen.sh
to do this, so that git builds (which run autogen.sh) get kdbus by
default, but tarball builds (which ue the configure defaults) do not get
it, and hence this stays out of the distros by default.
This way, we can avoid executing two /bin/swapon jobs to be dispatched
for the same swap device if it is configured for two different paths.
Previously we were just tracking the device nodes of active swap
devices, which would not allow us to recognize the identity of two swap
devices before they are active.
https://bugs.freedesktop.org/show_bug.cgi?id=69835
Given that plymouth listens on an abstract namespace socket and if
CLONE_NEWNET is not used the abstract namespace is shared with the host
we might actually end up send plymouth data to the host.
This patch converts PID 1 to libsystemd-bus and thus drops the
dependency on libdbus. The only remaining code using libdbus is a test
case that validates our bus marshalling against libdbus' marshalling,
and this dependency can be turned off.
This patch also adds a couple of things to libsystem-bus, that are
necessary to make the port work:
- Synthesizing of "Disconnected" messages when bus connections are
severed.
- Support for attaching multiple vtables for the same interface on the
same path.
This patch also fixes the SetDefaultTarget() and GetDefaultTarget() bus
calls which used an inappropriate signature.
As a side effect we will now generate PropertiesChanged messages which
carry property contents, rather than just invalidation information.