Commit graph

517 commits

Author SHA1 Message Date
Lennart Poettering ae57dad3f9 manager: refuse reloading/reexecing when /run is overly full
Let's add an extra safety check: before entering a reload/reexec, let's
verify that there's enough room in /run for it.

Fixes: #5016
2017-02-06 16:58:06 +01:00
Zbigniew Jędrzejewski-Szmek a80c157506 core: downgrade "Time has been changed" to debug (#4906)
That message is emitted by every systemd instance on every resume:
Dec 06 08:03:38 laptop systemd[1]: Time has been changed
Dec 06 08:03:38 laptop systemd[823]: Time has been changed
Dec 06 08:03:38 laptop systemd[916]: Time has been changed
Dec 07 08:00:32 laptop systemd[1]: Time has been changed
Dec 07 08:00:32 laptop systemd[823]: Time has been changed
Dec 07 08:00:32 laptop systemd[916]: Time has been changed
-- Reboot --
Dec 07 08:02:46 laptop systemd[836]: Time has been changed
Dec 07 08:02:46 laptop systemd[1]: Time has been changed
Dec 07 08:02:46 laptop systemd[926]: Time has been changed
Dec 07 19:48:12 laptop systemd[1]: Time has been changed
Dec 07 19:48:12 laptop systemd[836]: Time has been changed
Dec 07 19:48:12 laptop systemd[926]: Time has been changed
...

Fixes #4896.
2016-12-18 13:21:19 +01:00
Zbigniew Jędrzejewski-Szmek 5a1d6cb19d pid1,catalog: use a different MESSAGE_ID for user manager startup
This add a new message id for the end of user instance startup.
User manager startup is a different beast then the system startup.
Their descriptions are completely different too. Let's just separate
them.

Partially fixes #3351.

Also remove "successful" from the description, since we don't know if
the startup was successful or not.
2016-12-11 12:41:23 -05:00
Reverend Homer 8fb3f00997 tree-wide: replace all readdir cycles with FOREACH_DIRENT{,_ALL} (#4853) 2016-12-09 10:04:30 +01:00
Lennart Poettering 2e6dbc0fcd Merge pull request #4538 from fbuihuu/confirm-spawn-fixes
Confirm spawn fixes/enhancements
2016-11-18 11:08:06 +01:00
Franck Bui b0eb29449e core: add 'c' in confirmation_spawn to resume the boot process 2016-11-17 18:16:50 +01:00
Franck Bui 7d5ceb6416 core: allow to redirect confirmation messages to a different console
It's rather hard to parse the confirmation messages (enabled with
systemd.confirm_spawn=true) amongst the status messages and the kernel
ones (if enabled).

This patch gives the possibility to the user to redirect the confirmation
message to a different virtual console, either by giving its name or its path,
so those messages are separated from the other ones and easier to read.
2016-11-17 18:16:16 +01:00
Franck Bui 42bf1ae17b core: prevent the cylon when confirmation_spawn=yes (#2194)
When booting with systemd.confirm_spawn=true, the eye of cylon
animation kicks in pretty quickly so user doesn't have any chance to
answer the questions which services to start before the confirmation
message is screwed by the cylon.

This basically breaks the confirm_spawn functionality completely.

This patch prevents the cylon animation to kick in when
confirmation_spawn=yes.

Fixes: #2194
2016-11-17 18:11:21 +01:00
Lennart Poettering c5a97ed132 core: GC redundant device jobs from the run queue
In contrast to all other unit types device units when queued just track
external state, they cannot effect state changes on their own. Hence unless a
client or other job waits for them there's no reason to keep them in the job
queue. This adds a concept of GC'ing jobs of this type as soon as no client or
other job waits for them anymore.

To ensure this works correctly we need to track which clients actually
reference a job (i.e. which ones enqueued it). Unfortunately that's pretty
nasty to do for direct connections, as sd_bus_track doesn't work for
them. For now, work around this, by simply remembering in a boolean that a job
was requested by a direct connection, and reset it when we notice the direct
connection is gone. This means the GC logic works fine, except that jobs are
not immediately removed when direct connections disconnect.

In the longer term, a rework of the bus logic should fix this properly. For now
this should be good enough, as GC works for fine all cases except this one, and
thus is a clear improvement over the previous behaviour.

Fixes: #1921
2016-11-16 15:03:26 +01:00
Lennart Poettering a2d72e265a core: drop n_in_gc_queue field of Manager structure
We count the units in the GC queue with this, but actually never make use of
it, hence drop it.
2016-11-16 15:03:26 +01:00
Lennart Poettering 493fd52f1a Merge pull request #4510 from keszybz/tree-wide-cleanups
Tree wide cleanups
2016-11-03 13:59:20 -06:00
Zbigniew Jędrzejewski-Szmek 605405c6cc tree-wide: drop NULL sentinel from strjoin
This makes strjoin and strjoina more similar and avoids the useless final
argument.

spatch -I . -I ./src -I ./src/basic -I ./src/basic -I ./src/shared -I ./src/shared -I ./src/network -I ./src/locale -I ./src/login -I ./src/journal -I ./src/journal -I ./src/timedate -I ./src/timesync -I ./src/nspawn -I ./src/resolve -I ./src/resolve -I ./src/systemd -I ./src/core -I ./src/core -I ./src/libudev -I ./src/udev -I ./src/udev/net -I ./src/udev -I ./src/libsystemd/sd-bus -I ./src/libsystemd/sd-event -I ./src/libsystemd/sd-login -I ./src/libsystemd/sd-netlink -I ./src/libsystemd/sd-network -I ./src/libsystemd/sd-hwdb -I ./src/libsystemd/sd-device -I ./src/libsystemd/sd-id128 -I ./src/libsystemd-network --sp-file coccinelle/strjoin.cocci --in-place $(git ls-files src/*.c)

git grep -e '\bstrjoin\b.*NULL' -l|xargs sed -i -r 's/strjoin\((.*), NULL\)/strjoin(\1)/'

This might have missed a few cases (spatch has a really hard time dealing
with _cleanup_ macros), but that's no big issue, they can always be fixed
later.
2016-10-23 11:43:27 -04:00
Zbigniew Jędrzejewski-Szmek fb4650aa34 tree-wide: use startswith return value to avoid hardcoded offset
I think it's an antipattern to have to count the number of bytes in
the prefix by hand. We should do this automatically to avoid wasting
programmer time, and possible errors. I didn't any offsets that were
wrong, so this change is mostly to make future development easier.
2016-10-22 16:15:46 -04:00
Lukas Nykryn ae8c7939df core: use emergency_action for ctr+alt+del burst
Fixes #4306
2016-10-21 15:13:50 +02:00
Zbigniew Jędrzejewski-Szmek 3ce40911bd pid1: downgrade some rlimit warnings
Since we ignore the result anyway, downgrade errors to warning.

log_oom() will still emit an error, but that's mostly theoretical, so it
is not worth complicating the code to avoid the small inconsistency
2016-10-19 22:17:16 -04:00
Zbigniew Jędrzejewski-Szmek 6b430fdb7c tree-wide: use mfree more 2016-10-16 23:35:39 -04:00
Lennart Poettering 4b58153dd2 core: add "invocation ID" concept to service manager
This adds a new invocation ID concept to the service manager. The invocation ID
identifies each runtime cycle of a unit uniquely. A new randomized 128bit ID is
generated each time a unit moves from and inactive to an activating or active
state.

The primary usecase for this concept is to connect the runtime data PID 1
maintains about a service with the offline data the journal stores about it.
Previously we'd use the unit name plus start/stop times, which however is
highly racy since the journal will generally process log data after the service
already ended.

The "invocation ID" kinda matches the "boot ID" concept of the Linux kernel,
except that it applies to an individual unit instead of the whole system.

The invocation ID is passed to the activated processes as environment variable.
It is additionally stored as extended attribute on the cgroup of the unit. The
latter is used by journald to automatically retrieve it for each log logged
message and attach it to the log entry. The environment variable is very easily
accessible, even for unprivileged services. OTOH the extended attribute is only
accessible to privileged processes (this is because cgroupfs only supports the
"trusted." xattr namespace, not "user."). The environment variable may be
altered by services, the extended attribute may not be, hence is the better
choice for the journal.

Note that reading the invocation ID off the extended attribute from journald is
racy, similar to the way reading the unit name for a logging process is.

This patch adds APIs to read the invocation ID to sd-id128:
sd_id128_get_invocation() may be used in a similar fashion to
sd_id128_get_boot().

PID1's own logging is updated to always include the invocation ID when it logs
information about a unit.

A new bus call GetUnitByInvocationID() is added that allows retrieving a bus
path to a unit by its invocation ID. The bus path is built using the invocation
ID, thus providing a path for referring to a unit that is valid only for the
current runtime cycleof it.

Outlook for the future: should the kernel eventually allow passing of cgroup
information along AF_UNIX/SOCK_DGRAM messages via a unique cgroup id, then we
can alter the invocation ID to be generated as hash from that rather than
entirely randomly. This way we can derive the invocation race-freely from the
messages.
2016-10-07 20:14:38 +02:00
Zbigniew Jędrzejewski-Szmek 8f4d640135 core: only warn on short reads on signal fd 2016-10-07 10:05:04 -04:00
Lennart Poettering 875ca88da5 manager: tighten incoming notification message checks
Let's not accept datagrams with embedded NUL bytes. Previously we'd simply
ignore everything after the first NUL byte. But given that sending us that is
pretty ugly let's instead complain and refuse.

With this change we'll only accept messages that have exactly zero or one NUL
bytes at the very end of the datagram.
2016-10-07 12:14:33 +02:00
Lennart Poettering 045a3d5989 manager: be stricter with incomining notifications, warn properly about too large ones
Let's make the kernel let us know the full, original datagram size of the
incoming message. If it's larger than the buffer space provided by us, drop the
whole message with a warning.

Before this change the kernel would truncate the message for us to the buffer
space provided, and we'd not complain about this, and simply process the
incomplete message as far as it made sense.
2016-10-07 12:12:10 +02:00
Lennart Poettering c55ae51e77 manager: don't ever busy loop when we get a notification message we can't process
If the kernel doesn't permit us to dequeue/process an incoming notification
datagram message it's still better to stop processing the notification messages
altogether than to enter a busy loop where we keep getting notified but can't
do a thing about it.

With this change, manager_dispatch_notify_fd() behaviour is changed like this:

- if an error indicating a spurious wake-up is seen on recvmsg(), ignore it
  (EAGAIN/EINTR)

- if any other error is seen on recvmsg() propagate it, thus disabling
  processing of further wakeups

- if any error is seen on later code in the function, warn about it but do not
  propagate it, as in this cas we're not going to busy loop as the offending
  message is already dequeued.
2016-10-07 12:08:51 +02:00
Lukáš Nykrýn 24dd31c19e core: add possibility to set action for ctrl-alt-del burst (#4105)
For some certification, it should not be possible to reboot the machine through ctrl-alt-delete. Currently we suggest our customers to mask the ctrl-alt-delete target, but that is obviously not enough.

Patching the keymaps to disable that is really not a way to go for them, because the settings need to be easily checked by some SCAP tools.
2016-10-06 21:08:21 -04:00
Zbigniew Jędrzejewski-Szmek a63ee40751 core: do not try to create /run/systemd/transient in test mode
This prevented systemd-analyze from unprivileged operation on older systemd
installations, which should be possible.
Also, we shouldn't touch the file system in test mode even if we can.
2016-10-01 22:53:17 +02:00
Zbigniew Jędrzejewski-Szmek 5fd2c135f1 core: update warning message
"closing all" might suggest that _all_ fds received with the notification message
will be closed. Reword the message to clarify that only the "unused" ones will be
closed.
2016-10-01 11:01:31 +02:00
Zbigniew Jędrzejewski-Szmek c4bee3c40e core: get rid of unneeded state variable
No functional change.
2016-10-01 11:01:31 +02:00
Zbigniew Jędrzejewski-Szmek a86b76753d pid1: more informative error message for ignored notifications
It's probably easier to diagnose a bad notification message if the
contents are printed. But still, do anything only if debugging is on.
2016-09-29 22:57:57 +02:00
Zbigniew Jędrzejewski-Szmek 8523bf7dd5 pid1: process zero-length notification messages again
This undoes 531ac2b234. I acked that patch without looking at the code
carefully enough. There are two problems:
- we want to process the fds anyway
- in principle empty notification messages are valid, and we should
  process them as usual, including logging using log_unit_debug().
2016-09-29 22:57:57 +02:00
Franck Bui 9987750e7a pid1: don't return any error in manager_dispatch_notify_fd() (#4240)
If manager_dispatch_notify_fd() fails and returns an error then the handling of
service notifications will be disabled entirely leading to a compromised system.

For example pid1 won't be able to receive the WATCHDOG messages anymore and
will kill all services supposed to send such messages.
2016-09-29 19:44:34 +02:00
Jorge Niedbalski 531ac2b234 If the notification message length is 0, ignore the message (#4237)
Fixes #4234.

Signed-off-by: Jorge Niedbalski <jnr@metaklass.org>
2016-09-29 05:26:16 -04:00
Zbigniew Jędrzejewski-Szmek 232f6754f6 pid1: drop kdbus_fd and all associated logic 2016-09-09 15:16:26 +01:00
Lennart Poettering 05a98afd3e core: add Ref()/Unref() bus calls for units
This adds two (privileged) bus calls Ref() and Unref() to the Unit interface.
The two calls may be used by clients to pin a unit into memory, so that various
runtime properties aren't flushed out by the automatic GC. This is necessary
to permit clients to race-freely acquire runtime results (such as process exit
status/code or accumulated CPU time) on successful service termination.

Ref() and Unref() are fully recursive, hence act like the usual reference
counting concept in C. Taking a reference is a privileged operation, as this
allows pinning units into memory which consumes resources.

Transient units may also gain a reference at the time of creation, via the new
AddRef property (that is only defined for transient units at the time of
creation).
2016-08-22 16:14:21 +02:00
Zbigniew Jędrzejewski-Szmek 2056ec1927 Merge pull request #3965 from htejun/systemd-controller-on-unified 2016-08-19 19:58:01 -04:00
Lennart Poettering 00d9ef8560 core: add RemoveIPC= setting
This adds the boolean RemoveIPC= setting to service, socket, mount and swap
units (i.e.  all unit types that may invoke processes). if turned on, and the
unit's user/group is not root, all IPC objects of the user/group are removed
when the service is shut down. The life-cycle of the IPC objects is hence bound
to the unit life-cycle.

This is particularly relevant for units with dynamic users, as it is essential
that no objects owned by the dynamic users survive the service exiting. In
fact, this patch adds code to imply RemoveIPC= if DynamicUser= is set.

In order to communicate the UID/GID of an executed process back to PID 1 this
adds a new "user lookup" socket pair, that is inherited into the forked
processes, and closed before the exec(). This is needed since we cannot do NSS
from PID 1 due to deadlock risks, However need to know the used UID/GID in
order to clean up IPC owned by it if the unit shuts down.
2016-08-19 00:37:25 +02:00
Tejun Heo 5da38d0768 core: use the unified hierarchy for the systemd cgroup controller hierarchy
Currently, systemd uses either the legacy hierarchies or the unified hierarchy.
When the legacy hierarchies are used, systemd uses a named legacy hierarchy
mounted on /sys/fs/cgroup/systemd without any kernel controllers for process
management.  Due to the shortcomings in the legacy hierarchy, this involves a
lot of workarounds and complexities.

Because the unified hierarchy can be mounted and used in parallel to legacy
hierarchies, there's no reason for systemd to use a legacy hierarchy for
management even if the kernel resource controllers need to be mounted on legacy
hierarchies.  It can simply mount the unified hierarchy under
/sys/fs/cgroup/systemd and use it without affecting other legacy hierarchies.
This disables a significant amount of fragile workaround logics and would allow
using features which depend on the unified hierarchy membership such bpf cgroup
v2 membership test.  In time, this would also allow deleting the said
complexities.

This patch updates systemd so that it prefers the unified hierarchy for the
systemd cgroup controller hierarchy when legacy hierarchies are used for kernel
resource controllers.

* cg_unified(@controller) is introduced which tests whether the specific
  controller in on unified hierarchy and used to choose the unified hierarchy
  code path for process and service management when available.  Kernel
  controller specific operations remain gated by cg_all_unified().

* "systemd.legacy_systemd_cgroup_controller" kernel argument can be used to
  force the use of legacy hierarchy for systemd cgroup controller.

* nspawn: By default nspawn uses the same hierarchies as the host.  If
  UNIFIED_CGROUP_HIERARCHY is set to 1, unified hierarchy is used for all.  If
  0, legacy for all.

* nspawn: arg_unified_cgroup_hierarchy is made an enum and now encodes one of
  three options - legacy, only systemd controller on unified, and unified.  The
  value is passed into mount setup functions and controls cgroup configuration.

* nspawn: Interpretation of SYSTEMD_CGROUP_CONTROLLER to the actual mount
  option is moved to mount_legacy_cgroup_hierarchy() so that it can take an
  appropriate action depending on the configuration of the host.

v2: - CGroupUnified enum replaces open coded integer values to indicate the
      cgroup operation mode.
    - Various style updates.

v3: Fixed a bug in detect_unified_cgroup_hierarchy() introduced during v2.

v4: Restored legacy container on unified host support and fixed another bug in
    detect_unified_cgroup_hierarchy().
2016-08-17 17:44:36 -04:00
Tejun Heo ca2f6384aa core: rename cg_unified() to cg_all_unified()
A following patch will update cgroup handling so that the systemd controller
(/sys/fs/cgroup/systemd) can use the unified hierarchy even if the kernel
resource controllers are on the legacy hierarchies.  This would require
distinguishing whether all controllers are on cgroup v2 or only the systemd
controller is.  In preparation, this patch renames cg_unified() to
cg_all_unified().

This patch doesn't cause any functional changes.
2016-08-15 18:13:36 -04:00
Lennart Poettering 43992e57e0 core: drop spurious newline 2016-08-03 14:52:16 +02:00
Zbigniew Jędrzejewski-Szmek dadd6ecfa5 Merge pull request #3728 from poettering/dynamic-users 2016-07-25 16:40:26 -04:00
Lennart Poettering 29206d4619 core: add a concept of "dynamic" user ids, that are allocated as long as a service is running
This adds a new boolean setting DynamicUser= to service files. If set, a new
user will be allocated dynamically when the unit is started, and released when
it is stopped. The user ID is allocated from the range 61184..65519. The user
will not be added to /etc/passwd (but an NSS module to be added later should
make it show up in getent passwd).

For now, care should be taken that the service writes no files to disk, since
this might result in files owned by UIDs that might get assigned dynamically to
a different service later on. Later patches will tighten sandboxing in order to
ensure that this cannot happen, except for a few selected directories.

A simple way to test this is:

        systemd-run -p DynamicUser=1 /bin/sleep 99999
2016-07-22 15:53:45 +02:00
Lennart Poettering 79baeeb96d core: change TasksMax= default for system services to 15%
As it turns out 512 is max number of tasks per service is hit by too many
applications, hence let's bump it a bit, and make it relative to the system's
maximum number of PIDs. With this change the new default is 15%. At the
kernel's default pids_max value of 32768 this translates to 4915. At machined's
default TasksMax= setting of 16384 this translates to 2457.

Why 15%? Because it sounds like a round number and is close enough to 4096
which I was going for, i.e. an eight-fold increase over the old 512

Summary:

            | on the host | in a container
old default |         512 |           512
new default |        4915 |          2457
2016-07-22 15:33:13 +02:00
Thomas H. P. Andersen f8298f7be3 core: remove duplicate includes (#3771) 2016-07-21 10:52:07 +02:00
Lukáš Nykrýn ccc2c98e1b manager: don't skip sigchld handler for main and control pid for services (#3738)
During stop when service has one "regular" pid one main pid and one
control pid and the sighld for the regular one is processed first the
unit_tidy_watch_pids will skip the main and control pid and does not
remove them from u->pids(). But then we skip the sigchld event because we
already did one in the iteration and there are two pids in u->pids.

v2: Use general unit_main_pid() and unit_control_pid() instead of
reaching directly to service structure.
2016-07-16 15:04:13 -04:00
Kyle Walker 1e706c8dff manager: Fixing a debug printf formatting mistake (#3640)
A 'llu' formatting statement was used in a debugging printf statement
instead of a 'PRIu64'. Correcting that mistake here.
2016-07-01 20:03:35 +03:00
Kyle Walker 36f20ae3b2 manager: Only invoke a single sigchld per unit within a cleanup cycle
By default, each iteration of manager_dispatch_sigchld() results in a unit level
sigchld event being invoked. For scope units, this results in a scope_sigchld_event()
which can seemingly stall for workloads that have a large number of PIDs within the
scope. The stall exhibits itself as a SIG_0 being initiated for each u->pids entry
as a result of pid_is_unwaited().

v2:
This patch resolves this condition by only paying to cost of a sigchld in the underlying
scope unit once per sigchld iteration. A new "sigchldgen" member resides within the
Unit struct. The Manager is incremented via the sd event loop, accessed via
sd_event_get_iteration, and the Unit member is set to the same value as the manager each
time that a sigchld event is invoked. If the Manager iteration value and Unit member
match, the sigchld event is not invoked for that iteration.
2016-06-30 15:16:47 -04:00
Dave Reisner 222953e87f Ensure kdbus isn't used (#3501)
Delete the dbus1 generator and some critical wiring. This prevents
kdbus from being loaded or detected. As such, it will never be used,
even if the user still has a useful kdbus module loaded on their system.

Sort of fixes #3480. Not really, but it's better than the current state.
2016-06-18 17:24:23 -04:00
Lukáš Nykrýn 4892084f09 manager: reduce complexity of unit_gc_sweep (#3507)
When unit is marked as UNSURE, we are trying to find if it state was
changed over and over again. So lets not go through the UNSURE states
again. Also when we find a GOOD unit lets propagate the GOOD state to
all units that this unit reference.

This is a problem on machines with a lot of initscripts with different
starting priority, since those units will reference each other and the
original algorithm might get to n! complexity.

Thanks HATAYAMA Daisuke for the expand_good_state code.
2016-06-14 14:20:56 +02:00
Franck Bui 64c3610b55 core: disable colors when displaying cylon when systemd.log_color=off (#3495) 2016-06-10 18:33:15 +02:00
Lennart Poettering 3d0b8a55f2 manager: remove spurious newline 2016-05-26 15:34:41 +02:00
Michal Sekletar 833f92ad39 core: don't log job status message in case job was effectively NOP (#3199)
We currently generate log message about unit being started even when
unit was started already and job didn't do anything. This is because job
was requested explicitly and hence became anchor job of the transaction
thus we could not eliminate it. That is fine but, let's not pollute
journal with useless log messages.

$ systemctl start systemd-resolved
$ systemctl start systemd-resolved
$ systemctl start systemd-resolved

Current state:
$ journalctl -u systemd-resolved | grep Started

May 05 15:31:42 rawhide systemd[1]: Started Network Name Resolution.
May 05 15:31:59 rawhide systemd[1]: Started Network Name Resolution.
May 05 15:32:01 rawhide systemd[1]: Started Network Name Resolution.

After patch applied:
$ journalctl -u systemd-resolved | grep Started

May 05 16:42:12 rawhide systemd[1]: Started Network Name Resolution.

Fixes #1723
2016-05-16 11:24:51 -04:00
Lennart Poettering fc2fffe770 tree-wide: introduce new SOCKADDR_UN_LEN() macro, and use it everywhere
The macro determines the right length of a AF_UNIX "struct sockaddr_un" to pass to
connect() or bind(). It automatically figures out if the socket refers to an
abstract namespace socket, or a socket in the file system, and properly handles
the full length of the path field.

This macro is not only safer, but also simpler to use, than the usual
offsetof() + strlen() logic.
2016-05-05 22:24:36 +02:00
Lennart Poettering d8fdc62037 core: use an AF_UNIX/SOCK_DGRAM socket for cgroup agent notification
dbus-daemon currently uses a backlog of 30 on its D-bus system bus socket. On
overloaded systems this means that only 30 connections may be queued without
dbus-daemon processing them before further connection attempts fail. Our
cgroups-agent binary so far used D-Bus for its messaging, and hitting this
limit hence may result in us losing cgroup empty messages.

This patch adds a seperate cgroup agent socket of type AF_UNIX/SOCK_DGRAM.
Since sockets of these types need no connection set up, no listen() backlog
applies. Our cgroup-agent binary will hence simply block as long as it can't
enqueue its datagram message, so that we won't lose cgroup empty messages as
likely anymore.

This also rearranges the ordering of the processing of SIGCHLD signals, service
notification messages (sd_notify()...) and the two types of cgroup
notifications (inotify for the unified hierarchy support, and agent for the
classic hierarchy support). We now always process events for these in the
following order:

  1. service notification messages  (SD_EVENT_PRIORITY_NORMAL-7)
  2. SIGCHLD signals (SD_EVENT_PRIORITY_NORMAL-6)
  3. cgroup inotify and cgroup agent (SD_EVENT_PRIORITY_NORMAL-5)

This is because when receiving SIGCHLD we invalidate PID information, which we
need to process the service notification messages which are bound to PIDs.
Hence the order between the first two items. And we want to process SIGCHLD
metadata to detect whether a service is gone, before using cgroup
notifications, to decide when a service is gone, since the former carries more
useful metadata.

Related to this:
https://bugs.freedesktop.org/show_bug.cgi?id=95264
https://github.com/systemd/systemd/issues/1961
2016-05-05 12:37:04 +02:00
Lennart Poettering 03532f0ae0 coredump,basic: generalize O_TMPFILE handling a bit
This moves the O_TMPFILE handling from the coredumping code into common library
code, and generalizes it as open_tmpfile_linkable() + link_tmpfile(). The
existing open_tmpfile() function (which creates an unlinked temporary file that
cannot be linked into the fs) is renamed to open_tmpfile_unlinkable(), to make
the distinction clear. Thus, code may now choose between:

 a) open_tmpfile_linkable() + link_tmpfile()
 b) open_tmpfile_unlinkable()

Depending on whether they want a file that may be linked back into the fs later
on or not.

In a later commit we should probably convert fopen_temporary() to make use of
open_tmpfile_linkable().

Followup for: #3065
2016-04-22 16:16:53 +02:00
Lennart Poettering 4943d14306 systemctl: don't confuse sysv code with generated units
The SysV compat code checks whether there's a native unit file before looking
for a SysV init script. Since the newest rework generated units will show up in
the unit path, and hence the checks ended up assuming that there always was a
native unit file for each init script: the generated one.

With this change the generated unit file directory is suppressed from the
search path when this check is done, to avoid the confusion.
2016-04-12 13:43:32 +02:00
Lennart Poettering 9183df707b install: rename generator_paths() → generator_binary_paths()
This is too confusing, as this funciton returns the paths to the generator
binaries, while usually when we refer to the just the "generator path" we mean
the generated unit files. Let's clean this up.
2016-04-12 13:43:31 +02:00
Lennart Poettering 07a7864324 core: move flushing of generated unit files to path-lookup.c
It's very similar to the mkdir and trim operations for the generator dirs,
hence let's unify this at a single place.
2016-04-12 13:43:31 +02:00
Lennart Poettering d063a52741 core: modernize manager_build_unit_patch_cache() a bit 2016-04-12 13:43:31 +02:00
Lennart Poettering a145334304 core: rework logic to drop duplicate and non-existing items from search path
Move this into a function of its own, so that we can run it after we ran the
generators, so that it takes into account removed generator dirs.
2016-04-12 13:43:30 +02:00
Lennart Poettering cd64fd5613 path-lookup: split out logic for mkdir/rmdir of generator dirs in their own functions 2016-04-12 13:43:30 +02:00
Lennart Poettering 3959135139 core: add a separate unit directory for transient units
Previously, transient units were created below the normal runtime directory
/run/systemd/system. With this change they are created in a special transient
directory /run/systemd/transient, which only contains data for transient units.

This clarifies the life-cycle of transient units, and makes clear they are
distinct from user-provided runtime units. In particular, users may now
extend transient units via /run/systemd/system, without systemd interfering
with the life-cycle of these files.

This change also adds code so that when a transient unit exits only the
drop-ins in this new directory are removed, but nothing else.

Fixes: #2139
2016-04-12 13:43:30 +02:00
Lennart Poettering 92dd7c4965 core: reuse manager_get_runtime_prefix() at more places 2016-04-12 13:43:30 +02:00
Lennart Poettering 2c289ea833 core: introduce MANAGER_IS_RELOADING() macro
This replaces the old function call manager_is_reloading_or_reexecuting() which
was used only at very few places. Use the new macro wherever we check whether
we are reloading. This should hopefully make things a bit more readable, given
the nature of Manager:n_reloading being a counter.
2016-04-12 13:43:30 +02:00
Lennart Poettering 463d0d1569 core: remove ManagerRunningAs enum
Previously, we had two enums ManagerRunningAs and UnitFileScope, that were
mostly identical and converted from one to the other all the time. The latter
had one more value UNIT_FILE_GLOBAL however.

Let's simplify things, and remove ManagerRunningAs and replace it by
UnitFileScope everywhere, thus making the translation unnecessary. Introduce
two new macros MANAGER_IS_SYSTEM() and MANAGER_IS_USER() to simplify checking
if we are running in one or the user context.
2016-04-12 13:43:30 +02:00
Lennart Poettering a3c4eb0710 core: rework generator dir logic, move the dirs into LookupPaths structure
A long time ago – when generators where first introduced – the directories for
them were randomly created via mkdtemp(). This was changed later so that they
use fixed name directories now. Let's make use of this, and add the genrator
dirs to the LookupPaths structure and into the unit file search path maintained
in it. This has the benefit that the generator dirs are now normal part of the
search path for all tools, and thus are shown in "systemctl list-unit-files"
too.
2016-04-12 13:43:29 +02:00
Lukas Nykryn 5d512d5442 core: improve error message when starting template without instance 2016-03-30 13:54:33 +02:00
Zbigniew Jędrzejewski-Szmek 1b81db7a66 Merge pull request #2903 from keszybz/cgroup2-v3
core: cgroup2 support
2016-03-29 20:25:00 -04:00
Evgeny Vereshchagin 947292eef4 core: RuntimeWatchdogSec=infinity disables the watchdog logic 2016-03-28 17:17:32 +00:00
Tejun Heo e57051f542 core: update invoke_sigchld_event() to handle NULL ->sigchld_event()
After receiving SIGCHLD, one of the ways manager_dispatch_sigchld() maps the
now zombie $PID to its unit is through manager_get_unit_by_pid_cgroup() which
reads /proc/$PID/cgroup and looks up the unit associated with the cgroup path.

On non-unified cgroup hierarchies, a process is immediately migrated to the
root cgroup on death and the cgroup lookup would always have returned the unit
associated with it, making it rather pointless but safe.  On unified hierarchy,
a zombie remains associated with the cgroup that it was associated with at the
time of death and thus manager_get_unit_by_pid_cgroup() will look up the unit
properly.

However, by the time manager_dispatch_sigchld() is running, the original cgroup
may have become empty and it and its associated unit might already have been
removed.  If the cgroup path doesn't yield a match, manager_dispatch_sigchld()
keeps pruning the leaf component.  This means that the function may return a
slice unit for a pid and as a slice doesn't have ->sigchld_event() handler,
calling invoke_sigchld_event() on it causes a segfault.

This patch updates invoke_sigchld_event() so that it skips calling if the
handler is not set.
2016-03-26 12:06:06 -04:00
Vito Caputo 313cefa1d9 tree-wide: make ++/-- usage consistent WRT spacing
Throughout the tree there's spurious use of spaces separating ++ and --
operators from their respective operands.  Make ++ and -- operator
consistent with the majority of existing uses; discard the spaces.
2016-02-22 20:32:04 -08:00
Daniel Mack 50f48ad37a cgroup: remove support for NetClass= directive
Support for net_cls.class_id through the NetClass= configuration directive
has been added in v227 in preparation for a per-unit packet filter mechanism.
However, it turns out the kernel people have decided to deprecate the net_cls
and net_prio controllers in v2. Tejun provides a comprehensive justification
for this in his commit, which has landed during the merge window for kernel
v4.5:

  https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=bd1060a1d671

As we're aiming for full support for the v2 cgroup hierarchy, we can no
longer support this feature. Userspace tool such as nftables are moving over
to setting rules that are specific to the full cgroup path of a task, which
obsoletes these controllers anyway.

This commit removes support for tweaking details in the net_cls controller,
but keeps the NetClass= directive around for legacy compatibility reasons.
2016-02-10 16:38:56 +01:00
Daniel Mack b26fa1a2fb tree-wide: remove Emacs lines from all files
This should be handled fine now by .dir-locals.el, so need to carry that
stuff in every file.
2016-02-10 13:41:57 +01:00
Lennart Poettering f0469b8c4a core: when determining system state, don't bother with JOB_TRY_RESTART
When we determine the current system state we check whether units like emergency.target are running or a job that
results in them being run is queued. However, this is not the case for JOB_TRY_RESTART, since that's a NOP if the unit
has not been running before. Hence, don't bother with checking for that job type.
2016-01-28 18:49:59 +01:00
Lennart Poettering 5f0f8d749d Merge pull request #2357 from keszybz/warnings-2
Remove gcc warnings v2
2016-01-19 15:09:53 +01:00
Evgeny Vereshchagin d9814c76ec core: fix memory leak on reload
==1== HEAP SUMMARY:
==1==     in use at exit: 61,728 bytes in 22 blocks
==1==   total heap usage: 258,122 allocs, 258,100 frees, 78,219,628
bytes allocated
==1==
==1== 16 bytes in 1 blocks are definitely lost in loss record 1 of 6
==1==    at 0x4C2BBCF: malloc (vg_replace_malloc.c:299)
==1==    by 0x1E350E: memdup (alloc-util.c:34)
==1==    by 0x135AFB: memdup_multiply (alloc-util.h:74)
==1==    by 0x140F97: manager_set_default_rlimits (manager.c:2929)
==1==    by 0x1303DA: manager_set_defaults (main.c:737)
==1==    by 0x133A02: main (main.c:1718)
==1==
==1== 272 bytes in 17 blocks are definitely lost in loss record 2 of 6
==1==    at 0x4C2BBCF: malloc (vg_replace_malloc.c:299)
==1==    by 0x1E350E: memdup (alloc-util.c:34)
==1==    by 0x135AFB: memdup_multiply (alloc-util.h:74)
==1==    by 0x140F97: manager_set_default_rlimits (manager.c:2929)
==1==    by 0x1303DA: manager_set_defaults (main.c:737)
==1==    by 0x13480D: main (main.c:1828)
==1==
==1== LEAK SUMMARY:
==1==    definitely lost: 288 bytes in 18 blocks
==1==    indirectly lost: 0 bytes in 0 blocks
==1==      possibly lost: 0 bytes in 0 blocks
==1==    still reachable: 61,440 bytes in 4 blocks
==1==         suppressed: 0 bytes in 0 blocks
==1== Reachable blocks (those to which a pointer was found) are not
shown.
==1== To see them, rerun with: --leak-check=full --show-leak-kinds=all
2016-01-14 07:45:03 +00:00
Zbigniew Jędrzejewski-Szmek b326715278 tree-wide: check if errno is greater than zero (2)
Compare errno with zero in a way that tells gcc that
(if the condition is true) errno is positive.
2016-01-13 15:10:17 -05:00
Evgeny Vereshchagin 37453b3a2a core: don't enable special signals in test mode
Fixes:
$ systemd-analyze verify ...
Failed to open /dev/tty0: Permission denied
2016-01-04 18:39:55 +00:00
Evgeny Vereshchagin 2ce2cce3ad core: revert "manager: do not set up signals in test mode"
This reverts commit 5aa1054521.

Fixes test-execute
$ sudo make check TESTS=test-execute
...
$ cat test-execute.log
+ test /tmp/test-exec_workingdirectory = /tmp/test-exec_workingdirectory
Test timeout when testing exec-workingdirectory.service
exec-workingdirectory.service
	UMask: 0022
	WorkingDirectory: /tmp/test-exec_workingdirectory
	RootDirectory: /
	NonBlocking: no
	PrivateTmp: no
	PrivateNetwork: no
	PrivateDevices: no
	ProtectHome: no
	ProtectSystem: no
	IgnoreSIGPIPE: yes
	RuntimeDirectoryMode: 0755
	StandardInput: null
	StandardOutput: inherit
	StandardError: inherit
FAIL test-execute (exit status: 1)
2016-01-04 04:13:00 +00:00
Lennart Poettering 3260929919 Merge pull request #2224 from keszybz/analyze-verify-warning
manager: do not set up signals in test mode
2015-12-26 18:53:50 +01:00
Zbigniew Jędrzejewski-Szmek 5aa1054521 manager: do not set up signals in test mode
When we are running in test mode, we don't expect any signals.
In fact ^C should end the program. This also avoids permission
issues when running systemd-analyze verify.
2015-12-25 00:24:16 -05:00
Daniel Mack 8936a5e34d core: re-sync bus name list after deserializing during daemon-reload
When the daemon reloads, it doesn not actually give up its DBus connection,
as wrongly stated in an earlier commit. However, even though the bus
connection stays open, the daemon flushes out all its internal state.

Hence, if there is a NameOwnerChanged signal after the flush and before the
deserialization, it cannot be matched against any pending unit.

To fix this, rename bus_list_names() to manager_sync_bus_names() and call
it explicitly at the end of the daemon reload operation.
2015-12-23 23:31:35 +01:00
Zbigniew Jędrzejewski-Szmek 4cee3a78bb manager: log log level changes uniformly
Output the same message when a request to change the log level is
received over dbus and through a signal. From the user point of view
those two operations are very similar and it's easy to think that the
dbus operation didn't work when the expected message is not emitted.

Also "downgrade" the message level to info, since this is a normal
user initiated action.
2015-12-13 14:53:52 -05:00
Zbigniew Jędrzejewski-Szmek 76b6f3f68f manager: move status output change debug messages to set function
This way we can only print the debug message when the status actually
changes. We also means we don't print anything when running in --user
mode, where status output is always disabled.
2015-12-13 14:52:19 -05:00
Lennart Poettering 4afd3348c7 tree-wide: expose "p"-suffix unref calls in public APIs to make gcc cleanup easy
GLIB has recently started to officially support the gcc cleanup
attribute in its public API, hence let's do the same for our APIs.

With this patch we'll define an xyz_unrefp() call for each public
xyz_unref() call, to make it easy to use inside a
__attribute__((cleanup())) expression. Then, all code is ported over to
make use of this.

The new calls are also documented in the man pages, with examples how to
use them (well, I only added docs where the _unref() call itself already
had docs, and the examples, only cover sd_bus_unrefp() and
sd_event_unrefp()).

This also renames sd_lldp_free() to sd_lldp_unref(), since that's how we
tend to call our destructors these days.

Note that this defines no public macro that wraps gcc's attribute and
makes it easier to use. While I think it's our duty in the library to
make our stuff easy to use, I figure it's not our duty to make gcc's own
features easy to use on its own. Most likely, client code which wants to
make use of this should define its own:

       #define _cleanup_(function) __attribute__((cleanup(function)))

Or similar, to make the gcc feature easier to use.

Making this logic public has the benefit that we can remove three header
files whose only purpose was to define these functions internally.

See #2008.
2015-11-27 19:19:36 +01:00
Lennart Poettering 9ded9cd14c core: enable TasksMax= for all services by default, and set it to 512
Also, enable TasksAccounting= for all services by default, too.

See:

http://lists.freedesktop.org/archives/systemd-devel/2015-November/035006.html
2015-11-16 11:57:48 +01:00
Lennart Poettering 0af20ea2ee core: add new DefaultTasksMax= setting for system.conf
This allows initializing the TasksMax= setting of all units by default
to some fixed value, instead of leaving it at infinity as before.
2015-11-13 19:50:52 +01:00
Michal Schmidt 7152869f0a Merge pull request #1869 from poettering/kill-overridable
Remove support for RequiresOverridable= and RequisiteOverridable=
2015-11-13 14:04:34 +01:00
Evgeny Vereshchagin df0060346a core: use SD_EVENT_PRIORITY_NORMAL-n instead on -n 2015-11-12 19:54:34 +00:00
Lennart Poettering 53f1841669 core: unify code that warns about jobs we fail to enqueue
This allows us to shorten our code a bit.
2015-11-12 20:14:06 +01:00
Lennart Poettering 4bd29fe5ce core: drop "override" flag when building transactions
Now that we don't have RequiresOverridable= and RequisiteOverridable=
dependencies anymore, we can get rid of tracking the "override" boolean
for jobs in the job engine, as it serves no purpose anymore.

While we are at it, fix some error messages we print when invoking
functions that take the override parameter.
2015-11-12 19:54:07 +01:00
Lennart Poettering 9ff1a6f1d6 core: change type of distribute_fds() prototype to return void
We can't handle errors of thisc all sanely anyway, and we never actually
return any errors from the unit type that implements the call.  Hence,
let's make this void, in order to simplify things.
2015-11-10 21:03:49 +01:00
Lennart Poettering ba64af90ec core: change return value of the unit's enumerate() call to void
We cannot handle enumeration failures in a sensible way, hence let's try
hard to continue without making such failures fatal, and log about it
with precise error messages.
2015-11-10 21:03:49 +01:00
Vito Caputo ad231c7787 core: still make progress when throttling the manager loop
Don't simply continue after sleeping, it potentially puts us in a state
of spinning doing nothing slowly, if the ratelimit_test() keeps
detecting the need for limiting.

Observed in vms after the host had been suspended for a while, on resume
systemd entered a loop of making zero progress spamming the console
with:
[431942.850090] systemd[1]: Looping too fast. Throttling execution a
little.

I see no reason to have a continue here, the intention should be to
throttle execution, not circumvent it altogether.
2015-11-04 17:32:16 -08:00
Lennart Poettering a47806fafa sd-daemon: increase sd_notify() socket buffer size
Let's make sure we don't start blocking on sd_notify() earlier than
necessary, let's bump the socket buffer sizes to 8M.

We already do something similar for our logging socket buffers, hence
apply a similar bump here.
2015-10-31 19:09:20 +01:00
Lennart Poettering a1a078eef2 core: bail our earlier when doing audit
Let's make sure we don't even try to create the audit socket
2015-10-31 19:09:20 +01:00
Lennart Poettering 97044145b4 core,nspawn: minor coding style fixes 2015-10-31 19:09:20 +01:00
Lennart Poettering 96d66d89c9 core: constify a few things 2015-10-31 19:09:20 +01:00
David Herrmann b215b0ede1 core: fix priority ordering in notify-handling
Currently, we dispatch NOTIFY messages in a tight loop. Regardless how
much data is incoming, we always dispatch everything that is queued.
This, however, completely breaks priority event-handling of sd-event.
When dispatching one NOTIFY event, another completely different event
might fire, or might be queued by the NOTIFY handling. However, this
event will not get dispatched until all other further NOTIFY messages are
handled. Those might even arrive _after_ the other event fired, and as
such completely break priority ordering of sd-event (which several code
paths rely on).

Break this by never dispatching multiple messages. Just return after each
message that was read and let sd-event handle everything else.

(The patch looks scarier that it is. It basically just drops the for(;;)
 loop and re-indents the loop-content.)
2015-10-28 19:11:36 +01:00
Lennart Poettering b5efdb8af4 util-lib: split out allocation calls into alloc-util.[ch] 2015-10-27 13:45:53 +01:00
Lennart Poettering affb60b1ef util-lib: split out umask-related code to umask-util.h 2015-10-27 13:25:56 +01:00
Lennart Poettering 8b43440b7e util-lib: move string table stuff into its own string-table.[ch] 2015-10-27 13:25:56 +01:00
Lennart Poettering 8fcde01280 util-lib: split stat()/statfs()/stavfs() related calls into stat-util.[ch] 2015-10-27 13:25:56 +01:00
Lennart Poettering f4f15635ec util-lib: move a number of fs operations into fs-util.[ch] 2015-10-27 13:25:56 +01:00
Lennart Poettering 0d39fa9c69 util-lib: move more file I/O related calls into fileio.[ch] 2015-10-27 13:25:55 +01:00
Lennart Poettering 6bedfcbb29 util-lib: split string parsing related calls from util.[ch] into parse-util.[ch] 2015-10-27 13:25:55 +01:00
Lennart Poettering c004493cde util-lib: split out IO related calls to io-util.[ch] 2015-10-26 01:24:38 +01:00
Lennart Poettering 3ffd4af220 util-lib: split out fd-related operations into fd-util.[ch]
There are more than enough to deserve their own .c file, hence move them
over.
2015-10-25 13:19:18 +01:00
Lennart Poettering 07630cea1f util-lib: split our string related calls from util.[ch] into its own file string-util.[ch]
There are more than enough calls doing string manipulations to deserve
its own files, hence do something about it.

This patch also sorts the #include blocks of all files that needed to be
updated, according to the sorting suggestions from CODING_STYLE. Since
pretty much every file needs our string manipulation functions this
effectively means that most files have sorted #include blocks now.

Also touches a few unrelated include files.
2015-10-24 23:05:02 +02:00
Lennart Poettering 4f5dd3943b util: split out escaping code into escape.[ch]
This really deserves its own file, given how much code this is now.
2015-10-24 23:04:42 +02:00
Lennart Poettering ac5b0c13d8 tree-wide: add more void casts for various syscall invocations 2015-10-19 23:07:18 +02:00
Thomas Hindoe Paaboel Andersen 74165387ee manager: remove unused function 2015-10-13 22:17:26 +02:00
Lennart Poettering 8dd4c05b54 core: add support for naming file descriptors passed using socket activation
This adds support for naming file descriptors passed using socket
activation. The names are passed in a new $LISTEN_FDNAMES= environment
variable, that matches the existign $LISTEN_FDS= one and contains a
colon-separated list of names.

This also adds support for naming fds submitted to the per-service fd
store using FDNAME= in the sd_notify() message.

This also adds a new FileDescriptorName= setting for socket unit files
to set the name for fds created by socket units.

This also adds a new call sd_listen_fds_with_names(), that is similar to
sd_listen_fds(), but also returns the names of the fds.

systemd-activate gained the new --fdname= switch to specify a name for
testing socket activation.

This is based on #1247 by Maciej Wereski.

Fixes #1247.
2015-10-06 11:52:48 +02:00
Lennart Poettering 400f1a33cf core: sort includes of manager.[ch] according to CODING_STYLE 2015-09-29 21:08:36 +02:00
Daniel Mack d11885c814 Merge pull request #1335 from poettering/some-fixes
A variety of mostly unrelated fixes
2015-09-22 17:04:38 +02:00
Lennart Poettering 1fc464f6fb cgtop: underline table header
Let's underline the header line of the table shown by cgtop, how it is
customary for tables. In order to do this, let's introduce new ANSI
underline macros, and clean up the existing ones as side effect.
2015-09-22 16:30:42 +02:00
Lennart Poettering ed0d40229b util: add safe_closedir() similar to safe_fclose() 2015-09-22 16:30:24 +02:00
Lennart Poettering 85fade1edb Merge pull request #986 from karelzak/monitor
mount: use libmount to monitor mountinfo & utab
2015-09-22 14:31:58 +02:00
Daniel Mack 32ee7d3309 cgroup: add support for net_cls controllers
Add a new config directive called NetClass= to CGroup enabled units.
Allowed values are positive numbers for fix assignments and "auto" for
picking a free value automatically, for which we need to keep track of
dynamically assigned net class IDs of units. Introduce a hash table for
this, and also record the last ID that was given out, so the allocator
can start its search for the next 'hole' from there. This could
eventually be optimized with something like an irb.

The class IDs up to 65536 are considered reserved and won't be
assigned automatically by systemd. This barrier can be made a config
directive in the future.

Values set in unit files are stored in the CGroupContext of the
unit and considered read-only. The actually assigned number (which
may have been chosen dynamically) is stored in the unit itself and
is guaranteed to remain stable as long as the unit is active.

In the CGroup controller, set the configured CGroup net class to
net_cls.classid. Multiple unit may share the same net class ID,
and those which do are linked together.
2015-09-16 00:21:55 +02:00
Karel Zak d379d44255 mount: use libmount to monitor mountinfo & utab
The current implementation directly monitor /proc/self/mountinfo and
/run/mount/utab files. It's really not optimal because utab file is
private libmount stuff without any official guaranteed semantic.

The libmount since v2.26 provides API to monitor mount kernel &
userspace changes and since v2.27 the monitor is usable for
non-root users too.

This patch replaces the current implementation with libmount based
solution.

Signed-off-by: Karel Zak <kzak@redhat.com>
2015-09-14 09:12:31 +02:00
Lennart Poettering e7ab4d1ac9 cgroup: unify how we invalidate cgroup controller settings
Let's make sure that we follow the same codepaths when adjusting a
cgroup property via the dbus SetProperty() call, and when we execute the
StartupCPUShares= effect.
2015-09-11 18:31:50 +02:00
Lennart Poettering cd72bd8a46 core: invalidate idle pipe event source in manager_close_idle_pipe()
In all occasions when this function is called we do so anyway, so let's
move this inside, to make things easier.
2015-09-11 18:31:50 +02:00
Lennart Poettering 5269eb6b32 core: allocate sets of startup and failed units on-demand
There's a good chance we never needs these sets, hence allocate them
only when needed.
2015-09-11 18:31:49 +02:00
Lennart Poettering ece174c543 tree-wide: drop {} from one-line if blocks
Patch via coccinelle.
2015-09-09 08:20:20 +02:00
Lennart Poettering a1e58e8ee1 tree-wide: use coccinelle to patch a lot of code to use mfree()
This replaces this:

        free(p);
        p = NULL;

by this:

        p = mfree(p);

Change generated using coccinelle. Semantic patch is added to the
sources.
2015-09-09 08:19:27 +02:00
Lennart Poettering 75f86906c5 basic: rework virtualization detection API
Introduce a proper enum, and don't pass around string ids anymore. This
simplifies things quite a bit, and makes virtualization detection more
similar to architecture detection.
2015-09-07 13:42:47 +02:00
Lennart Poettering b3ac818be8 core: split up manager_get_unit_by_pid()
Let's move the actual cgroup part of it into a new separate function
manager_get_unit_by_pid_cgroup(), and then make
manager_get_unit_by_pid() just a wrapper that also checks the two pid
hashmaps.

Then, let's make sure the various calls that want to deliver events to
the owners of a PID check both hashmaps and the cgroup and deliver the
event to *each* of them. OTOH make sure bus calls like GetUnitByPID()
continue to check the PID hashmaps first and the cgroup only as
fallback.
2015-09-04 09:07:31 +02:00
Lennart Poettering fea72cc033 macro: introduce new PID_TO_PTR macros and make use of them
This adds a new PID_TO_PTR() macro, plus PTR_TO_PID() and makes use of
it wherever we maintain processes in a hash table. Previously we
sometimes used LONG_TO_PTR() and other times ULONG_TO_PTR() for that,
hence let's make this more explicit and clean up things.
2015-09-04 09:07:30 +02:00
Lennart Poettering efdb02375b core: unified cgroup hierarchy support
This patch set adds full support the new unified cgroup hierarchy logic
of modern kernels.

A new kernel command line option "systemd.unified_cgroup_hierarchy=1" is
added. If specified the unified hierarchy is mounted to /sys/fs/cgroup
instead of a tmpfs. No further hierarchies are mounted. The kernel
command line option defaults to off. We can turn it on by default as
soon as the kernel's APIs regarding this are stabilized (but even then
downstream distros might want to turn this off, as this will break any
tools that access cgroupfs directly).

It is possibly to choose for each boot individually whether the unified
or the legacy hierarchy is used. nspawn will by default provide the
legacy hierarchy to containers if the host is using it, and the unified
otherwise. However it is possible to run containers with the unified
hierarchy on a legacy host and vice versa, by setting the
$UNIFIED_CGROUP_HIERARCHY environment variable for nspawn to 1 or 0,
respectively.

The unified hierarchy provides reliable cgroup empty notifications for
the first time, via inotify. To make use of this we maintain one
manager-wide inotify fd, and each cgroup to it.

This patch also removes cg_delete() which is unused now.

On kernel 4.2 only the "memory" controller is compatible with the
unified hierarchy, hence that's the only controller systemd exposes when
booted in unified heirarchy mode.

This introduces a new enum for enumerating supported controllers, plus a
related enum for the mask bits mapping to it. The core is changed to
make use of this everywhere.

This moves PID 1 into a new "init.scope" implicit scope unit in the root
slice. This is necessary since on the unified hierarchy cgroups may
either contain subgroups or processes but not both. PID 1 hence has to
move out of the root cgroup (strictly speaking the root cgroup is the
only one where processes and subgroups are still allowed, but in order
to support containers nicey, we move PID 1 into the new scope in all
cases.) This new unit is also used on legacy hierarchy setups. It's
actually pretty useful on all systems, as it can then be used to filter
journal messages coming from PID 1, and so on.

The root slice ("-.slice") is now implicitly created and started (and
does not require a unit file on disk anymore), since
that's where "init.scope" is located and the slice needs to be started
before the scope can.

To check whether we are in unified or legacy hierarchy mode we use
statfs() on /sys/fs/cgroup. If the .f_type field reports tmpfs we are in
legacy mode, if it reports cgroupfs we are in unified mode.

This patch set carefuly makes sure that cgls and cgtop continue to work
as desired.

When invoking nspawn as a service it will implicitly create two
subcgroups in the cgroup it is using, one to move the nspawn process
into, the other to move the actual container processes into. This is
done because of the requirement that cgroups may either contain
processes or other subgroups.
2015-09-01 23:52:27 +02:00
Lennart Poettering ae2a2c53dd manager: don't write first-boot flag file all the time
Instead, remember that we have already written it.
2015-09-01 17:20:56 +02:00
Lennart Poettering 90990e28c9 manager: remove ask-password fd from sd_event before closing it
Otherwise we might attempt to remove a non-existing fd from epoll.
2015-08-31 13:20:44 +02:00
Daniel Mack bbc2908635 core: dbus: track bus names per unit
Currently, PID1 installs an unfiltered NameOwnerChanged signal match, and
dispatches the signals itself. This does not scale, as right now, PID1
wakes up every time a bus client connects.

To fix this, install individual matches once they are requested by
unit_watch_bus_name(), and remove the watches again through their slot in
unit_unwatch_bus_name().

If the bus is not available during unit_watch_bus_name(), just store
name in the 'watch_bus' hashmap, and let bus_setup_api() do the installing
later.
2015-08-06 10:14:41 +02:00
Lennart Poettering dacd6cee76 tree-wide: port everything over to fflush_and_check()
Some places invoked fflush() directly with their own manual error
checking, let's unify all that by using fflush_and_check().

This also unifies the general error paths of fflush()+rename() file
writers.
2015-07-29 20:31:07 +02:00
Kay Sievers 1b09f548c7 turn kdbus support into a runtime option
./configure --enable/disable-kdbus can be used to set the default
behavior regarding kdbus.

If no kdbus kernel support is available, dbus-dameon will be used.

With --enable-kdbus, the kernel command line option "kdbus=0" can
be used to disable kdbus.

With --disable-kdbus, the kernel command line option "kdbus=1" is
required to enable kdbus support.
2015-06-17 18:01:49 +02:00
Lennart Poettering 2a1288ff89 util: introduce CMSG_FOREACH() macro and make use of it everywhere
It's only marginally shorter then the usual for() loop, but certainly
more readable.
2015-06-10 19:29:47 +02:00
Thomas Hindoe Paaboel Andersen 920b52e490 tree-wide: remove spurious space 2015-06-08 23:11:26 +02:00
Kay Sievers 6517217d49 kdbus: remove attach_flags_mask module parameter setting 2015-06-03 22:24:16 +02:00
Lennart Poettering 24882e06c1 util: split out signal-util.[ch] from util.[ch]
No functional changes.
2015-05-29 20:14:11 +02:00
Lennart Poettering 9771b62d8b manager: fix finish_timestamp calculation
http://lists.freedesktop.org/archives/systemd-devel/2015-May/032100.html
2015-05-21 21:34:36 +02:00
Lennart Poettering aad1976ffa core: don't consider boot-up finished if we are still reloading
http://lists.freedesktop.org/archives/systemd-devel/2015-May/032025.html
2015-05-19 19:26:53 +02:00
Lennart Poettering c6497ccb71 core: when propagating restart requests due to deps, downgrade restart to try-restart
Previously, if a service A depended on a service B via Requires=, and A
was not running and B restarted this would trigger a start of A as well,
since the restart was propagated as restart independently of the state
of A.

This patch ensures that a restart of B would be propagated as a
try-restart to A, thus not changing its state if it isn't up.

http://lists.freedesktop.org/archives/systemd-devel/2015-May/032061.html
2015-05-19 18:21:19 +02:00
Cristian Rodríguez f42348ace7 core: fix memory leak in manager_run_generators()
If systemd is built with GCC address sanitizer or leak sanitizer
the following memory leak ocurs:

May 12 02:02:46 linux.site systemd[326]: =================================================================
May 12 02:02:46 linux.site systemd[326]: ==326==ERROR: LeakSanitizer: detected memory leaks
May 12 02:02:46 linux.site systemd[326]: Direct leak of 101 byte(s) in 3 object(s) allocated from:
May 12 02:02:46 linux.site systemd[326]: #0 0x7fd1f504993f in strdup (/usr/lib64/libasan.so.2+0x6293f)
May 12 02:02:46 linux.site systemd[326]: #1 0x55d6ffac5336 in strv_new_ap src/shared/strv.c:163
May 12 02:02:46 linux.site systemd[326]: #2 0x55d6ffac56a9 in strv_new src/shared/strv.c:185
May 12 02:02:46 linux.site systemd[326]: #3 0x55d6ffa80272 in generator_paths src/shared/path-lookup.c:223
May 12 02:02:46 linux.site systemd[326]: #4 0x55d6ff9bdb0f in manager_run_generators src/core/manager.c:2828
May 12 02:02:46 linux.site systemd[326]: #5 0x55d6ff9b1a10 in manager_startup src/core/manager.c:1121
May 12 02:02:46 linux.site systemd[326]: #6 0x55d6ff9a78e3 in main src/core/main.c:1667
May 12 02:02:46 linux.site systemd[326]: #7 0x7fd1f394e8c4 in __libc_start_main (/lib64/libc.so.6+0x208c4)
May 12 02:02:46 linux.site systemd[326]: Direct leak of 29 byte(s) in 1 object(s) allocated from:
May 12 02:02:46 linux.site systemd[326]: #0 0x7fd1f504993f in strdup (/usr/lib64/libasan.so.2+0x6293f)
May 12 02:02:46 linux.site systemd[326]: #1 0x55d6ffac5288 in strv_new_ap src/shared/strv.c:152
May 12 02:02:46 linux.site systemd[326]: #2 0x55d6ffac56a9 in strv_new src/shared/strv.c:185
May 12 02:02:46 linux.site systemd[326]: #3 0x55d6ffa80272 in generator_paths src/shared/path-lookup.c:223
May 12 02:02:46 linux.site systemd[326]: #4 0x55d6ff9bdb0f in manager_run_generators src/core/manager.c:2828
May 12 02:02:46 linux.site systemd[326]: #5 0x55d6ff9b1a10 in manager_startup src/core/manager.c:1121
May 12 02:02:46 linux.site systemd[326]: #6 0x55d6ff9a78e3 in main src/core/main.c:1667
May 12 02:02:46 linux.site systemd[326]: #7 0x7fd1f394e8c4 in __libc_start_main (/lib64/libc.so.6+0x208c4)
May 12 02:02:46 linux.site systemd[326]: SUMMARY: AddressSanitizer: 130 byte(s) leaked in 4 allocation(s).

There is a leak due to the the use of cleanup_free instead _cleanup_strv_free_
2015-05-13 16:32:55 +02:00
Tom Gundersen 3495967790 core: drop redundant logging about notification messages
Before:
May 12 17:11:22 tomegun-x2402 systemd[1]: systemd-udevd.service: Got notification message for unit.
May 12 17:11:22 tomegun-x2402 systemd[1]: systemd-udevd.service: Got notification message from PID 195 (READY=1)
May 12 17:11:22 tomegun-x2402 systemd[1]: systemd-udevd.service: Ggot READY=1

After:
May 12 17:11:22 tomegun-x2402 systemd[1]: systemd-udevd.service: Got notification message from PID 195 (READY=1)
2015-05-12 17:25:44 +02:00
Lennart Poettering b2c23da8ce core: rename SystemdRunningAs to ManagerRunningAs
It's primarily just a property of the Manager object after all, and we
try to refer to PID 1 as "manager" instead of "systemd", hence let's to
stick to this here too.
2015-05-11 22:51:49 +02:00
Lennart Poettering f2341e0a87 core,network: major per-object logging rework
This changes log_unit_info() (and friends) to take a real Unit* object
insted of just a unit name as parameter. The call will now prefix all
logged messages with the unit name, thus allowing the unit name to be
dropped from the various passed romat strings, simplifying invocations
drastically, and unifying log output across messages. Also, UNIT= vs.
USER_UNIT= is now derived from the Manager object attached to the Unit
object, instead of getpid(). This has the benefit of correcting the
field for --test runs.

Also contains a couple of other logging improvements:

- Drops a couple of strerror() invocations in favour of using %m.

- Not only .mount units now warn if a symlinks exist for the mount
  point already, .automount units do that too, now.

- A few invocations of log_struct() that didn't actually pass any
  additional structured data have been replaced by simpler invocations
  of log_unit_info() and friends.

- For structured data a new LOG_UNIT_MESSAGE() macro has been added,
  that works like LOG_MESSAGE() but prefixes the message with the unit
  name. Similar, there's now LOG_LINK_MESSAGE() and
  LOG_NETDEV_MESSAGE().

- For structured data new LOG_UNIT_ID(), LOG_LINK_INTERFACE(),
  LOG_NETDEV_INTERFACE() macros have been added that generate the
  necessary per object fields. The old log_unit_struct() call has been
  removed in favour of these new macros used in raw log_struct()
  invocations. In addition to removing one more function call this
  allows generated structured log messages that contain two object
  fields, as necessary for example for network interfaces that are
  joined into another network interface, and whose messages shall be
  indexed by both.

- The LOG_ERRNO() macro has been removed, in favour of
  log_struct_errno(). The latter has the benefit of ensuring that %m in
  format strings is properly resolved to the specified error number.

- A number of logging messages have been converted to use
  log_unit_info() instead of log_info()

- The client code in sysv-generator no longer #includes core code from
  src/core/.

- log_unit_full_errno() has been removed, log_unit_full() instead takes
  an errno now, too.

- log_unit_info(), log_link_info(), log_netdev_info() and friends, now
  avoid double evaluation of their parameters
2015-05-11 22:24:45 +02:00
Torstein Husebø ff9b60f38b treewide: Correct typos and spell plural of bus consistent 2015-05-11 15:51:30 +02:00
David Herrmann d79acc3093 bus: don't switch to kdbus if not requested
Whenever systemd is re-executed, it tries to create a system bus via
kdbus. If the system did not have kdbus loaded during bootup, but the
module is loaded later on manually, this will cause two system buses
running (kdbus and dbus-daemon in parallel).

This patch makes sure we never try to create kdbus buses if it wasn't
explicitly requested on the command-line.
2015-05-06 18:21:51 +02:00
Lennart Poettering 7410616cd9 core: rework unit name validation and manipulation logic
A variety of changes:

- Make sure all our calls distuingish OOM from other errors if OOM is
  not the only error possible.

- Be much stricter when parsing escaped paths, do not accept trailing or
  leading escaped slashes.

- Change unit validation to take a bit mask for allowing plain names,
  instance names or template names or an combination thereof.

- Refuse manipulating invalid unit name
2015-05-05 15:06:42 -07:00
Lennart Poettering 1c2e9646e4 core: simplify unit type detection logic
Introduce a new call unit_type_supported() and make use of it
everywhere.

Also, drop Manager parameter from per-type supported method prototype.
2015-04-30 01:29:00 +02:00
Mantas Mikulėnas cfa9677bd1 core: fix event source annotations
These looked like a mass-replace gone slightly wrong – two statements
with no { }'s, and no error checking.
2015-04-29 20:38:48 +02:00
Tom Gundersen 7dfbe2e3fc core: annotate event sources 2015-04-29 17:08:31 +02:00
Lennart Poettering 007c6337c6 manager: don't fail fatally if we cannot coldplug a unit
It's better to continue as good as we can, than to totally fail. Hence,
let's log about the failure and continue.
2015-04-24 16:14:46 +02:00
Lennart Poettering be847e82cf Revert "core: do not spawn jobs or touch other units during coldplugging"
This reverts commit 6e392c9c45.

We really shouldn't invent external state keeping hashmaps, if we can
keep this state in the units themselves.
2015-04-24 15:51:10 +02:00
Ronny Chevalier 288a74cce5 shared: add terminal-util.[ch] 2015-04-11 00:34:02 +02:00