Commit graph

91 commits

Author SHA1 Message Date
Lennart Poettering 03a7b521e3 core: add support for the "pids" cgroup controller
This adds support for the new "pids" cgroup controller of 4.3 kernels.
It allows accounting the number of tasks in a cgroup and enforcing
limits on it.

This adds two new setting TasksAccounting= and TasksMax= to each unit,
as well as a gloabl option DefaultTasksAccounting=.

This also updated "cgtop" to optionally make use of the new
kernel-provided accounting.

systemctl has been updated to show the number of tasks for each service
if it is available.

This patch also adds correct support for undoing memory limits for units
using a MemoryLimit=infinity syntax. We do the same for TasksMax= now
and hence keep things in sync here.
2015-09-10 18:41:06 +02:00
Michael Chapman 88ced61bf9 core: pass details to polkit for some unit actions
The following details are passed:

- unit: the primary name of the unit upon which the action was
        invoked (i.e. after resolving any aliases);
- verb: one of 'start', 'stop', 'reload', 'restart', 'try-restart',
        'reload-or-restart', 'reload-or-try-restart', 'kill',
        'reset-failed', or 'set-property', corresponding to the
        systemctl verb used to invoke the action.

Typical use of these details in a polkit policy rule might be:

  // Allow alice to manage example.service;
  // fall back to implicit authorization otherwise.
  polkit.addRule(function(action, subject) {
      if (action.id == "org.freedesktop.systemd1.manage-units" &&
          action.lookup("unit") == "example.service" &&
          subject.user == "alice") {
          return polkit.Result.YES;
      }
  });

We also supply a custom polkit message that includes the unit's name and
the requested operation.
2015-09-06 00:07:17 +10:00
Lennart Poettering efdb02375b core: unified cgroup hierarchy support
This patch set adds full support the new unified cgroup hierarchy logic
of modern kernels.

A new kernel command line option "systemd.unified_cgroup_hierarchy=1" is
added. If specified the unified hierarchy is mounted to /sys/fs/cgroup
instead of a tmpfs. No further hierarchies are mounted. The kernel
command line option defaults to off. We can turn it on by default as
soon as the kernel's APIs regarding this are stabilized (but even then
downstream distros might want to turn this off, as this will break any
tools that access cgroupfs directly).

It is possibly to choose for each boot individually whether the unified
or the legacy hierarchy is used. nspawn will by default provide the
legacy hierarchy to containers if the host is using it, and the unified
otherwise. However it is possible to run containers with the unified
hierarchy on a legacy host and vice versa, by setting the
$UNIFIED_CGROUP_HIERARCHY environment variable for nspawn to 1 or 0,
respectively.

The unified hierarchy provides reliable cgroup empty notifications for
the first time, via inotify. To make use of this we maintain one
manager-wide inotify fd, and each cgroup to it.

This patch also removes cg_delete() which is unused now.

On kernel 4.2 only the "memory" controller is compatible with the
unified hierarchy, hence that's the only controller systemd exposes when
booted in unified heirarchy mode.

This introduces a new enum for enumerating supported controllers, plus a
related enum for the mask bits mapping to it. The core is changed to
make use of this everywhere.

This moves PID 1 into a new "init.scope" implicit scope unit in the root
slice. This is necessary since on the unified hierarchy cgroups may
either contain subgroups or processes but not both. PID 1 hence has to
move out of the root cgroup (strictly speaking the root cgroup is the
only one where processes and subgroups are still allowed, but in order
to support containers nicey, we move PID 1 into the new scope in all
cases.) This new unit is also used on legacy hierarchy setups. It's
actually pretty useful on all systems, as it can then be used to filter
journal messages coming from PID 1, and so on.

The root slice ("-.slice") is now implicitly created and started (and
does not require a unit file on disk anymore), since
that's where "init.scope" is located and the slice needs to be started
before the scope can.

To check whether we are in unified or legacy hierarchy mode we use
statfs() on /sys/fs/cgroup. If the .f_type field reports tmpfs we are in
legacy mode, if it reports cgroupfs we are in unified mode.

This patch set carefuly makes sure that cgls and cgtop continue to work
as desired.

When invoking nspawn as a service it will implicitly create two
subcgroups in the cgroup it is using, one to move the nspawn process
into, the other to move the actual container processes into. This is
done because of the requirement that cgroups may either contain
processes or other subgroups.
2015-09-01 23:52:27 +02:00
Lennart Poettering d79200e26e unit: unify how we assing slices to units
This adds a new call unit_set_slice(), and simplifies
unit_add_default_slice(). THis should make our code a bit more robust
and simpler.
2015-08-31 13:20:43 +02:00
Lennart Poettering 35b7ff80e2 unit: add new macros to test for unit contexts 2015-08-31 13:20:43 +02:00
Lennart Poettering 21b735e798 core: add unit_dbus_interface_from_type() to unit-name.h
Let's add a way to get the type-specific D-Bus interface of a unit from
either its type or name to src/basic/unit-name.[ch]. That way we can
share it with the client side, where it is useful in tools like cgls or
machinectl.

Also ports over machinectl to make use of this.
2015-08-28 02:10:10 +02:00
Lennart Poettering 98bac6058c core: report root cgroup as "/" over the bus
Internally, the root cgroup is stored as the empty string in
Unit.cgroup_path, and "no cgroup" as NULL. Unfortunately, D-Bus does not
know a NULL concept, hence when reporting the cgroup to clients we
should turn the root cgroup into "/", and leave the empty string for the
"no cgroup" case.

This should make sure that "systemctl status -- -.slice" works correctly
and shows the entire cgroup tree.
2015-08-25 20:42:50 +02:00
Thomas Hindoe Paaboel Andersen e861098bf2 core: fix OOM checks in dbus-unit
CID#1299014
2015-05-19 06:46:59 +02:00
Lennart Poettering be7d9ff730 core: introduce seperate reverse dependencies for Requires= and Requisite=
This allows us to ensure that Requisite= dependencies never cause
propagation between units, while Requires= dependencies might.

http://lists.freedesktop.org/archives/systemd-devel/2015-May/031742.html
2015-05-19 01:24:28 +02:00
Lennart Poettering f2341e0a87 core,network: major per-object logging rework
This changes log_unit_info() (and friends) to take a real Unit* object
insted of just a unit name as parameter. The call will now prefix all
logged messages with the unit name, thus allowing the unit name to be
dropped from the various passed romat strings, simplifying invocations
drastically, and unifying log output across messages. Also, UNIT= vs.
USER_UNIT= is now derived from the Manager object attached to the Unit
object, instead of getpid(). This has the benefit of correcting the
field for --test runs.

Also contains a couple of other logging improvements:

- Drops a couple of strerror() invocations in favour of using %m.

- Not only .mount units now warn if a symlinks exist for the mount
  point already, .automount units do that too, now.

- A few invocations of log_struct() that didn't actually pass any
  additional structured data have been replaced by simpler invocations
  of log_unit_info() and friends.

- For structured data a new LOG_UNIT_MESSAGE() macro has been added,
  that works like LOG_MESSAGE() but prefixes the message with the unit
  name. Similar, there's now LOG_LINK_MESSAGE() and
  LOG_NETDEV_MESSAGE().

- For structured data new LOG_UNIT_ID(), LOG_LINK_INTERFACE(),
  LOG_NETDEV_INTERFACE() macros have been added that generate the
  necessary per object fields. The old log_unit_struct() call has been
  removed in favour of these new macros used in raw log_struct()
  invocations. In addition to removing one more function call this
  allows generated structured log messages that contain two object
  fields, as necessary for example for network interfaces that are
  joined into another network interface, and whose messages shall be
  indexed by both.

- The LOG_ERRNO() macro has been removed, in favour of
  log_struct_errno(). The latter has the benefit of ensuring that %m in
  format strings is properly resolved to the specified error number.

- A number of logging messages have been converted to use
  log_unit_info() instead of log_info()

- The client code in sysv-generator no longer #includes core code from
  src/core/.

- log_unit_full_errno() has been removed, log_unit_full() instead takes
  an errno now, too.

- log_unit_info(), log_link_info(), log_netdev_info() and friends, now
  avoid double evaluation of their parameters
2015-05-11 22:24:45 +02:00
Lennart Poettering 7410616cd9 core: rework unit name validation and manipulation logic
A variety of changes:

- Make sure all our calls distuingish OOM from other errors if OOM is
  not the only error possible.

- Be much stricter when parsing escaped paths, do not accept trailing or
  leading escaped slashes.

- Change unit validation to take a bit mask for allowing plain names,
  instance names or template names or an combination thereof.

- Refuse manipulating invalid unit name
2015-05-05 15:06:42 -07:00
Lennart Poettering 190700621f sd-bus: drop bus parameter from message callback prototype
This should simplify the prototype a bit. The bus parameter is redundant
in most cases, and in the few where it matters it can be derived from
the message via sd_bus_message_get_bus().
2015-04-29 18:36:25 +02:00
Lennart Poettering 5ad096b3f1 core: expose consumed CPU time per unit
This adds support for showing the accumulated consumed CPU time per-unit
in the "systemctl status" output. The property is also readable via the
bus.
2015-03-02 12:15:25 +01:00
Thomas Hindoe Paaboel Andersen 2eec67acbb remove unused includes
This patch removes includes that are not used. The removals were found with
include-what-you-use which checks if any of the symbols from a header is
in use.
2015-02-23 23:53:42 +01:00
Lennart Poettering 1d22e9068c core: rework policykit hookup
- Always issue selinux access check as early as possible, and PK check
  as late as possible.

- Introduce a new policykit action for altering environment

- Open most remaining bus calls to unprivileged clients via PK
2015-02-18 18:56:27 +01:00
Lennart Poettering 261420ba2a core: open up DefaultDependencies= property for transient units 2015-02-03 19:07:40 +01:00
Lennart Poettering 934277fe6a core: add a property that shows the current memory usage of a unit
This is exposed the memory.usage_in_bytes cgroup property on the bus,
and makes "systemctl status" show it in its default output.
2015-01-23 03:00:15 +01:00
Lennart Poettering 764458325e core: properly pass unit file state to clients via the bus 2014-12-10 19:58:50 +01:00
Lennart Poettering 96aad8d15a sd-bus: move common errors src/shared/bus-errors.h → src/libsystemd/sd-bus/bus-common-errors.h
Stuff in src/shared/ should not use stuff from src/libsystemd/ really.
2014-12-10 19:07:48 +01:00
Lennart Poettering d2dc52dbc4 systemctl: show unit file preset state in "systemctl status" output" 2014-12-02 13:23:04 +01:00
Michal Schmidt da927ba997 treewide: no need to negate errno for log_*_errno()
It corrrectly handles both positive and negative errno values.
2014-11-28 13:29:21 +01:00
Michal Schmidt 0a1beeb642 treewide: auto-convert the simple cases to log_*_errno()
As a followup to 086891e5c1 "log: add an "error" parameter to all
low-level logging calls and intrdouce log_error_errno() as log calls
that take error numbers", use sed to convert the simple cases to use
the new macros:

find . -name '*.[ch]' | xargs sed -r -i -e \
's/log_(debug|info|notice|warning|error|emergency)\("(.*)%s"(.*), strerror\(-([a-zA-Z_]+)\)\);/log_\1_errno(-\4, "\2%m"\3);/'

Multi-line log_*() invocations are not covered.
And we also should add log_unit_*_errno().
2014-11-28 12:04:41 +01:00
Lennart Poettering 59fccdc587 core: introduce the concept of AssertXYZ= similar to ConditionXYZ=, but fatal for a start job if not met 2014-11-06 14:21:11 +01:00
Lennart Poettering cc50ef134b condition: record test state internally and beef it up to be a full enum 2014-11-06 14:21:10 +01:00
WaLyong Cho 311f6cf33f bus: use STR_IN_SET 2014-11-01 14:39:47 -04:00
WaLyong Cho 8a188de9e0 mac: add mac_ prefix to distinguish origin security apis 2014-10-28 14:31:48 +01:00
Lennart Poettering f189ab18de job: optionally, when a job timeout is hit, also execute a failure action 2014-10-28 02:19:55 +01:00
Stef Walter 283868e1dc core: Verify systemd1 DBus method callers via polkit
DBus methods that retrieve information can be called by anyone.

DBus methods that modify state of units are verified via polkit
action: org.freedesktop.systemd1.manage-units

DBus methods that modify state of unit files are verified via polkit
action: org.freedesktop.systemd1.manage-unit-files

DBus methods that reload the entire daemon state are verified via polkit
action: org.freedesktop.systemd1.reload-daemon

DBus methods that modify job state are callable from the clients
that started the job.

root (ie: CAP_SYS_ADMIN) can continue to perform all calls, property
access etc. There are several DBus methods that can only be
called by root.

Open up the dbus1 policy for the above methods.

(Heavily modified by Lennart, making use of the new
bus_verify_polkit_async() version that doesn't force us to always
pass the original callback around. Also, interactive auhentication must
be opt-in, not unconditional, hence I turned this off.)
2014-08-18 18:08:28 +02:00
Stef Walter b39a2770ba core: Rename Job.subscribed field to Job.clients
This reflects how this field will be used, to not only track where
to send signals, but also which callers (other than root) are allowed
to call DBus methods on the Job.
2014-08-15 14:07:07 +02:00
Lennart Poettering 4f4f70361a core: no need to pass bus object to selinux access check calls anymore 2014-05-14 22:58:43 +02:00
Lennart Poettering fe7f06f142 dbus: suppress duplicate and misleading messages
When we try to send a signal on a connection we didn't hae the time to
process the Disconnected message yet, don't generate multiple warning
messages, but only a single debug message.

https://bugs.freedesktop.org/show_bug.cgi?id=75874
2014-03-11 04:19:59 +01:00
Lennart Poettering 8f8f05a919 bus: add sd_bus_track object for tracking peers, and port core over to it
This is primarily useful for services that need to track clients which
reference certain objects they maintain, or which explicitly want to
subscribe to certain events. Something like this is done in a large
number of services, and not trivial to do. Hence, let's unify this at
one place.

This also ports over PID 1 to use this to ensure that subscriptions to
job and manager events are correctly tracked. As a side-effect this
makes sure we properly serialize and restore the track list across
daemon reexec/reload, which didn't work correctly before.

This also simplifies how we distribute messages to broadcast to the
direct busses: we only track subscriptions for the API bus and
implicitly assume that all direct busses are subscribed. This should be
a pretty OK simplification since clients connected via direct bus
connections are shortlived anyway.
2014-03-03 02:34:13 +01:00
Lennart Poettering 151b9b9662 api: in constructor function calls, always put the returned object pointer first (or second)
Previously the returned object of constructor functions where sometimes
returned as last, sometimes as first and sometimes as second parameter.
Let's clean this up a bit. Here are the new rules:

1. The object the new object is derived from is put first, if there is any

2. The object we are creating will be returned in the next arguments

3. This is followed by any additional arguments

Rationale:

For functions that operate on an object we always put that object first.
Constructors should probably not be too different in this regard. Also,
if the additional parameters might want to use varargs which suggests to
put them last.

Note that this new scheme only applies to constructor functions, not to
all other functions. We do give a lot of freedom for those.

Note that this commit only changes the order of the new functions we
added, for old ones we accept the wrong order and leave it like that.
2014-02-20 00:03:10 +01:00
Lennart Poettering 6d2357247b core: fix property changes in transient units 2014-02-17 15:49:21 +01:00
Lennart Poettering 9ceefc810f core: fix oom check 2014-01-31 17:48:36 +01:00
Zbigniew Jędrzejewski-Szmek f78e6385dc Use enums to make it obvious what boolean params mean
Suggested-by: Russ Allbery <rra@debian.org>
2013-12-26 15:49:54 -05:00
Lennart Poettering aec8de63b1 core: no need to list properties for PropertiesChanged messages anymore
Since the vtable includes this information anyway, let's just use that
2013-12-22 03:50:52 +01:00
Lennart Poettering 556089dc57 bus: decorate the various object vtables with SD_BUS_VTABLE_PROPERTY_CONST where appropriate 2013-12-22 03:50:52 +01:00
David Herrmann 559a4d4cda core: fix Unit.SetProperties argument parsing
SetProperties has signature "ba(sv)", but the bus_unit_set_properties()
helper already does a enter_container('a', "sv") so we have to skip it in
bus_unit_method_set_properties().
2013-12-10 23:28:07 +00:00
Lennart Poettering 613b411c94 service: add the ability for units to join other unit's PrivateNetwork= and PrivateTmp= namespaces 2013-11-27 20:28:48 +01:00
Lennart Poettering d420282b28 core: replace OnFailureIsolate= setting by a more generic OnFailureJobMode= setting and make use of it where applicable 2013-11-26 02:26:31 +01:00
Lennart Poettering 2c7e050f61 core: fix bus serialization of conditions 2013-11-25 17:40:54 +01:00
Lennart Poettering 39abcaee3e core: don't warn loudly if we cannot send a bus signal to a disconnected client 2013-11-22 20:19:27 +01:00
Lennart Poettering 6ce270b10a core: fix deserialization of StartTransientUnit() parameters 2013-11-21 21:12:36 +01:00
Lennart Poettering ebcf1f97de bus: rework message handlers to always take an error argument
Message handler callbacks can be simplified drastically if the
dispatcher automatically replies to method calls if errors are returned.

Thus: add an sd_bus_error argument to all message handlers. When we
dispatch a message handler and it returns negative or a set sd_bus_error
we send this as message error back to the client. This means errors
returned by handlers by default are given back to clients instead of
rippling all the way up to the event loop, which is desirable to make
things robust.

As a side-effect we can now easily turn the SELinux checks into normal
function calls, since the method call dispatcher will generate the right
error replies automatically now.

Also, make sure we always pass the error structure to all property and
method handlers as last argument to follow the usual style of passing
variables for return values as last argument.
2013-11-21 21:12:36 +01:00
Lennart Poettering df2d202e6e bus: let's simplify things by getting rid of unnecessary bus parameters 2013-11-21 02:07:35 +01:00
Lennart Poettering 718db96199 core: convert PID 1 to libsystemd-bus
This patch converts PID 1 to libsystemd-bus and thus drops the
dependency on libdbus. The only remaining code using libdbus is a test
case that validates our bus marshalling against libdbus' marshalling,
and this dependency can be turned off.

This patch also adds a couple of things to libsystem-bus, that are
necessary to make the port work:

- Synthesizing of "Disconnected" messages when bus connections are
  severed.

- Support for attaching multiple vtables for the same interface on the
  same path.

This patch also fixes the SetDefaultTarget() and GetDefaultTarget() bus
calls which used an inappropriate signature.

As a side effect we will now generate PropertiesChanged messages which
carry property contents, rather than just invalidation information.
2013-11-20 20:52:36 +01:00
Kay Sievers ca2871d9b0 bus: remove static introspection file export 2013-10-21 00:41:26 +02:00
Lennart Poettering 71fda00f32 list: make our list macros a bit easier to use by not requring type spec on each invocation
We can determine the list entry type via the typeof() gcc construct, and
so we should to make the macros much shorter to use.
2013-10-14 06:11:19 +02:00
Gao feng a255a7f144 cgroup: only check once when mode is UNIT_CHECK
If the mode is UNIT_CHECK,it means we only want to check if
the paramaters are valid. the first round of cycle already
did this check, no need to check again.
2013-08-28 08:04:56 -04:00