Systemd/src/basic/special.h
Lennart Poettering efdb02375b core: unified cgroup hierarchy support
This patch set adds full support the new unified cgroup hierarchy logic
of modern kernels.

A new kernel command line option "systemd.unified_cgroup_hierarchy=1" is
added. If specified the unified hierarchy is mounted to /sys/fs/cgroup
instead of a tmpfs. No further hierarchies are mounted. The kernel
command line option defaults to off. We can turn it on by default as
soon as the kernel's APIs regarding this are stabilized (but even then
downstream distros might want to turn this off, as this will break any
tools that access cgroupfs directly).

It is possibly to choose for each boot individually whether the unified
or the legacy hierarchy is used. nspawn will by default provide the
legacy hierarchy to containers if the host is using it, and the unified
otherwise. However it is possible to run containers with the unified
hierarchy on a legacy host and vice versa, by setting the
$UNIFIED_CGROUP_HIERARCHY environment variable for nspawn to 1 or 0,
respectively.

The unified hierarchy provides reliable cgroup empty notifications for
the first time, via inotify. To make use of this we maintain one
manager-wide inotify fd, and each cgroup to it.

This patch also removes cg_delete() which is unused now.

On kernel 4.2 only the "memory" controller is compatible with the
unified hierarchy, hence that's the only controller systemd exposes when
booted in unified heirarchy mode.

This introduces a new enum for enumerating supported controllers, plus a
related enum for the mask bits mapping to it. The core is changed to
make use of this everywhere.

This moves PID 1 into a new "init.scope" implicit scope unit in the root
slice. This is necessary since on the unified hierarchy cgroups may
either contain subgroups or processes but not both. PID 1 hence has to
move out of the root cgroup (strictly speaking the root cgroup is the
only one where processes and subgroups are still allowed, but in order
to support containers nicey, we move PID 1 into the new scope in all
cases.) This new unit is also used on legacy hierarchy setups. It's
actually pretty useful on all systems, as it can then be used to filter
journal messages coming from PID 1, and so on.

The root slice ("-.slice") is now implicitly created and started (and
does not require a unit file on disk anymore), since
that's where "init.scope" is located and the slice needs to be started
before the scope can.

To check whether we are in unified or legacy hierarchy mode we use
statfs() on /sys/fs/cgroup. If the .f_type field reports tmpfs we are in
legacy mode, if it reports cgroupfs we are in unified mode.

This patch set carefuly makes sure that cgls and cgtop continue to work
as desired.

When invoking nspawn as a service it will implicitly create two
subcgroups in the cgroup it is using, one to move the nspawn process
into, the other to move the actual container processes into. This is
done because of the requirement that cgroups may either contain
processes or other subgroups.
2015-09-01 23:52:27 +02:00

121 lines
4.8 KiB
C

/*-*- Mode: C; c-basic-offset: 8; indent-tabs-mode: nil -*-*/
#pragma once
/***
This file is part of systemd.
Copyright 2010 Lennart Poettering
systemd is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation; either version 2.1 of the License, or
(at your option) any later version.
systemd is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License
along with systemd; If not, see <http://www.gnu.org/licenses/>.
***/
#define SPECIAL_DEFAULT_TARGET "default.target"
/* Shutdown targets */
#define SPECIAL_UMOUNT_TARGET "umount.target"
/* This is not really intended to be started by directly. This is
* mostly so that other targets (reboot/halt/poweroff) can depend on
* it to bring all services down that want to be brought down on
* system shutdown. */
#define SPECIAL_SHUTDOWN_TARGET "shutdown.target"
#define SPECIAL_HALT_TARGET "halt.target"
#define SPECIAL_POWEROFF_TARGET "poweroff.target"
#define SPECIAL_REBOOT_TARGET "reboot.target"
#define SPECIAL_KEXEC_TARGET "kexec.target"
#define SPECIAL_EXIT_TARGET "exit.target"
#define SPECIAL_SUSPEND_TARGET "suspend.target"
#define SPECIAL_HIBERNATE_TARGET "hibernate.target"
#define SPECIAL_HYBRID_SLEEP_TARGET "hybrid-sleep.target"
/* Special boot targets */
#define SPECIAL_RESCUE_TARGET "rescue.target"
#define SPECIAL_EMERGENCY_TARGET "emergency.target"
#define SPECIAL_MULTI_USER_TARGET "multi-user.target"
#define SPECIAL_GRAPHICAL_TARGET "graphical.target"
/* Early boot targets */
#define SPECIAL_SYSINIT_TARGET "sysinit.target"
#define SPECIAL_SOCKETS_TARGET "sockets.target"
#define SPECIAL_BUSNAMES_TARGET "busnames.target"
#define SPECIAL_TIMERS_TARGET "timers.target"
#define SPECIAL_PATHS_TARGET "paths.target"
#define SPECIAL_LOCAL_FS_TARGET "local-fs.target"
#define SPECIAL_LOCAL_FS_PRE_TARGET "local-fs-pre.target"
#define SPECIAL_INITRD_FS_TARGET "initrd-fs.target"
#define SPECIAL_INITRD_ROOT_FS_TARGET "initrd-root-fs.target"
#define SPECIAL_REMOTE_FS_TARGET "remote-fs.target" /* LSB's $remote_fs */
#define SPECIAL_REMOTE_FS_PRE_TARGET "remote-fs-pre.target"
#define SPECIAL_SWAP_TARGET "swap.target"
#define SPECIAL_NETWORK_ONLINE_TARGET "network-online.target"
#define SPECIAL_TIME_SYNC_TARGET "time-sync.target" /* LSB's $time */
#define SPECIAL_BASIC_TARGET "basic.target"
/* LSB compatibility */
#define SPECIAL_NETWORK_TARGET "network.target" /* LSB's $network */
#define SPECIAL_NSS_LOOKUP_TARGET "nss-lookup.target" /* LSB's $named */
#define SPECIAL_RPCBIND_TARGET "rpcbind.target" /* LSB's $portmap */
/*
* Rules regarding adding further high level targets like the above:
*
* - Be conservative, only add more of these when we really need
* them. We need strong usecases for further additions.
*
* - When there can be multiple implementations running side-by-side,
* it needs to be a .target unit which can pull in all
* implementations.
*
* - If something can be implemented with socket activation, and
* without, it needs to be a .target unit, so that it can pull in
* the appropriate unit.
*
* - Otherwise, it should be a .service unit.
*
* - In some cases it is OK to have both a .service and a .target
* unit, i.e. if there can be multiple parallel implementations, but
* only one is the "system" one. Example: syslog.
*
* Or to put this in other words: .service symlinks can be used to
* arbitrate between multiple implementations if there can be only one
* of a kind. .target units can be used to support multiple
* implementations that can run side-by-side.
*/
/* Magic early boot services */
#define SPECIAL_FSCK_SERVICE "systemd-fsck@.service"
#define SPECIAL_QUOTACHECK_SERVICE "systemd-quotacheck.service"
#define SPECIAL_QUOTAON_SERVICE "quotaon.service"
#define SPECIAL_REMOUNT_FS_SERVICE "systemd-remount-fs.service"
/* Services systemd relies on */
#define SPECIAL_DBUS_SERVICE "dbus.service"
#define SPECIAL_DBUS_SOCKET "dbus.socket"
#define SPECIAL_JOURNALD_SOCKET "systemd-journald.socket"
#define SPECIAL_JOURNALD_SERVICE "systemd-journald.service"
/* Magic init signals */
#define SPECIAL_KBREQUEST_TARGET "kbrequest.target"
#define SPECIAL_SIGPWR_TARGET "sigpwr.target"
#define SPECIAL_CTRL_ALT_DEL_TARGET "ctrl-alt-del.target"
/* Where we add all our system units, users and machines by default */
#define SPECIAL_SYSTEM_SLICE "system.slice"
#define SPECIAL_USER_SLICE "user.slice"
#define SPECIAL_MACHINE_SLICE "machine.slice"
#define SPECIAL_ROOT_SLICE "-.slice"
/* The scope unit systemd itself lives in. */
#define SPECIAL_INIT_SCOPE "init.scope"