Commit graph

50 commits

Author SHA1 Message Date
Lennart Poettering 03a7b521e3 core: add support for the "pids" cgroup controller
This adds support for the new "pids" cgroup controller of 4.3 kernels.
It allows accounting the number of tasks in a cgroup and enforcing
limits on it.

This adds two new setting TasksAccounting= and TasksMax= to each unit,
as well as a gloabl option DefaultTasksAccounting=.

This also updated "cgtop" to optionally make use of the new
kernel-provided accounting.

systemctl has been updated to show the number of tasks for each service
if it is available.

This patch also adds correct support for undoing memory limits for units
using a MemoryLimit=infinity syntax. We do the same for TasksMax= now
and hence keep things in sync here.
2015-09-10 18:41:06 +02:00
Lennart Poettering f757855e81 nspawn: add new .nspawn files for container settings
.nspawn fiels are simple settings files that may accompany container
images and directories and contain settings otherwise passed on the
nspawn command line. This provides an efficient way to attach execution
data directly to containers.
2015-09-06 01:49:06 +02:00
Lennart Poettering 023a4f6701 core: optionally create LOGIN_PROCESS or USER_PROCESS utmp entries
When generating utmp/wtmp entries, optionally add both LOGIN_PROCESS and
INIT_PROCESS entries or even all three of LOGIN_PROCESS, INIT_PROCESS
and USER_PROCESS entries, instead of just a single INIT_PROCESS entry.

With this change systemd may be used to not only invoke a getty directly
in a SysV-compliant way but alternatively also a login(1) implementation
or even forego getty and login entirely, and invoke arbitrary shells in
a way that they appear in who(1) or w(1).

This is preparation for a later commit that adds a "machinectl shell"
operation to invoke a shell in a container, in a way that is compatible
with who(1) and w(1).
2015-08-24 22:46:45 +02:00
Lennart Poettering b02cb41c78 conf-parse: don't accept invalid bus names as BusName= arguments in service units 2015-01-07 23:44:08 +01:00
Zbigniew Jędrzejewski-Szmek 9e37c9544b core: warn and ignore SysVStartPriority=
Option was being parsed but not used for anything.
2014-11-30 19:10:40 -05:00
Zbigniew Jędrzejewski-Szmek a2c0e528b8 When warning about unsupported options, be more detailed 2014-11-30 18:49:08 -05:00
WaLyong Cho 2ca620c4ed smack: introduce new SmackProcessLabel option
In service file, if the file has some of special SMACK label in
ExecStart= and systemd has no permission for the special SMACK label
then permission error will occurred. To resolve this, systemd should
be able to set its SMACK label to something accessible of ExecStart=.
So introduce new SmackProcessLabel. If label is specified with
SmackProcessLabel= then the child systemd will set its label to
that. To successfully execute the ExecStart=, accessible label should
be specified with SmackProcessLabel=.
Additionally, by SMACK policy, if the file in ExecStart= has no
SMACK64EXEC then the executed process will have given label by
SmackProcessLabel=. But if the file has SMACK64EXEC then the
SMACK64EXEC label will be overridden.

[zj: reword man page]
2014-11-24 10:20:53 -05:00
Daniel Mack 5019962312 bus: parse BusPolicy directive in service files
Add a new directive called BusPolicy to define custom endpoint policies. If
one such directive is given, an endpoint object in the service's ExecContext is
created and the given policy is added to it.
2014-09-08 14:12:54 +02:00
Lennart Poettering a4152e3fe2 kdbus: when uploading bus name policy, resolve users/groups out-of-process
It's not safe invoking NSS from PID 1, hence fork off worker processes
that upload the policy into the kernel for busnames.
2014-06-05 13:09:46 +02:00
Lennart Poettering 1b8689f949 core: rename ReadOnlySystem= to ProtectSystem= and add a third value for also mounting /etc read-only
Also, rename ProtectedHome= to ProtectHome=, to simplify things a bit.

With this in place we now have two neat options ProtectSystem= and
ProtectHome= for protecting the OS itself (and optionally its
configuration), and for protecting the user's data.
2014-06-04 18:12:55 +02:00
Lennart Poettering 811ba7a0e2 socket: add new Symlinks= option for socket units
With Symlinks= we can manage one or more symlinks to AF_UNIX or FIFO
nodes in the file system, with the same lifecycle as the socket itself.

This has two benefits: first, this allows us to remove /dev/log and
/dev/initctl from /dev, thus leaving only symlinks, device nodes and
directories in the /dev tree. More importantly however, this allows us
to move /dev/log out of /dev, while still making it accessible there, so
that PrivateDevices= can provide /dev/log too.
2014-06-04 16:21:17 +02:00
Lennart Poettering 417116f234 core: add new ReadOnlySystem= and ProtectedHome= settings for service units
ReadOnlySystem= uses fs namespaces to mount /usr and /boot read-only for
a service.

ProtectedHome= uses fs namespaces to mount /home and /run/user
inaccessible or read-only for a service.

This patch also enables these settings for all our long-running services.

Together they should be good building block for a minimal service
sandbox, removing the ability for services to modify the operating
system or access the user's private data.
2014-06-03 23:57:51 +02:00
Lennart Poettering db785129c9 cgroup: rework startup logic
Introduce a (unsigned long) -1 as "unset" state for cpu shares/block io
weights, and keep the startup unit set around all the time.
2014-05-22 07:13:56 +09:00
WaLyong Cho 95ae05c0e7 core: add startup resource control option
Similar to CPUShares= and BlockIOWeight= respectively. However only
assign the specified weight during startup. Each control group
attribute is re-assigned as weight by CPUShares=weight and
BlockIOWeight=weight after startup.  If not CPUShares= or
BlockIOWeight= be specified, then the attribute is re-assigned to each
default attribute value. (default cpu.shares=1024, blkio.weight=1000)
If only CPUShares=weight or BlockIOWeight=weight be specified, then
that implies StartupCPUShares=weight and StartupBlockIOWeight=weight.
2014-05-22 07:13:56 +09:00
Nis Martensen f1721625e7 fix spelling of privilege 2014-05-19 00:40:44 +09:00
Lennart Poettering b2f8b02ec2 core: expose CFS CPU time quota as high-level unit properties 2014-04-25 13:27:25 +02:00
Michael Olbrich bf50056632 service: rename StartLimitAction enum to FailureAction
It's used for the FailureAction property as well.
2014-04-24 20:11:20 +02:00
Daniel Mack 54d76c9286 busname: add parser for bus name policies
There are three directives to specify bus name polices in .busname
files:

 * AllowUser [username] [access]
 * AllowGroup [groupname] [access]
 * AllowWorld [access]

Where [access] is one of

 * 'see': The user/group/world is allowed to see a name on the bus
 * 'talk': The user/group/world is allowed to talk to a name
 * 'own': The user/group/world is allowed to own a name

There is no user added yet in this commit.
2014-03-07 19:14:05 +01:00
Lennart Poettering 760b9d7cba core: don't override NoNewPriviliges= from SystemCallFilter= if it is already explicitly set 2014-03-05 04:41:01 +01:00
Lennart Poettering 94828d2ddc conf-parser: config_parse_path_strv() is not generic, so let's move it into load-fragment.c
The parse code actually checked for specific lvalue names, which is
really wrong for supposedly generic parsers...
2014-03-03 21:40:55 +01:00
Lennart Poettering 3af00fb85a core: move config_parse_set_status() into load-fragment.c
Let's keep specific config parsers close to where they are needed. Only
the really generic ones should be defined in conf-parser.[ch].
2014-03-03 21:26:53 +01:00
Lennart Poettering e66cf1a3f9 core: introduce new RuntimeDirectory= and RuntimeDirectoryMode= unit settings
As discussed on the ML these are useful to manage runtime directories
below /run for services.
2014-03-03 17:55:32 +01:00
Lennart Poettering 4298d0b512 core: add new RestrictAddressFamilies= switch
This new unit settings allows restricting which address families are
available to processes. This is an effective way to minimize the attack
surface of services, by turning off entire network stacks for them.

This is based on seccomp, and does not work on x86-32, since seccomp
cannot filter socketcall() syscalls on that platform.
2014-02-26 02:19:28 +01:00
Michael Scherer eef65bf3ee core: Add AppArmor profile switching
This permit to switch to a specific apparmor profile when starting a daemon. This
will result in a non operation if apparmor is disabled.
It also add a new build requirement on libapparmor for using this feature.
2014-02-21 03:44:20 +01:00
Lennart Poettering ac45f971a1 core: add Personality= option for units to set the personality for spawned processes 2014-02-19 03:27:03 +01:00
Lennart Poettering 5f8640fb62 core: store and expose SELinuxContext field normalized as bool + string 2014-02-17 16:52:52 +01:00
Lennart Poettering 57183d117a core: add SystemCallArchitectures= unit setting to allow disabling of non-native
architecture support for system calls

Also, turn system call filter bus properties into complex types instead
of concatenated strings.
2014-02-13 00:24:00 +01:00
Lennart Poettering 17df7223be core: rework syscall filter
- Allow configuration of an errno error to return from blacklisted
  syscalls, instead of immediately terminating a process.

- Fix parsing logic when libseccomp support is turned off

- Only keep the actual syscall set in the ExecContext, and generate the
  string version only on demand.
2014-02-12 18:30:36 +01:00
Lennart Poettering e821075a23 bus: add .busname unit type to implement kdbus-style bus activation 2013-12-02 23:32:34 +01:00
Lennart Poettering d420282b28 core: replace OnFailureIsolate= setting by a more generic OnFailureJobMode= setting and make use of it where applicable 2013-11-26 02:26:31 +01:00
Tom Gundersen 71a6151083 conf-parser: distinguish between multiple sections with the same name
Pass on the line on which a section was decleared to the parsers, so they
can distinguish between multiple sections (if they chose to). Currently
no parsers take advantage of this, but a follow-up patch will do that
to distinguish

[Address]
Address=192.168.0.1/24
Label=one

[Address]
Address=192.168.0.2/24
Label=two

from

[Address]
Address=192.168.0.1/24
Label=one
Address=192.168.0.2/24
Label=two
2013-11-25 19:35:44 +01:00
Tom Gundersen accdd018ed mount/service: drop FsckPassNo support
We now treat passno as boleans in the generators, and don't need this any more. fsck itself
is able to sequentialize checks on the same local media, so in the common case the ordering
is redundant.

It is still possible to force an order by using .d fragments, in case that is desired.
2013-10-19 12:23:17 +02:00
Lennart Poettering 8e7076caae cgroup: split out per-device BlockIOWeight= setting into BlockIODeviceWeight=
This way we can nicely map the configuration directive to properties and
back, without requiring two different signatures for the same property.
2013-07-11 20:40:18 +02:00
Lennart Poettering 4ad490007b core: general cgroup rework
Replace the very generic cgroup hookup with a much simpler one. With
this change only the high-level cgroup settings remain, the ability to
set arbitrary cgroup attributes is removed, so is support for adding
units to arbitrary cgroup controllers or setting arbitrary paths for
them (especially paths that are different for the various controllers).

This also introduces a new -.slice root slice, that is the parent of
system.slice and friends. This enables easy admin configuration of
root-level cgrouo properties.

This replaces DeviceDeny= by DevicePolicy=, and implicitly adds in
/dev/null, /dev/zero and friends if DeviceAllow= is used (unless this is
turned off by DevicePolicy=).
2013-06-27 04:17:34 +02:00
Lennart Poettering a016b9228f core: add new .slice unit type for partitioning systems
In order to prepare for the kernel cgroup rework, let's introduce a new
unit type to systemd, the "slice". Slices can be arranged in a tree and
are useful to partition resources freely and hierarchally by the user.

Each service unit can now be assigned to one of these slices, and later
on login users and machines may too.

Slices translate pretty directly to the cgroup hierarchy, and the
various objects can be assigned to any of the slices in the tree.
2013-06-17 21:36:51 +02:00
Lennart Poettering 3ecaa09bcc unit: rework trigger dependency logic
Instead of having explicit type-specific callbacks that inform the
triggering unit when a triggered unit changes state, make this generic
so that state changes are forwarded betwee any triggered and triggering
unit.

Also, get rid of UnitRef references from automount, timer, path units,
to the units they trigger and rely exclsuively on UNIT_TRIGGER type
dendencies.
2013-04-23 16:00:32 -03:00
Zbigniew Jędrzejewski-Szmek e8e581bf25 Report about syntax errors with metadata
The information about the unit for which files are being parsed
is passed all the way down. This way messages land in the journal
with proper UNIT=... or USER_UNIT=... attribution.

'systemctl status' and 'journalctl -u' not displaying those messages
has been a source of confusion for users, since the journal entry for
a misspelt setting was often logged quite a bit earlier than the
failure to start a unit.

Based-on-a-patch-by: Oleksii Shevchuk <alxchk@gmail.com>
2013-04-17 00:09:16 -04:00
Lennart Poettering 26d04f86a3 unit: rework resource management API
This introduces a new static list of known attributes and their special
semantics. This means that cgroup attribute values can now be
automatically translated from user to kernel notation for command line
set settings, too.

This also adds proper support for multi-line attributes.
2013-02-27 18:50:41 +01:00
Lennart Poettering 853b8397ac core: properly validate environment data from Environment= lines in unit files 2013-02-11 23:54:30 +01:00
Shawn Landden c2f1db8f83 use #pragma once instead of foo*foo #define guards
#pragma once has been "un-deprecated" in gcc since 3.3, and is widely supported
in other compilers.

I've been using and maintaining (rebasing) this patch for a while now, as
it annoyed me to see #ifndef fooblahfoo, etc all over the place,
almost arrogant about the annoyance of having to define all these names to
perform a commen but neccicary functionality, when a completely superior
alternative exists.

I havn't sent it till now, cause its kindof a style change, and it is bad
voodoo to mess with style that has been established by more established
editors. So feel free to lambast me as a crazy bafoon.

v2 - preserve externally used headers
2012-07-19 12:30:59 +02:00
Lennart Poettering 8351ceaea9 execute: support syscall filtering using seccomp filters 2012-07-17 04:17:53 +02:00
Lennart Poettering 8ff290af3b unit: drop the Names= option
Names= is a source of errors, simply because alias names specified like
this only become relevant after a unit has been loaded but cannot be
used to load a unit.

Let's get rid of the confusion and drop this field. To establish alias
names peope should use symlinks, which have the the benefit of being
useful as key to load a unit, even though they are not taken into
account if unit names are listed but they haven't been explicitly
referenced before.
2012-06-22 16:24:57 +02:00
Lukas Nykryn 98709151f3 service: timeout for oneshot services
Add possibility to specify timeout for oneshot services.

[ https://bugzilla.redhat.com/show_bug.cgi?id=761656
  Added minor fixups. -- michich ]
2012-06-15 16:04:06 +02:00
Lennart Poettering 213ba152fd journal: allow setting of a cutoff log level for disk storage, syslog, kmsg, console forwarding 2012-06-01 17:27:16 +02:00
Lennart Poettering d88a251b12 util: introduce a proper nsec_t and make use of it where appropriate 2012-05-31 04:27:03 +02:00
Lennart Poettering ec8927ca59 main: add configuration option to alter capability bounding set for PID 1
This also ensures that caps dropped from the bounding set are also
dropped from the inheritable set, to be extra-secure. Usually that should
change very little though as the inheritable set is empty for all our uses
anyway.
2012-05-24 04:00:56 +02:00
Lennart Poettering 49dbfa7b2b units: introduce new Documentation= field and make use of it everywhere
This should help making the boot process a bit easier to explore and
understand for the administrator. The simple idea is that "systemctl
status" now shows a link to documentation alongside the other status and
decriptionary information of a service.

This patch adds the necessary fields to all our shipped units if we have
proper documentation for them.
2012-05-21 15:14:51 +02:00
Lennart Poettering 7c8fa05c4d unit: add new dependency type RequiresMountsFor=
RequiresMountsFor= is a shortcut for adding requires and after
dependencies to all mount units neeed for the specified paths.

This solves a couple of issues regarding dep loop cycles for encrypted
swap.
2012-04-30 10:52:07 +02:00
Lennart Poettering 5430f7f2bc relicense to LGPLv2.1 (with exceptions)
We finally got the OK from all contributors with non-trivial commits to
relicense systemd from GPL2+ to LGPL2.1+.

Some udev bits continue to be GPL2+ for now, but we are looking into
relicensing them too, to allow free copy/paste of all code within
systemd.

The bits that used to be MIT continue to be MIT.

The big benefit of the relicensing is that closed source code may now
link against libsystemd-login.so and friends.
2012-04-12 00:24:39 +02:00
Kay Sievers b30e2f4c18 move libsystemd_core.la sources into core/ 2012-04-11 16:03:51 +02:00
Renamed from src/load-fragment.h (Browse further)