Commit graph

87 commits

Author SHA1 Message Date
Lennart Poettering d54c499369 install: introduce new DefaultInstance= field for [Install] sections
The DefaultInstance= name is used when enabling template units when only
specifying the template name, but no instance.

Add DefaultInstance=tty1 to getty@.service, so that when the template
itself is enabled an instance for tty1 is created.

This is useful so that we "systemctl preset-all" can work properly,
because we can operate on getty@.service after finding it, and the right
instance is created.
2014-06-17 02:43:43 +02:00
Lennart Poettering 2dbd4a9454 mount: add new SloppyOptions= setting for mount units, mapping to mount(8)'s "-s" switch 2014-06-16 01:02:27 +02:00
Lennart Poettering a55654d598 core: add new ConditionNeedsUpdate= unit condition
This new condition allows checking whether /etc or /var are out-of-date
relative to /usr. This is the counterpart for the update flag managed by
systemd-update-done.service. Services that want to be started once after
/usr got updated should use:

        [Unit]
        ConditionNeedsUpdate=/etc
        Before=systemd-update-done.service

This makes sure that they are only run if /etc is out-of-date relative
to /usr. And that it will be executed after systemd-update-done.service
which is responsible for marking /etc up-to-date relative to the current
/usr.

ConditionNeedsUpdate= will also checks whether /etc is actually
writable, and not trigger if it isn't, since no update is possible then.
2014-06-13 13:26:32 +02:00
Lennart Poettering a4152e3fe2 kdbus: when uploading bus name policy, resolve users/groups out-of-process
It's not safe invoking NSS from PID 1, hence fork off worker processes
that upload the policy into the kernel for busnames.
2014-06-05 13:09:46 +02:00
Lennart Poettering 3900e5fdff socket: add SocketUser= and SocketGroup= for chown()ing sockets in the file system
This is relatively complex, as we cannot invoke NSS from PID 1, and thus
need to fork a helper process temporarily.
2014-06-05 09:55:53 +02:00
Lennart Poettering a8330cd118 core: make sure we properly parse ProtectHome= and ProtectSystem= 2014-06-04 23:04:34 +02:00
Lennart Poettering 1b8689f949 core: rename ReadOnlySystem= to ProtectSystem= and add a third value for also mounting /etc read-only
Also, rename ProtectedHome= to ProtectHome=, to simplify things a bit.

With this in place we now have two neat options ProtectSystem= and
ProtectHome= for protecting the OS itself (and optionally its
configuration), and for protecting the user's data.
2014-06-04 18:12:55 +02:00
Lennart Poettering 811ba7a0e2 socket: add new Symlinks= option for socket units
With Symlinks= we can manage one or more symlinks to AF_UNIX or FIFO
nodes in the file system, with the same lifecycle as the socket itself.

This has two benefits: first, this allows us to remove /dev/log and
/dev/initctl from /dev, thus leaving only symlinks, device nodes and
directories in the /dev tree. More importantly however, this allows us
to move /dev/log out of /dev, while still making it accessible there, so
that PrivateDevices= can provide /dev/log too.
2014-06-04 16:21:17 +02:00
Lennart Poettering bd1fe7c79d socket: optionally remove sockets/FIFOs in the file system after use 2014-06-04 13:12:34 +02:00
Lennart Poettering 417116f234 core: add new ReadOnlySystem= and ProtectedHome= settings for service units
ReadOnlySystem= uses fs namespaces to mount /usr and /boot read-only for
a service.

ProtectedHome= uses fs namespaces to mount /home and /run/user
inaccessible or read-only for a service.

This patch also enables these settings for all our long-running services.

Together they should be good building block for a minimal service
sandbox, removing the ability for services to modify the operating
system or access the user's private data.
2014-06-03 23:57:51 +02:00
Lennart Poettering 9a05490933 cgroups: simplify CPUQuota= logic
Only accept cpu quota values in percentages, get rid of period
definition.

It's not clear whether the CFS period controllable per-cgroup even has a
future in the kernel, hence let's simplify all this, hardcode the period
to 100ms and only accept percentage based quota values.
2014-05-22 11:53:12 +09:00
Lennart Poettering db785129c9 cgroup: rework startup logic
Introduce a (unsigned long) -1 as "unset" state for cpu shares/block io
weights, and keep the startup unit set around all the time.
2014-05-22 07:13:56 +09:00
WaLyong Cho 95ae05c0e7 core: add startup resource control option
Similar to CPUShares= and BlockIOWeight= respectively. However only
assign the specified weight during startup. Each control group
attribute is re-assigned as weight by CPUShares=weight and
BlockIOWeight=weight after startup.  If not CPUShares= or
BlockIOWeight= be specified, then the attribute is re-assigned to each
default attribute value. (default cpu.shares=1024, blkio.weight=1000)
If only CPUShares=weight or BlockIOWeight=weight be specified, then
that implies StartupCPUShares=weight and StartupBlockIOWeight=weight.
2014-05-22 07:13:56 +09:00
Nis Martensen f1721625e7 fix spelling of privilege 2014-05-19 00:40:44 +09:00
Lennart Poettering b2f8b02ec2 core: expose CFS CPU time quota as high-level unit properties 2014-04-25 13:27:25 +02:00
Michael Olbrich bf50056632 service: rename StartLimitAction enum to FailureAction
It's used for the FailureAction property as well.
2014-04-24 20:11:20 +02:00
Michael Olbrich 93ae25e6fd service: add FailureAction= option
It has the same possible values as StartLimitAction= and is executed
immediately if a service fails.
2014-04-24 20:11:20 +02:00
Michael Olbrich efe6e7d33a service: add support for reboot argument when triggered by StartLimitAction=
When rebooting with systemctl, an optional argument can be passed to the
reboot system call. This makes it possible the specify the argument in a
service file and use it when the service triggers a restart.
This is useful to distinguish between manual reboots and reboots caused by
failing services.
2014-04-21 09:58:53 -04:00
Lennart Poettering 7f8aa67131 core: remove tcpwrap support
tcpwrap is legacy code, that is barely maintained upstream. It's APIs
are awful, and the feature set it exposes (such as DNS and IDENT
access control) questionnable. We should not support this natively in
systemd.

Hence, let's remove the code. If people want to continue making use of
this, they can do so by plugging in "tcpd" for the processes they start.
With that scheme things are as well or badly supported as they were from
traditional inetd, hence no functionality is really lost.
2014-03-24 20:07:42 +01:00
Lennart Poettering dedabea4b3 timer: support timers that can resume the system from suspend 2014-03-24 16:24:07 +01:00
Lennart Poettering 06642d1795 timer: add timer persistance (aka anacron-like behaviour) 2014-03-21 03:43:46 +01:00
Daniel Mack 5892a914d1 busname: introduce Activating directive
Add a new config 'Activating' directive which denotes whether a busname
is actually registered on the bus. It defaults to 'yes'.

If set to 'no', the .busname unit only uploads policy, which will remain
active as long as the unit is running.
2014-03-19 02:25:36 +01:00
Lennart Poettering 3f9da41645 core: add new AcceptFD= setting to .busname units
AcceptFD= defaults to true, thus making sure that by default fd passing
is enabled for all activatable names. Since for normal bus connections
fd passing is enabled too by default this makes sure fd passing works
correctly regardless whether a service is already activated or not.

Making this configurable on both busname units and in bus connections is
messy, but unavoidable since busnames are established and may queue
messages before the connection feature negotiation is done by the
service eventually activated. Conversely, feature negotiation on bus
connections takes place before the connection acquires its names.

Of course, this means developers really should make sure to keep the
settings in .busname units in sync with what they later intend to
negotiate.
2014-03-18 20:54:32 +01:00
Daniel Mack 54d76c9286 busname: add parser for bus name policies
There are three directives to specify bus name polices in .busname
files:

 * AllowUser [username] [access]
 * AllowGroup [groupname] [access]
 * AllowWorld [access]

Where [access] is one of

 * 'see': The user/group/world is allowed to see a name on the bus
 * 'talk': The user/group/world is allowed to talk to a name
 * 'own': The user/group/world is allowed to own a name

There is no user added yet in this commit.
2014-03-07 19:14:05 +01:00
Lennart Poettering 760b9d7cba core: don't override NoNewPriviliges= from SystemCallFilter= if it is already explicitly set 2014-03-05 04:41:01 +01:00
Lennart Poettering 94828d2ddc conf-parser: config_parse_path_strv() is not generic, so let's move it into load-fragment.c
The parse code actually checked for specific lvalue names, which is
really wrong for supposedly generic parsers...
2014-03-03 21:40:55 +01:00
Lennart Poettering ca37242e52 conf-parse: rename config_parse_level() to config_parse_log_level()
"level" is a bit too generic, let's clarify what kind of level we are
referring to here.
2014-03-03 21:14:07 +01:00
Lennart Poettering e66cf1a3f9 core: introduce new RuntimeDirectory= and RuntimeDirectoryMode= unit settings
As discussed on the ML these are useful to manage runtime directories
below /run for services.
2014-03-03 17:55:32 +01:00
Lennart Poettering 4298d0b512 core: add new RestrictAddressFamilies= switch
This new unit settings allows restricting which address families are
available to processes. This is an effective way to minimize the attack
surface of services, by turning off entire network stacks for them.

This is based on seccomp, and does not work on x86-32, since seccomp
cannot filter socketcall() syscalls on that platform.
2014-02-26 02:19:28 +01:00
Lennart Poettering 5556b5fe41 core: clean up some confusing regarding SI decimal and IEC binary suffixes for sizes
According to Wikipedia it is customary to specify hardware metrics and
transfer speeds to the basis 1000 (SI decimal), while software metrics
and physical volatile memory (RAM) sizes to the basis 1024 (IEC binary).
So far we specified everything in IEC, let's fix that and be more
true to what's otherwise customary. Since we don't want to parse "Mi"
instead of "M" we document each time what the context used is.
2014-02-23 03:19:04 +01:00
Michael Scherer eef65bf3ee core: Add AppArmor profile switching
This permit to switch to a specific apparmor profile when starting a daemon. This
will result in a non operation if apparmor is disabled.
It also add a new build requirement on libapparmor for using this feature.
2014-02-21 03:44:20 +01:00
Lennart Poettering 099524d7b0 core: add new ConditionArchitecture() that checks the architecture returned by uname()'s machine field. 2014-02-21 02:43:14 +01:00
Lennart Poettering ac45f971a1 core: add Personality= option for units to set the personality for spawned processes 2014-02-19 03:27:03 +01:00
Jasper St. Pierre acfbbf5c56 Fix gperf syntax
If we put a closing bracket on its own line, gperf will complain about
empty lines. Only occurs if the option in question is disabled. So fix the
m4 macros to work properly in both cases.
2014-02-17 22:07:02 +01:00
Lennart Poettering 6a6751fe24 core: warn when unit files with unsupported options are parsed 2014-02-17 17:49:09 +01:00
Lennart Poettering 5f8640fb62 core: store and expose SELinuxContext field normalized as bool + string 2014-02-17 16:52:52 +01:00
Lennart Poettering d3b1c50833 core: add a system-wide SystemCallArchitectures= setting
This is useful to prohibit execution of non-native processes on systems,
for example 32bit binaries on 64bit systems, this lowering the attack
service on incorrect syscall and ioctl 32→64bit mappings.
2014-02-13 01:40:50 +01:00
Lennart Poettering 57183d117a core: add SystemCallArchitectures= unit setting to allow disabling of non-native
architecture support for system calls

Also, turn system call filter bus properties into complex types instead
of concatenated strings.
2014-02-13 00:24:00 +01:00
Lennart Poettering 17df7223be core: rework syscall filter
- Allow configuration of an errno error to return from blacklisted
  syscalls, instead of immediately terminating a process.

- Fix parsing logic when libseccomp support is turned off

- Only keep the actual syscall set in the ExecContext, and generate the
  string version only on demand.
2014-02-12 18:30:36 +01:00
Michael Scherer 7b52a628f8 exec: Add SELinuxContext configuration item
This permit to let system administrators decide of the domain of a service.
This can be used with templated units to have each service in a différent
domain ( for example, a per customer database, using MLS or anything ),
or can be used to force a non selinux enabled system (jvm, erlang, etc)
to start in a different domain for each service.
2014-02-10 13:18:16 +01:00
Lennart Poettering 7f112f50fe exec: introduce PrivateDevices= switch to provide services with a private /dev
Similar to PrivateNetwork=, PrivateTmp= introduce PrivateDevices= that
sets up a private /dev with only the API pseudo-devices like /dev/null,
/dev/zero, /dev/random, but not any physical devices in them.
2014-01-20 21:28:37 +01:00
Lennart Poettering e821075a23 bus: add .busname unit type to implement kdbus-style bus activation 2013-12-02 23:32:34 +01:00
Lennart Poettering 613b411c94 service: add the ability for units to join other unit's PrivateNetwork= and PrivateTmp= namespaces 2013-11-27 20:28:48 +01:00
Lennart Poettering d420282b28 core: replace OnFailureIsolate= setting by a more generic OnFailureJobMode= setting and make use of it where applicable 2013-11-26 02:26:31 +01:00
Lennart Poettering 9f5eb56a13 timer: make timer accuracy configurable
And make it default to 1min
2013-11-21 22:08:20 +01:00
Lennart Poettering 718db96199 core: convert PID 1 to libsystemd-bus
This patch converts PID 1 to libsystemd-bus and thus drops the
dependency on libdbus. The only remaining code using libdbus is a test
case that validates our bus marshalling against libdbus' marshalling,
and this dependency can be turned off.

This patch also adds a couple of things to libsystem-bus, that are
necessary to make the port work:

- Synthesizing of "Disconnected" messages when bus connections are
  severed.

- Support for attaching multiple vtables for the same interface on the
  same path.

This patch also fixes the SetDefaultTarget() and GetDefaultTarget() bus
calls which used an inappropriate signature.

As a side effect we will now generate PropertiesChanged messages which
carry property contents, rather than just invalidation information.
2013-11-20 20:52:36 +01:00
Shawn Landden f0511bd7e3 core/socket: fix SO_REUSEPORT 2013-11-17 17:41:35 -05:00
Tom Gundersen accdd018ed mount/service: drop FsckPassNo support
We now treat passno as boleans in the generators, and don't need this any more. fsck itself
is able to sequentialize checks on the same local media, so in the common case the ordering
is redundant.

It is still possible to force an order by using .d fragments, in case that is desired.
2013-10-19 12:23:17 +02:00
Lennart Poettering a57f7e2c82 core: rework how we match mount units against each other
Previously to automatically create dependencies between mount units we
matched every mount unit agains all others resulting in O(n^2)
complexity. On setups with large amounts of mount units this might make
things slow.

This change replaces the matching code to use a hashtable that is keyed
by a path prefix, and points to a set of units that require that path to
be around. When a new mount unit is installed it is hence sufficient to
simply look up this set of units via its own file system paths to know
which units to order after itself.

This patch also changes all unit types to only create automatic mount
dependencies via the RequiresMountsFor= logic, and this is exposed to
the outside to make things more transparent.

With this change we still have some O(n) complexities in place when
handling mounts, but that's currently unavoidable due to kernel APIs,
and still substantially better than O(n^2) as before.

https://bugs.freedesktop.org/show_bug.cgi?id=69740
2013-09-26 20:20:30 +02:00
Lennart Poettering ddca82aca0 cgroup: get rid of MemorySoftLimit=
The cgroup attribute memory.soft_limit_in_bytes is unlikely to stay
around in the kernel for good, so let's not expose it for now. We can
readd something like it later when the kernel guys decided on a final
API for this.
2013-09-17 14:58:00 -05:00