Commit graph

285 commits

Author SHA1 Message Date
Ismo Puustinen a103496ca5 capabilities: keep bounding set in non-inverted format.
Change the capability bounding set parser and logic so that the bounding
set is kept as a positive set internally. This means that the set
reflects those capabilities that we want to keep instead of drop.
2016-01-12 12:14:50 +02:00
Daniel Mack 201f0c916d tree-wide: unify argument lists of IN_SET()
The new implementation will not allow passing the same values more than
once, so clean up first.
2016-01-10 18:10:08 +01:00
Lennart Poettering 4afd3348c7 tree-wide: expose "p"-suffix unref calls in public APIs to make gcc cleanup easy
GLIB has recently started to officially support the gcc cleanup
attribute in its public API, hence let's do the same for our APIs.

With this patch we'll define an xyz_unrefp() call for each public
xyz_unref() call, to make it easy to use inside a
__attribute__((cleanup())) expression. Then, all code is ported over to
make use of this.

The new calls are also documented in the man pages, with examples how to
use them (well, I only added docs where the _unref() call itself already
had docs, and the examples, only cover sd_bus_unrefp() and
sd_event_unrefp()).

This also renames sd_lldp_free() to sd_lldp_unref(), since that's how we
tend to call our destructors these days.

Note that this defines no public macro that wraps gcc's attribute and
makes it easier to use. While I think it's our duty in the library to
make our stuff easy to use, I figure it's not our duty to make gcc's own
features easy to use on its own. Most likely, client code which wants to
make use of this should define its own:

       #define _cleanup_(function) __attribute__((cleanup(function)))

Or similar, to make the gcc feature easier to use.

Making this logic public has the benefit that we can remove three header
files whose only purpose was to define these functions internally.

See #2008.
2015-11-27 19:19:36 +01:00
Harald Hoyer 9d06297e26 core: Do not bind a mount unit to a device, if it was from mountinfo
If a mount unit is bound to a device, systemd tries to umount the
mount point, if it thinks the device has gone away.

Due to the uevent queue and inotify of /proc/self/mountinfo being two
different sources, systemd can never get the ordering reliably correct.

It can happen, that in the uevent queue ADD,REMOVE,ADD is queued
and an inotify of mountinfo (or libmount event) happend with the
device in question.

systemd cannot know, at which point of time the mount happend in the
ADD,REMOVE,ADD sequence.

The real ordering might have been ADD,REMOVE,ADD,mount
and systemd might think ADD,mount,REMOVE,ADD and would umount the
mountpoint.

A test script which triggered this behaviour is:
rm -f test-efi-disk.img
dd if=/dev/null of=test-efi-disk.img bs=1M seek=512 count=1
parted --script test-efi-disk.img \
  "mklabel gpt" \
  "mkpart ESP fat32 1MiB 511MiB" \
  "set 1 boot on"
LOOP=$(losetup --show -f -P test-efi-disk.img)
udevadm settle
mkfs.vfat -F32 ${LOOP}p1
mkdir -p mnt
mount ${LOOP}p1 mnt
... <dostuffwith mnt>

Without the "udevadm settle" systemd unmounted mnt while the script was
operating on mnt.

Of course the question is, why there was a REMOVE in the first place,
but this is not part of this patch.
2015-11-24 14:08:50 +01:00
Lennart Poettering df446f9603 core: Minor cleaning up of unit/log status and log logic
We only reorder a few things and modernize some constructs. No
functional changes.

- Move some if checks from the caller to the callee of a few functions.

- Use IN_SE() where we can

- Move status printing functions together
2015-11-17 20:41:52 +01:00
Lennart Poettering 7c65093ae3 core: make unit_make_transient() more thorough
Let's reset more stuff that does not apply to transient units. Also,
let's readd the unito to all queues, because it's identity now changed.
2015-11-17 20:41:52 +01:00
Lennart Poettering 0f13f3bd79 core: move check whether a unit is suitable to become transient into unit.c
Lets introduce unit_is_pristine() that verifies whether a unit is
suitable to become a transient unit, by checking that it is no
referenced yet and has no data on disk assigned.
2015-11-17 17:32:49 +01:00
Lennart Poettering 702d4e6f14 core: now that .snapshot unit are gone, we don't need the per-type .no_gc bool anymore 2015-11-13 19:50:52 +01:00
Lennart Poettering 0af20ea2ee core: add new DefaultTasksMax= setting for system.conf
This allows initializing the TasksMax= setting of all units by default
to some fixed value, instead of leaving it at infinity as before.
2015-11-13 19:50:52 +01:00
Michal Schmidt 7152869f0a Merge pull request #1869 from poettering/kill-overridable
Remove support for RequiresOverridable= and RequisiteOverridable=
2015-11-13 14:04:34 +01:00
Lennart Poettering 4bd29fe5ce core: drop "override" flag when building transactions
Now that we don't have RequiresOverridable= and RequisiteOverridable=
dependencies anymore, we can get rid of tracking the "override" boolean
for jobs in the job engine, as it serves no purpose anymore.

While we are at it, fix some error messages we print when invoking
functions that take the override parameter.
2015-11-12 19:54:07 +01:00
Lennart Poettering f32b43bda4 core: remove support for RequiresOverridable= and RequisiteOverridable=
As discussed at systemd.conf 2015 and on also raised on the ML:

http://lists.freedesktop.org/archives/systemd-devel/2015-November/034880.html

This removes the two XyzOverridable= unit dependencies, that were
basically never used, and do not enhance user experience in any way.
Most folks looking for the functionality this provides probably opt for
the "ignore-dependencies" job mode, and that's probably a good idea.

Hence, let's simplify systemd's dependency engine and remove these two
dependency types (and their inverses).

The unit file parser and the dbus property parser will now redirect
the settings/properties to result in an equivalent non-overridable
dependency. In the case of the unit file parser we generate a warning,
to inform the user.

The dbus properties for this unit type stay available on the unit
objects, but they are now hidden from usual introspection and will
always return the empty list when queried.

This should provide enough compatibility for the few unit files that
actually ever made use of this.
2015-11-12 19:27:24 +01:00
Lennart Poettering 0ec0deaa30 install: follow unit file symlinks in /usr, but not /etc when looking for [Install] data
Some distributions use alias unit files via symlinks in /usr to cover
for legacy service names. With this change we'll allow "systemctl
enable" on such aliases.

Previously, our rule was that symlinks are user configuration that
"systemctl enable" + "systemctl disable" creates and removes, while unit
files is where the instructions to do so are store. As a result of the
rule we'd never read install information through symlinks, since that
would mix enablement state with installation instructions.

Now, the new rule is that only symlinks inside of /etc are
configuration. Unit files, and symlinks in /usr are now valid for
installation instructions.

This patch is quite a rework of the whole install logic, and makes the
following addional changes:

- Adds a complete test "test-instal-root" that tests the install logic
  pretty comprehensively.

- Never uses canonicalize_file_name(), because that's incompatible with
  operation relative to a specific root directory.

- unit_file_get_state() is reworked to return a proper error, and
  returns the state in a call-by-ref parameter. This cleans up confusion
  between the enum type and errno-like errors.

- The new logic puts a limit on how long to follow unit file symlinks:
  it will do so only for 64 steps at max.

- The InstallContext object's fields are renamed to will_process and
  has_processed (will_install and has_installed) since they are also
  used for deinstallation and all kinds of other operations.

- The root directory is always verified before use.

- install.c is reordered to place the exported functions together.

- Stricter rules are followed when traversing symlinks: the unit suffix
  must say identical, and it's not allowed to link between regular units
  and templated units.

- Various modernizations

- The "invalid" unit file state has been renamed to "bad", in order to
  avoid confusion between UNIT_FILE_INVALID and
  _UNIT_FILE_STATE_INVALID. Given that the state should normally not be
  seen and is not documented this should not be a problematic change.
  The new name is now documented however.

Fixes #1375, #1718, #1706
2015-11-12 17:57:04 +01:00
Tom Gundersen 7042fc14ff Merge pull request #1837 from poettering/grabbag2
variety of fixes
2015-11-11 02:31:29 +01:00
Zbigniew Jędrzejewski-Szmek 36b4a7ba55 Remove snapshot unit type
Snapshots were never useful or used for anything. Many systemd
developers that I spoke to at systemd.conf2015, didn't even know they
existed, so it is fairly safe to assume that this type can be deleted
without harm.

The fundamental problem with snapshots is that the state of the system
is dynamic, devices come and go, users log in and out, timers fire...
and restoring all units to some state from the past would "undo"
those changes, which isn't really possible.

Tested by creating a snapshot, running the new binary, and checking
that the transition did not cause errors, and the snapshot is gone,
and snapshots cannot be created anymore.

New systemctl says:
Unknown operation snapshot.
Old systemctl says:
Failed to create snapshot: Support for snapshots has been removed.

IgnoreOnSnaphost settings are warned about and ignored:
Support for option IgnoreOnSnapshot= has been removed and it is ignored

http://lists.freedesktop.org/archives/systemd-devel/2015-November/034872.html
2015-11-10 19:33:06 -05:00
Lennart Poettering 5a6158b641 core: try to continue if coldplugging of a unit fails 2015-11-10 21:03:49 +01:00
Daniel Mack 8ea823b60d core: be more verbose when NameOwnerChanged subscriptions fail 2015-11-02 10:14:35 +01:00
Evgeny Vereshchagin cbe46ead79 core: set_unit_path overwrites SYSTEMD_UNIT_PATH 2015-10-31 14:14:37 +03:00
Lennart Poettering b5efdb8af4 util-lib: split out allocation calls into alloc-util.[ch] 2015-10-27 13:45:53 +01:00
Lennart Poettering 8fcde01280 util-lib: split stat()/statfs()/stavfs() related calls into stat-util.[ch] 2015-10-27 13:25:56 +01:00
Lennart Poettering 5f311f8c0e util: remove path_get_parent(), in favour of dirname_malloc()
We don't need two functions that do essentialy the same, hence drop
path_get_parent(), and stick to dirname_malloc(), but move it to
path-util.[ch].
2015-10-27 13:25:55 +01:00
Lennart Poettering 6bedfcbb29 util-lib: split string parsing related calls from util.[ch] into parse-util.[ch] 2015-10-27 13:25:55 +01:00
Lennart Poettering b1d4f8e154 util-lib: split out user/group/uid/gid calls into user-util.[ch] 2015-10-26 01:24:38 +01:00
Lennart Poettering 07630cea1f util-lib: split our string related calls from util.[ch] into its own file string-util.[ch]
There are more than enough calls doing string manipulations to deserve
its own files, hence do something about it.

This patch also sorts the #include blocks of all files that needed to be
updated, according to the sorting suggestions from CODING_STYLE. Since
pretty much every file needs our string manipulation functions this
effectively means that most files have sorted #include blocks now.

Also touches a few unrelated include files.
2015-10-24 23:05:02 +02:00
Lennart Poettering 4f5dd3943b util: split out escaping code into escape.[ch]
This really deserves its own file, given how much code this is now.
2015-10-24 23:04:42 +02:00
Thomas Hindoe Paaboel Andersen b250ea2fd6 tree-wide: remove unused functions 2015-10-19 21:46:01 +02:00
Lennart Poettering 9806e87da2 unit: allocate bus name match string on the stack
Let's use strjoina() rather than strjoin() for construct dbus match
strings.

Also, while we are at it, fix parameter ordering, so that our functions
always put the object first, like it is customary for OO-like
programming.
2015-10-17 16:48:21 +02:00
Lennart Poettering ce3cef2c52 Merge pull request #1501 from fbuihuu/fix-requires-mounts-for-directives
Make sure the mount units pulled by 'RequiresMountsFor=' are loaded, if they exist
2015-10-09 17:04:36 +03:00
Franck Bui c7c89abb9e Make sure the mount units pulled by 'RequiresMountsFor=' are loaded (if they exist)
We should make sure that mount units involved by 'RequiresMountsFor='
directives are really loaded if not required by any others units so
that Requires= dependencies on the mount units are applied and thus
the mount unit dependencies are started.
2015-10-08 21:24:22 +02:00
Lennart Poettering a34ceba66f core: add support for setting stdin/stdout/stderr for transient services
When starting a transient service, allow setting stdin/stdout/stderr fds
for it, by passing them in via the bus.

This also simplifies some of the serialization code for units.
2015-10-08 12:55:15 +02:00
Lennart Poettering 23a749f285 Merge pull request #1419 from keszybz/shell-completion
Shell completion tweaks
2015-09-30 22:32:09 +02:00
Lennart Poettering 8c8da0e0cb core: add a "Requires=" dependency between units and the slices they are located in
We place the processes we fork off in the cgroup anyway, and we probably
shouldn't be able to get that far if we couldn't set up the slice due to
resource problems or unmet conditions. Hence upgrade the dependency
between units and the slices they are located in from Wants= to
Requires=.
2015-09-29 21:55:52 +02:00
Zbigniew Jędrzejewski-Szmek 978c8b6347 Move UnitActiveState to basic/
Preparation to allow systemctl to query the list of unit states.
2015-09-28 15:09:34 -04:00
Jan Synacek ac5e3a505e core: extend KillUnit() to return error when no unit was killed 2015-09-21 10:05:49 +02:00
Daniel Mack 32ee7d3309 cgroup: add support for net_cls controllers
Add a new config directive called NetClass= to CGroup enabled units.
Allowed values are positive numbers for fix assignments and "auto" for
picking a free value automatically, for which we need to keep track of
dynamically assigned net class IDs of units. Introduce a hash table for
this, and also record the last ID that was given out, so the allocator
can start its search for the next 'hole' from there. This could
eventually be optimized with something like an irb.

The class IDs up to 65536 are considered reserved and won't be
assigned automatically by systemd. This barrier can be made a config
directive in the future.

Values set in unit files are stored in the CGroupContext of the
unit and considered read-only. The actually assigned number (which
may have been chosen dynamically) is stored in the unit itself and
is guaranteed to remain stable as long as the unit is active.

In the CGroup controller, set the configured CGroup net class to
net_cls.classid. Multiple unit may share the same net class ID,
and those which do are linked together.
2015-09-16 00:21:55 +02:00
Lennart Poettering 5269eb6b32 core: allocate sets of startup and failed units on-demand
There's a good chance we never needs these sets, hence allocate them
only when needed.
2015-09-11 18:31:49 +02:00
Lennart Poettering d53d94743c core: refactor cpu shares/blockio weight cgroup logic
Let's stop using the "unsigned long" type for weights/shares, and let's
just use uint64_t for this, as that's what we expose on the bus.

Unify parsers, and always validate the range for these fields.

Correct the default blockio weight to 500, since that's what the kernel
actually uses.

When parsing the weight/shares settings from unit files accept the empty
string as a way to reset the weight/shares value. When getting it via
the bus, uniformly map (uint64_t) -1 to unset.

Open up StartupCPUShares= and StartupBlockIOWeight= to transient units.
2015-09-11 18:31:49 +02:00
Lennart Poettering 03a7b521e3 core: add support for the "pids" cgroup controller
This adds support for the new "pids" cgroup controller of 4.3 kernels.
It allows accounting the number of tasks in a cgroup and enforcing
limits on it.

This adds two new setting TasksAccounting= and TasksMax= to each unit,
as well as a gloabl option DefaultTasksAccounting=.

This also updated "cgtop" to optionally make use of the new
kernel-provided accounting.

systemctl has been updated to show the number of tasks for each service
if it is available.

This patch also adds correct support for undoing memory limits for units
using a MemoryLimit=infinity syntax. We do the same for TasksMax= now
and hence keep things in sync here.
2015-09-10 18:41:06 +02:00
Lennart Poettering 525d3cc746 tree-wide: take benefit of the fact that hashmap_free() returns NULL
And set_free() too.

Another Coccinelle patch.
2015-09-09 23:12:07 +02:00
Lennart Poettering 6796073e33 tree-wide: make use of the fact that strv_free() returns NULL
Another Coccinelle patch.
2015-09-09 23:05:58 +02:00
Daniel Mack da323858ef Merge pull request #1190 from poettering/rework-virt
basic: rework virtualization detection API
2015-09-08 15:53:56 +02:00
Lennart Poettering d11a76451f unit: move "not supported" check after condition check in unit_start()
Make sure we always check conditions before checking whether the unit
type is supported in unit_start(), since condition checks are "clean
errors", while "not supported" errors are fatal.

This cleans up the boot output of systemd in containers, where a lot of
NOTSUPP lines were shown befor this fix.

This partially reverts 8ff4d2ab0d which
reorder the checks.
2015-09-07 14:10:53 +02:00
Lennart Poettering 75f86906c5 basic: rework virtualization detection API
Introduce a proper enum, and don't pass around string ids anymore. This
simplifies things quite a bit, and makes virtualization detection more
similar to architecture detection.
2015-09-07 13:42:47 +02:00
Lennart Poettering 8ff4d2ab0d unit: make unit_can_start() more accurate
This funciton is exposed via CanStart on the bus, and should be as
accurate as possible. Hence: make sure to return false for units of unit
types not supported on the system, and for unit types where
configuration failed to load.

Also see #1105.
2015-09-05 20:27:52 +02:00
Lennart Poettering fea72cc033 macro: introduce new PID_TO_PTR macros and make use of them
This adds a new PID_TO_PTR() macro, plus PTR_TO_PID() and makes use of
it wherever we maintain processes in a hash table. Previously we
sometimes used LONG_TO_PTR() and other times ULONG_TO_PTR() for that,
hence let's make this more explicit and clean up things.
2015-09-04 09:07:30 +02:00
Lennart Poettering efdb02375b core: unified cgroup hierarchy support
This patch set adds full support the new unified cgroup hierarchy logic
of modern kernels.

A new kernel command line option "systemd.unified_cgroup_hierarchy=1" is
added. If specified the unified hierarchy is mounted to /sys/fs/cgroup
instead of a tmpfs. No further hierarchies are mounted. The kernel
command line option defaults to off. We can turn it on by default as
soon as the kernel's APIs regarding this are stabilized (but even then
downstream distros might want to turn this off, as this will break any
tools that access cgroupfs directly).

It is possibly to choose for each boot individually whether the unified
or the legacy hierarchy is used. nspawn will by default provide the
legacy hierarchy to containers if the host is using it, and the unified
otherwise. However it is possible to run containers with the unified
hierarchy on a legacy host and vice versa, by setting the
$UNIFIED_CGROUP_HIERARCHY environment variable for nspawn to 1 or 0,
respectively.

The unified hierarchy provides reliable cgroup empty notifications for
the first time, via inotify. To make use of this we maintain one
manager-wide inotify fd, and each cgroup to it.

This patch also removes cg_delete() which is unused now.

On kernel 4.2 only the "memory" controller is compatible with the
unified hierarchy, hence that's the only controller systemd exposes when
booted in unified heirarchy mode.

This introduces a new enum for enumerating supported controllers, plus a
related enum for the mask bits mapping to it. The core is changed to
make use of this everywhere.

This moves PID 1 into a new "init.scope" implicit scope unit in the root
slice. This is necessary since on the unified hierarchy cgroups may
either contain subgroups or processes but not both. PID 1 hence has to
move out of the root cgroup (strictly speaking the root cgroup is the
only one where processes and subgroups are still allowed, but in order
to support containers nicey, we move PID 1 into the new scope in all
cases.) This new unit is also used on legacy hierarchy setups. It's
actually pretty useful on all systems, as it can then be used to filter
journal messages coming from PID 1, and so on.

The root slice ("-.slice") is now implicitly created and started (and
does not require a unit file on disk anymore), since
that's where "init.scope" is located and the slice needs to be started
before the scope can.

To check whether we are in unified or legacy hierarchy mode we use
statfs() on /sys/fs/cgroup. If the .f_type field reports tmpfs we are in
legacy mode, if it reports cgroupfs we are in unified mode.

This patch set carefuly makes sure that cgls and cgtop continue to work
as desired.

When invoking nspawn as a service it will implicitly create two
subcgroups in the cgroup it is using, one to move the nspawn process
into, the other to move the actual container processes into. This is
done because of the requirement that cgroups may either contain
processes or other subgroups.
2015-09-01 23:52:27 +02:00
Lennart Poettering d06673212e core: rework when we kill with which signal
When the user wants to explicitly send our own PID a signal, then do so.

Don't follow up SIGABRT with a SIGHUP if send_sighup is enabled. At that
point the process should have segfaulted, hence there's no point in
following up with a SIGHUP.

Send only termination signals to ourselves, never KILL or ABRT signals.
2015-09-01 18:54:08 +02:00
Lennart Poettering 102ef9829e core: don't allow changing the slice of a unit while it is active 2015-09-01 18:53:29 +02:00
Lennart Poettering b821a397c0 unit: small clean-ups
Always say when we ignore errors. Cast calls whose return value we
knowingly ingore to (void). Use "bool" where we actually mean a boolean,
even if we return it as an int later on.
2015-09-01 18:52:15 +02:00
Lennart Poettering e9db43d591 units: enable waiting for unit termination in certain cases
The legacy cgroup hierarchy does not support reliable empty
notifications in containers and if there are left-over subgroups in a
cgroup. This makes it hard to correctly wait for them running empty, and
thus we previously disabled this logic entirely.

With this change we explicitly check for the container case, and whether
the unit is a "delegation" unit (i.e. one where programs may create
their own subgroups). If we are neither in a container, nor operating on
a delegation unit cgroup empty notifications become reliable and thus we
start waiting for the empty notifications again.

This doesn't really fix the general problem around cgroup notifications
but reduces the effect around it.

(This also reorders #include lines by their focus, as suggsted in
CODING_STYLE. We have to add "virt.h", so let's do that at the right
place.)

Also see #317.
2015-09-01 17:44:17 +02:00