Commit graph

87 commits

Author SHA1 Message Date
Lennart Poettering 6f883237f1 cgroup: drop "ignore_self" argument from cg_is_empty()
In all cases where the function (or cg_is_empty_recursive()) ignoring
the calling process is actually wrong, as a process keeps a cgroup busy
regardless if its the current one or another. Hence, let's simplify
things and drop the "ignore_self" parameter.
2015-09-01 18:37:01 +02:00
Lennart Poettering e9db43d591 units: enable waiting for unit termination in certain cases
The legacy cgroup hierarchy does not support reliable empty
notifications in containers and if there are left-over subgroups in a
cgroup. This makes it hard to correctly wait for them running empty, and
thus we previously disabled this logic entirely.

With this change we explicitly check for the container case, and whether
the unit is a "delegation" unit (i.e. one where programs may create
their own subgroups). If we are neither in a container, nor operating on
a delegation unit cgroup empty notifications become reliable and thus we
start waiting for the empty notifications again.

This doesn't really fix the general problem around cgroup notifications
but reduces the effect around it.

(This also reorders #include lines by their focus, as suggsted in
CODING_STYLE. We have to add "virt.h", so let's do that at the right
place.)

Also see #317.
2015-09-01 17:44:17 +02:00
Lennart Poettering 35b7ff80e2 unit: add new macros to test for unit contexts 2015-08-31 13:20:43 +02:00
Lennart Poettering b2c23da8ce core: rename SystemdRunningAs to ManagerRunningAs
It's primarily just a property of the Manager object after all, and we
try to refer to PID 1 as "manager" instead of "systemd", hence let's to
stick to this here too.
2015-05-11 22:51:49 +02:00
Ronny Chevalier 0b452006de shared: add process-util.[ch] 2015-04-10 23:54:49 +02:00
Lennart Poettering 5ad096b3f1 core: expose consumed CPU time per unit
This adds support for showing the accumulated consumed CPU time per-unit
in the "systemctl status" output. The property is also readable via the
bus.
2015-03-02 12:15:25 +01:00
Zbigniew Jędrzejewski-Szmek a3bd89ea99 core/cgroup: fix embarrassing typo
https://github.com/docker/docker/issues/10280
2015-01-31 23:03:56 -05:00
Torstein Husebø cc98b3025e treewide: fix multiple typos 2015-01-26 10:39:47 -05:00
Daniel Mack 71c2687360 cgroup: fix typo 2015-01-19 18:34:17 +01:00
Zbigniew Jędrzejewski-Szmek 7539904965 cgroup: memory limits on / are not supported 2015-01-05 19:04:10 -05:00
Zbigniew Jędrzejewski-Szmek 6da139137e cgroup: fix error message
systemd[1]: Failed to set memory.limit_in_bytes on : Invalid argument
2015-01-05 19:04:10 -05:00
Lennart Poettering 714e2e1d56 cgroup: downgrade log messages when we cannot write to cgroup trees that are mounted read-only 2015-01-05 01:40:51 +01:00
Lennart Poettering 7b3fd6313c scope: make attachment of initial PIDs a bit more robust 2014-12-10 22:06:44 +01:00
Lennart Poettering 0cd385d318 core: don't migrate PIDs for units that may contain subcgroups, do this only for leaf units
Otherwise a slice or delegation unit might move PIDs around ignoring the
fact that it is attached to a subcgroup.
2014-12-10 20:38:24 +01:00
Lennart Poettering b1491eba40 core: rename unit_destroy_cgroup() to unit_destroy_cgroup_if_empty() since it's not quite as destructive as it sounds nowadays 2014-12-09 02:31:42 +01:00
Ross Lagerwall dab5bf8599 cgroup: Handle error when destroying cgroup
If a cgroup fails to be destroyed (most likely because there are still
processes running as part of a service after the main pid exits), don't
free and remove the cgroup unit from the manager.  This fixes a
regression introduced by the cgroup rework in v205 where systemd would
forget about processes still running after the unit becomes inactive.
(This can happen when the main pid exits and KillMode=process or none).
2014-12-09 02:28:09 +01:00
Michal Schmidt 4a62c710b6 treewide: another round of simplifications
Using the same scripts as in f647962d64 "treewide: yet more log_*_errno
+ return simplifications".
2014-11-28 19:57:32 +01:00
Michal Schmidt 56f64d9576 treewide: use log_*_errno whenever %m is in the format string
If the format string contains %m, clearly errno must have a meaningful
value, so we might as well use log_*_errno to have ERRNO= logged.

Using:
find . -name '*.[ch]' | xargs sed -r -i -e \
's/log_(debug|info|notice|warning|error|emergency)\((".*%m.*")/log_\1_errno(errno, \2/'

Plus some whitespace, linewrap, and indent adjustments.
2014-11-28 19:49:27 +01:00
Michal Schmidt 23bbb0de4e treewide: more log_*_errno + return simplifications 2014-11-28 18:24:30 +01:00
Michal Schmidt da927ba997 treewide: no need to negate errno for log_*_errno()
It corrrectly handles both positive and negative errno values.
2014-11-28 13:29:21 +01:00
Michal Schmidt 0a1beeb642 treewide: auto-convert the simple cases to log_*_errno()
As a followup to 086891e5c1 "log: add an "error" parameter to all
low-level logging calls and intrdouce log_error_errno() as log calls
that take error numbers", use sed to convert the simple cases to use
the new macros:

find . -name '*.[ch]' | xargs sed -r -i -e \
's/log_(debug|info|notice|warning|error|emergency)\("(.*)%s"(.*), strerror\(-([a-zA-Z_]+)\)\);/log_\1_errno(-\4, "\2%m"\3);/'

Multi-line log_*() invocations are not covered.
And we also should add log_unit_*_errno().
2014-11-28 12:04:41 +01:00
Lennart Poettering a931ad47a8 core: introduce new Delegate=yes/no property controlling creation of cgroup subhierarchies
For priviliged units this resource control property ensures that the
processes have all controllers systemd manages enabled.

For unpriviliged services (those with User= set) this ensures that
access rights to the service cgroup is granted to the user in question,
to create further subgroups. Note that this only applies to the
name=systemd hierarchy though, as access to other controllers is not
safe for unpriviliged processes.

Delegate=yes should be set for container scopes where a systemd instance
inside the container shall manage the hierarchies below its own cgroup
and have access to all controllers.

Delegate=yes should also be set for user@.service, so that systemd
--user can run, controlling its own cgroup tree.

This commit changes machined, systemd-nspawn@.service and user@.service
to set this boolean, in order to ensure that container management will
just work, and the user systemd instance can run fine.
2014-11-05 18:49:14 +01:00
Zbigniew Jędrzejewski-Szmek b1d6dcf5a5 Do not format USEC_INFINITY as NULL
systemctl would print 'CPUQuotaPerSecUSec=(null)' for no limit. This
does not look right.

Since USEC_INFINITY is one of the valid values, format_timespan()
could return NULL, and we should wrap every use of it in strna() or
similar. But most callers didn't do that, and it seems more robust to
return a string ("infinity") that makes sense most of the time, even
if in some places the result will not be grammatically correct.
2014-09-29 11:09:39 -04:00
Lennart Poettering d81afec1c9 core: split up "starting" manager state into "initializing" and "starting"
We'll stay in "initializing" until basic.target has reached, at which
point we will enter "starting".

This is preparation so that we can change the startip timeout to only
apply to the first phase of startup, not the full procedure.
2014-08-22 18:10:31 +02:00
Lennart Poettering 1aeab12b19 cgroup: only generate warnings if actually writing to cgroup attributes failed 2014-08-15 18:14:37 +02:00
Lennart Poettering 6b2f67b31c cgroup: downgrade log messages about non-existant cgroup attributes to LOG_DEBUG 2014-08-15 11:57:07 +02:00
Kay Sievers 3a43da2832 time-util: add and use USEC/NSEC_INFINIY 2014-07-29 13:20:20 +02:00
Zbigniew Jędrzejewski-Szmek 0d8c31ff72 test-engine: fix access to unit load path
Also add a bit of debugging output to help diagnose problems,
add missing units, and simplify cppflags.

Move test-engine to normal tests from manual tests, it should now
work without destroying the system.
2014-07-20 19:48:16 -04:00
Lennart Poettering 9a05490933 cgroups: simplify CPUQuota= logic
Only accept cpu quota values in percentages, get rid of period
definition.

It's not clear whether the CFS period controllable per-cgroup even has a
future in the kernel, hence let's simplify all this, hardcode the period
to 100ms and only accept percentage based quota values.
2014-05-22 11:53:12 +09:00
Lennart Poettering 637f421e5c cgroups: always propagate controller membership to siblings, for all controllers
This is the behaviour the kernel cgroup rework exposes for all
controllers, hence let's do this already now for all cases.
2014-05-22 07:50:03 +09:00
Lennart Poettering db785129c9 cgroup: rework startup logic
Introduce a (unsigned long) -1 as "unset" state for cpu shares/block io
weights, and keep the startup unit set around all the time.
2014-05-22 07:13:56 +09:00
WaLyong Cho 95ae05c0e7 core: add startup resource control option
Similar to CPUShares= and BlockIOWeight= respectively. However only
assign the specified weight during startup. Each control group
attribute is re-assigned as weight by CPUShares=weight and
BlockIOWeight=weight after startup.  If not CPUShares= or
BlockIOWeight= be specified, then the attribute is re-assigned to each
default attribute value. (default cpu.shares=1024, blkio.weight=1000)
If only CPUShares=weight or BlockIOWeight=weight be specified, then
that implies StartupCPUShares=weight and StartupBlockIOWeight=weight.
2014-05-22 07:13:56 +09:00
Łukasz Stelmach cd7affaeea core: check the right variable for failed open() 2014-05-08 13:24:34 +02:00
Kay Sievers 99a17ada9c core: require cgroups filesystem to be available
We should no longer pretend that we can run in any sensible way
without the kernel supporting us with cgroups functionality.
2014-05-05 18:52:36 +02:00
Lennart Poettering b2f8b02ec2 core: expose CFS CPU time quota as high-level unit properties 2014-04-25 13:27:25 +02:00
Lennart Poettering 7d711efb9c core: make sure we can combine DevicePolicy=closed with PrivateDevices=yes
if PrivateDevices=yes is used we need to make sure we can still
create /dev/null and so on.
2014-03-19 22:00:43 +01:00
Lennart Poettering 03e334a1c7 util: replace close_nointr_nofail() by a more useful safe_close()
safe_close() automatically becomes a NOP when a negative fd is passed,
and returns -1 unconditionally. This makes it easy to write lines like
this:

        fd = safe_close(fd);

Which will close an fd if it is open, and reset the fd variable
correctly.

By making use of this new scheme we can drop a > 200 lines of code that
was required to test for non-negative fds or to reset the closed fd
variable afterwards.
2014-03-18 19:31:34 +01:00
Lennart Poettering e41969e3d1 core: support globbing matches in DeviceAllow= when checking for device groups 2014-03-11 17:43:41 +01:00
Lennart Poettering 01efdf13a6 cgroup: certain cgroup attributes are not available in the root cgroup, hence don't bother 2014-02-24 03:38:58 +01:00
Lennart Poettering 90060676c4 cgroup: Extend DeviceAllow= syntax to whitelist groups of devices, not just particular devices nodes 2014-02-22 03:05:34 +01:00
Lennart Poettering d4fdc205a4 update TODO 2014-02-19 18:20:12 +01:00
Jan Engelhardt 73e231abde doc: update punctuation
Resolve spotted issues related to missing or extraneous commas, dashes.
2014-02-17 19:03:07 -05:00
Lennart Poettering 03b90d4bad core: find the closest parent slice that has a specfic cgroup controller enabled when enabling/disabling cgroup controllers for units 2014-02-17 15:49:21 +01:00
Lennart Poettering bc432dc7eb core: rework cgroup mask propagation
Previously a cgroup setting down tree would result in cgroup membership
additions being propagated up the tree and to the siblings, however a
unit could never lose cgroup memberships again. With this change we'll
make sure that both cgroup additions and removals propagate properly.
2014-02-17 15:49:21 +01:00
David Strauss 6414b7c981 cgroups: Cache controller masks and optimize queues. 2013-11-22 11:22:47 +10:00
Zbigniew Jędrzejewski-Szmek a94042fa9b systemd: fix memory leak in cgroup code
If the unit already was in the hashmap, path would be leaked.
2013-11-09 19:02:53 -05:00
David Strauss f366954523 Comment spelling fixes. 2013-11-06 20:03:18 +10:00
Lennart Poettering 15c60e99a9 cgroup: run PID 1 in the root cgroup
This way cleaning up the cgroup tree on shutdown is a lot easier since
we are in the root dir. Also PID 1 was previously artificially placed in
system.slice, even though our rule actually was not to have processes in
slices. The root slice otoh is magic anyway, so having PID 1 in there
sounds less surprising.

Of course, this means that PID is scheduled against the three top-level
slices.
2013-11-06 02:12:21 +01:00
Lennart Poettering 71fda00f32 list: make our list macros a bit easier to use by not requring type spec on each invocation
We can determine the list entry type via the typeof() gcc construct, and
so we should to make the macros much shorter to use.
2013-10-14 06:11:19 +02:00
Lennart Poettering 13b84ec7df cgroup: if we do a cgroup operation then do something on all supported controllers
Previously we did operations like attach, trim or migrate only on the
controllers that were enabled for a specific unit. With this changes we
will now do them for all supproted controllers, and fall back to all
possible prefix paths if the specified paths do not exist.

This fixes issues if a controller is being disabled for a unit where it
was previously enabled, and makes sure that all processes stay as "far
down" the tree as groups exist.
2013-09-25 03:38:17 +02:00