Commit graph

55 commits

Author SHA1 Message Date
Yu Watanabe db9ecf0501 license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
Michal Koutný 12b975e065 cgroup: Reduce unit_get_ancestor_disable_mask use
The usage in unit_get_own_mask is redundant, we only need apply
disable_mask at the end befor application, i.e. calculating enable or
target mask.

(IOW, we allow all configurations, but disabling affects effective
controls.)

Modify tests accordingly and add testing of enable mask.

This is intended as cleanup, with no effect but changing unit_dump
output.
2020-08-19 11:41:53 +02:00
Zbigniew Jędrzejewski-Szmek 7b43295346 tests: move unit files to units/ subdirectory
We have a bazillion of those unit files, and keeping them all directly in tests/
has become rather unwieldy.
2020-03-19 16:23:27 +01:00
Zbigniew Jędrzejewski-Szmek 3a0f06c41a core: make TasksMax a partially dynamic property
TasksMax= and DefaultTasksMax= can be specified as percentages. We don't
actually document of what the percentage is relative to, but the implementation
uses the smallest of /proc/sys/kernel/pid_max, /proc/sys/kernel/threads-max,
and /sys/fs/cgroup/pids.max (when present). When the value is a percentage,
we immediately convert it to an absolute value. If the limit later changes
(which can happen e.g. when systemd-sysctl runs), the absolute value becomes
outdated.

So let's store either the percentage or absolute value, whatever was specified,
and only convert to an absolute value when the value is used. For example, when
starting a unit, the absolute value will be calculated when the cgroup for
the unit is created.

Fixes #13419.
2019-11-14 18:41:54 +01:00
Zbigniew Jędrzejewski-Szmek 64ad9e088d tests: modify enter_cgroup_subroot() to return the new path 2019-11-11 14:55:57 +01:00
Zbigniew Jędrzejewski-Szmek 48e98ba5c3 tests: get rid of test-helper.[ch] completely
I don't think there's any particular reason to keep those functions in a separate
file.
2019-11-11 14:55:57 +01:00
Pavel Hrdina 047f5d63d7 cgroup: introduce support for cgroup v2 CPUSET controller
Introduce support for configuring cpus and mems for processes using
cgroup v2 CPUSET controller.  This allows users to limit which cpus
and memory NUMA nodes can be used by processes to better utilize
system resources.

The cgroup v2 interfaces to control it are cpuset.cpus and cpuset.mems
where the requested configuration is written.  However, it doesn't mean
that the requested configuration will be actually used as parent cgroup
may limit the cpus or mems as well.  In order to reflect the real
configuration cgroup v2 provides read-only files cpuset.cpus.effective
and cpuset.mems.effective which are exported to users as well.
2019-09-24 15:16:07 +02:00
Chris Down 3062dddabd test: Remove superfluous error check
This is already checked above before we set any manager attributes,
immediately after manager_new().
2019-05-22 15:27:26 -04:00
Chris Down c72703e26d cgroup: Add DisableControllers= directive to disable controller in subtree
Some controllers (like the CPU controller) have a performance cost that
is non-trivial on certain workloads. While this can be mitigated and
improved to an extent, there will for some controllers always be some
overheads associated with the benefits gained from the controller.
Inside Facebook, the fix applied has been to disable the CPU controller
forcibly with `cgroup_disable=cpu` on the kernel command line.

This presents a problem: to disable or reenable the controller, a reboot
is required, but this is quite cumbersome and slow to do for many
thousands of machines, especially machines where disabling/enabling a
stateful service on a machine is a matter of several minutes.

Currently systemd provides some configuration knobs for these in the
form of `[Default]CPUAccounting`, `[Default]MemoryAccounting`, and the
like. The limitation of these is that Default*Accounting is overrideable
by individual services, of which any one could decide to reenable a
controller within the hierarchy at any point just by using a controller
feature implicitly (eg. `CPUWeight`), even if the use of that CPU
feature could just be opportunistic. Since many services are provided by
the distribution, or by upstream teams at a particular organisation,
it's not a sustainable solution to simply try to find and remove
offending directives from these units.

This commit presents a more direct solution -- a DisableControllers=
directive that forcibly disallows a controller from being enabled within
a subtree.
2018-12-03 15:40:31 +00:00
Chris Down f98c25850f cgroup v2: Don't require CPU controller for CPU accounting in 4.15+
systemd only uses functions that are as of Linux 4.15+ provided
externally to the CPU controller (currently usage_usec), so if we have a
new enough kernel, we don't need to set CGROUP_MASK_CPU for
CPUAccounting=true as the CPU controller does not need to necessarily be
enabled in this case.

Part of this patch is modelled on an earlier patch by Ryutaroh Matsumoto
(see PR #9665).
2018-11-18 12:21:41 +00:00
Roman Gushchin 084c700780 core: support cgroup v2 device controller
Cgroup v2 provides the eBPF-based device controller, which isn't currently
supported by systemd. This commit aims to provide such support.

There are no user-visible changes, just the device policy and whitelist
start working if cgroup v2 is used.
2018-10-09 09:47:51 -07:00
Roman Gushchin 17f149556a core: refactor bpf firewall support into a pseudo-controller
The idea is to introduce a concept of bpf-based pseudo-controllers
to make adding new bpf-based features easier.
2018-10-09 09:46:08 -07:00
Zbigniew Jędrzejewski-Szmek 6d7c403324 tests: use a helper function to parse environment and open logging
The advantages are that we save a few lines, and that we can override
logging using environment variables in more test executables.
2018-09-14 09:29:57 +02:00
Zbigniew Jędrzejewski-Szmek 317bb217d3 tests: add helper to unify skipping a test and exiting 2018-09-14 09:29:57 +02:00
Filipe Brandenburger 55890a40c3 test: remove support for suffix in get_testdata_dir()
Instead, use path_join() in callers wherever needed.
2018-09-12 09:49:03 -07:00
Zbigniew Jędrzejewski-Szmek 25612ecba4 tree-wide: drop copyright lines for more authors
Acks in https://github.com/systemd/systemd/issues/9320.
2018-06-22 16:39:45 +02:00
Lennart Poettering 7675251663 tree-wide: pass NULL arguments to manager_startup() directly, avoid declaring unneeded variables 2018-06-20 23:59:29 +02:00
Lennart Poettering 96b2fb93c5 tree-wide: beautify remaining copyright statements
Let's unify an beautify our remaining copyright statements, with a
unicode ©. This means our copyright statements are now always formatted
the same way. Yay.
2018-06-14 10:20:21 +02:00
Lennart Poettering 818bf54632 tree-wide: drop 'This file is part of systemd' blurb
This part of the copyright blurb stems from the GPL use recommendations:

https://www.gnu.org/licenses/gpl-howto.en.html

The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.

hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
2018-06-14 10:20:20 +02:00
Lennart Poettering 2cb36f7c1e
Merge pull request #8575 from keszybz/non-absolute-paths
Do not require absolute paths in ExecStart and friends
2018-04-17 15:54:10 +02:00
Zbigniew Jędrzejewski-Szmek ba412430a9 tests: use manager_load_startable_unit_or_warn() to load units
Doing manager_load_unit() followed by UNIT_VTABLE(unit)->start(unit) would
result in an assertion failure in ->start() if the unit failed to load
properly. Something like this is okey-ish is tests, since the test units are
not expected to fail to load, but the reason for failure is clearer if we
fail immediately.
2018-04-16 16:08:52 +02:00
Zbigniew Jędrzejewski-Szmek 11a1589223 tree-wide: drop license boilerplate
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.

I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
2018-04-06 18:58:55 +02:00
Zbigniew Jędrzejewski-Szmek e8112e67e4 Make MANAGER_TEST_RUN_MINIMAL just allocate data structures
When running tests like test-unit-name, there is not point in setting
up the cgroup and signals and interacting with the environment. Similarly
when running fuzz testing of the parser.

Add new MANAGER_TEST_RUN_BASIC which takes the role of MANAGER_TEST_RUN_MINIMAL,
and redefine MANAGER_TEST_RUN_MINIMAL to just create the basic data structures.
2018-03-11 16:33:59 +01:00
Zbigniew Jędrzejewski-Szmek c70cac548a Introduce _cleanup_(manager_freep) 2018-03-11 16:33:57 +01:00
Zbigniew Jędrzejewski-Szmek 53e1b68390 Add SPDX license identifiers to source files under the LGPL
This follows what the kernel is doing, c.f.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.
2017-11-19 19:08:15 +01:00
Lennart Poettering ec635a2d21 cgroup: improve cg_mask_to_string a bit, and add tests for it 2017-11-13 10:24:03 +01:00
Zbigniew Jędrzejewski-Szmek 651d47d14b tests: skip tests when cg_pid_get_path fails (#7033)
v2:
- cast the fstype_t type to ull, because it varies between arches.
  Making it long long should be on the safe side.
2017-10-10 20:55:20 +02:00
Yu Watanabe 4c70109600 tree-wide: use IN_SET macro (#6977) 2017-10-04 16:01:32 +02:00
Zbigniew Jędrzejewski-Szmek e0a3da1fd2 Make test_run into a flags field and disable generators again
Now generators are only run in systemd --test mode, where this makes
most sense (how are you going to test what would happen otherwise?).

Fixes #6842.

v2:
- rename test_run to test_run_flags
2017-09-19 20:14:05 +02:00
Lennart Poettering 8c759b33a4 tests: when running a manager object in a test, migrate to private cgroup subroot first (#6576)
Without this "meson test" will end up running all tests in the same
cgroup root, and they all will try to manage it. Which usually isn't too
bad, except when they end up clearing up each other's cgroups. This race
is hard to trigger but has caused various CI runs to fail spuriously.

With this change we simply move every test that runs a manager object
into their own private cgroup. Note that we don't clean up the cgroup at
the end, we leave that to the cgroup manager around it.

This fixes races that become visible by test runs throwing out errors
like this:

```
exec-systemcallfilter-failing.service: Passing 0 fds to service
exec-systemcallfilter-failing.service: About to execute: /bin/echo 'This should not be seen'
exec-systemcallfilter-failing.service: Forked /bin/echo as 5693
exec-systemcallfilter-failing.service: Changed dead -> start
exec-systemcallfilter-failing.service: Failed to attach to cgroup /exec-systemcallfilter-failing.service: No such file or directory
Received SIGCHLD from PID 5693 ((echo)).
Child 5693 ((echo)) died (code=exited, status=219/CGROUP)
exec-systemcallfilter-failing.service: Child 5693 belongs to exec-systemcallfilter-failing.service
exec-systemcallfilter-failing.service: Main process exited, code=exited, status=219/CGROUP
exec-systemcallfilter-failing.service: Changed start -> failed
exec-systemcallfilter-failing.service: Unit entered failed state.
exec-systemcallfilter-failing.service: Failed with result 'exit-code'.
exec-systemcallfilter-failing.service: cgroup is empty
Assertion 'service->main_exec_status.status == status_expected' failed at ../src/src/test/test-execute.c:71, function check(). Aborting.
```

BTW, I tracked this race down by using perf:

```
        # perf record -e cgroup:cgroup_mkdir,cgroup_rmdir
        …
        # perf script
```

Thanks a lot @iaguis, @alban for helping me how to use perf for this.

Fixes #5895.
2017-08-09 09:42:49 -04:00
Martin Pitt cc100a5a9b test: drop TEST_DATA_DIR, fold into get_testdata_dir()
Drop the TEST_DATA_DIR macro as this was using alloca() within a
function call which is allegedly unsafe. So add a "suffix" argument to
get_testdata_dir() instead and call that directly.
2017-02-16 21:45:57 +01:00
Martin Pitt 3e29e810ae test: setup test data dir before fake runtime dir
That way, if the test directory does not exist we don't leave behind
temporary files (as in that case or on test failure the cleanup actions
don't run).
2017-02-16 21:36:30 +01:00
Martin Pitt f853c6efb5 test: make unit tests relocatable
It is useful to package test-* binaries and run them as root under
autopkgtest or manually on particular machines. They currently have a
built-in hardcoded absolute path to their test data, which does not work
when running the test programs from any other path than the original
build directory.

By default, make the tests look for their data in
<test_exe_directory>/testdata/ so that they can be called from any
directory (provided that the corresponding test data is installed
correctly). As we don't have a fixed static path in the build tree (as
build and source tree are independent), set $TEST_DIR with "make check"
to point to <srcdir>/test/, as we previously did with an automake
variable.
2017-02-13 22:31:13 +01:00
Lennart Poettering f9e26ecc48 Merge pull request #3290 from htejun/cgroup2-io-compat
Implement compat translation between IO* and BlockIO* settings
2016-05-20 18:53:11 +02:00
Evgeny Vereshchagin f942504e4f basic: remove rm_rf_and_free, add rm_rf_physical_and_free, use rm_rf_physical_and_freep in tests (#3292)
Some distros don't mount /tmp as tmpfs.
For example:
https://lists.ubuntu.com/archives/ubuntu-cloud/2016-January/001009.html

Some tests:
* print 'Attempted to remove disk file system, and we can't allow that.'
* don't really cleanup /tmp
2016-05-20 15:08:24 +02:00
Tejun Heo 538b48524c core: translate between IO and BlockIO settings to ease transition
Due to the substantial interface changes in cgroup unified hierarchy, new IO
settings are introduced.  Currently, IO settings apply only to unified
hierarchy and BlockIO to legacy.  While the transition is necessary, it's
painful for users to have to provide configs for both.  This patch implements
translation from one config set to another for configs which make sense.

* The translation takes place during application of the configs.  Users won't
  see IO or BlockIO settings appearing without being explicitly created.

* The translation takes place only if there is no config for the matching
  cgroup hierarchy type at all.

While this doesn't provide comprehensive compatibility, it should considerably
ease transition to the new IO settings which are a superset of BlockIO
settings.

v2:

- Update test-cgroup-mask.c so that it accounts for the fact that
  CGROUP_MASK_IO and CGROUP_MASK_BLKIO move together.  Also, test/parent.slice
  now sets IOWeight instead of BlockIOWeight.
2016-05-18 17:35:12 -07:00
Lennart Poettering d2120590ff tests: override XDG_RUNTIME_DIR where we use the user runtime dir
We don#t really support systems where XDG_RUNTIME_DIR is not supported for
systemd --user. Hence, let's always set our own XDG_RUNTIME_DIR for tests that
involve systemd --user, so that we know it is set, and that it doesn't polute
the user's actual runtime dir.
2016-04-12 13:43:33 +02:00
Lennart Poettering 463d0d1569 core: remove ManagerRunningAs enum
Previously, we had two enums ManagerRunningAs and UnitFileScope, that were
mostly identical and converted from one to the other all the time. The latter
had one more value UNIT_FILE_GLOBAL however.

Let's simplify things, and remove ManagerRunningAs and replace it by
UnitFileScope everywhere, thus making the translation unnecessary. Introduce
two new macros MANAGER_IS_SYSTEM() and MANAGER_IS_USER() to simplify checking
if we are running in one or the user context.
2016-04-12 13:43:30 +02:00
Daniel Mack b26fa1a2fb tree-wide: remove Emacs lines from all files
This should be handled fine now by .dir-locals.el, so need to carry that
stuff in every file.
2016-02-10 13:41:57 +01:00
Zbigniew Jędrzejewski-Szmek def8b4c5d6 test-cgroup-mask: check return value
CID #1339830.
2016-01-20 18:55:56 -05:00
Thomas Hindoe Paaboel Andersen cf0fbc49e6 tree-wide: sort includes
Sort the includes accoding to the new coding style.
2015-11-16 22:09:36 +01:00
Lennart Poettering 9ded9cd14c core: enable TasksMax= for all services by default, and set it to 512
Also, enable TasksAccounting= for all services by default, too.

See:

http://lists.freedesktop.org/archives/systemd-devel/2015-November/035006.html
2015-11-16 11:57:48 +01:00
Lennart Poettering efdb02375b core: unified cgroup hierarchy support
This patch set adds full support the new unified cgroup hierarchy logic
of modern kernels.

A new kernel command line option "systemd.unified_cgroup_hierarchy=1" is
added. If specified the unified hierarchy is mounted to /sys/fs/cgroup
instead of a tmpfs. No further hierarchies are mounted. The kernel
command line option defaults to off. We can turn it on by default as
soon as the kernel's APIs regarding this are stabilized (but even then
downstream distros might want to turn this off, as this will break any
tools that access cgroupfs directly).

It is possibly to choose for each boot individually whether the unified
or the legacy hierarchy is used. nspawn will by default provide the
legacy hierarchy to containers if the host is using it, and the unified
otherwise. However it is possible to run containers with the unified
hierarchy on a legacy host and vice versa, by setting the
$UNIFIED_CGROUP_HIERARCHY environment variable for nspawn to 1 or 0,
respectively.

The unified hierarchy provides reliable cgroup empty notifications for
the first time, via inotify. To make use of this we maintain one
manager-wide inotify fd, and each cgroup to it.

This patch also removes cg_delete() which is unused now.

On kernel 4.2 only the "memory" controller is compatible with the
unified hierarchy, hence that's the only controller systemd exposes when
booted in unified heirarchy mode.

This introduces a new enum for enumerating supported controllers, plus a
related enum for the mask bits mapping to it. The core is changed to
make use of this everywhere.

This moves PID 1 into a new "init.scope" implicit scope unit in the root
slice. This is necessary since on the unified hierarchy cgroups may
either contain subgroups or processes but not both. PID 1 hence has to
move out of the root cgroup (strictly speaking the root cgroup is the
only one where processes and subgroups are still allowed, but in order
to support containers nicey, we move PID 1 into the new scope in all
cases.) This new unit is also used on legacy hierarchy setups. It's
actually pretty useful on all systems, as it can then be used to filter
journal messages coming from PID 1, and so on.

The root slice ("-.slice") is now implicitly created and started (and
does not require a unit file on disk anymore), since
that's where "init.scope" is located and the slice needs to be started
before the scope can.

To check whether we are in unified or legacy hierarchy mode we use
statfs() on /sys/fs/cgroup. If the .f_type field reports tmpfs we are in
legacy mode, if it reports cgroupfs we are in unified mode.

This patch set carefuly makes sure that cgls and cgtop continue to work
as desired.

When invoking nspawn as a service it will implicitly create two
subcgroups in the cgroup it is using, one to move the nspawn process
into, the other to move the actual container processes into. This is
done because of the requirement that cgroups may either contain
processes or other subgroups.
2015-09-01 23:52:27 +02:00
Filipe Brandenburger 2bf25eeff8 test-cgroup-mask: unit_get_sibling_mask ignores cgroup_supported
The result of unit_get_sibling_mask returns bits for the sibling cgroups
even if they are not supported in the local system.

I caught this on a machine where my kernel was misconfigured with
CONFIG_MEMCG unset, but the rest of the cgroup infrastructure enabled.

Tested with `make check` on a host running a kernel where CONFIG_MEMCG
is not set.
2015-06-11 20:12:01 -07:00
Lennart Poettering b2c23da8ce core: rename SystemdRunningAs to ManagerRunningAs
It's primarily just a property of the Manager object after all, and we
try to refer to PID 1 as "manager" instead of "systemd", hence let's to
stick to this here too.
2015-05-11 22:51:49 +02:00
Thomas Hindoe Paaboel Andersen 2eec67acbb remove unused includes
This patch removes includes that are not used. The removals were found with
include-what-you-use which checks if any of the symbols from a header is
in use.
2015-02-23 23:53:42 +01:00
Thomas Hindoe Paaboel Andersen bdf7026e95 test: only use assert_se
The asserts used in the tests should never be allowed to be
optimized away
2014-10-04 23:55:35 +02:00
Zbigniew Jędrzejewski-Szmek 8328d8c633 test-cgroup-mask: fix masks in test and enable by default
Commit 637f421e5c ("cgroups: always propagate controller membership
to siblings") changed the mask propagation logic, but the test wasn't
updated.

Move to normal tests from manual tests, it should not touch the system
anymore.
2014-07-20 19:48:16 -04:00
Zbigniew Jędrzejewski-Szmek c2ef6f8427 test-cgroup-mask: pass on kernels without memory controller
It seems that unit_get_siblings_mask returns the controllers
filtered by what is available, but get_members_mask and
get_cgroup_mask do not. This just fixes the test following the
symptoms.
2014-07-20 19:48:16 -04:00
Zbigniew Jędrzejewski-Szmek 0d8c31ff72 test-engine: fix access to unit load path
Also add a bit of debugging output to help diagnose problems,
add missing units, and simplify cppflags.

Move test-engine to normal tests from manual tests, it should now
work without destroying the system.
2014-07-20 19:48:16 -04:00