Systemd

Commit Graph

Author	SHA1	Message	Date
Zbigniew Jędrzejewski-Szmek	2fe21124a6	Add open_memstream_unlocked() wrapper	2019-04-12 11:44:57 +02:00
Lennart Poettering	eefc66aa8f	util: split out some stuff into a new file limits-util.[ch]	2019-03-13 12:16:43 +01:00
Filipe Brandenburger	527ede0c63	core: downgrade CPUQuotaPeriodSec= clamping logs to debug After the first warning log, further messages are downgraded to LOG_DEBUG.	2019-02-14 11:04:42 -08:00
Filipe Brandenburger	10f2864111	core: add CPUQuotaPeriodSec= This new setting allows configuration of CFS period on the CPU cgroup, instead of using a hardcoded default of 100ms. Tested: - Legacy cgroup + Unified cgroup - systemctl set-property - systemctl show - Confirmed that the cgroup settings (such as cpu.cfs_period_ns) were set appropriately, including updating the CPU quota (cpu.cfs_quota_ns) when CPUQuotaPeriodSec= is updated. - Checked that clamping works properly when either period or (quota * period) are below the resolution of 1ms, or if period is above the max of 1s.	2019-02-14 11:04:42 -08:00
Chris Down	c72703e26d	cgroup: Add DisableControllers= directive to disable controller in subtree Some controllers (like the CPU controller) have a performance cost that is non-trivial on certain workloads. While this can be mitigated and improved to an extent, there will for some controllers always be some overheads associated with the benefits gained from the controller. Inside Facebook, the fix applied has been to disable the CPU controller forcibly with `cgroup_disable=cpu` on the kernel command line. This presents a problem: to disable or reenable the controller, a reboot is required, but this is quite cumbersome and slow to do for many thousands of machines, especially machines where disabling/enabling a stateful service on a machine is a matter of several minutes. Currently systemd provides some configuration knobs for these in the form of `[Default]CPUAccounting`, `[Default]MemoryAccounting`, and the like. The limitation of these is that Default*Accounting is overrideable by individual services, of which any one could decide to reenable a controller within the hierarchy at any point just by using a controller feature implicitly (eg. `CPUWeight`), even if the use of that CPU feature could just be opportunistic. Since many services are provided by the distribution, or by upstream teams at a particular organisation, it's not a sustainable solution to simply try to find and remove offending directives from these units. This commit presents a more direct solution -- a DisableControllers= directive that forcibly disallows a controller from being enabled within a subtree.	2018-12-03 15:40:31 +00:00
Chris Down	f98c25850f	cgroup v2: Don't require CPU controller for CPU accounting in 4.15+ systemd only uses functions that are as of Linux 4.15+ provided externally to the CPU controller (currently usage_usec), so if we have a new enough kernel, we don't need to set CGROUP_MASK_CPU for CPUAccounting=true as the CPU controller does not need to necessarily be enabled in this case. Part of this patch is modelled on an earlier patch by Ryutaroh Matsumoto (see PR #9665).	2018-11-18 12:21:41 +00:00
Tejun Heo	6ae4283cb1	core: add IODeviceLatencyTargetSec This adds support for the following proposed latency based IO control mechanism. https://lkml.org/lkml/2018/6/5/428	2018-08-22 16:46:18 +02:00
Tejun Heo	4842263577	core: add MemoryMin The kernel added support for a new cgroup memory controller knob memory.min in bf8d5d52ffe8 ("memcg: introduce memory.min") which was merged during v4.18 merge window. Add MemoryMin to support memory.min.	2018-07-12 08:21:43 +02:00
Lennart Poettering	0c69794138	tree-wide: remove Lennart's copyright lines These lines are generally out-of-date, incomplete and unnecessary. With SPDX and git repository much more accurate and fine grained information about licensing and authorship is available, hence let's drop the per-file copyright notice. Of course, removing copyright lines of others is problematic, hence this commit only removes my own lines and leaves all others untouched. It might be nicer if sooner or later those could go away too, making git the only and accurate source of authorship information.	2018-06-14 10:20:20 +02:00
Lennart Poettering	818bf54632	tree-wide: drop 'This file is part of systemd' blurb This part of the copyright blurb stems from the GPL use recommendations: https://www.gnu.org/licenses/gpl-howto.en.html The concept appears to originate in times where version control was per file, instead of per tree, and was a way to glue the files together. Ultimately, we nowadays don't live in that world anymore, and this information is entirely useless anyway, as people are very welcome to copy these files into any projects they like, and they shouldn't have to change bits that are part of our copyright header for that. hence, let's just get rid of this old cruft, and shorten our codebase a bit.	2018-06-14 10:20:20 +02:00
Lennart Poettering	57e84e7535	core: rework how we validate DeviceAllow= settings Let's make sure we don't validate "char-" and "block-" expressions as paths.	2018-06-11 18:01:06 +02:00
Lennart Poettering	9d5e9b4add	cgroup: relax checks for block device cgroup settings This drops needless safety checks that ensure we only reference block devices for blockio/io settings. The backing code was already able to accept regular file system paths too, in which case the backing device node of that file system would be used. Hence, let's drop the artificial restrictions and open up this underlying functionality.	2018-06-11 18:01:06 +02:00
Giuseppe Scrivano	ef42f561fc	src/core/dbus-cgroup.c: fix typo contoller -> controller (#8717 ) Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2018-04-14 11:06:11 +02:00
Zbigniew Jędrzejewski-Szmek	11a1589223	tree-wide: drop license boilerplate Files which are installed as-is (any .service and other unit files, .conf files, .policy files, etc), are left as is. My assumption is that SPDX identifiers are not yet that well known, so it's better to retain the extended header to avoid any doubt. I also kept any copyright lines. We can probably remove them, but it'd nice to obtain explicit acks from all involved authors before doing that.	2018-04-06 18:58:55 +02:00
Yu Watanabe	906bdbf5e7	core/cgroup: accepts MemorySwapMax=0 (#8366 ) Also, this moves two macros from dbus-util.h to dbus-cgroup.c, as they are only used in dbus-cgroup.c. Fixes #8363.	2018-03-09 11:34:50 +01:00
Lennart Poettering	2ae7ee58fa	bpf: beef up bpf detection, check if BPF_F_ALLOW_MULTI is supported This improves the BPF/cgroup detection logic, and looks whether BPF_ALLOW_MULTI is supported. This flag allows execution of multiple BPF filters in a recursive fashion for a whole cgroup tree. It enables us to properly report IP accounting for slice units, as well as delegation of BPF support to units without breaking our own IP accounting.	2018-02-21 16:43:36 +01:00
Lennart Poettering	1d9cc8768f	cgroup: add a new "can_delegate" flag to the unit vtable, and set it for scope and service units only Currently we allowed delegation for alluntis with cgroup backing except for slices. Let's make this a bit more strict for now, and only allow this in service and scope units. Let's also add a generic accessor unit_cgroup_delegate() for checking whether a unit has delegation turned on that checks the new bool first. Also, when doing transient units, let's explcitly refuse turning on delegation for unit types that don#t support it. This is mostly cosmetical as we wouldn't act on the delegation request anyway, but certainly helpful for debugging.	2018-02-12 11:34:00 +01:00
Lennart Poettering	60644c3dea	cgroup: fix handling of TasksAccounting= property	2018-01-23 21:22:50 +01:00
Yu Watanabe	681ae88e06	dbus-cgroup: simplify bus_cgroup_set_property()	2018-01-03 02:33:16 +09:00
Yu Watanabe	fffbc1dc7f	dbus-cgroup: add missing space	2018-01-03 02:33:06 +09:00
Yu Watanabe	32048f5414	cgroup: IODeviceWeight= or friends can take device node files in /run/systemd/inaccessible/ systemd creates several device nodes in /run/systemd/inaccessible/. This makes CGroup's settings related to IO can take device node files in the directory.	2017-12-23 19:32:42 +09:00
Yu Watanabe	13ec20d42a	dbus-cgroup: merge several blocks which operate almost same tasks	2017-12-23 19:32:36 +09:00
Yu Watanabe	d9f7305fd7	cgroup: move path checking logic to dbus-cgroup.c	2017-12-23 19:32:29 +09:00
Lennart Poettering	0d53667334	tree-wide: use __fsetlocking() instead of fxyz_unlocked() Let's replace usage of fputc_unlocked() and friends by __fsetlocking(f, FSETLOCKING_BYCALLER). This turns off locking for the entire FILE, instead of doing individual per-call decision whether to use normal calls or _unlocked() calls. This has various benefits: 1. It's easier to read and easier not to forget 2. It's more comprehensive, as fprintf() and friends are covered too (as these functions have no _unlocked() counterpart) 3. Philosophically, it's a bit more correct, because it's more a property of the file handle really whether we ever pass it on to another thread, not of the operations we then apply to it. This patch reworks all pieces of codes that so far used fxyz_unlocked() calls to use __fsetlocking() instead. It also reworks all places that use open_memstream(), i.e. use stdio FILE for string manipulations. Note that this in some way a revert of `4b61c87511`.	2017-12-14 10:42:25 +01:00
Lennart Poettering	66a892ae3d	core: accept MemorySwapMax= properties that are scaled, too Let's do what we already do for MemoryMax= and friends for MemorySwapMax= too.	2017-11-29 20:12:26 +01:00
Lennart Poettering	e74f76ca86	tree-wide: generate SD_BUS_ERROR_INVALID_ARGS when we get invalid arguments on bus calls Let's make sure that when we return a D-Bus error, we return a native one, if we generate it ourselves, and use errno-based error synthetization only if we received an errno ourselves. Yes, this makes things slightly longer, but is highly misleading as we propagate D-Bus errors, and not errnos to the client.	2017-11-29 12:34:12 +01:00
Lennart Poettering	2e59b241ca	core: add proper escaping to writing of drop-ins/transient unit files This majorly refactors the transient unit file and drop-in writing logic, so that we properly C-escape and specifier-escape (% → %%) everything we write out, so that when we read it back again, specifiers are parsed that aren't supposed to be parsed. This renames unit_write_drop_in() and friends by unit_write_setting(). The name change is supposed to clarify that the functions are not only used to write drop-in files, but also transient unit files. The previous "mode" parameter to this function is replaced by a more generic "flags", which knows additional flags for implicit C-style and specifier escaping before writing things out. This can cover most properties where either form of escaping is defined. For the cases where this isn't sufficient, we add helpers unit_escape_setting() and unit_concat_strv() for escaping individual strings or strvs properly. While we are at it, we also prettify generation of transient unit files: we try to reduce the number of section headers written out: previously we'd write the right section header our for each setting. With this change we do so only if the setting lives in a different section than the one before. (This should also be considered preparation for when we add proper APIs to systemd to write normal, persistant unit files through the bus API)	2017-11-29 12:34:12 +01:00
Zbigniew Jędrzejewski-Szmek	53e1b68390	Add SPDX license identifiers to source files under the LGPL This follows what the kernel is doing, c.f. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.	2017-11-19 19:08:15 +01:00
Yu Watanabe	1bdfc7b951	core/cgroup: assigning empty string to Delegate= resets list of controllers (#7336 ) Before this, assigning empty string to Delegate= makes no change to the controller list. This is inconsistent to the other options that take list of strings. After this, when empty string is assigned to Delegate=, the list of controllers is reset. Such behavior is consistent to other options and useful for drop-in configs. Closes #7334.	2017-11-17 10:04:25 +01:00
Lennart Poettering	ab8519c28b	core: only warn about BPF/cgroup missing once per runtime (#7319 ) Let's reduce the amount of noise a bit, there's little point in complaining loudly about every single unit like this, let's complain only about the first one, and then downgrade the log level to LOG_DEBUG for the other cases. Fixes: #7188	2017-11-13 22:02:51 +01:00
Lennart Poettering	0263828039	core: rework the Delegate= unit file setting to take a list of controller names Previously it was not possible to select which controllers to enable for a unit where Delegate=yes was set, as all controllers were enabled. With this change, this is made configurable, and thus delegation units can pick specifically what they want to manage themselves, and what they don't care about.	2017-11-13 10:49:15 +01:00
Lennart Poettering	4fe66c8681	core: improve dbus-cgroup error message As suggested by @keszybz in the review of #6764	2017-09-26 23:49:40 +02:00
Lennart Poettering	078ba556da	core: warn loudly if IP firewalling is configured but not in effect	2017-09-22 15:24:55 +02:00
Lennart Poettering	1274b6c687	ip-address-access: minimize IP address lists Let's drop redundant items from the IP address list after parsing. Let's also mask out redundant bits hidden by the prefixlength.	2017-09-22 15:24:55 +02:00
Lennart Poettering	3dc5ca9787	core: support IP firewalling to be configured for transient units	2017-09-22 15:24:55 +02:00
Lennart Poettering	27458ed629	tree-wide: use path_startswith() rather than startswith() where ever that's appropriate When checking path prefixes we really should use the right APIs, just in case people add multiple slashes to their paths...	2017-08-09 19:03:39 +02:00
Lennart Poettering	4b61c87511	tree-wide: fput[cs]() → fput[cs]_unlocked() wherever that makes sense (#6396 ) As a follow-up for `db3f45e2d2` let's do the same for all other cases where we create a FILE* with local scope and know that no other threads hence can have access to it. For most cases this shouldn't change much really, but this should speed dbus introspection and calender time formatting up a bit.	2017-07-21 10:35:45 +02:00
Zbigniew Jędrzejewski-Szmek	d4bf82fcac	pid1: properly encode infinity when writing CPUQuota snippet (#6141 ) We would write [Slice] CPUQuota=1844674407370955% which is (numerically) correct, but it seems better to just write [Slice] CPUQuota= which is interpreted as USEC_INFINITY by the parser in config_parse_cpu_quota(). Fixes #5965.	2017-06-18 11:18:41 +02:00
WaLyong Cho	96e131ea09	core: introduce MemorySwapMax= Similar to MemoryMax=, MemorySwapMax= limits swap usage. This controls controls "memory.swap.max" attribute in unified cgroup.	2016-08-30 11:11:45 +09:00
Tejun Heo	66ebf6c0a1	core: add cgroup CPU controller support on the unified hierarchy Unfortunately, due to the disagreements in the kernel development community, CPU controller cgroup v2 support has not been merged and enabling it requires applying two small out-of-tree kernel patches. The situation is explained in the following documentation. https://git.kernel.org/cgit/linux/kernel/git/tj/cgroup.git/tree/Documentation/cgroup-v2-cpu.txt?h=cgroup-v2-cpu While it isn't clear what will happen with CPU controller cgroup v2 support, there are critical features which are possible only on cgroup v2 such as buffered write control making cgroup v2 essential for a lot of workloads. This commit implements systemd CPU controller support on the unified hierarchy so that users who choose to deploy CPU controller cgroup v2 support can easily take advantage of it. On the unified hierarchy, "cpu.weight" knob replaces "cpu.shares" and "cpu.max" replaces "cpu.cfs_period_us" and "cpu.cfs_quota_us". [Startup]CPUWeight config options are added with the usual compat translation. CPU quota settings remain unchanged and apply to both legacy and unified hierarchies. v2: - Error in man page corrected. - CPU config application in cgroup_context_apply() refactored. - CPU accounting now works on unified hierarchy.	2016-08-07 09:45:39 -04:00
Lennart Poettering	f7903e8db6	core: rename MemoryLimitByPhysicalMemory transient property to MemoryLimitScale That way, we can neatly keep this in line with the new TasksMaxScale= option. Note that we didn't release a version with MemoryLimitByPhysicalMemory= yet, hence this change should be unproblematic without breaking API.	2016-07-22 15:33:12 +02:00
Lennart Poettering	83f8e80857	core: support percentage specifications on TasksMax= This adds support for a TasksMax=40% syntax for specifying values relative to the system's configured maximum number of processes. This is useful in order to neatly subdivide the available room for tasks within containers.	2016-07-22 15:33:12 +02:00
Alessandro Puccetti	31d28eabc1	nspawn: enable major=0/minor=0 devices inside the container (#3773 ) https://github.com/systemd/systemd/pull/3685 introduced /run/systemd/inaccessible/{chr,blk} to map inacessible devices, this patch allows systemd running inside a nspawn container to create /run/systemd/inaccessible/{chr,blk}.	2016-07-21 17:39:38 +02:00
Lennart Poettering	d58d600efd	systemctl: allow percent-based MemoryLimit= settings via systemctl set-property The unit files already accept relative, percent-based memory limit specification, let's make sure "systemctl set-property" support this too. Since we want the physical memory size of the destination machine to apply we pass the percentage in a new set of properties that only exist for this purpose, and can only be set.	2016-06-14 20:01:45 +02:00
Lennart Poettering	799ec13412	core: make sure to use "infinity" in unit files, not "max" THe latter is a kernelism, we only understand "infinity".	2016-06-14 19:50:38 +02:00
Lennart Poettering	cd0a7a8e58	core: when receiving a memory limit via the bus, refuse 0 When parsing unit files we already refuse unit memory limits of zero, let's also refuse it when the value is set via the bus.	2016-06-14 19:50:38 +02:00
Zbigniew Jędrzejewski-Szmek	b27b4b51c6	tree-wide: remove newlines from unit_write_drop_in This reverts part of #3329, but all for a good cause.	2016-05-28 16:29:42 -04:00
Tejun Heo	da4d897e75	core: add cgroup memory controller support on the unified hierarchy (#3315 ) On the unified hierarchy, memory controller implements three control knobs - low, high and max which enables more useable and versatile control over memory usage. This patch implements support for the three control knobs. * MemoryLow, MemoryHigh and MemoryMax are added for memory.low, memory.high and memory.max, respectively. * As all absolute limits on the unified hierarchy use "max" for no limit, make memory limit parse functions accept "max" in addition to "infinity" and document "max" for the new knobs. * Implement compatibility translation between MemoryMax and MemoryLimit. v2: - Fixed missing else's in config_parse_memory_limit(). - Fixed missing newline when writing out drop-ins. - Coding style updates to use "val > 0" instead of "val". - Minor updates to documentation.	2016-05-27 18:10:18 +02:00
Tejun Heo	0c2d96f5f5	core: fix missing newlines when writing out drop-ins for cgroup settings Except for per-device BlockIO, IO and DeviceAllow/Deny settings, all were missing newline causing the next drop-in to be concatenated at the end of the line. Fix it.	2016-05-23 16:48:46 -04:00
Tejun Heo	6fb0926976	core: fix the reversed sanity check when setting StartupBlockIOWeight over dbus bus_cgroup_set_property() was rejecting if the input value was in range. Reverse it.	2016-05-23 16:48:46 -04:00

1 2

96 Commits