Systemd

Commit Graph

Author	SHA1	Message	Date
Zbigniew Jędrzejewski-Szmek	2e2ed88062	pid1,systemctl: allow symbolic exit code names	2019-07-29 15:54:53 +02:00
Zbigniew Jędrzejewski-Szmek	23d5dd1687	shared/exit-status: use Bitmap instead of Sets I opted to embed the Bitmap structure directly in the ExitStatusSet. This means that memory usage is a bit higher for units which don't define this setting: Service changes: /* size: 2720, cachelines: 43, members: 73 / / sum members: 2680, holes: 9, sum holes: 39 / / sum bitfield members: 7 bits, bit holes: 1, sum bit holes: 1 bits / / last cacheline: 32 bytes / / size: 2816, cachelines: 44, members: 73 / / sum members: 2776, holes: 9, sum holes: 39 / / sum bitfield members: 7 bits, bit holes: 1, sum bit holes: 1 bits */ But this way the code is simpler and we do less pointer chasing.	2019-07-29 15:54:53 +02:00
Zbigniew Jędrzejewski-Szmek	4ec8514142	Rename EXTRACT_QUOTES to EXTRACT_UNQUOTE Whenever I see EXTRACT_QUOTES, I'm always confused whether it means to leave the quotes in or to take them out. Let's say "unquote", like we say "cunescape".	2019-06-28 11:35:05 +02:00
Zbigniew Jędrzejewski-Szmek	cae90de3d3	Reindent some things for readability	2019-06-28 11:19:24 +02:00
Zbigniew Jędrzejewski-Szmek	9266f31e61	core: skip whitespace after "\|" and "!" in the condition parser We'd skip any whitespace immediately after "=", but then we'd treat whitespace that is between "\|" or "!" and the value as significant. This is rather confusing, let's ignore it too.	2019-06-27 10:54:37 +02:00
Frantisek Sumsal	a07a7324ad	core: move config_parse_* functions to a shared module Apart from making the code a little bit more clean, it should allow us to write a fuzzer around the config-parsing functions in the future	2019-06-25 22:35:02 +09:00
Kai Lüke	fab347489f	bpf-firewall: custom BPF programs through IP(Ingress\|Egress)FilterPath= Takes a single /sys/fs/bpf/pinned_prog string as argument, but may be specified multiple times. An empty assignment resets all previous filters. Closes https://github.com/systemd/systemd/issues/10227	2019-06-25 09:56:16 +02:00
Michal Sekletar	b070c7c0e1	core: introduce NUMAPolicy and NUMAMask options Make possible to set NUMA allocation policy for manager. Manager's policy is by default inherited to all forked off processes. However, it is possible to override the policy on per-service basis. Currently we support, these policies: default, prefer, bind, interleave, local. See man 2 set_mempolicy for details on each policy. Overall NUMA policy actually consists of two parts. Policy itself and bitmask representing NUMA nodes where is policy effective. Node mask can be specified using related option, NUMAMask. Default mask can be overwritten on per-service level.	2019-06-24 16:58:54 +02:00
Yu Watanabe	657ee2d82b	tree-wide: replace strjoin() with path_join()	2019-06-21 03:26:16 +09:00
Zbigniew Jędrzejewski-Szmek	0985c7c4e2	Rework cpu affinity parsing The CPU_SET_S api is pretty bad. In particular, it has a parameter for the size of the array, but operations which take two (CPU_EQUAL_S) or even three arrays (CPU_{AND,OR,XOR}_S) still take just one size. This means that all arrays must be of the same size, or buffer overruns will occur. This is exactly what our code would do, if it received an array of unexpected size over the network. ("Unexpected" here means anything different from what cpu_set_malloc() detects as the "right" size.) Let's rework this, and store the size in bytes of the allocated storage area. The code will now parse any number up to 8191, independently of what the current kernel supports. This matches the kernel maximum setting for any architecture, to make things more portable. Fixes #12605.	2019-05-29 10:20:42 +02:00
Chris Down	22bf131be2	cgroup: Support 0-value for memory protection directives These make sense to be explicitly set at 0 (which has a different effect than the default, since it can affect processing of `DefaultMemoryXXX`). Without this, it's not easily possible to relinquish memory protection for a subtree, which is not great.	2019-05-08 12:06:32 +01:00
Chris Down	7ad5439e06	unit: Add DefaultMemoryMin	2019-04-16 18:45:04 +01:00
Jan Klötzke	dc653bf487	service: handle abort stops with dedicated timeout When shooting down a service with SIGABRT the user might want to have a much longer stop timeout than on regular stops/shutdowns. Especially in the face of short stop timeouts the time might not be sufficient to write huge core dumps before the service is killed. This commit adds a dedicated (Default)TimeoutAbortSec= timer that is used when stopping a service via SIGABRT. In all other cases the existing TimeoutStopSec= is used. The timer value is unset by default to skip the special handling and use TimeoutStopSec= for state 'stop-watchdog' to keep the old behaviour. If the service is in state 'stop-watchdog' and the service should be stopped explicitly we still go to 'stop-sigterm' and re-apply the usual TimeoutStopSec= timeout.	2019-04-12 17:32:52 +02:00
Chris Down	c52db42b78	cgroup: Implement default propagation of MemoryLow with DefaultMemoryLow In cgroup v2 we have protection tunables -- currently MemoryLow and MemoryMin (there will be more in future for other resources, too). The design of these protection tunables requires not only intermediate cgroups to propagate protections, but also the units at the leaf of that resource's operation to accept it (by setting MemoryLow or MemoryMin). This makes sense from an low-level API design perspective, but it's a good idea to also have a higher-level abstraction that can, by default, propagate these resources to children recursively. In this patch, this happens by having descendants set memory.low to N if their ancestor has DefaultMemoryLow=N -- assuming they don't set a separate MemoryLow value. Any affected unit can opt out of this propagation by manually setting `MemoryLow` to some value in its unit configuration. A unit can also stop further propagation by setting `DefaultMemoryLow=` with no argument. This removes further propagation in the subtree, but has no effect on the unit itself (for that, use `MemoryLow=0`). Our use case in production is simplifying the configuration of machines which heavily rely on memory protection tunables, but currently require tweaking a huge number of unit files to make that a reality. This directive makes that significantly less fragile, and decreases the risk of misconfiguration. After this patch is merged, I will implement DefaultMemoryMin= using the same principles.	2019-04-12 17:23:58 +02:00
Lennart Poettering	afcfaa695c	core: implement OOMPolicy= and watch cgroups for OOM killings This adds a new per-service OOMPolicy= (along with a global DefaultOOMPolicy=) that controls what to do if a process of the service is killed by the kernel's OOM killer. It has three different values: "continue" (old behaviour), "stop" (terminate the service), "kill" (let the kernel kill all the service's processes). On top of that, track OOM killer events per unit: generate a per-unit structured, recognizable log message when we see an OOM killer event, and put the service in a failure state if an OOM killer event was seen and the selected policy was not "continue". A new "result" is defined for this case: "oom-kill". All of this relies on new cgroupv2 kernel functionality: the "memory.events" notification interface and the "memory.oom.group" attribute (which makes the kernel kill all cgroup processes automatically).	2019-04-09 11:17:58 +02:00
Zbigniew Jędrzejewski-Szmek	58f6ab4454	pid1: pass unit name to seccomp parser when we have no file location Building on previous commit, let's pass the unit name when parsing dbus message or builtin whitelist, which is better than nothing. seccomp_parse_syscall_filter() is not needed anymore, so it is removed, and seccomp_parse_syscall_filter_full() is renamed to take its place.	2019-04-03 09:17:42 +02:00
Lennart Poettering	dc44c96d97	core: pass parse error to log functions when parsing timer expressions	2019-04-01 18:25:43 +02:00
Lennart Poettering	25a04ae55e	core: simply timer expression parsing by using ".ltype" field of conf-parser logic No change of behaviour. Let's just not parse the lvalue all the time with timer_base_from_string() if we can already pass it in parsed.	2019-04-01 18:25:43 +02:00
Zbigniew Jędrzejewski-Szmek	983616735e	Merge pull request #12137 from poettering/socket-var-run warn about sockets in /var/run/ too	2019-03-29 15:00:25 +01:00
Lennart Poettering	4a66b5c9bf	core: complain and correct /var/run/ → /run/ for listening sockets We already do that for PIDFile= paths, and for tmpfiles.d/ snippets, let's also do this for .socket paths.	2019-03-28 16:59:57 +01:00
Lennart Poettering	7d2c9c6b50	load-fragment: use TAKE_PTR() where we can	2019-03-28 16:46:27 +01:00
Lennart Poettering	acd142af79	core: break overly long line	2019-03-28 12:09:38 +01:00
Lennart Poettering	2f6b9110fc	core: parse '@default' seccomp group permissively We are about to add system calls (rseq()) not available on old libseccomp/old kernels, and hence we need to be permissive when parsing our definitions.	2019-03-28 12:09:38 +01:00
Lennart Poettering	d8b4d14df4	util: split out nulstr related stuff to nulstr-util.[ch]	2019-03-14 13:25:52 +01:00
Lennart Poettering	eefc66aa8f	util: split out some stuff into a new file limits-util.[ch]	2019-03-13 12:16:43 +01:00
Lennart Poettering	55dadc5c57	core: warn if people use the undocumented/depreacted ConditionNull= Triggered by: https://github.com/systemd/systemd/issues/11812	2019-03-05 13:54:20 +01:00
Anita Zhang	7ca69792e5	core: add ':' prefix to ExecXYZ= skip env var substitution	2019-02-20 17:58:14 +01:00
Filipe Brandenburger	10f2864111	core: add CPUQuotaPeriodSec= This new setting allows configuration of CFS period on the CPU cgroup, instead of using a hardcoded default of 100ms. Tested: - Legacy cgroup + Unified cgroup - systemctl set-property - systemctl show - Confirmed that the cgroup settings (such as cpu.cfs_period_ns) were set appropriately, including updating the CPU quota (cpu.cfs_quota_ns) when CPUQuotaPeriodSec= is updated. - Checked that clamping works properly when either period or (quota * period) are below the resolution of 1ms, or if period is above the max of 1s.	2019-02-14 11:04:42 -08:00
Yu Watanabe	b8055c05e2	core/load-fragment: empty assignment to PIDFile= resets the value Follow-up for `a9353a5c5b`.	2019-02-06 17:58:24 +01:00
Chris Lamb	4605de118d	Correct more spelling errors.	2019-01-23 23:34:52 +01:00
Khem Raj	baa162cecd	core: Fix use after free case in load_from_path() ensure that mfree() on filename is called after the logging function which uses the string pointed by filename Signed-off-by: Khem Raj <raj.khem@gmail.com>	2018-12-16 22:02:00 -08:00
Chris Down	e92aaed30e	tree-wide: Remove O_CLOEXEC from fdopen fdopen doesn't accept "e", it's ignored. Let's not mislead people into believing that it actually sets O_CLOEXEC. From `man 3 fdopen`: > e (since glibc 2.7): > Open the file with the O_CLOEXEC flag. See open(2) for more information. This flag is ignored for fdopen() As mentioned by @jlebon in #11131.	2018-12-12 20:47:40 +01:00
Yu Watanabe	3843e8260c	missing: rename securebits.h to missing_securebits.h	2018-12-04 07:49:24 +01:00
Chris Down	c72703e26d	cgroup: Add DisableControllers= directive to disable controller in subtree Some controllers (like the CPU controller) have a performance cost that is non-trivial on certain workloads. While this can be mitigated and improved to an extent, there will for some controllers always be some overheads associated with the benefits gained from the controller. Inside Facebook, the fix applied has been to disable the CPU controller forcibly with `cgroup_disable=cpu` on the kernel command line. This presents a problem: to disable or reenable the controller, a reboot is required, but this is quite cumbersome and slow to do for many thousands of machines, especially machines where disabling/enabling a stateful service on a machine is a matter of several minutes. Currently systemd provides some configuration knobs for these in the form of `[Default]CPUAccounting`, `[Default]MemoryAccounting`, and the like. The limitation of these is that Default*Accounting is overrideable by individual services, of which any one could decide to reenable a controller within the hierarchy at any point just by using a controller feature implicitly (eg. `CPUWeight`), even if the use of that CPU feature could just be opportunistic. Since many services are provided by the distribution, or by upstream teams at a particular organisation, it's not a sustainable solution to simply try to find and remove offending directives from these units. This commit presents a more direct solution -- a DisableControllers= directive that forcibly disallows a controller from being enabled within a subtree.	2018-12-03 15:40:31 +00:00
Yu Watanabe	d2b42d63c4	core,run: make SocketProtocol= accept protocol name in upper case an protocol number	2018-12-02 06:13:47 +01:00
Yu Watanabe	da96ad5ae2	util: rename socket_protocol_{from,to}_name() to ip_protocol_{from,to}_name()	2018-12-02 05:48:27 +01:00
Zbigniew Jędrzejewski-Szmek	049af8ad0c	Split out part of mount-util.c into mountpoint-util.c The idea is that anything which is related to actually manipulating mounts is in mount-util.c, but functions for mountpoint introspection are moved to the new file. Anything which requires libmount must be in mount-util.c. This was supposed to be a preparation for further changes, with no functional difference, but it results in a significant change in linkage: $ ldd build/libnss_*.so.2 (before) build/libnss_myhostname.so.2: linux-vdso.so.1 (0x00007fff77bf5000) librt.so.1 => /lib64/librt.so.1 (0x00007f4bbb7b2000) libmount.so.1 => /lib64/libmount.so.1 (0x00007f4bbb755000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f4bbb734000) libc.so.6 => /lib64/libc.so.6 (0x00007f4bbb56e000) /lib64/ld-linux-x86-64.so.2 (0x00007f4bbb8c1000) libblkid.so.1 => /lib64/libblkid.so.1 (0x00007f4bbb51b000) libuuid.so.1 => /lib64/libuuid.so.1 (0x00007f4bbb512000) libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f4bbb4e3000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f4bbb45e000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f4bbb458000) build/libnss_mymachines.so.2: linux-vdso.so.1 (0x00007ffc19cc0000) librt.so.1 => /lib64/librt.so.1 (0x00007fdecb74b000) libcap.so.2 => /lib64/libcap.so.2 (0x00007fdecb744000) libmount.so.1 => /lib64/libmount.so.1 (0x00007fdecb6e7000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fdecb6c6000) libc.so.6 => /lib64/libc.so.6 (0x00007fdecb500000) /lib64/ld-linux-x86-64.so.2 (0x00007fdecb8a9000) libblkid.so.1 => /lib64/libblkid.so.1 (0x00007fdecb4ad000) libuuid.so.1 => /lib64/libuuid.so.1 (0x00007fdecb4a2000) libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fdecb475000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fdecb3f0000) libdl.so.2 => /lib64/libdl.so.2 (0x00007fdecb3ea000) build/libnss_resolve.so.2: linux-vdso.so.1 (0x00007ffe8ef8e000) librt.so.1 => /lib64/librt.so.1 (0x00007fcf314bd000) libcap.so.2 => /lib64/libcap.so.2 (0x00007fcf314b6000) libmount.so.1 => /lib64/libmount.so.1 (0x00007fcf31459000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fcf31438000) libc.so.6 => /lib64/libc.so.6 (0x00007fcf31272000) /lib64/ld-linux-x86-64.so.2 (0x00007fcf31615000) libblkid.so.1 => /lib64/libblkid.so.1 (0x00007fcf3121f000) libuuid.so.1 => /lib64/libuuid.so.1 (0x00007fcf31214000) libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fcf311e7000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fcf31162000) libdl.so.2 => /lib64/libdl.so.2 (0x00007fcf3115c000) build/libnss_systemd.so.2: linux-vdso.so.1 (0x00007ffda6d17000) librt.so.1 => /lib64/librt.so.1 (0x00007f610b83c000) libcap.so.2 => /lib64/libcap.so.2 (0x00007f610b835000) libmount.so.1 => /lib64/libmount.so.1 (0x00007f610b7d8000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f610b7b7000) libc.so.6 => /lib64/libc.so.6 (0x00007f610b5f1000) /lib64/ld-linux-x86-64.so.2 (0x00007f610b995000) libblkid.so.1 => /lib64/libblkid.so.1 (0x00007f610b59e000) libuuid.so.1 => /lib64/libuuid.so.1 (0x00007f610b593000) libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f610b566000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f610b4e1000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f610b4db000) (after) build/libnss_myhostname.so.2: linux-vdso.so.1 (0x00007fff0b5e2000) librt.so.1 => /lib64/librt.so.1 (0x00007fde0c328000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fde0c307000) libc.so.6 => /lib64/libc.so.6 (0x00007fde0c141000) /lib64/ld-linux-x86-64.so.2 (0x00007fde0c435000) build/libnss_mymachines.so.2: linux-vdso.so.1 (0x00007ffdc30a7000) librt.so.1 => /lib64/librt.so.1 (0x00007f06ecabb000) libcap.so.2 => /lib64/libcap.so.2 (0x00007f06ecab4000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f06eca93000) libc.so.6 => /lib64/libc.so.6 (0x00007f06ec8cd000) /lib64/ld-linux-x86-64.so.2 (0x00007f06ecc15000) build/libnss_resolve.so.2: linux-vdso.so.1 (0x00007ffe95747000) librt.so.1 => /lib64/librt.so.1 (0x00007fa56a80f000) libcap.so.2 => /lib64/libcap.so.2 (0x00007fa56a808000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fa56a7e7000) libc.so.6 => /lib64/libc.so.6 (0x00007fa56a621000) /lib64/ld-linux-x86-64.so.2 (0x00007fa56a964000) build/libnss_systemd.so.2: linux-vdso.so.1 (0x00007ffe67b51000) librt.so.1 => /lib64/librt.so.1 (0x00007ffb32113000) libcap.so.2 => /lib64/libcap.so.2 (0x00007ffb3210c000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ffb320eb000) libc.so.6 => /lib64/libc.so.6 (0x00007ffb31f25000) /lib64/ld-linux-x86-64.so.2 (0x00007ffb3226a000) I don't quite understand what is going on here, but let's not be too picky.	2018-11-29 21:03:44 +01:00
Zbigniew Jędrzejewski-Szmek	8b4e51a60e	Merge pull request #10797 from poettering/run-generator add new "systemd-run-generator" for running arbitrary commands from the kernel command line as system services using the "systemd.run=" kernel command line switch	2018-11-28 22:40:55 +01:00
Yu Watanabe	acf4d15893	util: make *_from_name() returns negative errno on error	2018-11-28 20:20:50 +09:00
Lennart Poettering	7af67e9a8b	core: allow to set exit status when using SuccessAction=/FailureAction=exit in units This adds SuccessActionExitStatus= and FailureActionExitStatus= that may be used to configure the exit status to propagate in when SuccessAction=exit or FailureAction=exit is used. When not specified let's also propagate the exit status of the main process we fork off for the unit.	2018-11-27 09:44:40 +01:00
Lennart Poettering	49fe5c0996	tree-wide: port various places over to STARTSWITH_SET()	2018-11-26 14:08:46 +01:00
Lennart Poettering	a9353a5c5b	core: log about /var/run/ prefix used in PIDFile=, patch it to be /run instead In a way this is a follow-up for `a2d1fb882c`, but adds a similar warning for PIDFile=. There's a much stronger case for doing this kind of notification in tmpfiles.d (since it helps relating lines to each other for the purpose of merging them). Doing this for PIDFile= is mostly about being systematic and copying tmpfiles.d/ behaviour here. While we are at it, let's also support relative filenames in PIDFile= now, and prefix them with /run, to make them absolute. Fixes: #10657	2018-11-10 19:17:00 +01:00
Zbigniew Jędrzejewski-Szmek	469f76f170	core: accept system mode emergency action specifiers with a warning Before we would only accept those "system" values, so there wasn't other chocie. Let's provide backwards compatiblity in case somebody made use of this functionality in user mode. v2: use 'exit-force' not 'exit' v3: use error value in log_syntax	2018-10-17 19:31:50 +02:00
Zbigniew Jędrzejewski-Szmek	54fcb6192c	core: define "exit" and "exit-force" actions for user units and only accept that We would accept e.g. FailureAction=reboot-force in user units and then do an exit in the user manager. Let's be stricter, and define "exit"/"exit-force" as the only supported actions in user units. v2: - rename 'exit' to 'exit-force' and add new 'exit' - add test for the parsing function	2018-10-17 19:31:49 +02:00
Yu Watanabe	958b8c7bd7	core: fix member access within null pointer config_parse_tasks_max() is also used for parsing system.conf or user.conf. In that case, userdata is NULL. Fixes #10362.	2018-10-11 22:23:39 +02:00
Zbigniew Jędrzejewski-Szmek	5a72417084	pid1: drop unused path parameter to add_two_dependencies_by_name()	2018-09-15 20:02:00 +02:00
Zbigniew Jędrzejewski-Szmek	35d8c19ace	pid1: drop now-unused path parameter to add_dependency_by_name()	2018-09-15 19:57:52 +02:00
Tejun Heo	6ae4283cb1	core: add IODeviceLatencyTargetSec This adds support for the following proposed latency based IO control mechanism. https://lkml.org/lkml/2018/6/5/428	2018-08-22 16:46:18 +02:00
Yu Watanabe	fd870bac25	core: introduce cgroup_add_device_allow()	2018-08-06 13:42:14 +09:00
Lennart Poettering	f806dfd345	tree-wide: increase granularity of percent specifications all over the place to permille We so far had various placed we'd parse percentages with parse_percent(). Let's make them use parse_permille() instead, which is downward compatible (as it also parses percent values), and increases the granularity a bit. Given that on the wire we usually normalize relative specifications to something like UINT32_MAX anyway changing from base-100 to base-1000 calculations can be done easily without breaking compat. This commit doesn't document this change in the man pages. While allowing more precise specifcations permille is not as commonly understood as perent I guess, hence let's keep this out of the docs for now.	2018-07-25 16:14:45 +02:00

1 2 3 4 5 ...

482 Commits