Systemd

Commit Graph

Author	SHA1	Message	Date
Kristijan Gjoshev	acf24a1a84	timer: add new feature FixedRandomDelay= FixedRandomDelay=yes will use `siphash24(sd_id128_get_machine() \|\| MANAGER_IS_SYSTEM(m) \|\| getuid() \|\| u->id)`, where \|\| is concatenation, instead of a random number to choose a value between 0 and RandomizedDelaySec= as the timer delay. This essentially sets up a fixed, but seemingly random, offset for each timer iteration rather than having a random offset recalculated each time it fires. Closes #10355 Co-author: Anita Zhang <the.anitazha@gmail.com>	2020-11-05 10:59:33 -08:00
Lennart Poettering	9b1915256c	core: add Timestamping= option for socket units This adds a way to control SO_TIMESTAMP/SO_TIMESTAMPNS socket options for sockets PID 1 binds to. This is useful in journald so that we get proper timestamps even for ingress log messages that are submitted before journald is running. We recently turned on packet info metadata from PID 1 for these sockets, but the timestamping info was still missing. Let's correct that.	2020-10-27 14:12:39 +01:00
Anita Zhang	f561e8c659	core: move where we send unit change updates to oomd Post-merge suggestion from #15206	2020-10-19 02:46:07 -07:00
Anita Zhang	620ed14e44	core: reindent and align table in load-fragment-gperf.gperf.m4	2020-10-19 02:46:07 -07:00
Anita Zhang	e30bbc90c9	core: add varlink call to get cgroup paths of units using ManagedOOM*=	2020-10-07 16:17:23 -07:00
Anita Zhang	4d824a4e0b	core: add ManagedOOM*= properties to configure systemd-oomd on the unit This adds the hook ups so it can be read with the usual systemd utilities. Used in later commits by sytemd-oomd.	2020-10-07 16:17:23 -07:00
Topi Miettinen	9df2cdd8ec	exec: SystemCallLog= directive With new directive SystemCallLog= it's possible to list system calls to be logged. This can be used for auditing or temporarily when constructing system call filters. --- v5: drop intermediary, update HASHMAP_FOREACH_KEY() use v4: skip useless debug messages, actually parse directive v3: don't declare unused variables with old libseccomp v2: fix build without seccomp or old libseccomp	2020-09-15 12:54:17 +03:00
Renaud Métrich	3e5f04bf64	socket: New option 'FlushPending' (boolean) to flush socket before entering listening state Disabled by default. When Enabled, before listening on the socket, flush the content. Applies when Accept=no only.	2020-09-01 17:20:23 +02:00
Lennart Poettering	bb0c0d6f29	core: add credentials logic Fixes: #15778 #16060	2020-08-25 19:45:35 +02:00
Lennart Poettering	f053c9477b	core: drop redundant comment Since `625a164069` we don't need to update analyze-condition.c separately anymore, hence drop the comment suggesting otherwise.	2020-08-25 07:47:50 +02:00
Lennart Poettering	4e39995371	core: introduce ProtectProc= and ProcSubset= to expose hidepid= and subset= procfs mount options Kernel 5.8 gained a hidepid= implementation that is truly per procfs, which allows us to mount a distinct once into every unit, with individual hidepid= settings. Let's expose this via two new settings: ProtectProc= (wrapping hidpid=) and ProcSubset= (wrapping subset=). Replaces: #11670	2020-08-24 20:11:02 +02:00
Lennart Poettering	476cfe626d	core: remove support for ConditionNull= The concept is flawed, and mostly useless. Let's finally remove it. It has been deprecated since `90a2ec10f2` (6 years ago) and we started to warn since `55dadc5c57` (1.5 years ago). Let's get rid of it altogether.	2020-08-20 14:01:25 +02:00
Lennart Poettering	4f55a5b0bf	core: add missing conditions/asserts to unit file parsing	2020-08-20 13:56:14 +02:00
Luca Boccassi	b3d133148e	core: new feature MountImages Follows the same pattern and features as RootImage, but allows an arbitrary mount point under / to be specified by the user, and multiple values - like BindPaths. Original implementation by @topimiettinen at: https://github.com/systemd/systemd/pull/14451 Reworked to use dissect's logic instead of bare libmount() calls and other review comments. Thanks Topi for the initial work to come up with and implement this useful feature.	2020-08-05 21:34:55 +01:00
Luca Boccassi	18d7370587	service: add new RootImageOptions feature Allows to specify mount options for RootImage. In case of multi-partition images, the partition number can be prefixed followed by colon. Eg: RootImageOptions=1:ro,dev 2:nosuid nodev In absence of a partition number, 0 is assumed.	2020-07-29 17:17:32 +01:00
Luca Boccassi	d4d55b0d13	core: add RootHashSignature service parameter Allow to explicitly pass root hash signature as a unit option. Takes precedence over implicit checks.	2020-06-25 08:45:21 +01:00
Luca Boccassi	0389f4fa81	core: add RootHash and RootVerity service parameters Allow to explicitly pass root hash (explicitly or as a file) and verity device/file as unit options. Take precedence over implicit checks.	2020-06-23 10:50:09 +02:00
Jan Klötzke	bf76080180	core: let user define start-/stop-timeout behaviour The usual behaviour when a timeout expires is to terminate/kill the service. This is what user usually want in production systems. To debug services that fail to start/stop (especially sporadic failures) it might be necessary to trigger the watchdog machinery and write core dumps, though. Likewise, it is usually just a waste of time to gracefully stop a stuck service. Instead it might save time to go directly into kill mode. This commit adds two new options to services: TimeoutStartFailureMode= and TimeoutStopFailureMode=. Both take the same values and tweak the behavior of systemd when a start/stop timeout expires: * 'terminate': is the default behaviour as it has always been, * 'abort': triggers the watchdog machinery and will send SIGABRT (unless WatchdogSignal was changed) and * 'kill' will directly send SIGKILL. To handle the stop failure mode in stop-post state too a new final-watchdog state needs to be introduced.	2020-06-09 10:04:57 +02:00
Lennart Poettering	a3d19f5d99	core: add new PassPacketInfo= socket unit property	2020-05-27 22:40:38 +02:00
Martin Hundebøll	c600357ba6	mount: add ReadWriteOnly property to fail on read-only mounts Systems where a mount point is expected to be read-write needs a way to fail mount units that fallback as read-only. Add a property to allow setting the -w option when calling mount(8).	2020-05-01 13:23:30 +02:00
Zbigniew Jędrzejewski-Szmek	ad21e542b2	manager: add CoredumpFilter= setting Fixes #6685.	2020-04-09 14:08:48 +02:00
Lennart Poettering	91dd5f7cbe	core: add new LogNamespace= execution setting	2020-01-31 15:01:43 +01:00
Kevin Kuehler	fc64760dda	core: shared: Add ProtectClock= to systemd.exec	2020-01-26 12:23:33 -08:00
Lennart Poettering	eb34a981d6	core: initialize priority_set when parsing swap unit files Fixes: #14524	2020-01-09 17:08:31 +01:00
Zbigniew Jędrzejewski-Szmek	0b8d307587	pid1: fix the names of AllowedCPUs= and AllowedMemoryNodes= The original PR was submitted with CPUSetCpus and CPUSetMems, which was later changed to AllowedCPUs and AllowedMemmoryNodes everywhere (including the parser used by systemd-run), but not in the parser for unit files. Since we already released -rc1, let's keep support for the old names. I think we can remove it in a release or two if anyone remembers to do that. Fixes #14126. Follow-up for `047f5d63d7`.	2019-11-25 14:02:14 +01:00
Kevin Kuehler	8470304018	core: Add ProtectKernelLogs If seccomp is enabled, load the SYSCALL_FILTER_SET_SYSLOG into the seccomp filter set. Drop the CAP_SYSLOG capability.	2019-11-11 12:12:02 -08:00
Yu Watanabe	f5947a5e92	tree-wide: drop missing.h	2019-10-31 17:57:03 +09:00
Zbigniew Jędrzejewski-Szmek	a5f6f346d3	Merge pull request #13423 from pwithnall/12035-session-time-limits Add `RuntimeMaxSec=` support to scope units (time-limited login sessions)	2019-10-28 14:57:00 +01:00
Philip Withnall	9ed7de605d	scope: Support RuntimeMaxSec= directive in scope units Just as `RuntimeMaxSec=` is supported for service units, add support for it to scope units. This will gracefully kill a scope after the timeout expires from the moment the scope enters the running state. This could be used for time-limited login sessions, for example. Signed-off-by: Philip Withnall <withnall@endlessm.com> Fixes: #12035	2019-10-28 09:44:31 +01:00
Zbigniew Jędrzejewski-Szmek	a232ebcc2c	core: add support for RestartKillSignal= to override signal used for restart jobs v2: - if RestartKillSignal= is not specified, fall back to KillSignal=. This is necessary to preserve backwards compatibility (and keep KillSignal= generally useful).	2019-10-02 14:01:25 +02:00
Pavel Hrdina	047f5d63d7	cgroup: introduce support for cgroup v2 CPUSET controller Introduce support for configuring cpus and mems for processes using cgroup v2 CPUSET controller. This allows users to limit which cpus and memory NUMA nodes can be used by processes to better utilize system resources. The cgroup v2 interfaces to control it are cpuset.cpus and cpuset.mems where the requested configuration is written. However, it doesn't mean that the requested configuration will be actually used as parent cgroup may limit the cpus or mems as well. In order to reflect the real configuration cgroup v2 provides read-only files cpuset.cpus.effective and cpuset.mems.effective which are exported to users as well.	2019-09-24 15:16:07 +02:00
Zbigniew Jędrzejewski-Szmek	5ac1530eca	tree-wide: say "ratelimit" not "rate_limit" "ratelimit" is a real word, so we don't need to use the other form anywhere. We had both forms in various places, let's standarize on the shorter and more correct one.	2019-09-20 16:05:53 +02:00
Zbigniew Jędrzejewski-Szmek	7bf081a1e5	pid1: rename start_limit to start_ratelimit This way it is clearer what the type is. We also have auto_stop_ratelimit adjacent, and it feels ugly to have a different suffix for those two.	2019-09-20 16:05:53 +02:00
Zbigniew Jędrzejewski-Szmek	6b4f7fb08c	Merge pull request #13385 from yuwata/core-remove-private-directories-13355 core: also remove private directories by systemctl clean	2019-08-31 09:28:39 +02:00
Yu Watanabe	12213aed12	core: move timeout_clean_usec from Service to ExecContext	2019-08-28 23:09:54 +09:00
Zbigniew Jędrzejewski-Szmek	ae480f0b09	shared/user-util: allow usernames with dots in specific fields People do have usernames with dots, and it makes them very unhappy that systemd doesn't like their that. It seems that there is no actual problem with allowing dots in the username. In particular chown declares ":" as the official separator, and internally in systemd we never rely on "." as the seperator between user and group (nor do we call chown directly). Using dots in the name is probably not a very good idea, but we don't need to care. Debian tools (adduser) do not allow users with dots to be created. This patch allows existing names with dots to be used in User, Group, SupplementaryGroups, SocketUser, SocketGroup fields, both in unit files and on the command line. DynamicUsers and sysusers still follow the strict policy. user@.service and tmpfiles already allowed arbitrary user names, and this remains unchanged. Fixes #12754.	2019-08-19 21:19:13 +02:00
Anita Zhang	31cd5f63ce	core: ExecCondition= for services Closes #10596	2019-07-17 11:35:02 +02:00
Lennart Poettering	4c2f584230	core: hook up service unit type with the new clean operation The implementation is pretty straight-foward: when we get a request to clean some type of resources we fork off a process doing that, and while it is running we are in the "cleaning" state.	2019-07-11 12:18:51 +02:00
Zbigniew Jędrzejewski-Szmek	edfea9fe0d	analyze: add 'condition' verb We didn't have a straightforward way to parse and evaluate those strings. Prompted by #12881.	2019-06-27 10:54:37 +02:00
Kai Lüke	fab347489f	bpf-firewall: custom BPF programs through IP(Ingress\|Egress)FilterPath= Takes a single /sys/fs/bpf/pinned_prog string as argument, but may be specified multiple times. An empty assignment resets all previous filters. Closes https://github.com/systemd/systemd/issues/10227	2019-06-25 09:56:16 +02:00
Michal Sekletar	b070c7c0e1	core: introduce NUMAPolicy and NUMAMask options Make possible to set NUMA allocation policy for manager. Manager's policy is by default inherited to all forked off processes. However, it is possible to override the policy on per-service basis. Currently we support, these policies: default, prefer, bind, interleave, local. See man 2 set_mempolicy for details on each policy. Overall NUMA policy actually consists of two parts. Policy itself and bitmask representing NUMA nodes where is policy effective. Node mask can be specified using related option, NUMAMask. Default mask can be overwritten on per-service level.	2019-06-24 16:58:54 +02:00
Chris Down	7e7223b3d5	cgroup: Readd some plumbing for DefaultMemoryMin Somehow these got lost in the previous PR, rendering DefaultMemoryMin not very useful.	2019-05-08 12:06:32 +01:00
Jan Klötzke	dc653bf487	service: handle abort stops with dedicated timeout When shooting down a service with SIGABRT the user might want to have a much longer stop timeout than on regular stops/shutdowns. Especially in the face of short stop timeouts the time might not be sufficient to write huge core dumps before the service is killed. This commit adds a dedicated (Default)TimeoutAbortSec= timer that is used when stopping a service via SIGABRT. In all other cases the existing TimeoutStopSec= is used. The timer value is unset by default to skip the special handling and use TimeoutStopSec= for state 'stop-watchdog' to keep the old behaviour. If the service is in state 'stop-watchdog' and the service should be stopped explicitly we still go to 'stop-sigterm' and re-apply the usual TimeoutStopSec= timeout.	2019-04-12 17:32:52 +02:00
Chris Down	c52db42b78	cgroup: Implement default propagation of MemoryLow with DefaultMemoryLow In cgroup v2 we have protection tunables -- currently MemoryLow and MemoryMin (there will be more in future for other resources, too). The design of these protection tunables requires not only intermediate cgroups to propagate protections, but also the units at the leaf of that resource's operation to accept it (by setting MemoryLow or MemoryMin). This makes sense from an low-level API design perspective, but it's a good idea to also have a higher-level abstraction that can, by default, propagate these resources to children recursively. In this patch, this happens by having descendants set memory.low to N if their ancestor has DefaultMemoryLow=N -- assuming they don't set a separate MemoryLow value. Any affected unit can opt out of this propagation by manually setting `MemoryLow` to some value in its unit configuration. A unit can also stop further propagation by setting `DefaultMemoryLow=` with no argument. This removes further propagation in the subtree, but has no effect on the unit itself (for that, use `MemoryLow=0`). Our use case in production is simplifying the configuration of machines which heavily rely on memory protection tunables, but currently require tweaking a huge number of unit files to make that a reality. This directive makes that significantly less fragile, and decreases the risk of misconfiguration. After this patch is merged, I will implement DefaultMemoryMin= using the same principles.	2019-04-12 17:23:58 +02:00
Lennart Poettering	afcfaa695c	core: implement OOMPolicy= and watch cgroups for OOM killings This adds a new per-service OOMPolicy= (along with a global DefaultOOMPolicy=) that controls what to do if a process of the service is killed by the kernel's OOM killer. It has three different values: "continue" (old behaviour), "stop" (terminate the service), "kill" (let the kernel kill all the service's processes). On top of that, track OOM killer events per unit: generate a per-unit structured, recognizable log message when we see an OOM killer event, and put the service in a failure state if an OOM killer event was seen and the selected policy was not "continue". A new "result" is defined for this case: "oom-kill". All of this relies on new cgroupv2 kernel functionality: the "memory.events" notification interface and the "memory.oom.group" attribute (which makes the kernel kill all cgroup processes automatically).	2019-04-09 11:17:58 +02:00
Davide Cavalca	639dd43a36	core: fix build failure if seccomp is disabled	2019-04-03 13:46:32 +09:00
Lennart Poettering	f69567cbe2	core: expose SUID/SGID restriction as new unit setting RestrictSUIDSGID=	2019-04-02 16:56:48 +02:00
Lennart Poettering	efebb613c7	core: optionally, trigger .timer units on timezone and clock changes Fixes: #6228	2019-04-02 08:20:10 +02:00
Lennart Poettering	25a04ae55e	core: simply timer expression parsing by using ".ltype" field of conf-parser logic No change of behaviour. Let's just not parse the lvalue all the time with timer_base_from_string() if we can already pass it in parsed.	2019-04-01 18:25:43 +02:00
Lennart Poettering	a8d08f39d1	core: add new setting NetworkNamespacePath= for configuring a netns by path for a service Fixes: #2741	2019-03-07 16:55:23 +01:00

1 2 3 4 5 ...

279 Commits