Systemd

Author	SHA1	Message	Date
Lennart Poettering	74e1252072	execute: add helper for checking if root_directory/root_image are set in ExecContext	2020-10-01 11:02:11 +02:00
Zbigniew Jędrzejewski-Szmek	5e98086d16	core: remember when we set ExecContext.mount_apivfs No functional change intended so far.	2020-09-24 10:03:18 +02:00
Topi Miettinen	9df2cdd8ec	exec: SystemCallLog= directive With new directive SystemCallLog= it's possible to list system calls to be logged. This can be used for auditing or temporarily when constructing system call filters. --- v5: drop intermediary, update HASHMAP_FOREACH_KEY() use v4: skip useless debug messages, actually parse directive v3: don't declare unused variables with old libseccomp v2: fix build without seccomp or old libseccomp	2020-09-15 12:54:17 +03:00
Lennart Poettering	bb0c0d6f29	core: add credentials logic Fixes: #15778 #16060	2020-08-25 19:45:35 +02:00
Lennart Poettering	4e39995371	core: introduce ProtectProc= and ProcSubset= to expose hidepid= and subset= procfs mount options Kernel 5.8 gained a hidepid= implementation that is truly per procfs, which allows us to mount a distinct once into every unit, with individual hidepid= settings. Let's expose this via two new settings: ProtectProc= (wrapping hidpid=) and ProcSubset= (wrapping subset=). Replaces: #11670	2020-08-24 20:11:02 +02:00
Luca Boccassi	b3d133148e	core: new feature MountImages Follows the same pattern and features as RootImage, but allows an arbitrary mount point under / to be specified by the user, and multiple values - like BindPaths. Original implementation by @topimiettinen at: https://github.com/systemd/systemd/pull/14451 Reworked to use dissect's logic instead of bare libmount() calls and other review comments. Thanks Topi for the initial work to come up with and implement this useful feature.	2020-08-05 21:34:55 +01:00
Luca Boccassi	18d7370587	service: add new RootImageOptions feature Allows to specify mount options for RootImage. In case of multi-partition images, the partition number can be prefixed followed by colon. Eg: RootImageOptions=1:ro,dev 2:nosuid nodev In absence of a partition number, 0 is assumed.	2020-07-29 17:17:32 +01:00
Zbigniew Jędrzejewski-Szmek	56a13a495c	pid1: create ro private tmp dirs when /tmp or /var/tmp is read-only Read-only /var/tmp is more likely, because it's backed by a real device. /tmp is (by default) backed by tmpfs, but it doesn't have to be. In both cases the same consideration applies. If we boot with read-only /var/tmp, any unit with PrivateTmp=yes would fail because we cannot create the subdir under /var/tmp to mount the private directory. But many services actually don't require /var/tmp (either because they only use it occasionally, or because they only use /tmp, or even because they don't use the temporary directories at all, and PrivateTmp=yes is used to isolate them from the rest of the system). To handle both cases let's create a read-only directory under /run/systemd and mount it as the private /tmp or /var/tmp. (Read-only to not fool the service into dumping too much data in /run.) $ sudo systemd-run -t -p PrivateTmp=yes bash Running as unit: run-u14.service Press ^] three times within 1s to disconnect TTY. [root@workstation /]# ls -l /tmp/ total 0 [root@workstation /]# ls -l /var/tmp/ total 0 [root@workstation /]# touch /tmp/f [root@workstation /]# touch /var/tmp/f touch: cannot touch '/var/tmp/f': Read-only file system This commit has more changes than I like to put in one commit, but it's touching all the same paths so it's hard to split. exec_runtime_make() was using the wrong cleanup function, so the directory would be left behind on error.	2020-07-14 19:47:15 +02:00
Luca Boccassi	d4d55b0d13	core: add RootHashSignature service parameter Allow to explicitly pass root hash signature as a unit option. Takes precedence over implicit checks.	2020-06-25 08:45:21 +01:00
Lennart Poettering	6b000af4f2	tree-wide: avoid some loaded terms https://tools.ietf.org/html/draft-knodel-terminology-02 https://lwn.net/Articles/823224/ This gets rid of most but not occasions of these loaded terms: 1. scsi_id and friends are something that is supposed to be removed from our tree (see #7594) 2. The test suite defines an API used by the ubuntu CI. We can remove this too later, but this needs to be done in sync with the ubuntu CI. 3. In some cases the terms are part of APIs we call or where we expose concepts the kernel names the way it names them. (In particular all remaining uses of the word "slave" in our codebase are like this, it's used by the POSIX PTY layer, by the network subsystem, the mount API and the block device subsystem). Getting rid of the term in these contexts would mean doing some major fixes of the kernel ABI first. Regarding the replacements: when whitelist/blacklist is used as noun we replace with with allow list/deny list, and when used as verb with allow-list/deny-list.	2020-06-25 09:00:19 +02:00
Luca Boccassi	0389f4fa81	core: add RootHash and RootVerity service parameters Allow to explicitly pass root hash (explicitly or as a file) and verity device/file as unit options. Take precedence over implicit checks.	2020-06-23 10:50:09 +02:00
Lennart Poettering	f3dc6af20f	core: automatically update StandardOuput=syslog to =journal (and similar for StandardError=) Let's go one step further and upgrade implicitly. Usually =syslog assignments are historic artifacts only. Let's upgrade the lines automatically, and politely suggest people update their unit files/configuration (and drop the lines altogether, without replacement). Fixes: #15807	2020-05-15 00:05:46 +02:00
Zbigniew Jędrzejewski-Szmek	ad21e542b2	manager: add CoredumpFilter= setting Fixes #6685.	2020-04-09 14:08:48 +02:00
Michal Sekletár	e2b2fb7f56	core: add support for setting CPUAffinity= to special "numa" value systemd will automatically derive CPU affinity mask from NUMA node mask. Fixes #13248	2020-03-16 08:57:28 +01:00
Michal Sekletár	1808f76870	shared: split out NUMA code from cpu-set-util.c to numa-util.c	2020-03-16 08:23:18 +01:00
Lennart Poettering	91dd5f7cbe	core: add new LogNamespace= execution setting	2020-01-31 15:01:43 +01:00
Kevin Kuehler	fc64760dda	core: shared: Add ProtectClock= to systemd.exec	2020-01-26 12:23:33 -08:00
Kevin Kuehler	8470304018	core: Add ProtectKernelLogs If seccomp is enabled, load the SYSCALL_FILTER_SET_SYSLOG into the seccomp filter set. Drop the CAP_SYSLOG capability.	2019-11-11 12:12:02 -08:00
Zbigniew Jędrzejewski-Szmek	5ac1530eca	tree-wide: say "ratelimit" not "rate_limit" "ratelimit" is a real word, so we don't need to use the other form anywhere. We had both forms in various places, let's standarize on the shorter and more correct one.	2019-09-20 16:05:53 +02:00
Yu Watanabe	12213aed12	core: move timeout_clean_usec from Service to ExecContext	2019-08-28 23:09:54 +09:00
Lennart Poettering	6b7b2ed96b	core: add type of resource string table	2019-07-11 12:18:51 +02:00
Lennart Poettering	4c2f584230	core: hook up service unit type with the new clean operation The implementation is pretty straight-foward: when we get a request to clean some type of resources we fork off a process doing that, and while it is running we are in the "cleaning" state.	2019-07-11 12:18:51 +02:00
Lennart Poettering	380dc8b0a2	core: add generic "clean" operation to units This adds basic infrastructure to implement a "clean" operation for unit types. This "clean" operation is supposed to remove on-disk resources of units, and is supposed to be used in a later commit to clean our RuntimeDirectory=, StateDirectory= and so on of service units. Later commits will open this up to the bus, and hook up service units with this. This also adds a new generic ActiveState called UNIT_MAINTENANCE. It's supposed to cover all kinds of "maintainance" state of units. Specifically, this is supposed to cover the "cleaning" operations later added for service units which might take a bit of time. This high-level, generic, abstract state is called UNIT_MAINTENANCE instead of the more specific "UNIT_CLEANING", since I think this should be kept open for different operations possibly later on that could be nicely subsumed under this (for example, maybe a recursive chown()ing operation could be covered by this, and similar).	2019-07-11 12:18:51 +02:00
Michal Sekletar	b070c7c0e1	core: introduce NUMAPolicy and NUMAMask options Make possible to set NUMA allocation policy for manager. Manager's policy is by default inherited to all forked off processes. However, it is possible to override the policy on per-service basis. Currently we support, these policies: default, prefer, bind, interleave, local. See man 2 set_mempolicy for details on each policy. Overall NUMA policy actually consists of two parts. Policy itself and bitmask representing NUMA nodes where is policy effective. Node mask can be specified using related option, NUMAMask. Default mask can be overwritten on per-service level.	2019-06-24 16:58:54 +02:00
Anita Zhang	b3d593673c	core: add ExecStartXYZEx= with dbus support for executable prefixes Closes #11654	2019-05-30 20:41:42 -07:00
Zbigniew Jędrzejewski-Szmek	0985c7c4e2	Rework cpu affinity parsing The CPU_SET_S api is pretty bad. In particular, it has a parameter for the size of the array, but operations which take two (CPU_EQUAL_S) or even three arrays (CPU_{AND,OR,XOR}_S) still take just one size. This means that all arrays must be of the same size, or buffer overruns will occur. This is exactly what our code would do, if it received an array of unexpected size over the network. ("Unexpected" here means anything different from what cpu_set_malloc() detects as the "right" size.) Let's rework this, and store the size in bytes of the allocated storage area. The code will now parse any number up to 8191, independently of what the current kernel supports. This matches the kernel maximum setting for any architecture, to make things more portable. Fixes #12605.	2019-05-29 10:20:42 +02:00
Lennart Poettering	f69567cbe2	core: expose SUID/SGID restriction as new unit setting RestrictSUIDSGID=	2019-04-02 16:56:48 +02:00
Lennart Poettering	0a6991e0bb	tree-wide: reorder various structures to make them smaller and use fewer cache lines Some "pahole" spelunking.	2019-03-27 18:11:11 +01:00
Zbigniew Jędrzejewski-Szmek	ca78ad1de9	headers: remove unneeded includes from util.h This means we need to include many more headers in various files that simply included util.h before, but it seems cleaner to do it this way.	2019-03-27 11:53:12 +01:00
Lennart Poettering	6f765baf23	core: rework how we reset the TTY after use by a service This makes two changes: 1. Instead of resetting the configured service TTY each time after a process exited, let's do so only when the service goes back to "dead" state. This should be preferable in case the started processes leave background child processes around that still reference the TTY. 2. chmod() and chown() the TTY at the same time. This should make it safe to run "systemd-run -p DynamicUser=1 -p StandardInput=tty -p TTYPath=/dev/tty8 /bin/bash" without leaving a TTY owned by a dynamic user around.	2019-03-20 21:28:02 +01:00
Lennart Poettering	a8d08f39d1	core: add new setting NetworkNamespacePath= for configuring a netns by path for a service Fixes: #2741	2019-03-07 16:55:23 +01:00
Anita Zhang	7ca69792e5	core: add ':' prefix to ExecXYZ= skip env var substitution	2019-02-20 17:58:14 +01:00
Topi Miettinen	aecd5ac621	core: ProtectHostname= feature Let services use a private UTS namespace. In addition, a seccomp filter is installed on set{host,domain}name and a ro bind mounts on /proc/sys/kernel/{host,domain}name.	2019-02-20 10:50:44 +02:00
Alexey Bogdanenko	8f9f3cb724	core: fix KeyringMode for user services KeyringMode option is useful for user services. Also, documentation for the option suggests that the option applies to user services. However, setting the option to any of its allowed values has no effect. This commit fixes that and removes EXEC_NEW_KEYRING flag. The flag is no longer necessary: instead of checking if the flag is set we can check if keyring_mode is not equal to EXEC_KEYRING_INHERIT.	2018-12-17 16:56:36 +01:00
Yu Watanabe	e93672eeac	tree-wide: drop missing.h from headers and use relevant missing_*.h	2018-12-06 13:31:16 +01:00
Lennart Poettering	78f93209fc	core: when Delegate=yes is set for a unit, run ExecStartPre= and friends in a subcgroup of the unit Otherwise we might conflict with the "no-processes-in-inner-cgroup" rule of cgroupsv2. Consider nspawn starting up and initializing its cgroup hierarchy with "supervisor/" and "payload/" as subcgroup, with itself moved into the former and the payload into the latter. Now, if an ExecStartPre= is run right after it cannot be placed in the main cgroup, because that is now in inner cgroup with populated children. Hence, let's run these helpers in another sub-cgroup .control/ below it. This is somewhat ugly since it weakens the clear separation of ownership, but given that this is an explicit contract, and double opt-in should be acceptable. Fixes: #10482	2018-11-26 18:43:23 +01:00
Yu Watanabe	b9c04eafb8	core: introduce exec_params_clear() Follow-up for `1ad6e8b302`. Fixes #10677.	2018-11-08 09:36:37 +01:00
Lennart Poettering	7593c3ecbf	tree-wide: add a few missing _sentinel_ decorators	2018-10-26 19:49:15 +02:00
Anita Zhang	90fc172e19	core: implement per unit journal rate limiting Add LogRateLimitIntervalSec= and LogRateLimitBurst= options for services. If provided, these values get passed to the journald client context, and those values are used in the rate limiting function in the journal over the the journald.conf values. Part of #10230	2018-10-18 09:56:20 +02:00
Yu Watanabe	cf4b2f9906	tree-wide: use unsigned for refcount	2018-08-27 13:48:04 +09:00
Zbigniew Jędrzejewski-Szmek	5b316330be	Merge pull request #9624 from poettering/service-state-flush flush out ExecStatus structures when a new service cycle begins	2018-08-02 09:50:39 +02:00
Lennart Poettering	5686391b00	core: introduce new Type=exec service type Users are often surprised that "systemd-run" command lines like "systemd-run -p User=idontexist /bin/true" will return successfully, even though the logs show that the process couldn't be invoked, as the user "idontexist" doesn't exist. This is because Type=simple will only wait until fork() succeeded before returning start-up success. This patch adds a new service type Type=exec, which is very similar to Type=simple, but waits until the child process completed the execve() before returning success. It uses a pipe that has O_CLOEXEC set for this logic, so that the kernel automatically sends POLLHUP on it when the execve() succeeded but leaves the pipe open if not. This means PID 1 waits exactly until the execve() succeeded in the child, and not longer and not shorter, which is the desired functionality. Making use of this new functionality, the command line "systemd-run -p User=idontexist -p Type=exec /bin/true" will now fail, as expected.	2018-07-25 22:48:11 +02:00
Lennart Poettering	ce0d60a7c4	execute: use our usual syntax for defining bit masks	2018-07-25 22:48:11 +02:00
Lennart Poettering	25b583d7ff	core: swap order of "n_storage_fds" and "n_socket_fds" parameters When process fd lists to pass to activated programs we always place the socket activation fds first, and the storage fds last. Irritatingly in almost all calls the "n_storage_fds" parameter (i.e. the number of storage fds to pass) came first so far, and the "n_socket_fds" parameter second. Let's clean this up, and specify the number of fds in the order the fds themselves are passed. (Also, let's fix one more case where "unsigned" was used to size an array, while we should use "size_t" instead.)	2018-07-25 22:48:11 +02:00
Lennart Poettering	6a1d4d9fa6	core: properly reset all ExecStatus structures when entering a new unit cycle Whenever a unit is started fresh we should flush out any runtime data from the previous cycle. We are pretty good at that already, but what so far we missed was the ExecStart=/ExecStop=/… command exit status data. Let's fix that, and properly flush out that stuff too. Consider this service: [Service] ExecStart=/bin/sleep infinity ExecStop=/bin/false When this service is started, then stopped and then started again "systemctl status" would show the ExecStop= results of the previous run along with the ExecStart= results of the current one, which is very confusing. With this patch this is corrected: the data is kept right until the moment the new service cycle starts, and then flushed out. Hence "systemctl status" in that case will only show the ExecStart= data, but no ExecStop= data, like it should be. This should fix part of the confusion of #9588	2018-07-23 13:36:47 +02:00
Lennart Poettering	42cb05d5ff	execute: document what the different structures are for in comments	2018-07-23 13:36:47 +02:00
Lennart Poettering	ee39ca20c6	core: drop "argv" field from ExecParameter structure We always initialize it from the same field in ExecCommand anyway, hence there's no point in passing it separately to exec_spawn(), after all we already pass the ExecCommand structure itself anyway. No change in behaviour.	2018-07-23 13:36:47 +02:00
Zsolt Dollenstein	566b7d23eb	Add support for opening files for appending Addresses part of #8983	2018-07-20 03:54:22 -07:00
Lennart Poettering	0c69794138	tree-wide: remove Lennart's copyright lines These lines are generally out-of-date, incomplete and unnecessary. With SPDX and git repository much more accurate and fine grained information about licensing and authorship is available, hence let's drop the per-file copyright notice. Of course, removing copyright lines of others is problematic, hence this commit only removes my own lines and leaves all others untouched. It might be nicer if sooner or later those could go away too, making git the only and accurate source of authorship information.	2018-06-14 10:20:20 +02:00
Lennart Poettering	818bf54632	tree-wide: drop 'This file is part of systemd' blurb This part of the copyright blurb stems from the GPL use recommendations: https://www.gnu.org/licenses/gpl-howto.en.html The concept appears to originate in times where version control was per file, instead of per tree, and was a way to glue the files together. Ultimately, we nowadays don't live in that world anymore, and this information is entirely useless anyway, as people are very welcome to copy these files into any projects they like, and they shouldn't have to change bits that are part of our copyright header for that. hence, let's just get rid of this old cruft, and shorten our codebase a bit.	2018-06-14 10:20:20 +02:00

1 2 3 4

169 commits