Systemd

Commit Graph

Author	SHA1	Message	Date
Yu Watanabe	fb2042dd55	core: add new environment variable $RUNTIME_DIRECTORY= or friends The variable is generated from RuntimeDirectory= or friends. If multiple directories are set, then they are concatenated with the separator ':'.	2018-09-13 17:02:58 +09:00
Yu Watanabe	7c1cb6f198	core: add one more assert()	2018-09-13 17:02:58 +09:00
Yu Watanabe	76a9460d44	core: fix assert() about number of built environment variables Follow-up for `4b58153dd2` and `fd63e712b2`.	2018-09-13 17:02:58 +09:00
Yu Watanabe	52e4d62550	Merge pull request #9852 from poettering/namespace-errno namespace: be more careful when handling namespacing failures	2018-08-22 11:16:29 +09:00
Lennart Poettering	1beab8b0d0	namespace: be more careful when handling namespacing failures gracefully This makes two changes to the namespacing code: 1. We'll only gracefully skip service namespacing on access failure if exclusively sandboxing options where selected, and not mount-related options that result in a very different view of the world. For example, ignoring RootDirectory=, RootImage= or Bind= is really probablematic, but ReadOnlyPaths= is just a weaker sandbox. 2. The namespacing code will now return a clearly recognizable error code when it cannot enforce its namespacing, so that we cannot confuse EPERM errors from mount() with those from unshare(). Only the errors from the first unshare() are now taken as hint to gracefully disable namespacing. Fixes: #9844 #9835	2018-08-21 20:00:33 +02:00
Zbigniew Jędrzejewski-Szmek	7692fed98b	Merge pull request #9783 from poettering/get-user-creds-flags beef up get_user_creds() a bit and other improvements	2018-08-21 10:09:33 +02:00
Lennart Poettering	fafff8f1ff	user-util: rework get_user_creds() Let's fold get_user_creds_clean() into get_user_creds(), and introduce a flags argument for it to select "clean" behaviour. This flags parameter also learns to other new flags: - USER_CREDS_SYNTHESIZE_FALLBACK: in this mode the user records for root/nobody are only synthesized as fallback. Normally, the synthesized records take precedence over what is in the user database. With this flag set this is reversed, and the user database takes precedence, and the synthesized records are only used if they are missing there. This flag should be set in cases where doing NSS is deemed safe, and where there's interest in knowing the correct shell, for example if the admin changed root's shell to zsh or suchlike. - USER_CREDS_ALLOW_MISSING: if set, and a UID/GID is specified by numeric value, and there's no user/group record for it accept it anyway. This allows us to fix #9767 This then also ports all users to set the most appropriate flags. Fixes: #9767 [zj: remove one isempty() call]	2018-08-20 15:58:21 +02:00
Lennart Poettering	3cd24c1aa9	core: when setting up PAM, try to get tty of STDIN_FILENO if not set explicitly When stdin/stdout/stderr is initialized from an fd, let's read the tty name of it if we can, and pass that to PAM. This makes sure that "machinectl shell" sessions have proper TTY fields initialized that "loginctl" then shows.	2018-08-20 12:28:17 +02:00
Yu Watanabe	4c3a2b84d8	core/execute: fix dump format for Limit*= Fixes #9846.	2018-08-10 11:59:16 +02:00
Yu Watanabe	7e8d494b33	core: use memcpy_safe() Fixes #9738.	2018-08-08 17:11:43 +09:00
Zbigniew Jędrzejewski-Szmek	5b316330be	Merge pull request #9624 from poettering/service-state-flush flush out ExecStatus structures when a new service cycle begins	2018-08-02 09:50:39 +02:00
Zbigniew Jędrzejewski-Szmek	54fe2ce1b9	Merge pull request #9504 from poettering/nss-deadlock some nss deadlock love	2018-07-26 10:16:25 +02:00
Lennart Poettering	5686391b00	core: introduce new Type=exec service type Users are often surprised that "systemd-run" command lines like "systemd-run -p User=idontexist /bin/true" will return successfully, even though the logs show that the process couldn't be invoked, as the user "idontexist" doesn't exist. This is because Type=simple will only wait until fork() succeeded before returning start-up success. This patch adds a new service type Type=exec, which is very similar to Type=simple, but waits until the child process completed the execve() before returning success. It uses a pipe that has O_CLOEXEC set for this logic, so that the kernel automatically sends POLLHUP on it when the execve() succeeded but leaves the pipe open if not. This means PID 1 waits exactly until the execve() succeeded in the child, and not longer and not shorter, which is the desired functionality. Making use of this new functionality, the command line "systemd-run -p User=idontexist -p Type=exec /bin/true" will now fail, as expected.	2018-07-25 22:48:11 +02:00
Lennart Poettering	25b583d7ff	core: swap order of "n_storage_fds" and "n_socket_fds" parameters When process fd lists to pass to activated programs we always place the socket activation fds first, and the storage fds last. Irritatingly in almost all calls the "n_storage_fds" parameter (i.e. the number of storage fds to pass) came first so far, and the "n_socket_fds" parameter second. Let's clean this up, and specify the number of fds in the order the fds themselves are passed. (Also, let's fix one more case where "unsigned" was used to size an array, while we should use "size_t" instead.)	2018-07-25 22:48:11 +02:00
Lennart Poettering	6a1d4d9fa6	core: properly reset all ExecStatus structures when entering a new unit cycle Whenever a unit is started fresh we should flush out any runtime data from the previous cycle. We are pretty good at that already, but what so far we missed was the ExecStart=/ExecStop=/… command exit status data. Let's fix that, and properly flush out that stuff too. Consider this service: [Service] ExecStart=/bin/sleep infinity ExecStop=/bin/false When this service is started, then stopped and then started again "systemctl status" would show the ExecStop= results of the previous run along with the ExecStart= results of the current one, which is very confusing. With this patch this is corrected: the data is kept right until the moment the new service cycle starts, and then flushed out. Hence "systemctl status" in that case will only show the ExecStart= data, but no ExecStop= data, like it should be. This should fix part of the confusion of #9588	2018-07-23 13:36:47 +02:00
Lennart Poettering	ee39ca20c6	core: drop "argv" field from ExecParameter structure We always initialize it from the same field in ExecCommand anyway, hence there's no point in passing it separately to exec_spawn(), after all we already pass the ExecCommand structure itself anyway. No change in behaviour.	2018-07-23 13:36:47 +02:00
Lennart Poettering	2ed26ed065	execute: use structure initialization when filling in exec status	2018-07-23 13:36:47 +02:00
Lennart Poettering	d521916d0f	pid1: tell PAM/NSS modules why we are calling them	2018-07-20 16:57:35 +02:00
Zsolt Dollenstein	566b7d23eb	Add support for opening files for appending Addresses part of #8983	2018-07-20 03:54:22 -07:00
Chris Lamb	3fe910794b	Correct a number of trivial typos.	2018-06-18 22:44:44 +02:00
Lennart Poettering	0c69794138	tree-wide: remove Lennart's copyright lines These lines are generally out-of-date, incomplete and unnecessary. With SPDX and git repository much more accurate and fine grained information about licensing and authorship is available, hence let's drop the per-file copyright notice. Of course, removing copyright lines of others is problematic, hence this commit only removes my own lines and leaves all others untouched. It might be nicer if sooner or later those could go away too, making git the only and accurate source of authorship information.	2018-06-14 10:20:20 +02:00
Lennart Poettering	818bf54632	tree-wide: drop 'This file is part of systemd' blurb This part of the copyright blurb stems from the GPL use recommendations: https://www.gnu.org/licenses/gpl-howto.en.html The concept appears to originate in times where version control was per file, instead of per tree, and was a way to glue the files together. Ultimately, we nowadays don't live in that world anymore, and this information is entirely useless anyway, as people are very welcome to copy these files into any projects they like, and they shouldn't have to change bits that are part of our copyright header for that. hence, let's just get rid of this old cruft, and shorten our codebase a bit.	2018-06-14 10:20:20 +02:00
Lennart Poettering	228af36fff	core: add new PrivateMounts= unit setting This new setting is supposed to be useful in most cases where "MountFlags=slave" is currently used, i.e. as an explicit way to run a service in its own mount namespace and decouple propagation from all mounts of the new mount namespace towards the host. The effect of MountFlags=slave and PrivateMounts=yes is mostly the same, as both cause a CLONE_NEWNS namespace to be opened, and both will result in all mounts within it to be mounted MS_SLAVE. The difference is mostly on the conceptual/philosophical level: configuring the propagation mode is nothing people should have to think about, in particular as the matter is not precisely easyto grok. Moreover, MountFlags= allows configuration of "private" and "slave" modes which don't really make much sense to use in real-life and are quite confusing. In particular PrivateMounts=private means mounts made on the host stay pinned for good by the service which is particularly nasty for removable media mount. And PrivateMounts=shared is in most ways a NOP when used a alone... The main technical difference between setting only MountFlags=slave or only PrivateMounts=yes in a unit file is that the former remounts all mounts to MS_SLAVE and leaves them there, while that latter remounts them to MS_SHARED again right after. The latter is generally a nicer approach, since it disables propagation, while MS_SHARED is afterwards in effect, which is really nice as that means further namespacing down the tree will get MS_SHARED logic by default and we unify how applications see our mounts as we always pass them as MS_SHARED regardless whether any mount namespacing is used or not. The effect of PrivateMounts=yes was implied already by all the other mount namespacing options. With this new option we add an explicit knob for it, to request it without any other option used as well. See: #4393	2018-06-12 16:12:10 +02:00
Zbigniew Jędrzejewski-Szmek	a1230ff972	basic/log: add the log_struct terminator to macro This way all callers do not need to specify it. Exhaustively tested by running test-log under valgrind ;)	2018-06-04 13:46:03 +02:00
Yu Watanabe	37c56f89d2	core: setup mount namespace when RootDirectory= and RuntimeDirectory= or friends are set The directories specified by RuntimeDirectory= or friends are created on host. So, it is necessary to bind-mount them on root directory.	2018-05-25 17:33:03 +09:00
Yu Watanabe	5609f6888b	core: make StateDirectory= or friends works with DynamicUser= and RootDirectory=/RootImage= The symbolic links to private directories specified by StateDirectory= or its friends are created on the host. So, when DynamicUser= and RootDirectory=/RootImage= are set, then the executed process cannot access private directory. This makes the private directories are mounted on the non-private place when both DynamicUser= and RootDirectory=/RootImage= are set. Fixes #8965.	2018-05-25 17:25:17 +09:00
Lennart Poettering	cdc0f9be92	Merge pull request #8817 from yuwata/cleanup-nsflags core: allow to specify RestrictNamespaces= multiple times	2018-05-24 16:49:13 +02:00
Yu Watanabe	fdff1da299	core: chown RuntimeDirectory= if DynamicUser= is set When DynamicUser= is set, then RuntimeDirectory= should be always chowned, as the service unit may enable RuntimeDirectoryPreserve=, and the uid or gid may changed from the last run. This also makes easier to migrate the service to use DynamicUser=.	2018-05-22 22:26:22 +09:00
Lennart Poettering	9f8168eb23	process-util: add new helper call for adjusting the OOM score And let's make use of it in execute.c	2018-05-17 20:47:21 +02:00
Lennart Poettering	34a5df58da	rlimit-util: introduce setrlimit_closest_all() This new call applies all configured resource limits in one.	2018-05-17 20:40:04 +02:00
Lennart Poettering	31ce987c2b	rlimit-util: add a common destructor call for arrays of struct rlimit	2018-05-17 20:36:52 +02:00
Lennart Poettering	6550c24c7f	rlimit-util: rework rlimit_{from\|to}_string() to work without "Limit" prefix let's make the call more generic, so that we can also easily use it for parsing "RLIMIT_xyz" style constants.	2018-05-17 20:36:52 +02:00
Felipe Sateler	57b7a260c2	core: undo the dependency inversion between unit.h and all unit types	2018-05-15 14:24:34 -04:00
Yu Watanabe	130d3d22e9	tree-wide: use strv_free_and_replace() macro	2018-05-10 00:57:34 +09:00
Yu Watanabe	aa9d574de9	load-fragment: allow to specify RestrictNamespaces= multiple times If multiple RestrictNamespaces= settings are set, then merge the settings. This also drops supporting "~yes" and "~no".	2018-05-05 11:07:37 +09:00
Yu Watanabe	86c2a9f1c2	nsflsgs: drop namespace_flag_{from,to}_string() This also drops namespace_flag_to_string_many_with_check(), and renames namespace_flag_{from,to}_string_many() to namespace_flags_{from,to}_string().	2018-05-05 11:07:37 +09:00
Yu Watanabe	b5a33299b0	core: disable namespace sandboxing for '+' prefixed lines Fixes #8842.	2018-05-01 13:44:06 +09:00
Lennart Poettering	da6053d0a7	tree-wide: be more careful with the type of array sizes Previously we were a bit sloppy with the index and size types of arrays, we'd regularly use unsigned. While I don't think this ever resulted in real issues I think we should be more careful there and follow a stricter regime: unless there's a strong reason not to use size_t for array sizes and indexes, size_t it should be. Any allocations we do ultimately will use size_t anyway, and converting forth and back between unsigned and size_t will always be a source of problems. Note that on 32bit machines "unsigned" and "size_t" are equivalent, and on 64bit machines our arrays shouldn't grow that large anyway, and if they do we have a problem, however that kind of overly large allocation we have protections for usually, but for overflows we do not have that so much, hence let's add it. So yeah, it's a story of the current code being already "good enough", but I think some extra type hygiene is better. This patch tries to be comprehensive, but it probably isn't and I missed a few cases. But I guess we can cover that later as we notice it. Among smaller fixes, this changes: 1. strv_length()' return type becomes size_t 2. the unit file changes array size becomes size_t 3. DNS answer and query array sizes become size_t Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=76745	2018-04-27 14:29:06 +02:00
Lennart Poettering	5d13a15b1d	tree-wide: drop spurious newlines (#8764 ) Double newlines (i.e. one empty lines) are great to structure code. But let's avoid triple newlines (i.e. two empty lines), quadruple newlines, quintuple newlines, …, that's just spurious whitespace. It's an easy way to drop 121 lines of code, and keeps the coding style of our sources a bit tigther.	2018-04-19 12:13:23 +02:00
Zbigniew Jędrzejewski-Szmek	11a1589223	tree-wide: drop license boilerplate Files which are installed as-is (any .service and other unit files, .conf files, .policy files, etc), are left as is. My assumption is that SPDX identifiers are not yet that well known, so it's better to retain the extended header to avoid any doubt. I also kept any copyright lines. We can probably remove them, but it'd nice to obtain explicit acks from all involved authors before doing that.	2018-04-06 18:58:55 +02:00
Yu Watanabe	1cc6c93a95	tree-wide: use TAKE_PTR() and TAKE_FD() macros	2018-04-05 14:26:26 +09:00
Dimitri John Ledkov	e64c2d0b5f	core: use setreuid/setregid trick to create session keyring with right ownership (#8447 ) Re-use the hacks used to link user keyring, when creating the session keyring. This way changing ownership of the keyring is not required, and thus incovation_id can be correctly created in restricted environments. Creating invocation_id with root permissions works and linking it into session keyring works, as at that point session keyring is possessed. Simple way to validate this is with following commands: $ journalctl -f & $ sudo systemd-run --uid 1000 /bin/sh -c 'keyctl describe @s; keyctl list @s; keyctl read `keyctl search @s user invocation_id`' which now works in LXD containers as well as on the host. Fixes: https://github.com/systemd/systemd/issues/7655	2018-03-27 12:58:10 +02:00
Lennart Poettering	959071cac2	Merge pull request #8552 from keszybz/test-improvements Test and diagnostics improvements	2018-03-23 15:26:54 +01:00
Zbigniew Jędrzejewski-Szmek	37c1d5e97d	tree-wide: warn when a directory path already exists but has bad mode/owner/type When we are attempting to create directory somewhere in the bowels of /var/lib and get an error that it already exists, it can be quite hard to diagnose what is wrong (especially for a user who is not aware that the directory must have the specified owner, and permissions not looser than what was requested). Let's print a warning in most cases. A warning is appropriate, because such state is usually a sign of borked installation and needs to be resolved by the adminstrator. $ build/test-fs-util Path "/tmp/test-readlink_and_make_absolute" already exists and is not a directory, refusing. (or) Directory "/tmp/test-readlink_and_make_absolute" already exists, but has mode 0775 that is too permissive (0755 was requested), refusing. (or) Directory "/tmp/test-readlink_and_make_absolute" already exists, but is owned by 1001:1000 (1000:1000 was requested), refusing. Assertion 'mkdir_safe(tempdir, 0755, getuid(), getgid(), MKDIR_WARN_MODE) >= 0' failed at ../src/test/test-fs-util.c:320, function test_readlink_and_make_absolute(). Aborting. No functional change except for the new log lines.	2018-03-23 10:26:38 +01:00
Lennart Poettering	ae2a15bc14	macro: introduce TAKE_PTR() macro This macro will read a pointer of any type, return it, and set the pointer to NULL. This is useful as an explicit concept of passing ownership of a memory area between pointers. This takes inspiration from Rust: https://doc.rust-lang.org/std/option/enum.Option.html#method.take and was suggested by Alan Jenkins (@sourcejedi). It drops ~160 lines of code from our codebase, which makes me like it. Also, I think it clarifies passing of ownership, and thus helps readability a bit (at least for the initiated who know the new macro)	2018-03-22 20:21:42 +01:00
Zbigniew Jędrzejewski-Szmek	d50b5839b0	basic/mkdir: convert bool flag to enum In preparation for subsequent changes...	2018-03-22 15:57:56 +01:00
Lennart Poettering	2b33ab0957	tree-wide: port various places over to use new rearrange_stdio()	2018-03-02 11:42:10 +01:00
Zbigniew Jędrzejewski-Szmek	30c81ce2ce	pid1: when creating service directories, don't chown existing files (#8181 ) This partially reverts `3536f49e8f` and `3536f49e8f`. When the user is dynamic, and we are setting up state, cache, or logs dirs, behaviour is unchanged, we always do a recursive chown. This is necessary because the user number might change between invocations. But when setting up a directory for non-dynamic user, or a runtime directory for a dynamic user, do any ownership or mode changes only when the directory is initially created. Nothing says that the files under those directories have to be all recursively owned by our user. This restores behaviour before `3536f49e8f`, so modifications to the state of the runtime directory persist between ExecStartPre's and ExecStart's, and even longer in case the directory is persistent. I think it _would_ be a nice property if setting a user would automatically propagate to ownership of any Runtime/Logs/Cache directories. But this is incompatible with another nice property, namely preserving changes to those directories made by an admin, and with allowing change of ownership of files in those directories by the service (e.g. to allow other users to access them). Of the two, I think the second property is more important. Also, it's backwards compatible. https://bugzilla.redhat.com/show_bug.cgi?id=1508495 There is no need to chmod a directory we just created, so move that step up into a branch. After that, 'effective' is only used once, so get rid of it too.	2018-02-22 11:30:59 +01:00
Yu Watanabe	2abd4e388a	core: add new setting TemporaryFileSystem= This introduces a new setting TemporaryFileSystem=. This is useful to hide files not relevant to the processes invoked by unit, while necessary files or directories can be still accessed by combining with Bind{,ReadOnly}Paths=.	2018-02-21 09:17:52 +09:00
Yu Watanabe	4ca763a902	core/namespace: make '-' prefix in Bind{,ReadOnly}Paths= work Each path in `Bind{ReadOnly}Paths=` accept '-' prefix. However, the prefix is completely ignored. This makes it work as expected.	2018-02-21 09:07:56 +09:00

1 2 3 4 5 ...

508 Commits