Commit graph

836 commits

Author SHA1 Message Date
Lennart Poettering 18e51a022c Merge pull request #4351 from keszybz/nspawn-debugging
Enhance nspawn debug logs for mount/unmount operations
2016-10-12 11:21:11 +02:00
Zbigniew Jędrzejewski-Szmek 3ccb886283 Allow block and char classes in DeviceAllow bus properties (#4353)
Allowed paths are unified betwen the configuration file parses and the bus
property checker. The biggest change is that the bus code now allows "block-"
and "char-" classes. In addition, path_startswith("/dev") was used in the bus
code, and startswith("/dev") was used in the config file code. It seems
reasonable to use path_startswith() which allows a slightly broader class of
strings.

Fixes #3935.
2016-10-12 11:12:11 +02:00
Zbigniew Jędrzejewski-Szmek 7ef7147041 missing: add a bunch of mount flags 2016-10-11 17:24:03 -04:00
Zbigniew Jędrzejewski-Szmek 60e76d4897 nspawn,mount-util: add [u]mount_verbose and use it in nspawn
This makes it easier to debug failed nspawn invocations:

Mounting sysfs on /var/lib/machines/fedora-rawhide/sys (MS_RDONLY|MS_NOSUID|MS_NOEXEC|MS_NODEV "")...
Mounting tmpfs on /var/lib/machines/fedora-rawhide/dev (MS_NOSUID|MS_STRICTATIME "mode=755,uid=1450901504,gid=1450901504")...
Mounting tmpfs on /var/lib/machines/fedora-rawhide/dev/shm (MS_NOSUID|MS_NODEV|MS_STRICTATIME "mode=1777,uid=1450901504,gid=1450901504")...
Mounting tmpfs on /var/lib/machines/fedora-rawhide/run (MS_NOSUID|MS_NODEV|MS_STRICTATIME "mode=755,uid=1450901504,gid=1450901504")...
Bind-mounting /sys/fs/selinux on /var/lib/machines/fedora-rawhide/sys/fs/selinux (MS_BIND "")...
Remounting /var/lib/machines/fedora-rawhide/sys/fs/selinux (MS_RDONLY|MS_NOSUID|MS_NOEXEC|MS_NODEV|MS_BIND|MS_REMOUNT "")...
Mounting proc on /proc (MS_NOSUID|MS_NOEXEC|MS_NODEV "")...
Bind-mounting /proc/sys on /proc/sys (MS_BIND "")...
Remounting /proc/sys (MS_RDONLY|MS_NOSUID|MS_NOEXEC|MS_NODEV|MS_BIND|MS_REMOUNT "")...
Bind-mounting /proc/sysrq-trigger on /proc/sysrq-trigger (MS_BIND "")...
Remounting /proc/sysrq-trigger (MS_RDONLY|MS_NOSUID|MS_NOEXEC|MS_NODEV|MS_BIND|MS_REMOUNT "")...
Mounting tmpfs on /tmp (MS_STRICTATIME "mode=1777,uid=0,gid=0")...
Mounting tmpfs on /sys/fs/cgroup (MS_NOSUID|MS_NOEXEC|MS_NODEV|MS_STRICTATIME "mode=755,uid=0,gid=0")...
Mounting cgroup on /sys/fs/cgroup/systemd (MS_NOSUID|MS_NOEXEC|MS_NODEV "none,name=systemd,xattr")...
Failed to mount cgroup on /sys/fs/cgroup/systemd (MS_NOSUID|MS_NOEXEC|MS_NODEV "none,name=systemd,xattr"): No such file or directory
2016-10-11 16:50:07 -04:00
Zbigniew Jędrzejewski-Szmek b744e8937c Merge pull request #4067 from poettering/invocation-id
Add an "invocation ID" concept to the service manager
2016-10-11 13:40:50 -04:00
Lennart Poettering 1f0958f640 core: when determining whether a process exit status is clean, consider whether it is a command or a daemon
SIGTERM should be considered a clean exit code for daemons (i.e. long-running
processes, as a daemon without SIGTERM handler may be shut down without issues
via SIGTERM still) while it should not be considered a clean exit code for
commands (i.e. short-running processes).

Let's add two different clean checking modes for this, and use the right one at
the appropriate places.

Fixes: #4275
2016-10-10 22:57:01 +02:00
Lennart Poettering 41e2036eb8 exit-status: kill is_clean_exit_lsb(), move logic to sysv-generator
Let's get rid of is_clean_exit_lsb(), let's move the logic for the special
handling of the two LSB exit codes into the sysv-generator by writing out
appropriate SuccessExitStatus= lines if the LSB header exists. This is not only
semantically more correct, bug also fixes a bug as the code in service.c that
chose between is_clean_exit_lsb() and is_clean_exit() based this check on
whether a native unit files was available for the unit. However, that check was
bogus since a long time, since the SysV generator was introduced and native
SysV script support was removed from PID 1, as in that case a unit file always
existed.
2016-10-10 21:48:08 +02:00
Lennart Poettering 56ecbcc048 exit-status: reorder the exit status switch table
Let's make sure it's in the same order as the actual enum defining the exit
statuses.
2016-10-10 20:11:21 +02:00
Lennart Poettering 65e3fd83c9 exit-status: remove ExitStatus typedef
Do not make up our own type for ExitStatus, but use the type used by POSIX for
this, which is "int".  In particular as we never used that type outside of the
definition of exit_status_to_string() where we internally cast the paramter to
(int) every single time we used it.

Hence, let's simplify things, drop the type and use the kernel type directly.
2016-10-10 20:08:41 +02:00
Evgeny Vereshchagin a0f72a24e0 Merge pull request #4310 from keszybz/nspawn-autodetect
Autodetect systemd version in containers started by systemd-nspawn
2016-10-10 20:47:25 +03:00
Zbigniew Jędrzejewski-Szmek 5a46d55fc8 path-util: add a function to peek into a container and guess systemd version
This is a bit crude and only works for new systemd versions which
have libsystemd-shared.
2016-10-08 14:48:41 -04:00
Susant Sahani e63be0847c networkd: address add support to configure flags (#4201)
This patch enables to configure

IFA_F_HOMEADDRESS
IFA_F_NODAD
IFA_F_MANAGETEMPADDR
IFA_F_NOPREFIXROUTE
IFA_F_MCAUTOJOIN
2016-10-08 13:05:41 +02:00
Lennart Poettering 4b58153dd2 core: add "invocation ID" concept to service manager
This adds a new invocation ID concept to the service manager. The invocation ID
identifies each runtime cycle of a unit uniquely. A new randomized 128bit ID is
generated each time a unit moves from and inactive to an activating or active
state.

The primary usecase for this concept is to connect the runtime data PID 1
maintains about a service with the offline data the journal stores about it.
Previously we'd use the unit name plus start/stop times, which however is
highly racy since the journal will generally process log data after the service
already ended.

The "invocation ID" kinda matches the "boot ID" concept of the Linux kernel,
except that it applies to an individual unit instead of the whole system.

The invocation ID is passed to the activated processes as environment variable.
It is additionally stored as extended attribute on the cgroup of the unit. The
latter is used by journald to automatically retrieve it for each log logged
message and attach it to the log entry. The environment variable is very easily
accessible, even for unprivileged services. OTOH the extended attribute is only
accessible to privileged processes (this is because cgroupfs only supports the
"trusted." xattr namespace, not "user."). The environment variable may be
altered by services, the extended attribute may not be, hence is the better
choice for the journal.

Note that reading the invocation ID off the extended attribute from journald is
racy, similar to the way reading the unit name for a logging process is.

This patch adds APIs to read the invocation ID to sd-id128:
sd_id128_get_invocation() may be used in a similar fashion to
sd_id128_get_boot().

PID1's own logging is updated to always include the invocation ID when it logs
information about a unit.

A new bus call GetUnitByInvocationID() is added that allows retrieving a bus
path to a unit by its invocation ID. The bus path is built using the invocation
ID, thus providing a path for referring to a unit that is valid only for the
current runtime cycleof it.

Outlook for the future: should the kernel eventually allow passing of cgroup
information along AF_UNIX/SOCK_DGRAM messages via a unique cgroup id, then we
can alter the invocation ID to be generated as hash from that rather than
entirely randomly. This way we can derive the invocation race-freely from the
messages.
2016-10-07 20:14:38 +02:00
Lennart Poettering e5d855d364 util: use SPECIAL_ROOT_SLICE macro where appropriate 2016-10-07 20:14:38 +02:00
Lennart Poettering 0474ef7b3e log: minor fixes
Most important is a fix to negate the error number if necessary, before we
first access it.
2016-10-07 20:14:38 +02:00
Lennart Poettering 4a39c77419 strv: fix STRV_FOREACH_BACKWARDS() to be a single statement only
Let's make sure people invoking STRV_FOREACH_BACKWARDS() as a single statement
of an if statement don't fall into a trap, and find the tail for the list via
strv_length().
2016-10-07 20:14:38 +02:00
rwmjones 171b533800 architecture: Add support for the RISC-V architecture. (#4305)
RISC-V is an open source ISA in development since 2010 at UCB.
For more information, see https://riscv.org/

I am adding RISC-V support to Fedora:
https://fedoraproject.org/wiki/Architectures/RISC-V

There are three major variants of the architecture (32-, 64- and
128-bit).  The 128-bit variant is a paper exercise, but the other
two really exist in silicon.  RISC-V is always little endian.

On Linux, the default kernel uname(2) can return "riscv" for all
variants.  However a patch was added recently which makes the kernel
return one of "riscv32" or "riscv64" (or in future "riscv128").  So
systemd should be prepared to handle any of "riscv", "riscv32" or
"riscv64" (in future, "riscv128" but that is not included in the
current patch).  If the kernel returns "riscv" then you need to use
the pointer size in order to know the real variant.

The Fedora/RISC-V kernel only ever returns "riscv64" since we're
only doing Fedora for 64 bit at the moment, and we've patched the
kernel so it doesn't return "riscv".

As well as the major bitsize variants, there are also architecture
extensions.  However I'm trying to ensure that uname(2) does *not*
return any other information about those in utsname.machine, so that
we don't end up with "riscv64abcde" nonsense.  Instead those
extensions will be exposed in /proc/cpuinfo similar to how flags
work in x86.
2016-10-07 14:56:27 +02:00
Lennart Poettering 97f0e76f18 user-util: rework maybe_setgroups() a bit
Let's drop the caching of the setgroups /proc field for now. While there's a
strict regime in place when it changes states, let's better not cache it since
we cannot really be sure we follow that regime correctly.

More importantly however, this is not in performance sensitive code, and
there's no indication the cache is really beneficial, hence let's drop the
caching and make things a bit simpler.

Also, while we are at it, rework the error handling a bit, and always return
negative errno-style error codes, following our usual coding style. This has
the benefit that we can sensible hanld read_one_line_file() errors, without
having to updat errno explicitly.
2016-10-06 19:04:10 +02:00
Lennart Poettering 429b435026 sd-device/networkd: unify code to get a socket for issuing netdev ioctls on
As suggested here:

https://github.com/systemd/systemd/pull/4296#issuecomment-251911349

Let's try AF_INET first as socket, but let's fall back to AF_NETLINK, so that
we can use a protocol-independent socket here if possible. This has the benefit
that our code will still work even if AF_INET/AF_INET6 is made unavailable (for
exmple via seccomp), at least on current kernels.
2016-10-06 19:04:01 +02:00
Giuseppe Scrivano 36d854780c core: do not fail in a container if we can't use setgroups
It might be blocked through /proc/PID/setgroups
2016-10-06 11:49:00 +02:00
Giuseppe Scrivano f006b30bd5 audit: disable if cannot create NETLINK_AUDIT socket 2016-10-06 11:49:00 +02:00
Stefan Schweter 629ff674ac tree-wide: remove consecutive duplicate words in comments 2016-10-04 17:06:25 +02:00
Michael Olbrich 5076f4219e list: LIST_INSERT_BEFORE: update head if necessary (#4261)
If the new item is inserted before the first item in the list, then the
head must be updated as well.
Add a test to the list unit test to check for this.
2016-10-04 16:15:37 +02:00
Evgeny Vereshchagin cc238590e4 Merge pull request #4185 from endocode/djalal-sandbox-first-protection-v1
core:sandbox: Add new ProtectKernelTunables=, ProtectControlGroups=, ProtectSystem=strict and fixes
2016-09-28 04:50:30 +03:00
Martin Pitt b8fafaf4a1 Merge pull request #4220 from keszybz/show-and-formatting-fixes
Show and formatting fixes
2016-09-27 16:25:27 +02:00
Susant Sahani 629abfc23f basic: fix for IPv6 status (#4224)
Even if
```
   cat /proc/sys/net/ipv6/conf/all/disable_ipv6
1
```

is disabled

cat /proc/net/sockstat6

```
TCP6: inuse 2
UDP6: inuse 1
UDPLITE6: inuse 0
RAW6: inuse 0
FRAG6: inuse 0 memory 0
 ```

Looking for /proc/net/if_inet6 is the right choice.
2016-09-27 15:55:13 +02:00
Lennart Poettering d944dc9553 namespace: chase symlinks for mounts to set up in userspace
This adds logic to chase symlinks for all mount points that shall be created in
a namespace environment in userspace, instead of leaving this to the kernel.
This has the advantage that we can correctly handle absolute symlinks that
shall be taken relative to a specific root directory. Moreover, we can properly
handle mounts created on symlinked files or directories as we can merge their
mounts as necessary.

(This also drops the "done" flag in the namespace logic, which was never
actually working, but was supposed to permit a partial rollback of the
namespace logic, which however is only mildly useful as it wasn't clear in
which case it would or would not be able to roll back.)

Fixes: #3867
2016-09-25 10:42:18 +02:00
Lennart Poettering 6b7c9f8bce namespace: rework how ReadWritePaths= is applied
Previously, if ReadWritePaths= was nested inside a ReadOnlyPaths=
specification, then we'd first recursively apply the ReadOnlyPaths= paths, and
make everything below read-only, only in order to then flip the read-only bit
again for the subdirs listed in ReadWritePaths= below it.

This is not only ugly (as for the dirs in question we first turn on the RO bit,
only to turn it off again immediately after), but also problematic in
containers, where a container manager might have marked a set of dirs read-only
and this code will undo this is ReadWritePaths= is set for any.

With this patch behaviour in this regard is altered: ReadOnlyPaths= will not be
applied to the children listed in ReadWritePaths= in the first place, so that
we do not need to turn off the RO bit for those after all.

This means that ReadWritePaths=/ReadOnlyPaths= may only be used to turn on the
RO bit, but never to turn it off again. Or to say this differently: if some
dirs are marked read-only via some external tool, then ReadWritePaths= will not
undo it.

This is not only the safer option, but also more in-line with what the man page
currently claims:

        "Entries (files or directories) listed in ReadWritePaths= are
        accessible from within the namespace with the same access rights as
        from outside."

To implement this change bind_remount_recursive() gained a new "blacklist"
string list parameter, which when passed may contain subdirs that shall be
excluded from the read-only mounting.

A number of functions are updated to add more debug logging to make this more
digestable.
2016-09-25 10:40:51 +02:00
Lennart Poettering be39ccf3a0 execute: move suppression of HOME=/ and SHELL=/bin/nologin into user-util.c
This adds a new call get_user_creds_clean(), which is just like
get_user_creds() but returns NULL in the home/shell parameters if they contain
no useful information. This code previously lived in execute.c, but by
generalizing this we can reuse it in run.c.
2016-09-25 10:18:57 +02:00
Zbigniew Jędrzejewski-Szmek c7bf9d5183 basic/strv: add STRPTR_IN_SET
Also some trivial tests for STR_IN_SET and STRPTR_IN_SET.
2016-09-24 20:13:28 -04:00
Martin Pitt 6ac288a990 Merge pull request #4123 from keszybz/network-file-dropins
Network file dropins
2016-09-17 10:00:19 +02:00
Zbigniew Jędrzejewski-Szmek 43688c49d1 tree-wide: rename config_parse_many to …_nulstr
In preparation for adding a version which takes a strv.
2016-09-16 10:32:03 -04:00
Zbigniew Jędrzejewski-Szmek e77e0f51fe Merge pull request #4131 from intelfx/update-done-timestamps-precision
condition: ignore nanoseconds in timestamps for ConditionNeedsUpdate=

Fixes #4130.
2016-09-15 22:53:00 -04:00
Ivan Shapovalov 3a730176b3 time-util: export timespec_load_nsec() 2016-09-15 05:21:09 +03:00
Martin Pitt 2d88def959 Merge pull request #4133 from keszybz/strerror-removal
Strerror removal and other janitorial cleanups
2016-09-14 11:17:58 +02:00
Zbigniew Jędrzejewski-Szmek 481a2b02a3 Always use unicode ellipsis when ellipsizing
We were already unconditionally using the unicode character when the
input string was not pure ASCII, leading to different behaviour in
depending on the input string.

systemd[1]: Starting printit.service.
python3[19962]: foooooooooooooooooooooooooooooooooooo…oooo
python3[19964]: fooąęoooooooooooooooooooooooooooooooo…oooo
python3[19966]: fooąęoooooooooooooooooooooooooooooooo…ąęąę
python3[19968]: fooąęoooooooooooooooooąęąęąęąęąęąęąęą…ąęąę
systemd[1]: Started printit.service.
2016-09-13 20:10:57 -04:00
Topi Miettinen 646853bdd8 fileio: simplify mkostemp_safe() (#4090)
According to its manual page, flags given to mkostemp(3) shouldn't include
O_RDWR, O_CREAT or O_EXCL flags as these are always included. Beyond
those, the only flag that all callers (except a few tests where it
probably doesn't matter) use is O_CLOEXEC, so set that unconditionally.
2016-09-13 08:20:38 +02:00
Seraphime Kirkovski 07b0b339d6 machinectl: split OS field in two; print ip addresses (#4058)
This splits the OS field in two : one for the distribution name
and one for the the version id.
Dashes are written for missing fields.
This also prints ip addresses of known machines. The `--max-addresses`
option specifies how much ip addresses we want to see. The default is 1.
When more than one address is written for a machine, a `,` follows it.
If there are more ips than `--max-addresses`, `...` follows the last
address.
2016-08-31 20:06:57 +02:00
Yann E. MORIN 1d9ed17178 basic/fileio: we always have O_TMPFILE now
fileio makes use of O_TMPFILE when it is available.

We now always have O_TMPFILE, defined in missing.h if missing
from the toolchain headers.

Have fileio include missing.h and drop the guards around the
use of O_TMPFILE.
2016-08-29 12:49:10 +02:00
Yann E. MORIN daad709a7c missing.h: add missing definitions for __O_TMPFILE
Currently, a missing __O_TMPFILE was only defined for i386 and x86_64,
leaving any other architectures with an "old" toolchain fail miserably
at build time:
    src/import/export-raw.c: In function 'reflink_snapshot':
    src/import/export-raw.c:271:26: error: 'O_TMPFILE' undeclared (first use in this function)
             new_fd = open(d, O_TMPFILE|O_CLOEXEC|O_NOCTTY|O_RDWR, 0600);
                              ^

__O_TMPFILE (and O_TMPFILE) are available since glibc 2.19. However, a
lot of existing toolchains are still using glibc-2.18, and some even
before that, and it is not really possible to update those toolchains.

Instead of defining it only for i386 and x86_64, define __O_TMPFILE
with the specific values for those archs where it is different from the
generic value. Use the values as found in the Linux kernel (v4.8-rc3,
current as of time of commit).

---
Note: tested on ARM (build+run), with glibc-2.18 and linux headers 3.12.
Untested on other archs, though (I have no board to test this).

Changes v1 -> v2:
  - add a comment specifying some are hexa, others are octal.
2016-08-29 12:40:22 +02:00
Zbigniew Jędrzejewski-Szmek 2056ec1927 Merge pull request #3965 from htejun/systemd-controller-on-unified 2016-08-19 19:58:01 -04:00
0xAX e6c9fa74a5 terminal-util: remove unnecessary check of result of isatty() (#4000)
After the call of the isatty() we check its result twice in the
open_terminal(). There are no sense to check result of isatty() that
it is less than zero and return -errno, because as described in
documentation:

isatty() returns 1 if fd is an open file descriptor referring to a
terminal;  otherwise 0 is returned, and errno is set to indicate the
error.

So it can't be less than zero.
2016-08-19 18:51:54 -04:00
Evgeny Vereshchagin 29272c04a7 Merge pull request #3909 from poettering/mount-tool
add a new tool for creating transient mount and automount units
2016-08-19 23:33:49 +03:00
Lennart Poettering 16d901e251 Merge pull request #3987 from keszybz/console-color-setup
Rework console color setup
2016-08-19 19:36:09 +02:00
Zbigniew Jędrzejewski-Szmek acf553b04d terminal-util: use getenv_bool for $SYSTEMD_COLORS
This changes the semantics a bit: before, SYSTEMD_COLORS= would be treated as
"yes", same as SYSTEMD_COLORS=xxx and SYSTEMD_COLORS=1, and only
SYSTEMD_COLORS=0 would be treated as "no". Now, only valid booleans are treated
as "yes". This actually matches how $SYSTEMD_COLORS was announced in NEWS.
2016-08-19 11:57:37 -04:00
Zbigniew Jędrzejewski-Szmek 158fbf7661 systemd: ignore lack of tty when checking whether colors should be enabled
When started by the kernel, we are connected to the console, and we'll set TERM
properly to some value in fixup_environment(). We'll then enable or disable
colors based on the value of $SYSTEMD_COLORS and $TERM.

When reexecuting, TERM should be already set, so we can use this value.
Effectively, behaviour is the same as before affd7ed1a was reverted, but instead
of reopening the console before configuring color output, we just ignore what
stdout is connected to and decide based on the variables only.
2016-08-19 11:34:22 -04:00
Lennart Poettering cbf138ebef Merge pull request #3988 from keszybz/journald-dynamic-users
Journald dynamic users
2016-08-19 10:41:26 +02:00
Zbigniew Jędrzejewski-Szmek 61755fdae0 journald: do not create split journals for dynamic users
Dynamic users should be treated like system users, and their logs
should end up in the main system journal.
2016-08-18 23:34:40 -04:00
Tejun Heo f50582649f logind: update empty and "infinity" handling for [User]TasksMax (#3835)
The parsing functions for [User]TasksMax were inconsistent.  Empty string and
"infinity" were interpreted as no limit for TasksMax but not accepted for
UserTasksMax.  Update them so that they're consistent with other knobs.

* Empty string indicates the default value.
* "infinity" indicates no limit.

While at it, replace opencoded (uint64_t) -1 with CGROUP_LIMIT_MAX in TasksMax
handling.

v2: Update empty string to indicate the default value as suggested by Zbigniew
    Jędrzejewski-Szmek.

v3: Fixed empty UserTasksMax handling.
2016-08-18 22:57:53 -04:00
Lennart Poettering 2ae0858e6c hostnamectl: rework pretty hostname validation (#3985)
Rework 17eb9a9ddb a bit.

Let's make sure we don't clobber the input parameter args[1], following our
coding style to not clobber parameters unless explicitly indicated. (in
particular, as we don't want to have our changes appear in the command line
shown in "ps"...)

No functional change.
2016-08-18 21:16:16 -04:00