Previously, device_copy_properties() copies all properties to both
sd_device::properties and ::properties_db. Thus, on move uevent,
also tentative properties, e.g. DEVPATH or INTERFACE, are stored to
::properties_db, and saved to udev database.
This makes such tentative properties be copied to only ::properties,
and thus not saved to udev database.
Fixes#9426.
Fixes#3374. The problem is that we set MACPolicy=persistent (i.e. we would
like to generate persistent MAC addresses for interfaces which don't have a
fixed MAC address), but various virtual interfaces including bridges, tun/tap,
bonds, etc., do not not have the necessary ID_NET_NAME_* attributes and udev
would not assing the address and warn:
Could not generate persistent MAC address for $name: No such file or directory
Basic requirements which I think a solution for this needs to satisfy:
1. No changes to MAC address generation for those cases which are currently
handled successfully. This means that net_get_unique_predictable_data() must
keep returning the same answer, which in turn means net_get_name() must keep
returning the same answer. We can only add more things we look at with lower
priority so that we start to cover cases which were not covered before.
2. Like 1, but for IPvLL seed and DHCP IAD. This is less important, but "nice
to have".
3. Keep MACPolicy=persistent. If people don't want it, they can always apply
local configuration, but in general stable MACs are a good thing. I have never
seen anyone complain about that.
== Various approaches that have been proposed
=== https://github.com/systemd/systemd/issues/3374#issuecomment-223753264 (tomty89)
if !ID_BUS and INTERFACE, use INTERFACE
I think this almost does the good thing, but I don't see the reason to reject ID_BUS
(i.e. physical hardware). Stable MACs are very useful for physical hardware that has
no physical MAC.
=== https://github.com/systemd/systemd/issues/3374#issuecomment-224733069 (teg)
if (should_rename(device, true))
This means looking at name_assign_type. In particular for
NET_NAME_USER should_rename(..., true) returns true. It only returns false
for NET_NAME_PREDICTABLE. So this would cover stuff like br0, bond0, etc,
but would not cover lo and other devices with predictable names. That doesn't
make much sense.
But did teg mean should_rename() or !should_rename()?
=== https://github.com/systemd/systemd/issues/3374#issuecomment-234628502 (tomty89):
+ if (!should_rename(device, true))
+ return udev_device_get_sysname(device)
This covers only devices with NET_NAME_PREDICTABLE. Since the problem applies as
much to bridges and such, this isn't neough.
=== https://github.com/systemd/systemd/issues/3374#issuecomment-281745967 (grafi-tt)
+ /* if the machine doesn't provide data about the device, use the ifname specified by userspace
+ * (this is the case when the device is virtual, e.g., bridge or bond) */
+ s = udev_device_get_sysattr_value(device, "name_assign_type");
+ if (s && safe_atou(s, &type) >= 0 && type == NET_NAME_USER)
+ return udev_device_get_sysname(device);
This does not cover bond0, vnet0, tun/tap and similar.
grafi-tt also proposes patching the kernel, but *not* setting name_assign_type
seems intentional in those cases, because the device name is a result of
enumeration, not set by the userspace.
=== https://github.com/systemd/systemd/issues/3374#issuecomment-288882355 (tomty89)
(also PR #11372)
- MACAddressPolicy=persistent
This break requirement 3. above. It would solve the immediate problem, but I
think the disruption is too big.
=== This patch
This patch means that we will set a "stable" MAC for pretty much any virtual
device by default, where "stable" means keyed off the machine-id and interface
name.
It seems like a big change, but we already did this for most physical devices.
Doing it also for virtual devices doesn't seem like a big issue. It will make
the setup and monitoring of virtualized networks slightly nicer. I don't think
anyone is depending on having the MAC address changed when those devices are
destoryed and recreated. If they do, they'd have to change MACAddressPolicy=.
== Implementation
net_get_name() is called from dhcp_ident_set_iaid() so I didn't change
net_get_name() like in grafi-tt's patch, but net_get_unique_predictable_data().
net_get_unique_predictable_data() is called from get_mac() in link-config.c
and sd_ipv4ll_set_address_seed(), so both of those code paths are affected
and will now get data in some cases where they errored out previously.
The return code is changed to -ENODATA since that gives a nicer error string.
Otherwise, LIST_REMOVE() in network_free() fails.
This fixes the following assertion:
```
systemd-networkd[2595]: Bus bus-api-network: changing state UNSET → OPENING
systemd-networkd[2595]: Bus bus-api-network: changing state OPENING → AUTHENTICATING
systemd-networkd[2595]: timestamp of '/etc/systemd/network' changed
systemd-networkd[2595]: /etc/systemd/network/10-hoge.network:1: Invalid section header '[Network]Address=192.168.0.1'
systemd-networkd[2595]: /etc/systemd/network/10-hoge.network:1: Failed to parse file: Bad message
systemd-networkd[2595]: Assertion '*_head == _item' failed at ../../home/watanabe/git/systemd/src/network/networkd-network.c:378, function network_free(). Aborting.
valgrind[2595]: ==2595==
valgrind[2595]: ==2595== Process terminating with default action of signal 6 (SIGABRT): dumping core
valgrind[2595]: ==2595== at 0x4BCA53F: raise (in /usr/lib64/libc-2.28.so)
valgrind[2595]: ==2595== by 0x4BB4894: abort (in /usr/lib64/libc-2.28.so)
valgrind[2595]: ==2595== by 0x4955F09: log_assert_failed_realm (log.c:795)
valgrind[2595]: ==2595== by 0x417101: network_free (networkd-network.c:378)
valgrind[2595]: ==2595== by 0x415E99: network_freep (networkd-network.h:282)
valgrind[2595]: ==2595== by 0x416AB2: network_load_one (networkd-network.c:101)
valgrind[2595]: ==2595== by 0x416C39: network_load (networkd-network.c:293)
valgrind[2595]: ==2595== by 0x414031: manager_load_config (networkd-manager.c:1502)
valgrind[2595]: ==2595== by 0x40B258: run (networkd.c:82)
valgrind[2595]: ==2595== by 0x40B74A: main (networkd.c:117)
valgrind[2595]: ==2595==
valgrind[2595]: ==2595== HEAP SUMMARY:
valgrind[2595]: ==2595== in use at exit: 32,621 bytes in 201 blocks
valgrind[2595]: ==2595== total heap usage: 746 allocs, 545 frees, 241,027 bytes allocated
valgrind[2595]: ==2595==
valgrind[2595]: ==2595== LEAK SUMMARY:
valgrind[2595]: ==2595== definitely lost: 0 bytes in 0 blocks
valgrind[2595]: ==2595== indirectly lost: 0 bytes in 0 blocks
valgrind[2595]: ==2595== possibly lost: 0 bytes in 0 blocks
valgrind[2595]: ==2595== still reachable: 32,621 bytes in 201 blocks
valgrind[2595]: ==2595== suppressed: 0 bytes in 0 blocks
valgrind[2595]: ==2595== Reachable blocks (those to which a pointer was found) are not shown.
valgrind[2595]: ==2595== To see them, rerun with: --leak-check=full --show-leak-kinds=all
valgrind[2595]: ==2595==
valgrind[2595]: ==2595== For counts of detected and suppressed errors, rerun with: -v
valgrind[2595]: ==2595== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
systemd-coredump[2600]: Process 2595 (memcheck-amd64-) of user 192 dumped core.
```
Gentoo's Portage package manager uses a PID namespace for process
isolation and cleanup. In this namespace, PID 1 has UID=250 (portage).
Bug: https://bugs.gentoo.org/674458
Fixes: #11499
Let's return -EAGAIN so that on state change, unit_process_job tries to
add our job to run_queue again so that all the reloads that coalesced
into the installed reload (which itself merged into a running one)
inititally atleast runs *once*. This should ensure service picks up all
config changes reliably.
See the issue being fixed for a detailed explanation.
When we synthesize a "struct rlimit" structure to pass on for
RLIMIT_NOFILE to our children, let's explicitly make sure that the soft
limit is not above FD_SETSIZE, for compat reason with select().
Note this only applies when we derive the "struct rlimit" from what we
inherited. If the user configures something explicitly it always takes
precedence.
Fixes: #11305Fixes: #3260
Related: #11456
So, here's what happens in the described scenario in #11305. A unit goes
down, and that triggeres stop jobs for the other two units as they were
bound to it. Now, the timer for manager triggered restarts kicks in and
schedules a restart job with the JOB_FAIL job mode. This means there is
a stop job installed on those units, and now due to them being bound to
us they also get a restart job enqueued. This however is a conflicts, as
neither stop can merge into restart, nor restart into stop. However,
restart should be able to replace stop in any case. If the stop
procedure is ongoing, it can cancel the stop job, install itself, and
then after reaching dead finish and convert itself to a start job.
However, if we increase the timer, then it can always take those units
from inactive -> auto-restart.
We change the job mode to JOB_REPLACE so the restart job cancels the
stop job and installs itself.
Also, the original bug could be worked around by bumping RestartSec= to
avoid the conflicting.
This doesn't seem to be something that is going to break uses. That is
because for those who already had it working, there must have never been
conflicting jobs, as that would result in a desctructive transaction by
virtue of the job mode used.
After this change, the test case is able to work nicely without issues.
Instead of having just a single INITRD field, add support for all
additional parameters being INITRD fields in order.
Signed-off-by: Mike Auty <mike.auty@gmail.com>
@bl33pbl0p, please fix your editor
(Apparently you never configured the source tree? If you did, then the
git pre-commit hook would have been enabled which doesn't allow
commiting non-whitespace clean stuff...)
This doesn't really change much, but feels more correct to do, as it
ensures that all messages currently queued in the bus connections are
definitely unreffed and thus destryoing of the connection object will
follow immediately.
Strictly speaking this change is entirely unnecessary, since nothing
else could have acquired a ref to the connection and queued a message
in, however, now that we have the new sd_bus_close_unref() helper it
makes a lot of sense to use it here, to ensure that whatever happens
nothing that might have been queued fucks with us.
It's similar to sd_bus_flush_close_unref() but doesn't do the flushing.
This is useful since this will still discnnect the connection properly
but not synchronously wait for the peer to take our messages.
Primary usecase is within _cleanup_() expressions where synchronously
waiting on the peer is not OK.
Fixes#10659.
This changes the behaviour of parsing environment files to more closely
follow POSIX shell standards.
This has the effect that these variables defined in a file:
VAR1='\value'
VAR2="\value"
Are now interpreted as `\value` instead of interpreting the `\`
character and interpreting them as `value`.
For more information about the behaviour followed, see:
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_02
If "keep" policy is specified, and the interface has a name that is
NET_NAME_USER or NET_NAME_RENAMED, we stop processing rules. "keep" should
probably be specified either first or last depending on the preference.
This partially reimplements 55b6530baa, in the
sense that if the "keep" policy is not specified, and if the interface has
a NamingPolicy, it will be renamed, even if it had a name previously.
So this breaks backwards compatibility in this case, but that's more in line
with what users expect.
Closes#9006.
The error string for operations that are not supported (e.g. "shutdown" for
user-defined units) should take two arguments, where the first one is the type
of action being defined (i.e. "FailureAction" vs. "SuccessAction") and the
second is the string that was invalid.
Currently, the code prints this:
$ systemd-run --user --wait -p SuccessAction=poweroff true
Failed to start transient service unit: EmergencyAction setting invalid for manager type: SuccessAction
Change the format string to instead print:
$ systemd-run --user --wait -p SuccessAction=poweroff true
Failed to start transient service unit: SuccessAction setting invalid for manager type: poweroff
This function returns 0 on success and a negative value on failure. On success,
it writes the parsed action to the address passed in its third argument.
`bus_set_transient_emergency_action` does this:
r = parse_emergency_action(s, system, &v);
if (v < 0)
// handle failure
However, `v` is not updated if the function fails, and this should be checking
`r` instead of `v`.
The result of this is that if an invalid failure (or success) action is
specified, systemd ends up creating the unit anyway and then misbehaves if it
tries to run the failure action because the action value comes from
uninitialized stack data. In my case, this resulted in a failed assertion:
Program received signal SIGABRT, Aborted.
0x00007fe52cca0d7f in raise () from /snap/usr/lib/libc.so.6
(gdb) bt
#0 0x00007fe52cca0d7f in raise () from /snap/usr/lib/libc.so.6
#1 0x00007fe52cc8b672 in abort () from /snap/usr/lib/libc.so.6
#2 0x00007fe52d66f169 in log_assert_failed_realm (realm=LOG_REALM_SYSTEMD, text=0x56177ab8e000 "action < _EMERGENCY_ACTION_MAX", file=0x56177ab8dfb8 "../src/core/emergency-action.c", line=33, func=0x56177ab8e2b0 <__PRETTY_FUNCTION__.14207> "emergency_action") at ../src/basic/log.c:795
#3 0x000056177aa98cf4 in emergency_action (m=0x56177c992cb0, action=2059118610, options=(unknown: 0), reboot_arg=0x0, exit_status=1, reason=0x7ffdd2df4290 "unit run-u0.service failed") at ../src/core/emergency-action.c:33
#4 0x000056177ab2b739 in unit_notify (u=0x56177c9eb340, os=UNIT_ACTIVE, ns=UNIT_FAILED, flags=(unknown: 0)) at ../src/core/unit.c:2504
#5 0x000056177aaf62ed in service_set_state (s=0x56177c9eb340, state=SERVICE_FAILED) at ../src/core/service.c:1104
#6 0x000056177aaf8a29 in service_enter_dead (s=0x56177c9eb340, f=SERVICE_SUCCESS, allow_restart=true) at ../src/core/service.c:1712
#7 0x000056177aaf9233 in service_enter_signal (s=0x56177c9eb340, state=SERVICE_FINAL_SIGKILL, f=SERVICE_SUCCESS) at ../src/core/service.c:1854
#8 0x000056177aaf921b in service_enter_signal (s=0x56177c9eb340, state=SERVICE_FINAL_SIGTERM, f=SERVICE_SUCCESS) at ../src/core/service.c:1852
#9 0x000056177aaf8eb3 in service_enter_stop_post (s=0x56177c9eb340, f=SERVICE_SUCCESS) at ../src/core/service.c:1788
#10 0x000056177aaf91eb in service_enter_signal (s=0x56177c9eb340, state=SERVICE_STOP_SIGKILL, f=SERVICE_SUCCESS) at ../src/core/service.c:1850
#11 0x000056177aaf91bc in service_enter_signal (s=0x56177c9eb340, state=SERVICE_STOP_SIGTERM, f=SERVICE_FAILURE_EXIT_CODE) at ../src/core/service.c:1848
#12 0x000056177aaf9759 in service_enter_running (s=0x56177c9eb340, f=SERVICE_FAILURE_EXIT_CODE) at ../src/core/service.c:1941
#13 0x000056177ab005b7 in service_sigchld_event (u=0x56177c9eb340, pid=112, code=1, status=1) at ../src/core/service.c:3296
#14 0x000056177aad84b5 in manager_invoke_sigchld_event (m=0x56177c992cb0, u=0x56177c9eb340, si=0x7ffdd2df48f0) at ../src/core/manager.c:2444
#15 0x000056177aad88df in manager_dispatch_sigchld (source=0x56177c994710, userdata=0x56177c992cb0) at ../src/core/manager.c:2508
#16 0x00007fe52d72f807 in source_dispatch (s=0x56177c994710) at ../src/libsystemd/sd-event/sd-event.c:2846
#17 0x00007fe52d730f7d in sd_event_dispatch (e=0x56177c993530) at ../src/libsystemd/sd-event/sd-event.c:3229
#18 0x00007fe52d73142e in sd_event_run (e=0x56177c993530, timeout=18446744073709551615) at ../src/libsystemd/sd-event/sd-event.c:3286
#19 0x000056177aad9f71 in manager_loop (m=0x56177c992cb0) at ../src/core/manager.c:2906
#20 0x000056177aa7c876 in invoke_main_loop (m=0x56177c992cb0, ret_reexecute=0x7ffdd2df4bff, ret_retval=0x7ffdd2df4c04, ret_shutdown_verb=0x7ffdd2df4c58, ret_fds=0x7ffdd2df4c70, ret_switch_root_dir=0x7ffdd2df4c48, ret_switch_root_init=0x7ffdd2df4c50, ret_error_message=0x7ffdd2df4c60) at ../src/core/main.c:1792
#21 0x000056177aa7f251 in main (argc=2, argv=0x7ffdd2df4e78) at ../src/core/main.c:2573
Fix this by checking the correct variable.
What policy we dicide to use it rather important, but this bit of information
wasn't logged. Let's always do that.
The code was also written in a confusing way, which probably contributed to the
unintended effects of 55b6530baa and other commits.
We would loop over all policies, and note if "kernel" was specified, and then
possibly unset the result at the end. Let's immediately log the result and cut
to the end if we can figure out the answer.
No functional change intended, except for the new log lines.
Using goto is not very elegant, but we can't use break because of the switch,
and there are multiple conditions to break the loop, so using goto is cleanest.
AFAICT, it was never hooked up to meson, so it'd only work if somebody manually
defined the flag. I think it's preferable to drop the ifdef: it removes only a
single access call to a fixed location, which is realy cheap. If poeple don't
want to make use of this, they should just not create the file. Making this
both compile-time optional and opt-in in the filesystem is unnecessary.
Previously, this system call was included in @system-service since it is
a "getter" only, i.e. only queries information, and doesn't change
anything, and hence was considered not risky.
However, as it turns out, mincore() is actually security sensitive, see
the discussion here:
https://lwn.net/Articles/776034/
Hence, let's adjust the system call filter and drop mincore() from it.
This constitues a compatibility break to some level, however I presume
we can get away with this as the systemcall is pretty exotic. The fact
that it is pretty exotic is also reflected by the fact that the kernel
intends to majorly change behaviour of the system call soon (see the
linked LWN article)
This reverts commit 55b6530baa.
This commit description says "Always rename an interface to its name specified
in config if no NamePolicy= is specified", but it does much more:
1. It completely changes the meaning of NamePolicy=kernel. Before, it meant that an interface
with type==NAMEPOLICY_KERNEL would not be renamed. After, the kernel name only works as
a fallback, if no policy matches.
2. The "if no NamePolicy= is specified" part is not true at all, the interface will be renamed
according to the specified NamePolicy=.
After 55b6530baa, the should_rename() function is named very misleadingly: it is only used
to mean "respect kernel predictable name if no naming policy matches".
Let's revert, and start with a clean slate. This fixes#11436.
Makes it easier to understand what was merged (and easier to realize why).
Example is a start job running, and another unit triggering a verify-active job. It is not clear what job was it that from baz.service that merged into the installed job for bar.service in the debug logs. This makes it useful when debugging issues.
Jan 15 11:45:58 jupiter systemd[1218]: baz.service: Trying to enqueue job baz.service/start/replace
Jan 15 11:45:58 jupiter systemd[1218]: baz.service: Installed new job baz.service/start as 498
Jan 15 11:45:58 jupiter systemd[1218]: bar.service: Merged into installed job bar.service/start as 497
Jan 15 11:45:58 jupiter systemd[1218]: baz.service: Enqueued job baz.service/start as 498
It becomes:
Jan 15 11:45:58 jupiter systemd[1218]: bar.service: Merged bar.service/verify-active into installed job bar.service/start as 497
So it's apparently problematic that we use STRV_MAKE() (i.e. a compound
initializer) outside of the {} block we use it in (and that includes
outside of the ({}) block, too). Hence, let's rework the macro to not
need that.
This also makes the macro shorter, which is definitely a good and more
readable. Moreover, it will now complain if the iterator is a "char*"
instead of a "const char*", which is good too.
Fixes: #11394
If an IPv6 route is added with a source address that is still
tentative, the kernel will refuse to install it.
Previously, once we sent the messages to the kernel to add the
addresses, we would immediately proceed to add the routes. The
addresses would usually still be tentative at this point, so
adding static IPv6 routes was broken - see issue #5882.
Now, only begin to configure routes once the addresses are ready,
by restructuring the state machine, and tracking when addresses are
ready, not just added.
Fixes: #5882
Signed-off-by: Daniel Axtens <dja@axtens.net>
c4397d94c3 introduces
link_detach_from_manager() and netdev_detach_from_manager(), and they
set Link::manager or NetDev::manager NULL.
But, at the time e.g. link is removed, hence link_drop() is called,
there may be still some asynchronous netlink call is waiting, and
their callbacks hit assertion.
This make {link,netdev}_detach_from_manager() just drop all references
from manager, but keep the pointer to manager.
Fixes#11411.
This reverts commit 69bd76f2b9.
$DBUS_SESSION_BUS_ADDRESS is again set only if the socket exists.
Quoting https://github.com/systemd/systemd/pull/11327#issuecomment-452019027:
> [setting $DBUS_SESSION_BUS_ADDRESS unconditionally] makes pam_systemd
> incompatible with installations and distributions where dbus was not
> configured with --enable-user-session, and the session dbus-daemon is started
> by autolaunching or dbus-launch (as opposed to dbus.socket). I don't think
> that's wise: using autolaunching or dbus-launch, and disabling or not
> installing dbus.socket and dbus.service on the systemd user instance, is our
> compatibility story for people who still need a D-Bus session bus per X11
> session for whatever reason.
>
> For example, Debian can currently do either way, with a dbus-user-session
> package strongly recommended but not actually mandatory. dbus-user-session
> requires libpam-systemd; if pam_systemd now requires dbus.socket (which is in
> the dbus-user-session package), that's a circular dependency, which we
> normally try hard to avoid.
For systems that use dbus.socket this doesn't matter much, because the
user session is ordered after the user managaer, which pulls in dbus.socket
very early. For example, when logging over ssh:
sshd[20796]: pam_systemd(sshd:session): pam-systemd initializing
sshd[20796]: pam_systemd(sshd:session): Asking logind to create session: uid=1001 pid=20796 service=sshd type=tty class=user desktop= seat= vtnr=0 tty= display= remote=yes remote_user= remote_host=::1
sshd[20796]: pam_systemd(sshd:session): Session limits: memory_max=n/a tasks_max=n/a cpu_weight=n/a io_weight=n/a
systemd[1]: Created slice User Slice of UID 1001.
systemd[1]: Starting User Runtime Directory /run/user/1001...
systemd-logind[1210]: New session 3796 of user guest.
systemd[1]: Started User Runtime Directory /run/user/1001.
systemd[1]: Starting User Manager for UID 1001...
systemd[20805]: pam_systemd(systemd-user:session): pam-systemd initializing
systemd[20805]: Starting D-Bus User Message Bus Socket.
...
systemd[20805]: Reached target Sockets.
systemd[20805]: Reached target Basic System.
systemd[1]: Started User Manager for UID 1001.
systemd[1]: Started Session 3796 of user guest.
sshd[20796]: pam_systemd(sshd:session): Reply from logind: id=3796 object_path=/org/freedesktop/login1/session/_33796 runtime_path=/run/user/1001 session_fd=13 seat= vtnr=0 original_uid=1001
sshd[20796]: pam_unix(sshd:session): session opened for user guest by (uid=0)
Hence, everything in the ssh session is ordered after the user instance.
And in the user instance, services should be orderd after dbus.socket using
inter-unit dependencies. dbus.socket in turns does
systemctl --user set-environment DBUS_SESSION_BUS_ADDRESS=unix:path=%t/bus.
So there should be no race between starting of the dbus socket and our check
if it exists.
The alternative would be to set the "DBUS_SESSION_BUS_ADDRESS=unix:path=%s/bus;autolaunch:".
AFAICT, this would work as well. But I don't see any case where it actually works
better. Since this is an area with many compatiblity concerns, let's stick to
the previous setup which seems to work well.
When `timedatectl set-time` is called, NTP client may be in
'activating' or something. For safety, let's treat such states as
'active'.
This also changes all unit file status except for 'masked' or 'disabled'
are treated as 'enabled'.
Before:
IMPORT builtin 'hwdb' fails: No such file or directory
After:
IMPORT builtin 'hwdb' fails: No data available
Previous log is confusing and may be understood as hwdb file not exist.
Now, not a few udevd debug logs come from sd-device or sd-hwdb.
Only setting LOG_REALM_UDEV may not sufficient to debug.
We have already similar code in main() and udevadm.
Found by inspecting results of running this small program:
int main(int argc, const char **argv) {
for (int i = 1; i < argc; i++) {
FILE *f;
char line[1024], prev[1024], *r;
int lineno;
prev[0] = '\0';
lineno = 1;
f = fopen(argv[i], "r");
if (!f)
exit(1);
do {
r = fgets(line, sizeof(line), f);
if (!r)
break;
if (strcmp(line, prev) == 0)
printf("%s:%d: error: dup %s", argv[i], lineno, line);
lineno++;
strcpy(prev, line);
} while (!feof(f));
fclose(f);
}
}
By b1c097af8d (#10239), the receive buffer
size for uevents was set by SO_RCVBUF at first, and fallback to
use SO_RCVBUFFORCE. So, as SO_RCVBUF limits to the buffer size
net.core.rmem_max, which is usually much smaller than 128MB udevd requests,
uevents buffer size was not sufficient.
This fixes the ordering of the request: SO_RCVBUFFORCE first, and
fallback to SO_RCVBUF. Then, udevd's uevent buffer size can be set to
128MB.
This also revert 903893237a.
Fixes#11314 and #10754.
This reverts commit b26c904113.
I don't see anythign wrong, but Ubuntu autopkgtest CI started failing fairly
consistently since this was merged. Let's see if reverting fixes things.
The message is changed from
Cannot create session: Already running in a session...
to
Not creating session: Already running in a session...
This is more neutral and avoids suggesting a problem.
"Will not create session: ..." was suggested, but it sounds like the action
would have yet to be performed. I think Using present continuous is better.
Fixes#10822 (for good now, I hope).
c4b69e990f effectively moved the initalization of socket.
Before that commit:
run → listen_fds → udev_ctrl_new → udev_ctrl_new_from_fd → socket()
After:
run → main_loop → manager_new → udev_ctrl_new_from_fd → socket()
The problem is that main_loop was called after daemonization. Move manager_new
out of main_loop and before daemonization.
Fixes#11314 (hopefully ;)).
v2: Yu Watanabe
sd_event is initialized in main_loop().
We're about to need it to be later in the file for the next commit.
Moving it now means that when we change it in the next commit, it's
not intermingled with the move.
No functional change intended.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Existing use of E2BIG is replaced with ENOBUFS (entry too long), and E2BIG is
reused for the new error condition (too many fields).
This matches the change done for systemd-journald, hence forming the second
part of the fix for CVE-2018-16865
(https://bugzilla.redhat.com/show_bug.cgi?id=1653861).
Calling mhd_respond(), which ulimately calls MHD_queue_response() is
ineffective at point, becuase MHD_queue_response() immediately returns
MHD_NO signifying an error, because the connection is in state
MHD_CONNECTION_CONTINUE_SENT.
As Christian Grothoff kindly explained:
> You are likely calling MHD_queue_repsonse() too late: once you are
> receiving upload_data, HTTP forces you to process it all. At this time,
> MHD has already sent "100 continue" and cannot take it back (hence you
> get MHD_NO!).
>
> In your request handler, the first time when you are called for a
> connection (and when hence *upload_data_size == 0 and upload_data ==
> NULL) you must check the content-length header and react (with
> MHD_queue_response) based on this (to prevent MHD from automatically
> generating 100 continue).
If we ever encounter this kind of error, print a warning and immediately
abort the connection. (The alternative would be to keep reading the data,
but ignore it, and return an error after we get to the end of data.
That is possible, but of course puts additional load on both the
sender and reciever, and doesn't seem important enough just to return
a good error message.)
Note that sending of the error does not work (the connection is always aborted
when MHD_queue_response is used with MHD_RESPMEM_MUST_FREE, as in this case)
with libµhttpd 0.59, but works with 0.61:
https://src.fedoraproject.org/rpms/libmicrohttpd/pull-request/1
We immediately read the whole contents into memory, making thigs much more
expensive. Sealed fds should be used instead since they are more efficient
on our side.
We'd first parse all or most of the message, and only then consider if it
is not too large. Also, when encountering a single field over the limit,
we'd still process the preceding part of the message. Let's be stricter,
and check size limits early, and let's refuse the whole message if it fails
any of the size limits.
We allocate a iovec entry for each field, so with many short entries,
our memory usage and processing time can be large, even with a relatively
small message size. Let's refuse overly long entries.
CVE-2018-16865
https://bugzilla.redhat.com/show_bug.cgi?id=1653861
What from I can see, the problem is not from an alloca, despite what the CVE
description says, but from the attack multiplication that comes from creating
many very small iovecs: (void* + size_t) for each three bytes of input message.
If creation of the message failed, we'd write a bogus entry:
systemd-coredump[1400]: Cannot store coredump of 416 (systemd-journal): No space left on device
systemd-coredump[1400]: MESSAGE=Process 416 (systemd-journal) of user 0 dumped core.
systemd-coredump[1400]: Coredump diverted to
This affects systemd-journald and systemd-coredump.
Example entry:
$ journalctl -o export -n1 'MESSAGE=Something logged'
__CURSOR=s=976542d120c649f494471be317829ef9;i=34e;b=4871e4c474574ce4a462dfe3f1c37f06;m=c7d0c37dd2;t=57c4ac58f3b98;x=67598e942bd23dc0
__REALTIME_TIMESTAMP=1544035467475864
__MONOTONIC_TIMESTAMP=858200964562
_BOOT_ID=4871e4c474574ce4a462dfe3f1c37f06
PRIORITY=6
_UID=1000
_GID=1000
_CAP_EFFECTIVE=0
_SELINUX_CONTEXT=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
_AUDIT_SESSION=1
_AUDIT_LOGINUID=1000
_SYSTEMD_OWNER_UID=1000
_SYSTEMD_UNIT=user@1000.service
_SYSTEMD_SLICE=user-1000.slice
_SYSTEMD_USER_SLICE=-.slice
_SYSTEMD_INVOCATION_ID=1c4a469986d448719cb0f9141a10810e
_MACHINE_ID=08a5690a2eed47cf92ac0a5d2e3cf6b0
_HOSTNAME=krowka
_TRANSPORT=syslog
SYSLOG_FACILITY=17
SYSLOG_IDENTIFIER=syslog-caller
MESSAGE=Something logged
_COMM=poc
_EXE=/home/zbyszek/src/systemd-work3/poc
_SYSTEMD_CGROUP=/user.slice/user-1000.slice/user@1000.service/gnome-terminal-server.service
_SYSTEMD_USER_UNIT=gnome-terminal-server.service
SYSLOG_PID=4108
SYSLOG_TIMESTAMP=Dec 5 19:44:27
_PID=4108
_CMDLINE=./poc AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA>
_SOURCE_REALTIME_TIMESTAMP=1544035467475848
$ journalctl -o export -n1 'MESSAGE=Something logged' --output-fields=_CMDLINE|wc
6 2053 2097410
2MB might be hard for some clients to use meaningfully, but OTOH, it is
important to log the full commandline sometimes. For example, when the program
is crashing, the exact argument list is useful.
This fixes a crash where we would read the commandline, whose length is under
control of the sending program, and then crash when trying to create a stack
allocation for it.
CVE-2018-16864
https://bugzilla.redhat.com/show_bug.cgi?id=1653855
The message actually doesn't get written to disk, because
journal_file_append_entry() returns -E2BIG.
systemd-coredump[9982]: MESSAGE=Process 771 (systemd-journal) of user 0 dumped core.
systemd-coredump[9982]: Coredump diverted to /var/lib/systemd/coredump/core...
log_dispatch() calls log_dispatch_internal() which calls write_to_journal()
which appends MESSAGE= on its own.
In all three cases, sd_event_loop() will return the exit code anyway.
If sd_event_loop() returns negative, failure is logged and results in an
immediate return. Otherwise, we don't care if sd_event_loop() returns 0
or positive, because the return value feeds into DEFINE_MAIN_FUNCTION(), which
doesn't make the distinction.
Empty line between setting the output parameter and return is removed. I like
to think about both steps as part of returning from the function, and there's
no need to separate them.
Similarly, if we need to unset a pointer after successfully passing ownership,
use TAKE_PTR and do it immediately after the ownership change, without an empty
line inbetween.
When running PROGRAM="...", we would log
systemd-udevd[447]: Failed to wait spawned command '...': Input/output error
no matter why the program actually failed, at error level.
The code wouldn't distinguish between an internal failure and a failure in the
program being called and run sd_event_exit(..., -EIO) on any kind of error. EIO
is rather misleading here, becuase it suggests a serious error.
on_spawn_sigchld is updated to set the return code to distinguish failure to
spawn, including the program being killed by a signal (a negative return value),
and the program failing (positive return value).
The logging levels are adjusted, so that for PROGRAM= calls, which are
essentially "if" statements, we only log at debug level (unless we get a
timeout or segfault or another unexpected error).
There's very little lost if the variable is set for a socket that isn't
connectible, but a lot lost (races, ...) if it's not set but the socket exists.
Also, drop the FIXME note, since we don't plan to revert this revert any time
soon.
It seems quite useful to provide this additional information in public exported
functions.
This is a c99 feature, not supported in C++. Without the check in _sd-common.h:
FAILED: test-bus-vtable-cc@exe/src_libsystemd_sd-bus_test-bus-vtable-cc.cc.o
...
In file included from ../src/libsystemd/sd-bus/test-bus-vtable-cc.cc:9:
In file included from ../src/systemd/sd-bus-vtable.h:26:
In file included from ../src/systemd/sd-bus.h:26:
../src/systemd/sd-id128.h:38:47: error: static array size is a C99 feature, not permitted in C++
char *sd_id128_to_string(sd_id128_t id, char s[static SD_ID128_STRING_MAX]);
^
In .c files, I opted to use the define for consistency, even though we don't support
compilation with a C++ compiler, so the unconditional keyword would work too.
https://hamberg.no/erlend/posts/2013-02-18-static-array-indices.html
This only works with clang, unfortunately gcc doesn't seem to implement the check
(tested with gcc-8.2.1-5.fc29.x86_64).
Simulated error:
[2/3] Compiling C object 'systemd-nspawn@exe/src_nspawn_nspawn.c.o'.
../src/nspawn/nspawn.c:3179:45: warning: array argument is too small; contains 15 elements, callee requires at least 16 [-Warray-bounds]
candidate = (uid_t) siphash24(arg_machine, strlen(arg_machine), hash_key);
^ ~~~~~~~~
../src/basic/siphash24.h:24:64: note: callee declares array parameter as static here
uint64_t siphash24(const void *in, size_t inlen, const uint8_t k[static 16]);
^~~~~~~~~~~~
This reverts commit 2b2b7228bf.
Fixes#11293.
Removing the environment variable causes problems, e.g. Xfce and Chromium and
... don't communicate with the running dbus instance. If they attempt to start their
own instance, things become even more confusing. Those packages could be fixed
one by one, but removing the variable right now is causing too many problems.
MIPS/O32's st_rdev member of struct stat is unsigned long, which
is 32bit, while dev_t is defined as 64bit, which make some problems
in device_path_parse_major_minor.
Don't pass st.st_rdev, st_mode to device_path_parse_major_minor,
while pass 2 seperate variables. The result of stat is alos copied
out into these 2 variables. Fixes: #11247
The test fails under valgrind, so there was an exception for valgrind.
Unfortunately that check only works when valgrind-devel headers are
available during build. But it is possible to have just valgrind installed,
or simply install it after the build, and then "valgrind test-json" would
fail.
It also seems that even without valgrind, this fails on some arm32 CPUs.
Let's do the usual-style test for absolute and relative differences.
Nitpicky, but we've used a lot of random spacings and names in the past,
but we're trying to be completely consistent on "cgroup vN" now.
Generated by `fd -0 | xargs -0 -n1 sed -ri --follow-symlinks 's/cgroups? ?v?([0-9])/cgroup v\1/gI'`.
I manually ignored places where it's not appropriate to replace (eg.
"cgroup2" fstype and in src/shared/linux).
As devpath may not be set yet.
When debug logging is enabled, log_device_*() calls
sd_device_get_sysname(). So, we should not assume that devpath is always
set.
Fixes#11258.
The current support in kernel-install for initrd images doesn't copy
over the initrd file or allow a means for it to be specified (it
requires a specific filename in a particular directory).
This patchset adds support for (optionally) providing the name of
initial ramdisk file to copied over and used by kernel-install.
"src/basic/time-util.c" is part of src/basic and should not include
"src/shared/serialize.h". It is one of the few cases where this
is done. Also, it's not even required.