This moves pretty much all uses of getpid() over to getpid_raw(). I
didn't specifically check whether the optimization is worth it for each
replacement, but in order to keep things simple and systematic I
switched over everything at once.
This patch ensures that session devices are saved for each session.
In order to make the revokation logic work when logind is restarted, the
session devices are now saved in the session state files and their respective
file descriptors sent to PID1's fdstore in order to keep them open accross
restart.
This is mandatory in order to keep the revokation logic working. Indeed in case
of input-devices, the same file descriptors must be shared by logind and a
given session controller in order EVIOCREVOKE to work otherwise multiple
sessions can have device access in parallel.
This should be the only remaining and missing piece for making logind fully
restartable.
Fixes: #1163
We don't have plural in the name of any other -util files and this
inconsistency trips me up every time I try to type this file name
from memory. "formats-util" is even hard to pronounce.
The parsing functions for [User]TasksMax were inconsistent. Empty string and
"infinity" were interpreted as no limit for TasksMax but not accepted for
UserTasksMax. Update them so that they're consistent with other knobs.
* Empty string indicates the default value.
* "infinity" indicates no limit.
While at it, replace opencoded (uint64_t) -1 with CGROUP_LIMIT_MAX in TasksMax
handling.
v2: Update empty string to indicate the default value as suggested by Zbigniew
Jędrzejewski-Szmek.
v3: Fixed empty UserTasksMax handling.
Let's change from a fixed value of 12288 tasks per user to a relative value of
33%, which with the kernel's default of 32768 translates to 10813. This is a
slight decrease of the limit, for no other reason than "33%" sounding like a nice
round number that is close enough to 12288 (which would translate to 37.5%).
(Well, it also has the nice effect of still leaving a bit of room in the PID
space if there are 3 cooperating evil users that try to consume all PIDs...
Also, I like my bikesheds blue).
Since the new value is taken relative, and machined's TasksMax= setting
defaults to 16384, 33% inside of containers is usually equivalent to 5406,
which should still be ample space.
To summarize:
| on the host | in the container
old default | 12288 | 12288
new default | 10813 | 5406
For similar reasons as the recent addition of a limit on sessions.
Note that we don't enforce a limit on inhibitors per-user currently, but
there's an implicit one, since each inhibitor takes up one fd, and fds are
limited via RLIMIT_NOFILE, and the limit on the number of processes per user.
Let's make sure we process session and inhibitor pipe fds (that signal
sessions/inhibtors going away) at a higher priority
than new bus calls that might create new sessions or inhibitors. This helps
ensuring that the number of open sessions stays minimal.
We really should put limits on all resources we manage, hence add one to the
number of concurrent sessions, too. This was previously unbounded, hence set a
relatively high limit of 8K by default.
Note that most PAM setups will actually invoke pam_systemd prefixed with "-",
so that the return code of pam_systemd is ignored, and the login attempt
succeeds anyway. On systems like this the session will be created but is not
tracked by systemd.
This ensures that users sessions are properly cleaned up after.
The admin can still enable or disable linger for specific users to allow
them to run processes after they log out. Doing that through the user
session is much cleaner and provides better control.
dbus daemon can now be run in the user session (with --enable-user-session,
added in 1.10.2), and most distributions opted to pick this configuration.
In the normal case it makes a lot of sense to kill remaining processes.
The exception is stuff like screen and tmux. But it's easy enough to
work around, a simple example was added to the man page in previous
commit. In the long run those services should integrate with the systemd
users session on their own.
https://bugs.freedesktop.org/show_bug.cgi?id=94508https://github.com/systemd/systemd/issues/2900
systemd-logind uses mkdir_label and label_fix functions without calling
first mac_selinux_init. This makes /run/user/$UID/ directories not
labelled correctly on an Arch Linux system using SELinux.
Fix this by calling mac_selinux_init("/run") early in systemd-logind.
This makes files created in /etc/udev/rules.d and /var/lib/systemd to be
labelled through transitions in the SELinux policy instead of using
setfscreatecon (with mac_selinux_create_file_prepare).
Issue #2388 suggests the current TasksMax= setting for user processes is to low. Bump it to 12K. Also, bump the
container TasksMax= from 8K to 16K, so that it remains higher than the one for user processes.
(Compare: the kernel default limit for processes system-wide is 32K).
Fixes#2388
GLIB has recently started to officially support the gcc cleanup
attribute in its public API, hence let's do the same for our APIs.
With this patch we'll define an xyz_unrefp() call for each public
xyz_unref() call, to make it easy to use inside a
__attribute__((cleanup())) expression. Then, all code is ported over to
make use of this.
The new calls are also documented in the man pages, with examples how to
use them (well, I only added docs where the _unref() call itself already
had docs, and the examples, only cover sd_bus_unrefp() and
sd_event_unrefp()).
This also renames sd_lldp_free() to sd_lldp_unref(), since that's how we
tend to call our destructors these days.
Note that this defines no public macro that wraps gcc's attribute and
makes it easier to use. While I think it's our duty in the library to
make our stuff easy to use, I figure it's not our duty to make gcc's own
features easy to use on its own. Most likely, client code which wants to
make use of this should define its own:
#define _cleanup_(function) __attribute__((cleanup(function)))
Or similar, to make the gcc feature easier to use.
Making this logic public has the benefit that we can remove three header
files whose only purpose was to define these functions internally.
See #2008.
This new setting configures the TasksMax= field for the slice objects we
create for each user.
This alters logind to create the slice unit as transient unit explicitly
instead of relying on implicit generation of slice units by simply
starting them. This also enables us to set a friendly description for
slice units that way.
The macro is generically useful for putting together search paths, hence
let's make it truly generic, by dropping the implicit ".d" appending it
does, and leave that to the caller. Also rename it from
CONF_DIRS_NULSTR() to CONF_PATHS_NULSTR(), since it's not strictly about
dirs that way, but any kind of file system path.
Also, mark CONF_DIR_SPLIT_USR() as internal macro by renaming it to
_CONF_PATHS_SPLIT_USR() so that the leading underscore indicates that
it's internal.
There are more than enough calls doing string manipulations to deserve
its own files, hence do something about it.
This patch also sorts the #include blocks of all files that needed to be
updated, according to the sorting suggestions from CODING_STYLE. Since
pretty much every file needs our string manipulation functions this
effectively means that most files have sorted #include blocks now.
Also touches a few unrelated include files.
The following functions return immediately if a null pointer was passed.
* calendar_spec_free
* link_address_free
* manager_free
* sd_bus_unref
* sd_journal_close
* udev_monitor_unref
* udev_unref
It is therefore not needed that a function caller repeats a corresponding check.
This issue was fixed by using the software Coccinelle 1.0.1.
Use mfree() where we can.
Drop unnecessary {}.
Drop unnecessary variable declarations.
Cast syscall invocations where explicitly don't care for the return
value to (void).
Reword a comment.
Let logind use the sd_bus_track helper object to track the controllers of
sessions. This does not only remove quite some code but also kills the
unconditional matches for all NameOwnerChanged signals.
The latter is something we should never ever do, as it wakes up the daemon
every time a client connects, which doesn't scale.
Commit c0f32805 ("logind: use sd_event timer source for inhibitor
logic") reworked the main loop logic of logind so that it uses a
real timeout callback handler to execute delayed functions.
What the old code did, however, was to call those functions on
every iteration in the main loop, not only when the timeout
expired.
Restore that behavior by bringing back manager_dispatch_delayed(),
and call it from manager_run(). The internal event source callback
manager_inhibit_timeout_handler() was turned into a wrapper of
manager_dispatch_delayed() now.
This ports a lot of manual code over to sigprocmask_many() and friends.
Also, we now consistly check for sigprocmask() failures with
assert_se(), since the call cannot realistically fail unless there's a
programming error.
Also encloses a few sd_event_add_signal() calls with (void) when we
ignore the return values for it knowingly.
Port over more code from shutdownd and teach logind to write /run/nologin at
least 5 minutes before the system is going down, and
/run/systemd/shutdown/scheduled when a shutdown is scheduled.
Add a timer to print UTMP wall messages so that it repeatedly informs users
about a scheduled shutdown:
* every 1 minute with less than 10 minutes to go
* every 15 minutes with less than 60 minutes to go
* every 30 minutes with less than 180 minutes (3 hours) to go
* every 60 minutes if more than that to go
This functionality only active if the .EnableWallMessages DBus property
is set to true. Also, a custom string can be added to the wall message,
set through the WallMessagePrefix property.
Add a method called ScheduleShutdown in org.freedesktop.login1.Manager
which adds a timer to shut down the system at a later point in time.
The first argument holds the type of the schedule that is about to
happen, and must be one of 'reboot', 'halt' or 'poweroff'.
The second argument specifies the absolute time, based on
CLOCK_REALTIME in nanoseconds, at which the the operation should be
executed.
To cancel a previously scheduled shutdown, the CancelScheduledShutdown()
can be called, which returns a bool, indicating whether a scheduled
timeout was cancelled.
Also add a new property called ScheduledShutdown which returns the
equivalent to what was passed in via ScheduleShutdown, as '(st)' type.
Instead of open-coding the delayed action and inhibit timeout logic,
switch over to a real sd_event_source based implementation.
This is not only easier to read but also allows us to add more timers
in the future.
This introduces 'HoldoffTimeoutSec' to logind.conf to make
IGNORE_LID_SWITCH_{SUSPEND,STARTUP}_USEC configurable.
Background: If an external monitor is connected, or if the system is
docked, we want to ignore LID events. This is required to support setups
where a laptop is used with external peripherals while the LID is closed.
However, this requires us to probe all hot-plugged devices before reacting
to LID events. But with modern buses like USB, the standards do not impose
any timeout on the slots, so we have no chance to know whether a given
slot is used or not. Hence, after resume and startup, we have to wait a
fixed timeout to give the kernel a chance to probe devices. Our timeout
has always been generous enough to support even the slowest devices.
However, a lot of people didn't use these features and wanted to disable
the hold-off timer. Now we provide a knob to do that.
This patch removes includes that are not used. The removals were found with
include-what-you-use which checks if any of the symbols from a header is
in use.
If the format string contains %m, clearly errno must have a meaningful
value, so we might as well use log_*_errno to have ERRNO= logged.
Using:
find . -name '*.[ch]' | xargs sed -r -i -e \
's/log_(debug|info|notice|warning|error|emergency)\((".*%m.*")/log_\1_errno(errno, \2/'
Plus some whitespace, linewrap, and indent adjustments.
Using:
find . -name '*.[ch]' | while read f; do perl -i.mmm -e \
'local $/;
local $_=<>;
s/(if\s*\([^\n]+\))\s*{\n(\s*)(log_[a-z_]*_errno\(\s*([->a-zA-Z_]+)\s*,[^;]+);\s*return\s+\g4;\s+}/\1\n\2return \3;/msg;
print;'
$f
done
And a couple of manual whitespace fixups.