Why this change
---------------
Assumption - PAM's auth stack is properly configured.
Currently account pam_systemd_home.so returns PAM_SUCCESS for non
systemd-homed users, and a variety of return values (including
PAM_SUCCESS) for homed users.
account pam_unix returns PAM_AUTHINFO_UNAVAIL for systemd-homed
users, and a variety of return values (including PAM_AUTHINFO_UNAVAIL)
for normal users.
No possible combination in the pam stack can let us preserve the
various return values of the modules. For example, the configuration
mentioned in the manpage causes account pam_unix to never be reached
since pam_systemd_home just returns a success for ordinary users. Users
with expired passwords are allowed to log in because a check cannot be
made.
More configuration examples and why they don't work are mentioned
in #16906 and the downstream discussion linked there.
After this change
-----------------
account pam_unix will continue to return wrong value for homed users.
But we can skip the module conditionally using the return value from
account pam_systemd_home. We can already do this with the auth and
password modules.
We return BUS_ERROR_NO_SUCH_UNIT a.k.a. org.freedesktop.systemd1.NoSuchUnit
in various places. In #16813:
Aug 22 06:14:48 core sudo[2769199]: pam_systemd_home(sudo:account): Failed to query user record: Unit dbus-org.freedesktop.home1.service not found.
Aug 22 06:14:48 core dbus-daemon[5311]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.home1.service': Unit dbus-org.freedesktop.home1.service not found.
Aug 22 06:14:48 core dbus-daemon[5311]: [system] Activating via systemd: service name='org.freedesktop.home1' unit='dbus-org.freedesktop.home1.service' requested by ':1.6564' (uid=0 pid=2769199 comm="sudo su ")
This particular error comes from bus_unit_validate_load_state() in pid1:
case UNIT_NOT_FOUND:
return sd_bus_error_setf(error, BUS_ERROR_NO_SUCH_UNIT, "Unit %s not found.", u->id);
It seems possible that we should return a different error, but it doesn't really
matter: if we change pid1 to return a different error, we still need to handle
BUS_ERROR_NO_SUCH_UNIT as in this patch to handle pid1 with current code.
We'd like to use it for FIDO2 tokens too, and the concept is entirely
generic, hence let's just reuse the field, but rename it. Read the old
name for compatibility, and treat the old name and the new name as
identical for most purposes.
This variable is read by the module and can be used instead of the
suspend= PAM module parameter.
It is also set for the session itself to make debugging easy.
We might pin a home through authentication and a different one through a
session, all from the same PAM context, like sudo does. Hence also store
the referencing fd keyed by the user name.
Since acquiring user records involves plenty of IPC we try to cache user
records in the PAM context between our various hooks. Previously we'd
just cache whatever we acquired, and use it from the on, forever until
the context is destroyed.
This is problematic however, since some programs (notably sudo) use the
same PAM context for multiple different operations. Specifically, sudo
first authenticates the originating user before creating a session for
the destination user, all with the same PAM context. Thankfully, there
was a safety check for this case in place that re-validated that the
cached user record actually matched our current idea of the user to
operate on, but this just meant the hook would fail entirely.
Let's rework this: let's key the cache by the user name, so that we do
not confused by the changing of the user name during the context's
lifecycle and always, strictly use the cached user record of the user we
operate on.
Essentially this just means we now include the user name in the PAM data
field.
Secondly, this gets rid of the extra PAM data field that indicates
whether a user record is from homed or something else. To simplify
things we instead just cache the user record twice: once for consumption
by pam_systemd_home (which only wants homed records) and once shared by
pam_systemd and pam_systemd_home (and whoever else wants it). The cache
entries simply have different field names.
Previously pam_systemd_home.so was relying on `PAM_PROMPT_ECHO_OFF` to
display error messages to the user and also display the next prompt.
`PAM_PROMPT_ECHO_OFF` was never meant as a way to convey information to
the user, and following the example set in pam_unix.so you can see that
it's meant to _only_ display the prompt. Details about why the
authentication failed should be done in a `PAM_ERROR_MSG` before
displaying a short prompt as per usual using `PAM_PROMPT_ECHO_OFF`.
This reworks the user validation infrastructure. There are now two
modes. In regular mode we are strict and test against a strict set of
valid chars. And in "relaxed" mode we just filter out some really
obvious, dangerous stuff. i.e. strict is whitelisting what is OK, but
"relaxed" is blacklisting what is really not OK.
The idea is that we use strict mode whenver we allocate a new user
(i.e. in sysusers.d or homed), while "relaxed" mode is when we process
users registered elsewhere, (i.e. userdb, logind, …)
The requirements on user name validity vary wildly. SSSD thinks its fine
to embedd "@" for example, while the suggested NAME_REGEX field on
Debian does not even allow uppercase chars…
This effectively liberaralizes a lot what we expect from usernames.
The code that warns about questionnable user names is now optional and
only used at places such as unit file parsing, so that it doesn't show
up on every userdb query, but only when processing configuration files
that know better.
Fixes: #15149#15090