Only log at LOG_INFO level, i.e. make this informational. During start
let's leave it at LOG_WARNING though.
Of course, it's ugly leaving processes around like that either in start
or in stop, but at start its more dangerous than on stop, so be tougher
there.
Whenever we pick up a new line in /proc/self/mountinfo and want to
synthesize a new mount unit from it, let's say which one it is.
Moreover, downgrade the log message when we encounter a mount point with
an overly long name to LOG_WARNING, since it's generally fine to ignore
such mount points.
Also, attach a catalog entry to explain the situation further.
Prompted-By: #15221
If we're using a set with _put_strdup(), most of the time we want to use
string hash ops on the set, and free the strings when done. This defines
the appropriate a new string_hash_ops_free structure to automatically free
the keys when removing the set, and makes set_put_strdup() and set_put_strdupv()
instantiate the set with those hash ops.
hashmap_put_strdup() was already doing something similar.
(It is OK to instantiate the set earlier, possibly with a different hash ops
structure. set_put_strdup() will then use the existing set. It is also OK
to call set_free_free() instead of set_free() on a set with
string_hash_ops_free, the effect is the same, we're just overriding the
override of the cleanup function.)
No functional change intended.
In all the other cases, I think the code was clearer with the static table.
Here, not so much. And because of the existing dump code, the vtables cannot
be made static and need to remain exported. I still think it's worth to do the
change to have the cmdline introspection, but I'm disappointed with how this
came out.
Systems where a mount point is expected to be read-write needs a way to
fail mount units that fallback as read-only.
Add a property to allow setting the -w option when calling mount(8).
Regardless of whether a mount is setup in initrd or int the main system,
the network default dependencies _netdev should still be honored.
IOW if a mount unit use the following options "x-initrd.mount,_netdev", it
should be ordered against initrd-fs.target, network.target,
network-online.target.
/dev/vdb1 /mnt ext4 x-initrd.mount,_netdev defaults 0 0
Before this patch:
Before=umount.target initrd-fs.target
After=system.slice sysroot.mount dev-vdb1.device -.mount systemd-journald.socket blockdev@dev-vdb1.target
After this patch:
Before=initrd-fs.target umount.target
After=network-online.target -.mount blockdev@dev-vdb1.target dev-vdb1.device sysroot.mount system.slice network.target systemd-journald.socket
fstab-generator was also handling the default ordering dependencies for mount
units setup in initrd. To do that it was turning the defaults dependencies off
completely and ordered the mount unit against either local-fs.target or
initrd-fs.target or initrd-root-fs.target itself.
But it had the bad side effect to also remove all other default dependencies as
well. Thus if an initrd mount was using _netdev, the network dependencies were
missing.
In general fstab-generator shouldn't use DefaultDependecies=no because it can
handle only a small set of the default dependencies the rest are dealt by pid1.
So this patch makes pid1 handle all default dependencies.
Both fstab-generator and pid1 are duplicating the handling of
"Before=local-fs.target" dependency for mount units.
fstab-generator is correctly skipping this dep if the mount unit is activated
by an automount unit.
However the condition used by pid1 was incorrect and missed the case when a mount
unit uses "x-systemd.automount" since in this case the mount unit should be
(only) ordered against its automount unit counterpart instead.
This way we shuld be able to order mounts properly against their backing
services in case complex storage is used (i.e. LUKS), even if the device
path used for mounting the devices is different from the expected device
node of the backing service.
Specifically, if we have a LUKS device /dev/mapper/foo that is mounted
by this name all is trivial as the relationship can be established a
priori easily. But if it is mounted via a /dev/disk/by-uuid/ symlink or
similar we only can relate the device node generated to the one mounted
at the moment the device is actually established. That's because the
UUID of the fs is stored inside the encrypted volume and thus not
knowable until the volume is set up. This patch tries to improve on this
situation: a implicit After=blockdev@.target dependency is generated for
all mounts, based on the data from /proc/self/mountinfo, which should be
the actual device node, with all symlinks resolved. This means that as
soon as the mount is established the ordering via blockdev@.target will
work, and that means during shutdown it is honoured, which is what we
are looking for.
Note that specifying /etc/fstab entries via UUID= for LUKS devices still
sucks and shouldn't be done, because it means we cannot know which LUKS
device to activate to make an fs appear, and that means unless the
volume is set up at boot anyway we can't really handle things
automatically when putting together transactions that need the mount.
This extends on d253a45e1c, and instead of
merging just a single flag from previous mount entries of
/proc/self/mountinfo for the same path we merge all three.
This shouldn't change behaviour, but I think make things more readable.
Previously we'd set MOUNT_PROC_IS_MOUNTED unconditionally, we still do.
Previously we'd inherit MOUNT_PROC_JUST_MOUNTED from a previous entry on
the same line, we still do.
MOUNT_PROC_JUST_CHANGED should generally stay set too. Why that? If we
have two mount entries on the same mount point we'd first process one
and then the other, and the almost certainly different mount parameters
of the two would mean we'd set MOUNT_PROC_JUST_CHANGED for the second.
And with this we'll definitely do that still.
This also adds a comment explaining the situation a bit, and why we get
into this situation.
When starting a mount unit, systemd invokes mount command and moves the
unit's internal state to "mounting". Then it watches for updates of
/proc/self/mountinfo. When the expected mount entry newly appears in
mountinfo, the unit internal state is changed to "mounting-done".
Finally, when systemd finds the mount command has finished, it checks
whether the unit internal state is "mounting-done" and changes the state
to "mounted".
If the state was not "mounting-done" in the last step though mount command
was successfully finished, the unit is marked as "failed" with following
log messages:
Mount process finished, but there is no mount.
Failed with result 'protocol'.
If daemon-reload is done in parallel with starting mount unit, it is
possible that things happen in following order and result in above failure.
1. the mount unit state changes to "mounting"
2. daemon-reload saves the unit state
3. kernel completes the mount and /proc/self/mountinfo is updated
4. daemon-reload restores the saved unit state, that is "mounting"
5. systemd notices the mount command has finished but the unit state
is still "mounting" though it should be "mounting-done"
mount_setup_existing_unit() should take into account that MOUNT_MOUNTING
is transitional state and set MOUNT_PROC_JUST_MOUNTED flag if the unit
comes from /proc/self/mountinfo so that mount_process_proc_self_mountinfo()
later can make state transition from "mounting" to "mounting-done".
Fixes: #10872
Similar, refuse triggering deps on units that cannot trigger.
And rework how we ignore After= dependencies on device units, to work
the same way.
See: #14142
No functional change, just adjusting code to follow the same pattern
everywhere. In particular, never call _verify() on an already loaded unit,
but return early from the caller instead. This makes the code a bit easier
to follow.
unit_load_fragment_and_dropin() and unit_load_fragment_and_dropin_optional()
are really the same, with one minor difference in behaviour. Let's drop
the second function.
"_optional" in the name suggests that it's the "dropin" part that is optional.
(Which it is, but in this case, we mean the fragment to be optional.)
I think the new version with a flag is easier to understand.
This is the most basic consumer of the new systemd-vs-kernel checker,
both acting as a reasonable standalone exerciser of the code, and also
as a way for easy inspection of deviations from systemd internal state.
v2:
- if RestartKillSignal= is not specified, fall back to KillSignal=. This is necessary
to preserve backwards compatibility (and keep KillSignal= generally useful).
(The interesting bits about the what and why are in a comment in the
patch, please have a look there instead of looking here in the commit
msg).
Fixes: #10872
Our IO handler is only installed for one fd, hence there's no reason to
conditionalize on it again.
Also, split out the draining into a helper function of its own.
This wraps a few common steps. It is defined as inline function instead of in a
.c file to avoid having a .c file. With a .c file, we would have three choices:
- either link it into libshared, but then then libshared would have to be
linked to libmount.
- or compile the .c file into each target separately. This has the disdvantage
that configuration of every target has to be updated and stuff will be compiled
multiple times anyway, which is not too different from keeping this in the
header file.
- or create a new convenience library just for this. This also has the disadvantage
that the every target would have to be updated, and a separate library for a
10 line function seems overkill.
By keeping everything in a header file, we compile this a few times, but
otherwise it's the least painful option. The compiler can optimize most of the
function away, because it knows if 'source' is set or not.
It seems better to use just a single parsing algorithm for /proc/self/mountinfo.
Also, unify the naming of variables in all places that use mnt_table_next_fs().
It makes it easier to compare the different call sites.
Use a trivial header file to share mnt_free_tablep and mnt_free_iterp.
It would be nicer put this in mount-util.h, but libmount.h is not in the
default include path, and the build system would have to be adjusted to pass
pkg-config include path in various places, and it's just not worth the trouble.
A separate header file works nicely.
Some PIDs can remain in the watched list even though their processes have
exited since a long time. It can easily happen if the main process of a forking
service manages to spawn a child before the control process exits for example.
However when a pid is about to be mapped to a unit by calling unit_watch_pid(),
the caller usually knows if the pid should belong to this unit exclusively: if
we just forked() off a child, then we can be sure that its PID is otherwise
unused. In this case we take this opportunity to remove any stalled PIDs from
the watched process list.
If we learnt about a PID in any other form (for example via PID file, via
searching, MAINPID= and so on), then we can't assume anything.
Just some renaming, no change in behaviour.
Background: I'd like to add more functions unit_test_xyz() that test
various things, hence let's streamline the naming a bit.
As device units will be reloaded by systemd whenever the corresponding device generates a "changed" event, if the mount unit / cryptsetup service is wanted by its device unit, the former can be restarted by systemd unexpectedly after the user stopped them explicitly. It is not sensible at all and can be considered dangerous. Neither is the behaviour conventional (as `auto` in fstab should only affect behaviour on boot and `mount -a`) or ever documented at all (not even in systemd, see systemd.mount(5) and crypttab(5)).
The problem was introduced in a37422045fbb68ad68f734e5dc00e0a5b1759773:
we have a unit which has a fragment, and when we'd update it based on
/proc/self/mountinfo, we'd say that e.g. What=/dev/loop8 has origin-fragment.
This commit changes two things:
- origin-fragment is changed to origin-mountinfo-implicit
- when we stop a unit, mountinfo information is flushed and all deps based
on it are dropped.
The second step is important, because when we restart the unit, we want to
notice that we have "fresh" mountinfo information. We could keep the old info
around and solve this in a different way, but keeping stale information seems
inelegant.
Fixes#11342.