LOOP_CONFIGURE allows us to configure a loopback device in one ioctl
instead of two, which is not just faster but also removes the race that
udev might start probing the device before we adjusted things properly.
Unfortunately LOOP_CONFIGURE is broken in regards to LO_FLAGS_PARTSCAN
as of kernel 5.8.0. This patch contains a work-around for that, to
fallback to old behaviour if partition scanning is requested but does
not work. Sucks a bit.
Proposed upstream fix for that issue:
https://lkml.org/lkml/2020/8/6/97
Apparently, if IO is still in flight at the moment we invoke LOOP_CLR_FD
it is likely simply dropped (probably because yanking physical storage,
such as a USB stick would drop it too). Let's protect ourselves against
that and always sync explicitly before we invoke it.
If we call LOOP_CLR_FD and LOOP_CTL_REMOVE too rapidly, the kernel cannot deal
with that (5.3.13-300.fc31.x86_64 running on dual core
Intel(R) Core(TM) i7-6500U CPU @ 2.50GHz).
$ sudo strace -eioctl build/test-dissect-image /tmp/foobar3.img
ioctl(3, TCGETS, 0x7ffcee47de20) = -1 ENOTTY (Inappropriate ioctl for device)
ioctl(4, LOOP_CTL_GET_FREE) = 9
ioctl(5, LOOP_SET_FD, 3) = 0
ioctl(5, LOOP_SET_STATUS64, {lo_offset=0, lo_number=0, lo_flags=LO_FLAGS_READ_ONLY|LO_FLAGS_AUTOCLEAR|LO_FLAGS_PARTSCAN, lo_file_name="", ...}) = 0
ioctl(5, BLKGETSIZE64, [299999744]) = 0
ioctl(5, CDROM_GET_CAPABILITY, 0) = -1 EINVAL (Invalid argument)
ioctl(5, BLKSSZGET, [512]) = 0
Waiting for device (parent + 0 partitions) to appear...
Found root partition, writable of type btrfs at #-1 (/dev/block/7:9)
ioctl(5, LOOP_CLR_FD) = 0
ioctl(3, LOOP_CTL_REMOVE, 9) = -1 EBUSY (Device or resource busy)
Failed to remove loop device: Device or resource busy
This seems to be clear race condition, and attaching strace is generally enough
to "win" the race. But even with strace attached, we will fail occasionally.
Let's wait a bit and retry. With the wait, on my machine, the second attempt
always succeeds:
...
Found root partition, writable of type btrfs at #-1 (/dev/block/7:9)
ioctl(5, LOOP_CLR_FD) = 0
ioctl(3, LOOP_CTL_REMOVE, 9) = -1 EBUSY (Device or resource busy)
ioctl(3, LOOP_CTL_REMOVE, 9) = 9
+++ exited with 0 +++
Without the wait, all 64 attempts will occasionally fail.
The function no longer returns the fd. This complicated semantics, because it
wasn't clear what holds the ownership: the return value or the output
parameter. There were no users of the fd in the return value, so let's
simplify things conceptually and only return the fd once.
Reduce the scope of variables.
LOOP_CLR_FD was called on the wrong fd. Let's use a cleanup function to make
this automatic and reduce chances of a mixup in the future.
CID 1408498.
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
This part of the copyright blurb stems from the GPL use recommendations:
https://www.gnu.org/licenses/gpl-howto.en.html
The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.
hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.
I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
Let's add a common implementation for regular file checks, that are
careful to return the right error code (EISDIR/EISLNK/EBADFD) when we
are encountering a wrong file node.
log.h really should only include the bare minimum of other headers, as
it is really pulled into pretty much everything else and already in
itself one of the most basic pieces of code we have.
Let's hence drop inclusion of:
1. sd-id128.h because it's entirely unneeded in current log.h
2. errno.h, dito.
3. sys/signalfd.h which we can replace by a simple struct forward
declaration
4. process-util.h which was needed for getpid_cached() which we now hide
in a funciton log_emergency_level() instead, which nicely abstracts
the details away.
5. sys/socket.h which was needed for struct iovec, but a simple struct
forward declaration suffices for that too.
Ultimately this actually makes our source tree larger (since users of
the functionality above must now include it themselves, log.h won't do
that for them), but I think it helps to untangle our web of includes a
tiny bit.
(Background: I'd like to isolate the generic bits of src/basic/ enough
so that we can do a git submodule import into casync for it)
This adds a small tool that may be used to look into OS images, and mount them
to any place. This is mostly a friendlier version of test-dissect-image.c. I am
not sure this should really become a proper command of systemd, hence for now
do not install it into bindir, but simply libexecdir.
This tool is already pretty useful since you can mount image files with it,
honouring the various partitions correctly. I figure this is going to become
more interesting if the dissctor learns luks and verity support.
This adds two new APIs to systemd:
- loop-util.h is a simple internal API for allocating, setting up and releasing
loopback block devices.
- dissect-image.h is an internal API for taking apart disk images and figuring
out what the purpose of each partition is.
Both APIs are basically refactored versions of similar code in nspawn. This
rework should permit us to reuse this in other places than just nspawn in the
future. Specifically: to implement RootImage= in the service image, similar to
RootDirectory=, but operating on a disk image; to unify the gpt-auto-discovery
generator code with the discovery logic in nspawn; to add new API to machined
for determining the OS version of a disk image (i.e. not just running
containers). This PR does not make any such changes however, it just provides
the new reworked API.
The reworked code is also slightly more powerful than the nspawn original one.
When pointing it to an image or block device with a naked file system (i.e. no
partition table) it will simply make it the root device.