Commit Graph

42 Commits

Author SHA1 Message Date
Yu Watanabe 377a9545e9 tree-wide: fix typos found by Fossies codespell report 2020-10-24 13:29:31 +02:00
Lennart Poettering b202ec2068 loop-util: wait a random time before trying again
Let's try to make collisions when multiple clients want to use the same
device less likely, by sleeping a random time on collision.

The loop device allocation protocol is inherently collision prone:
first, a program asks which is the next free loop device, then it tries
to acquire it, in a separate, unsynchronized setp. If many peers do this
all at the same time, they'll likely all collide when trying to
acquire the device, so that they need to ask for a free device again and
again.

Let's make this a little less prone to collisions, reducing the number
of failing attempts: whenever we notice a collision we'll now wait
short and randomized time, making it more likely another peer succeeds.

(This also adds a similar logic when retrying LOOP_SET_STATUS64, but
with a slightly altered calculation, since there we definitely want to
wait a bit, under all cases)
2020-10-22 14:58:28 +02:00
Lennart Poettering 021bf17528 loop-util: if a loopback device we want to use still has partitions, do something about it
On current kernels (5.8 for example) under some conditions I don't fully
grok it might happen that a detached loopback block device still has
partition block devices around. Accessing these partition block devices
results in EIO errors (that also fill up dmesg). These devices cannot be
claned up with LOOP_CLR_FD (since the main device already is officially
detached), nor with LOOP_CTL_DELETE (returns EBUSY as long as the
partitions still exist). This is a kernel bug. But it appears to apply
to all recent kernels. I cannot really pin down what triggers this,
suffice to say our heavy-duty test can trigger it.

Either way, let's do something about it: when we notice this state we'll
attach an empty file to it, which is guaranteed to have to part table.
This makes the partitions go away. After closing/reoping the device we
hence are good to go again. ugly workaround, but I think OK enough to
use.

The net result is: with this commit, we'll guarantee that by the time we
attach a file to the loopback device we have zero kernel partitions
associated with it. Thus if we then wait for the kernel partitions we
need to appear we should have entirely reliable behaviour even if
loopback devices by the name are heavily recycled and udev events reach
us very late.

Fixes: #16858
2020-10-22 14:58:27 +02:00
Lennart Poettering 95c5009248 loop-util: LOOP_CLR_FD is async, don't retry to reuse a device right after issuing it
When we fall back to classic LOOP_SET_FD logic in case LOOP_CONFIGURE
didn't work we issue LOOP_CLR_FD first. But that call turns out to be
potentially async in the kernel: if something else (let's say
udev/blkid) is accessing the device the ioctl just sets the autoclear
flag and exits. Hence quite often the LOOP_SET_FD will subsequently
fail. Let's avoid the trouble, and immediately exit with EBUSY if
LOOP_CONFIGURE fails, and but remember that LOOP_CONFIGURE is not
available so that on the next iteration we go directly for LOOP_SET_FD
instead.
2020-10-22 14:58:27 +02:00
Lennart Poettering 738f29cb53 loop-util: handle EAGAIN on LOOP_SET_STATUS64
Since
5db470e229 (i.e. kernel 5.0)
changing the .lo_offset field via LOOP_SET_STATUS64 might result in
EAGAIN. Let's handle that.

Fixes: #16858
2020-10-22 14:58:27 +02:00
Lennart Poettering 77ad674b51 loop-util: apparently opening a loop device sometimes results in ENXIO, handle this 2020-09-25 16:03:05 +02:00
Lennart Poettering 0950526afd loop-util: use right flags field 2020-09-25 16:02:56 +02:00
Lennart Poettering bb2551bdcb loop-util: LOOP_CONFIGURE ignores lo_sizelimit
It appears LOOP_CONFIGURE in 5.8 is even more broken than initially
thought: it doesn't properly propgate lo_sizelimit to the block device
layer. :-(

Let's hence check the block device size immediately after issuing
LOOP_CONFIGURE, and if it doesn't match what we just set let's fallback
to the old ioctls.

This means LOOP_CONFIGURE currently works correctly only for the most
simply case: no partition table logic and no size limit. Sad!

(Kernel people should really be told about the concepts of tests and
even CI, one day!)
2020-08-24 22:01:13 +02:00
Lennart Poettering 8dbc208cc1 loop-util: define API for syncing loopback device 2020-08-24 21:59:35 +02:00
Lennart Poettering 86c1c1f345 loop-util: use new LOOP_CONFIGURE ioctl
LOOP_CONFIGURE allows us to configure a loopback device in one ioctl
instead of two, which is not just faster but also removes the race that
udev might start probing the device before we adjusted things properly.

Unfortunately LOOP_CONFIGURE is broken in regards to LO_FLAGS_PARTSCAN
as of kernel 5.8.0. This patch contains a work-around for that, to
fallback to old behaviour if partition scanning is requested but does
not work. Sucks a bit.

Proposed upstream fix for that issue:

https://lkml.org/lkml/2020/8/6/97
2020-08-11 15:24:18 +02:00
Lennart Poettering cae1e8fb88 loop-device: implicitly sync device on detach
Apparently, if IO is still in flight at the moment we invoke LOOP_CLR_FD
it is likely simply dropped (probably because yanking physical storage,
such as a USB stick would drop it too). Let's protect ourselves against
that and always sync explicitly before we invoke it.
2020-07-30 20:56:13 +02:00
Lennart Poettering b0a94268f8 core: when we cannot open an image file for write, try read-only
Closes: #14442
2020-01-09 11:18:06 +01:00
Zbigniew Jędrzejewski-Szmek 1b49e3e3c4 shared/loop-util: rename function
As suggested in https://github.com/systemd/systemd/pull/14261#pullrequestreview-332398625.
2019-12-18 08:48:33 +01:00
Zbigniew Jędrzejewski-Szmek 0181314861 shared/loop-util: spin on open() returning ENOENT too
https://github.com/systemd/systemd/pull/14261#discussion_r355001559
2019-12-15 21:06:42 +01:00
Zbigniew Jędrzejewski-Szmek f2d9213fee shared/loop-util: spin on LOOP_CTL_REMOVE
If we call LOOP_CLR_FD and LOOP_CTL_REMOVE too rapidly, the kernel cannot deal
with that (5.3.13-300.fc31.x86_64 running on dual core
Intel(R) Core(TM) i7-6500U CPU @ 2.50GHz).

$ sudo strace -eioctl build/test-dissect-image /tmp/foobar3.img
ioctl(3, TCGETS, 0x7ffcee47de20)        = -1 ENOTTY (Inappropriate ioctl for device)
ioctl(4, LOOP_CTL_GET_FREE)             = 9
ioctl(5, LOOP_SET_FD, 3)                = 0
ioctl(5, LOOP_SET_STATUS64, {lo_offset=0, lo_number=0, lo_flags=LO_FLAGS_READ_ONLY|LO_FLAGS_AUTOCLEAR|LO_FLAGS_PARTSCAN, lo_file_name="", ...}) = 0
ioctl(5, BLKGETSIZE64, [299999744])     = 0
ioctl(5, CDROM_GET_CAPABILITY, 0)       = -1 EINVAL (Invalid argument)
ioctl(5, BLKSSZGET, [512])              = 0
Waiting for device (parent + 0 partitions) to appear...
Found root partition, writable of type btrfs at #-1 (/dev/block/7:9)
ioctl(5, LOOP_CLR_FD)                   = 0
ioctl(3, LOOP_CTL_REMOVE, 9)            = -1 EBUSY (Device or resource busy)
Failed to remove loop device: Device or resource busy

This seems to be clear race condition, and attaching strace is generally enough
to "win" the race. But even with strace attached, we will fail occasionally.
Let's wait a bit and retry. With the wait, on my machine, the second attempt
always succeeds:

...
Found root partition, writable of type btrfs at #-1 (/dev/block/7:9)
ioctl(5, LOOP_CLR_FD)                   = 0
ioctl(3, LOOP_CTL_REMOVE, 9)            = -1 EBUSY (Device or resource busy)
ioctl(3, LOOP_CTL_REMOVE, 9)            = 9
+++ exited with 0 +++

Without the wait, all 64 attempts will occasionally fail.
2019-12-15 21:06:42 +01:00
Zbigniew Jędrzejewski-Szmek e8af3bfd63 shared/loop-util: fix error handling in loop_device_make_full()
The function no longer returns the fd. This complicated semantics, because it
wasn't clear what holds the ownership: the return value or the output
parameter.  There were no users of the fd in the return value, so let's
simplify things conceptually and only return the fd once.

Reduce the scope of variables.

LOOP_CLR_FD was called on the wrong fd. Let's use a cleanup function to make
this automatic and reduce chances of a mixup in the future.

CID 1408498.
2019-12-15 21:06:42 +01:00
Zbigniew Jędrzejewski-Szmek 6b2a8b80b4 shared/loop-util: drop inline function with one use 2019-12-06 10:56:49 +01:00
Zbigniew Jędrzejewski-Szmek ba5450f411 shared/loop-util: fix leak of fd in error path 2019-12-06 10:40:20 +01:00
Zbigniew Jędrzejewski-Szmek 1163a2e98a shared/loop-util: operate on the right fd
'loop' is always -1 at this point in the code.
2019-12-06 10:39:14 +01:00
Lennart Poettering 10c1b18888 valgrind: temporarily handle that valgrind still doesn't know LOOP_GET_STATUS64
Should be removed once valgrind learns it.
2019-12-02 10:06:56 +01:00
Lennart Poettering 50d046993b loop-util: if we fail to fully set up a loop device, detach it again 2019-12-02 10:06:45 +01:00
Lennart Poettering b26c39ad2c loop-util: fill in the loopback number, even a posteriori 2019-12-02 10:06:34 +01:00
Lennart Poettering f1443709e0 loop-util: optionally also resize partitions 2019-12-02 10:06:23 +01:00
Lennart Poettering 441ec80468 loop-util: add api for locking the block device with flock() 2019-12-02 10:06:12 +01:00
Lennart Poettering c37878fced loop-util: allow refreshing offset 2019-12-02 10:06:00 +01:00
Lennart Poettering ed9eeb7b0b loop-util: allow creating loopback block devices with offset/length 2019-12-02 10:05:49 +01:00
Lennart Poettering 9dabc4fda5 loop-util: add API to refresh loopback device size and opening existing loopback block devices 2019-12-02 10:05:38 +01:00
Lennart Poettering e08f94acf5 loop-util: accept loopback flags when creating loopback device
This way callers can choose if they want partition scanning or not.
2019-12-02 10:05:09 +01:00
Yu Watanabe e30e8b5073 tree-wide: drop stat.h or statfs.h when stat-util.h is included 2019-11-04 00:30:32 +09:00
Lennart Poettering 0f6519d43c loop-util: invoke LOOP_CTL_GET_FREE in a loop
if we don't call it in a loop the device it tells us to open might
already be gone, taken by somebody else racing against us. Hence try a
few times.
2019-05-22 13:04:26 +02:00
Lennart Poettering 16420be1fd loop-util: tweak codepath when a loopback file is "created" from an existing block device
Be more careful with initialized of the 'relinquished' boolean field,
and let's return the fd, like we do for the regular codepath, too.
2019-01-07 17:50:39 +01:00
Lennart Poettering 26c1be0fdc loop-util: minor simplification 2019-01-07 17:50:39 +01:00
Lennart Poettering 0c69794138 tree-wide: remove Lennart's copyright lines
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
2018-06-14 10:20:20 +02:00
Lennart Poettering 818bf54632 tree-wide: drop 'This file is part of systemd' blurb
This part of the copyright blurb stems from the GPL use recommendations:

https://www.gnu.org/licenses/gpl-howto.en.html

The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.

hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
2018-06-14 10:20:20 +02:00
Zbigniew Jędrzejewski-Szmek 11a1589223 tree-wide: drop license boilerplate
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.

I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
2018-04-06 18:58:55 +02:00
Yu Watanabe 1cc6c93a95 tree-wide: use TAKE_PTR() and TAKE_FD() macros 2018-04-05 14:26:26 +09:00
Lennart Poettering 3cc4411403 stat-util: unify code that checks whether something is a regular file
Let's add a common implementation for regular file checks, that are
careful to return the right error code (EISDIR/EISLNK/EBADFD) when we
are encountering a wrong file node.
2018-02-20 15:39:31 +01:00
Lennart Poettering dccca82b1a log: minimize includes in log.h
log.h really should only include the bare minimum of other headers, as
it is really pulled into pretty much everything else and already in
itself one of the most basic pieces of code we have.

Let's hence drop inclusion of:

1. sd-id128.h because it's entirely unneeded in current log.h
2. errno.h, dito.
3. sys/signalfd.h which we can replace by a simple struct forward
   declaration
4. process-util.h which was needed for getpid_cached() which we now hide
   in a funciton log_emergency_level() instead, which nicely abstracts
   the details away.
5. sys/socket.h which was needed for struct iovec, but a simple struct
   forward declaration suffices for that too.

Ultimately this actually makes our source tree larger (since users of
the functionality above must now include it themselves, log.h won't do
that for them), but I think it helps to untangle our web of includes a
tiny bit.

(Background: I'd like to isolate the generic bits of src/basic/ enough
so that we can do a git submodule import into casync for it)
2018-01-11 14:44:31 +01:00
Daniel Lockyer 5fecf46d76 Replace free and return NULL with return mfree 2017-11-24 10:31:49 +00:00
Zbigniew Jędrzejewski-Szmek 53e1b68390 Add SPDX license identifiers to source files under the LGPL
This follows what the kernel is doing, c.f.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.
2017-11-19 19:08:15 +01:00
Lennart Poettering a2ea3b2fc8 dissect: add small "systemd-dissect" tool as wrapper around dissect-image.c
This adds a small tool that may be used to look into OS images, and mount them
to any place. This is mostly a friendlier version of test-dissect-image.c. I am
not sure this should really become a proper command of systemd, hence for now
do not install it into bindir, but simply libexecdir.

This tool is already pretty useful since you can mount image files with it,
honouring the various partitions correctly. I figure this is going to become
more interesting if the dissctor learns luks and verity support.
2016-12-07 18:38:41 +01:00
Lennart Poettering 8c1be37e5b util-lib: split out image dissecting code and loopback code from nspawn
This adds two new APIs to systemd:

- loop-util.h is a simple internal API for allocating, setting up and releasing
  loopback block devices.

- dissect-image.h is an internal API for taking apart disk images and figuring
  out what the purpose of each partition is.

Both APIs are basically refactored versions of similar code in nspawn.  This
rework should permit us to reuse this in other places than just nspawn in the
future. Specifically: to implement RootImage= in the service image, similar to
RootDirectory=, but operating on a disk image; to unify the gpt-auto-discovery
generator code with the discovery logic in nspawn; to add new API to machined
for determining the OS version of a disk image (i.e. not just running
containers). This PR does not make any such changes however, it just provides
the new reworked API.

The reworked code is also slightly more powerful than the nspawn original one.
When pointing it to an image or block device with a naked file system (i.e. no
partition table) it will simply make it the root device.
2016-12-07 18:38:40 +01:00