Some chattrs only work sensible if you set them right after opening a
file for create (think: FS_NOCOW_FL). Others only work when they are
applied when the file is fully written (think: FS_IMMUTABLE_FL). Let's
take that into account when copying files and applying a chattr to them.
Previously, when we'd copy an individual file we'd synthesize a
user.crtime_usec xattr with the source's creation time if we can
determine it. As the creation/birth time was until recently not
queriable form userspace this effectively just propagated the same xattr
on the source to the same xattr on the destination. However, current
kernels now allow to query the birthtime using statx() and we do make
use of that now. Which means that suddenly we started synthesizing these
xattrs much more regularly.
Doing this actually does make sense, but only in very few cases:
not for the typical regular files we copy, but certainly when dealing
with disk images. Hence, let's keep this kind of propagation, but let's
make it a flag and default to off. Then turn it on whenever we deal with
disk images, and leave it off otherwise.
This is particularly relevant as overlayfs combining a real fs, and a
tmpfs on top will result in EOPNOTSUPP when it is attempted to open a
file with xattrs for writing, as tmpfs does not support xattrs, and
hence the copy-up cannot work. Hence, let's avoid synthesizing this
needlessly, to increase compat with overlayfs.
This splits out a bunch of functions from fileio.c that have to do with
temporary files. Simply to make the header files a bit shorter, and to
group things more nicely.
No code changes, just some rearranging of source files.
This adds two optional functions that may be passed to the various copy
functions. One is invoked whenever we start copying a new file object,
the other while we copy file payload in each loop iteration.
When the caller passes one or both they can get notifications about copy
progress, for example to log where things are.
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
This part of the copyright blurb stems from the GPL use recommendations:
https://www.gnu.org/licenses/gpl-howto.en.html
The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.
hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
We do this checks as protection against bind mount cycles on the same
file system. However, the check wasn't really effective for that, as
it would only detect cycles A → B → A this way. By using
fs_is_mount_point() we'll also detect cycles A → A.
Also, while we are at it, make these file system boundary checks
optional. This is not used anywhere, but might be eventually...
Most importantly though add a longer blurb explanation the why.
This fixes the copy routines on overlay filesystem, which typically
returns the underlying st_dev for files, symlinks, etc.
The value of st_dev is guaranteed to be the same for directories, so
checking it on directories only fixes this code on overlay filesystem
and still keeps it from traversing mount points (which was the original
intent.)
There's a small side effect here, by which regular (non-directory) files
with bind mounts will be copied by the new logic (while they were
skipped by the previous logic.)
Tested: ./build/test-copy with an overlay on /tmp.
Fixes: #9134
This way we don't need to repeat the argument twice.
I didn't replace all instances. I think it's better to leave out:
- asserts
- comparisons like x & y == x, which are mathematically equivalent, but
here we aren't checking if flags are set, but if the argument fits in the
flags.
If a tool only invokes copy_bytes() a single time the _unlikely_() will always be
wrong, and is hence not useful. Let's drop it and let the compiler
figure our what to do, instead of misleading it.
Also, some coding style imprvoements.
splice() ignores O_NONBLOCK on pipes but not on other fds. Let's handle
that properly, and query O_ONBLOCK manually in that case, ensuring
systematic behaviour in either case.
Optionally, when we copy between fds with simple read/write, let's
return any remaining data we already read into the buffer if write
fails. This is useful to allow callers to use the read data otherwise,
perhaps implementing a different fallback for copying.
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.
I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
This changes the file copy logic of machined to set the UID/GID of all
copied files to 0 if the host and container do not share the same user
namespace.
Fixes: #4078
This adds a unified "copy_flags" parameter to all copy_xyz() function
calls, replacing the various boolean flags so far used. This should make
many invocations more readable as it is clear what behaviour is
precisely requested. This also prepares ground for adding support for
more modes later on.
Fall back to a normal copy operation when the backing file system isn't btrfs,
and hence doesn't support cheap snapshotting. Of course, this will be slow, but
given that the execution is asynchronous now, this should be OK.
Fixes: #1308
When recursively copying a directory tree, fix up the file times after having
created all contents in it, so that our changes don't end up altering any of
the directory times.
We previously would fail with EOPNOTSUPP when encountering an AF_UNIX socket in
the directory tree to copy. Fix that, and copy them too (even if they are dead
in the result).
Fixes: #2914
In nspawn we invoke copy_bytes() on a TTY fd. copy_file_range() returns EBADF
on a TTY and this error is considered fatal by copy_bytes() so far. Correct
that, so that nspawn's copy_bytes() operation works again.
This is a follow-up for a44202e98b.
The patch is not minimal, but a function to parse size_t is probably
going to come in handy in other places, so I think it's nicer to define
a proper parsing function than to open-code the cast.
For btrfs, c_f_r() is like BTRFS_IOC_CLONE which we already used, but also
works when max_bytes is set. We do call copy_bytes in coredump code with
max_bytes set, and for large files, so we might see some benefit from using
c_f_r() on btrfs.
For other filesystems, c_f_r() falls back to do_splice_direct(), the same as
sendfile, which we already call, so there shouldn't be much difference.
Tested with test-copy and systemd-coredump on Linux 4.3 (w/o c_f_r)
and 4.5 (w/ c_f_r).
We called sendfile with 16kb (a.k.a. COPY_BUFFER_SIZE) as the maximum
number of bytes to copy. This seems rather inefficient, especially with
large files. Instead, call sendfile with a "large" maximum.
What "large" max means is a bit tricky: current file offset + max
must fit in loff_t. This means that as we call sendfile more than once,
we have to lower the max size.
With this patch, test-copy calls sendfile twice, e.g.:
sendfile(4, 3, NULL, 9223372036854775807) = 738760
sendfile(4, 3, NULL, 9223372036854037047) = 0
The second call is necessary to determine EOF.