diff --git a/TODO b/TODO index 961767187b..ce37869ce1 100644 --- a/TODO +++ b/TODO @@ -19,6 +19,11 @@ Janitorial Clean-ups: Features: +* when dissecting images, warn about unrecognized partition flags + +* honour specifiers in unit files that resolve to some very basic + /etc/os-release data, such as ID, VERSION_ID, BUILD_ID, VARIANT_ID. + * socket units: allow creating a udev monitor socket with ListenDevices= or so, with matches, then actviate app thorugh that passing socket oveer @@ -44,8 +49,6 @@ Features: shouldn't operate in a volatile mode unless we got told so from a trusted source. -* look for /var/tmp automatically via gpt auto discovery - * figure out automatic partition discovery when combining writable root dir with immutable /usr @@ -416,8 +419,6 @@ Features: "systemd-gdb" for attaching to the start-up of any system service in its natural habitat. -* maybe introduce gpt auto discovery for /var/tmp? - * maybe add gpt-partition-based user management: each user gets his own LUKS-encrypted GPT partition with a new GPT type. A small nss module enumerates users via udev partition enumeration. UIDs are assigned in a fixed @@ -427,13 +428,6 @@ Features: with stateless/read-only/verity-enabled root. (other idea: do this based on loopback files in /home, without GPT involvement) -* gpt-auto logic: introduce support for discovering /var matching an image. For - that, use a partition type UUID that is hashed from the OS name (as encoded - in /etc/os-release), the architecture, and 4 new bits from the gpt flags - field of the root partition. This way can easily support multiple OS - installations on the same GPT partition table, without problems with - unmatched /var partitions. - * gpt-auto logic: related to the above, maybe support a "secondary" root partition, that is mounted to / and is writable, and where the actual root's /usr is mounted into. diff --git a/docs/DISCOVERABLE_PARTITIONS.md b/docs/DISCOVERABLE_PARTITIONS.md new file mode 100644 index 0000000000..d80a667497 --- /dev/null +++ b/docs/DISCOVERABLE_PARTITIONS.md @@ -0,0 +1,359 @@ +--- +title: Discoverable Partitions Specification +category: Concepts +layout: default +--- +# The Discoverable Partitions Specification + +_TL;DR: Let's automatically discover, mount and enable the root partition, +`/home/`, `/srv/`, `/var/` and `/var/tmp/` and the swap partitions based on +GUID Partition Tables (GPT)!_ + +The GUID Partition Table (GPT) is mandatory on EFI systems. It allows +identification of partition types with UUIDs. So far Linux has made little use +of this, and mostly just defined one UUID for file system/data partitions and +another one for swap partitions. With this specification, we introduce +additional partition types to enable automatic discovery of partitions and +their intended mountpoint. This has many benefits: + +* OS installers can automatically discover and make sense of partitions of + existing Linux installations. +* The OS can discover and mount the necessary file systems with a non-existing + or incomplete `/etc/fstab` file and without the `root=` kernel command line + option. +* Container managers (such as nspawn and libvirt-lxc) can decode and set up + file systems contained in GPT disk images automatically and mount them to the + right places, thus allowing booting the same, identical images on bare-metal + and in Linux containers. This enables true, natural portability of disk + images between physical machines and Linux containers. +* As a help to administrators and users partition manager tools can show more + descriptive information about partitions tables. + +Note that the OS side of this specification is currently implemented in +[systemd](http://systemd.io/) 211 and newer in the +[systemd-auto-gpt-generator(8)](http://www.freedesktop.org/software/systemd/man/systemd-gpt-auto-generator.html) +generator tool. Note that automatic discovery of the root only works if the +boot loader communicates this information to the OS, by implementing the [Boot +Loader +Interface](https://systemd.io/BOOT_LOADER_INTERFACE). + +## Defined Partition Type UUIDs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Partition Type UUIDNameAllowed File SystemsExplanation
44479540-f297-41b2-9af7-d131d5f0458aRoot Partition (x86)Any native, optionally in LUKSOn systems with matching architecture, the first partition with +this type UUID on the disk containing the active EFI ESP is automatically +mounted to the root directory /. If the partition is encrypted with +LUKS or has dm-verity integrity data (see below), the device mapper file will +be named /dev/mapper/root.
4f68bce3-e8cd-4db1-96e7-fbcaf984b709Root Partition (x86-64)
69dad710-2ce4-4e3c-b16c-21a1d49abed3Root Partition (32-bit ARM)
b921b045-1df0-41c3-af44-4c6f280d3faeRoot Partition (64-bit ARM/AArch64)
993d8d3d-f80e-4225-855a-9daf8ed7ea97Root Partition (Itanium/IA-64)
d13c5d3b-b5d1-422a-b29f-9454fdc89d76Root Verity Partition (x86)A dm-verity superblock followed by hash dataOn systems with matching architecture, contains dm-verity +integrity hash data for the matching root partition. If this feature is used +the partition UUID of the root partition should be the first 128bit of the root +hash of the dm-verity hash data, and the partition UUID of this dm-verity +partition should be the final 128bit of it, so that the root partition and its +verity partition can be discovered easily, simply by specifying the root +hash.
2c7357ed-ebd2-46d9-aec1-23d437ec2bf5Root Verity Partition (x86-64)
7386cdf2-203c-47a9-a498-f2ecce45a2d6Root Verity Partition (32-bit ARM)
df3300ce-d69f-4c92-978c-9bfb0f38d820Root Verity Partition (64-bit ARM/AArch64)
86ed10d5-b607-45bb-8957-d350f23d0571Root Verity Partition (Itanium/IA-64)
933ac7e1-2eb4-4f13-b844-0e14e2aef915Home PartitionAny native, optionally in LUKSThe first partition with this type UUID on the disk containing the root +partition is automatically mounted to /home/. If the partition is encrypted +with LUKS, the device mapper file will be named /dev/mapper/home.
3b8f8425-20e0-4f3b-907f-1a25a76f98e8Server Data PartitionAny native, optionally in LUKSThe first partition with this type UUID on the disk containing the root +partition is automatically mounted to /srv/. If the partition is encrypted +with LUKS, the device mapper file will be named /dev/mapper/srv.
4d21b016-b534-45c2-a9fb-5c16e091fd2dVariable Data PartitionAny native, optionally in LUKSThe first partition with this type UUID on the disk containing the root +partition is automatically mounted to /var/ — under the condition that +its partition UUID matches the first 128 bit of HMAC-SHA256(machine-id, +0x4d21b016b53445c2a9fb5c16e091fd2d) (i.e. the SHA256 HMAC hash of the +binary type UUID keyed by the machine ID as read from /etc/machine-id. This +special requirement is made because /var/ (unlike the other partition +types listed here) is inherently private to a specific installation and cannot +possibly be shared between multiple OS installations on the same disk, and thus +should be bound to a specific instance of the OS, identified by its machine ID. +If the partition is encrypted with LUKS, the device mapper file will be named +/dev/mapper/var.
7ec6f557-3bc5-4aca-b293-16ef5df639d1Temporary Data PartitionAny native, optionally in LUKSThe first partition with this type UUID on the disk containing the root +partition is automatically mounted to /var/tmp/. If the partition is +encrypted with LUKS, the device mapper file will be named +/dev/mapper/tmp. Note that the intended mount point is indeed +/var/tmp/, not /tmp/. The latter is typically maintained in +memory via tmpfs and does not require a partition on disk. In some +cases it might be desirable to make /tmp/ persistent too, in which +case it is recommended to make it a symlink or bind mount to +/var/tmp/, thus not requiring its own partition type UUID.
0657fd6d-a4ab-43c4-84e5-0933c84b4f4fSwapSwapAll swap partitions on the disk containing the root partition are automatically enabled.
c12a7328-f81f-11d2-ba4b-00a0c93ec93bEFI System PartitionVFATThe ESP used for the current boot is automatically mounted to +/efi/ (or /boot/ as fallback), unless a different partition +is mounted there (possibly via /etc/fstab, or because the Extended +Boot Loader Partition — see below — exists) or the directory is non-empty on +the root disk. This partition type is defined by the UEFI Specification.
bc13c2ff-59e6-4262-a352-b275fd6f7172Extended Boot Loader PartitionTypically VFATThe Extended Boot Loader Partition (XBOOTLDR) used for the current boot is +automatically mounted to /boot/, unless a different partition is +mounted there (possibly via /etc/fstab) or the directory is non-empty +on the root disk. This partition type is defined by the Boot Loader +Specification.
0fc63daf-8483-4772-8e79-3d69d8477de4Other Data PartitionsAny native, optionally in LUKSNo automatic mounting takes place for other Linux data partitions. This +partition type should be used for all partitions that carry Linux file +systems. The installer needs to mount them explicitly via entries in +/etc/fstab. Optionally, these partitions may be encrypted with LUKS.
+ +Other GPT type IDs might be used on Linux, for example to mark software RAID or +LVM partitions. The definitions of those GPT types is outside of the scope of +this specification. + +## Partition Names + +For partitions of the types listed above it is recommended to use +human-friendly, descriptive partition names in the GPT partition table, for +example "*Home*", "*Server* *Data*", "*Fedora* *Root*" and similar, possibly +localized. + +## Partition Flags + +For the root, server data, home, variable data, temporary data and swap +partitions, the partition flag bit 63 ("*no-auto*") may be used to turn off +auto-discovery for the specific partition. If set, the partition will not be +automatically mounted or enabled. + +For the root, server data, home, variable data and temporary data partitions, +the partition flag bit 60 ("*read-only*") may be used to mark a partition for +read-only mounts only. If set, the partition will be mounted read-only instead +of read-write. Note that the variable data partition and the temporary data +partition will generally not be able to serve their purpose if marked +read-only, since by their very definition they are supposed to be mutable. (The +home and server data partitions are generally assumed to be mutable as well, +but the requirement for them is not equally strong.) Because of that, while the +read-only flag is defined and supported, it's almost never a good idea to +actually use it for these partitions. + +Note that these two flag definitions happen to map nicely to the ones used by +Microsoft Basic Data Partitions. + +## Suggested Mode of Operation + +An *installer* that repartitions the hard disk _should_ use the above UUID +partition types for appropriate partitions it creates. + +An *installer* which supports a "manual partitioning" interface _may_ choose to +pre-populate the interface with swap, `/home/`, `/srv/`, `/var/tmp/` partitions +of pre-existing Linux installations, identified with the GPT type UUIDs +above. The installer should not pre-populate such an interface with any +identified root or `/var/` partition unless the intention is to overwrite an +existing operating system that might be installed. + +An *installer* _may_ omit creating entries in `/etc/fstab` for root, `/home/`, +`/srv/`, `/var/`, `/var/tmp` and for the swap partitions if they use these UUID +partition types, and are the first partitions on the disk of each type. If the +ESP shall be mounted to `/efi/` (or `/boot/`), it may additionally omit +creating the entry for it in `/etc/fstab`. If an extended boot partition is +used, or if the EFI partition shall not be mounted to `/efi/` or `/boot/`, it +_must_ create `/etc/fstab` entries for them. If other partitions are used (for +example for `/usr/` or `/var/lib/mysql/`), the installer _must_ register these +in `/etc/fstab`. The `root=` parameter passed to the kernel by the boot loader +may be omitted if the root partition is the first one on the disk of its type. +If the root partition is not the first one on the disk, the `root=` parameter +_must_ be passed to the kernel by the boot loader. An installer that mounts a +root, `/home/`, `/srv/`, `/var/`, or `/var/tmp/` file system with the partition +types defined as above which contains a LUKS header _must_ call the device +mapper device "root", "home", "srv", "var" or "tmp", respectively. This is +necessary to ensure that the automatic discovery will never result in different +device mapper names than any static configuration by the installer, thus +eliminating possible naming conflicts and ambiguities. + +An *operating* *system* _should_ automatically discover and mount the first +root partition that does not have the no-auto flag set (as described above) by +scanning the disk containing the currently used EFI ESP. It _should_ +automatically discover and mount the first `/home/`, `/srv/`, `/var/`, +`/var/tmp/` and swap partitions that do not have the no-auto flag set by +scanning the disk containing the discovered root partition. It should +automatically discover and mount the partition containing the currently used +EFI ESP to `/efi/` (or `/boot/` as fallback). It should automatically discover +and mount the partition containing the currently used Extended Boot Loader +Partition to `/boot/`. It _should not_ discover or automatically mount +partitions with other UUID partition types, or partitions located on other +disks, or partitions with the no-auto flag set. User configuration shall +always override automatic discovery and mounting. If a root, `/home/`, +`/srv/`, `/boot/`, `/var/`, `/var/tmp/`, `/efi/`, `/boot/` or swap partition is +listed in `/etc/fstab` or with `root=` on the kernel command line, it _must_ +take precedence over automatically discovered partitions. If a `/home/`, +`/srv/`, `/boot/`, `/var/`, `/var/tmp/`, `/efi/` or `/boot/` directory is found +to be populated already in the root partition, the automatic discovery _must +not_ mount any discovered file system over it. + +A *container* *manager* should automatically discover and mount the root, +`/home/`, `/srv/`, `/var/`, `/var/tmp/` partitions inside a container disk +image. It may choose to mount any discovered ESP and/or XBOOOTLDR partition to +`/efi/` or `/boot/`. It should ignore any swap should they be included in a +container disk image. + +If a btrfs file system is automatically discovered and mounted by the operating +system/container manager it will be mounted with its *default* subvolume. The +installer should make sure to set the default subvolume correctly using "btrfs +subvolume set-default". + +## Sharing of File Systems between Installations + +If two Linux-based operating systems are installed on the same disk, the scheme +above suggests that they may share the swap, `/home/`, `/srv/`, `/var/tmp/`, +ESP, XBOOTLDR. However, they should each have their own root and `/var/` +partition. + +## Frequently Asked Questions + +### Why are you taking my `/etc/fstab` away? + +We are not. `/etc/fstab` always overrides automatic discovery and is indeed +mentioned in the specifications. We are simply trying to make the boot and +installation processes of Linux a bit more robust and self-descriptive. + +### Why did you only define the root partition for x86, x86-64, ARM, ARM64, ia64? + +The automatic discovery of the root partition is defined to operate on the disk +containing the current EFI System Partition (ESP). Since EFI only exists on +x86, x86-64, ia64, and ARM so far, we only defined root partition UUIDs for +these architectures. Should EFI become more common on other architectures, we +can define additional UUIDs for them. + +### Why define distinct root partition UUIDs for the various architectures? + +This allows disk images that may be booted on multiple architectures to use +discovery of the appropriate root partition on each architecture. + +### Doesn't this break multi-boot scenarios? + +No, it doesn't. The specification says that installers may not stop creating +`/etc/fstab` or stop including `root=` on the kernel command line, unless the used +partitions are the first ones of their type on the disk. Additionally, +`/etc/fstab` and `root=` both override automatic discovery. Multi-boot is hence +well supported, since it doesn't change anything for anything but the first +installation. + +That all said, it's not expected that generic installers generally stop setting +`root=` and creating `/etc/fstab` anyway. The option to drop these configuration +bits is primarily something for appliance-like devices. However, generic +installers should *still* set the right GPT partition types for the partitions +they create so that container managers, partition tools and administrators can +benefit. Phrased differently, this specification introduces A) the +*recommendation* to use the newly defined partition types to tag things +properly and B) the *option* to then drop `root=` and `/etc/fstab`. While we +advertise A) to *all* installers, we only propose B) for simpler, +appliance-like installations. + +### What partitioning tools will create a DPS-compliant partition table? + +As of util-linux 2.25.2, the fdisk tool provides type codes to create the root, +home, and swap partitions that the DPS expects, but the gdisk tool (version +0.8.10) and its variants do not support creation of a root file system with a +matching type code. By default, fdisk will create an old-style MBR, not a GPT, +so typing 'l' to list partition types will not show the choices that the root +partition with the correct UUID. You must first create an empty GPT and then +type 'l' in order for the DPS-compliant type codes to be available. diff --git a/docs/PORTABLE_SERVICES.md b/docs/PORTABLE_SERVICES.md index d9c838f982..8248275ced 100644 --- a/docs/PORTABLE_SERVICES.md +++ b/docs/PORTABLE_SERVICES.md @@ -165,7 +165,7 @@ requirements are made for an image that can be attached/detached with an image with a partition table understood by the Linux kernel with only a single partition defined, or alternatively, a GPT partition table with a set of properly marked partitions following the [Discoverable Partitions - Specification](https://www.freedesktop.org/wiki/Specifications/DiscoverablePartitionsSpec/). + Specification](https://systemd.io/DISCOVERABLE_PARTITIONS). 3. The image must at least contain one matching unit file, with the right name prefix and suffix (see above). The unit file is searched in the usual paths, diff --git a/man/systemd-gpt-auto-generator.xml b/man/systemd-gpt-auto-generator.xml index 0b9d88a87a..b0fa617d63 100644 --- a/man/systemd-gpt-auto-generator.xml +++ b/man/systemd-gpt-auto-generator.xml @@ -17,11 +17,9 @@ systemd-gpt-auto-generator - Generator for automatically discovering - and mounting root, /home and - /srv partitions, as well as - discovering and enabling swap partitions, based on GPT - partition type GUIDs. + Generator for automatically discovering and mounting root, /home/, + /srv/, /var/ and /var/tmp/ partitions, as + well as discovering and enabling swap partitions, based on GPT partition type GUIDs. @@ -32,11 +30,11 @@ Description systemd-gpt-auto-generator is a unit generator that automatically discovers - root, /home/, /srv/, the EFI System Partition, the Extended - Boot Loader Partition and swap partitions and creates mount and swap units for them, based on the - partition type GUIDs of GUID partition tables (GPT), see UEFI Specification, chapter 5. It implements the Discoverable Partitions + root, /home/, /srv/, /var/, + /var/tmp/, the EFI System Partition, the Extended Boot Loader Partition and swap + partitions and creates mount and swap units for them, based on the partition type GUIDs of GUID partition + tables (GPT), see UEFI Specification, chapter 5. It + implements the Discoverable Partitions Specification. Note that this generator has no effect on non-GPT systems, and on specific mount points that are directories already containing files. Also, on systems where the units are explicitly configured (for example, listed in This generator will only look for the root partition on the same physical disk the EFI System - Partition (ESP) is located on. Note that support from the boot loader is required: EFI variable - LoaderDevicePartUUID-4a67b082-0a4c-41cf-b6c7-440b29bb8c4f is used to determine from - which partition, and hence the disk from which the system was booted. If the boot loader does not set - this variable, this generator will not be able to autodetect the root partition. + Partition (ESP) is located on. Note that support from the boot loader is required: the EFI variable + LoaderDevicePartUUID of the 4a67b082-0a4c-41cf-b6c7-440b29bb8c4f + vendor UUID is used to determine from which partition, and hence the disk from which the system was + booted. If the boot loader does not set this variable, this generator will not be able to autodetect the + root partition. See the Boot Loader + Interface for details. Similarly, this generator will only look for the other partitions on the same physical disk as the root partition. In this case, boot loader support is not required. These partitions will not be searched @@ -113,13 +113,25 @@ 933ac7e1-2eb4-4f13-b844-0e14e2aef915 Home Partition /home/ - The first home partition on the disk the root partition is located on is mounted to /home. + The first home partition on the disk the root partition is located on is mounted to /home/. 3b8f8425-20e0-4f3b-907f-1a25a76f98e8 Server Data Partition /srv/ - The first server data partition on the disk the root partition is located on is mounted to /srv. + The first server data partition on the disk the root partition is located on is mounted to /srv/. + + + 4d21b016-b534-45c2-a9fb-5c16e091fd2d + Variable Data Partition + /var/ + The first variable data partition on the disk the root partition is located on is mounted to /var/ — under the condition its partition UUID matches the first 128 bit of the HMAC-SHA256 of the GPT type uuid of this partition keyed by the machine ID of the installation stored in machine-id5. + + + 7ec6f557-3bc5-4aca-b293-16ef5df639d1 + Temporary Data Partition + /var/tmp/ + The first temporary data partition on the disk the root partition is located on is mounted to /var/tmp/. 0657fd6d-a4ab-43c4-84e5-0933c84b4f4f @@ -131,13 +143,13 @@ c12a7328-f81f-11d2-ba4b-00a0c93ec93b EFI System Partition (ESP) /efi/ or /boot/ - The first ESP located on the disk the root partition is located on is mounted to /boot or /efi, see below. + The first ESP located on the disk the root partition is located on is mounted to /boot/ or /efi/, see below. bc13c2ff-59e6-4262-a352-b275fd6f7172 Extended Boot Loader Partition /boot/ - The first Extended Boot Loader Partition is mounted to /boot, see below. + The first Extended Boot Loader Partition is mounted to /boot/, see below. @@ -164,14 +176,14 @@ GPT_FLAG_READ_ONLY 0x1000000000000000 - /, /home/, /srv/, Extended Boot Loader Partition + /, /home/, /srv/, /var/, /var/tmp/, Extended Boot Loader Partition Partition is mounted read-only GPT_FLAG_NO_AUTO 0x8000000000000000 - /, /home/, /srv/, Extended Boot Loader Partition + /, /home/, /srv/, /var/, /var/tmp/, Extended Boot Loader Partition Partition is not mounted automatically @@ -185,11 +197,12 @@ - The /home/ and /srv/ partitions may be encrypted in LUKS - format. In this case, a device mapper device is set up under the names - /dev/mapper/home and /dev/mapper/srv. Note that this might - create conflicts if the same partition is listed in /etc/crypttab with a different - device mapper device name. + The /home/, /srv/, /var/ and + /var/tmp/ partitions may be encrypted in LUKS format. In this case, a device mapper + device is set up under the names /dev/mapper/home, + /dev/mapper/srv, /dev/mapper/var and + /dev/mapper/tmp. Note that this might create conflicts if the same partition is + listed in /etc/crypttab with a different device mapper device name. When systemd is running in the initrd the / partition may be encrypted in LUKS format as well. In this case, a device mapper device is set up under the name /dev/mapper/root, @@ -209,7 +222,7 @@ If the disk contains an Extended Boot Loader partition, as defined in the Boot Loader Specification, it is made - available at /boot (by means of an automount point, similar to the ESP, see + available at /boot/ (by means of an automount point, similar to the ESP, see above). If both an EFI System Partition and an Extended Boot Loader partition exist the latter is preferably mounted to /boot/. Make sure to create both /efi/ and /boot/ to ensure both partitions are mounted. @@ -269,6 +282,7 @@ systemd.swap5, systemd-fstab-generator8, systemd-cryptsetup@.service8, + machine-id5, cryptsetup8, fstab5, btrfs8 diff --git a/man/systemd-nspawn.xml b/man/systemd-nspawn.xml index 9e6cc8bf1a..0b7959e06f 100644 --- a/man/systemd-nspawn.xml +++ b/man/systemd-nspawn.xml @@ -288,7 +288,7 @@ a server data partition which are mounted to the appropriate places in the container. All these partitions must be identified by the partition types defined by the Discoverable + url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions Specification. No partition table, and a single file system spanning the whole image. diff --git a/man/systemd.exec.xml b/man/systemd.exec.xml index ee2ca6aa10..9d98b2b411 100644 --- a/man/systemd.exec.xml +++ b/man/systemd.exec.xml @@ -117,12 +117,12 @@ RootImage= - Takes a path to a block device node or regular file as argument. This call is similar to - RootDirectory= however mounts a file system hierarchy from a block device node or loopback - file instead of a directory. The device node or file system image file needs to contain a file system without a - partition table, or a file system within an MBR/MS-DOS or GPT partition table with only a single - Linux-compatible partition, or a set of file systems within a GPT partition table that follows the Discoverable Partitions + Takes a path to a block device node or regular file as argument. This call is similar + to RootDirectory= however mounts a file system hierarchy from a block device node + or loopback file instead of a directory. The device node or file system image file needs to contain a + file system without a partition table, or a file system within an MBR/MS-DOS or GPT partition table + with only a single Linux-compatible partition, or a set of file systems within a GPT partition table + that follows the Discoverable Partitions Specification. When DevicePolicy= is set to closed or diff --git a/src/core/execute.c b/src/core/execute.c index 5dc111f714..2a02cc54db 100644 --- a/src/core/execute.c +++ b/src/core/execute.c @@ -2589,7 +2589,7 @@ static int apply_mount_namespace( needs_sandboxing ? context->protect_home : PROTECT_HOME_NO, needs_sandboxing ? context->protect_system : PROTECT_SYSTEM_NO, context->mount_flags, - DISSECT_IMAGE_DISCARD_ON_LOOP, + DISSECT_IMAGE_DISCARD_ON_LOOP|DISSECT_IMAGE_RELAX_VAR_CHECK, error_path); /* If we couldn't set up the namespace this is probably due to a missing capability. setup_namespace() reports diff --git a/src/dissect/dissect.c b/src/dissect/dissect.c index c1be6c034c..2fef95aa02 100644 --- a/src/dissect/dissect.c +++ b/src/dissect/dissect.c @@ -22,7 +22,7 @@ static enum { } arg_action = ACTION_DISSECT; static const char *arg_image = NULL; static const char *arg_path = NULL; -static DissectImageFlags arg_flags = DISSECT_IMAGE_REQUIRE_ROOT|DISSECT_IMAGE_DISCARD_ON_LOOP; +static DissectImageFlags arg_flags = DISSECT_IMAGE_REQUIRE_ROOT|DISSECT_IMAGE_DISCARD_ON_LOOP|DISSECT_IMAGE_RELAX_VAR_CHECK; static void *arg_root_hash = NULL; static size_t arg_root_hash_size = 0; diff --git a/src/gpt-auto-generator/gpt-auto-generator.c b/src/gpt-auto-generator/gpt-auto-generator.c index af3bd23450..e03cdbd5c0 100644 --- a/src/gpt-auto-generator/gpt-auto-generator.c +++ b/src/gpt-auto-generator/gpt-auto-generator.c @@ -710,6 +710,18 @@ static int enumerate_partitions(dev_t devnum) { r = k; } + if (m->partitions[PARTITION_VAR].found) { + k = add_partition_mount(m->partitions + PARTITION_VAR, "var", "/var", "Variable Data Partition"); + if (k < 0) + r = k; + } + + if (m->partitions[PARTITION_TMP].found) { + k = add_partition_mount(m->partitions + PARTITION_TMP, "var-tmp", "/var/tmp", "Temporary Data Partition"); + if (k < 0) + r = k; + } + if (m->partitions[PARTITION_ROOT].found) { k = add_root_rw(m->partitions + PARTITION_ROOT); if (k < 0) diff --git a/src/nspawn/nspawn.c b/src/nspawn/nspawn.c index 791bdd3532..ef6d573bb3 100644 --- a/src/nspawn/nspawn.c +++ b/src/nspawn/nspawn.c @@ -5112,14 +5112,14 @@ static int run(int argc, char *argv[]) { loop->fd, arg_image, arg_root_hash, arg_root_hash_size, - DISSECT_IMAGE_REQUIRE_ROOT, + DISSECT_IMAGE_REQUIRE_ROOT|DISSECT_IMAGE_RELAX_VAR_CHECK, &dissected_image); if (r == -ENOPKG) { /* dissected_image_and_warn() already printed a brief error message. Extend on that with more details */ log_notice("Note that the disk image needs to\n" " a) either contain only a single MBR partition of type 0x83 that is marked bootable\n" " b) or contain a single GPT partition of type 0FC63DAF-8483-4772-8E79-3D69D8477DE4\n" - " c) or follow http://www.freedesktop.org/wiki/Specifications/DiscoverablePartitionsSpec/\n" + " c) or follow https://systemd.io/DISCOVERABLE_PARTITIONS\n" " d) or contain a file system without a partition table\n" "in order to be bootable with systemd-nspawn."); goto finish; diff --git a/src/portable/portable.c b/src/portable/portable.c index 7a86398a4b..e18826ab26 100644 --- a/src/portable/portable.c +++ b/src/portable/portable.c @@ -387,7 +387,7 @@ static int portable_extract_by_path( if (r < 0) return log_debug_errno(r, "Failed to create temporary directory: %m"); - r = dissect_image(d->fd, NULL, 0, DISSECT_IMAGE_READ_ONLY|DISSECT_IMAGE_REQUIRE_ROOT|DISSECT_IMAGE_DISCARD_ON_LOOP, &m); + r = dissect_image(d->fd, NULL, 0, DISSECT_IMAGE_READ_ONLY|DISSECT_IMAGE_REQUIRE_ROOT|DISSECT_IMAGE_DISCARD_ON_LOOP|DISSECT_IMAGE_RELAX_VAR_CHECK, &m); if (r == -ENOPKG) sd_bus_error_setf(error, SD_BUS_ERROR_INVALID_ARGS, "Couldn't identify a suitable partition table or file system in '%s'.", path); else if (r == -EADDRNOTAVAIL) diff --git a/src/shared/dissect-image.c b/src/shared/dissect-image.c index 11d21c3a4d..087c3dc345 100644 --- a/src/shared/dissect-image.c +++ b/src/shared/dissect-image.c @@ -598,6 +598,43 @@ int dissect_image( if (!generic_node) return -ENOMEM; } + + } else if (sd_id128_equal(type_id, GPT_TMP)) { + + if (pflags & GPT_FLAG_NO_AUTO) + continue; + + designator = PARTITION_TMP; + rw = !(pflags & GPT_FLAG_READ_ONLY); + + } else if (sd_id128_equal(type_id, GPT_VAR)) { + + if (pflags & GPT_FLAG_NO_AUTO) + continue; + + if (!FLAGS_SET(flags, DISSECT_IMAGE_RELAX_VAR_CHECK)) { + sd_id128_t var_uuid; + + /* For /var we insist that the uuid of the partition matches the + * HMAC-SHA256 of the /var GPT partition type uuid, keyed by machine + * ID. Why? Unlike the other partitions /var is inherently + * installation specific, hence we need to be careful not to mount it + * in the wrong installation. By hashing the partition UUID from + * /etc/machine-id we can securely bind the partition to the + * installation. */ + + r = sd_id128_get_machine_app_specific(GPT_VAR, &var_uuid); + if (r < 0) + return r; + + if (!sd_id128_equal(var_uuid, id)) { + log_debug("Found a /var/ partition, but its UUID didn't match our expectations, ignoring."); + continue; + } + } + + designator = PARTITION_VAR; + rw = !(pflags & GPT_FLAG_READ_ONLY); } if (designator != _PARTITION_DESIGNATOR_INVALID) { @@ -910,6 +947,14 @@ int dissected_image_mount(DissectedImage *m, const char *where, uid_t uid_shift, if (r < 0) return r; + r = mount_partition(m->partitions + PARTITION_VAR, where, "/var", uid_shift, flags); + if (r < 0) + return r; + + r = mount_partition(m->partitions + PARTITION_TMP, where, "/var/tmp", uid_shift, flags); + if (r < 0) + return r; + boot_mounted = mount_partition(m->partitions + PARTITION_XBOOTLDR, where, "/boot", uid_shift, flags); if (boot_mounted < 0) return boot_mounted; @@ -1333,7 +1378,8 @@ int dissected_image_acquire_metadata(DissectedImage *m) { [META_HOSTNAME] = "/etc/hostname\0", [META_MACHINE_ID] = "/etc/machine-id\0", [META_MACHINE_INFO] = "/etc/machine-info\0", - [META_OS_RELEASE] = "/etc/os-release\0/usr/lib/os-release\0", + [META_OS_RELEASE] = "/etc/os-release\0" + "/usr/lib/os-release\0", }; _cleanup_strv_free_ char **machine_info = NULL, **os_release = NULL; @@ -1528,6 +1574,8 @@ static const char *const partition_designator_table[] = { [PARTITION_SWAP] = "swap", [PARTITION_ROOT_VERITY] = "root-verity", [PARTITION_ROOT_SECONDARY_VERITY] = "root-secondary-verity", + [PARTITION_TMP] = "tmp", + [PARTITION_VAR] = "var", }; DEFINE_STRING_TABLE_LOOKUP(partition_designator, int); diff --git a/src/shared/dissect-image.h b/src/shared/dissect-image.h index 40b8035ac7..359dc877d5 100644 --- a/src/shared/dissect-image.h +++ b/src/shared/dissect-image.h @@ -33,6 +33,8 @@ enum { PARTITION_SWAP, PARTITION_ROOT_VERITY, /* verity data for the PARTITION_ROOT partition */ PARTITION_ROOT_SECONDARY_VERITY, /* verity data for the PARTITION_ROOT_SECONDARY partition */ + PARTITION_TMP, + PARTITION_VAR, _PARTITION_DESIGNATOR_MAX, _PARTITION_DESIGNATOR_INVALID = -1 }; @@ -59,6 +61,7 @@ typedef enum DissectImageFlags { DISSECT_IMAGE_MOUNT_NON_ROOT_ONLY = 1 << 7, /* Mount only non-root partitions */ DISSECT_IMAGE_VALIDATE_OS = 1 << 8, /* Refuse mounting images that aren't identifiable as OS images */ DISSECT_IMAGE_NO_UDEV = 1 << 9, /* Don't wait for udev initializing things */ + DISSECT_IMAGE_RELAX_VAR_CHECK = 1 << 10, /* Don't insist that the UUID of /var is hashed from /etc/machine-id */ } DissectImageFlags; struct DissectedImage { diff --git a/src/shared/gpt.h b/src/shared/gpt.h index 31e01bd5a5..8e9b111857 100644 --- a/src/shared/gpt.h +++ b/src/shared/gpt.h @@ -19,6 +19,8 @@ #define GPT_SWAP SD_ID128_MAKE(06,57,fd,6d,a4,ab,43,c4,84,e5,09,33,c8,4b,4f,4f) #define GPT_HOME SD_ID128_MAKE(93,3a,c7,e1,2e,b4,4f,13,b8,44,0e,14,e2,ae,f9,15) #define GPT_SRV SD_ID128_MAKE(3b,8f,84,25,20,e0,4f,3b,90,7f,1a,25,a7,6f,98,e8) +#define GPT_VAR SD_ID128_MAKE(4d,21,b0,16,b5,34,45,c2,a9,fb,5c,16,e0,91,fd,2d) +#define GPT_TMP SD_ID128_MAKE(7e,c6,f5,57,3b,c5,4a,ca,b2,93,16,ef,5d,f6,39,d1) /* Verity partitions for the root partitions above (we only define them for the root partitions, because only they are * are commonly read-only and hence suitable for verity). */ diff --git a/src/shared/machine-image.c b/src/shared/machine-image.c index 15fd514353..b45efcd1e6 100644 --- a/src/shared/machine-image.c +++ b/src/shared/machine-image.c @@ -1171,7 +1171,7 @@ int image_read_metadata(Image *i) { if (r < 0) return r; - r = dissect_image(d->fd, NULL, 0, DISSECT_IMAGE_REQUIRE_ROOT, &m); + r = dissect_image(d->fd, NULL, 0, DISSECT_IMAGE_REQUIRE_ROOT|DISSECT_IMAGE_RELAX_VAR_CHECK, &m); if (r < 0) return r; diff --git a/src/test/test-dissect-image.c b/src/test/test-dissect-image.c index 12685dad13..a1ccf605b1 100644 --- a/src/test/test-dissect-image.c +++ b/src/test/test-dissect-image.c @@ -28,7 +28,7 @@ int main(int argc, char *argv[]) { return EXIT_FAILURE; } - r = dissect_image(d->fd, NULL, 0, DISSECT_IMAGE_REQUIRE_ROOT, &m); + r = dissect_image(d->fd, NULL, 0, DISSECT_IMAGE_REQUIRE_ROOT|DISSECT_IMAGE_RELAX_VAR_CHECK, &m); if (r < 0) { log_error_errno(r, "Failed to dissect image: %m"); return EXIT_FAILURE;