Fedora Project Developers intend switch atomically updated distribution editions to the file system by default Composefs. If the proposal is approved by the FESCo (Fedora Engineering Steering Committee), which is responsible for the technical part of the development of the Fedora distribution, Composefs will begin to be used in builds of Fedora Silverblue (GNOME), Fedora Kinoite (KDE), Fedora CoreOS, Fedora IoT, Fedora Sway Atomic and Fedora Budgie Atomic.
The change will allow the use of a root partition operating in read-only mode in these assemblies, as well as further use of integrity verification tools for the system partition, which allows identifying problems that arise during operation. The /etc and /var partitions will continue to be writably mounted.
Currently, work is organized by mounting the /usr partition in read-only mode, while the root partition is mounted in writable mode, and making changes is prohibited at the access rights level (“chattr +i /”). Integrity checking is carried out at the partition update stage, which does not allow identifying damage and changes made during operation without initiating a full scan of all data with the “ostree fsck” command.
The Composefs file system is implemented as an add-on over the OverlayFS and EROFS file systems already present in the kernel, and is optimized for efficient joint storage of the contents of several mounted disk images. EROFS (Extendable Read-Only File System) functionality meets Composefs requirements starting with Linux kernel version 5.15, and OverlayFS – starting with kernel 6.5. Composefs makes it possible to create multi-layer file systems in which arbitrary file system trees in read-only mode are superimposed on top of standard Linux file systems that act as a bottom layer.
What distinguishes Composefs from existing similar file systems is its support for sharing the contents of different disk images and the presence of functions for verifying the authenticity of read data. Composefs uses a content-based storage model in which the primary identifier is not the file name, but a hash of the file's contents. This model provides deduplication and allows you to store only one copy of identical files found on different mounted partitions.
System images usually contain many generic files, and when using Composefs, each of these files will be shared by all mounted images, without resorting to tricks like forwarding using hard links. In this case, shared files are not only stored as one copy on disk, but also cost one entry in the page cache, which makes it possible to save both disk and RAM.
To save disk space, data and metadata in Composefs are separated and when mounted, a binary index is specified separately, which contains all file system metadata, file names, access rights and other information. Indexes with metadata are created for each FS image and stored in a separate file in the EROFS format (in loopback mode, an EROFS image is mounted, which contains only metadata). The files of all mounted images are stored in a common base directory in a regular file system (ext4, xfs, btrfs) and are associated with the image using the extended attribute rusted.overlay.redirect, on the basis of which OverlayFS finds the necessary files based on the content hash.
To verify the contents of individual files and the entire image under shared storage conditions, the fs-verity mechanism is used, which, when accessing files, checks the correspondence of the hashes specified in the binary index with the actual content – if an attacker makes a change to a file in the base directory or the data is damaged as a result of a failure, then such a reconciliation will reveal a discrepancy.
Thanks for reading: