[PATCH 0/2] Allow booting of degraded software RAID/MD arrays

  • Open
  • quality assurance status badge
Details
3 participants
  • Andreas Enge
  • Felix Lechner
  • Ludovic Courtès
Owner
unassigned
Submitted by
Felix Lechner
Severity
normal
F
F
Felix Lechner wrote on 24 Jun 2023 04:05
(address . guix-patches@gnu.org)(name . Felix Lechner)(address . felix.lechner@lease-up.com)
cover.1687571974.git.felix.lechner@lease-up.com
This commit series cures a dangerous condition for users of MD arrays in GNU
Guix. Such arrays are presently unlikely to boot after a drive
failure. Instead the user is dropped into an early boot Guile shell.

That behaviour contradicts the expectations of many users of such arrays.

These commits were tested over several months on two production machines. The
changes also includes a system test of the new facility.

Please feel free to make any edits to this series as needed, without checking
with the author. Thanks!

Felix Lechner (2):
Offer an mdadm variant of uuids.
Provide md-array-device-mapping to start MD arrays via UUID or name.

doc/guix.texi | 28 +++++++++++----------
gnu/system/mapped-devices.scm | 38 ++++++++++++++++++++++++++++-
gnu/system/uuid.scm | 46 ++++++++++++++++++++++++++++++++---
gnu/tests/install.scm | 32 ++++++++++++------------
4 files changed, 110 insertions(+), 34 deletions(-)


base-commit: d6dc82e8cdb2d6114a12b06d449ce7f1150c7f70
--
2.40.1
F
F
Felix Lechner wrote on 24 Jun 2023 04:07
[PATCH 1/2] Offer an mdadm variant of uuids.
(address . 64259@debbugs.gnu.org)(name . Felix Lechner)(address . felix.lechner@lease-up.com)
a3f6db9b97ad87027ef74007c0782a535a553415.1687571974.git.felix.lechner@lease-up.com
The main executable to set up and manage Linux MD arrays, mdadm, uses a UUID
format that is different from other standards. The variant is here provided
for the benefit of Guix users.

* gnu/system/uuid.scm: Offer an mdadm variant of uuids.
---
gnu/system/uuid.scm | 46 +++++++++++++++++++++++++++++++++++++++++----
1 file changed, 42 insertions(+), 4 deletions(-)

Toggle diff (94 lines)
diff --git a/gnu/system/uuid.scm b/gnu/system/uuid.scm
index 8f967387ad..dc8bb3f7b7 100644
--- a/gnu/system/uuid.scm
+++ b/gnu/system/uuid.scm
@@ -82,8 +82,9 @@ (define-syntax %network-byte-order
(identifier-syntax (endianness big)))
(define (dce-uuid->string uuid)
- "Convert UUID, a 16-byte bytevector, to its string representation, something
-like \"6b700d61-5550-48a1-874c-a3d86998990e\"."
+ "Convert UUID, a 16-byte bytevector, to its DCE string representation (see
+<https://tools.ietf.org/html/rfc4122>), which looks something like
+\"6b700d61-5550-48a1-874c-a3d86998990e\"."
;; See <https://tools.ietf.org/html/rfc4122>.
(let ((time-low (bytevector-uint-ref uuid 0 %network-byte-order 4))
(time-mid (bytevector-uint-ref uuid 4 %network-byte-order 2))
@@ -93,7 +94,7 @@ (define (dce-uuid->string uuid)
(format #f "~8,'0x-~4,'0x-~4,'0x-~4,'0x-~12,'0x"
time-low time-mid time-hi clock-seq node)))
-(define %uuid-rx
+(define %dce-uuid-rx
;; The regexp of a UUID.
(make-regexp "^([[:xdigit:]]{8})-([[:xdigit:]]{4})-([[:xdigit:]]{4})-([[:xdigit:]]{4})-([[:xdigit:]]{12})$"))
@@ -101,7 +102,7 @@ (define (string->dce-uuid str)
"Parse STR as a DCE UUID (see <https://tools.ietf.org/html/rfc4122>) and
return its contents as a 16-byte bytevector. Return #f if STR is not a valid
UUID representation."
- (and=> (regexp-exec %uuid-rx str)
+ (and=> (regexp-exec %dce-uuid-rx str)
(lambda (match)
(letrec-syntax ((hex->number
(syntax-rules ()
@@ -167,6 +168,41 @@ (define (digits->string bytes)
(parts (list year month day hour minute second hundredths)))
(string-append (string-join (map digits->string parts) "-"))))
+
+;;;
+;;; Mdadm.
+;;;
+
+(define (mdadm-uuid->string uuid)
+ "Convert UUID, a 16-byte bytevector, to its Mdadm string representation,
+which looks something like \"6b700d61:555048a1:874ca3d8:6998990e\"."
+ ;; See <https://tools.ietf.org/html/rfc4122>.
+ (format #f "~8,'0x:~8,'0x:~8,'0x:~8,'0x"
+ (bytevector-uint-ref uuid 0 %network-byte-order 4)
+ (bytevector-uint-ref uuid 4 %network-byte-order 4)
+ (bytevector-uint-ref uuid 8 %network-byte-order 4)
+ (bytevector-uint-ref uuid 12 %network-byte-order 4)))
+
+(define %mdadm-uuid-rx
+ (make-regexp "^([[:xdigit:]]{8}):([[:xdigit:]]{8}):([[:xdigit:]]{8}):([[:xdigit:]]{8})$"))
+
+(define (string->mdadm-uuid str)
+ "Parse STR, which is in Mdadm format, and return a bytevector or #f."
+ (match (regexp-exec %mdadm-uuid-rx str)
+ (#f
+ #f)
+ (rx-match
+ (uint-list->bytevector (list (string->number
+ (match:substring rx-match 1) 16)
+ (string->number
+ (match:substring rx-match 2) 16)
+ (string->number
+ (match:substring rx-match 3) 16)
+ (string->number
+ (match:substring rx-match 4) 16))
+ %network-byte-order
+ 4))))
+
;;;
;;; FAT32/FAT16.
@@ -259,6 +295,7 @@ (define %uuid-parsers
('dce 'ext2 'ext3 'ext4 'bcachefs 'btrfs 'f2fs 'jfs 'xfs 'luks
=> string->dce-uuid)
('fat32 'fat16 'fat => string->fat-uuid)
+ ('mdadm => string->mdadm-uuid)
('ntfs => string->ntfs-uuid)
('iso9660 => string->iso9660-uuid)))
@@ -268,6 +305,7 @@ (define %uuid-printers
=> dce-uuid->string)
('iso9660 => iso9660-uuid->string)
('fat32 'fat16 'fat => fat-uuid->string)
+ ('mdadm => mdadm-uuid->string)
('ntfs => ntfs-uuid->string)))
(define* (string->uuid str #:optional (type 'dce))
--
2.40.1
F
F
Felix Lechner wrote on 24 Jun 2023 04:07
[PATCH 2/2] Provide md-array-device-mapping to start MD arrays via UUID or name.
(address . 64259@debbugs.gnu.org)(name . Felix Lechner)(address . felix.lechner@lease-up.com)
4e7eab10caeacfb1f8a0736cdab7154c517b9e36.1687571974.git.felix.lechner@lease-up.com
This commit cures the most precipitous danger for users of MD arrays in GNU
Guix, namely that their equipment may not boot after a drive failure. That
behavior likely contradicts their primary expectation for having such a disk
arrangments.

In order to facilitate a smooth transition from raid-device-mapping to
md-array-device-mapping, this commit introduces a new mapping rather than
repurpose the old one. The new mapping here is also incompatible with
raid-device-mapping in the sense that a plain string is now interpreted as the
array name from the MD superblock.

For details, please consult the mdadm manual page.

Personally, the author prefers UUIDs over array names when identifying array
components, but either will work. The system test uses the name.

The name for the new device mapping was chosen instead of the traditional RAID
to account for the fact that some modern technologies (like SSDs) and some
array configurations, such as striping, are neither redundant nor inexpensive.

Adjusts the documentation by erasing any mention of the obsolete
raid-device-mapping. No one should use that any longer. Ideally, users would
be a deprecation warning, but I was unable to adapt 'define-deprecated' to
this use case. Please feel free to make further changes.

This commit includes an updated system test for the root file system on
an-md-array.

More details for the motivation of these changes may be available here:


The author of this commit used to maintain mdadm in Debian.

Please feel free to insert better changelog messages. I had some difficulty
meeting the likely expectations of any reviewer. Please also feel free to make
any other adjustments as needed without checking with me. Thanks!

* gnu/system/mapped-devices.scm: New variable md-array-device-mapping.
* doc/guix.texi: Mention md-array-device-mapping in the documentation..
* gnu/tests/install.scm: Adjust test for root-on-md-array.
---
doc/guix.texi | 28 ++++++++++++++------------
gnu/system/mapped-devices.scm | 38 ++++++++++++++++++++++++++++++++++-
gnu/tests/install.scm | 32 ++++++++++++++---------------
3 files changed, 68 insertions(+), 30 deletions(-)

Toggle diff (211 lines)
diff --git a/doc/guix.texi b/doc/guix.texi
index c961f706ec..91125479b1 100644
--- a/doc/guix.texi
+++ b/doc/guix.texi
@@ -17513,18 +17513,19 @@ the system boots up.
@table @code
@item source
-This is either a string specifying the name of the block device to be mapped,
-such as @code{"/dev/sda3"}, or a list of such strings when several devices
-need to be assembled for creating a new one. In case of LVM this is a
-string specifying name of the volume group to be mapped.
+This is either a string specifying the name of the block device to be
+mapped, such as @code{"/dev/sda3"}. For MD array devices it is either
+the UUID of the array or a string that is interpreted as the array name
+(see Mdadm documentation). In case of LVM it is a string specifying
+name of the volume group to be mapped.
@item target
This string specifies the name of the resulting mapped device. For
kernel mappers such as encrypted devices of type @code{luks-device-mapping},
specifying @code{"my-partition"} leads to the creation of
the @code{"/dev/mapper/my-partition"} device.
-For RAID devices of type @code{raid-device-mapping}, the full device name
-such as @code{"/dev/md0"} needs to be given.
+For MD array devices of type @code{md-array-device-mapping}, the full device
+name such as @code{"/dev/md18"} needs to be given.
LVM logical volumes of type @code{lvm-device-mapping} need to
be specified as @code{"VGNAME-LVNAME"}.
@@ -17544,11 +17545,12 @@ command from the package with the same name. It relies on the
@code{dm-crypt} Linux kernel module.
@end defvar
-@defvar raid-device-mapping
+@defvar md-array-device-mapping
This defines a RAID device, which is assembled using the @code{mdadm}
-command from the package with the same name. It requires a Linux kernel
-module for the appropriate RAID level to be loaded, such as @code{raid456}
-for RAID-4, RAID-5 or RAID-6, or @code{raid10} for RAID-10.
+command from the package with the same name. It requires the Linux kernel
+module for the appropriate RAID level to be loaded, such as @code{raid1}
+for mirroring, @code{raid456} for the checksum-based RAID levels 4, 5 or 6,
+or @code{raid10} for RAID-10.
@end defvar
@cindex LVM, logical volume manager
@@ -17606,9 +17608,9 @@ may be declared as follows:
@lisp
(mapped-device
- (source (list "/dev/sda1" "/dev/sdb1"))
- (target "/dev/md0")
- (type raid-device-mapping))
+ (source (uuid "33cf3e31:8e33d75b:517d64b9:0a8f7623" 'mdadm))
+ (target "/dev/md17")
+ (type md-array-device-mapping))
@end lisp
The @file{/dev/md0} device can then be used as the @code{device} of a
diff --git a/gnu/system/mapped-devices.scm b/gnu/system/mapped-devices.scm
index e6b8970c12..ffe5bc00f4 100644
--- a/gnu/system/mapped-devices.scm
+++ b/gnu/system/mapped-devices.scm
@@ -64,6 +64,7 @@ (define-module (gnu system mapped-devices)
check-device-initrd-modules ;XXX: needs a better place
luks-device-mapping
+ md-array-device-mapping
raid-device-mapping
lvm-device-mapping))
@@ -317,11 +318,46 @@ (define raid-device-mapping
(open open-raid-device)
(close close-raid-device)))
+(define (open-md-array-device source targets)
+ "Return a gexp that assembles SOURCE to the MD device
+TARGET (e.g., \"/dev/md0\"), using 'mdadm'."
+ (let ((array-selector
+ (match source
+ ((? uuid?)
+ (string-append "--uuid=" (uuid->string source)))
+ ((? string?)
+ (string-append "--name=" source))))
+ (md-device
+ (match targets
+ ((target)
+ target))))
+ (if (and array-selector md-device)
+ ;; Use 'mdadm-static' rather than 'mdadm' to avoid pulling its whole
+ ;; closure (80 MiB) in the initrd when an MD device is needed for boot.
+ #~(zero? (system* #$(file-append mdadm-static "/sbin/mdadm")
+ "--assemble" #$md-device
+ "--run"
+ #$array-selector))
+ #f)))
+
+(define (close-md-array-device source targets)
+ "Return a gexp that stops the MD device TARGET."
+ (match targets
+ ((target)
+ #~(zero? (system* #$(file-append mdadm-static "/sbin/mdadm")
+ "--stop" #$target)))))
+
+(define md-array-device-mapping
+ ;; The type of MD mapped device.
+ (mapped-device-kind
+ (open open-md-array-device)
+ (close close-md-array-device)))
+
(define (open-lvm-device source targets)
#~(and
(zero? (system* #$(file-append lvm2-static "/sbin/lvm")
"vgchange" "--activate" "ay" #$source))
- ; /dev/mapper nodes are usually created by udev, but udev may be unavailable at the time we run this. So we create them here.
+ ; /dev/mapper nodes are usually created by udev, but udev may be unavailable at the time we run this. So we create them here.
(zero? (system* #$(file-append lvm2-static "/sbin/lvm")
"vgscan" "--mknodes"))
(every file-exists? (map (lambda (file) (string-append "/dev/mapper/" file))
diff --git a/gnu/tests/install.scm b/gnu/tests/install.scm
index 0f4204d1a6..061365fd87 100644
--- a/gnu/tests/install.scm
+++ b/gnu/tests/install.scm
@@ -64,7 +64,7 @@ (define-module (gnu tests install)
%test-iso-image-installer
%test-separate-store-os
%test-separate-home-os
- %test-raid-root-os
+ %test-md-array-root-os
%test-encrypted-root-os
%test-encrypted-home-os
%test-encrypted-root-not-boot-os
@@ -612,11 +612,11 @@ (define %test-separate-store-os
;;;
-;;; RAID root device.
+;;; MD root device.
;;;
-(define-os-with-source (%raid-root-os %raid-root-os-source)
- ;; An OS whose root partition is a RAID partition.
+(define-os-with-source (%md-array-root-os %md-array-root-os-source)
+ ;; An OS whose root partition is a MD partition.
(use-modules (gnu) (gnu tests))
(operating-system
@@ -633,9 +633,9 @@ (define-os-with-source (%raid-root-os %raid-root-os-source)
(initrd-modules (cons "raid1" %base-initrd-modules))
(mapped-devices (list (mapped-device
- (source (list "/dev/vda2" "/dev/vda3"))
+ (source "marionette:mirror0")
(target "/dev/md0")
- (type raid-device-mapping))))
+ (type md-array-device-mapping))))
(file-systems (cons (file-system
(device (file-system-label "root-fs"))
(mount-point "/")
@@ -649,7 +649,7 @@ (define-os-with-source (%raid-root-os %raid-root-os-source)
(guix combinators)))))
%base-services))))
-(define %raid-root-installation-script
+(define %md-array-root-installation-script
;; Installation with a separate /gnu partition. See
;; <https://raid.wiki.kernel.org/index.php/RAID_setup> for more on RAID and
;; mdadm.
@@ -665,8 +665,8 @@ (define %raid-root-installation-script
mkpart primary ext2 1.6G 3.2G \\
set 1 boot on \\
set 1 bios_grub on
-yes | mdadm --create /dev/md0 --verbose --level=mirror --raid-devices=2 \\
- /dev/vdb2 /dev/vdb3
+yes | mdadm --create /dev/md0 --verbose --homehost=marionette --name=mirror0 \\
+ --level=mirror --raid-devices=2 /dev/vdb2 /dev/vdb3
mkfs.ext4 -L root-fs /dev/md0
mount /dev/md0 /mnt
df -h /mnt
@@ -677,21 +677,21 @@ (define %raid-root-installation-script
sync
reboot\n")
-(define %test-raid-root-os
+(define %test-md-array-root-os
(system-test
- (name "raid-root-os")
+ (name "md-array-root-os")
(description
"Test functionality of an OS installed with a RAID root partition managed
by 'mdadm'.")
(value
- (mlet* %store-monad ((images (run-install %raid-root-os
- %raid-root-os-source
+ (mlet* %store-monad ((images (run-install %md-array-root-os
+ %md-array-root-os-source
#:script
- %raid-root-installation-script
+ %md-array-root-installation-script
#:target-size (* 3200 MiB)))
(command (qemu-command* images)))
- (run-basic-test %raid-root-os
- `(,@command) "raid-root-os")))))
+ (run-basic-test %md-array-root-os
+ `(,@command) "md-array-root-os")))))
;;;
--
2.40.1
L
L
Ludovic Courtès wrote on 20 Oct 2023 23:55
(name . Felix Lechner)(address . felix.lechner@lease-up.com)(address . 64259@debbugs.gnu.org)
87bkctt0vh.fsf@gnu.org
Hi,

Felix Lechner <felix.lechner@lease-up.com> skribis:

Toggle quote (3 lines)
> This commit cures the most precipitous danger for users of MD arrays in GNU
> Guix, namely that their equipment may not boot after a drive failure.

Why would that happen? Could be because the device names specified in
the ‘source’ field of the mapped device become invalid?

Toggle quote (5 lines)
> Adjusts the documentation by erasing any mention of the obsolete
> raid-device-mapping. No one should use that any longer. Ideally, users would
> be a deprecation warning, but I was unable to adapt 'define-deprecated' to
> this use case. Please feel free to make further changes.

If it has to be deprecated then yes, we try and use ‘define-deprecated’.

Toggle quote (4 lines)
> Please feel free to insert better changelog messages. I had some difficulty
> meeting the likely expectations of any reviewer. Please also feel free to make
> any other adjustments as needed without checking with me. Thanks!

The reviewer may feel free, sure… :-)

Toggle quote (11 lines)
> @item source
> -This is either a string specifying the name of the block device to be mapped,
> -such as @code{"/dev/sda3"}, or a list of such strings when several devices
> -need to be assembled for creating a new one. In case of LVM this is a
> -string specifying name of the volume group to be mapped.
> +This is either a string specifying the name of the block device to be
> +mapped, such as @code{"/dev/sda3"}. For MD array devices it is either
> +the UUID of the array or a string that is interpreted as the array name
> +(see Mdadm documentation). In case of LVM it is a string specifying
> +name of the volume group to be mapped.

Instead of “see Mdadm documentation”, could you add a link or a command
to type to access said documentation? Better yet, an example of what an
mdadm device name or UUID is and how to obtain it would be welcome.

Toggle quote (14 lines)
> +(define (open-md-array-device source targets)
> + "Return a gexp that assembles SOURCE to the MD device
> +TARGET (e.g., \"/dev/md0\"), using 'mdadm'."
> + (let ((array-selector
> + (match source
> + ((? uuid?)
> + (string-append "--uuid=" (uuid->string source)))
> + ((? string?)
> + (string-append "--name=" source))))
> + (md-device
> + (match targets
> + ((target)
> + target))))
> + (if (and array-selector md-device)
^
This condition is always true.

Toggle quote (21 lines)
> + ;; Use 'mdadm-static' rather than 'mdadm' to avoid pulling its whole
> + ;; closure (80 MiB) in the initrd when an MD device is needed for boot.
> + #~(zero? (system* #$(file-append mdadm-static "/sbin/mdadm")
> + "--assemble" #$md-device
> + "--run"
> + #$array-selector))
> + #f)))
> +
> +(define (close-md-array-device source targets)
> + "Return a gexp that stops the MD device TARGET."
> + (match targets
> + ((target)
> + #~(zero? (system* #$(file-append mdadm-static "/sbin/mdadm")
> + "--stop" #$target)))))
> +
> +(define md-array-device-mapping
> + ;; The type of MD mapped device.
> + (mapped-device-kind
> + (open open-md-array-device)
> + (close close-md-array-device)))

Instead of renaming and duplicating part of the logic, how about
supporting those new ‘source’ specification right in ‘open-raid-device’?
It would emit a deprecation warning when ‘source’ is a list of strings.

Does the busy wait loop currently in ‘open-raid-device’ need to be
preserved?

Thanks,
Ludo’.
F
F
Felix Lechner wrote on 22 Oct 2023 19:44
(name . Ludovic Courtès)(address . ludo@gnu.org)
87fs22pn6a.fsf@lease-up.com
Hi,

Thanks for looking at my patch!

On Fri, Oct 20 2023, Ludovic Courtès wrote:

Toggle quote (3 lines)
> Could ... the device names specified in
> the ‘source’ field of the mapped device become invalid?

In RAID failures, the devices for defective components usually stop
functioning as intended or become unavailable altogether. Listing those
devices on the mdadm command line, however, requires them to be present
for the assembly of the array. For the fault-tolerant behavior people
expect, arrays should be started via the array name or the special UUID.

Toggle quote (2 lines)
> try and use ‘define-deprecated’.

Yes, thank you! I will do so and deploy locally before I update the
patch herein.

Toggle quote (3 lines)
> Instead of “see Mdadm documentation”, could you add a link or a command
> to type to access said documentation?

Upon review, I am not sure that the mdadm documentation is actually very
helpful. I must have been thinking about third-party sites.

Toggle quote (3 lines)
> Better yet, an example of what an
> mdadm device name or UUID is and how to obtain it would be welcome.

Yes, I will include examples on how to access both, and how to change
the "array name." The latter can be a chosen string that is optionally
prefaced by the local host name. The array name is not the same as the
device name, which looks like /dev/md12. I shall clarify all that in the
revised patch.

Toggle quote (4 lines)
>> + (if (and array-selector md-device)
> ^
> This condition is always true.

Okay, I may not know Guile macros well enough.

Toggle quote (4 lines)
> Instead of renaming and duplicating part of the logic, how about
> supporting those new ‘source’ specification right in ‘open-raid-device’?
> It would emit a deprecation warning when ‘source’ is a list of strings.

It's an good idea, but many other file systems offer RAID-type
functionality. Do you think that a raid-device-mapping based on mdadm
occupies a fair share in the common name space?

Toggle quote (3 lines)
> Does the busy wait loop currently in ‘open-raid-device’ need to be
> preserved?

I personally do not believe so but I'll defer to Andreas Enge, whom I
copied on this message. I believe Andreas wrote the original device
mapping.

Kind regards
Felix
F
F
Felix Lechner wrote on 23 Nov 2023 15:56
[PATCH v2 1/2] Offer an mdadm variant of uuids.
(address . 64259@debbugs.gnu.org)
9b4c88707c00531fa2a43e5172d1fc0c4f4af3d9.1700751420.git.felix.lechner@lease-up.com
---
gnu/system/uuid.scm | 46 +++++++++++++++++++++++++++++++++++++++++----
1 file changed, 42 insertions(+), 4 deletions(-)

Toggle diff (96 lines)
diff --git a/gnu/system/uuid.scm b/gnu/system/uuid.scm
index 8f967387ad..dc8bb3f7b7 100644
--- a/gnu/system/uuid.scm
+++ b/gnu/system/uuid.scm
@@ -82,8 +82,9 @@ (define-syntax %network-byte-order
(identifier-syntax (endianness big)))
(define (dce-uuid->string uuid)
- "Convert UUID, a 16-byte bytevector, to its string representation, something
-like \"6b700d61-5550-48a1-874c-a3d86998990e\"."
+ "Convert UUID, a 16-byte bytevector, to its DCE string representation (see
+<https://tools.ietf.org/html/rfc4122>), which looks something like
+\"6b700d61-5550-48a1-874c-a3d86998990e\"."
;; See <https://tools.ietf.org/html/rfc4122>.
(let ((time-low (bytevector-uint-ref uuid 0 %network-byte-order 4))
(time-mid (bytevector-uint-ref uuid 4 %network-byte-order 2))
@@ -93,7 +94,7 @@ (define (dce-uuid->string uuid)
(format #f "~8,'0x-~4,'0x-~4,'0x-~4,'0x-~12,'0x"
time-low time-mid time-hi clock-seq node)))
-(define %uuid-rx
+(define %dce-uuid-rx
;; The regexp of a UUID.
(make-regexp "^([[:xdigit:]]{8})-([[:xdigit:]]{4})-([[:xdigit:]]{4})-([[:xdigit:]]{4})-([[:xdigit:]]{12})$"))
@@ -101,7 +102,7 @@ (define (string->dce-uuid str)
"Parse STR as a DCE UUID (see <https://tools.ietf.org/html/rfc4122>) and
return its contents as a 16-byte bytevector. Return #f if STR is not a valid
UUID representation."
- (and=> (regexp-exec %uuid-rx str)
+ (and=> (regexp-exec %dce-uuid-rx str)
(lambda (match)
(letrec-syntax ((hex->number
(syntax-rules ()
@@ -167,6 +168,41 @@ (define (digits->string bytes)
(parts (list year month day hour minute second hundredths)))
(string-append (string-join (map digits->string parts) "-"))))
+
+;;;
+;;; Mdadm.
+;;;
+
+(define (mdadm-uuid->string uuid)
+ "Convert UUID, a 16-byte bytevector, to its Mdadm string representation,
+which looks something like \"6b700d61:555048a1:874ca3d8:6998990e\"."
+ ;; See <https://tools.ietf.org/html/rfc4122>.
+ (format #f "~8,'0x:~8,'0x:~8,'0x:~8,'0x"
+ (bytevector-uint-ref uuid 0 %network-byte-order 4)
+ (bytevector-uint-ref uuid 4 %network-byte-order 4)
+ (bytevector-uint-ref uuid 8 %network-byte-order 4)
+ (bytevector-uint-ref uuid 12 %network-byte-order 4)))
+
+(define %mdadm-uuid-rx
+ (make-regexp "^([[:xdigit:]]{8}):([[:xdigit:]]{8}):([[:xdigit:]]{8}):([[:xdigit:]]{8})$"))
+
+(define (string->mdadm-uuid str)
+ "Parse STR, which is in Mdadm format, and return a bytevector or #f."
+ (match (regexp-exec %mdadm-uuid-rx str)
+ (#f
+ #f)
+ (rx-match
+ (uint-list->bytevector (list (string->number
+ (match:substring rx-match 1) 16)
+ (string->number
+ (match:substring rx-match 2) 16)
+ (string->number
+ (match:substring rx-match 3) 16)
+ (string->number
+ (match:substring rx-match 4) 16))
+ %network-byte-order
+ 4))))
+
;;;
;;; FAT32/FAT16.
@@ -259,6 +295,7 @@ (define %uuid-parsers
('dce 'ext2 'ext3 'ext4 'bcachefs 'btrfs 'f2fs 'jfs 'xfs 'luks
=> string->dce-uuid)
('fat32 'fat16 'fat => string->fat-uuid)
+ ('mdadm => string->mdadm-uuid)
('ntfs => string->ntfs-uuid)
('iso9660 => string->iso9660-uuid)))
@@ -268,6 +305,7 @@ (define %uuid-printers
=> dce-uuid->string)
('iso9660 => iso9660-uuid->string)
('fat32 'fat16 'fat => fat-uuid->string)
+ ('mdadm => mdadm-uuid->string)
('ntfs => ntfs-uuid->string)))
(define* (string->uuid str #:optional (type 'dce))

base-commit: 5283d24062be62f59ff9f14fa7095ebcfcb7a9a4
--
2.41.0
F
F
Felix Lechner wrote on 23 Nov 2023 15:57
[PATCH v2 2/2] Provide md-array-device-mapping to start MD arrays via UUID or name.
(address . 64259@debbugs.gnu.org)
c745ee04c873d62bf275923e748abee5637dca81.1700751420.git.felix.lechner@lease-up.com
This commit cures the most precipitous danger for users of MD arrays in GNU
Guix, namely that their equipment may not boot after a drive failure. That
behavior likely contradicts their primary expectation for having such a disk
arrangments.

In order to facilitate a smooth transition from raid-device-mapping to
md-array-device-mapping, this commit introduces a new mapping rather than
repurpose the old one. The new mapping here is also incompatible with
raid-device-mapping in the sense that a plain string is now interpreted as the
array name from the MD superblock.

For details, please consult the mdadm manual page.

Personally, the author prefers UUIDs over array names when identifying array
components, but either will work. The system test uses the name.

The name for the new device mapping was chosen instead of the traditional RAID
to account for the fact that some modern technologies (like SSDs) and some
array configurations, such as striping, are neither redundant nor inexpensive.

Adjusts the documentation by erasing any mention of the obsolete
raid-device-mapping. No one should use that any longer. Ideally, users would
be a deprecation warning, but I was unable to adapt 'define-deprecated' to
this use case. Please feel free to make further changes.

This commit includes an updated system test for the root file system on
an-md-array.

More details for the motivation of these changes may be available here:


The author of this commit used to maintain mdadm in Debian.

Please feel free to insert better changelog messages. I had some difficulty
meeting the likely expectations of any reviewer. Please also feel free to make
any other adjustments as needed without checking with me. Thanks!

* gnu/system/mapped-devices.scm: New variable md-array-device-mapping.
* doc/guix.texi: Mention md-array-device-mapping in the documentation..
* gnu/tests/install.scm: Adjust test for root-on-md-array.
---
Hi Ludo'

With this updated patch series, I hope to address all your questions
and concerns. Thanks!

Kind regards
Felix

doc/guix.texi | 28 ++++++------
gnu/system/mapped-devices.scm | 36 +++++++++++++++
gnu/tests/install.scm | 84 +++++++++++++++++++++++++++++++++++
3 files changed, 135 insertions(+), 13 deletions(-)

Toggle diff (225 lines)
diff --git a/doc/guix.texi b/doc/guix.texi
index 94903fb5e2..7676a58d99 100644
--- a/doc/guix.texi
+++ b/doc/guix.texi
@@ -17762,18 +17762,19 @@ the system boots up.
@table @code
@item source
-This is either a string specifying the name of the block device to be mapped,
-such as @code{"/dev/sda3"}, or a list of such strings when several devices
-need to be assembled for creating a new one. In case of LVM this is a
-string specifying name of the volume group to be mapped.
+This is either a string specifying the name of the block device to be
+mapped, such as @code{"/dev/sda3"}. For MD array devices it is either
+the UUID of the array or a string that is interpreted as the array name
+(see mdadm.conf(5) in the manual). In case of LVM it is a string
+specifying name of the volume group to be mapped.
@item target
This string specifies the name of the resulting mapped device. For
kernel mappers such as encrypted devices of type @code{luks-device-mapping},
specifying @code{"my-partition"} leads to the creation of
the @code{"/dev/mapper/my-partition"} device.
-For RAID devices of type @code{raid-device-mapping}, the full device name
-such as @code{"/dev/md0"} needs to be given.
+For MD array devices of type @code{md-array-device-mapping}, the full device
+name such as @code{"/dev/md18"} needs to be given.
LVM logical volumes of type @code{lvm-device-mapping} need to
be specified as @code{"VGNAME-LVNAME"}.
@@ -17793,11 +17794,12 @@ command from the package with the same name. It relies on the
@code{dm-crypt} Linux kernel module.
@end defvar
-@defvar raid-device-mapping
+@defvar md-array-device-mapping
This defines a RAID device, which is assembled using the @code{mdadm}
-command from the package with the same name. It requires a Linux kernel
-module for the appropriate RAID level to be loaded, such as @code{raid456}
-for RAID-4, RAID-5 or RAID-6, or @code{raid10} for RAID-10.
+command from the package with the same name. It requires the Linux kernel
+module for the appropriate RAID level to be loaded, such as @code{raid1}
+for mirroring, @code{raid456} for the checksum-based RAID levels 4, 5 or 6,
+or @code{raid10} for RAID-10.
@end defvar
@cindex LVM, logical volume manager
@@ -17855,9 +17857,9 @@ may be declared as follows:
@lisp
(mapped-device
- (source (list "/dev/sda1" "/dev/sdb1"))
- (target "/dev/md0")
- (type raid-device-mapping))
+ (source (uuid "33cf3e31:8e33d75b:517d64b9:0a8f7623" 'mdadm))
+ (target "/dev/md17")
+ (type md-array-device-mapping))
@end lisp
The @file{/dev/md0} device can then be used as the @code{device} of a
diff --git a/gnu/system/mapped-devices.scm b/gnu/system/mapped-devices.scm
index e6b8970c12..e6635b531d 100644
--- a/gnu/system/mapped-devices.scm
+++ b/gnu/system/mapped-devices.scm
@@ -64,6 +64,7 @@ (define-module (gnu system mapped-devices)
check-device-initrd-modules ;XXX: needs a better place
luks-device-mapping
+ md-array-device-mapping
raid-device-mapping
lvm-device-mapping))
@@ -276,6 +277,39 @@ (define luks-device-mapping
(close close-luks-device)
(check check-luks-device)))
+(define (open-md-array-device source targets)
+ "Return a gexp that assembles SOURCE to the MD device
+TARGET (e.g., \"/dev/md0\"), using 'mdadm'."
+ (let ((array-selector
+ (match source
+ ((? uuid?)
+ (string-append "--uuid=" (uuid->string source)))
+ ((? string?)
+ (string-append "--name=" source))))
+ (md-device
+ (match targets
+ ((target)
+ target))))
+ ;; Use 'mdadm-static' rather than 'mdadm' to avoid pulling its whole
+ ;; closure (80 MiB) in the initrd when an MD device is needed for boot.
+ #~(zero? (system* #$(file-append mdadm-static "/sbin/mdadm")
+ "--assemble" #$md-device
+ "--run"
+ #$array-selector))))
+
+(define (close-md-array-device source targets)
+ "Return a gexp that stops the MD device TARGET."
+ (match targets
+ ((target)
+ #~(zero? (system* #$(file-append mdadm-static "/sbin/mdadm")
+ "--stop" #$target)))))
+
+(define md-array-device-mapping
+ ;; The type of MD mapped device.
+ (mapped-device-kind
+ (open open-md-array-device)
+ (close close-md-array-device)))
+
(define (open-raid-device sources targets)
"Return a gexp that assembles SOURCES (a list of devices) to the RAID device
TARGET (e.g., \"/dev/md0\"), using 'mdadm'."
@@ -317,6 +351,8 @@ (define raid-device-mapping
(open open-raid-device)
(close close-raid-device)))
+(define-deprecated raid-device-mapping md-array-device-mapping)
+
(define (open-lvm-device source targets)
#~(and
(zero? (system* #$(file-append lvm2-static "/sbin/lvm")
diff --git a/gnu/tests/install.scm b/gnu/tests/install.scm
index daa4647299..9e80b55f84 100644
--- a/gnu/tests/install.scm
+++ b/gnu/tests/install.scm
@@ -64,6 +64,7 @@ (define-module (gnu tests install)
%test-iso-image-installer
%test-separate-store-os
%test-separate-home-os
+ %test-md-array-root-os
%test-raid-root-os
%test-encrypted-root-os
%test-encrypted-home-os
@@ -610,6 +611,89 @@ (define %test-separate-store-os
(command (qemu-command* images)))
(run-basic-test %separate-store-os command "separate-store-os")))))
+
+;;;
+;;; MD root device.
+;;;
+
+(define-os-with-source (%md-array-root-os %md-array-root-os-source)
+ ;; An OS whose root partition is a MD partition.
+ (use-modules (gnu) (gnu tests))
+
+ (operating-system
+ (host-name "raidified")
+ (timezone "Europe/Paris")
+ (locale "en_US.utf8")
+
+ (bootloader (bootloader-configuration
+ (bootloader grub-bootloader)
+ (targets (list "/dev/vdb"))))
+ (kernel-arguments '("console=ttyS0"))
+
+ ;; Add a kernel module for RAID-1 (aka. "mirror").
+ (initrd-modules (cons "raid1" %base-initrd-modules))
+
+ (mapped-devices (list (mapped-device
+ (source "marionette:mirror0")
+ (target "/dev/md0")
+ (type md-array-device-mapping))))
+ (file-systems (cons (file-system
+ (device (file-system-label "root-fs"))
+ (mount-point "/")
+ (type "ext4")
+ (dependencies mapped-devices))
+ %base-file-systems))
+ (users %base-user-accounts)
+ (services (cons (service marionette-service-type
+ (marionette-configuration
+ (imported-modules '((gnu services herd)
+ (guix combinators)))))
+ %base-services))))
+
+(define %md-array-root-installation-script
+ ;; Installation with a separate /gnu partition. See
+ ;; <https://raid.wiki.kernel.org/index.php/RAID_setup> for more on RAID and
+ ;; mdadm.
+ "\
+. /etc/profile
+set -e -x
+guix --version
+
+export GUIX_BUILD_OPTIONS=--no-grafts
+parted --script /dev/vdb mklabel gpt \\
+ mkpart primary ext2 1M 3M \\
+ mkpart primary ext2 3M 1.6G \\
+ mkpart primary ext2 1.6G 3.2G \\
+ set 1 boot on \\
+ set 1 bios_grub on
+yes | mdadm --create /dev/md0 --verbose --homehost=marionette --name=mirror0 \\
+ --level=mirror --raid-devices=2 /dev/vdb2 /dev/vdb3
+mkfs.ext4 -L root-fs /dev/md0
+mount /dev/md0 /mnt
+df -h /mnt
+herd start cow-store /mnt
+mkdir /mnt/etc
+cp /etc/target-config.scm /mnt/etc/config.scm
+guix system init /mnt/etc/config.scm /mnt --no-substitutes
+sync
+reboot\n")
+
+(define %test-md-array-root-os
+ (system-test
+ (name "md-array-root-os")
+ (description
+ "Test functionality of an OS installed with a RAID root partition managed
+by 'mdadm'.")
+ (value
+ (mlet* %store-monad ((images (run-install %md-array-root-os
+ %md-array-root-os-source
+ #:script
+ %md-array-root-installation-script
+ #:target-size (* 3200 MiB)))
+ (command (qemu-command* images)))
+ (run-basic-test %md-array-root-os
+ `(,@command) "md-array-root-os")))))
+
;;;
;;; RAID root device.
--
2.41.0
A
A
Andreas Enge wrote on 18 Jan 15:39 +0100
Re: [bug#64259] [PATCH 2/2] Provide md-array-device-mapping to start MD arrays via UUID or name.
(name . Felix Lechner)(address . felix.lechner@lease-up.com)
Zak4KDD1lrMPqt5r@jurong
Hello,

Am Sun, Oct 22, 2023 at 10:44:13AM -0700 schrieb Felix Lechner:
Toggle quote (6 lines)
> > Does the busy wait loop currently in ‘open-raid-device’ need to be
> > preserved?
> I personally do not believe so but I'll defer to Andreas Enge, whom I
> copied on this message. I believe Andreas wrote the original device
> mapping.

well, I do not know whether it is still needed. It appears that when
I wrote the code for bayfront, we needed to wait a bit until the hard
disks appeared. Are there reasons to believe that this has changed
in the meantime?

Andreas
F
F
Felix Lechner wrote on 18 Jan 17:46 +0100
(name . Andreas Enge)(address . andreas@enge.fr)
87zfx2zj1b.fsf@lease-up.com
Hi Andreas,

On Thu, Jan 18 2024, Andreas Enge wrote:

Toggle quote (3 lines)
> when I wrote the code for bayfront, we needed to wait a bit until the
> hard disks appeared.

How long ago? I never needed it on six pieces of equipment of varying
dimensions, including an SAS server (with reflashed Dell equipment, if
that's what Bayfront is using) and a VM with NVMe SSDs.

Either way, accepting this patch as is now will not break anything. We
introduce a new mapping (md-device-mapping) that can be used to
reconfigure Bayfront at the maintainer's leisure.

It would be easy to react to a bug report later while Bayfront continues
to use raid-device-mapping..

Kind regards
Felix
A
A
Andreas Enge wrote on 18 Jan 17:51 +0100
(name . Felix Lechner)(address . felix.lechner@lease-up.com)
ZalW_EEVK4G5xEuu@jurong
Am Thu, Jan 18, 2024 at 08:46:24AM -0800 schrieb Felix Lechner:
Toggle quote (6 lines)
> Either way, accepting this patch as is now will not break anything. We
> introduce a new mapping (md-device-mapping) that can be used to
> reconfigure Bayfront at the maintainer's leisure.
> It would be easy to react to a bug report later while Bayfront continues
> to use raid-device-mapping..

Okay, that is fine with me!

Andreas
?