udevd error with lvm-raid array leading to race condition with luks

  • Open
  • quality assurance status badge
Details
5 participants
  • Yann Dupont
  • Josselin Poiret
  • Adrien 'neox' Bourmault
  • Tomas Volf
  • Simon Tournier
Owner
unassigned
Submitted by
Adrien 'neox' Bourmault
Severity
normal
A
A
Adrien 'neox' Bourmault wrote on 9 Aug 2023 12:25
(address . bug-guix@gnu.org)
58f0d91f-deb8-35c3-d1f3-2ad8a99843c0@gnu.org
Hi there.
My setup is the following (LVM array containing a LUKS partition):

(mapped-devices
(list
(mapped-device
(source "HOMERAID")
(target "HOMERAID-HOMERAID_data")
(type lvm-device-mapping))
(mapped-device
(source "/dev/mapper/HOMERAID-HOMERAID_data")
(target "luks-f0a72a6c-499b-4445-8d13-21dc99337752")
(type luks-device-mapping))))

(file-systems
(cons*
(file-system
(mount-point "/")
(device (uuid "2e44f3f7-bb6b-43ac-933a-e8992bf10d29" 'ext4))
(type "ext4"))
(file-system
(mount-point "/home")
(device "/dev/mapper/luks-f0a72a6c-499b-4445-8d13-21dc99337752")
(type "ext4")
(dependencies mapped-devices))
(file-system
(mount-point "/boot/efi")
(device (uuid "DC58-946E" 'fat32))
(type "vfat"))
%base-file-systems)))

I use Guix System since 2022, and never had any problem booting with
this configuration. But I did update my system (and reconfigure it) last
week and now I can't boot. I don't have any older generation to restore
(yes I'm dumb, I executed a delete-generations to show how it works to a
friend), but I think the last working one was from July, 25th or just a
bit older.

I have in my /var/log/messages :

Aug 9 11:40:27 localhost vmunix: [ 7.525877] udevd[515]: failed
to execute '/usr/bin/systemd-run' '/usr/bin/systemd-run --no-block
--property DefaultDependencies=no --unit lvm-activate-HOMERAID
/gnu/store/hffkn63zx2zjadawrkxpnr486frc9n74-lvm2-2.03.21/sbin/lvm
vgchange -aay --autoactivation event HOMERAID': No such file or directory

On the screen, during boot, I obtain a wall of messages like (not the
exact message, sorry, can't find on syslog) "Device
HOMERAID-HOMERAID_data could not be found: does not exist or access
denied" and after that the system hangs.

However, I can boot successfully if and only if I press Scroll Lock just
after modesetting and before this message ("Device ... not be found")
appears and when I press it again (to deactivate its effect) after some
seconds, the LUKS passphrase prompt appears and allows me to boot properly.

I can use my system but it requires multiple tries each time x) I have
to be really synchronized with the modesetting and the message xD

Thank you very much.

Freely,
--
Adrien Bourmault
Co-maintainer, GNU Boot project
Elected member, XMPP Standards Foundation
Associate member, Free Software Foundation
Trésorier, Association Libre en Communs (https://www.a-lec.org)
GPG : 1DF1132CF1658A8559025C98AAD6B069819E6979
J
J
Josselin Poiret wrote on 10 Aug 2023 10:02
(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)
87ttt7e35d.fsf@jpoiret.xyz
Hi Adrien,

Adrien 'neox' Bourmault <neox@gnu.org> writes:

Toggle quote (6 lines)
> Aug 9 11:40:27 localhost vmunix: [ 7.525877] udevd[515]: failed
> to execute '/usr/bin/systemd-run' '/usr/bin/systemd-run --no-block
> --property DefaultDependencies=no --unit lvm-activate-HOMERAID
> /gnu/store/hffkn63zx2zjadawrkxpnr486frc9n74-lvm2-2.03.21/sbin/lvm
> vgchange -aay --autoactivation event HOMERAID': No such file or directory

Since lvm2 2.03.14, the included udev rules use systemd-run to run
vgchange and activate the volume group. lvm2 was updated recently from
2.03.11 to 2.03.21, then 2.03.22 (cc'ing tobias), and probably started
exhibiting this behavior then. I think we can probably remove the
indirection through systemd-run and directly run vgchange.

Best,
--
Josselin Poiret
-----BEGIN PGP SIGNATURE-----

iQHEBAEBCgAuFiEEOSSM2EHGPMM23K8vUF5AuRYXGooFAmTUmY4QHGRldkBqcG9p
cmV0Lnh5egAKCRBQXkC5Fhcaij56C/4rhiCk3eD8u5COuAWzzSezbDZEtSEHSIWz
8klUZ7OEaTskLO2mqWMYlj7piY/bGVFVlZGHT8djvG2myhU41hDgEst9fDrJOFLf
CjyRAr1/EHFmS2/YCscA8kzFTU02NulgUMp3BlUXsyHCNuN6gwE73W2xrdNxMKgG
QzEjNxN3gDdsUCj0A7zIejoH4vvopoS602uC9xDdG4kCf8WshsAcz0Vm65hNhlk3
8rwPm83cN/bO0Sym/GdQh2yNGsTHK5G5x1EUm+zmcylreFeowL9YDLpC+boCsRIP
bEsapFLBoCZKt4IZD8zHUXD3UBLSq2FwNxQFOvrhXW47eUx3/PgOXyGFfvxPh7nT
ujiyEg7fBvCbabmGEtwe81WAHwpmILilHFxql9VCsEtCnI1FWtf8iNwrvk4XoYaN
6HV7iHd0gHb+/MFyE6mnn3EuXXLu9GF7Uc2knYsqkuxHzQ0vqzUcvsFJNZ1BeuVo
TaSRH7k1Hc6ACkmDtZH+b5YA70J0eEY=
=sCw/
-----END PGP SIGNATURE-----

Y
Y
Yann Dupont wrote on 14 Sep 2023 11:25
udevd error with lvm-raid array leading to race condition with luks
(address . 65177@debbugs.gnu.org)
9b755986-a86d-dfa1-f497-1d4b1d37bb5b@univ-nantes.fr
hello everyone, we're also victims of this bug, in an even simpler use case.

[…]
(file-system
                                (device "/dev/mapper/VG0-DATA")
                                (mount-point "/VG0-DATA")
                                (type "ext4"))
[…]

The culprit seems to be 69-dm-lvm.rules

[ 18.226226] udevd[115]: failed to execute '/usr/bin/systemd-run'
'/usr/bin/systemd-run --no-block --property DefaultDependencies=no
--unit lvm-activate-VG0
/gnu/store/0hndg947ywdl5izvy63ny38hyywci66k-lvm2-2.03.22/sbin/lvm vy

I can confirm that when using time-machine to revert to lvm2 2-03.11
versions, the VM boots.

cheers,
Y
Y
Yann Dupont wrote on 14 Sep 2023 18:23
(address . 65177@debbugs.gnu.org)
4b4bd749-fa4d-f4ea-742d-175d3da3822c@univ-nantes.fr
Hi, as suggested by Josselin, I tested the following patch and it seems
to do the job here.


Be careful, I'm not an udev or lvm2 specialist at all and basically, I
don't really know if what I did is the right way to do it.


All I can say is that the VMs now boot.


Cheers,



Toggle diff (38 lines)
diff --git a/gnu/packages/linux.scm b/gnu/packages/linux.scm
index 91109c41d9..28b3c1e0bf 100644
--- a/gnu/packages/linux.scm
+++ b/gnu/packages/linux.scm
@@ -4421,6 +4421,7 @@ (define-public lvm2
               (sha256
                (base32
"0z6w6bknhwh1n3qfkb5ij6x57q3wjf28lq3l8kh7rkhsplinjnjc"))
+              (patches (search-patches "lvm2-no-systemd.patch"))
               (modules '((guix build utils)))
               (snippet
                '(begin
diff --git a/gnu/packages/patches/lvm2-no-systemd.patch
b/gnu/packages/patches/lvm2-no-systemd.patch
new file mode 100644
index 0000000000..7e8a37abcc
--- /dev/null
+++ b/gnu/packages/patches/lvm2-no-systemd.patch
@@ -0,0 +1,13 @@
+diff --git a/udev/69-dm-lvm.rules.in b/udev/69-dm-lvm.rules.in
+index ff1568145..8879a2ef9 100644
+--- a/udev/69-dm-lvm.rules.in
++++ b/udev/69-dm-lvm.rules.in
+@@ -76,7 +76,7 @@ LABEL="lvm_scan"
+ # it's better suited to appearing in the journal.
+
+ IMPORT{program}="(LVM_EXEC)/lvm pvscan --cache --listvg
--checkcomplete --vgonline --autoactivation event --udevoutput
--journal=output $env{DEVNAME}"
+-ENV{LVM_VG_NAME_COMPLETE}=="?*", RUN+="(SYSTEMDRUN) --no-block
--property DefaultDependencies=no --unit
lvm-activate-$env{LVM_VG_NAME_COMPLETE} (LVM_EXEC)/lvm vgchange -aay
--autoactivation event $env{LVM_VG_NAME_COMPLETE}"
++ENV{LVM_VG_NAME_COMPLETE}=="?*", RUN+="(LVM_EXEC)/lvm vgchange -aay
--autoactivation event $env{LVM_VG_NAME_COMPLETE}"
+ GOTO="lvm_end"
+
+ LABEL="lvm_end"
Attachment: file
S
S
Simon Tournier wrote on 25 Sep 2023 09:35
Re: bug#65177: udevd error with lvm-raid array leading to race condition with luks
(name . Yann Dupont)(address . Yann.Dupont@univ-nantes.fr)
87pm267lid.fsf@gmail.com
Hi,

On Thu, 14 Sep 2023 at 18:23, Yann Dupont <Yann.Dupont@univ-nantes.fr> wrote:

Toggle quote (2 lines)
> All I can say is that the VMs now boot.

WDYT about this patch?

Toggle quote (38 lines)
> diff --git a/gnu/packages/linux.scm b/gnu/packages/linux.scm
> index 91109c41d9..28b3c1e0bf 100644
> --- a/gnu/packages/linux.scm
> +++ b/gnu/packages/linux.scm
> @@ -4421,6 +4421,7 @@ (define-public lvm2
> (sha256
> (base32
> "0z6w6bknhwh1n3qfkb5ij6x57q3wjf28lq3l8kh7rkhsplinjnjc"))
> + (patches (search-patches "lvm2-no-systemd.patch"))
> (modules '((guix build utils)))
> (snippet
> '(begin
> diff --git a/gnu/packages/patches/lvm2-no-systemd.patch
> b/gnu/packages/patches/lvm2-no-systemd.patch
> new file mode 100644
> index 0000000000..7e8a37abcc
> --- /dev/null
> +++ b/gnu/packages/patches/lvm2-no-systemd.patch
> @@ -0,0 +1,13 @@
> +diff --git a/udev/69-dm-lvm.rules.in b/udev/69-dm-lvm.rules.in
> +index ff1568145..8879a2ef9 100644
> +--- a/udev/69-dm-lvm.rules.in
> ++++ b/udev/69-dm-lvm.rules.in
> +@@ -76,7 +76,7 @@ LABEL="lvm_scan"
> + # it's better suited to appearing in the journal.
> +
> + IMPORT{program}="(LVM_EXEC)/lvm pvscan --cache --listvg --checkcomplete
> --vgonline --autoactivation event --udevoutput --journal=output $env{DEVNAME}"
> +-ENV{LVM_VG_NAME_COMPLETE}=="?*", RUN+="(SYSTEMDRUN) --no-block
> --property DefaultDependencies=no --unit lvm-activate-$env
> {LVM_VG_NAME_COMPLETE} (LVM_EXEC)/lvm vgchange -aay --autoactivation event
> $env{LVM_VG_NAME_COMPLETE}"
> ++ENV{LVM_VG_NAME_COMPLETE}=="?*", RUN+="(LVM_EXEC)/lvm vgchange -aay
> --autoactivation event $env{LVM_VG_NAME_COMPLETE}"
> + GOTO="lvm_end"
> +
> + LABEL="lvm_end"

Josselin, is it what you had in mind?

Cheers,
simon
S
S
Simon Tournier wrote on 17 Oct 2023 11:00
(name . Yann Dupont)(address . Yann.Dupont@univ-nantes.fr)
865y35pqsh.fsf@gmail.com
Hi,

On Mon, 25 Sep 2023 at 09:35, Simon Tournier <zimon.toutoune@gmail.com> wrote:

Toggle quote (4 lines)
>> All I can say is that the VMs now boot.
>
> WDYT about this patch?

For easing the discussion, I extracted Yann’s diff and prepared a patch
ready to merge. See:


Cheers,
simon
S
S
Simon Tournier wrote on 31 Oct 2023 09:59
(name . Yann Dupont)(address . Yann.Dupont@univ-nantes.fr)
87ttq7yxo0.fsf@gmail.com
Hi,

Toggle quote (9 lines)
>>> All I can say is that the VMs now boot.
>>
>> WDYT about this patch?
>
> For easing the discussion, I extracted Yann’s diff and prepared a patch
> ready to merge. See:
>
> https://issues.guix.gnu.org/issue/66586

Patch pushed as c0895371c5759c7d9edb330774e90f192cc4cf2c.

Closing.

Feel free to reopen if the patch does not fix the issue.

Cheers,
simon
Closed
T
T
Tomas Volf wrote on 31 Oct 2023 19:38
Re: bug#65177: udevd error with lvm-raid array leading to race
(address . 65177@debbugs.gnu.org)
ZUFJkQsH7TbjS62u@ws
Hi,

I think the patch is wrong. No comment regarding the udev rule itself (I do not know much about it), it does spam the logs with:

Oct 31 18:31:35 localhost vmunix: [ 60.013538] udevd[327]: invalid key/value pair in file /gnu/store/a3xvv1zxzyb53r5r752q8nh9v9vd150h-udev-rules/69-dm-lvm.rules on line 78, starting at character 1 ('I')
Oct 31 18:31:35 localhost vmunix: [ 60.013544] udevd[327]: invalid key/value pair in file /gnu/store/a3xvv1zxzyb53r5r752q8nh9v9vd150h-udev-rules/69-dm-lvm.rules on line 79, starting at character 32 (',')

The fix[0] for that is fairly trivial.


Have a nice day,
W.

--
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEt4NJs4wUfTYpiGikL7/ufbZ/wakFAmVBSZEACgkQL7/ufbZ/
walKdQ/9ENQG7NMaUVB2mUzE8G2BNWeyS8CyNxddGzK5kZf4hh/UuhFUbIeG/8/p
1742UD006noNEDzQa1i7M9Yd1xX0JngDv+I1KsNojrDnNQyi4UEkH1uZfPRQPYx2
Fy8B1pLHtvD4xLgQiWBWpIkkiTH9dO5ueJneOqnD3ZuwdUbgnOgJiDJP65hqEMOa
VsNacP71AdTeOnc3hf+R6p9B7QXWOIDXBsaonaU8jjSmrDml94oWFL7d7UfgqwXq
9k4n69w/yeP8j04xiHImnM16GQIspNG6RtCBeop3LWz21jmGcej3ShmgMmuOeNVO
DMi8RBez+Fo0QK03FHVU1zG2uLKqwEsDXEti77gRyZFZovuJaw9AQrLkPiEDrA3V
l+QURWSfhQLRI6kAzKFM5hl0diK6xYGXWIwAvY/K+fNv+3//ULKL2cSDCypjgvp+
o7AV33OCGxJ2RFFeo3bxoEsrT71uQDnkqBhQMU0a26zUQPPFDJ5+m2ul8bFj2gXa
gghQwNuq01jvRm8R4/YzWKe8dfqI0j4R2neje+Ce7mPf5q8R0/OHkvJcTea6kAQ5
LtnbNpXpkrBLbFTh7xD+oVBrVLCNL+plkBUqPJbGBVtyRk/k9kb6ckWthgSUnNdy
aKGnIkcazWA9bJQNLvrkg3r3exkpIJWj0Gldcmu0r/B3JzDzSpw=
=B+40
-----END PGP SIGNATURE-----


S
?