[installer] backtrace during fresh Guix System install after during formatting

DoneSubmitted by bdju.
Details
3 participants
  • bdju
  • Leo Famulari
  • Mathieu Othacehe
Owner
unassigned
Severity
normal
B
backtrace during fresh Guix System install after during formatting
(address . bug-guix@gnu.org)
D7C8F08A-C5C0-4719-8BB3-C17C81A6AEE7@tilde.team
Using a "latest" installer image from the last day or two
Picture of error taken with my phone attached (eh, what can ya do?)

Toggle quote (2 lines)
>Device /dev/sdc is still in use.

This sticks out and appears in a couple spots.
Attachment: file
Attachment: IMG_20220125_211802.jpg (2.81 MiB)
B
Re: bug#53541: backtrace during fresh Guix System install during formatting
CHFAN1BVWNBC.18LG1NI62TCZD@masaki
On Tue Jan 25, 2022 at 9:21 PM CST, bdju via Bug reports for GNU Guix wrote:
Toggle quote (6 lines)
> Using a "latest" installer image from the last day or two
> Picture of error taken with my phone attached (eh, what can ya do?)
>
> >Device /dev/sdc is still in use.
>
> This sticks out and appears in a couple spots.
if I don't view the details of /boot it tells me it can't read
/dev/sda1 and to format it, then if I select it, it
becomes ext4 by default and format is set to no

Should the boot partition be fat32, ext4, or could it even be btrfs? I'm
doing an encrypted install and I want btrfs for the root partition.
L
L
Leo Famulari wrote on 26 Jan 05:15 +0100
(no subject)
(address . control@debbugs.gnu.org)
YfDK5FlNDgk/u6IN@jasmine.lan
retitle 53541 [installer] backtrace during fresh Guix System install after during formatting
L
L
Leo Famulari wrote on 26 Jan 05:18 +0100
(address . control@debbugs.gnu.org)
YfDLfRicGkaiLTJL@jasmine.lan
block 53214 with 53544
block 53214 with 53541
B
Re: bug#53541: backtrace during fresh Guix System install during formatting
4791697A-84BE-4E11-AFED-2D219DC5BB32@tilde.team
First drive was using Ventoy. Tried a regular dd'd drive in case that worked better. Got past formatting. Pressed edit on the system config preview screen and got a new/different backtrace. Picture attached.

Sorry if this top-posts after bottom-posting, have to send the pics from my phone.

On January 25, 2022 9:42:35 PM CST, bdju <bdju@tilde.team> wrote:
Toggle quote (13 lines)
>On Tue Jan 25, 2022 at 9:21 PM CST, bdju via Bug reports for GNU Guix wrote:
>> Using a "latest" installer image from the last day or two
>> Picture of error taken with my phone attached (eh, what can ya do?)
>>
>> >Device /dev/sdc is still in use.
>>
>> This sticks out and appears in a couple spots.
>if I don't view the details of /boot it tells me it can't read
>/dev/sda1 and to format it, then if I select it, it
>becomes ext4 by default and format is set to no
>
>Should the boot partition be fat32, ext4, or could it even be btrfs? I'm
>doing an encrypted install and I want btrfs for the root partition.
Attachment: file
Attachment: IMG_20220125_220553.jpg (5.50 MiB)
B
EC10C875-F576-45DB-BD61-9D713D744E9B@tilde.team
Tried to use the same settings as before, but didn't get past formatting this time. Similar backtrace to with the Ventoy drive, just says sda instead of sdc now for which device is "still in use"

On January 25, 2022 10:08:07 PM CST, bdju <bdju@tilde.team> wrote:
Toggle quote (18 lines)
>First drive was using Ventoy. Tried a regular dd'd drive in case that worked better. Got past formatting. Pressed edit on the system config preview screen and got a new/different backtrace. Picture attached.
>
>Sorry if this top-posts after bottom-posting, have to send the pics from my phone.
>
>On January 25, 2022 9:42:35 PM CST, bdju <bdju@tilde.team> wrote:
>>On Tue Jan 25, 2022 at 9:21 PM CST, bdju via Bug reports for GNU Guix wrote:
>>> Using a "latest" installer image from the last day or two
>>> Picture of error taken with my phone attached (eh, what can ya do?)
>>>
>>> >Device /dev/sdc is still in use.
>>>
>>> This sticks out and appears in a couple spots.
>>if I don't view the details of /boot it tells me it can't read
>>/dev/sda1 and to format it, then if I select it, it
>>becomes ext4 by default and format is set to no
>>
>>Should the boot partition be fat32, ext4, or could it even be btrfs? I'm
>>doing an encrypted install and I want btrfs for the root partition.
Attachment: file
Attachment: IMG_20220125_221328.jpg (5.57 MiB)
L
L
Leo Famulari wrote on 3 Feb 19:34 +0100
Re: bug#53541: backtrace during fresh Guix System install after during formatting
(name . bdju via Bug reports for GNU Guix)(address . bug-guix@gnu.org)(address . 53541@debbugs.gnu.org)
fe9905ae-8d6e-4a59-90eb-ed0c8283ce94@www.fastmail.com
On Tue, Jan 25, 2022 at 09:21:00PM -0600, bdju via Bug reports for GNU Guix wrote:
Toggle quote (7 lines)
> Using a "latest" installer image from the last day or two
> Picture of error taken with my phone attached (eh, what can ya do?)
>
> >Device /dev/sdc is still in use.
>
> This sticks out and appears in a couple spots.

We made many changes to the installer code today:

------
$ git log --oneline 0d37a5df7e709cadca97cfbbf9c680dfe54b8302^..4943ac86e4f95a2e14fd209f3fdaac74a0d9ca2e
4943ac86e4 installer: Use system-wide guix for system init.
ad55ccf9b1 installer: Make dump archive creation optional and selective.
112ef30b84 installer: Turn passwords into opaque records.
41eb0f01fc installer: Use dynamic-wind to setup installer.
7cbd95a9f6 installer: Add error page when running external commands.
726d0bd2f3 installer: Use named prompt to abort or break installer steps.
59fec4a1a2 installer: Add nano to PATH.
ed6567abbf installer: Replace run-command by invoke in newt/page.scm.
dad9a1c0b2 installer: Fix run-file-textbox-page when edit-button is #f.
0a74509a69 installer: Raise condition when mklabel fails.
af59e53631 installer: Use run-command-in-installer in (gnu installer parted).
408427a36c installer: Add installer-specific run command process.
0b9fbbb4dd installer: Capture external commands output.
c57ec6ed1e installer: Remove specific logging code.
2f7f1d11e9 installer: Keep PATH inside the install container.
438bf9b840 installer: Un-export syslog syntax.
4f2fd33b4f installer: Use new installer-log-line everywhere.
7251b15d30 installer: Generalize logging facility.
4a68a00c8b installer: Use define instead of let at top-level.
0d37a5df7e installer: Add crash dump upload support.
------

Your report doesn't explain how to reproduce the bug. Can you test the
latest installer again and see if the bug still exists?

Here is the installer image that contains these changes:

Or get it from here:
B
CHMY3I6N3QUY.3GZRXDZ9RY6YR@masaki
On Thu Feb 3, 2022 at 12:34 PM CST, Leo Famulari wrote:
Toggle quote (35 lines)
> We made many changes to the installer code today:
>
> ------
> $ git log --oneline 0d37a5df7e709cadca97cfbbf9c680dfe54b8302^..4943ac86e4f95a2e14fd209f3fdaac74a0d9ca2e
> 4943ac86e4 installer: Use system-wide guix for system init.
> ad55ccf9b1 installer: Make dump archive creation optional and selective.
> 112ef30b84 installer: Turn passwords into opaque records.
> 41eb0f01fc installer: Use dynamic-wind to setup installer.
> 7cbd95a9f6 installer: Add error page when running external commands.
> 726d0bd2f3 installer: Use named prompt to abort or break installer steps.
> 59fec4a1a2 installer: Add nano to PATH.
> ed6567abbf installer: Replace run-command by invoke in newt/page.scm.
> dad9a1c0b2 installer: Fix run-file-textbox-page when edit-button is #f.
> 0a74509a69 installer: Raise condition when mklabel fails.
> af59e53631 installer: Use run-command-in-installer in (gnu installer parted).
> 408427a36c installer: Add installer-specific run command process.
> 0b9fbbb4dd installer: Capture external commands output.
> c57ec6ed1e installer: Remove specific logging code.
> 2f7f1d11e9 installer: Keep PATH inside the install container.
> 438bf9b840 installer: Un-export syslog syntax.
> 4f2fd33b4f installer: Use new installer-log-line everywhere.
> 7251b15d30 installer: Generalize logging facility.
> 4a68a00c8b installer: Use define instead of let at top-level.
> 0d37a5df7e installer: Add crash dump upload support.
> ------
>
> Your report doesn't explain how to reproduce the bug. Can you test the
> latest installer again and see if the bug still exists?
>
> Here is the installer image that contains these changes:
> https://ci.guix.gnu.org/build/460051/details
>
> Or get it from here:
> https://guix.gnu.org/en/download/latest/

It would be difficult to test now, I no longer have an empty SSD waiting
to be installed on in the same PC. I ended up getting through an install
with the last stable release using the same settings.
From memory, it was set up for EFI, btrfs root partition, encryption
enabled, boot partition was fat32, I chose no environment (wanted to use
sway), accepted networkmanager, tor, cups... I kept the default
generated config.scm because it often crashed when trying to edit and I
was sick of starting over. I think the crash was either right before the
screen where you can edit the config or right after you tell it to apply
the config. I had a second HDD (not touched by the installer) in
addition to the SSD being installed on, and /dev/sdc/ was the flash
drive with the installer image on it.
I am glad to hear it's (maybe) fixed. I'll download a new ISO, but I
don't have a hypervisor configured to test this sort of thing. I could
maybe try it with a different computer, but still it would be hard to
find one I'm willing to wipe I think. Sorry I can't be more help.
I have still only ever done 3 full Guix System installs, all years
apart. Unfortunately I never considered the experience fun enough to do
besides when absolutely necessary.
L
L
Leo Famulari wrote on 4 Feb 05:15 +0100
(name . bdju)(address . bdju@tilde.team)(address . 53541@debbugs.gnu.org)
Yfyoecw+Xnqqfxvm@jasmine.lan
On Thu, Feb 03, 2022 at 09:34:11PM -0600, bdju wrote:
Toggle quote (3 lines)
> It would be difficult to test now, I no longer have an empty SSD waiting
> to be installed on in the same PC.

Understood.

Toggle quote (10 lines)
> I ended up getting through an install
> with the last stable release using the same settings.
> From memory, it was set up for EFI, btrfs root partition, encryption
> enabled, boot partition was fat32, I chose no environment (wanted to use
> sway), accepted networkmanager, tor, cups... I kept the default
> generated config.scm because it often crashed when trying to edit and I
> was sick of starting over. I think the crash was either right before the
> screen where you can edit the config or right after you tell it to apply
> the config.

Okay. The crash when you do edit the config has been fixed, based on our
testing:


I know this report was about a different issue, but not as clearly
defined.

Toggle quote (11 lines)
> I had a second HDD (not touched by the installer) in
> addition to the SSD being installed on, and /dev/sdc/ was the flash
> drive with the installer image on it.
> I am glad to hear it's (maybe) fixed. I'll download a new ISO, but I
> don't have a hypervisor configured to test this sort of thing. I could
> maybe try it with a different computer, but still it would be hard to
> find one I'm willing to wipe I think. Sorry I can't be more help.
> I have still only ever done 3 full Guix System installs, all years
> apart. Unfortunately I never considered the experience fun enough to do
> besides when absolutely necessary.

No worries! You already helped a lot by reporting the bug and giving us
all the details you can. I will do some tests in a VM, choosing
non-default filesystems and partitioning.
M
M
Mathieu Othacehe wrote on 20 Oct 17:48 +0200
Re: bug#53541: [installer] backtrace during fresh Guix System install after during formatting
(name . bdju)(address . bdju@tilde.team)
877d0uebt9.fsf_-_@gnu.org
Hello,

Toggle quote (5 lines)
>>From memory, it was set up for EFI, btrfs root partition, encryption
> enabled, boot partition was fat32, I chose no environment (wanted to use
> sway), accepted networkmanager, tor, cups... I kept the default
> generated config.scm because it often crashed when trying to edit and I

I was able to reproduce it on real hardware, following those
instructions. The dump is available here if people want to join the
party: dump.guix.gnu.org/download/installer-dump-304492ff.

Thanks,

Mathieu
M
M
Mathieu Othacehe wrote on 22 Oct 19:00 +0200
(name . bdju)(address . bdju@tilde.team)
871qqzpzd4.fsf@gnu.org
Hey,

Toggle quote (4 lines)
> I was able to reproduce it on real hardware, following those
> instructions. The dump is available here if people want to join the
> party: dump.guix.gnu.org/download/installer-dump-304492ff.

So the backtrace suggests that we are trying to open /dev/nvme0n1p1 to
read its superblock:

Toggle snippet (3 lines)
9 (open "/dev/nvme0n1p1" 524288 #<undefined>)

and that it fails because the file does not exist:

Toggle snippet (3 lines)
1780:13 6 (_ #<&compound-exception components: (#<&external-error> #<&origin origin: "open-fdes"> #<&message message: "~A"> #<&irritants irritants: ("No such file or directory")> #<&exception-w…>)

This open call originates from check-user-partitions in (gnu installer
parted). If we arrive here, it means that the file *should* exist.

Looking at the kernel trace, the two last lines are:

Toggle snippet (4 lines)
[ 72.271204] nvme0n1: p1 p2 p3 p4
[ 127.415648] nvme0n1: p1 p2

so the disk partition table is updated because we move from 4 to 2
partitions. Could it be possible that during a brief period of time the
/dev/nvme0n1p1 file disappears then re-appears?

I'll try to reproduce it a VM to conduct more testing.

Mathieu
M
M
Mathieu Othacehe wrote on 22 Oct 22:34 +0200
(name . bdju)(address . bdju@tilde.team)
87czajiolo.fsf@gnu.org
Hey,

Toggle quote (4 lines)
> so the disk partition table is updated because we move from 4 to 2
> partitions. Could it be possible that during a brief period of time the
> /dev/nvme0n1p1 file disappears then re-appears?

Looks like that's what happening. I'm not able to reproduce it on a
VM. I guess that's because my hardware is slower.

Anyway having a few retries of read-partition-uuid fixes it for me. This
is a bit dirty but that's how we usually deal with that kind of
problems. A patch is attached.

Running those tests I experienced a segmentation fault in libparted and
then in libblkid, but that's another story. I'll open a ticket about
that later on.

Thanks,

Mathieu
From 4407374ff4087772bd8226824cf4883537752f01 Mon Sep 17 00:00:00 2001
From: Mathieu Othacehe <othacehe@gnu.org>
Date: Sat, 22 Oct 2022 22:27:57 +0200
Subject: [PATCH 1/1] installer: parted: Retry failing read-partition-uuid
call.


* gnu/installer/parted.scm (read-partition-uuid/retry): New procedure.
(check-user-partitions): Use it.
---
gnu/installer/parted.scm | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)

Toggle diff (41 lines)
diff --git a/gnu/installer/parted.scm b/gnu/installer/parted.scm
index fcc936a391..82375d29e3 100644
--- a/gnu/installer/parted.scm
+++ b/gnu/installer/parted.scm
@@ -319,6 +319,25 @@ (define (find-user-partition-by-parted-object user-partitions
                   partition))
         user-partitions))
 
+(define (read-partition-uuid/retry file-name)
+  "Call READ-PARTITION-UUID with 5 retries spaced by 1 second.  This is useful
+if the partition table is updated by the kernel at the time this function is
+called, causing the underlying /dev to be absent."
+  (define max-retries 5)
+
+  (let loop ((retry max-retries))
+    (catch #t
+      (lambda ()
+        (read-partition-uuid file-name))
+      (lambda _
+        (if (> retry 0)
+            (begin
+              (sleep 1)
+              (loop (- retry 1)))
+            (error
+             (format #f (G_ "Could not open ~a after ~a retries~%.")
+                     file-name max-retries)))))))
+
 
 ;;
 ;; Devices
@@ -1108,7 +1127,7 @@ (define (check-uuid)
                (need-formatting?
                 (user-partition-need-formatting? user-partition)))
            (or need-formatting?
-               (read-partition-uuid file-name)
+               (read-partition-uuid/retry file-name)
                (raise
                 (condition
                  (&cannot-read-uuid
-- 
2.38.0
M
M
Mathieu Othacehe wrote on 31 Oct 09:35 +0100
(name . bdju)(address . bdju@tilde.team)
87pme8gzm2.fsf@gnu.org
Toggle quote (5 lines)
>
> * gnu/installer/parted.scm (read-partition-uuid/retry): New procedure.
> (check-user-partitions): Use it.

Pushed as ab974ed709976d34917c8f6f9e5cc0004547af45.

Mathieu
Closed
?
Your comment

This issue is archived.

To comment on this conversation send email to 53541@debbugs.gnu.org