Installer crashes while partitioning hard disk > 1 TiB

  • Done
  • quality assurance status badge
Details
5 participants
  • Keyhenge
  • Ludovic Courtès
  • Mathieu Othacehe
  • Mathieu Othacehe
  • pelzflorian (Florian Pelz)
Owner
unassigned
Submitted by
Keyhenge
Severity
important
K
K
Keyhenge wrote on 13 Apr 2020 04:46
(address . bug-guix@gnu.org)
87o8rwt84t.fsf@keyhenge.xyz
Hi, I'm a new Guix user (well, trying to be) and am trying to
install
Guix onto my desktop machine. Unfortunately, I'm getting four
errors in
the installer which are preventing me from installing it. Also,
apologies
if I format this wrong, this is my first time mailing in a bug
report.

Info about my machine:
CPU: AMD Ryzen 2700X
GPU: AMD 5700 XT
SSD: Samsung SSD 860 2TB
Memory: 32GB

The first error comes when trying to partition the SSD via the
graphical
installer, choosing a guided installation. This will happen
choosing with
or without encryption, and with or without a separate /home
partition.

In ice-9/boot-9.scm:
829:9 19 (catch srfi-34 #procedure 2ea60c0 at
./gnu/installer/steps.scm:144:7()> #procedure 2e251e0 at
./gnu/installer/steps.scm:144:7 (key c)> _)
829:9 18 (catch srfi-34 #procedure 2eb8440 at
./gnu/installer/steps.scm:144:7()> #procedure 2f27e60 at
./gnu/installer/steps.scm:144:7 (key c)> _)
829:9 17 (catch srfi-34 #procedure 2eb8240 at
./gnu/installer/steps.scm:144:7()> #procedure 2f27e10 at
./gnu/installer/steps.scm:144:7 (key c)> _)
829:9 16 (catch srfi-34 #procedure 2eb8040 at
./gnu/installer/steps.scm:144:7()> #procedure 2f27dc0 at
./gnu/installer/steps.scm:144:7 (key c)> _)
829:9 15 (catch srfi-34 #procedure 286e200 at
./gnu/installer/steps.scm:144:7()> #procedure 17468c0 at
./gnu/installer/steps.scm:144:7 (key c)> _)
In ./gnu/installer/steps.scm
182:21 14 (_)
In ./gnu/installer/newt/partition.scm:
755:33 13 (run-partitioning-page)
In ./gnu/installer/parted.scm:
1010:14 12 (auto-partition! #<<disk> bytestructure:
#<bytestructure
0x185ba50>>#:scheme _)
863:17 11 (loop _ 3905980593 1048575)
771:25 10 (mkpart #<<disk> bytestructure: #<bytestructure
0x185ba50>> _
#:previous-partition _)
In parted/structs.scm
552:19 9 (pointer->partition _)
132:3 8 (pointer->bytestructure #<pointer 0x0>
#<bytestructure-descriptor
0x31071c0>)
In unknown file:
7 (pointer->bytevector #<pointer 0x0> 88 #<undefined>
#<undefined>)
In ice-9/boot-9.scm:
751:25 6 (dispatch-exception 5 null-pointer-error
("pointer->bytevector"
"null pointer dereference" () ()))
In ice-9/eval.scm:
619:8 5 (_ #(#(#<directory (guile-user) 1253140> #<<installer>
name: newt
init: #<procedure init ()> exit: #<procedure exit ()> exit-error:
#<procedure exit-error (file key args)> final-...>) ...))
619:8 4 (_ #(#(#(#<directory (guile-user) 1253140> #<<installer>
name:
newt
init: #<procedure init ()> exit: #<procedure exit ()> exit-error:
#<procedure exit-error (file key args)> f...>) ...) #))
In ice-9/ports.scm:
462:17 3 (call-with-output-file _ _ #:binary _ #:encoding _)
In ice-9/eval.scm:
619:8 2 (_ #(#(#<directory (guile-user) 1253140>
null-pointer-error
("pointer->bytevector" "null pointer dereference" () ()))
#<output:
/tmp/last-installer-error 12>))
159:9 1 (_ #(#(#<directory (guile-user) 1253140>
null-pointer-error
("pointer->bytevector" "null pointer dereference" () ()))
#<output:
/tmp/last-installer-error 12>))
In unknown file:
0 (make-stack #t)
ice-9/eval.scm:159:9: In procedure pointer->bytevector: null
pointer
dereference

The second error I get unfortunately doesn't have an error
message, and
comes when trying to do a manual partition via the graphical
installer.
I'll be able to make a boot partition (around 512 MB) just fine,
but when
trying to make a partition on the rest of the disk (or even most
of it),
the installer will immediately reset and take me back to picking
the
language.

I then tried the CLI installation method which fared quite a bit
better.
For one, I got past the partitioning stage, mounted the partitions
to the
appropriate places, and made a configuration. It even gets through
most of
the installation progress, but unfortunately fails on what seems
to be the
very last step, initializing the operating system under /mnt,
specifically
populating /mnt. bootloader-installer fails with the following
error:

Initializing operating system under '/mnt'...
copying to '/mnt'...
populating '/mnt'...
error: '/gnu/store/[hash]-grub-efi-2.02/sbin/grub-install
--boot-directory
/mnt/boot --bootloader-id=Guix --efi-directory /mnt/boot/efi'
exited with
status 1; output follows:
Installing for x86_64-efi platform.
Could not prepare Boot variable: No such file or directory
/gnu/store/[hash]-grub-efi-2.02/sbin/grub-install: error:
efibootmgr
failed to register the boot entry: Input/output error.
guix system: error: failed to install bootloader
/gnu/store/[hash]-bootloader-installer

If you head back to the graphical install after this, you'll be
treated to
another error when trying to pick your locale/keymap.

In ice-9/eval.scm:
619:8 19 (_ #(#(#<directory (guile-user) 133e140> #<<installer>
name: newt
init: #<procedure init ()> exit: #<procedure exit ()> exit-error:
#<procedure exit-error (file key args)> final-...>) #))
In ice-9/boot-9.scm:
829:9 18 (catch #t #<procedure 294a2e0 at ice-9/eval.scm:330:13
()>
#<procedure 294a140 at ice-9/boot-9.scm:1048:2 _> _)
In ice-9/eval.scm:
619:8 17 (_ #(#(#(#<directory (guile-user) 133e140>
#<<installer> name:
newt
init: #<procedure init ()> exit: #<procedure exit ()> exit-error:
#<procedure exit-error (file key args)> fi...> ...)) ...))
626:19 16 (_ #(#(#(#<directory (guile-user) 133e140>
#<<installer> name:
newtinit: #<procedure init ()> exit: #<procedure exit ()>
exit-error:
#<procedure exit-error (file key args)> fi...> ...)) ...))
In ./gnu/installer/steps.scm:
189:6 15 (run-installer-steps #:steps _ #:rewind-strategy _
#:menu-proc _)
In ice-9/boot-9.scm:
829:9 14 (catch srfi-34 #<procedure 2e79b80 at
./gnu/installer/steps.scm:144:7()> #<procedure 2df4320 at
./gnu/installer/steps.scm:144:7 (key c)> _)
829:9 13 (catch srfi-34 #<procedure 2efaf80 at
./gnu/installer/steps.scm:144:7()> #<procedure 2df4050 at
./gnu/installer/steps.scm:144:7 (key c)> _)
829:9 12 (catch srfi-34 #<procedure 2efad40 at
./gnu/installer/steps.scm:144:7()> #<procedure 2df4000 at
./gnu/installer/steps.scm:144:7 (key c)> _)
829:9 11 (catch srfi-34 #<procedure 2efab40 at
./gnu/installer/steps.scm:144:7()> #<procedure 2ff1fa0 at
./gnu/installer/steps.scm:144:7 (key c)> _)
In ./gnu/installer/keymap.scm:
163:7 8 (_ _)
In unknown file:
7 (scm-error misc-error #f "~A" ("Unable to locate keymap update
file")
#f)
In ice-9/boot-9.scm:
751:25 6 (dispatch-exception 4 misc-error (#f "~A" ("Unable to
locate
keymap update file") #f))
In ice-9/eval.scm:
619:8 5 (_ #(#(#<directory (guile-user) 133e140> #<<installer>
name: newt
init: #<procedure init ()> exit: #<procedure exit ()> exit-error:
#<procedure exit-error (file key args)> final-...>) ...))
619:8 4 (_ #(#(#(#<directory (guile-user) 133e140> #<<installer>
name:
newt
init: #<procedure init ()> exit: #<procedure exit ()> exit-error:
#<procedure exit-error (file key args)> f...>) ...) #))
In ice-9/ports.scm:
462:17 3 (call-with-output-file _ _ #:binary _ #:encoding _)
In ice-9/eval.scm:
619:8 2 (_ #(#(#<directory (guile-user) 133e140> misc-error (#f
"~A"
("Unable to locate keymap update file") #f)) #<output:
/tmp/last-installer-error 17>))
159:9 1 (_ #(#(#<directory (guile-user) 133e140> misc-error (#f
"~A"
("Unable to locate keymap update file") #f)) #<output:
/tmp/last-installer-error 17>))
In unknown file:
0 (make-stack #t)
ice-9/eval.scm:159:9: Unable to locate keymap update file

Some side notes:
- I've redownloaded, verified through gpg, and dd'd the iso onto
my USB a
couple of times to ensure it wasn't just a bad download
- While I have NixOS on another SSD, these errors persist even
when I
physically remove it, so I don't think there's any /dev/sdX
ordering issues
involved.
- Doing the partitions via CLI, then going back through the manual
install
process results in normal behavior up until the grub-install error
- Restarting the graphical installer after the grub-install error,
then
going to the CLI will let you see that all of the partitions have
been
unmounted, except for the root partition which will instead be
mounted to
/tmp and have a guix-install folder.
- When viewing what's under /mnt/boot/efi, you'll get the path
/mnt/boot/efi/EFI/Guix/grubx64.efi. This seems like a weird path
to me with
efi/EFI, but I don't really have experience with tinkering with
grub.
Changing the path to be /mnt/boot/efi/Guix/grubx64.efi doesn't
change the
bootloader error.
- I've transcribed the above errors by hand, but since they seem
to be
exporting the output to /tmp, there's got to be a better way. Is
it
possible to email this list from inside the installation image?
M
M
Mathieu Othacehe wrote on 13 Apr 2020 10:15
(name . Keyhenge)(address . key@keyhenge.xyz)(address . 40590@debbugs.gnu.org)
87v9m3n6nr.fsf@gmail.com
Hello,

Toggle quote (4 lines)
> The first error comes when trying to partition the SSD via the graphical
> installer, choosing a guided installation. This will happen choosing with
> or without encryption, and with or without a separate /home partition.

Sorry for all the errors you encountered, but many thanks for the
feedback! Let's try to see what went wrong.

Did you use 1.0.1 image or a more recent one?

Toggle quote (11 lines)
> 619:8 2 (_ #(#(#<directory (guile-user) 1253140> null-pointer-error
> ("pointer->bytevector" "null pointer dereference" () ())) #<output:
> /tmp/last-installer-error 12>))
> 159:9 1 (_ #(#(#<directory (guile-user) 1253140> null-pointer-error
> ("pointer->bytevector" "null pointer dereference" () ())) #<output:
> /tmp/last-installer-error 12>))
> In unknown file:
> 0 (make-stack #t)
> ice-9/eval.scm:159:9: In procedure pointer->bytevector: null pointer
> dereference

Ok, so I suspect that this one is caused by the size of your hard
drive (2TiB). It's an unsolved issue[1] that fell into the cracks.

Toggle quote (10 lines)
> error: '/gnu/store/[hash]-grub-efi-2.02/sbin/grub-install --boot-directory
> /mnt/boot --bootloader-id=Guix --efi-directory /mnt/boot/efi' exited with
> status 1; output follows:
> Installing for x86_64-efi platform.
> Could not prepare Boot variable: No such file or directory
> /gnu/store/[hash]-grub-efi-2.02/sbin/grub-install: error: efibootmgr
> failed to register the boot entry: Input/output error.
> guix system: error: failed to install bootloader
> /gnu/store/[hash]-bootloader-installer

Did you mount your EFI partition to "/mnt/boot/efi" before starting the
install?

Toggle quote (3 lines)
> If you head back to the graphical install after this, you'll be treated to
> another error when trying to pick your locale/keymap.

Well, the graphical installer does not support very well to be
restarted, especially in 1.0.1 release.

Toggle quote (4 lines)
> - I've transcribed the above errors by hand, but since they seem to be
> exporting the output to /tmp, there's got to be a better way. Is it
> possible to email this list from inside the installation image?

That's heroic! If you have another machine around you may use SSH to get
the error file (located in /tmp/last-installer-error). Otherwise you can
take some pictures of your screen and send them as attachment.

Now, if you could try again with:

* The 1.1.0-rc2 release available here[2].

* This custom release[3]: 1.1.0-rc2 + the patch suggested by Danny for
partitions > 1TiB.

that would be really appreciated :)

Thanks,

Mathieu

[3]: Via ipfs: QmbbkQZc7fbsDrutNwVuvpFep3dBjZx6muni3WgXjRneq7
K
K
Keyhenge wrote on 14 Apr 2020 01:59
(name . Mathieu Othacehe)(address . m.othacehe@gmail.com)(address . 40590@debbugs.gnu.org)
87k12jszs0.fsf@keyhenge.xyz
First off, thanks for your help so far!

Toggle quote (1 lines)
> Did you use 1.0.1 image or a more recent one?
I was using the Guix System 1.0.1 x86_64 image here:

Toggle quote (3 lines)
> Did you mount your EFI partition to "/mnt/boot/efi" before
> starting the
> install?
Yep. Not mounting it yields a different error (cannot find
canonical
path of /mnt/boot/efi)
I'll also confirm I'm setting the esp flag on the partition.

Toggle quote (3 lines)
> Now, if you could try again with:
>
> * The 1.1.0-rc2 release available here[2].
- The graphical installer displays normal behavior up to when
making the
partitions, where it resets the installation without displaying an
error
- Installation through TTY has the same behavior as before,
efibootmgr
failed to register the boot entry: Input/output error

Toggle quote (3 lines)
> * This custom release[3]: 1.1.0-rc2 + the patch suggested by
> Danny for
> partitions > 1TiB.
- Graphical installation has an oddity, my Samsung drive has
disappeared
from the disk selection in both guided and manual partitioning
- Strangely, my WD Blue 2TB drive (also an SSD) with the NixOS
installation shows up just fine
- While it is possible to access the Samsung drive from the TTY,
and
even wipe it/make new partitions, trying to populate the
partitions with
filesystems will result in them saying they're in use or already
mounted. I suspect this is also why it's not showing up in the
graphical
installation.
K
K
Keyhenge wrote on 14 Apr 2020 02:54
(name . Mathieu Othacehe)(address . m.othacehe@gmail.com)(address . 40590@debbugs.gnu.org)
87sgh6c2ez.fsf@keyhenge.xyz
Quick update on the previous email:

Toggle quote (4 lines)
> * This custom release[3]: 1.1.0-rc2 + the patch suggested by
> Danny for
> partitions > 1TiB.
> [3]: Via ipfs: QmbbkQZc7fbsDrutNwVuvpFep3dBjZx6muni3WgXjRneq7
It occured to me that the Samsung drive might have been corrupted
at
some point, so I dd'd /dev/zero into the first few gigabytes of
the
drive to completely wipe the partition table. Sure enough, it
showed up
on the partitioning stage of the graphical installer and I was
able get
all the way through to the end... except for the same efibootmgr
error.
M
M
Mathieu Othacehe wrote on 14 Apr 2020 08:59
(name . Keyhenge)(address . key@keyhenge.xyz)(address . 40590@debbugs.gnu.org)
87lfmylfif.fsf@gmail.com
Hey,

Toggle quote (9 lines)
>> * This custom release[3]: 1.1.0-rc2 + the patch suggested by Danny for
>> partitions > 1TiB.
>> [3]: Via ipfs: QmbbkQZc7fbsDrutNwVuvpFep3dBjZx6muni3WgXjRneq7
> It occured to me that the Samsung drive might have been corrupted at
> some point, so I dd'd /dev/zero into the first few gigabytes of the
> drive to completely wipe the partition table. Sure enough, it showed up
> on the partitioning stage of the graphical installer and I was able get
> all the way through to the end... except for the same efibootmgr error.

Ok! So it seems that the additional patch is improving the situation. I
also fear that 1.1.0-rc2 is having the same error as 1.0.1 but is
swallowing it. I will investigate it further.

Regarding the efibootmgr error, I'm not really familiar with EFI. Can
you check that efivarfs is mounted as suggested here[1]?

Thanks,

Mathieu

L
L
Ludovic Courtès wrote on 14 Apr 2020 14:28
control message for bug #40590
(address . control@debbugs.gnu.org)
87lfmyxndh.fsf@gnu.org
retitle 40590 Installer crashes while partitioning hard disk > 1 TiB
quit
L
L
Ludovic Courtès wrote on 14 Apr 2020 14:28
(address . control@debbugs.gnu.org)
87k12ixndd.fsf@gnu.org
severity 40590 important
quit
K
K
Keyhenge wrote on 15 Apr 2020 01:08
Re: bug#40590:
(name . Mathieu Othacehe)(address . m.othacehe@gmail.com)(address . 40590@debbugs.gnu.org)
87y2qxvf6l.fsf@keyhenge.xyz
Progress! But not quite there yet.

Toggle quote (24 lines)
>> It occured to me that the Samsung drive might have been
>> corrupted at
>> some point, so I dd'd /dev/zero into the first few gigabytes of
>> the
>> drive to completely wipe the partition table. Sure enough, it
>> showed up
>> on the partitioning stage of the graphical installer and I was
>> able get
>> all the way through to the end... except for the same
>> efibootmgr error.
>
> Ok! So it seems that the additional patch is improving the
> situation. I
> also fear that 1.1.0-rc2 is having the same error as 1.0.1 but
> is
> swallowing it. I will investigate it further.
>
> Regarding the efibootmgr error, I'm not really familiar with
> EFI. Can
> you check that efivarfs is mounted as suggested here[1]?
>
> [1]:
> https://unix.stackexchange.com/questions/379774/grub-installation-failed

Followed the advice on the SE answer, specifically:
$ mount -t efivarfs efivarfs /sys/firmware/efi/efivars
$ rm /sys/firmware/efi/efivars/dump-*
And it successfully installed!.. sort of.

Guix now boots to this screen[1] and then hard freezes.

Just in case its relevant, I did the installation twice (both
through
TTY in order to run the mount/rm commands):
1. 2 partitions, boot and encrypted root w/ swapfile, using
desktop.scm
as a base
2. 3 partitions, boot, root, and a swap partition, using
lightweight-desktop.scm as a base
Both had the same results.

[1]:
M
M
Mathieu Othacehe wrote on 15 Apr 2020 08:44
(name . Keyhenge)(address . key@keyhenge.xyz)(address . 40590@debbugs.gnu.org)
87wo6hi6ym.fsf@gmail.com
Hello,

Toggle quote (8 lines)
> Just in case its relevant, I did the installation twice (both through
> TTY in order to run the mount/rm commands):
> 1. 2 partitions, boot and encrypted root w/ swapfile, using desktop.scm
> as a base
> 2. 3 partitions, boot, root, and a swap partition, using
> lightweight-desktop.scm as a base
> Both had the same results.

Ok, thanks for your perseverance! I see that someone in experimenting
system freeze with a hardware close to yours[1].

Could you try adding the following kernel parameters (via Grub by
pressing 'e', then appending the option at the end of the cmdline):

* nomodeset

and if it doesn't work:

* amd_iommu=pt ivrs_ioapic[32]=00:14.0 iommu=soft

Thanks,

Mathieu

K
K
Keyhenge wrote on 18 Apr 2020 19:47
(name . Mathieu Othacehe)(address . m.othacehe@gmail.com)(address . 40590@debbugs.gnu.org)
87k12celeo.fsf@keyhenge.xyz
Sorry for the late reply on this.

Toggle quote (11 lines)
> Could you try adding the following kernel parameters (via Grub
> by
> pressing 'e', then appending the option at the end of the
> cmdline):
>
> * nomodeset
>
> and if it doesn't work:
>
> * amd_iommu=pt ivrs_ioapic[32]=00:14.0 iommu=soft

The amd option didn't work, but nomodeset allowed me to get to a
shell,
at which point I could start experimenting with various settings
to see
if I could get a graphical environment. I finally did by switching
over
to the proprietary kernel (which points to a graphical driver
issue with
AMD Navi cards in the libre kernel), but that's obviously less
than ideal.

Still, it means I can finally start using the system in
earnest. Thanks
for all your help!
P
P
pelzflorian (Florian Pelz) wrote on 18 Apr 2020 20:17
(name . Keyhenge)(address . key@keyhenge.xyz)
20200418181729.lnovxwpyko227wj7@pelzflorian.localdomain
On Sat, Apr 18, 2020 at 01:47:11PM -0400, Keyhenge wrote:
Toggle quote (7 lines)
> The amd option didn't work, but nomodeset allowed me to get to a shell,
> at which point I could start experimenting with various settings to see
> if I could get a graphical environment. I finally did by switching over
> to the proprietary kernel (which points to a graphical driver issue with
> AMD Navi cards in the libre kernel), but that's obviously less than ideal.
>

If you have no problems sacrificing all the acceleration by your AMD
GPU, you can try using libre uvesafb for Xorg graphics. What I wrote
was not necessary there but maybe is in your case. That said, this
situation is far from ideal, but I don’t think the AMD firmware will
be freed (if there is no other way).

Regards,
Florian
M
M
Mathieu Othacehe wrote on 19 Apr 2020 12:09
(name . Keyhenge)(address . key@keyhenge.xyz)(address . 40590@debbugs.gnu.org)
874ktfpz27.fsf@gmail.com
Hello,

Thanks for your answer!

Toggle quote (9 lines)
> The amd option didn't work, but nomodeset allowed me to get to a shell,
> at which point I could start experimenting with various settings to see
> if I could get a graphical environment. I finally did by switching over
> to the proprietary kernel (which points to a graphical driver issue with
> AMD Navi cards in the libre kernel), but that's obviously less than ideal.
>
> Still, it means I can finally start using the system in earnest. Thanks
> for all your help!

It is deserved after all the efforts :p. I sent patches for the issues
you had (partitioning issue[1] and efivarfs[2] support). Now for the
graphical issue, if you find a workaround, maybe using uvesafb as
suggested by Florian, please let us know.

Thanks,

Mathieu

M
M
Mathieu Othacehe wrote on 30 Jul 2020 17:18
control message for bug #40590
(address . control@debbugs.gnu.org)
87d04d820j.fsf@cervin.i-did-not-set--mail-host-address--so-tickle-me
close 40590
quit
?