failing to boot, probably due to guix gc

  • Done
  • quality assurance status badge
Details
3 participants
  • Attila Lendvai
  • Josselin Poiret
  • Maxime Devos
Owner
unassigned
Submitted by
Attila Lendvai
Severity
normal
A
A
Attila Lendvai wrote on 15 Sep 2022 21:44
(name . bug-guix@gnu.org)(address . bug-guix@gnu.org)
pa0uwc_lSNtPiZmfkmBFuzQr106M9dN93EyM5rqWUdNBIWXIrDeZMlJSbWaUlC8KF_I9XyMpqA2SnqFRpfJ-3VWm7sMfYdEWxlL_AujxQ6Q=@lendvai.name
dear Guixers,

on one of my installs i ran the following two commands as root:

guix gc --delete-generations=60d
guix system delete-generations 60d

i think i ran a reboot pretty soon after this, and the machine is failing to boot with the error "no code for module (ice-9 popen)".

i'm attaching a photo of the kernel panic screen.

the delete-generations left 3 generations, but all the three fail the same way.

i'll keep the machine as-is for a while if some inspection would help by providing logs/data. i also want to fix it and have it running soonish, so let me know if there's anything valuable i should extract from it before attempting to fix it. i hope a chroot and a reconfigure will fix it.

i think i ran the commands multiple times with different time ranges, and in different order.

i did something similar on another install, but IIRC i also ran a `guix system reconfigure` prior to the reboot, and that install didn't break.

HTH,

--
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“Politicians never accuse you of 'greed' for wanting other people's money - only for wanting to keep your own money.”
— Joseph Sobran (1946–2010)
Attachment: kernelpanic.png
A
A
Attila Lendvai wrote on 16 Sep 2022 21:44
(No Subject)
(name . 57838@debbugs.gnu.org)(address . 57838@debbugs.gnu.org)
NF5QB02v17vqhtLsmikD5Jsfcel4zACDDVZaY3myh27khLQXLvCUbmIgZa6OgfpBge81kOFw9xSsFq3XYEwZrjWpPEfT7meWZboYcwDgaM8=@lendvai.name
i have fixed it by running a reconfigure in a chroot.

surprisingly, this has also fixed the old, previously broken system generations, not only the newly created one.

maybe some boot related references are not traversed while GC'ing?

--
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
If you expect the world to be fair with you because you are fair, you're fooling yourself. Stop bitching about the lion, and get busy learning how not to get eaten.
J
J
Josselin Poiret wrote on 19 Sep 2022 21:31
87tu532mb0.fsf@jpoiret.xyz
Hi Attila,
Attila Lendvai <attila@lendvai.name> writes:

Toggle quote (6 lines)
> i have fixed it by running a reconfigure in a chroot.
>
> surprisingly, this has also fixed the old, previously broken system generations, not only the newly created one.
>
> maybe some boot related references are not traversed while GC'ing?

I just checked a couple of things on my own system generation:
* the boot script execl's shepherd, which specifies the (full) guile
interpreter as a full path in the shebang line;
* that guile interpreter does contain the (ice-9 popen) module;
* that same guile interpreter does appear as a requisite of the system
generation, so shouldn't be gc'd as long as it's a gc root.

I've got no clue :(

--
Josselin Poiret
M
M
Maxime Devos wrote on 24 Sep 2022 03:43
Re: bug#57838: failing to boot, probably due to guix gc
f3da0bc1-897a-54ae-4ceb-f4ebb08161bb@telenet.be
For these kind of errors, I think I've an idea what's the cause in
On 15-09-2022 21:44, Attila Lendvai wrote:
Toggle quote (8 lines)
> dear Guixers,
>
> on one of my installs i ran the following two commands as root:
>
> guix gc --delete-generations=60d
> guix system delete-generations 60d
>
> i think i ran a reboot pretty soon after this, and the machine is failing to boot with the error "no code for module (ice-9 popen)".
How did you reboot? Maybe whatever rebooting mechanism you use doesn't
do 'sync' first or doesn't wait for 'sync' to complete.
To test the hypothesis that there is store corruption, could you do
"guix gc --verify=contents" (assuming there are some old system
generations you can boot from)?
Greetings,
Maxime.
Attachment: OpenPGP_signature
A
A
Attila Lendvai wrote on 25 Sep 2022 18:31
(name . Maxime Devos)(address . maximedevos@telenet.be)(address . 57838@debbugs.gnu.org)
1a4KufteLMmEAWtddioo4rdZP8FCVfMEGfnY4pOJzUetKJn3321jlRsaWfsvYGybRG-tbgmewA02PdyTMcyy18SDWFvh2KZexfuvxfFZ918=@lendvai.name
Toggle quote (12 lines)
> > on one of my installs i ran the following two commands as root:
> >
> > guix gc --delete-generations=60d
> > guix system delete-generations 60d
> >
> > i think i ran a reboot pretty soon after this, and the machine is failing to boot with the error "no code for module (ice-9 popen)".
>
>
> How did you reboot? Maybe whatever rebooting mechanism you use doesn't
> do 'sync' first or doesn't wait for 'sync' to complete.


i'm pretty sure i have issued a `reboot` in an ssh session.


Toggle quote (5 lines)
> To test the hypothesis that there is store corruption, could you do
> "guix gc --verify=contents" (assuming there are some old system
> generations you can boot from)?


as i have explained in an earlier mail, all my old generations (3 remained after the GC) got broken, but they all got repaired by a subsequent reconfigure from a chroot. it has probably installed some files that were missing.

# guix gc --verify=contents
reading the store...
checking path existence...
checking hashes...
#

--
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“Self knowledge is always bad news.”
— John Barth (1930–)
A
A
Attila Lendvai wrote on 18 Dec 2022 14:19
(address . 57838@debbugs.gnu.org)
CAE4vfcLDt1+4QwWGmXfxFM=D=2fOdryx_d5PmcqESuBsySc81Q@mail.gmail.com
close 57838
--

i just found out that this is a bug in that codebase that we don't
talk about here.

sorry about the noise!


--
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
In the end, we only regret the chances we didn’t take, relationships
we were afraid to have, and the decisions we waited too long to make.
A
A
Attila Lendvai wrote on 18 Dec 2022 14:46
(address . control@debbugs.gnu.org)
87ili8omd9.fsf@lendvai.name
close 57838
quit
?