%guile-static-stripped crashes with a sigsegv (i.e. the guile used in the initrd (?))

  • Done
  • quality assurance status badge
Details
3 participants
  • Attila Lendvai
  • jbranso
  • Ludovic Courtès
Owner
unassigned
Submitted by
Attila Lendvai
Severity
normal
A
A
Attila Lendvai wrote on 26 May 17:26 +0200
(name . bug-guix@gnu.org)(address . bug-guix@gnu.org)
aSL_8-HxtZW0rvbLbh7iyyoJ4I01TU5bXlI_NKx0wpDdq1FrvjGRXV-I28_qCCL9Iu9shxE_xZkG32FvvSHSfnLB0FDd117Ji_4GMkaZ1yc=@lendvai.name
root symptom:
-------------

i think this is the guile binary that is used in initrd. it's been a while i noted this bug. but if it's so, then if error happens early in the boot, then it just dies with a sigsegv; i.e. it's a major hindrance to debuggability.


reproducer:
-----------

guix shell -e '(begin (use-modules (gnu packages make-bootstrap)) %guile-static-stripped)'

and then start guile. or alternatively:

guile -c '(use-modules (ice-9 readline))'


analysis:
---------

make-guile-static-stripped calls (remove-store-references guile2) without checking the return code.

if i remove that call, then building fails with: "... is not allowed to refer to path `/gnu/store/3zl03prdg07ax4dny78hrzykillr6vcy-glibc-2.35'"
i.e. there's some reference in the binary to glibc, which is corrupted by remove-store-references.

i'm not sure this is the cause, but i suspect.

note that the `guile --version` test in make-guile-static-stripped runs fine; i.e. it's an insufficient test.

--
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“How much truth can a spirit bear, how much truth can a spirit dare? […] that became for me more and more the real measure of value.”
— Friedrich Nietzsche (1844–1900)
A
A
Attila Lendvai wrote on 29 May 22:51 +0200
(No Subject)
(name . 71211@debbugs.gnu.org)(address . 71211@debbugs.gnu.org)
b-zIVxs2cjSTkgwiGqQUWmKtNtv1Euoo3-7-HwFuNt_k86gR_ccVC8dvq2mKRxzqrTtdwRLn8zidinXbIF8KsWO8WC-x6HyzDRrTE-l8Ous=@lendvai.name
the reproducer still crashes on a recent x86_64, but i originally noticed this long ago (maybe a year even). back then i investigated an early crash in the boot, and i reached %GUILE-STATIC-STRIPPED, and made a TODO note to further investigate. then i forgot most of what happened, and recently i opened this bug report based on my note.

since then EXPRESSION->INITRD may have changed, because it now uses %GUILE-STATIC-INITRD, but it's created with the same MAKE-GUILE-STATIC-STRIPPED that produces the faulty %GUILE-STATIC-STRIPPED, so they're essentially the same.

in short: the reproducer crashes both %GUILE-STATIC-STRIPPED and %GUILE-STATIC-INITRD on x86_64, and i believe that it crashes the same way in the early phase of the boot when/if it tries to enter the debugger.

--
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“If instead of teaching other people what government should be and should do, you'd teach yourself what government actually is and does do, you'd be a libertarian.”
— François-René Rideau
J
J
jbranso wrote on 30 May 21:52 +0200
Re: bug#71211: %guile-static-stripped crashes with a sigsegv (i.e. the guile used in the initrd (?))
9b136882cb3dec8f306179b62cc95bf761197714@dismail.de
May 26, 2024 at 11:26 AM, "Attila Lendvai" <attila@lendvai.name> wrote:



Toggle quote (21 lines)
>
> root symptom:
>
> -------------
>
> i think this is the guile binary that is used in initrd. it's been a while i noted this bug. but if it's so, then if error happens early in the boot, then it just dies with a sigsegv; i.e. it's a major hindrance to debuggability.
>
> reproducer:
>
> -----------
>
> guix shell -e '(begin (use-modules (gnu packages make-bootstrap)) %guile-static-stripped)'
>
> and then start guile. or alternatively:
>
> guile -c '(use-modules (ice-9 readline))'
>
> analysis:
>
> ---------

If it's that easy, to crash guile, then I'll try it on my T400. I'll resond if a few
if it crashes for me.

Toggle quote (25 lines)
>
> make-guile-static-stripped calls (remove-store-references guile2) without checking the return code.
>
> if i remove that call, then building fails with: "... is not allowed to refer to path `/gnu/store/3zl03prdg07ax4dny78hrzykillr6vcy-glibc-2.35'"
>
>
>
> i.e. there's some reference in the binary to glibc, which is corrupted by remove-store-references.
>
> i'm not sure this is the cause, but i suspect.
>
> note that the `guile --version` test in make-guile-static-stripped runs fine; i.e. it's an insufficient test.
>
> --
>
> • attila lendvai
>
> • PGP: 963F 5D5F 45C7 DFCD 0A39
>
> --
>
> “How much truth can a spirit bear, how much truth can a spirit dare? […] that became for me more and more the real measure of value.”
>
> — Friedrich Nietzsche (1844–1900)
>
L
L
Ludovic Courtès wrote on 22 Jul 09:20 +0200
(name . Attila Lendvai)(address . attila@lendvai.name)(address . 71211@debbugs.gnu.org)
87r0blki4c.fsf@gnu.org
Hi,

Attila Lendvai <attila@lendvai.name> skribis:

Toggle quote (6 lines)
> guix shell -e '(begin (use-modules (gnu packages make-bootstrap)) %guile-static-stripped)'
>
> and then start guile. or alternatively:
>
> guile -c '(use-modules (ice-9 readline))'

Indeed:

Toggle snippet (5 lines)
$ guix shell guile-static-stripped -- guile -c '(use-modules (ice-9 readline))'
$ echo $?
139

It’s hard to see exactly where it segfaults, but it’s not surprising:
it’s built with a different libc than the ‘guile-readline’ package I’ve
installed, and it’s not capable of dlopening things. That’s a
restriction we have to be aware of, but that’s ok.

Thanks,
Ludo’.
L
L
Ludovic Courtès wrote on 22 Jul 09:20 +0200
control message for bug #71211
(address . control@debbugs.gnu.org)
87plr5ki45.fsf@gnu.org
tags 71211 notabug
close 71211
quit
?
Your comment

This issue is archived.

To comment on this conversation send an email to 71211@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 71211
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch