guix-data-service build failure, segfault probably related to guile@3.0.8

  • Open
  • quality assurance status badge
Details
2 participants
  • Ludovic Courtès
  • Christopher Baines
Owner
unassigned
Submitted by
Christopher Baines
Severity
normal
C
C
Christopher Baines wrote on 18 Feb 2022 16:08
(address . bug-guix@gnu.org)
87pmnkb44i.fsf@cbaines.net
The recent derivation for the guix-data-service built with Guile 3.0.8
seems to often segfault when running the tests (see the failed builds
here [1]).


With some help from IRC, I managed to get a core dump:

Core was generated by `/gnu/store/jjl6sa1bhjpj9cssi80yr4h8ihdgk34z-guile-3.0.8/bin/guile --no-auto-com'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007ffff0720496 in ?? ()
[Current thread is 1 (LWP 2505)]
(gdb) bt
#0 0x00007ffff0720496 in ?? ()
#1 0x0000000000436488 in ?? ()
#2 0x00007ffff19eb180 in ?? ()
#3 0x00007fffea6ac858 in ?? ()
#4 0x00007ffff7ee7ccc in scm_jit_enter_mcode (thread=0x7ffff75c8d80, mcode=0x439154 <incomplete sequence \340>) at jit.c:6038
#5 0x00007ffff7f3cf3c in vm_regular_engine (thread=0x7ffff75c8d80) at vm-engine.c:360
#6 0x00007ffff7f4a5e9 in scm_call_n (proc=<optimized out>, argv=<optimized out>, nargs=4) at vm.c:1608
#7 0x00007ffff7eb2144 in scm_call_4 (proc=<optimized out>, arg1=<optimized out>, arg2=<optimized out>, arg3=<optimized out>, arg4=<optimized out>) at eval.c:517
#8 0x00007ffff7eaedd0 in error_during_backtrace (data=0x7ffff1c2b080, tag=out-of-range, throw_args=<error reading variable: ERROR: Cannot access memory at address 0x0>0x7fffe5a9bb80) at backtrace.c:252
#9 0x00007ffff7f68821 in scm_c_with_exception_handler.constprop.0 (type=#t, handler_data=handler_data@entry=0x7fffffff98b0, thunk_data=thunk_data@entry=0x7fffffff98b0, thunk=<optimized out>, handler=<optimized out>) at exceptions.c:167
#10 0x00007ffff7f3a88f in scm_c_catch (tag=<optimized out>, body=<optimized out>, body_data=<optimized out>, handler=<optimized out>, handler_data=<optimized out>, pre_unwind_handler=pre_unwind_handler@entry=0x0, pre_unwind_handler_data=0x0) at throw.c:168
#11 0x00007ffff7f3a8ae in scm_internal_catch (tag=<optimized out>, body=<optimized out>, body_data=<optimized out>, handler=<optimized out>, handler_data=<optimized out>) at throw.c:177
#12 0x00007ffff7eaf005 in scm_display_backtrace_with_highlights (stack=stack@entry="#<struct stack>" = {...}, port=port@entry=#<port #<port-type file 7ffff1bf3b40> 7ffff1c2b080>, first=first@entry=#f, depth=depth@entry=#f, highlights=highlights@entry=()) at backtrace.c:277
#13 0x00007ffff7eaf080 in scm_backtrace_with_highlights (highlights=()) at backtrace.c:310
#14 0x00007ffff7f3d336 in vm_regular_engine (thread=0x7ffff75c8d80) at vm-engine.c:972
#15 0x00007ffff7f4a5e9 in scm_call_n (proc=<optimized out>, argv=<optimized out>, nargs=4) at vm.c:1608
#16 0x00007ffff7eb6571 in scm_apply_0 (proc=#<program 7ffff1c55700>, args=()) at eval.c:603
#17 0x00007ffff7f3bc8d in scm_throw (key=match-error, args=("match" "no matching pattern" x86_64-linux)) at throw.c:262
#18 0x00007ffff7edb239 in throw_ (key=<optimized out>, args=<optimized out>) at intrinsics.c:396
#19 0x00007ffff7f3f137 in vm_regular_engine (thread=0x7ffff75c8d80) at vm-engine.c:1183
#20 0x00007ffff7f4a5e9 in scm_call_n (proc=<optimized out>, argv=<optimized out>, nargs=1) at vm.c:1608
#21 0x00007ffff7eb2457 in scm_primitive_eval (exp=<optimized out>) at eval.c:671
#22 0x00007ffff7eeb239 in scm_primitive_load (filename=filename@entry="/tmp/guix-build-guix-data-service-0.0.1-29.4a1088c.drv-0/source/tests/jobs-load-new-guix-revision.scm") at load.c:131
#23 0x00007ffff7eedff0 in scm_primitive_load_path (args=<optimized out>) at load.c:1267
#24 0x00007ffff7f3d336 in vm_regular_engine (thread=0x7ffff75c8d80) at vm-engine.c:972
#25 0x00007ffff7f4a5e9 in scm_call_n (proc=<optimized out>, argv=<optimized out>, nargs=1) at vm.c:1608
#26 0x00007ffff7eb2457 in scm_primitive_eval (exp=<optimized out>, exp@entry=((@ (ice-9 control) %) (begin ((@@ (ice-9 command-line) load/lang) "./build-aux/test-driver.scm") (main (command-line)) (quit)))) at eval.c:671
#27 0x00007ffff7eb84b6 in scm_eval (exp=((@ (ice-9 control) %) (begin ((@@ (ice-9 command-line) load/lang) "./build-aux/test-driver.scm") (main (command-line)) (quit))), module_or_state="#<struct module>" = {...}) at eval.c:705
#28 0x00007ffff7f1c3b6 in scm_shell (argc=19, argv=0x7fffffffa6d8) at script.c:357
#29 0x00007ffff7ec749c in invoke_main_func (body_data=0x7fffffffa590) at init.c:312
#30 0x00007ffff7eb085a in c_body (d=0x7fffffffa4b0) at continuations.c:430
#31 0x00007ffff7f3d336 in vm_regular_engine (thread=0x7ffff75c8d80) at vm-engine.c:972
#32 0x00007ffff7f4a5e9 in scm_call_n (proc=<optimized out>, argv=<optimized out>, nargs=2) at vm.c:1608
#33 0x00007ffff7eb209a in scm_call_2 (proc=<optimized out>, arg1=<optimized out>, arg2=<optimized out>) at eval.c:503
#34 0x00007ffff7f68752 in scm_c_with_exception_handler.constprop.0 (type=#t, handler_data=handler_data@entry=0x7fffffffa440, thunk_data=thunk_data@entry=0x7fffffffa440, thunk=<optimized out>, handler=<optimized out>) at exceptions.c:170
#35 0x00007ffff7f3a88f in scm_c_catch (tag=<optimized out>, body=<optimized out>, body_data=<optimized out>, handler=<optimized out>, handler_data=<optimized out>, pre_unwind_handler=<optimized out>, pre_unwind_handler_data=0x7ffff1c2b040) at throw.c:168
#36 0x00007ffff7eb2e66 in scm_i_with_continuation_barrier (pre_unwind_handler=0x7ffff7eb2b80 <pre_unwind_handler>, pre_unwind_handler_data=0x7ffff1c2b040, handler_data=0x7fffffffa4b0, handler=0x7ffff7eb98b0 <c_handler>, body_data=0x7fffffffa4b0, body=0x7ffff7eb0850 <c_body>) at continuations.c:368
#37 scm_c_with_continuation_barrier (func=<optimized out>, data=<optimized out>) at continuations.c:464
#38 0x00007ffff7f39b39 in with_guile (base=0x7fffffffa538, data=0x7fffffffa560) at threads.c:645
#39 0x00007ffff7e100ba in GC_call_with_stack_base () from /gnu/store/2lczkxbdbzh4gk7wh91bzrqrk7h5g1dl-libgc-8.0.4/lib/libgc.so.1
#40 0x00007ffff7f328b8 in scm_i_with_guile (dynamic_state=<optimized out>, data=<optimized out>, func=<optimized out>) at threads.c:688
#41 scm_with_guile (func=<optimized out>, data=<optimized out>) at threads.c:694
#42 0x00007ffff7ed0025 in scm_boot_guile (argc=argc@entry=19, argv=argv@entry=0x7fffffffa6d8, main_func=main_func@entry=0x401230 <inner_main>, closure=closure@entry=0x0) at init.c:295
#43 0x00000000004010f7 in main (argc=19, argv=0x7fffffffa6d8) at guile.c:94
-----BEGIN PGP SIGNATURE-----

iQKlBAEBCgCPFiEEPonu50WOcg2XVOCyXiijOwuE9XcFAmIPt11fFIAAAAAALgAo
aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDNF
ODlFRUU3NDU4RTcyMEQ5NzU0RTBCMjVFMjhBMzNCMEI4NEY1NzcRHG1haWxAY2Jh
aW5lcy5uZXQACgkQXiijOwuE9XeuIA//Qcak32jj6l39WtFERItsw1k1qpANsmu3
JGL5Jsr3re5aU/pi0CAjsRj2ekMTfIwMqRT9Tyx6zorvAd1lHCURfMxmx8SQcQmQ
KKIz3Jh2WFvbO5Pi3oYfSepoJ+Cxd27Czp5Ry1me2lRwkJ3NnqDxCdYsUcZpgDjA
9gVZPu8+lD0G4csjBXDkXxF4Lan9kZ3t9b5BYC2UywQIYFyZpaH3o+dGkJHx6aJ4
fstnaBXOGsv4ANAYaPbPi7j1G5/vk8O9inqotlfo3AK13hbvMmLo68na345gBefB
QCDC/3w757MqDOlep/Uzx1d4OQGAII+FTHMDlPKiAc4ZKk0vu60AIpkvWkugfdpL
0gAtbfWiM4EmzB6qiqkm2/21PJnJUeQ52WfBK5S9zlUPj2sBnf6+gcN6WeIcKebq
BxLOMf4d4A/k8Pgr7i9+000LAELmgSqOUHxQsXWcg4khnv05oIWuUpBKNa3KF9Rr
EhKixTyZCrmzkQ/Zijkx/9dO48MC+KFyjeTYdDaTu3VljALGOUCk6WDEeGQKffVs
8E/VXYG10Dn1J2w4jlU4B3uqX4YUmqfVxYmXXLi48fVsg+j4cxzIMdWyPdpNBHOH
VK7InTkCEXrdV/MG6w9/H87+VaVN7tJKNQ+EkCfhrHzT6Z+T6YMG2Ilek773AOio
7VAslqlb0nQ=
=a1Oa
-----END PGP SIGNATURE-----

L
L
Ludovic Courtès wrote on 2 Mar 2022 17:11
(name . Christopher Baines)(address . mail@cbaines.net)(address . 54056@debbugs.gnu.org)
87fso01ghk.fsf@gnu.org
Hi,

Christopher Baines <mail@cbaines.net> skribis:

Toggle quote (3 lines)
> Core was generated by `/gnu/store/jjl6sa1bhjpj9cssi80yr4h8ihdgk34z-guile-3.0.8/bin/guile --no-auto-com'.
> Program terminated with signal SIGSEGV, Segmentation fault.

This segfault seems to come from a bug (an out-of-range exception is
raised) while walking the VM stack to display the backtrace:

Toggle quote (6 lines)
> #8 0x00007ffff7eaedd0 in error_during_backtrace (data=0x7ffff1c2b080, tag=out-of-range, throw_args=<error reading variable: ERROR: Cannot access memory at address 0x0>0x7fffe5a9bb80) at backtrace.c:252
> #9 0x00007ffff7f68821 in scm_c_with_exception_handler.constprop.0 (type=#t, handler_data=handler_data@entry=0x7fffffff98b0, thunk_data=thunk_data@entry=0x7fffffff98b0, thunk=<optimized out>, handler=<optimized out>) at exceptions.c:167
> #10 0x00007ffff7f3a88f in scm_c_catch (tag=<optimized out>, body=<optimized out>, body_data=<optimized out>, handler=<optimized out>, handler_data=<optimized out>, pre_unwind_handler=pre_unwind_handler@entry=0x0, pre_unwind_handler_data=0x0) at throw.c:168
> #11 0x00007ffff7f3a8ae in scm_internal_catch (tag=<optimized out>, body=<optimized out>, body_data=<optimized out>, handler=<optimized out>, handler_data=<optimized out>) at throw.c:177
> #12 0x00007ffff7eaf005 in scm_display_backtrace_with_highlights (stack=stack@entry="#<struct stack>" = {...}, port=port@entry=#<port #<port-type file 7ffff1bf3b40> 7ffff1c2b080>, first=first@entry=#f, depth=depth@entry=#f, highlights=highlights@entry=()) at backtrace.c:277

(Of course, both the out-of-range exception and subsequent segfault are
genuine Guile bugs.)

The real cause of the error though seems to be a ‘match-error’ in
application code:

Toggle quote (7 lines)
> #17 0x00007ffff7f3bc8d in scm_throw (key=match-error, args=("match" "no matching pattern" x86_64-linux)) at throw.c:262
> #18 0x00007ffff7edb239 in throw_ (key=<optimized out>, args=<optimized out>) at intrinsics.c:396
> #19 0x00007ffff7f3f137 in vm_regular_engine (thread=0x7ffff75c8d80) at vm-engine.c:1183
> #20 0x00007ffff7f4a5e9 in scm_call_n (proc=<optimized out>, argv=<optimized out>, nargs=1) at vm.c:1608
> #21 0x00007ffff7eb2457 in scm_primitive_eval (exp=<optimized out>) at eval.c:671
> #22 0x00007ffff7eeb239 in scm_primitive_load (filename=filename@entry="/tmp/guix-build-guix-data-service-0.0.1-29.4a1088c.drv-0/source/tests/jobs-load-new-guix-revision.scm") at load.c:131

Thoughts?

Ludo’.
C
C
Christopher Baines wrote on 2 Mar 2022 20:32
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 54056@debbugs.gnu.org)
87ilswb126.fsf@cbaines.net
Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (14 lines)
> The real cause of the error though seems to be a ‘match-error’ in
> application code:
>
>> #17 0x00007ffff7f3bc8d in scm_throw (key=match-error, args=("match" "no matching pattern" x86_64-linux)) at throw.c:262
>> #18 0x00007ffff7edb239 in throw_ (key=<optimized out>, args=<optimized out>) at intrinsics.c:396
>> #19 0x00007ffff7f3f137 in vm_regular_engine (thread=0x7ffff75c8d80) at vm-engine.c:1183
>> #20 0x00007ffff7f4a5e9 in scm_call_n (proc=<optimized out>, argv=<optimized out>, nargs=1) at vm.c:1608
>> #21 0x00007ffff7eb2457 in scm_primitive_eval (exp=<optimized out>) at eval.c:671
>> #22 0x00007ffff7eeb239 in scm_primitive_load
>> (filename=filename@entry="/tmp/guix-build-guix-data-service-0.0.1-29.4a1088c.drv-0/source/tests/jobs-load-new-guix-revision.scm")
>> at load.c:131
>
> Thoughts?

So these are the tests, so exceptions are sometimes intended. Although
in this case, while the test in question was still passing, the failure
it was testing the handling for wasn't happening as expected. I've fixed
this in [1].


I haven't updated the Guix package yet, but maybe this change will help.
-----BEGIN PGP SIGNATURE-----

iQKlBAEBCgCPFiEEPonu50WOcg2XVOCyXiijOwuE9XcFAmIfxuFfFIAAAAAALgAo
aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDNF
ODlFRUU3NDU4RTcyMEQ5NzU0RTBCMjVFMjhBMzNCMEI4NEY1NzcRHG1haWxAY2Jh
aW5lcy5uZXQACgkQXiijOwuE9XcCjA/+KiQdUwW7g+d/XFt6iHslDpWjfg6Cw191
E3NKkJ+CHVJqvQK3Jb0VBnY46YU9XEEEjQ8bxD0mjm9oeMZlWo8ImybMpZWXksK2
18FKH00AizRL0q1ttr52sBS7CwBaVPKyMe1/57PI2cB28xFuLjg9Olv7PV1JZhMn
xiqkzhxuq4/SIj37MvdaCGf+eMbqtLsf8/dv3R8kC0lQFzEBl2CI4Sj/E07uIfYS
T9hJ9ZNcCqRPX0yuWLvntBqbJkqHYlQ7VH+NNuKndoHAzyo7k8xlwL4wZdysOZWX
06rIl9VdOX/j0f9264/p65spwazuggArlOJSCNYkpuoGEw5nicNzQQNovcaTv70/
tueJatwoJ3K8jXAjwAlnbH5NZ7A+bkOVsVyVNNRJPKs1crk5/fbz7asIsY9405n/
oRVLDOTgOOoa/zJDrLjtrdXO3yoEcbYoC6RQ3GnaT3NlP4iJsj2Xbfl04yN3mN/o
OfDFh6hQDcWVkUwJw9AzQk+BARKZXIC+qq+nWv5kMiU0mFR5466IfMVMoGJhCT42
HbtrSYzwKwH2N480Ap/BIDabz/gptsXrTS9iZpnny63yakDzQuH6XFUoHMrNvJS4
ixuX9SXG7jBgts0tTELeT9bbML3XY0hpAdUAsfpqve2EZmICJXlfvn4EGK3LpgKu
8XtYEXJ0zNM=
=vjKm
-----END PGP SIGNATURE-----

?
Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send an email to 54056@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 54056
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch