Shepherd crash on berlin
(address . bug-guix@gnu.org)
Earlier today, berlin was unreachable. From /var/log/messages, this
appears to be due to a crash of shepherd (PID 1); here are its last
words:
Toggle snippet (16 lines)
Oct 19 06:32:36 localhost shepherd[1]: Respawning mumi-worker.
Oct 19 06:32:37 localhost shepherd[1]: Service mumi-worker has been started.
Oct 19 06:32:52 localhost ntpd[1837]: Soliciting pool server 193.141.27.6
Oct 19 06:33:56 localhost ntpd[1837]: Soliciting pool server 193.158.22.13
Oct 19 06:35:02 localhost ntpd[1837]: Soliciting pool server 62.75.236.38
Oct 19 06:35:43 localhost vmunix: [1586150.310354] shepherd[87089]: segfault at 0 ip 00007ff3018dc264 sp 00007ffeae066810 error 6 in crash-handler.so[7ff3018dc000+1000]
Oct 19 06:35:43 localhost vmunix: [1586150.322247] Code: ff ff ff e8 6e fe ff ff 48 8d 3d b7 0d 00 00 e8 12 fe ff ff bf 27 00 00 00 31 c0 e8 26 fe ff ff 89 ee 48 89 c7 e8 2c fe ff ff <c7> 04 25 00 00 00 00 00 00 00 00 0f 0b 48 89 c3 be 01 00 00 00 89
Oct 19 06:36:09 localhost ntpd[1837]: Soliciting pool server 51.75.67.47
Oct 19 06:36:40 localhost postgres[1389]: [73-1] 2022-10-19 04:36:40.036 GMT [1389] LOG: using stale statistics instead of current ones because stats collector is not responding
Oct 19 06:36:50 localhost postgres[87362]: [6-1] 2022-10-19 04:36:50.101 GMT [87362] LOG: using stale statistics instead of current ones because stats collector is not responding
Oct 19 06:37:00 localhost postgres[1389]: [74-1] 2022-10-19 04:37:00.062 GMT [1389] LOG: using stale statistics instead of current ones because stats collector is not responding
Oct 19 06:37:10 localhost postgres[87368]: [6-1] 2022-10-19 04:37:10.128 GMT [87368] LOG: using stale statistics instead of current ones because stats collector is not responding
Oct 19 06:37:16 localhost ntpd[1837]: Soliciting pool server 3.64.117.201
Oct 19 12:48:58 localhost syslogd (GNU inetutils 2.0): restart
The interesting part is that we have a core dump—see ‘crash-handler.so’
above, which is shepherd’s mechanism to ensure there’s a core dump
before the machine stops. Here’s what we get:
Toggle snippet (104 lines)
ludo@berlin ~$ gdb -ix ~/.gdbinit /gnu/store/cnfsv9ywaacyafkqdqsv2ry8f01yr7a9-guile-3.0.7/bin/guile /core.shepherd-20221019
GNU gdb (GDB) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /gnu/store/cnfsv9ywaacyafkqdqsv2ry8f01yr7a9-guile-3.0.7/bin/guile...
Reading symbols from /gnu/store/4ws3vh3zrs1yi9lfaibha64chf3vn2rm-guile-3.0.7-debug/lib/debug//gnu/store/cnfsv9ywaacyafkqdqsv2ry8f01yr7a9-guile-3.0.7/bin/guile.debug...
warning: core file may not match specified executable file.
[New LWP 87089]
warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.
Core was generated by `/gnu/store/cnfsv9ywaacyafkqdqsv2ry8f01yr7a9-guile-3.0.7/bin/guile --no-auto-com'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007ff3018dc264 in ?? ()
from /gnu/store/cdc1gzbp3q15kdiwn2i5j3437jwx61ac-shepherd-0.9.2/lib/shepherd/crash-handler.so
(gdb) bt
#0 0x00007ff3018dc264 in ?? ()
from /gnu/store/cdc1gzbp3q15kdiwn2i5j3437jwx61ac-shepherd-0.9.2/lib/shepherd/crash-handler.so
#1 <signal handler called>
#2 0x00007ff3018dc264 in ?? ()
from /gnu/store/cdc1gzbp3q15kdiwn2i5j3437jwx61ac-shepherd-0.9.2/lib/shepherd/crash-handler.so
#3 <signal handler called>
#4 0x00007ff30b2d2030 in raise ()
from /gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/libc.so.6
#5 0x00007ff30b2bc526 in abort ()
from /gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/libc.so.6
#6 0x00007ff30b7bb263 in ?? () from /gnu/store/2lczkxbdbzh4gk7wh91bzrqrk7h5g1dl-libgc-8.0.4/lib/libgc.so.1
#7 0x00007ff30b7c2232 in ?? () from /gnu/store/2lczkxbdbzh4gk7wh91bzrqrk7h5g1dl-libgc-8.0.4/lib/libgc.so.1
#8 0x00007ff30b7c25a8 in ?? () from /gnu/store/2lczkxbdbzh4gk7wh91bzrqrk7h5g1dl-libgc-8.0.4/lib/libgc.so.1
#9 0x00007ff30b7c290f in ?? () from /gnu/store/2lczkxbdbzh4gk7wh91bzrqrk7h5g1dl-libgc-8.0.4/lib/libgc.so.1
#10 0x00007ff30b7c78da in GC_generic_malloc_many ()
from /gnu/store/2lczkxbdbzh4gk7wh91bzrqrk7h5g1dl-libgc-8.0.4/lib/libgc.so.1
#11 0x00007ff30b87fb45 in scm_inline_gc_alloc (kind=SCM_INLINE_GC_KIND_NORMAL, idx=6,
freelist=0x7ff30b246e48) at gc-inline.h:79
#12 allocate_words_with_freelist (thread=0x7ff30b246d80, freelist_idx=6) at intrinsics.c:470
#13 0x00007ff300039f95 in ?? ()
#14 0x00007ff30b246d80 in ?? ()
#15 0x00007ff302a2f7e8 in ?? ()
#16 0x0000000000000040 in ?? ()
#17 0x00007ff30b88bb1c in scm_jit_enter_mcode (thread=0x7ff30b246d80,
mcode=0x7ff30231e75c "B", <incomplete sequence \340>) at jit.c:6038
#18 0x00007ff30b8e80c5 in scm_call_n (proc=<optimized out>, argv=argv@entry=0x7ffeae067878,
nargs=nargs@entry=1) at vm.c:1602
#19 0x00007ff30b862ea7 in scm_primitive_eval (exp=<optimized out>) at eval.c:671
#20 0x00007ff30b88d729 in scm_primitive_load (filename=<optimized out>) at load.c:131
#21 0x00007ff30b8e5915 in vm_regular_engine (thread=0x7ff30b246d80) at vm-engine.c:972
#22 0x00007ff30b8e8029 in scm_call_n (proc=<optimized out>, argv=argv@entry=0x7ffeae067a58,
nargs=nargs@entry=1) at vm.c:1608
#23 0x00007ff30b862ea7 in scm_primitive_eval (exp=<optimized out>,
exp@entry=((@ (ice-9 control) %) (begin ((@@ (ice-9 command-line) load/lang) "/gnu/store/cdc1gzbp3q15kdiwn2i5j3437jwx61ac-shepherd-0.9.2/bin/shepherd") (quit)))) at eval.c:671
#24 0x00007ff30b862f06 in scm_eval (
exp=((@ (ice-9 control) %) (begin ((@@ (ice-9 command-line) load/lang) "/gnu/store/cdc1gzbp3q15kdiwn2i5j3437jwx61ac-shepherd-0.9.2/bin/shepherd") (quit))),
module_or_state=module_or_state@entry="#<struct module>" = {...}) at eval.c:705
#25 0x00007ff30b8bde76 in scm_shell (argc=5, argv=0x7ffeae0680e8) at script.c:357
#26 0x00007ff30b87b36d in invoke_main_func (body_data=0x7ffeae067f80) at init.c:313
#27 0x00007ff30b85cbea in c_body (d=0x7ffeae067ec0) at continuations.c:430
#28 0x00007ff30b8e5915 in vm_regular_engine (thread=0x7ff30b246d80) at vm-engine.c:972
#29 0x00007ff30b8e8029 in scm_call_n (proc=<optimized out>, argv=argv@entry=0x7ffeae067c80,
nargs=nargs@entry=2) at vm.c:1608
#30 0x00007ff30b861dfa in scm_call_2 (proc=<optimized out>, arg1=<optimized out>, arg2=<optimized out>)
at eval.c:503
#31 0x00007ff30b863529 in scm_c_with_exception_handler (type=type@entry=#t,
handler=handler@entry=0x7ff30b8dd750 <catch_post_unwind_handler>,
handler_data=handler_data@entry=0x7ffeae067df0, thunk=thunk@entry=0x7ff30b8dd890 <catch_body>,
thunk_data=thunk_data@entry=0x7ffeae067df0) at exceptions.c:170
#32 0x00007ff30b8dda8d in scm_c_catch (tag=tag@entry=#t, body=body@entry=0x7ff30b85cbe0 <c_body>,
body_data=body_data@entry=0x7ffeae067ec0, handler=handler@entry=0x7ff30b85ce80 <c_handler>,
handler_data=handler_data@entry=0x7ffeae067ec0,
pre_unwind_handler=pre_unwind_handler@entry=0x7ff30b85ccd0 <pre_unwind_handler>,
pre_unwind_handler_data=0x7ff3038c4b00) at throw.c:168
#33 0x00007ff30b85d238 in scm_i_with_continuation_barrier (body=0x7ff30b85cbe0 <c_body>,
body_data=0x7ffeae067ec0, handler=0x7ff30b85ce80 <c_handler>, handler_data=0x7ffeae067ec0,
pre_unwind_handler=0x7ff30b85ccd0 <pre_unwind_handler>, pre_unwind_handler_data=0x7ff3038c4b00)
at continuations.c:368
#34 0x00007ff30b85d295 in scm_c_with_continuation_barrier (func=<optimized out>, data=<optimized out>)
at continuations.c:464
#35 0x00007ff30b8dc549 in with_guile (base=0x7ffeae067f28, data=0x7ffeae067f50) at threads.c:645
#36 0x00007ff30b7b90ba in GC_call_with_stack_base ()
from /gnu/store/2lczkxbdbzh4gk7wh91bzrqrk7h5g1dl-libgc-8.0.4/lib/libgc.so.1
#37 0x00007ff30b8dc848 in scm_i_with_guile (dynamic_state=<optimized out>, data=data@entry=0x7ffeae067f30,
func=func@entry=0x7ff30b87b350 <invoke_main_func>) at threads.c:688
#38 scm_with_guile (func=func@entry=0x7ff30b87b350 <invoke_main_func>, data=data@entry=0x7ffeae067f80)
at threads.c:694
#39 0x00007ff30b87b4e2 in scm_boot_guile (argc=argc@entry=5, argv=argv@entry=0x7ffeae0680e8,
main_func=main_func@entry=0x401230 <inner_main>, closure=closure@entry=0x0) at init.c:296
#40 0x00000000004010f7 in main (argc=5, argv=0x7ffeae0680e8) at guile.c:94
(gdb) info proc status
unable to handle request
(gdb) info proc
exe = '/gnu/store/cnfsv9ywaacyafkqdqsv2ry8f01yr7a9-guile-3.0.7/bin/guile --no-auto-com'
Ludo’.