[3.0.0] Segfault while building on ARMv7

Ludovic Courtès wrote on 20 Jan 17:33 +0100
(address . bug-Guile@gnu.org)
Building 3.0.0 with Guix on ARMv7 reproducibly fails:
Toggle snippet (34 lines) BOOTSTRAP GUILEC language/cps/loop-instrumentation.gowrote `language/cps/loop-instrumentation.go' BOOTSTRAP GUILEC language/cps/peel-loops.gowrote `language/cps/effects-analysis.go' BOOTSTRAP GUILEC language/cps/prune-top-level-scopes.gowrote `language/cps/licm.go' BOOTSTRAP GUILEC language/cps/reify-primitives.gowrote `language/cps/prune-top-level-scopes.go' BOOTSTRAP GUILEC language/cps/renumber.gowrote `language/cps/peel-loops.go' BOOTSTRAP GUILEC language/cps/rotate-loops.gowrote `language/cps/reify-primitives.go' BOOTSTRAP GUILEC language/cps/optimize.gowrote `language/cps/renumber.go' BOOTSTRAP GUILEC language/cps/simplify.gowrote `language/cps/rotate-loops.go' BOOTSTRAP GUILEC language/cps/self-references.gowrote `language/cps/optimize.go' BOOTSTRAP GUILEC language/cps/slot-allocation.gowrote `language/cps/self-references.go' BOOTSTRAP GUILEC language/cps/spec.gowrote `language/cps/simplify.go' BOOTSTRAP GUILEC language/cps/specialize-primcalls.gowrote `language/cps/spec.go' BOOTSTRAP GUILEC language/cps/specialize-numbers.go/gnu/store/nvkn00kq4x4g5wjjjvjj6rhzs0ihihxl-bash-minimal-5.0.7/bin/bash: line 6: 23019 Segmentation fault (core dumped) GUILE_AUTO_COMPILE=0 ../meta/build-env guild compile --target="arm-unknown-linux-gnueabihf" -O1 -Oresolve-primitives -L "/tmp/guix-build-guile-next-3.0.0.drv-0/guile-3.0.0/module" -L "/tmp/guix-build-guile-next-3.0.0.drv-0/guile-3.0.0/guile-readline" -o "language/cps/specialize-primcalls.go" "../module/language/cps/specialize-primcalls.scm"make[2]: *** [Makefile:1931: language/cps/specialize-primcalls.go] Error 139make[2]: *** Waiting for unfinished jobs....wrote `language/cps/slot-allocation.go'wrote `language/cps/specialize-numbers.go'make[2]: Leaving directory '/tmp/guix-build-guile-next-3.0.0.drv-0/guile-3.0.0/bootstrap'make[1]: *** [Makefile:1849: all-recursive] Error 1
It seems to always happen while building ‘specialize-primcalls.go’.
The backtrace is unfortunately not all that readable:
Toggle snippet (109 lines)Program terminated with signal SIGSEGV, Segmentation fault.#0 0xf5c67b74 in ?? ()[Current thread is 1 (Thread 0xf7fe8010 (LWP 23019))](gdb) bt#0 0xf5c67b74 in ?? ()#1 0xf7f3ffcc in scm_jit_enter_mcode (thread=0xdedc20, mcode=0xf5c67a00 " 8\r\032(-@\360c\203%i\250B\300\362_\203\240`\245m") at jit.c:5725#2 0xf7093a40 in ?? ()Backtrace stopped: previous frame identical to this frame (corrupt stack?)(gdb) info threads Id Target Id Frame* 1 Thread 0xf7fe8010 (LWP 23019) 0xf5c67b74 in ?? () 2 Thread 0xf7894460 (LWP 23042) 0xf7e8f034 in __libc_do_syscall () from /gnu/store/n7c20pjm6q1xq1gqjqzzys1yk9fy7n1k-glibc-2.29/lib/libpthread.so.0 3 Thread 0xf69a5460 (LWP 23045) 0xf7e8f034 in __libc_do_syscall () from /gnu/store/n7c20pjm6q1xq1gqjqzzys1yk9fy7n1k-glibc-2.29/lib/libpthread.so.0(gdb) frame 0#0 0xf5c67b74 in ?? ()(gdb) disassemble 0xf5c67a00,+500Dump of assembler code from 0xf5c67a00 to 0xf5c67bf4: 0xf5c67a00: subs r0, #32 0xf5c67a02: subs r5, r1, r0 0xf5c67a04: cmp r5, #40 ; 0x28 0xf5c67a06: bne.w 0xf5c680d0 0xf5c67a0a: ldr r5, [r4, #16] 0xf5c67a0c: cmp r0, r5 0xf5c67a0e: blt.w 0xf5c680d0 0xf5c67a12: str r0, [r4, #8] 0xf5c67a14: ldr r5, [r4, #88] ; 0x58 0xf5c67a16: cmp r5, #0 0xf5c67a18: beq.w 0xf5c68102 0xf5c67a1c: ldrt r6, [r5] 0xf5c67a20: str r6, [r4, #88] ; 0x58 0xf5c67a22: str r5, [r0, #24] 0xf5c67a24: movw r5, #1293 ; 0x50d 0xf5c67a28: movs r6, #0 0xf5c67a2a: str r5, [r0, #16] 0xf5c67a2c: str r6, [r0, #20] 0xf5c67a2e: ldr r5, [r0, #24] 0xf5c67a30: ldr r6, [r0, #16] 0xf5c67a32: str r6, [r5, #0] 0xf5c67a34: ldr r5, [r0, #32] 0xf5c67a36: ldr r5, [r5, #4] 0xf5c67a38: str r5, [r0, #16][…] 0xf5c67b5e: cmp r5, #0 0xf5c67b60: ble.w 0xf5c67fce 0xf5c67b64: ldr r5, [r0, #32] 0xf5c67b66: ldr r5, [r5, #20] 0xf5c67b68: str r5, [r0, #16] 0xf5c67b6a: ldr r5, [r0, #16] 0xf5c67b6c: ldr r5, [r5, #4] 0xf5c67b6e: str r5, [r0, #16] 0xf5c67b70: mov.w r12, #0=> 0xf5c67b74: ldrt r5, [r12] 0xf5c67b78: str r5, [r0, #8] 0xf5c67b7a: ldr r5, [r0, #8] 0xf5c67b7c: ldr r6, [r0, #16] 0xf5c67b7e: cmp r5, r6 0xf5c67b80: bne.w 0xf5c67f80[…](gdb) info registersr0 0xf7093a20 4144577056r1 0xf7093a48 4144577096r2 0x0 0r3 0xf7a24001 4154605569r4 0x74e00 478720r5 0xdedc20 14605344r6 0x0 0r7 0xf5c67a00 4123425280r8 0x0 0r9 0x0 0r10 0xf7fc4bdc 4160506844r11 0xf7fb5000 4160442368r12 0x0 0sp 0xfffedc50 0xfffedc50lr 0xf7f3ffcd -135004211pc 0xf5c67b74 0xf5c67b74cpsr 0x200f0030 537854000fpscr 0x60000000 1610612736(gdb) frame 1#1 0xf7f3ffcc in scm_jit_enter_mcode (thread=0xdedc20, mcode=0xf5c67a00 " 8\r\032(-@\360c\203%i\250B\300\362_\203\240`\245m") at jit.c:57255725 enter_mcode (thread, mcode);(gdb) info localsNo locals.(gdb) p *thread$2 = {next_thread = 0x5, vm = {ip = 0xdecd50, sp = 0x324602ae, fp = 0xdebc50, stack_limit = 0x30d, compare_result = 72 'H', apply_hook_enabled = 130 '\202', return_hook_enabled = 82 'R', next_hook_enabled = 1 '\001', abort_hook_enabled = 192 '\300', disable_mcode = 159 '\237', engine = 166 '\246', unused = 1 '\001', stack_size = 15218, stack_bottom = 0x20d, apply_hook = 0x1528240, return_hook = 0x1a69fc0, next_hook = 0x0, abort_hook = 0x5, stack_top = 0xdecd68, overflow_handler_stack = 0x28f45b3e, registers = 0xdebc58, mra_after_abort = 0x20045 "", trace_level = -140048256}, pending_asyncs = 0x116ab78, block_asyncs = 22184512, freelists = {0x20045, 0xf7a707ec, 0x1162bf8, 0x1528248, 0x5, 0x248360, 0x30999e00, 0xdebc60, 0x20d, 0x1528250, 0x1a69fc0, 0x0, 0x20045, 0xf7a70880, 0x1167440, 0x1528250, 0x5, 0xdecd80, 0x181aface, 0xdebc68, 0x20d, 0x1528258, 0x1a69fc0, 0x0, 0x20045, 0xf7a70880, 0x1167460, 0x1528258, 0x5, 0x248370, 0x91c2f1f, 0xdebc70}, pointerless_freelists = {0x30d, 0x7f1720, 0x1a69fc0, 0x1496, 0x30d, 0xdedd50, 0x96, 0x16, 0x5, 0x789740, 0x3275b29e, 0xdebc78, 0x30d, 0xdedd50, 0x16, 0x3fffffe, 0x30d, 0xdedd50, 0x96, 0x3fe, 0x5, 0xdecd98, 0x3afba8fc, 0xdebc80, 0x30d, 0x7f1720, 0x96, 0x16, 0x20d, 0x1528260, 0x1a69fc0, 0x0}, handle = 0x5, pthread = 14601648, result = 0x38a97220, exited = 14597256, guile_mode = 131141, needs_unregister = -140048256, wake = 0x1162128, sleep_cond = {__data = {{__wseq = 95281925316411917, __wseq32 = {__low = 525, __high = 22184552}}, {__g1_start = 27697088, __g1_start32 = {__low = 27697088, __high = 0}}, __g_refs = {5, 7903072}, __g_size = {67881032, 14597264}, __g1_orig_size = 131141, __wrefs = 4154919040, __g_signals = { 18228776, 22184552}}, __size = "\r\002\000\000h\202R\001\300\237\246\001\000\000\000\000\005\000\000\000`\227x\000H\310\v\004\220\274\336\000E\000\002\000\200\b\247\367(&\026\001h\202R\001", __align = 95281925316411917}, sleep_pipe = {525, 22184560}, dynamic_state = 0xa, dynstack = {base = 0x0, top = 0x5, limit = 0x789780}, continuation_root = 0x2db7da0c, continuation_base = 0xdebc98, base = 0x20d, jit_state = 0xdede20}
Unfortunately I’m unable to reproduce the bug outside Guix’s buildenvironment, even with ASLR disabled (what guix-daemon does).
I wonder if that could be the same issue ashttps://issues.guix.gnu.org/issue/39118.
I’ll happily take suggestions as to what debug info would be useful andwhat I could bisect!
Ludovic Courtès wrote on 20 Jan 18:09 +0100
(address . 39208@debbugs.gnu.org)
Ludovic Courtès <ludo@gnu.org> skribis:
Toggle quote (3 lines)> Unfortunately I’m unable to reproduce the bug outside Guix’s build> environment, even with ASLR disabled (what guix-daemon does).
I finally managed to reproduce it with from the failed-build tree:
rm -vf bootstrap/language/cps/{slot-allocation,specialize-numbers,specialize-primcalls,spec}.go GUILE_JIT_LOG=4 /run/current-system/profile/bin/linux32 -R make
which shows:
Toggle snippet (63 lines)jit: entering mcode: 0xf7a5d1c0jit: exited mcodejit: entering mcode: 0xf7a5d1c0jit: exited mcodejit: entering mcode: 0xf7956ca0jit: exited mcodejit: entering mcode: 0xf791a9f0jit: exited mcodejit: entering mcode: 0xf78b03d0jit: exited mcodejit: entering mcode: 0xf7a5d1c0jit: exited mcodejit: entering mcode: 0xf79407bbjit: exited mcodejit: entering mcode: 0xf7a5d1c0jit: exited mcodejit: entering mcode: 0xf79407bbjit: exited mcodejit: entering mcode: 0xf7a5d1c0jit: exited mcodejit: vcode: start=0xf5fe95d4,+203 entry=+0jit: mcode: 0xf5c3eac0,+2288jit: entering mcode: 0xf5c3eac0jit: exited mcodejit: vcode: start=0xf5fe9900,+203 entry=+0jit: mcode: 0xf5c3f3b0,+2288jit: entering mcode: 0xf5c3f3b0jit: exited mcodejit: vcode: start=0xf5fe9c2c,+203 entry=+0jit: mcode: 0xf5c3fca0,+2288jit: entering mcode: 0xf5c3fca0jit: exited mcodejit: vcode: start=0xf5fe9f58,+203 entry=+0jit: mcode: 0xf5c40590,+2288jit: entering mcode: 0xf5c40590jit: exited mcodejit: vcode: start=0xf5fea284,+203 entry=+0jit: mcode: 0xf5c40e80,+2288jit: entering mcode: 0xf5c40e80jit: exited mcodejit: vcode: start=0xf5fea5b0,+203 entry=+0jit: mcode: 0xf5c41770,+2288jit: entering mcode: 0xf5c41770jit: exited mcodejit: vcode: start=0xf5fea8dc,+203 entry=+0jit: mcode: 0xf5c42060,+2288jit: entering mcode: 0xf5c42060jit: exited mcodejit: vcode: start=0xf5feac08,+203 entry=+0jit: mcode: 0xf5c42950,+2288jit: entering mcode: 0xf5c42950jit: exited mcodejit: vcode: start=0xf5feaf34,+203 entry=+0jit: mcode: 0xf5c43240,+2288jit: entering mcode: 0xf5c43240jit: exited mcodejit: vcode: start=0xf5feb260,+203 entry=+0jit: mcode: 0xf5c43b30,+2280jit: entering mcode: 0xf5c43b30/gnu/store/nvkn00kq4x4g5wjjjvjj6rhzs0ihihxl-bash-minimal-5.0.7/bin/bash: line 6: 13151 Segmentation fault (core dumped) GUILE_AUTO_COMPILE=0 ../meta/build-env guild compile --target="arm-unknown-linux-gnueabihf" -O1 -Oresolve-primitives -L "/tmp/guix-build-guile-next-3.0.0.drv-0/guile-3.0.0/module" -L "/tmp/guix-build-guile-next-3.0.0.drv-0/guile-3.0.0/guile-readline" -o "language/cps/slot-allocation.go" "../module/language/cps/slot-allocation.scm"make[2]: *** [Makefile:1931: language/cps/slot-allocation.go] Error 139
Toggle snippet (80 lines)#0 0xf5c43ca4 in ?? ()[Current thread is 1 (Thread 0xf7fe8010 (LWP 13151))](gdb) bt#0 0xf5c43ca4 in ?? ()#1 0xf7f3ffcc in scm_jit_enter_mcode (thread=0x74fe10, mcode=0xf5c43b30 " 8\r\032(-@\360c\203%i\250B\300\362_\203\240`\245m") at jit.c:5725#2 0x00021048 in ?? ()Backtrace stopped: previous frame identical to this frame (corrupt stack?)(gdb) disassemble 0xf5c43b30,+2280Dump of assembler code from 0xf5c43b30 to 0xf5c44418: 0xf5c43b30: subs r0, #32 0xf5c43b32: subs r5, r1, r0 0xf5c43b34: cmp r5, #40 ; 0x28 0xf5c43b36: bne.w 0xf5c44200 0xf5c43b3a: ldr r5, [r4, #16] 0xf5c43b3c: cmp r0, r5 0xf5c43b3e: blt.w 0xf5c44200 0xf5c43b42: str r0, [r4, #8] 0xf5c43b44: ldr r5, [r4, #88] ; 0x58 0xf5c43b46: cmp r5, #0 0xf5c43b48: beq.w 0xf5c44232 0xf5c43b4c: ldrt r6, [r5] 0xf5c43b50: str r6, [r4, #88] ; 0x58 0xf5c43b52: str r5, [r0, #24] 0xf5c43b54: movw r5, #1293 ; 0x50d 0xf5c43b58: movs r6, #0[…] 0xf5c43c52: and.w r5, r5, #127 ; 0x7f 0xf5c43c56: cmp r5, #13 0xf5c43c58: bne.w 0xf5c4411e 0xf5c43c5c: ldr r5, [r0, #32] 0xf5c43c5e: ldr r5, [r5, #20] 0xf5c43c60: str r5, [r0, #16] 0xf5c43c62: ldr r5, [r0, #16] 0xf5c43c64: ldrt r5, [r5] 0xf5c43c68: str r5, [r0, #16] 0xf5c43c6a: eors r5, r5 0xf5c43c6c: str r5, [r0, #20] 0xf5c43c6e: ldr r5, [r0, #16] 0xf5c43c70: ldr r6, [r0, #20] 0xf5c43c72: lsls r2, r6, #24 0xf5c43c74: lsrs r6, r6, #8 0xf5c43c76: lsrs r5, r5, #8 0xf5c43c78: adds r5, r5, r2 0xf5c43c7a: str r5, [r0, #16] 0xf5c43c7c: str r6, [r0, #20] 0xf5c43c7e: ldr r5, [r0, #16] 0xf5c43c80: ldr r6, [r0, #20] 0xf5c43c82: cmp r6, #0 0xf5c43c84: blt.w 0xf5c440fe 0xf5c43c88: cmp r6, #0 0xf5c43c8a: bne.w 0xf5c43c94 0xf5c43c8e: cmp r5, #0 0xf5c43c90: ble.w 0xf5c440fe 0xf5c43c94: ldr r5, [r0, #32] 0xf5c43c96: ldr r5, [r5, #20] 0xf5c43c98: str r5, [r0, #16] 0xf5c43c9a: ldr r5, [r0, #16] 0xf5c43c9c: ldr r5, [r5, #4] 0xf5c43c9e: str r5, [r0, #16] 0xf5c43ca0: mov.w r12, #0=> 0xf5c43ca4: ldrt r5, [r12] 0xf5c43ca8: str r5, [r0, #8][…] 0xf5c443fc: ands r0, r6 0xf5c443fe: ; <UNDEFINED> instruction: 0xf7a24f00 0xf5c44402: mov pc, r7 0xf5c44404: ands r0, r6 0xf5c44406: ; <UNDEFINED> instruction: 0xf7a24f00 0xf5c4440a: mov pc, r7 0xf5c4440c: ands r0, r6 0xf5c4440e: ; <UNDEFINED> instruction: 0xf7a24f00 0xf5c44412: mov pc, r7 0xf5c44414: ands r0, r6 0xf5c44416: ; <UNDEFINED> instruction: 0xf7a20000End of assembler dump.(gdb) p $r12$1 = 0
Apparently r12 is JIT_TMP0.
Anyway, it seems that I have an environment in which to reproduce anddebug it now.
Ludovic Courtès wrote on 22 Jan 15:11 +0100
control message for bug #39208
(address . control@debbugs.gnu.org)
severity 39208 importantquit
Ludovic Courtès wrote on 11 Mar 21:21 +0100
Re: bug#39208: [3.0.0] Segfault while building on ARMv7
(address . 39208-done@debbugs.gnu.org)
Ludovic Courtès <ludo@gnu.org> skribis:
Toggle quote (35 lines)> Building 3.0.0 with Guix on ARMv7 reproducibly fails:>> BOOTSTRAP GUILEC language/cps/loop-instrumentation.go> wrote `language/cps/loop-instrumentation.go'> BOOTSTRAP GUILEC language/cps/peel-loops.go> wrote `language/cps/effects-analysis.go'> BOOTSTRAP GUILEC language/cps/prune-top-level-scopes.go> wrote `language/cps/licm.go'> BOOTSTRAP GUILEC language/cps/reify-primitives.go> wrote `language/cps/prune-top-level-scopes.go'> BOOTSTRAP GUILEC language/cps/renumber.go> wrote `language/cps/peel-loops.go'> BOOTSTRAP GUILEC language/cps/rotate-loops.go> wrote `language/cps/reify-primitives.go'> BOOTSTRAP GUILEC language/cps/optimize.go> wrote `language/cps/renumber.go'> BOOTSTRAP GUILEC language/cps/simplify.go> wrote `language/cps/rotate-loops.go'> BOOTSTRAP GUILEC language/cps/self-references.go> wrote `language/cps/optimize.go'> BOOTSTRAP GUILEC language/cps/slot-allocation.go> wrote `language/cps/self-references.go'> BOOTSTRAP GUILEC language/cps/spec.go> wrote `language/cps/simplify.go'> BOOTSTRAP GUILEC language/cps/specialize-primcalls.go> wrote `language/cps/spec.go'> BOOTSTRAP GUILEC language/cps/specialize-numbers.go> /gnu/store/nvkn00kq4x4g5wjjjvjj6rhzs0ihihxl-bash-minimal-5.0.7/bin/bash: line 6: 23019 Segmentation fault (core dumped) GUILE_AUTO_COMPILE=0 ../meta/build-env guild compile --target="arm-unknown-linux-gnueabihf" -O1 -Oresolve-primitives -L "/tmp/guix-build-guile-next-3.0.0.drv-0/guile-3.0.0/module" -L "/tmp/guix-build-guile-next-3.0.0.drv-0/guile-3.0.0/guile-readline" -o "language/cps/specialize-primcalls.go" "../module/language/cps/specialize-primcalls.scm"> make[2]: *** [Makefile:1931: language/cps/specialize-primcalls.go] Error 139> make[2]: *** Waiting for unfinished jobs....> wrote `language/cps/slot-allocation.go'> wrote `language/cps/specialize-numbers.go'> make[2]: Leaving directory '/tmp/guix-build-guile-next-3.0.0.drv-0/guile-3.0.0/bootstrap'> make[1]: *** [Makefile:1849: all-recursive] Error 1
This also is fixed by commit 7c17655cd3d859bf0c5a86d9782a7788205fc05a(https://issues.guix.gnu.org/issue/39266).
Ludovic Courtès wrote on 12 Mar 17:01 +0100
control message for bug #39266
(address . control@debbugs.gnu.org)
merge 39266 39208quit
