[3.0.0] Segfault while building on ARMv7

DoneSubmitted by Ludovic Courtès.
Details
One participant
  • Ludovic Courtès
Owner
unassigned
Severity
important
Merged with
L
L
Ludovic Courtès wrote on 20 Jan 2020 17:33
(address . bug-Guile@gnu.org)
87a76igk9f.fsf@gnu.org
Hello,

Building 3.0.0 with Guix on ARMv7 reproducibly fails:

Toggle snippet (34 lines)
BOOTSTRAP GUILEC language/cps/loop-instrumentation.go
wrote `language/cps/loop-instrumentation.go'
BOOTSTRAP GUILEC language/cps/peel-loops.go
wrote `language/cps/effects-analysis.go'
BOOTSTRAP GUILEC language/cps/prune-top-level-scopes.go
wrote `language/cps/licm.go'
BOOTSTRAP GUILEC language/cps/reify-primitives.go
wrote `language/cps/prune-top-level-scopes.go'
BOOTSTRAP GUILEC language/cps/renumber.go
wrote `language/cps/peel-loops.go'
BOOTSTRAP GUILEC language/cps/rotate-loops.go
wrote `language/cps/reify-primitives.go'
BOOTSTRAP GUILEC language/cps/optimize.go
wrote `language/cps/renumber.go'
BOOTSTRAP GUILEC language/cps/simplify.go
wrote `language/cps/rotate-loops.go'
BOOTSTRAP GUILEC language/cps/self-references.go
wrote `language/cps/optimize.go'
BOOTSTRAP GUILEC language/cps/slot-allocation.go
wrote `language/cps/self-references.go'
BOOTSTRAP GUILEC language/cps/spec.go
wrote `language/cps/simplify.go'
BOOTSTRAP GUILEC language/cps/specialize-primcalls.go
wrote `language/cps/spec.go'
BOOTSTRAP GUILEC language/cps/specialize-numbers.go
/gnu/store/nvkn00kq4x4g5wjjjvjj6rhzs0ihihxl-bash-minimal-5.0.7/bin/bash: line 6: 23019 Segmentation fault (core dumped) GUILE_AUTO_COMPILE=0 ../meta/build-env guild compile --target="arm-unknown-linux-gnueabihf" -O1 -Oresolve-primitives -L "/tmp/guix-build-guile-next-3.0.0.drv-0/guile-3.0.0/module" -L "/tmp/guix-build-guile-next-3.0.0.drv-0/guile-3.0.0/guile-readline" -o "language/cps/specialize-primcalls.go" "../module/language/cps/specialize-primcalls.scm"
make[2]: *** [Makefile:1931: language/cps/specialize-primcalls.go] Error 139
make[2]: *** Waiting for unfinished jobs....
wrote `language/cps/slot-allocation.go'
wrote `language/cps/specialize-numbers.go'
make[2]: Leaving directory '/tmp/guix-build-guile-next-3.0.0.drv-0/guile-3.0.0/bootstrap'
make[1]: *** [Makefile:1849: all-recursive] Error 1

It seems to always happen while building ‘specialize-primcalls.go’.

(See

The backtrace is unfortunately not all that readable:

Toggle snippet (109 lines)
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0xf5c67b74 in ?? ()
[Current thread is 1 (Thread 0xf7fe8010 (LWP 23019))]
(gdb) bt
#0 0xf5c67b74 in ?? ()
#1 0xf7f3ffcc in scm_jit_enter_mcode (thread=0xdedc20,
mcode=0xf5c67a00 " 8\r\032(-@\360c\203%i\250B\300\362_\203\240`\245m") at jit.c:5725
#2 0xf7093a40 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) info threads
Id Target Id Frame
* 1 Thread 0xf7fe8010 (LWP 23019) 0xf5c67b74 in ?? ()
2 Thread 0xf7894460 (LWP 23042) 0xf7e8f034 in __libc_do_syscall ()
from /gnu/store/n7c20pjm6q1xq1gqjqzzys1yk9fy7n1k-glibc-2.29/lib/libpthread.so.0
3 Thread 0xf69a5460 (LWP 23045) 0xf7e8f034 in __libc_do_syscall ()
from /gnu/store/n7c20pjm6q1xq1gqjqzzys1yk9fy7n1k-glibc-2.29/lib/libpthread.so.0
(gdb) frame 0
#0 0xf5c67b74 in ?? ()
(gdb) disassemble 0xf5c67a00,+500
Dump of assembler code from 0xf5c67a00 to 0xf5c67bf4:
0xf5c67a00: subs r0, #32
0xf5c67a02: subs r5, r1, r0
0xf5c67a04: cmp r5, #40 ; 0x28
0xf5c67a06: bne.w 0xf5c680d0
0xf5c67a0a: ldr r5, [r4, #16]
0xf5c67a0c: cmp r0, r5
0xf5c67a0e: blt.w 0xf5c680d0
0xf5c67a12: str r0, [r4, #8]
0xf5c67a14: ldr r5, [r4, #88] ; 0x58
0xf5c67a16: cmp r5, #0
0xf5c67a18: beq.w 0xf5c68102
0xf5c67a1c: ldrt r6, [r5]
0xf5c67a20: str r6, [r4, #88] ; 0x58
0xf5c67a22: str r5, [r0, #24]
0xf5c67a24: movw r5, #1293 ; 0x50d
0xf5c67a28: movs r6, #0
0xf5c67a2a: str r5, [r0, #16]
0xf5c67a2c: str r6, [r0, #20]
0xf5c67a2e: ldr r5, [r0, #24]
0xf5c67a30: ldr r6, [r0, #16]
0xf5c67a32: str r6, [r5, #0]
0xf5c67a34: ldr r5, [r0, #32]
0xf5c67a36: ldr r5, [r5, #4]
0xf5c67a38: str r5, [r0, #16]
[…]
0xf5c67b5e: cmp r5, #0
0xf5c67b60: ble.w 0xf5c67fce
0xf5c67b64: ldr r5, [r0, #32]
0xf5c67b66: ldr r5, [r5, #20]
0xf5c67b68: str r5, [r0, #16]
0xf5c67b6a: ldr r5, [r0, #16]
0xf5c67b6c: ldr r5, [r5, #4]
0xf5c67b6e: str r5, [r0, #16]
0xf5c67b70: mov.w r12, #0
=> 0xf5c67b74: ldrt r5, [r12]
0xf5c67b78: str r5, [r0, #8]
0xf5c67b7a: ldr r5, [r0, #8]
0xf5c67b7c: ldr r6, [r0, #16]
0xf5c67b7e: cmp r5, r6
0xf5c67b80: bne.w 0xf5c67f80
[…]
(gdb) info registers
r0 0xf7093a20 4144577056
r1 0xf7093a48 4144577096
r2 0x0 0
r3 0xf7a24001 4154605569
r4 0x74e00 478720
r5 0xdedc20 14605344
r6 0x0 0
r7 0xf5c67a00 4123425280
r8 0x0 0
r9 0x0 0
r10 0xf7fc4bdc 4160506844
r11 0xf7fb5000 4160442368
r12 0x0 0
sp 0xfffedc50 0xfffedc50
lr 0xf7f3ffcd -135004211
pc 0xf5c67b74 0xf5c67b74
cpsr 0x200f0030 537854000
fpscr 0x60000000 1610612736
(gdb) frame 1
#1 0xf7f3ffcc in scm_jit_enter_mcode (thread=0xdedc20,
mcode=0xf5c67a00 " 8\r\032(-@\360c\203%i\250B\300\362_\203\240`\245m") at jit.c:5725
5725 enter_mcode (thread, mcode);
(gdb) info locals
No locals.
(gdb) p *thread
$2 = {next_thread = 0x5, vm = {ip = 0xdecd50, sp = 0x324602ae, fp = 0xdebc50, stack_limit = 0x30d,
compare_result = 72 'H', apply_hook_enabled = 130 '\202', return_hook_enabled = 82 'R',
next_hook_enabled = 1 '\001', abort_hook_enabled = 192 '\300', disable_mcode = 159 '\237', engine = 166 '\246',
unused = 1 '\001', stack_size = 15218, stack_bottom = 0x20d, apply_hook = 0x1528240, return_hook = 0x1a69fc0,
next_hook = 0x0, abort_hook = 0x5, stack_top = 0xdecd68, overflow_handler_stack = 0x28f45b3e,
registers = 0xdebc58, mra_after_abort = 0x20045 "", trace_level = -140048256}, pending_asyncs = 0x116ab78,
block_asyncs = 22184512, freelists = {0x20045, 0xf7a707ec, 0x1162bf8, 0x1528248, 0x5, 0x248360, 0x30999e00,
0xdebc60, 0x20d, 0x1528250, 0x1a69fc0, 0x0, 0x20045, 0xf7a70880, 0x1167440, 0x1528250, 0x5, 0xdecd80, 0x181aface,
0xdebc68, 0x20d, 0x1528258, 0x1a69fc0, 0x0, 0x20045, 0xf7a70880, 0x1167460, 0x1528258, 0x5, 0x248370, 0x91c2f1f,
0xdebc70}, pointerless_freelists = {0x30d, 0x7f1720, 0x1a69fc0, 0x1496, 0x30d, 0xdedd50, 0x96, 0x16, 0x5,
0x789740, 0x3275b29e, 0xdebc78, 0x30d, 0xdedd50, 0x16, 0x3fffffe, 0x30d, 0xdedd50, 0x96, 0x3fe, 0x5, 0xdecd98,
0x3afba8fc, 0xdebc80, 0x30d, 0x7f1720, 0x96, 0x16, 0x20d, 0x1528260, 0x1a69fc0, 0x0}, handle = 0x5,
pthread = 14601648, result = 0x38a97220, exited = 14597256, guile_mode = 131141, needs_unregister = -140048256,
wake = 0x1162128, sleep_cond = {__data = {{__wseq = 95281925316411917, __wseq32 = {__low = 525,
__high = 22184552}}, {__g1_start = 27697088, __g1_start32 = {__low = 27697088, __high = 0}}, __g_refs = {5,
7903072}, __g_size = {67881032, 14597264}, __g1_orig_size = 131141, __wrefs = 4154919040, __g_signals = {
18228776, 22184552}},
__size = "\r\002\000\000h\202R\001\300\237\246\001\000\000\000\000\005\000\000\000`\227x\000H\310\v\004\220\274\336\000E\000\002\000\200\b\247\367(&\026\001h\202R\001", __align = 95281925316411917}, sleep_pipe = {525, 22184560},
dynamic_state = 0xa, dynstack = {base = 0x0, top = 0x5, limit = 0x789780}, continuation_root = 0x2db7da0c,
continuation_base = 0xdebc98, base = 0x20d, jit_state = 0xdede20}

Unfortunately I’m unable to reproduce the bug outside Guix’s build
environment, even with ASLR disabled (what guix-daemon does).

I wonder if that could be the same issue as

I’ll happily take suggestions as to what debug info would be useful and
what I could bisect!

Ludo’.
L
L
Ludovic Courtès wrote on 20 Jan 2020 18:09
(address . 39208@debbugs.gnu.org)
87pnfef420.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (3 lines)
> Unfortunately I’m unable to reproduce the bug outside Guix’s build
> environment, even with ASLR disabled (what guix-daemon does).

I finally managed to reproduce it with from the failed-build tree:

rm -vf bootstrap/language/cps/{slot-allocation,specialize-numbers,specialize-primcalls,spec}.go
GUILE_JIT_LOG=4 /run/current-system/profile/bin/linux32 -R make

which shows:

Toggle snippet (63 lines)
jit: entering mcode: 0xf7a5d1c0
jit: exited mcode
jit: entering mcode: 0xf7a5d1c0
jit: exited mcode
jit: entering mcode: 0xf7956ca0
jit: exited mcode
jit: entering mcode: 0xf791a9f0
jit: exited mcode
jit: entering mcode: 0xf78b03d0
jit: exited mcode
jit: entering mcode: 0xf7a5d1c0
jit: exited mcode
jit: entering mcode: 0xf79407bb
jit: exited mcode
jit: entering mcode: 0xf7a5d1c0
jit: exited mcode
jit: entering mcode: 0xf79407bb
jit: exited mcode
jit: entering mcode: 0xf7a5d1c0
jit: exited mcode
jit: vcode: start=0xf5fe95d4,+203 entry=+0
jit: mcode: 0xf5c3eac0,+2288
jit: entering mcode: 0xf5c3eac0
jit: exited mcode
jit: vcode: start=0xf5fe9900,+203 entry=+0
jit: mcode: 0xf5c3f3b0,+2288
jit: entering mcode: 0xf5c3f3b0
jit: exited mcode
jit: vcode: start=0xf5fe9c2c,+203 entry=+0
jit: mcode: 0xf5c3fca0,+2288
jit: entering mcode: 0xf5c3fca0
jit: exited mcode
jit: vcode: start=0xf5fe9f58,+203 entry=+0
jit: mcode: 0xf5c40590,+2288
jit: entering mcode: 0xf5c40590
jit: exited mcode
jit: vcode: start=0xf5fea284,+203 entry=+0
jit: mcode: 0xf5c40e80,+2288
jit: entering mcode: 0xf5c40e80
jit: exited mcode
jit: vcode: start=0xf5fea5b0,+203 entry=+0
jit: mcode: 0xf5c41770,+2288
jit: entering mcode: 0xf5c41770
jit: exited mcode
jit: vcode: start=0xf5fea8dc,+203 entry=+0
jit: mcode: 0xf5c42060,+2288
jit: entering mcode: 0xf5c42060
jit: exited mcode
jit: vcode: start=0xf5feac08,+203 entry=+0
jit: mcode: 0xf5c42950,+2288
jit: entering mcode: 0xf5c42950
jit: exited mcode
jit: vcode: start=0xf5feaf34,+203 entry=+0
jit: mcode: 0xf5c43240,+2288
jit: entering mcode: 0xf5c43240
jit: exited mcode
jit: vcode: start=0xf5feb260,+203 entry=+0
jit: mcode: 0xf5c43b30,+2280
jit: entering mcode: 0xf5c43b30
/gnu/store/nvkn00kq4x4g5wjjjvjj6rhzs0ihihxl-bash-minimal-5.0.7/bin/bash: line 6: 13151 Segmentation fault (core dumped) GUILE_AUTO_COMPILE=0 ../meta/build-env guild compile --target="arm-unknown-linux-gnueabihf" -O1 -Oresolve-primitives -L "/tmp/guix-build-guile-next-3.0.0.drv-0/guile-3.0.0/module" -L "/tmp/guix-build-guile-next-3.0.0.drv-0/guile-3.0.0/guile-readline" -o "language/cps/slot-allocation.go" "../module/language/cps/slot-allocation.scm"
make[2]: *** [Makefile:1931: language/cps/slot-allocation.go] Error 139

Backtrace:

Toggle snippet (80 lines)
#0 0xf5c43ca4 in ?? ()
[Current thread is 1 (Thread 0xf7fe8010 (LWP 13151))]
(gdb) bt
#0 0xf5c43ca4 in ?? ()
#1 0xf7f3ffcc in scm_jit_enter_mcode (thread=0x74fe10,
mcode=0xf5c43b30 " 8\r\032(-@\360c\203%i\250B\300\362_\203\240`\245m") at jit.c:5725
#2 0x00021048 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) disassemble 0xf5c43b30,+2280
Dump of assembler code from 0xf5c43b30 to 0xf5c44418:
0xf5c43b30: subs r0, #32
0xf5c43b32: subs r5, r1, r0
0xf5c43b34: cmp r5, #40 ; 0x28
0xf5c43b36: bne.w 0xf5c44200
0xf5c43b3a: ldr r5, [r4, #16]
0xf5c43b3c: cmp r0, r5
0xf5c43b3e: blt.w 0xf5c44200
0xf5c43b42: str r0, [r4, #8]
0xf5c43b44: ldr r5, [r4, #88] ; 0x58
0xf5c43b46: cmp r5, #0
0xf5c43b48: beq.w 0xf5c44232
0xf5c43b4c: ldrt r6, [r5]
0xf5c43b50: str r6, [r4, #88] ; 0x58
0xf5c43b52: str r5, [r0, #24]
0xf5c43b54: movw r5, #1293 ; 0x50d
0xf5c43b58: movs r6, #0
[…]
0xf5c43c52: and.w r5, r5, #127 ; 0x7f
0xf5c43c56: cmp r5, #13
0xf5c43c58: bne.w 0xf5c4411e
0xf5c43c5c: ldr r5, [r0, #32]
0xf5c43c5e: ldr r5, [r5, #20]
0xf5c43c60: str r5, [r0, #16]
0xf5c43c62: ldr r5, [r0, #16]
0xf5c43c64: ldrt r5, [r5]
0xf5c43c68: str r5, [r0, #16]
0xf5c43c6a: eors r5, r5
0xf5c43c6c: str r5, [r0, #20]
0xf5c43c6e: ldr r5, [r0, #16]
0xf5c43c70: ldr r6, [r0, #20]
0xf5c43c72: lsls r2, r6, #24
0xf5c43c74: lsrs r6, r6, #8
0xf5c43c76: lsrs r5, r5, #8
0xf5c43c78: adds r5, r5, r2
0xf5c43c7a: str r5, [r0, #16]
0xf5c43c7c: str r6, [r0, #20]
0xf5c43c7e: ldr r5, [r0, #16]
0xf5c43c80: ldr r6, [r0, #20]
0xf5c43c82: cmp r6, #0
0xf5c43c84: blt.w 0xf5c440fe
0xf5c43c88: cmp r6, #0
0xf5c43c8a: bne.w 0xf5c43c94
0xf5c43c8e: cmp r5, #0
0xf5c43c90: ble.w 0xf5c440fe
0xf5c43c94: ldr r5, [r0, #32]
0xf5c43c96: ldr r5, [r5, #20]
0xf5c43c98: str r5, [r0, #16]
0xf5c43c9a: ldr r5, [r0, #16]
0xf5c43c9c: ldr r5, [r5, #4]
0xf5c43c9e: str r5, [r0, #16]
0xf5c43ca0: mov.w r12, #0
=> 0xf5c43ca4: ldrt r5, [r12]
0xf5c43ca8: str r5, [r0, #8]
[…]
0xf5c443fc: ands r0, r6
0xf5c443fe: ; <UNDEFINED> instruction: 0xf7a24f00
0xf5c44402: mov pc, r7
0xf5c44404: ands r0, r6
0xf5c44406: ; <UNDEFINED> instruction: 0xf7a24f00
0xf5c4440a: mov pc, r7
0xf5c4440c: ands r0, r6
0xf5c4440e: ; <UNDEFINED> instruction: 0xf7a24f00
0xf5c44412: mov pc, r7
0xf5c44414: ands r0, r6
0xf5c44416: ; <UNDEFINED> instruction: 0xf7a20000
End of assembler dump.
(gdb) p $r12
$1 = 0

Apparently r12 is JIT_TMP0.

Anyway, it seems that I have an environment in which to reproduce and
debug it now.

Ludo’.
L
L
Ludovic Courtès wrote on 22 Jan 2020 15:11
control message for bug #39208
(address . control@debbugs.gnu.org)
87tv4nbmxi.fsf@gnu.org
severity 39208 important
quit
L
L
Ludovic Courtès wrote on 11 Mar 2020 21:21
Re: bug#39208: [3.0.0] Segfault while building on ARMv7
(address . 39208-done@debbugs.gnu.org)
87y2s6fxsg.fsf@gnu.org
Hi,

Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (35 lines)
> Building 3.0.0 with Guix on ARMv7 reproducibly fails:
>
> BOOTSTRAP GUILEC language/cps/loop-instrumentation.go
> wrote `language/cps/loop-instrumentation.go'
> BOOTSTRAP GUILEC language/cps/peel-loops.go
> wrote `language/cps/effects-analysis.go'
> BOOTSTRAP GUILEC language/cps/prune-top-level-scopes.go
> wrote `language/cps/licm.go'
> BOOTSTRAP GUILEC language/cps/reify-primitives.go
> wrote `language/cps/prune-top-level-scopes.go'
> BOOTSTRAP GUILEC language/cps/renumber.go
> wrote `language/cps/peel-loops.go'
> BOOTSTRAP GUILEC language/cps/rotate-loops.go
> wrote `language/cps/reify-primitives.go'
> BOOTSTRAP GUILEC language/cps/optimize.go
> wrote `language/cps/renumber.go'
> BOOTSTRAP GUILEC language/cps/simplify.go
> wrote `language/cps/rotate-loops.go'
> BOOTSTRAP GUILEC language/cps/self-references.go
> wrote `language/cps/optimize.go'
> BOOTSTRAP GUILEC language/cps/slot-allocation.go
> wrote `language/cps/self-references.go'
> BOOTSTRAP GUILEC language/cps/spec.go
> wrote `language/cps/simplify.go'
> BOOTSTRAP GUILEC language/cps/specialize-primcalls.go
> wrote `language/cps/spec.go'
> BOOTSTRAP GUILEC language/cps/specialize-numbers.go
> /gnu/store/nvkn00kq4x4g5wjjjvjj6rhzs0ihihxl-bash-minimal-5.0.7/bin/bash: line 6: 23019 Segmentation fault (core dumped) GUILE_AUTO_COMPILE=0 ../meta/build-env guild compile --target="arm-unknown-linux-gnueabihf" -O1 -Oresolve-primitives -L "/tmp/guix-build-guile-next-3.0.0.drv-0/guile-3.0.0/module" -L "/tmp/guix-build-guile-next-3.0.0.drv-0/guile-3.0.0/guile-readline" -o "language/cps/specialize-primcalls.go" "../module/language/cps/specialize-primcalls.scm"
> make[2]: *** [Makefile:1931: language/cps/specialize-primcalls.go] Error 139
> make[2]: *** Waiting for unfinished jobs....
> wrote `language/cps/slot-allocation.go'
> wrote `language/cps/specialize-numbers.go'
> make[2]: Leaving directory '/tmp/guix-build-guile-next-3.0.0.drv-0/guile-3.0.0/bootstrap'
> make[1]: *** [Makefile:1849: all-recursive] Error 1

This also is fixed by commit 7c17655cd3d859bf0c5a86d9782a7788205fc05a

\o/

Ludo’.
Closed
L
L
Ludovic Courtès wrote on 12 Mar 2020 17:01
control message for bug #39266
(address . control@debbugs.gnu.org)
878sk5sgt1.fsf@gnu.org
merge 39266 39208
quit
?
Your comment

This issue is archived.

To comment on this conversation send email to 39208@debbugs.gnu.org