libsigsegv fails to build on emulated aarch64 [core-updates]

  • Open
  • quality assurance status badge
Details
4 participants
  • Efraim Flashner
  • Sarah Morgensen
  • Ludovic Courtès
  • Maxime Devos
Owner
unassigned
Submitted by
Maxime Devos
Severity
normal
M
M
Maxime Devos wrote on 10 Jul 2021 20:07
(address . bug-guix@gnu.org)
d8da64a0c855ce2923fc232799bf089a70704d9e.camel@telenet.be
Hi guix

I noticed a new build failure on ci.guix.gnu.org:

Relevant log output:

make[2]: Leaving directory '/tmp/guix-build-libsigsegv-2.13.drv-0/libsigsegv-2.13'
make[1]: Leaving directory '/tmp/guix-build-libsigsegv-2.13.drv-0/libsigsegv-2.13'
phase `build' succeeded after 44.7 seconds
starting phase `check'
yes
checking for working strerror function... Making check in src
make[1]: Entering directory '/tmp/guix-build-libsigsegv-2.13.drv-0/libsigsegv-2.13/src'
make[1]: Nothing to be done for 'check'.
make[1]: Leaving directory '/tmp/guix-build-libsigsegv-2.13.drv-0/libsigsegv-2.13/src'
Making check in tests
make[1]: Entering directory '/tmp/guix-build-libsigsegv-2.13.drv-0/libsigsegv-2.13/tests'
make check-TESTS
make[2]: Entering directory '/tmp/guix-build-libsigsegv-2.13.drv-0/libsigsegv-2.13/tests'
time.h
checking for struct tm.tm_zone... yes
checking for fake locale system (OpenBSD)... make[3]: Entering directory '/tmp/guix-build-libsigsegv-2.13.drv-0/libsigsegv-2.13/tests'
yes
checking for struct tm.tm_gmtoff... FAIL: stackoverflow2
PASS: sigsegv2
../build-aux/test-driver: line 109: 6663 Segmentation fault "$@" > $log_file 2>&1
PASS: sigsegv3
PASS: sigsegv1
FAIL: stackoverflow1
yes
checking for nlink_t... (cached) yes
checking whether unlink honors trailing slashes... yes
checking for O_CLOEXEC... no
checking for Solaris 11.4 locale system... no
checking for getlocalename_l... ============================================================================
Testsuite summary for libsigsegv 2.13
============================================================================
# TOTAL: 5
# PASS: 3
# SKIP: 0
# XFAIL: 0
# FAIL: 2
# XPASS: 0
# ERROR: 0
============================================================================
See tests/test-suite.log
============================================================================
make[3]: *** [Makefile:729: test-suite.log] Error 1
make[3]: Leaving directory '/tmp/guix-build-libsigsegv-2.13.drv-0/libsigsegv-2.13/tests'
make[2]: *** [Makefile:837: check-TESTS] Error 2
make[2]: Leaving directory '/tmp/guix-build-libsigsegv-2.13.drv-0/libsigsegv-2.13/tests'
make[1]: *** [Makefile:966: check-am] Error 2
make[1]: Leaving directory '/tmp/guix-build-libsigsegv-2.13.drv-0/libsigsegv-2.13/tests'
make: *** [Makefile:432: check-recursive] Error 1

Test suite failed, dumping logs.

--- ./tests/test-suite.log --------------------------------------------------

===========================================
libsigsegv 2.13: tests/test-suite.log
===========================================

# TOTAL: 5
# PASS: 3
# SKIP: 0
# XFAIL: 0
# FAIL: 2
# XPASS: 0
# ERROR: 0

.. contents:: :depth: 2

FAIL: stackoverflow1
====================

qemu: uncaught target signal 11 (Segmentation fault) - core dumped
FAIL stackoverflow1 (exit status: 139)

FAIL: stackoverflow2
====================

Starting recursion pass 1.
Stack overflow 1 missed.
FAIL stackoverflow2 (exit status: 1)


error: in phase 'check': uncaught exception:
%exception #<&invoke-error program: "make" arguments: ("check" "-j" "16") exit-status: 2 term-signal: #f stop-signal: #f>
phase `check' failed after 3.5 seconds
command "make" "check" "-j" "16" failed with status 2
builder for `/gnu/store/ww5bf6xz13wxjs1sjvjc2kmwq5mrjdj5-libsigsegv-2.13.drv' failed with exit code 1
@ build-failed /gnu/store/ww5bf6xz13wxjs1sjvjc2kmwq5mrjdj5-libsigsegv-2.13.drv - 1 builder for `/gnu/store/ww5bf6xz13wxjs1sjvjc2kmwq5mrjdj5-libsigsegv-2.13.drv' failed with exit code 1
cannot build derivation `/gnu/store/5pfnq2666wp2gg1h6yl0c92din8n24wc-gawk-5.1.0.drv': 1 dependencies couldn't be built

Greetings,
Maxime.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYOnh4BccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7sDpAQCOet6UBr/wfKFkj+nQGg+bQ071
DnOUOTossS7m6dqX3wEAoKK6jtqMQVdZ3jvIB1h/5Q5R6UjTj9nLHATS4d0jkw0=
=5rQv
-----END PGP SIGNATURE-----


L
L
Ludovic Courtès wrote on 11 Jul 2021 00:19
(name . Maxime Devos)(address . maximedevos@telenet.be)(address . 49509@debbugs.gnu.org)
875yxhbynf.fsf@gnu.org
Hi,

Maxime Devos <maximedevos@telenet.be> skribis:

Toggle quote (13 lines)
> FAIL: stackoverflow1
> ====================
>
> qemu: uncaught target signal 11 (Segmentation fault) - core dumped
> FAIL stackoverflow1 (exit status: 139)
>
> FAIL: stackoverflow2
> ====================
>
> Starting recursion pass 1.
> Stack overflow 1 missed.
> FAIL stackoverflow2 (exit status: 1)

For now I worked around it by offloading this to a “real” machine
(overdrive1), where it builds fine. I wonder if there’s much we can do
regarding QEMU’s behavior here.

Ludo’.
M
M
Maxime Devos wrote on 11 Jul 2021 16:11
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 49509@debbugs.gnu.org)
d6ed6b9f9275c1cb01f85fb42f069108371185f7.camel@telenet.be
Ludovic Courtès schreef op zo 11-07-2021 om 00:19 [+0200]:
Toggle quote (21 lines)
> Hi,
>
> Maxime Devos <maximedevos@telenet.be> skribis:
>
> > FAIL: stackoverflow1
> > ====================
> >
> > qemu: uncaught target signal 11 (Segmentation fault) - core dumped
> > FAIL stackoverflow1 (exit status: 139)
> >
> > FAIL: stackoverflow2
> > ====================
> >
> > Starting recursion pass 1.
> > Stack overflow 1 missed.
> > FAIL stackoverflow2 (exit status: 1)
>
> For now I worked around it by offloading this to a “real” machine
> (overdrive1), where it builds fine. I wonder if there’s much we can do
> regarding QEMU’s behavior here.

Maybe detect if QEMU is used, and if so, don't run the test suite?
Not really a ‘clean’ solution though, w.r.t. reproducibility,
and I wouldn't know how to detect this.

If this is a bug in QEMU, then ideally that would be fixed in QEMU,
but I wouldn't know where to look.

Greetings,
Maxime.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYOr8EhccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7h+uAQCvWZ7HoMRFLQmtAyEZkERzmSuz
oMq7vHvegRmbBKVabgEA+vHx39SfoEMKHKowgCZ3QAZPZjGRpi8BzwaD+rWpsAI=
=YqQa
-----END PGP SIGNATURE-----


L
L
Ludovic Courtès wrote on 11 Jul 2021 18:13
(name . Maxime Devos)(address . maximedevos@telenet.be)(address . 49509@debbugs.gnu.org)
87bl78akvx.fsf@gnu.org
Hello,

Maxime Devos <maximedevos@telenet.be> skribis:

Toggle quote (26 lines)
> Ludovic Courtès schreef op zo 11-07-2021 om 00:19 [+0200]:
>> Hi,
>>
>> Maxime Devos <maximedevos@telenet.be> skribis:
>>
>> > FAIL: stackoverflow1
>> > ====================
>> >
>> > qemu: uncaught target signal 11 (Segmentation fault) - core dumped
>> > FAIL stackoverflow1 (exit status: 139)
>> >
>> > FAIL: stackoverflow2
>> > ====================
>> >
>> > Starting recursion pass 1.
>> > Stack overflow 1 missed.
>> > FAIL stackoverflow2 (exit status: 1)
>>
>> For now I worked around it by offloading this to a “real” machine
>> (overdrive1), where it builds fine. I wonder if there’s much we can do
>> regarding QEMU’s behavior here.
>
> Maybe detect if QEMU is used, and if so, don't run the test suite?
> Not really a ‘clean’ solution though, w.r.t. reproducibility,
> and I wouldn't know how to detect this.

Yeah, I’d rather avoid that.

Toggle quote (3 lines)
> If this is a bug in QEMU, then ideally that would be fixed in QEMU,
> but I wouldn't know where to look.

It could be that someone else on the intertubes stumbled upon that
issue, that’d be great. It could be that libsigsegv plays tricks that
don’t fare well with QEMU’s expectations, as in
on bug-libsigsegv@gnu.org.

Thanks,
Ludo’.
S
S
Sarah Morgensen wrote on 30 Sep 2021 04:40
(name . Ludovic Courtès)(address . ludo@gnu.org)
86czoqu6p7.fsf@mgsn.dev
Hi all,

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (44 lines)
> Hello,
>
> Maxime Devos <maximedevos@telenet.be> skribis:
>
>> Ludovic Courtès schreef op zo 11-07-2021 om 00:19 [+0200]:
>>> Hi,
>>>
>>> Maxime Devos <maximedevos@telenet.be> skribis:
>>>
>>> > FAIL: stackoverflow1
>>> > ====================
>>> >
>>> > qemu: uncaught target signal 11 (Segmentation fault) - core dumped
>>> > FAIL stackoverflow1 (exit status: 139)
>>> >
>>> > FAIL: stackoverflow2
>>> > ====================
>>> >
>>> > Starting recursion pass 1.
>>> > Stack overflow 1 missed.
>>> > FAIL stackoverflow2 (exit status: 1)
>>>
>>> For now I worked around it by offloading this to a “real” machine
>>> (overdrive1), where it builds fine. I wonder if there’s much we can do
>>> regarding QEMU’s behavior here.
>>
>> Maybe detect if QEMU is used, and if so, don't run the test suite?
>> Not really a ‘clean’ solution though, w.r.t. reproducibility,
>> and I wouldn't know how to detect this.
>
> Yeah, I’d rather avoid that.
>
>> If this is a bug in QEMU, then ideally that would be fixed in QEMU,
>> but I wouldn't know where to look.
>
> It could be that someone else on the intertubes stumbled upon that
> issue, that’d be great. It could be that libsigsegv plays tricks that
> don’t fare well with QEMU’s expectations, as in
> <https://bugzilla.redhat.com/show_bug.cgi?id=1493304#c5>. We should ask
> on bug-libsigsegv@gnu.org.
>
> Thanks,
> Ludo’.

(I just realized I never actually replied to this!)

Configuring with "--disable-stackvma" seems to fix this. Doing this
makes libsigsegv use a different heuristic for determining if a SIGSEGV
was a stack overflow. I don't think it should impact functionality.
Perhaps just apply that to aarch64 until there's a proper fix?

This is probably a QEMU bug... I will try to report this to upstream
QEMU when I can, as I can't find my notes on this right now.

--
Sarah
E
E
Efraim Flashner wrote on 30 Sep 2021 10:37
(name . Sarah Morgensen)(address . iskarian@mgsn.dev)
YVV3UJQsUIDra25h@3900XT
On Wed, Sep 29, 2021 at 07:40:20PM -0700, Sarah Morgensen wrote:
Toggle quote (59 lines)
> Hi all,
>
> Ludovic Courtès <ludo@gnu.org> writes:
>
> > Hello,
> >
> > Maxime Devos <maximedevos@telenet.be> skribis:
> >
> >> Ludovic Courtès schreef op zo 11-07-2021 om 00:19 [+0200]:
> >>> Hi,
> >>>
> >>> Maxime Devos <maximedevos@telenet.be> skribis:
> >>>
> >>> > FAIL: stackoverflow1
> >>> > ====================
> >>> >
> >>> > qemu: uncaught target signal 11 (Segmentation fault) - core dumped
> >>> > FAIL stackoverflow1 (exit status: 139)
> >>> >
> >>> > FAIL: stackoverflow2
> >>> > ====================
> >>> >
> >>> > Starting recursion pass 1.
> >>> > Stack overflow 1 missed.
> >>> > FAIL stackoverflow2 (exit status: 1)
> >>>
> >>> For now I worked around it by offloading this to a “real” machine
> >>> (overdrive1), where it builds fine. I wonder if there’s much we can do
> >>> regarding QEMU’s behavior here.
> >>
> >> Maybe detect if QEMU is used, and if so, don't run the test suite?
> >> Not really a ‘clean’ solution though, w.r.t. reproducibility,
> >> and I wouldn't know how to detect this.
> >
> > Yeah, I’d rather avoid that.
> >
> >> If this is a bug in QEMU, then ideally that would be fixed in QEMU,
> >> but I wouldn't know where to look.
> >
> > It could be that someone else on the intertubes stumbled upon that
> > issue, that’d be great. It could be that libsigsegv plays tricks that
> > don’t fare well with QEMU’s expectations, as in
> > <https://bugzilla.redhat.com/show_bug.cgi?id=1493304#c5>. We should ask
> > on bug-libsigsegv@gnu.org.
> >
> > Thanks,
> > Ludo’.
>
> (I just realized I never actually replied to this!)
>
> Configuring with "--disable-stackvma" seems to fix this. Doing this
> makes libsigsegv use a different heuristic for determining if a SIGSEGV
> was a stack overflow. I don't think it should impact functionality.
> Perhaps just apply that to aarch64 until there's a proper fix?
>
> This is probably a QEMU bug... I will try to report this to upstream
> QEMU when I can, as I can't find my notes on this right now.
>

I came across this on x86_64 when using our qemu-binfmt service when
building for powerpc-linux too, and I'm pretty sure powerpc64le-linux
and armhf-linux also. I haven't tried going the other direction, from
aarch64-linux and emulating x86_64/i686 to see if it happens there too.

--
Efraim Flashner <efraim@flashner.co.il> ????? ?????
GPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted
-----BEGIN PGP SIGNATURE-----

iQIzBAABCgAdFiEEoov0DD5VE3JmLRT3Qarn3Mo9g1EFAmFVd0oACgkQQarn3Mo9
g1GlZQ/5AUiyGiR0Ej/zy+QUn02GNVJqWJrkX1TSpIiu2DKHv11ypk3lply3qYJw
5Coqg9PoecY6HUGgdobmJZDF8v19aFdFS/C172cYD0zVT0VXXSuNm1K8uO6VWNXX
7LUQ9AaBhKhxoNObnyn1ODGGt/q0P4JJuhjB72lrAt/Jq6RGReznRjnmjoZFDlOu
T47dODs6ILF1ZMkdg6Vg18iqCVhCiNB1NyHbGCq8jf34IZXayYjtoJnNAItD2/xq
WwkjCZC/iWqSePVTJ9iVR39v12oGOq+ROkkIBAdK0b3vm9xqa7lDyU9sk1ovjRqM
bwBAMNFaWYNsCw1gaExQjwpcJ7MnAI3mRF4yX/yNi45qRnZ8bXPgMBrbDt+qN2R6
Px3Hj8Hk3dT8L1tmVriuVTp8XvrWZIaBuKgFRRgIyl2QF2Vzx+htROKtyzSCHSGd
dgyouvW+e9ZqXBYHR7FXPYcT6LmhAhwkHUvX/kRnsk9MvXdbWnB9dRML5XuFDgX+
lbiSk7nnYXODGfnwnmdKvmlVimBo3r7g+1n3UbDwnvZcDyeH0bTEAbJt5gmOCqOX
1WKMABo7R/M74bjQMPwEqJObP1ceaaBswL1+rucOafabugevAvQ4EPMyO6ytwmYS
JbnYRztWanR2baeBc4EzVsv/s1qXAkay3pI+O96w6ptptuY0OkM=
=nt/r
-----END PGP SIGNATURE-----


L
L
Ludovic Courtès wrote on 30 Sep 2021 22:20
(name . Sarah Morgensen)(address . iskarian@mgsn.dev)
87fstl3jek.fsf@gnu.org
Hi!

Sarah Morgensen <iskarian@mgsn.dev> skribis:

Toggle quote (5 lines)
> Configuring with "--disable-stackvma" seems to fix this. Doing this
> makes libsigsegv use a different heuristic for determining if a SIGSEGV
> was a stack overflow. I don't think it should impact functionality.
> Perhaps just apply that to aarch64 until there's a proper fix?

Sounds like a good interim measure, for aarch64 and powerpc at least as
Efraim writes.

If you can come up with a patch, we could apply it in the upcoming
rebuild.

Thanks,
Ludo’.
?