shepherd fails tests on all systems except x86_64

  • Done
  • quality assurance status badge
Details
6 participants
  • Andreas Enge
  • Brice Waegeneire
  • Efraim Flashner
  • Ludovic Courtès
  • Marius Bakke
  • Mark H Weaver
Owner
unassigned
Submitted by
Mark H Weaver
Severity
serious
M
M
Mark H Weaver wrote on 31 Jan 2018 04:07
(address . bug-guix@gnu.org)
87zi4uvi88.fsf@netris.org
On core-updates, Hydra has been unable to successfully build 'shepherd'
on any system except x86_64-linux. I can also report that on my
mips64el-linux GuixSD system, which is running something close to
'core-updates', I had to disable tests on shepherd in order to build it.
I don't know about aarch64-linux.

These are the tests that commonly fail:

FAIL: tests/respawn.sh
FAIL: tests/respawn-throttling.sh
FAIL: tests/basic.sh

Mark
E
E
Efraim Flashner wrote on 3 Feb 2018 20:53
(name . Mark H Weaver)(address . mhw@netris.org)(address . 30299@debbugs.gnu.org)
20180203195335.GA1003@macbook41
On Tue, Jan 30, 2018 at 10:07:35PM -0500, Mark H Weaver wrote:
Toggle quote (14 lines)
> On core-updates, Hydra has been unable to successfully build 'shepherd'
> on any system except x86_64-linux. I can also report that on my
> mips64el-linux GuixSD system, which is running something close to
> 'core-updates', I had to disable tests on shepherd in order to build it.
> I don't know about aarch64-linux.
>
> These are the tests that commonly fail:
>
> FAIL: tests/respawn.sh
> FAIL: tests/respawn-throttling.sh
> FAIL: tests/basic.sh
>
> Mark

Shepherd built successfully on aarch64-linux, and again with '--check'.

--
Efraim Flashner <efraim@flashner.co.il> ????? ?????
GPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEoov0DD5VE3JmLRT3Qarn3Mo9g1EFAlp2EzsACgkQQarn3Mo9
g1EFDw//dPvmfRCZhp+NaM0W7i6EcM0umVQKANQNW4ATawK9Ayaf6Ni2hmlhIWtU
AS1raLs3k3buG5ctQJ7+XrdtkNlOQV1Wlt4aNa9/0Wfgk7vnZNdhDLMmMqUpih0L
xBeCxb8ZBsdeCBHwfUXdT7IOrMFsQaWBNKXjqsP4LntzxOWagitZxF6Afk+1Q9gu
jIxIW4k3GasyPnFLf/1ifcCxr0HtH09IRYsYZzD8dY29J65ewGvpWGUlhch+YXt4
+oRjMfRlyoPpzg4wmIqWOmpjWXZBU2cHzZPX1+uikeAkWG35f+kMQiZyNCa/rs4W
Il2/1YFw/R2/FtSgqhCQWUQVIgEVgBM72a9dgGl42b7Q7YmPbuNk8ZdqeGFPBMGj
D8sAJfZ3EAdUOD2Ux7RrElmXELDCfqPn9w1JflntRUfxZdD64UDC6vYoKuc/Et7e
P/2wONiDMGVFJ6OWZU+8U80Jzmi+/7GT1PNp8suZSTXCXZ3dcxrYenw6nDCDnahI
O98NjA5I1/eJhsla0x+sVLphlE8nIYXJGRA6VjcIpwF4Fm3xLq807thKLsMjoMzV
8Oas+Z0BeKzQ+rmDpUrxy7+6WN84xlyQrX/2fzJr+GW/dDtRyQmWE5GxyfhF5Tvn
9ht6okCEeI6lzLwadoLQB5aUlNyXPs+2eN2tsp4Br41MA2jEUA0=
=Kvx2
-----END PGP SIGNATURE-----


L
L
Ludovic Courtès wrote on 4 Feb 2018 23:48
(name . Mark H Weaver)(address . mhw@netris.org)(address . 30299@debbugs.gnu.org)
87k1vsfk1u.fsf@gnu.org
Hello,

Mark H Weaver <mhw@netris.org> skribis:

Toggle quote (12 lines)
> On core-updates, Hydra has been unable to successfully build 'shepherd'
> on any system except x86_64-linux. I can also report that on my
> mips64el-linux GuixSD system, which is running something close to
> 'core-updates', I had to disable tests on shepherd in order to build it.
> I don't know about aarch64-linux.
>
> These are the tests that commonly fail:
>
> FAIL: tests/respawn.sh
> FAIL: tests/respawn-throttling.sh
> FAIL: tests/basic.sh

This is a non-deterministic failure. I could reproduce the
tests/basic.sh one and it is fixed by this:


I’ll roll a new Shepherd release soon and update the package.

The rest may be a duplicate of 23811, though I couldn’t reproduce it.

Thanks,
Ludo’.
M
M
Mark H Weaver wrote on 14 Feb 2018 09:52
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 30299@debbugs.gnu.org)
87sha4osus.fsf@netris.org
ludo@gnu.org (Ludovic Courtès) writes:

Toggle quote (19 lines)
> Mark H Weaver <mhw@netris.org> skribis:
>
>> On core-updates, Hydra has been unable to successfully build 'shepherd'
>> on any system except x86_64-linux. I can also report that on my
>> mips64el-linux GuixSD system, which is running something close to
>> 'core-updates', I had to disable tests on shepherd in order to build it.
>> I don't know about aarch64-linux.
>>
>> These are the tests that commonly fail:
>>
>> FAIL: tests/respawn.sh
>> FAIL: tests/respawn-throttling.sh
>> FAIL: tests/basic.sh
>
> This is a non-deterministic failure. I could reproduce the
> tests/basic.sh one and it is fixed by this:
>
> https://git.savannah.gnu.org/cgit/shepherd.git/commit/?id=cc9564586729a5bb90dd5d2722b543fdde9ab821

Thank you!

Toggle quote (2 lines)
> I’ll roll a new Shepherd release soon and update the package.

For now, I added your patch to the shepherd package in core-updates, in
commit f2d2ee42f168909f27c0c3b6532ef16febfd3b86.

Mark
M
M
Mark H Weaver wrote on 14 Feb 2018 10:16
(address . 30299@debbugs.gnu.org)
87efloorqz.fsf@netris.org
Mark H Weaver <mhw@netris.org> writes:

Toggle quote (12 lines)
> On core-updates, Hydra has been unable to successfully build 'shepherd'
> on any system except x86_64-linux. I can also report that on my
> mips64el-linux GuixSD system, which is running something close to
> 'core-updates', I had to disable tests on shepherd in order to build it.
> I don't know about aarch64-linux.
>
> These are the tests that commonly fail:
>
> FAIL: tests/respawn.sh
> FAIL: tests/respawn-throttling.sh
> FAIL: tests/basic.sh

FYI, after 18 failed attempts, Hydra finally built shepherd on i686
successfully on the 19th try:


Now that I've added Ludovic's patch, hopefully it will require fewer
attempts this time :)

Mark
L
L
Ludovic Courtès wrote on 14 Feb 2018 14:00
(name . Mark H Weaver)(address . mhw@netris.org)(address . 30299@debbugs.gnu.org)
873723vi8t.fsf@gnu.org
Mark H Weaver <mhw@netris.org> skribis:

Toggle quote (28 lines)
> ludo@gnu.org (Ludovic Courtès) writes:
>
>> Mark H Weaver <mhw@netris.org> skribis:
>>
>>> On core-updates, Hydra has been unable to successfully build 'shepherd'
>>> on any system except x86_64-linux. I can also report that on my
>>> mips64el-linux GuixSD system, which is running something close to
>>> 'core-updates', I had to disable tests on shepherd in order to build it.
>>> I don't know about aarch64-linux.
>>>
>>> These are the tests that commonly fail:
>>>
>>> FAIL: tests/respawn.sh
>>> FAIL: tests/respawn-throttling.sh
>>> FAIL: tests/basic.sh
>>
>> This is a non-deterministic failure. I could reproduce the
>> tests/basic.sh one and it is fixed by this:
>>
>> https://git.savannah.gnu.org/cgit/shepherd.git/commit/?id=cc9564586729a5bb90dd5d2722b543fdde9ab821
>
> Thank you!
>
>> I’ll roll a new Shepherd release soon and update the package.
>
> For now, I added your patch to the shepherd package in core-updates, in
> commit f2d2ee42f168909f27c0c3b6532ef16febfd3b86.

Thanks, and sorry for not acting earlier! (I was waiting for a reply
from the Translation Project to make the new release, but that hasn’t
happened yet.)

Ludo’.
M
M
Mark H Weaver wrote on 15 Feb 2018 20:21
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 30299@debbugs.gnu.org)
87a7wam53e.fsf@netris.org
Hi Ludovic,

Mark H Weaver <mhw@netris.org> writes:

Toggle quote (24 lines)
> ludo@gnu.org (Ludovic Courtès) writes:
>
>> Mark H Weaver <mhw@netris.org> skribis:
>>
>>> On core-updates, Hydra has been unable to successfully build 'shepherd'
>>> on any system except x86_64-linux. I can also report that on my
>>> mips64el-linux GuixSD system, which is running something close to
>>> 'core-updates', I had to disable tests on shepherd in order to build it.
>>> I don't know about aarch64-linux.
>>>
>>> These are the tests that commonly fail:
>>>
>>> FAIL: tests/respawn.sh
>>> FAIL: tests/respawn-throttling.sh
>>> FAIL: tests/basic.sh
>>
>> This is a non-deterministic failure. I could reproduce the
>> tests/basic.sh one and it is fixed by this:
>>
>> https://git.savannah.gnu.org/cgit/shepherd.git/commit/?id=cc9564586729a5bb90dd5d2722b543fdde9ab821
>
> For now, I added your patch to the shepherd package in core-updates, in
> commit f2d2ee42f168909f27c0c3b6532ef16febfd3b86.

With your patch, Hydra built shepherd successfully on i686-linux on the
first try, which is much better than the 19th try :)

However, on armhf-linux, three tests failed: respawn.sh,
respawn-throttling.sh, and pid-file.sh.


We should probably arrange for test-suite.log to be printed when "make
check" fails. We could add this functionality to selected packages like
shepherd and guix the next time we update them, and maybe consider
adding something generic to gnu-build-system's check phase in the next
core-updates cycle. What do you think?

Mark

Toggle snippet (39 lines)
starting phase `check'
make check-am
make[1]: Entering directory '/tmp/guix-build-shepherd-0.3.2.drv-0/shepherd-0.3.2'
make check-TESTS
make[2]: Entering directory '/tmp/guix-build-shepherd-0.3.2.drv-0/shepherd-0.3.2'
make[3]: Entering directory '/tmp/guix-build-shepherd-0.3.2.drv-0/shepherd-0.3.2'
PASS: tests/misbehaved-client.sh
FAIL: tests/respawn.sh
PASS: tests/no-home.sh
FAIL: tests/pid-file.sh
PASS: tests/status-sexp.sh
PASS: tests/sigint.sh
PASS: tests/basic.sh
FAIL: tests/respawn-throttling.sh
============================================================================
Testsuite summary for GNU Shepherd 0.3.2
============================================================================
# TOTAL: 8
# PASS: 5
# SKIP: 0
# XFAIL: 0
# FAIL: 3
# XPASS: 0
# ERROR: 0
============================================================================
See ./test-suite.log
Please report to bug-guix@gnu.org
============================================================================
make[3]: *** [Makefile:1220: test-suite.log] Error 1
make[3]: Leaving directory '/tmp/guix-build-shepherd-0.3.2.drv-0/shepherd-0.3.2'
make[2]: *** [Makefile:1328: check-TESTS] Error 2
make[2]: Leaving directory '/tmp/guix-build-shepherd-0.3.2.drv-0/shepherd-0.3.2'
make[1]: *** [Makefile:1527: check-am] Error 2
make[1]: Leaving directory '/tmp/guix-build-shepherd-0.3.2.drv-0/shepherd-0.3.2'
make: *** [Makefile:1529: check] Error 2
phase `check' failed after 29.6 seconds
builder for `/gnu/store/sk0l3ll4x1ddn9zwxmfqjipr578hiqx1-shepherd-0.3.2.drv' failed with exit code 1
@ build-failed /gnu/store/sk0l3ll4x1ddn9zwxmfqjipr578hiqx1-shepherd-0.3.2.drv - 1 builder for `/gnu/store/sk0l3ll4x1ddn9zwxmfqjipr578hiqx1-shepherd-0.3.2.drv' failed with exit code 1
L
L
Ludovic Courtès wrote on 15 Feb 2018 22:26
(name . Mark H Weaver)(address . mhw@netris.org)(address . 30299@debbugs.gnu.org)
87o9kq6j2t.fsf@gnu.org
Hi Mark,

Mark H Weaver <mhw@netris.org> skribis:

Toggle quote (2 lines)
> Mark H Weaver <mhw@netris.org> writes:

[...]

Toggle quote (11 lines)
>>> This is a non-deterministic failure. I could reproduce the
>>> tests/basic.sh one and it is fixed by this:
>>>
>>> https://git.savannah.gnu.org/cgit/shepherd.git/commit/?id=cc9564586729a5bb90dd5d2722b543fdde9ab821
>>
>> For now, I added your patch to the shepherd package in core-updates, in
>> commit f2d2ee42f168909f27c0c3b6532ef16febfd3b86.
>
> With your patch, Hydra built shepherd successfully on i686-linux on the
> first try, which is much better than the 19th try :)

We’re making progress. :-)

Toggle quote (5 lines)
> However, on armhf-linux, three tests failed: respawn.sh,
> respawn-throttling.sh, and pid-file.sh.
>
> https://hydra.gnu.org/build/2499835

I’ll try to reproduce the failure here.

Toggle quote (6 lines)
> We should probably arrange for test-suite.log to be printed when "make
> check" fails. We could add this functionality to selected packages like
> shepherd and guix the next time we update them, and maybe consider
> adding something generic to gnu-build-system's check phase in the next
> core-updates cycle. What do you think?

Definitely. We discussed it before and I think it’s a good idea. I
wanted to add it to ‘guix’ as a starting point but never got around to
doing it.

Thanks,
Ludo’.
L
L
Ludovic Courtès wrote on 16 Feb 2018 11:28
control message for bug #30299
(address . control@debbugs.gnu.org)
871shlqlcn.fsf@gnu.org
severity 30299 important
L
L
Ludovic Courtès wrote on 17 Feb 2018 01:04
Re: bug#30299: [core-updates] shepherd fails tests on all systems except x86_64
(name . Mark H Weaver)(address . mhw@netris.org)(address . 30299@debbugs.gnu.org)
87inawlbwv.fsf@gnu.org
Hello,

Mark H Weaver <mhw@netris.org> skribis:

Toggle quote (5 lines)
> However, on armhf-linux, three tests failed: respawn.sh,
> respawn-throttling.sh, and pid-file.sh.
>
> https://hydra.gnu.org/build/2499835

(Similar issue on aarch64:
passed on the 2nd and 3rd attempts…)

I was able to reproduce a tests/respawn.sh failure on hardware (ARMv7).
The issue is that a service is not respawned, and the log shows:

Toggle snippet (27 lines)
+ assert_killed_service_is_respawned t-service2-pid-695
++ cat t-service2-pid-695
+ old_pid=789
+ rm t-service2-pid-695
+ kill 789
+ wait_for_file t-service2-pid-695
+ i=0
+ test -f t-service2-pid-695
+ test 0 -lt 20
+ sleep 0.3
++ expr 0 + 1

[...]

2018-02-16 11:13:31 Service root has been started.
2018-02-16 11:13:32 Service test1 has been started.
2018-02-16 11:13:34 Service test2 has been started.
2018-02-16 11:13:35 Respawning test1.
2018-02-16 11:13:35 Service test1 has been started.
2018-02-16 11:13:36 Respawning test2.
2018-02-16 11:13:37 Service test2 has been started.
2018-02-16 11:13:37 Respawning test1.
2018-02-16 11:13:37 Service test1 has been started.
2018-02-16 11:13:38 Respawning test2.
2018-02-16 11:13:43 Service test2 could not be started.

So SIGCHLD was correctly delivered, but somehow restarting that service
didn’t work (its PID file didn’t show up again; the 5 seconds between
“Respawning” and “could not be started” correspond to the delay in
‘read-pid-file’ in (shepherd service)).

These test failures seem to be more frequent when the machine is loaded.

Ludo’.
A
A
Andreas Enge wrote on 24 Feb 2018 23:57
Change bug 30299
(address . control@debbugs.gnu.org)
20180224225733.GA10747@jurong
retitle 30299 shepherd fails tests on all systems except x86_64
severity 30299 serious
thanks
A
A
Andreas Enge wrote on 25 Feb 2018 00:00
Re: bug#30299: [core-updates] shepherd fails tests on all systems except x86_64
(name . Ludovic Courtès)(address . ludo@gnu.org)
20180224230033.GA10795@jurong
Hello,

I changed the severity to "serious", since this bug prevents installing
GuixSD on arm, or creating an installation image. Also, it is now present
on master instead of core-updates.

Andreas
B
B
Brice Waegeneire wrote on 19 Mar 2020 09:28
(address . 30299@debbugs.gnu.org)
de75727fb11139381c68d8e3b55047ee@waegenei.re
Hello,

Does this bug is still relevant?
It was reported on core-updates 2 years ago with shepherd 0.3.2.
I can't see any CI failures[0] related to a test failing since cuirass
was setup, the only time the build of shepherd failed[1] it wasn't due
to a test.
I wasn't able to reproduce the failing test with some building rounds on
armhf and i686.

Toggle quote (5 lines)
> I changed the severity to "serious", since this bug prevents installing
> GuixSD on arm, or creating an installation image. Also, it is now
> present
> on master instead of core-updates.

As far as I know this isn't the case anymore, so at least the priority
should be lowered.


Brice.
M
M
Marius Bakke wrote on 19 Mar 2020 10:16
87d098bt7t.fsf@devup.no
Brice Waegeneire <brice@waegenei.re> writes:

Toggle quote (10 lines)
> Hello,
>
> Does this bug is still relevant?
> It was reported on core-updates 2 years ago with shepherd 0.3.2.
> I can't see any CI failures[0] related to a test failing since cuirass
> was setup, the only time the build of shepherd failed[1] it wasn't due
> to a test.
> I wasn't able to reproduce the failing test with some building rounds on
> armhf and i686.

Thank you for doing bug triage Brice. I'm closing the issue as we
haven't had problems building the shepherd in a while.
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEu7At3yzq9qgNHeZDoqBt8qM6VPoFAl5zOFcACgkQoqBt8qM6
VPrwHwf/emItLW800ANKAqYD9pn1OQJs82Ktp/NFnGY7FzDgpEoZ1xDa5NQKu/gC
768yEYzIJFFa4D3EYWQW29bFopmjCZqcEnqFdDU3zkuZPdupQ2Nrv0yTX/Hq8O0d
WIEkZjNg587GhW13iFMrXG505pbyK81D0pZ0+00PGOCaTisnxgS01QIiZyTcIso+
eQgG9a7kShd87o+TtapYFczAwS/HVDStir3/ouflcgi/n/on0KSVgJJZ+Hg78K7J
G4R4Ia9RiugJF5gHIXPLJZqGU0tsd5/YmNndBrMYnRBy53EHlMFkqVjXv+IT3G7q
OKaG2LC0HK4312pHftqDXAyMtr/PZg==
=tHOm
-----END PGP SIGNATURE-----

Closed
?