Shepherd 0.8.1 tests fail on core-updates

  • Done
  • quality assurance status badge
Details
3 participants
  • Léo Le Bouter
  • Ludovic Courtès
  • Marius Bakke
Owner
unassigned
Submitted by
Léo Le Bouter
Severity
important
L
L
Léo Le Bouter wrote on 15 Mar 2021 19:51
GNU Shepherd 0.8.1 fails on core-updates
(address . bug-guix@gnu.org)
37305bfa08faea95b45a6496623154c2ebab1f11.camel@zaclys.net
Some tests fail:

FAIL: tests/no-home.sh
FAIL: tests/status-sexp.sh
PASS: tests/misbehaved-client.sh
FAIL: tests/replacement.sh
PASS: tests/file-creation-mask.sh
PASS: tests/restart.sh
PASS: tests/one-shot.sh
FAIL: tests/basic.sh
PASS: tests/respawn-throttling.sh
PASS: tests/signals.sh
PASS: tests/respawn.sh
PASS: tests/forking-service.sh
PASS: tests/pid-file.sh

Attached bzip2 compressed full log
-----BEGIN PGP SIGNATURE-----

iQIzBAABCgAdFiEEFIvLi9gL+xax3g6RRaix6GvNEKYFAmBPrKwACgkQRaix6GvN
EKbc7RAAp/SvvE1Httr9STqj71/NYieDZMjieSAcD/5RtUqylO3ppdeR/+tIR2gf
egBWC1sAQYZDTh8MegApw9eUh1kb9Cr6Iq+3NkXSxxVLr3RwyJ/SCJwjKfLonvvL
ITzzGdv0oNIRDxxlQiGqOSqrlfy8KMbXJC8Wlh99ac3KjvolLeZkqy8fjXo9BC82
rNAkx3zJ9eJxr2O8nEb2A9lNHjRN1PhTmMmZowQAPpDjPKty0XPyHvk7lAShO35v
0tG2fS0eaiLzGE/19CxDCVEHhyDS0+m73UCWY1lddZ6Rbj1SFCH516SbKleEjF61
xXA8gMGt56PDkPafPqPDBNb7bs+7PxZ6wkXp2VJvLInb7Bjwc7Hcwo5gIrod1UbR
4hVi0IuqheDVte8ks+A0qgZ8tdxizMSHS1S4GncGImWcS7t6dfDjrbzbhjPNS24E
4L9xDIPCNFVUK1xmr8EBq4eJwJce7VV8ZM7xt2jfuxk2I9H82B3X+Jmk6gp7WuNU
uGOnu/S2NkeFNrC34ztvx9wF6Fd+7BnEAmYstYZKblSPJdSO8rT6fvaOMzxGESBx
lFywnn45PvzJNsXBkWcS4zDf1GpeLFPBrCACN/P1f9jkpGDsoxDNLSsfDcDvpM6X
0kW+owHOS5mNHey9iotXJqqpinxpKAOmNpGgwlRJVuSPB+uggXE=
=tjdm
-----END PGP SIGNATURE-----


L
L
Ludovic Courtès wrote on 23 Mar 2021 16:11
control message for bug #47172
(address . control@debbugs.gnu.org)
877dlxj45u.fsf@gnu.org
retitle 47172 Shepherd 0.8.1 tests fail on core-updates
quit
L
L
Léo Le Bouter wrote on 24 Mar 2021 14:25
Re: bug#47172: GNU Shepherd 0.8.1 fails on core-updates
(address . 47172@debbugs.gnu.org)
80954a930aa5ea8d71e472be49a6ea118e7ba2a7.camel@zaclys.net
It seems this is due to guile 3.0.5, GNU Shepherd 0.8.1 does not work
with it, it works with guile 3.0.2 however.

Thanks to Efraim on IRC for hints.

It would be great if people knowledgeable with Scheme, GNU Shepherd and
Guile could fix it, it blocks GNOME upgrade work on top of core-
updates.
-----BEGIN PGP SIGNATURE-----

iQIzBAABCgAdFiEEFIvLi9gL+xax3g6RRaix6GvNEKYFAmBbPdoACgkQRaix6GvN
EKYfwQ//VEI/cxaNLgoUV+2EgjTElcLbcx3RUM8U1dxWMZonedT833jhScVa5pzA
J2wbu/Rvdc5vWjUtfXZdUA0vTdKV2AQ+EL0hQerdUUhGTb8kHktQcf143SJvDMQd
KKVOQrAhe0FDWIdmJpbscl36RNMgC6lKNo9RkV25GutXECchbz+sRcKk/P35vov8
jYVcl2dwrXo5ywulv8M4xitKvhiaG/ZqJMsuAOXlgmxYrwzJEo/pKlUGTHW9WJ0k
MJPirsGpQTGozxs82scTX/DrkedJKCeA/ec2UUh/jE0Zil3uqPQwHrkpfUm9gKIA
6/YPBOXdUWskAvcTEespbF3Bi7RKTQPDHuFLpu3rxls29pbO8mkMhM4Vj/8BSDu/
iHzAAYTr3lb2dYuIgYelX3N8X01v8lWa4gk7noZQEOYhk2DX5kZ66PYDF7aK8bnS
08H2bS0abaj/wdsVWbivd2btsGhWD7/IhxguHKWuBVQt0MUdqI+vw6PrGAn/MQoe
NiAK4q1kkH6tMlMO7ugLkvB7S1teuFRWs/4K/ndPAa1by9WQsYGlr9F3/Z1sdO0o
XZp5VNqNVKWwvC4gYnBcfuSYjHNXyGx7Zyv5Rsa2w/A3n3QlmymVDDp6qKdP879p
hmIsFFoKKVlcpNexfgiVmTxt6rqM/c/PPzxAcxwT6RE/BM0CB4c=
=ASE3
-----END PGP SIGNATURE-----


L
L
Ludovic Courtès wrote on 23 May 2021 12:23
control message for bug #47172
(address . control@debbugs.gnu.org)
8735ud696b.fsf@gnu.org
severity 47172 important
quit
L
L
Ludovic Courtès wrote on 23 May 2021 12:25
Re: bug#47172: Shepherd 0.8.1 tests fail on core-updates
(address . 47172@debbugs.gnu.org)
87zgwl4ui0.fsf@gnu.org
Hi there,

Léo Le Bouter <lle-bout@zaclys.net> skribis:

Toggle quote (6 lines)
> Some tests fail:
>
> FAIL: tests/no-home.sh
> FAIL: tests/status-sexp.sh
> PASS: tests/misbehaved-client.sh

[...]

Toggle quote (3 lines)
> It seems this is due to guile 3.0.5, GNU Shepherd 0.8.1 does not work
> with it, it works with guile 3.0.2 however.

This turns out to be due to a… miscompilation bug.

In (shepherd scripts herd), ‘run-command’ has this code:

(let ((sock (open-connection socket-file))
(action* (if (and (eq? action 'detailed-status)
(memq service '(root shepherd)))
'status
action)))
…)

Problem is that everything works as if (eq? action 'detailed-status)
was omitted, such that ‘herd stop root’ is interpreted as ‘herd status
root’.

Simply wrapping the condition in (pk …) “fixes” the problem.

The peval output looks good (it contains the 'detailed-status
comparison), but the assembly seems to lack the 'detailed-status
comparison altogether:

Toggle snippet (51 lines)
Disassembly of <unnamed function> at #x29e0:

0 (instrument-entry 15700) at shepherd/scripts/herd.scm:127:2
2 (assert-nargs-ee/locals 1 11) ;; 12 slots (0 args)
3 (static-ref 10 15369) ;; #f at shepherd/scripts/herd.scm:128:19
5 (immediate-tag=? 10 7 0) ;; heap-object?
7 (je 9) ;; -> L1
8 (static-ref 10 14166) ;; #f
10 (static-ref 9 15372) ;; open-connection
12 (call-scm<-scm-scm 10 10 9 111)
14 (static-set! 10 15358) ;; #f
L1:
16 (scm-ref/immediate 7 10 1)
17 (scm-ref/immediate 6 11 2)
18 (handle-interrupts) at shepherd/scripts/herd.scm:128:18
19 (call 4 2)
21 (receive 1 4 12)
23 (scm-ref/immediate 9 11 3)
24 (static-ref 8 15360) ;; #f at shepherd/scripts/herd.scm:134:6
26 (immediate-tag=? 8 7 0) ;; heap-object?
28 (je 9) ;; -> L2
29 (static-ref 8 14145) ;; #f
31 (static-ref 7 15363) ;; write-command
33 (call-scm<-scm-scm 8 8 7 111)
35 (static-set! 8 15349) ;; #f
L2:
37 (scm-ref/immediate 8 8 1)
38 (static-ref 7 15358) ;; #f at shepherd/scripts/herd.scm:134:21
40 (immediate-tag=? 7 7 0) ;; heap-object?
42 (je 9) ;; -> L3
43 (static-ref 7 14131) ;; #f
45 (static-ref 6 15361) ;; shepherd-command
47 (call-scm<-scm-scm 7 7 6 111)
49 (static-set! 7 15347) ;; #f
L3:
51 (scm-ref/immediate 7 7 1)
52 (scm-ref/immediate 6 11 4)
53 (static-ref 5 15363) ;; root
55 (eq? 6 5)
56 (je 5) ;; -> L4
57 (static-ref 5 13655) ;; shepherd
59 (eq? 6 5)
60 (jne 3) ;; -> L5
L4:
61 (static-ref 9 15365) ;; status at shepherd/scripts/herd.scm:131:22
L5:
63 (static-ref 1 15375) ;; #:arguments at shepherd/scripts/herd.scm:134:54
65 (scm-ref/immediate 0 11 5)
66 (mov 4 7) at shepherd/scripts/herd.scm:134:20

(This is compiled with 3.0.7 and the default optimizations, so -O2.)

To be continued…

Ludo’.
L
L
Ludovic Courtès wrote on 23 May 2021 15:49
(address . 47172@debbugs.gnu.org)
87v9794l2a.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (15 lines)
> This turns out to be due to a… miscompilation bug.
>
> In (shepherd scripts herd), ‘run-command’ has this code:
>
> (let ((sock (open-connection socket-file))
> (action* (if (and (eq? action 'detailed-status)
> (memq service '(root shepherd)))
> 'status
> action)))
> …)
>
> Problem is that everything works as if (eq? action 'detailed-status)
> was omitted, such that ‘herd stop root’ is interpreted as ‘herd status
> root’.

A workaround that works with 3.0.7 is swapping the two ‘and’
sub-expressions:
Toggle diff (15 lines)
diff --git a/modules/shepherd/scripts/herd.scm b/modules/shepherd/scripts/herd.scm
index 106de1e..39d2e34 100644
--- a/modules/shepherd/scripts/herd.scm
+++ b/modules/shepherd/scripts/herd.scm
@@ -126,8 +126,8 @@ of pairs."
the daemon via SOCKET-FILE."
(with-system-error-handling
(let ((sock (open-connection socket-file))
- (action* (if (and (eq? action 'detailed-status)
- (memq service '(root shepherd)))
+ (action* (if (and (memq service '(root shepherd))
+ (eq? action 'detailed-status))
'status
action)))
;; Send the command.
Ludo’.
M
M
Marius Bakke wrote on 23 May 2021 17:23
(address . 48368@debbugs.gnu.org)
875yz9qxrv.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skriver:

Toggle quote (36 lines)
> Ludovic Courtès <ludo@gnu.org> skribis:
>
>> This turns out to be due to a… miscompilation bug.
>>
>> In (shepherd scripts herd), ‘run-command’ has this code:
>>
>> (let ((sock (open-connection socket-file))
>> (action* (if (and (eq? action 'detailed-status)
>> (memq service '(root shepherd)))
>> 'status
>> action)))
>> …)
>>
>> Problem is that everything works as if (eq? action 'detailed-status)
>> was omitted, such that ‘herd stop root’ is interpreted as ‘herd status
>> root’.
>
> A workaround that works with 3.0.7 is swapping the two ‘and’
> sub-expressions:
>
> diff --git a/modules/shepherd/scripts/herd.scm b/modules/shepherd/scripts/herd.scm
> index 106de1e..39d2e34 100644
> --- a/modules/shepherd/scripts/herd.scm
> +++ b/modules/shepherd/scripts/herd.scm
> @@ -126,8 +126,8 @@ of pairs."
> the daemon via SOCKET-FILE."
> (with-system-error-handling
> (let ((sock (open-connection socket-file))
> - (action* (if (and (eq? action 'detailed-status)
> - (memq service '(root shepherd)))
> + (action* (if (and (memq service '(root shepherd))
> + (eq? action 'detailed-status))
> 'status
> action)))
> ;; Send the command.

Cc'ing the relevant Guile bug:


See also commit 79be6a985799adc6d663890250f4fb7c12f015b4 on
'core-updates' that builds with -O1 as a less satisfactory workaround.
-----BEGIN PGP SIGNATURE-----

iIUEARYKAC0WIQRNTknu3zbaMQ2ddzTocYulkRQQdwUCYKpzhA8cbWFyaXVzQGdu
dS5vcmcACgkQ6HGLpZEUEHd0CgD9FsWiNMu2PxB/773BI2hOmYPKZqyX+KbAy05R
C7+xubIBAPcyjBy9TtmqfG0aCSUu1r6a8dmFKkJm4r4eb5fLEwEK
=o7Sr
-----END PGP SIGNATURE-----

L
L
Ludovic Courtès wrote on 23 May 2021 23:43
(name . Marius Bakke)(address . marius@gnu.org)
87lf853z4i.fsf@gnu.org
Hello,

Marius Bakke <marius@gnu.org> skribis:

Toggle quote (2 lines)
> Ludovic Courtès <ludo@gnu.org> skriver:

[...]

Toggle quote (23 lines)
>> A workaround that works with 3.0.7 is swapping the two ‘and’
>> sub-expressions:
>>
>> diff --git a/modules/shepherd/scripts/herd.scm b/modules/shepherd/scripts/herd.scm
>> index 106de1e..39d2e34 100644
>> --- a/modules/shepherd/scripts/herd.scm
>> +++ b/modules/shepherd/scripts/herd.scm
>> @@ -126,8 +126,8 @@ of pairs."
>> the daemon via SOCKET-FILE."
>> (with-system-error-handling
>> (let ((sock (open-connection socket-file))
>> - (action* (if (and (eq? action 'detailed-status)
>> - (memq service '(root shepherd)))
>> + (action* (if (and (memq service '(root shepherd))
>> + (eq? action 'detailed-status))
>> 'status
>> action)))
>> ;; Send the command.
>
> Cc'ing the relevant Guile bug:
>
> https://bugs.gnu.org/48368

Oh nice! (It would have saved me a bit of time to catch up on email
beforehand. :-))

Toggle quote (3 lines)
> See also commit 79be6a985799adc6d663890250f4fb7c12f015b4 on
> 'core-updates' that builds with -O1 as a less satisfactory workaround.

I found that ‘-O2 -Ono-resolve-primitives’ also does the trick.

If we manually replace ‘memq’ by two ‘eq?’ tests (which is what the
compiler does), the same problem is exhibited:
Toggle diff (14 lines)
diff --git a/modules/shepherd/scripts/herd.scm b/modules/shepherd/scripts/herd.scm
index 106de1e..513508f 100644
--- a/modules/shepherd/scripts/herd.scm
+++ b/modules/shepherd/scripts/herd.scm
@@ -127,7 +127,8 @@ the daemon via SOCKET-FILE."
(with-system-error-handling
(let ((sock (open-connection socket-file))
(action* (if (and (eq? action 'detailed-status)
- (memq service '(root shepherd)))
+ (or (eq? service 'root)
+ (eq? service 'shepherd)))
'status
action)))
;; Send the command.
‘-Ono-resolve-primitives’ also helps in this case.

‘-Ono-optimize-branch-chains’ has no effect.

So, not much progress, but at least we have a workaround.

Thanks,
Ludo’.
L
L
Ludovic Courtès wrote on 30 Sep 2021 10:57
(name . Marius Bakke)(address . marius@gnu.org)
87k0iy4f02.fsf@gnu.org
Hi,

Marius Bakke <marius@gnu.org> skribis:

Toggle quote (2 lines)
> Ludovic Courtès <ludo@gnu.org> skriver:

[...]

Toggle quote (26 lines)
>> A workaround that works with 3.0.7 is swapping the two ‘and’
>> sub-expressions:
>>
>> diff --git a/modules/shepherd/scripts/herd.scm b/modules/shepherd/scripts/herd.scm
>> index 106de1e..39d2e34 100644
>> --- a/modules/shepherd/scripts/herd.scm
>> +++ b/modules/shepherd/scripts/herd.scm
>> @@ -126,8 +126,8 @@ of pairs."
>> the daemon via SOCKET-FILE."
>> (with-system-error-handling
>> (let ((sock (open-connection socket-file))
>> - (action* (if (and (eq? action 'detailed-status)
>> - (memq service '(root shepherd)))
>> + (action* (if (and (memq service '(root shepherd))
>> + (eq? action 'detailed-status))
>> 'status
>> action)))
>> ;; Send the command.
>
> Cc'ing the relevant Guile bug:
>
> https://bugs.gnu.org/48368
>
> See also commit 79be6a985799adc6d663890250f4fb7c12f015b4 on
> 'core-updates' that builds with -O1 as a less satisfactory workaround.

The bug has been fixed in Guile (will be in 3.0.8), worked around in
Shepherd commit a066c5ac05037a6ffad8e4ea3e8de8150869aa8b, and worked
around in Guix on ‘core-updates’, so I think we can close it now.

Thanks,
Ludo’.
Closed
?