'guix publish' does not handle guix-daemon disconnections

  • Done
  • quality assurance status badge
Details
2 participants
  • Ludovic Courtès
  • Mark H Weaver
Owner
unassigned
Submitted by
Mark H Weaver
Severity
important
M
M
Mark H Weaver wrote on 29 Apr 2017 00:33
guix publish daemon on Hydra became dysfunctional; needed restart
(address . bug-guix@gnu.org)
877f24xiv9.fsf@netris.org
While trying to update my GuixSD system in the last hour, I found that
every attempt by the substituter to download NARs resulted in a 500
"Internal Server Error":

Toggle snippet (11 lines)
mhw@jojen ~$ guix package -u
substitute: ^Msubstitute: updating list of substitutes from 'https://mirror.hydra.gnu.org'... 0.0%^Msubstitute: updating list of substitutes from 'https://mirror.hydra.gnu.org'... 100.0%
Downloading https://mirror.hydra.gnu.org/guix/nar/qz4mg7sid6avdav158lhr6wziqswpjmx-gnome-calendar-3.22.2.tar.xz (2.3MiB installed)...
guix substitute: error: download from 'https://mirror.hydra.gnu.org/guix/nar/qz4mg7sid6avdav158lhr6wziqswpjmx-gnome-calendar-3.22.2.tar.xz' failed: 500, "Internal Server Error"
guix package: error: build failed: some substitutes for the outputs of derivation `/gnu/store/l58cq0l2j2ikyimjvxyv1dakha2ai6rx-gnome-calendar-3.22.2.tar.xz.drv' failed (usually happens due to networking issues); try `--fallback' to build derivation from source
mhw@jojen ~$ guix package -u
Downloading https://mirror.hydra.gnu.org/guix/nar/qz4mg7sid6avdav158lhr6wziqswpjmx-gnome-calendar-3.22.2.tar.xz (2.3MiB installed)...
guix substitute: error: download from 'https://mirror.hydra.gnu.org/guix/nar/qz4mg7sid6avdav158lhr6wziqswpjmx-gnome-calendar-3.22.2.tar.xz' failed: 500, "Internal Server Error"
guix package: error: build failed: some substitutes for the outputs of derivation `/gnu/store/l58cq0l2j2ikyimjvxyv1dakha2ai6rx-gnome-calendar-3.22.2.tar.xz.drv' failed (usually happens due to networking issues); try `--fallback' to build derivation from source

Bypassing the mirror didn't help:

Toggle snippet (8 lines)
mhw@jojen ~$ guix package -u . --substitute-urls=https://hydra.gnu.org
substitute: ^Msubstitute: updating list of substitutes from 'https://hydra.gnu.org'... 0.0%^Msubstitute: updating list of substitutes from 'https://hydra.gnu.org'... 100.0%
substitute: ^Msubstitute: updating list of substitutes from 'https://hydra.gnu.org'... 0.0%^Msubstitute: updating list of substitutes from 'https://hydra.gnu.org'... 100.0%
Downloading https://hydra.gnu.org/guix/nar/qz4mg7sid6avdav158lhr6wziqswpjmx-gnome-calendar-3.22.2.tar.xz (2.3MiB installed)...
guix substitute: error: download from 'https://hydra.gnu.org/guix/nar/qz4mg7sid6avdav158lhr6wziqswpjmx-gnome-calendar-3.22.2.tar.xz' failed: 500, "Internal Server Error"
guix package: error: build failed: some substitutes for the outputs of derivation `/gnu/store/l58cq0l2j2ikyimjvxyv1dakha2ai6rx-gnome-calendar-3.22.2.tar.xz.drv' failed (usually happens due to networking issues); try `--fallback' to build derivation from source

Here's what I found in the recent output of the 'guix publish' server on
Hydra (running in root's screen session):

Toggle snippet (42 lines)
GET /74ch6nvjfkj3i56nygwijnaghlpi01d4.narinfo
In guix/scripts/publish.scm:
393:2 2 (render-narinfo/cached #<build-daemon 256.96 211f7c0> ...)
In guix/store.scm:
663:9 1 (query-path-from-hash-part #<build-daemon 256.96 2...> #)
In unknown file:
0 (put-bytevector #<input-output: socket 12> #vu8(# ...) ...)
ERROR: In procedure fport_write: Broken pipe
GET /guix/nar/qz4mg7sid6avdav158lhr6wziqswpjmx-gnome-calendar-3.22.2.tar.xz
In guix/scripts/publish.scm:
491:8 2 (render-nar #<build-daemon 256.96 211f7c0> #<<requ...> ...)
In guix/store.scm:
648:0 1 (valid-path? #<build-daemon 256.96 211f7c0> "/gnu/sto...")
In unknown file:
0 (put-bytevector #<input-output: socket 12> #vu8(1 ...) ...)
ERROR: In procedure fport_write: Broken pipe
GET /zhxab8rkbbalgrlhg86q8pzpqi3s6q36.narinfo
In guix/scripts/publish.scm:
393:2 2 (render-narinfo/cached #<build-daemon 256.96 211f7c0> ...)
In guix/store.scm:
663:9 1 (query-path-from-hash-part #<build-daemon 256.96 2...> #)
In unknown file:
0 (put-bytevector #<input-output: socket 12> #vu8(# ...) ...)
ERROR: In procedure fport_write: Broken pipe
GET /z2rygpahxbpkwx8k164bi5lr5lann357.narinfo
In guix/scripts/publish.scm:
393:2 2 (render-narinfo/cached #<build-daemon 256.96 211f7c0> ...)
In guix/store.scm:
663:9 1 (query-path-from-hash-part #<build-daemon 256.96 2...> #)
In unknown file:
0 (put-bytevector #<input-output: socket 12> #vu8(# ...) ...)
ERROR: In procedure fport_write: Broken pipe
GET /yw635wxpm4ac40khzd9w6j9v3n72a708.narinfo
In guix/scripts/publish.scm:
393:2 2 (render-narinfo/cached #<build-daemon 256.96 211f7c0> ...)
In guix/store.scm:
663:9 1 (query-path-from-hash-part #<build-daemon 256.96 2...> #)
In unknown file:
0 (put-bytevector #<input-output: socket 12> #vu8(# ...) ...)
ERROR: In procedure fport_write: Broken pipe

I killed the process, restarted it, and now things seem to be working
again, but it would be good to know what went wrong here.

Mark
L
L
Ludovic Courtès wrote on 2 May 2017 11:01
control message for bug #26705
(address . control@debbugs.gnu.org)
87k25zljjc.fsf@gnu.org
severity 26705 serious
L
L
Ludovic Courtès wrote on 2 May 2017 11:14
Re: bug#26705: guix publish daemon on Hydra became dysfunctional; needed restart
(name . Mark H Weaver)(address . mhw@netris.org)(address . 26705@debbugs.gnu.org)
878tmflix2.fsf@gnu.org
Hello,

Mark H Weaver <mhw@netris.org> skribis:

Toggle quote (4 lines)
> While trying to update my GuixSD system in the last hour, I found that
> every attempt by the substituter to download NARs resulted in a 500
> "Internal Server Error":

[...]

Toggle quote (2 lines)
> Bypassing the mirror didn't help:

Did you try talking directly to ‘guix publish’ (localhost:9999 on
hydra.gnu.org)?

Toggle quote (20 lines)
> Here's what I found in the recent output of the 'guix publish' server on
> Hydra (running in root's screen session):
>
> GET /74ch6nvjfkj3i56nygwijnaghlpi01d4.narinfo
> In guix/scripts/publish.scm:
> 393:2 2 (render-narinfo/cached #<build-daemon 256.96 211f7c0> ...)
> In guix/store.scm:
> 663:9 1 (query-path-from-hash-part #<build-daemon 256.96 2...> #)
> In unknown file:
> 0 (put-bytevector #<input-output: socket 12> #vu8(# ...) ...)
> ERROR: In procedure fport_write: Broken pipe
> GET /guix/nar/qz4mg7sid6avdav158lhr6wziqswpjmx-gnome-calendar-3.22.2.tar.xz
> In guix/scripts/publish.scm:
> 491:8 2 (render-nar #<build-daemon 256.96 211f7c0> #<<requ...> ...)
> In guix/store.scm:
> 648:0 1 (valid-path? #<build-daemon 256.96 211f7c0> "/gnu/sto...")
> In unknown file:
> 0 (put-bytevector #<input-output: socket 12> #vu8(1 ...) ...)
> ERROR: In procedure fport_write: Broken pipe

Ooh, the connection to the daemon was broken, hence this error.

Currently ‘guix publish’ assumes the connection opened in the
‘guix-publish’ procedure remains valid all along. That’s normally the
case unless (1) the daemon is restarted, or (2) there’s a protocol error
somewhere that leads the daemon to close the connection.

#2 is not supposed to happen, but could happen if, for instance, an
exception is thrown before we have read the RPC reply sent by the
daemon.

Perhaps ‘guix publish’ could catch EPIPE when talking to the daemon?
(Currently EPIPEs are swallowed by ‘handle-request’ in (web server) and
converted to 500.)

That said, if there really is a situation where ‘guix publish’ can
violate the daemon protocol, that’s what should be fixed.

Thoughts?

Thanks for handling this quickly!

Ludo’.
L
L
Ludovic Courtès wrote on 27 Jul 2017 13:44
control message for bug #26705
(address . control@debbugs.gnu.org)
87d18m5cla.fsf@gnu.org
retitle 26705 'guix publish' does not handle guix-daemon disconnections
L
L
Ludovic Courtès wrote on 27 Jul 2017 14:37
(address . control@debbugs.gnu.org)
87fudi3vke.fsf@gnu.org
severity 26705 important
L
L
Ludovic Courtès wrote on 9 Jan 2019 21:47
Re: bug#26705: guix publish daemon on Hydra became dysfunctional; needed restart
(name . Mark H Weaver)(address . mhw@netris.org)(address . 26705@debbugs.gnu.org)
87zhs91ru3.fsf@gnu.org
ludo@gnu.org (Ludovic Courtès) skribis:

Toggle quote (6 lines)
> Mark H Weaver <mhw@netris.org> skribis:
>
>> While trying to update my GuixSD system in the last hour, I found that
>> every attempt by the substituter to download NARs resulted in a 500
>> "Internal Server Error":

[...]

Toggle quote (24 lines)
>> GET /74ch6nvjfkj3i56nygwijnaghlpi01d4.narinfo
>> In guix/scripts/publish.scm:
>> 393:2 2 (render-narinfo/cached #<build-daemon 256.96 211f7c0> ...)
>> In guix/store.scm:
>> 663:9 1 (query-path-from-hash-part #<build-daemon 256.96 2...> #)
>> In unknown file:
>> 0 (put-bytevector #<input-output: socket 12> #vu8(# ...) ...)
>> ERROR: In procedure fport_write: Broken pipe
>> GET /guix/nar/qz4mg7sid6avdav158lhr6wziqswpjmx-gnome-calendar-3.22.2.tar.xz
>> In guix/scripts/publish.scm:
>> 491:8 2 (render-nar #<build-daemon 256.96 211f7c0> #<<requ...> ...)
>> In guix/store.scm:
>> 648:0 1 (valid-path? #<build-daemon 256.96 211f7c0> "/gnu/sto...")
>> In unknown file:
>> 0 (put-bytevector #<input-output: socket 12> #vu8(1 ...) ...)
>> ERROR: In procedure fport_write: Broken pipe
>
> Ooh, the connection to the daemon was broken, hence this error.
>
> Currently ‘guix publish’ assumes the connection opened in the
> ‘guix-publish’ procedure remains valid all along. That’s normally the
> case unless (1) the daemon is restarted, or (2) there’s a protocol error
> somewhere that leads the daemon to close the connection.

For now I’m closing this bug as “wontfix” because I’ve never seen any
occurrence of #2, and because #1 cannot happen on GuixSD (if ‘guix-daemon’
is restarted, the shepherd will also restart ‘guix-publish’.)

Ludo’.
L
L
Ludovic Courtès wrote on 9 Jan 2019 21:47
control message for bug #26705
(address . control@debbugs.gnu.org)
87y37t1rty.fsf@gnu.org
tags 26705 wontfix
close 26705
?