'guix publish' does not handle guix-daemon disconnections

DoneSubmitted by Mark HWeaver.
Details
2 participants
  • Ludovic Courtès
  • Mark HWeaver
Owner
unassigned
Severity
important
M
M
Mark HWeaver wrote on 29 Apr 2017 00:33
guix publish daemon on Hydra became dysfunctional; needed restart
(address . bug-guix@gnu.org)
877f24xiv9.fsf@netris.org
While trying to update my GuixSD system in the last hour, I found thatevery attempt by the substituter to download NARs resulted in a 500"Internal Server Error":
Toggle snippet (11 lines)mhw@jojen ~$ guix package -usubstitute: ^Msubstitute: updating list of substitutes from 'https://mirror.hydra.gnu.org'... 0.0%^Msubstitute: updating list of substitutes from 'https://mirror.hydra.gnu.org'... 100.0%Downloading https://mirror.hydra.gnu.org/guix/nar/qz4mg7sid6avdav158lhr6wziqswpjmx-gnome-calendar-3.22.2.tar.xz (2.3MiB installed)...guix substitute: error: download from 'https://mirror.hydra.gnu.org/guix/nar/qz4mg7sid6avdav158lhr6wziqswpjmx-gnome-calendar-3.22.2.tar.xz' failed: 500, "Internal Server Error"guix package: error: build failed: some substitutes for the outputs of derivation `/gnu/store/l58cq0l2j2ikyimjvxyv1dakha2ai6rx-gnome-calendar-3.22.2.tar.xz.drv' failed (usually happens due to networking issues); try `--fallback' to build derivation from sourcemhw@jojen ~$ guix package -uDownloading https://mirror.hydra.gnu.org/guix/nar/qz4mg7sid6avdav158lhr6wziqswpjmx-gnome-calendar-3.22.2.tar.xz (2.3MiB installed)...guix substitute: error: download from 'https://mirror.hydra.gnu.org/guix/nar/qz4mg7sid6avdav158lhr6wziqswpjmx-gnome-calendar-3.22.2.tar.xz' failed: 500, "Internal Server Error"guix package: error: build failed: some substitutes for the outputs of derivation `/gnu/store/l58cq0l2j2ikyimjvxyv1dakha2ai6rx-gnome-calendar-3.22.2.tar.xz.drv' failed (usually happens due to networking issues); try `--fallback' to build derivation from source
Bypassing the mirror didn't help:
Toggle snippet (8 lines)mhw@jojen ~$ guix package -u . --substitute-urls=https://hydra.gnu.orgsubstitute: ^Msubstitute: updating list of substitutes from 'https://hydra.gnu.org'... 0.0%^Msubstitute: updating list of substitutes from 'https://hydra.gnu.org'... 100.0%substitute: ^Msubstitute: updating list of substitutes from 'https://hydra.gnu.org'... 0.0%^Msubstitute: updating list of substitutes from 'https://hydra.gnu.org'... 100.0%Downloading https://hydra.gnu.org/guix/nar/qz4mg7sid6avdav158lhr6wziqswpjmx-gnome-calendar-3.22.2.tar.xz (2.3MiB installed)...guix substitute: error: download from 'https://hydra.gnu.org/guix/nar/qz4mg7sid6avdav158lhr6wziqswpjmx-gnome-calendar-3.22.2.tar.xz' failed: 500, "Internal Server Error"guix package: error: build failed: some substitutes for the outputs of derivation `/gnu/store/l58cq0l2j2ikyimjvxyv1dakha2ai6rx-gnome-calendar-3.22.2.tar.xz.drv' failed (usually happens due to networking issues); try `--fallback' to build derivation from source
Here's what I found in the recent output of the 'guix publish' server onHydra (running in root's screen session):
Toggle snippet (42 lines)GET /74ch6nvjfkj3i56nygwijnaghlpi01d4.narinfoIn guix/scripts/publish.scm: 393:2 2 (render-narinfo/cached #<build-daemon 256.96 211f7c0> ...)In guix/store.scm: 663:9 1 (query-path-from-hash-part #<build-daemon 256.96 2...> #)In unknown file: 0 (put-bytevector #<input-output: socket 12> #vu8(# ...) ...)ERROR: In procedure fport_write: Broken pipeGET /guix/nar/qz4mg7sid6avdav158lhr6wziqswpjmx-gnome-calendar-3.22.2.tar.xzIn guix/scripts/publish.scm: 491:8 2 (render-nar #<build-daemon 256.96 211f7c0> #<<requ...> ...)In guix/store.scm: 648:0 1 (valid-path? #<build-daemon 256.96 211f7c0> "/gnu/sto...")In unknown file: 0 (put-bytevector #<input-output: socket 12> #vu8(1 ...) ...)ERROR: In procedure fport_write: Broken pipeGET /zhxab8rkbbalgrlhg86q8pzpqi3s6q36.narinfoIn guix/scripts/publish.scm: 393:2 2 (render-narinfo/cached #<build-daemon 256.96 211f7c0> ...)In guix/store.scm: 663:9 1 (query-path-from-hash-part #<build-daemon 256.96 2...> #)In unknown file: 0 (put-bytevector #<input-output: socket 12> #vu8(# ...) ...)ERROR: In procedure fport_write: Broken pipeGET /z2rygpahxbpkwx8k164bi5lr5lann357.narinfoIn guix/scripts/publish.scm: 393:2 2 (render-narinfo/cached #<build-daemon 256.96 211f7c0> ...)In guix/store.scm: 663:9 1 (query-path-from-hash-part #<build-daemon 256.96 2...> #)In unknown file: 0 (put-bytevector #<input-output: socket 12> #vu8(# ...) ...)ERROR: In procedure fport_write: Broken pipeGET /yw635wxpm4ac40khzd9w6j9v3n72a708.narinfoIn guix/scripts/publish.scm: 393:2 2 (render-narinfo/cached #<build-daemon 256.96 211f7c0> ...)In guix/store.scm: 663:9 1 (query-path-from-hash-part #<build-daemon 256.96 2...> #)In unknown file: 0 (put-bytevector #<input-output: socket 12> #vu8(# ...) ...)ERROR: In procedure fport_write: Broken pipe
I killed the process, restarted it, and now things seem to be workingagain, but it would be good to know what went wrong here.
Mark
L
L
Ludovic Courtès wrote on 2 May 2017 11:01
control message for bug #26705
(address . control@debbugs.gnu.org)
87k25zljjc.fsf@gnu.org
severity 26705 serious
L
L
Ludovic Courtès wrote on 2 May 2017 11:14
Re: bug#26705: guix publish daemon on Hydra became dysfunctional; needed restart
(name . Mark HWeaver)(address . mhw@netris.org)(address . 26705@debbugs.gnu.org)
878tmflix2.fsf@gnu.org
Hello,
Mark H Weaver <mhw@netris.org> skribis:
Toggle quote (4 lines)> While trying to update my GuixSD system in the last hour, I found that> every attempt by the substituter to download NARs resulted in a 500> "Internal Server Error":
[...]
Toggle quote (2 lines)> Bypassing the mirror didn't help:
Did you try talking directly to ‘guix publish’ (localhost:9999 onhydra.gnu.org)?
Toggle quote (20 lines)> Here's what I found in the recent output of the 'guix publish' server on> Hydra (running in root's screen session):>> GET /74ch6nvjfkj3i56nygwijnaghlpi01d4.narinfo> In guix/scripts/publish.scm:> 393:2 2 (render-narinfo/cached #<build-daemon 256.96 211f7c0> ...)> In guix/store.scm:> 663:9 1 (query-path-from-hash-part #<build-daemon 256.96 2...> #)> In unknown file:> 0 (put-bytevector #<input-output: socket 12> #vu8(# ...) ...)> ERROR: In procedure fport_write: Broken pipe> GET /guix/nar/qz4mg7sid6avdav158lhr6wziqswpjmx-gnome-calendar-3.22.2.tar.xz> In guix/scripts/publish.scm:> 491:8 2 (render-nar #<build-daemon 256.96 211f7c0> #<<requ...> ...)> In guix/store.scm:> 648:0 1 (valid-path? #<build-daemon 256.96 211f7c0> "/gnu/sto...")> In unknown file:> 0 (put-bytevector #<input-output: socket 12> #vu8(1 ...) ...)> ERROR: In procedure fport_write: Broken pipe
Ooh, the connection to the daemon was broken, hence this error.
Currently ‘guix publish’ assumes the connection opened in the‘guix-publish’ procedure remains valid all along. That’s normally thecase unless (1) the daemon is restarted, or (2) there’s a protocol errorsomewhere that leads the daemon to close the connection.
#2 is not supposed to happen, but could happen if, for instance, anexception is thrown before we have read the RPC reply sent by thedaemon.
Perhaps ‘guix publish’ could catch EPIPE when talking to the daemon?(Currently EPIPEs are swallowed by ‘handle-request’ in (web server) andconverted to 500.)
That said, if there really is a situation where ‘guix publish’ canviolate the daemon protocol, that’s what should be fixed.
Thoughts?
Thanks for handling this quickly!
Ludo’.
L
L
Ludovic Courtès wrote on 27 Jul 2017 13:44
control message for bug #26705
(address . control@debbugs.gnu.org)
87d18m5cla.fsf@gnu.org
retitle 26705 'guix publish' does not handle guix-daemon disconnections
L
L
Ludovic Courtès wrote on 27 Jul 2017 14:37
(address . control@debbugs.gnu.org)
87fudi3vke.fsf@gnu.org
severity 26705 important
L
L
Ludovic Courtès wrote on 9 Jan 2019 21:47
Re: bug#26705: guix publish daemon on Hydra became dysfunctional; needed restart
(name . Mark HWeaver)(address . mhw@netris.org)(address . 26705@debbugs.gnu.org)
87zhs91ru3.fsf@gnu.org
ludo@gnu.org (Ludovic Courtès) skribis:
Toggle quote (6 lines)> Mark H Weaver <mhw@netris.org> skribis:>>> While trying to update my GuixSD system in the last hour, I found that>> every attempt by the substituter to download NARs resulted in a 500>> "Internal Server Error":
[...]
Toggle quote (24 lines)>> GET /74ch6nvjfkj3i56nygwijnaghlpi01d4.narinfo>> In guix/scripts/publish.scm:>> 393:2 2 (render-narinfo/cached #<build-daemon 256.96 211f7c0> ...)>> In guix/store.scm:>> 663:9 1 (query-path-from-hash-part #<build-daemon 256.96 2...> #)>> In unknown file:>> 0 (put-bytevector #<input-output: socket 12> #vu8(# ...) ...)>> ERROR: In procedure fport_write: Broken pipe>> GET /guix/nar/qz4mg7sid6avdav158lhr6wziqswpjmx-gnome-calendar-3.22.2.tar.xz>> In guix/scripts/publish.scm:>> 491:8 2 (render-nar #<build-daemon 256.96 211f7c0> #<<requ...> ...)>> In guix/store.scm:>> 648:0 1 (valid-path? #<build-daemon 256.96 211f7c0> "/gnu/sto...")>> In unknown file:>> 0 (put-bytevector #<input-output: socket 12> #vu8(1 ...) ...)>> ERROR: In procedure fport_write: Broken pipe>> Ooh, the connection to the daemon was broken, hence this error.>> Currently ‘guix publish’ assumes the connection opened in the> ‘guix-publish’ procedure remains valid all along. That’s normally the> case unless (1) the daemon is restarted, or (2) there’s a protocol error> somewhere that leads the daemon to close the connection.
For now I’m closing this bug as “wontfix” because I’ve never seen anyoccurrence of #2, and because #1 cannot happen on GuixSD (if ‘guix-daemon’is restarted, the shepherd will also restart ‘guix-publish’.)
Ludo’.
L
L
Ludovic Courtès wrote on 9 Jan 2019 21:47
control message for bug #26705
(address . control@debbugs.gnu.org)
87y37t1rty.fsf@gnu.org
tags 26705 wontfixclose 26705
?
Your comment

This issue is archived.

To comment on this conversation send email to 26705@debbugs.gnu.org