"write_wait_fd: unimplemented" error from 'guix substitute'

  • Open
  • quality assurance status badge
Details
8 participants
  • Attila Lendvai
  • Thiago Jung Bauermann
  • Ludovic Courtès
  • Mathieu Othacehe
  • Maxim Cournoyer
  • Maxime Devos
  • Maxime Devos
  • Nathan Dehnel
Owner
unassigned
Submitted by
Thiago Jung Bauermann
Severity
important
Merged with
T
T
Thiago Jung Bauermann wrote on 16 Jun 2022 04:40
exception while downloading substitutes
(address . bug-guix@gnu.org)
87o7yt71sc.fsf@kolabnow.com
Hello,

I just ran “guix pull && guix package -u” and after downloading many
substitutes, guix package aborted with the following error:

Toggle snippet (34 lines)
substitute: atualizando substitutos de "https://ci.guix.gnu.org"... 0.0%Backtrace:
substitute: 14 (primitive-load "/gnu/store/yxh9kr0150494jf8phrf1x28mhw…")
substitute: In guix/ui.scm:
substitute: 2230:7 13 (run-guix . _)
substitute: 2193:10 12 (run-guix-command _ . _)
substitute: In ice-9/boot-9.scm:
substitute: 1752:10 11 (with-exception-handler _ _ #:unwind? _ # _)
substitute: 1752:10 10 (with-exception-handler _ _ #:unwind? _ # _)
substitute: In guix/scripts/substitute.scm:
substitute: 757:18 9 (_)
substitute: 348:26 8 (process-query #<output: file 4> _ #:cache-urls _ #:acl _)
substitute: In guix/substitutes.scm:
substitute: 365:27 7 (lookup-narinfos/diverse _ _ #<procedure 7f22cf4534a0 …> …)
substitute: 322:31 6 (lookup-narinfos "https://ci.guix.gnu.org" _ # _ # _)
substitute: 245:26 5 (fetch-narinfos _ _ #:open-connection _ # _)
substitute: In ice-9/boot-9.scm:
substitute: 1685:16 4 (raise-exception _ #:continuable? _)
substitute: 1685:16 3 (raise-exception _ #:continuable? _)
substitute: 1780:13 2 (_ #<&compound-exception components: (#<&error> #<&orig…>)
substitute: 1685:16 1 (raise-exception _ #:continuable? _)
substitute: 1685:16 0 (raise-exception _ #:continuable? _)
substitute:
substitute: ice-9/boot-9.scm:1685:16: In procedure raise-exception:
substitute: In procedure write_wait_fd: unimplemented
guix package: erro: `/gnu/store/yxh9kr0150494jf8phrf1x28mhwnnv7f-guix-command substitute' died unexpectedly
popigai: (1) guix describe
Geração 144 15 jun 2022 22:55:58 (atual)
guix 128697d
URL do repositório: https://git.savannah.gnu.org/git/guix.git
ramo: master
commit: 128697d43c21eb229ff5413f1c4cf79ae1a9dcd4

I immediately ran “guix package -u” again, and this time the command
completed successfully.

I had a quick look at ‘fetch-narinfos’ but I couldn't figure out what
could be calling into this unimplemented function ‘write_wait_fd’…

--
Thanks
Thiago
M
M
Maxime Devos wrote on 16 Jun 2022 09:58
Re: bug#56005: "write_wait_fd: unimplemented" during downloading substitutes
(address . control@debbugs.gnu.org)
31d325076ae0c552cc04a010956faed27dc35d2c.camel@telenet.be
retitle 56006 "write_wait_fd: unimplemented" during downloading substitutes
thanks

Thiago Jung Bauermann via Bug reports for GNU Guix schreef op wo 15-06-
2022 om 23:40 [-0300]:
Toggle quote (3 lines)
> I had a quick look at ‘fetch-narinfos’ but I couldn't figure out what
> could be calling into this unimplemented function ‘write_wait_fd’…

Untested hypothesis: maybe writing to the TLS-wrapped port constructed
in (guix build download) blocked, so Guile's port code ran
'write_wait_fd', which was not set for the TLS port (in the GnuTLS
library)? Alternatively: maybe 'make-custom-binary-input/output-port'
in (guix build download) was used incorrectly?

Greetings,
Maxime.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYqrioBccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7ub3AQCXLxSVhvcA036bcxhkQhn+Ebhd
EYRLQL+qhR7cauDhswD/VSD62gJ43EZhdv/JEHWEa23cpanMaP10G62L8zTSFgk=
=nKHx
-----END PGP SIGNATURE-----


M
M
Maxime Devos wrote on 30 Jun 2022 13:46
(address . control@debbugs.gnu.org)
b53635debe0151d44e23506d59b2270d7726b6f5.camel@telenet.be
retitle 56005 during substitution: write_wait_fd: unimplemented
thanks
A
A
Attila Lendvai wrote on 30 Jun 2022 13:48
(name . 56005@debbugs.gnu.org)(address . 56005@debbugs.gnu.org)
RVsXkQfVKVzXzCTwWdhqGEyZLP7ctm2Gn4Dd9UPvZxSrOequIWmWFDbUPB2pV_1KsUlvYH1Y5_iBfp_cLX35HdpdcJPPfPw9BD4Zgpk_euM=@lendvai.name
i'm also seeing this every once in a while.

some speculation: my router has QoS set up that limits the upstream, so that i avoid triggering my ISP's rate limiter, because it sends ping into the ballpark of seconds.

maybe because of this config i'm seeing this more regularly than others?

- attila
M
M
Maxime Devos wrote on 30 Jun 2022 13:51
Re: during substitution: write_wait_fd: unimplemented
(address . 56005@debbugs.gnu.org)
65897f5d969ca183b9505035bc60b8e047c1e582.camel@telenet.be
Toggle quote (1 lines)
> substitute: 348:26 8 (process-query #<output: file 4> _ #:cache
urls _ #:acl _)
Toggle quote (2 lines)
> substitute: In guix/substitutes.scm:
> substitute: 365:27 7 (lookup-narinfos/diverse _ _ #<procedure
7f22cf4534a0 …> …)
Toggle quote (1 lines)
> substitute: 322:31 6 (lookup-narinfos "https://ci.guix.gnu.org" _
# _ # _)
Toggle quote (2 lines)
> substitute: 245:26 5 (fetch-narinfos _ _ #:open-connection _ # _)

For extra debugging information, you could try adding "COLUMNS=999999" to
the environment variables with which the guix daemon is started (where
exactly, depends on the init system).

Greetings,
Maxime.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYr2OXBccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7lFqAP9sgj773Xq4KDhvl/nwlEOGLMXE
JaKLW/xGYZy+LBarFgD+K/5Zr0R/pIuoQtToGYV/l8INVi5yn+IzE8zM7as2nQU=
=V1zU
-----END PGP SIGNATURE-----


M
M
Maxime Devos wrote on 30 Jun 2022 13:55
(address . 56005@debbugs.gnu.org)
97dc0c596de2b8463436cc9acb33daffe971ea7f.camel@telenet.be
Maxime Devos schreef op do 30-06-2022 om 13:51 [+0200]:
Toggle quote (10 lines)
> > substitute: 348:26 8 (process-query #<output: file 4> _ #:cache
> urls _ #:acl _)
> > substitute: In guix/substitutes.scm:
> > substitute: 365:27 7 (lookup-narinfos/diverse _ _ #<procedure
> 7f22cf4534a0 …> …)
> > substitute: 322:31 6 (lookup-narinfos "https://ci.guix.gnu.org"
> > _
> # _ # _)
> > substitute: 245:26 5 (fetch-narinfos _ _ #:open-connection _ #

This is at the following ...

substitute: 348:26 8 (process-query #<output: file 4> _ #:cache-
urls _ #:acl _)
substitute: In guix/substitutes.scm:
substitute: 365:27 7 (lookup-narinfos/diverse _ _ #<procedure
7f22cf4534a0 …> …)
substitute: 322:31 6 (lookup-narinfos "https://ci.guix.gnu.org" _ #
_ # _)
substitute: 245:26 5 (fetch-narinfos _ _ #:open-connection _ # _)

line. Looks like the error wasn't actually handled. The reason that
the backtrace was truncated, is that 'catch' was used instead of
'with-exception-handler' + #:unwind? #true. So to investigate the
error, it might be good to adjust 'call-with-connection-error-handling'
first.

Greetings,
Maxime.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYr2PRhccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7pFWAP99W2HrasHQLq1TNyEyRKozdZDA
M2+1nIAoH3okKAq0sAEAsQWwEzvQMk1I56HmkfSozxceQwNF0L4801hLZMT50w4=
=aUtt
-----END PGP SIGNATURE-----


M
M
Maxime Devos wrote on 30 Jun 2022 13:56
(address . 56005@debbugs.gnu.org)
abbc2ba8624c9302aeff78a44e29d836cd29978a.camel@telenet.be
Maxime Devos schreef op do 30-06-2022 om 13:55 [+0200]:
Toggle quote (4 lines)
>
> the backtrace was truncated, is that 'catch' was used instead of
> 'with-exception-handler' + #:unwind? #true

Correction: #:unwind? #false.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYr2PcxccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7izwAP0Q6mmis4vEoumi2izARzPs/0dz
2dTmznagwXURw5+dnAD+Itecvcabe13XuMeZ41b9JbrFTV5BFc47CGEWR3iedws=
=si/k
-----END PGP SIGNATURE-----


M
M
Maxime Devos wrote on 1 Jul 2022 16:22
Re: 'guix system' failure due to substitute being temporarily 404 -> Implement a retry attempts?
1656685348683.68627@student.kuleuven.be
merge 56005 56320 56319
thanks

This doesn't look Guix System-specific to me and looks to have the same issue (i.e., something about write_wait_fd) as
https://issues.guix.gnu.org/56005, so merging into 56005.

Also, is the substitute server being down a hypothesis for the error but not actually known to actually have hapend, or something you have noticed the substitute server being down via other methods and believe it to be a plausible hypothesis for the error cause?

In all cases, I believe write_wait_fd: unimplemented to be a bug that needs to be investigated for the root cause and fixed, not something to accumulate work-arounds for and ignore by automatic retries.

Greetings,
Maxime.
L
L
Ludovic Courtès wrote on 3 Jul 2022 22:56
control message for bug #56320
(address . control@debbugs.gnu.org)
878rp9zzil.fsf@gnu.org
severity 56320 important
quit
M
M
Maxim Cournoyer wrote on 13 Jul 2022 14:47
control message for bug #56319
(address . control@debbugs.gnu.org)
878rox2n8y.fsf@gmail.com
close 56319
quit
T
T
Thiago Jung Bauermann wrote on 16 Jul 2022 05:34
Re: bug#56005: during substitution: write_wait_fd: unimplemented
(name . Maxime Devos)(address . maximedevos@telenet.be)(address . 56005@debbugs.gnu.org)
8735f1vi9o.fsf@kolabnow.com
Hello,

This bug was closed, but I couldn't find what was the resolution of the
problem. Could someone please clarify, for my own education and also so
that it gets documented in the bug report?

--
Thanks
Thiago
M
M
Maxime Devos wrote on 19 Jul 2022 21:50
(name . Thiago Jung Bauermann)(address . bauermann@kolabnow.com)
0a1b9cd5-8721-0bde-d20b-389db9b1440d@telenet.be
reopen 56005
thanks
(not sure if this works because I haven't set up plain-text instead of
HTML yet ...)
On 16-07-2022 05:34, Thiago Jung Bauermann wrote:
Toggle quote (5 lines)
> Hello,
>
> This bug was closed, but I couldn't find what was the resolution of the
> problem. Could someone please clarify, for my own education and also so
> that it gets documented in the bug report?
No reason was given at all for closing (normally it says "solved in
commit ..." or "cannot reproduce" or such ...) , so I'd assume that the
wrong bug number was targetted? Tentatively reopening ...
Greetings,
Maxime.
Attachment: OpenPGP_signature
T
T
Thiago Jung Bauermann wrote on 4 Aug 2022 16:46
Re: [bug#56867] [PATCH] download: Do not wrap TLS port on GnuTLS >= 3.7.7.
(name . Ludovic Courtès)(address . ludo@gnu.org)
87v8r86p7s.fsf@kolabnow.com
Hello Ludo,

I don't have any comment/insight on what you're doing in general, except
about one of your points below:

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (17 lines)
> First, I noticed that GnuTLS doesn’t implement ‘write_wait_fd’, only
> ‘read_wait_fd’ (not sure how problematic that is):
>
> scheme@(guile-user)> ,use(web client)
> scheme@(guile-user)> (define p (open-socket-for-uri "https://guix.gnu.org"))
> scheme@(guile-user)> ((@@ (ice-9 suspendable-ports) wait-for-writable) p)
> ice-9/boot-9.scm:1685:16: In procedure raise-exception:
> In procedure write_wait_fd: unimplemented
>
> Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue.
> scheme@(guile-user) [1]> ,q
> scheme@(guile-user)> ,use(gnutls)
> scheme@(guile-user)> (gnutls-version)
> $1 = "3.7.7"
> scheme@(guile-user)> ((@@ (ice-9 suspendable-ports) wait-for-readable) p)
> $2 = 1

This occasionally causes problems when fetching substitutes, as can be
seen in bug #56005 (during substitution: write_wait_fd: unimplemented).

--
Thanks
Thiago
L
L
Ludovic Courtès wrote on 4 Aug 2022 18:19
(name . Thiago Jung Bauermann)(address . bauermann@kolabnow.com)
87iln8kmli.fsf@gnu.org
Hi,

Thiago Jung Bauermann <bauermann@kolabnow.com> skribis:

Toggle quote (22 lines)
> Ludovic Courtès <ludo@gnu.org> writes:
>
>> First, I noticed that GnuTLS doesn’t implement ‘write_wait_fd’, only
>> ‘read_wait_fd’ (not sure how problematic that is):
>>
>> scheme@(guile-user)> ,use(web client)
>> scheme@(guile-user)> (define p (open-socket-for-uri "https://guix.gnu.org"))
>> scheme@(guile-user)> ((@@ (ice-9 suspendable-ports) wait-for-writable) p)
>> ice-9/boot-9.scm:1685:16: In procedure raise-exception:
>> In procedure write_wait_fd: unimplemented
>>
>> Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue.
>> scheme@(guile-user) [1]> ,q
>> scheme@(guile-user)> ,use(gnutls)
>> scheme@(guile-user)> (gnutls-version)
>> $1 = "3.7.7"
>> scheme@(guile-user)> ((@@ (ice-9 suspendable-ports) wait-for-readable) p)
>> $2 = 1
>
> This occasionally causes problems when fetching substitutes, as can be
> seen in bug #56005 (during substitution: write_wait_fd: unimplemented).

Oh, I have not seen it but it’s weird: (guix scripts substitute) doesn’t
use O_NONBLOCK sockets, so I don’t get how it can hit that. Needs
investigation…

Thanks,
Ludo’.
M
M
Mathieu Othacehe wrote on 22 Sep 2022 16:35
control message for bug #57983
(address . control@debbugs.gnu.org)
87illf1np4.fsf@meije.mail-host-address-is-not-set
block 57983 by 56005
quit
L
L
Ludovic Courtès wrote on 14 Nov 2022 17:33
control message for bug #56320
(address . control@debbugs.gnu.org)
87pmdpo5qm.fsf@gnu.org
retitle 56320 "write_wait_fd: unimplemented" error from 'guix substitute'
quit
N
N
Nathan Dehnel wrote on 24 Apr 2023 00:27
(address . 56005@debbugs.gnu.org)
CAEEhgEsVzvQZQ-Yc8wmoDdQBqiPc2PqTfV0GPShnFV171gXNeA@mail.gmail.com
Can we please do a workaround for this until it is fixed? I can't
leave my machines while they're updating and have to keep babying
them, which is a massive waste of my time.

Specific suggestion, this flag:
--fallback fall back to building when the substituter fails
This "unimplemented" error is a substituter failure, so if it fails it
should fall back to building the package, no? Instead the entire
command fails and the upgrade is aborted.
L
L
Ludovic Courtès wrote on 3 May 2023 22:19
(name . Christopher Baines)(address . mail@cbaines.net)
87ild9yxq2.fsf@gnu.org
Christopher Baines <mail@cbaines.net> skribis:

Toggle quote (17 lines)
> Simon Tournier <zimon.toutoune@gmail.com> writes:
>
>> Hi,
>>
>> On Mon, 20 Feb 2023 at 11:46, Christopher Baines <mail@cbaines.net> wrote:
>>
>>> It's not, since it relates to code in the (guix substitutes) module.
>>
>> Do you mean that if "https://substitutes.nonguix.org" is incorrectly
>> configured, then the code in (guix substitutes) should handle the
>> error instead of crash with a backtrace?
>
> No, but to answer your question, yes.
>
> I don't think this is a server side code/configuration issue. Also see
> this older bug for the same issue https://issues.guix.gnu.org/56005

The Guile-GnuTLS change you submitted in
issue.

We have yet to put out a new Guile-GnuTLS release, but we should keep an
eye on it.

Ludo’.
L
L
Ludovic Courtès wrote on 23 May 2023 14:41
control message for bug #56319
(address . control@debbugs.gnu.org)
87mt1vusnm.fsf@gnu.org
merge 56319 56320
quit
M
M
Maxim Cournoyer wrote on 29 Aug 2023 22:26
control message for bug #56320
(address . control@debbugs.gnu.org)
87zg298uid.fsf@gmail.com
merge 56320 56319
quit
?