"write_wait_fd: unimplemented" error from 'guix substitute'

  • Open
  • quality assurance status badge
Details
8 participants
  • Attila Lendvai
  • Thiago Jung Bauermann
  • Ludovic Courtès
  • Mathieu Othacehe
  • Maxim Cournoyer
  • Maxime Devos
  • Maxime Devos
  • Nathan Dehnel
Owner
unassigned
Submitted by
Thiago Jung Bauermann
Severity
important
Merged with
T
T
Thiago Jung Bauermann wrote on 16 Jun 2022 04:40
exception while downloading substitutes
(address . bug-guix@gnu.org)
87o7yt71sc.fsf@kolabnow.com
Hello,

I just ran “guix pull && guix package -u” and after downloading many
substitutes, guix package aborted with the following error:

Toggle snippet (34 lines)
substitute: atualizando substitutos de "https://ci.guix.gnu.org"... 0.0%Backtrace:
substitute: 14 (primitive-load "/gnu/store/yxh9kr0150494jf8phrf1x28mhw…")
substitute: In guix/ui.scm:
substitute: 2230:7 13 (run-guix . _)
substitute: 2193:10 12 (run-guix-command _ . _)
substitute: In ice-9/boot-9.scm:
substitute: 1752:10 11 (with-exception-handler _ _ #:unwind? _ # _)
substitute: 1752:10 10 (with-exception-handler _ _ #:unwind? _ # _)
substitute: In guix/scripts/substitute.scm:
substitute: 757:18 9 (_)
substitute: 348:26 8 (process-query #<output: file 4> _ #:cache-urls _ #:acl _)
substitute: In guix/substitutes.scm:
substitute: 365:27 7 (lookup-narinfos/diverse _ _ #<procedure 7f22cf4534a0 …> …)
substitute: 322:31 6 (lookup-narinfos "https://ci.guix.gnu.org" _ # _ # _)
substitute: 245:26 5 (fetch-narinfos _ _ #:open-connection _ # _)
substitute: In ice-9/boot-9.scm:
substitute: 1685:16 4 (raise-exception _ #:continuable? _)
substitute: 1685:16 3 (raise-exception _ #:continuable? _)
substitute: 1780:13 2 (_ #<&compound-exception components: (#<&error> #<&orig…>)
substitute: 1685:16 1 (raise-exception _ #:continuable? _)
substitute: 1685:16 0 (raise-exception _ #:continuable? _)
substitute:
substitute: ice-9/boot-9.scm:1685:16: In procedure raise-exception:
substitute: In procedure write_wait_fd: unimplemented
guix package: erro: `/gnu/store/yxh9kr0150494jf8phrf1x28mhwnnv7f-guix-command substitute' died unexpectedly
popigai: (1) guix describe
Geração 144 15 jun 2022 22:55:58 (atual)
guix 128697d
URL do repositório: https://git.savannah.gnu.org/git/guix.git
ramo: master
commit: 128697d43c21eb229ff5413f1c4cf79ae1a9dcd4

I immediately ran “guix package -u” again, and this time the command
completed successfully.

I had a quick look at ‘fetch-narinfos’ but I couldn't figure out what
could be calling into this unimplemented function ‘write_wait_fd’…

--
Thanks
Thiago
M
M
Maxime Devos wrote on 16 Jun 2022 09:58
Re: bug#56005: "write_wait_fd: unimplemented" during downloading substitutes
(address . control@debbugs.gnu.org)
31d325076ae0c552cc04a010956faed27dc35d2c.camel@telenet.be
retitle 56006 "write_wait_fd: unimplemented" during downloading substitutes
thanks

Thiago Jung Bauermann via Bug reports for GNU Guix schreef op wo 15-06-
2022 om 23:40 [-0300]:
Toggle quote (3 lines)
> I had a quick look at ‘fetch-narinfos’ but I couldn't figure out what
> could be calling into this unimplemented function ‘write_wait_fd’…

Untested hypothesis: maybe writing to the TLS-wrapped port constructed
in (guix build download) blocked, so Guile's port code ran
'write_wait_fd', which was not set for the TLS port (in the GnuTLS
library)? Alternatively: maybe 'make-custom-binary-input/output-port'
in (guix build download) was used incorrectly?

Greetings,
Maxime.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYqrioBccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7ub3AQCXLxSVhvcA036bcxhkQhn+Ebhd
EYRLQL+qhR7cauDhswD/VSD62gJ43EZhdv/JEHWEa23cpanMaP10G62L8zTSFgk=
=nKHx
-----END PGP SIGNATURE-----


M
M
Maxime Devos wrote on 30 Jun 2022 13:46
(address . control@debbugs.gnu.org)
b53635debe0151d44e23506d59b2270d7726b6f5.camel@telenet.be
retitle 56005 during substitution: write_wait_fd: unimplemented
thanks
A
A
Attila Lendvai wrote on 30 Jun 2022 13:48
(name . 56005@debbugs.gnu.org)(address . 56005@debbugs.gnu.org)
RVsXkQfVKVzXzCTwWdhqGEyZLP7ctm2Gn4Dd9UPvZxSrOequIWmWFDbUPB2pV_1KsUlvYH1Y5_iBfp_cLX35HdpdcJPPfPw9BD4Zgpk_euM=@lendvai.name
i'm also seeing this every once in a while.

some speculation: my router has QoS set up that limits the upstream, so that i avoid triggering my ISP's rate limiter, because it sends ping into the ballpark of seconds.

maybe because of this config i'm seeing this more regularly than others?

- attila
M
M
Maxime Devos wrote on 30 Jun 2022 13:51
Re: during substitution: write_wait_fd: unimplemented
(address . 56005@debbugs.gnu.org)
65897f5d969ca183b9505035bc60b8e047c1e582.camel@telenet.be
Toggle quote (1 lines)
> substitute: 348:26 8 (process-query #<output: file 4> _ #:cache
urls _ #:acl _)
Toggle quote (2 lines)
> substitute: In guix/substitutes.scm:
> substitute: 365:27 7 (lookup-narinfos/diverse _ _ #<procedure
7f22cf4534a0 …> …)
Toggle quote (1 lines)
> substitute: 322:31 6 (lookup-narinfos "https://ci.guix.gnu.org" _
# _ # _)
Toggle quote (2 lines)
> substitute: 245:26 5 (fetch-narinfos _ _ #:open-connection _ # _)

For extra debugging information, you could try adding "COLUMNS=999999" to
the environment variables with which the guix daemon is started (where
exactly, depends on the init system).

Greetings,
Maxime.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYr2OXBccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7lFqAP9sgj773Xq4KDhvl/nwlEOGLMXE
JaKLW/xGYZy+LBarFgD+K/5Zr0R/pIuoQtToGYV/l8INVi5yn+IzE8zM7as2nQU=
=V1zU
-----END PGP SIGNATURE-----


M
M
Maxime Devos wrote on 30 Jun 2022 13:55
(address . 56005@debbugs.gnu.org)
97dc0c596de2b8463436cc9acb33daffe971ea7f.camel@telenet.be
Maxime Devos schreef op do 30-06-2022 om 13:51 [+0200]:
Toggle quote (10 lines)
> > substitute: 348:26 8 (process-query #<output: file 4> _ #:cache
> urls _ #:acl _)
> > substitute: In guix/substitutes.scm:
> > substitute: 365:27 7 (lookup-narinfos/diverse _ _ #<procedure
> 7f22cf4534a0 …> …)
> > substitute: 322:31 6 (lookup-narinfos "https://ci.guix.gnu.org"
> > _
> # _ # _)
> > substitute: 245:26 5 (fetch-narinfos _ _ #:open-connection _ #

This is at the following ...

substitute: 348:26 8 (process-query #<output: file 4> _ #:cache-
urls _ #:acl _)
substitute: In guix/substitutes.scm:
substitute: 365:27 7 (lookup-narinfos/diverse _ _ #<procedure
7f22cf4534a0 …> …)
substitute: 322:31 6 (lookup-narinfos "https://ci.guix.gnu.org" _ #
_ # _)
substitute: 245:26 5 (fetch-narinfos _ _ #:open-connection _ # _)

line. Looks like the error wasn't actually handled. The reason that
the backtrace was truncated, is that 'catch' was used instead of
'with-exception-handler' + #:unwind? #true. So to investigate the
error, it might be good to adjust 'call-with-connection-error-handling'
first.

Greetings,
Maxime.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYr2PRhccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7pFWAP99W2HrasHQLq1TNyEyRKozdZDA
M2+1nIAoH3okKAq0sAEAsQWwEzvQMk1I56HmkfSozxceQwNF0L4801hLZMT50w4=
=aUtt
-----END PGP SIGNATURE-----


M
M
Maxime Devos wrote on 30 Jun 2022 13:56
(address . 56005@debbugs.gnu.org)
abbc2ba8624c9302aeff78a44e29d836cd29978a.camel@telenet.be
Maxime Devos schreef op do 30-06-2022 om 13:55 [+0200]:
Toggle quote (4 lines)
>
> the backtrace was truncated, is that 'catch' was used instead of
> 'with-exception-handler' + #:unwind? #true

Correction: #:unwind? #false.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYr2PcxccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7izwAP0Q6mmis4vEoumi2izARzPs/0dz
2dTmznagwXURw5+dnAD+Itecvcabe13XuMeZ41b9JbrFTV5BFc47CGEWR3iedws=
=si/k
-----END PGP SIGNATURE-----


M
M
Maxime Devos wrote on 1 Jul 2022 16:22
Re: 'guix system' failure due to substitute being temporarily 404 -> Implement a retry attempts?
1656685348683.68627@student.kuleuven.be
merge 56005 56320 56319
thanks

This doesn't look Guix System-specific to me and looks to have the same issue (i.e., something about write_wait_fd) as
https://issues.guix.gnu.org/56005, so merging into 56005.

Also, is the substitute server being down a hypothesis for the error but not actually known to actually have hapend, or something you have noticed the substitute server being down via other methods and believe it to be a plausible hypothesis for the error cause?

In all cases, I believe write_wait_fd: unimplemented to be a bug that needs to be investigated for the root cause and fixed, not something to accumulate work-arounds for and ignore by automatic retries.

Greetings,
Maxime.
L
L
Ludovic Courtès wrote on 3 Jul 2022 22:56
control message for bug #56320
(address . control@debbugs.gnu.org)
878rp9zzil.fsf@gnu.org
severity 56320 important
quit
M
M
Maxim Cournoyer wrote on 13 Jul 2022 14:47
control message for bug #56319
(address . control@debbugs.gnu.org)
878rox2n8y.fsf@gmail.com
close 56319
quit
T
T
Thiago Jung Bauermann wrote on 16 Jul 2022 05:34
Re: bug#56005: during substitution: write_wait_fd: unimplemented
(name . Maxime Devos)(address . maximedevos@telenet.be)(address . 56005@debbugs.gnu.org)
8735f1vi9o.fsf@kolabnow.com
Hello,

This bug was closed, but I couldn't find what was the resolution of the
problem. Could someone please clarify, for my own education and also so
that it gets documented in the bug report?

--
Thanks
Thiago
M
M
Maxime Devos wrote on 19 Jul 2022 21:50
(name . Thiago Jung Bauermann)(address . bauermann@kolabnow.com)
0a1b9cd5-8721-0bde-d20b-389db9b1440d@telenet.be
reopen 56005
thanks
(not sure if this works because I haven't set up plain-text instead of
HTML yet ...)
On 16-07-2022 05:34, Thiago Jung Bauermann wrote:
Toggle quote (5 lines)
> Hello,
>
> This bug was closed, but I couldn't find what was the resolution of the
> problem. Could someone please clarify, for my own education and also so
> that it gets documented in the bug report?
No reason was given at all for closing (normally it says "solved in
commit ..." or "cannot reproduce" or such ...) , so I'd assume that the
wrong bug number was targetted? Tentatively reopening ...
Greetings,
Maxime.
Attachment: OpenPGP_signature
T
T
Thiago Jung Bauermann wrote on 4 Aug 2022 16:46
Re: [bug#56867] [PATCH] download: Do not wrap TLS port on GnuTLS >= 3.7.7.
(name . Ludovic Courtès)(address . ludo@gnu.org)
87v8r86p7s.fsf@kolabnow.com
Hello Ludo,

I don't have any comment/insight on what you're doing in general, except
about one of your points below:

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (17 lines)
> First, I noticed that GnuTLS doesn’t implement ‘write_wait_fd’, only
> ‘read_wait_fd’ (not sure how problematic that is):
>
> scheme@(guile-user)> ,use(web client)
> scheme@(guile-user)> (define p (open-socket-for-uri "https://guix.gnu.org"))
> scheme@(guile-user)> ((@@ (ice-9 suspendable-ports) wait-for-writable) p)
> ice-9/boot-9.scm:1685:16: In procedure raise-exception:
> In procedure write_wait_fd: unimplemented
>
> Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue.
> scheme@(guile-user) [1]> ,q
> scheme@(guile-user)> ,use(gnutls)
> scheme@(guile-user)> (gnutls-version)
> $1 = "3.7.7"
> scheme@(guile-user)> ((@@ (ice-9 suspendable-ports) wait-for-readable) p)
> $2 = 1

This occasionally causes problems when fetching substitutes, as can be
seen in bug #56005 (during substitution: write_wait_fd: unimplemented).

--
Thanks
Thiago
L
L
Ludovic Courtès wrote on 4 Aug 2022 18:19
(name . Thiago Jung Bauermann)(address . bauermann@kolabnow.com)
87iln8kmli.fsf@gnu.org
Hi,

Thiago Jung Bauermann <bauermann@kolabnow.com> skribis:

Toggle quote (22 lines)
> Ludovic Courtès <ludo@gnu.org> writes:
>
>> First, I noticed that GnuTLS doesn’t implement ‘write_wait_fd’, only
>> ‘read_wait_fd’ (not sure how problematic that is):
>>
>> scheme@(guile-user)> ,use(web client)
>> scheme@(guile-user)> (define p (open-socket-for-uri "https://guix.gnu.org"))
>> scheme@(guile-user)> ((@@ (ice-9 suspendable-ports) wait-for-writable) p)
>> ice-9/boot-9.scm:1685:16: In procedure raise-exception:
>> In procedure write_wait_fd: unimplemented
>>
>> Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue.
>> scheme@(guile-user) [1]> ,q
>> scheme@(guile-user)> ,use(gnutls)
>> scheme@(guile-user)> (gnutls-version)
>> $1 = "3.7.7"
>> scheme@(guile-user)> ((@@ (ice-9 suspendable-ports) wait-for-readable) p)
>> $2 = 1
>
> This occasionally causes problems when fetching substitutes, as can be
> seen in bug #56005 (during substitution: write_wait_fd: unimplemented).

Oh, I have not seen it but it’s weird: (guix scripts substitute) doesn’t
use O_NONBLOCK sockets, so I don’t get how it can hit that. Needs
investigation…

Thanks,
Ludo’.
M
M
Mathieu Othacehe wrote on 22 Sep 2022 16:35
control message for bug #57983
(address . control@debbugs.gnu.org)
87illf1np4.fsf@meije.mail-host-address-is-not-set
block 57983 by 56005
quit
L
L
Ludovic Courtès wrote on 14 Nov 2022 17:33
control message for bug #56320
(address . control@debbugs.gnu.org)
87pmdpo5qm.fsf@gnu.org
retitle 56320 "write_wait_fd: unimplemented" error from 'guix substitute'
quit
N
N
Nathan Dehnel wrote on 24 Apr 2023 00:27
(address . 56005@debbugs.gnu.org)
CAEEhgEsVzvQZQ-Yc8wmoDdQBqiPc2PqTfV0GPShnFV171gXNeA@mail.gmail.com
Can we please do a workaround for this until it is fixed? I can't
leave my machines while they're updating and have to keep babying
them, which is a massive waste of my time.

Specific suggestion, this flag:
--fallback fall back to building when the substituter fails
This "unimplemented" error is a substituter failure, so if it fails it
should fall back to building the package, no? Instead the entire
command fails and the upgrade is aborted.
L
L
Ludovic Courtès wrote on 3 May 2023 22:19
(name . Christopher Baines)(address . mail@cbaines.net)
87ild9yxq2.fsf@gnu.org
Christopher Baines <mail@cbaines.net> skribis:

Toggle quote (17 lines)
> Simon Tournier <zimon.toutoune@gmail.com> writes:
>
>> Hi,
>>
>> On Mon, 20 Feb 2023 at 11:46, Christopher Baines <mail@cbaines.net> wrote:
>>
>>> It's not, since it relates to code in the (guix substitutes) module.
>>
>> Do you mean that if "https://substitutes.nonguix.org" is incorrectly
>> configured, then the code in (guix substitutes) should handle the
>> error instead of crash with a backtrace?
>
> No, but to answer your question, yes.
>
> I don't think this is a server side code/configuration issue. Also see
> this older bug for the same issue https://issues.guix.gnu.org/56005

The Guile-GnuTLS change you submitted in
issue.

We have yet to put out a new Guile-GnuTLS release, but we should keep an
eye on it.

Ludo’.
L
L
Ludovic Courtès wrote on 23 May 2023 14:41
control message for bug #56319
(address . control@debbugs.gnu.org)
87mt1vusnm.fsf@gnu.org
merge 56319 56320
quit
M
M
Maxim Cournoyer wrote on 29 Aug 2023 22:26
control message for bug #56320
(address . control@debbugs.gnu.org)
87zg298uid.fsf@gmail.com
merge 56320 56319
quit
?
Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send an email to 56005@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 56005
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch