[PATCH] scripts: substitute: Add back some error handling.

  • Done
  • quality assurance status badge
Details
2 participants
  • Ludovic Courtès
  • Christopher Baines
Owner
unassigned
Submitted by
Christopher Baines
Severity
normal
C
C
Christopher Baines wrote on 15 Mar 2021 16:11
(address . guix-patches@gnu.org)
20210315151133.16282-1-mail@cbaines.net
In f50f5751fff4cfc6d5abba9681054569694b7a5c, the way fetch was called within
process-substitution was changed. As with-cached-connection actually includes
important error handling for the opening of a HTTP request (when using a
cached connection), this change removed some error handling.

This commit adds that error handling back,
with-cached-connection/call-with-cached-connection is back, rebranded as
call-with-fresh-connection-retry.

* guix/scripts/substitute.scm (process-substitution): Retry once for some
errors when making HTTP requests to fetch substitutes.
---
guix/scripts/substitute.scm | 38 ++++++++++++++++++++++++++++++++-----
1 file changed, 33 insertions(+), 5 deletions(-)

Toggle diff (65 lines)
diff --git a/guix/scripts/substitute.scm b/guix/scripts/substitute.scm
index 6892aa999b..2c9b45023f 100755
--- a/guix/scripts/substitute.scm
+++ b/guix/scripts/substitute.scm
@@ -45,6 +45,7 @@
#:select (uri-abbreviation nar-uri-abbreviation
(open-connection-for-uri
. guix:open-connection-for-uri)))
+ #:autoload (gnutls) (error/invalid-session)
#:use-module (guix progress)
#:use-module ((guix build syscalls)
#:select (set-thread-name))
@@ -401,6 +402,31 @@ the current output port."
(apply dump-file/deduplicate
(append args (list #:store (%store-prefix)))))
+ (define (call-with-fresh-connection-retry uri proc)
+ (define (get-port)
+ (open-connection-for-uri/cached uri
+ #:verify-certificate? #f))
+
+ (let ((port (get-port)))
+ (catch #t
+ (lambda ()
+ (proc port))
+ (lambda (key . args)
+ ;; If PORT was cached and the server closed the connection in the
+ ;; meantime, we get EPIPE. In that case, open a fresh connection
+ ;; and retry. We might also get 'bad-response or a similar
+ ;; exception from (web response) later on, once we've sent the
+ ;; request, or a ERROR/INVALID-SESSION from GnuTLS.
+ (if (or (and (eq? key 'system-error)
+ (= EPIPE (system-error-errno `(,key ,@args))))
+ (and (eq? key 'gnutls-error)
+ (eq? (first args) error/invalid-session))
+ (memq key '(bad-response bad-header bad-header-component)))
+ (begin
+ (close-port port) ; close the port to get a fresh one
+ (proc (get-port)))
+ (apply throw key args))))))
+
(define (fetch uri)
(case (uri-scheme uri)
((file)
@@ -424,11 +450,13 @@ the current output port."
(call-with-connection-error-handling
uri
(lambda ()
- (http-fetch uri #:text? #f
- #:open-connection open-connection-for-uri/cached
- #:keep-alive? #t
- #:buffered? #f
- #:verify-certificate? #f))))))
+ (call-with-fresh-connection-retry
+ uri
+ (lambda (port)
+ (http-fetch uri #:text? #f
+ #:port port
+ #:keep-alive? #t
+ #:buffered? #f))))))))
(else
(leave (G_ "unsupported substitute URI scheme: ~a~%")
(uri->string uri)))))
--
2.30.1
L
L
Ludovic Courtès wrote on 15 Mar 2021 16:20
(name . Christopher Baines)(address . mail@cbaines.net)(address . 47160@debbugs.gnu.org)
8735wwh29g.fsf@gnu.org
Hi,

Christopher Baines <mail@cbaines.net> skribis:

Toggle quote (12 lines)
> In f50f5751fff4cfc6d5abba9681054569694b7a5c, the way fetch was called within
> process-substitution was changed. As with-cached-connection actually includes
> important error handling for the opening of a HTTP request (when using a
> cached connection), this change removed some error handling.
>
> This commit adds that error handling back,
> with-cached-connection/call-with-cached-connection is back, rebranded as
> call-with-fresh-connection-retry.
>
> * guix/scripts/substitute.scm (process-substitution): Retry once for some
> errors when making HTTP requests to fetch substitutes.

[...]

Toggle quote (25 lines)
> + (define (call-with-fresh-connection-retry uri proc)
> + (define (get-port)
> + (open-connection-for-uri/cached uri
> + #:verify-certificate? #f))
> +
> + (let ((port (get-port)))
> + (catch #t
> + (lambda ()
> + (proc port))
> + (lambda (key . args)
> + ;; If PORT was cached and the server closed the connection in the
> + ;; meantime, we get EPIPE. In that case, open a fresh connection
> + ;; and retry. We might also get 'bad-response or a similar
> + ;; exception from (web response) later on, once we've sent the
> + ;; request, or a ERROR/INVALID-SESSION from GnuTLS.
> + (if (or (and (eq? key 'system-error)
> + (= EPIPE (system-error-errno `(,key ,@args))))
> + (and (eq? key 'gnutls-error)
> + (eq? (first args) error/invalid-session))
> + (memq key '(bad-response bad-header bad-header-component)))
> + (begin
> + (close-port port) ; close the port to get a fresh one
> + (proc (get-port)))
> + (apply throw key args))))))

I think this should be at the top level for clarity. It used to have
‘cached’ in its name because catching all these exceptions is something
you wouldn’t normally do; it only makes sense in the context of cached
connections.

Toggle quote (20 lines)
> (define (fetch uri)
> (case (uri-scheme uri)
> ((file)
> @@ -424,11 +450,13 @@ the current output port."
> (call-with-connection-error-handling
> uri
> (lambda ()
> - (http-fetch uri #:text? #f
> - #:open-connection open-connection-for-uri/cached
> - #:keep-alive? #t
> - #:buffered? #f
> - #:verify-certificate? #f))))))
> + (call-with-fresh-connection-retry
> + uri
> + (lambda (port)
> + (http-fetch uri #:text? #f
> + #:port port
> + #:keep-alive? #t
> + #:buffered? #f))))))))

Does ‘call-with-connection-error-handling’ still make sense here?
There’s already ‘with-networking’ at the top level to do proper
networking error reporting.

Regarding https://issues.guix.gnu.org/47157, I would lean towards
perhaps reverting the connection/error-handling patch series and
starting anew from that “known state”.

This area is unfortunately quite tedious to test and to get right so I’d
err on the path of conservative, incremental changes.

Thought?

Ludo’.
C
C
Christopher Baines wrote on 15 Mar 2021 17:15
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 47160@debbugs.gnu.org)
874khcfl6j.fsf@cbaines.net
Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (48 lines)
> Hi,
>
> Christopher Baines <mail@cbaines.net> skribis:
>
>> In f50f5751fff4cfc6d5abba9681054569694b7a5c, the way fetch was called within
>> process-substitution was changed. As with-cached-connection actually includes
>> important error handling for the opening of a HTTP request (when using a
>> cached connection), this change removed some error handling.
>>
>> This commit adds that error handling back,
>> with-cached-connection/call-with-cached-connection is back, rebranded as
>> call-with-fresh-connection-retry.
>>
>> * guix/scripts/substitute.scm (process-substitution): Retry once for some
>> errors when making HTTP requests to fetch substitutes.
>
> [...]
>
>> + (define (call-with-fresh-connection-retry uri proc)
>> + (define (get-port)
>> + (open-connection-for-uri/cached uri
>> + #:verify-certificate? #f))
>> +
>> + (let ((port (get-port)))
>> + (catch #t
>> + (lambda ()
>> + (proc port))
>> + (lambda (key . args)
>> + ;; If PORT was cached and the server closed the connection in the
>> + ;; meantime, we get EPIPE. In that case, open a fresh connection
>> + ;; and retry. We might also get 'bad-response or a similar
>> + ;; exception from (web response) later on, once we've sent the
>> + ;; request, or a ERROR/INVALID-SESSION from GnuTLS.
>> + (if (or (and (eq? key 'system-error)
>> + (= EPIPE (system-error-errno `(,key ,@args))))
>> + (and (eq? key 'gnutls-error)
>> + (eq? (first args) error/invalid-session))
>> + (memq key '(bad-response bad-header bad-header-component)))
>> + (begin
>> + (close-port port) ; close the port to get a fresh one
>> + (proc (get-port)))
>> + (apply throw key args))))))
>
> I think this should be at the top level for clarity. It used to have
> ‘cached’ in its name because catching all these exceptions is something
> you wouldn’t normally do; it only makes sense in the context of cached
> connections.

I initially tried to just put the error handling in just where it's
needed, but that was difficult since the http-fetch bit needs to happen
again when there's a relevant error.

The two things: getting a port which maybe is a cached connection and
handling some errors plus potentially re-running proc is difficult to
capture in a name, but "call-with-cached-connection-and-error-handling"
is an improvement over "with-cached-connection" I think.

Toggle quote (24 lines)
>> (define (fetch uri)
>> (case (uri-scheme uri)
>> ((file)
>> @@ -424,11 +450,13 @@ the current output port."
>> (call-with-connection-error-handling
>> uri
>> (lambda ()
>> - (http-fetch uri #:text? #f
>> - #:open-connection open-connection-for-uri/cached
>> - #:keep-alive? #t
>> - #:buffered? #f
>> - #:verify-certificate? #f))))))
>> + (call-with-fresh-connection-retry
>> + uri
>> + (lambda (port)
>> + (http-fetch uri #:text? #f
>> + #:port port
>> + #:keep-alive? #t
>> + #:buffered? #f))))))))
>
> Does ‘call-with-connection-error-handling’ still make sense here?
> There’s already ‘with-networking’ at the top level to do proper
> networking error reporting.

So, looking back, the call-with-connection-error-handling error handling
was related to (call-)with-cached-connection, but it was only relevant
inside of fetch-narinfos, as that's when open-connection-for-uri/maybe
was passed in to call-with-cached-connection.

Which means no, I think it can be removed, at least that's more
consistent with the older behaviour.

I'll send some updated patches.

Toggle quote (9 lines)
> Regarding https://issues.guix.gnu.org/47157, I would lean towards
> perhaps reverting the connection/error-handling patch series and
> starting anew from that “known state”.
>
> This area is unfortunately quite tedious to test and to get right so I’d
> err on the path of conservative, incremental changes.
>
> Thought?

My preference is still to try and move forward and to make the error
handling easier to see in the code.

Particularly with this change, I think the problem was introduced in
this commit [1], but I think it's hard to tell from the diff, since the
error handling and retrying is within with-cached-connection.


That commit was one of the commits where I was making small incremental
changes prior to actually getting to the changes I was looking at
making, but a breakage was still introduced.

What I was thinking about with this patch was how to make the error
handling being added back here easier to see, and thus harder to
break/remove.
-----BEGIN PGP SIGNATURE-----

iQKlBAEBCgCPFiEEPonu50WOcg2XVOCyXiijOwuE9XcFAmBPiARfFIAAAAAALgAo
aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDNF
ODlFRUU3NDU4RTcyMEQ5NzU0RTBCMjVFMjhBMzNCMEI4NEY1NzcRHG1haWxAY2Jh
aW5lcy5uZXQACgkQXiijOwuE9XdzNhAAjlEXfQFwHMnKdBI6XW/16fih/rB9wMcA
hkkHtvRyQQyEKkYaGBYhRcO/yYt5ovtjkT+FkcvWgA+MlOo9Fomk7eauyOIZcBA4
ibXRpfli5qvOvqSqr9Diu3lTiMYqIOAy1dG3iqYhbcQL+fNVJo5mI/FyMPSt3o5F
EhRZJXeOn3wdV2KOWspRgIHpE43LJWGEiSJ6ISFrrvpHFBBL8Nu8K9nLa3vB+hFt
s8tFMh6bityYCrlwHhS5cP0B2nenmI9BwZQzl+doB8I6N3hMfu7OdkUm/kNlqImQ
1MWOA/rYGrQNlj9XOGrefX4zQLyOSyfn3PcTcg4MqY8Az4WaaRmleIPyK5ZW4ikU
/OQYBe5A6rz3HOtZxH7yYj8sEEWEWepSn5fwAxv+tAH5Ydpn8Iz91QGkHfty6ApM
6Gr+zn6g9CPvpVG/y16pvQx6nIFCrf5GJQcu8GgdZQn9/1dH/Fpa9KOYGis/X7hV
0rNYWaMq7/hcqA4nINS+JJSB6VSSXeh1+sxPh4GXuKxuev/UKjejfxGMaCZD4v7u
LB+WR64S+tFE2frKS38PsXp1XmdmxSgEjTUBu0W+N/JbxpNb32o+XzZOWf1uS1dO
VBEg/1WweBpTjQuHdi9TBb7Qhw6SvBK5kNBajwRq4JrdamNzyUm7QI7hzq/WZix9
xx6RtWXJjtw=
=xoNb
-----END PGP SIGNATURE-----

C
C
Christopher Baines wrote on 15 Mar 2021 17:15
[PATCH v2 2/2] scripts: substitute: Tweak error reporting in process-substitution.
(address . 47160@debbugs.gnu.org)
20210315161532.1716-2-mail@cbaines.net
The call-with-connection-error-handling was added in
20c08a8a45d0f137ead7c05e720456b2aea44402, but that error handling was
previously inside of open-connection-for-uri/maybe, which is related
to (call-)with-cached-connection which was used in process-substitution, but
only actually used with call-with-cached-connection when used in
fetch-narinfos.

There's some handling for similar errors within with-networking, which is used
within process-substitution.

* guix/scripts/substitute.scm (process-substitution): Remove
call-with-connection-error-handling call.
---
guix/scripts/substitute.scm | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)

Toggle diff (29 lines)
diff --git a/guix/scripts/substitute.scm b/guix/scripts/substitute.scm
index 16ba28455f..997e2565e0 100755
--- a/guix/scripts/substitute.scm
+++ b/guix/scripts/substitute.scm
@@ -447,16 +447,13 @@ the current output port."
(warning (G_ "while fetching ~a: server is somewhat slow~%")
(uri->string uri))
(warning (G_ "try `--no-substitutes' if the problem persists~%")))
- (call-with-connection-error-handling
+ (call-with-cached-connection-and-error-handling
uri
- (lambda ()
- (call-with-cached-connection-and-error-handling
- uri
- (lambda (port)
- (http-fetch uri #:text? #f
- #:port port
- #:keep-alive? #t
- #:buffered? #f))))))))
+ (lambda (port)
+ (http-fetch uri #:text? #f
+ #:port port
+ #:keep-alive? #t
+ #:buffered? #f))))))
(else
(leave (G_ "unsupported substitute URI scheme: ~a~%")
(uri->string uri)))))
--
2.30.1
C
C
Christopher Baines wrote on 15 Mar 2021 17:15
[PATCH v2 1/2] scripts: substitute: Add back some error handling.
(address . 47160@debbugs.gnu.org)
20210315161532.1716-1-mail@cbaines.net
In f50f5751fff4cfc6d5abba9681054569694b7a5c, the way fetch was called within
process-substitution was changed. As with-cached-connection actually includes
important error handling for the opening of a HTTP request (when using a
cached connection), this change removed some error handling.

This commit adds that error handling back,
(call-)with-cached-connection is back, rebranded as
call-with-cached-connection-and-error-handling.

* guix/scripts/substitute.scm (process-substitution): Retry once for some
errors when making HTTP requests to fetch substitutes.
---
guix/scripts/substitute.scm | 38 ++++++++++++++++++++++++++++++++-----
1 file changed, 33 insertions(+), 5 deletions(-)

Toggle diff (65 lines)
diff --git a/guix/scripts/substitute.scm b/guix/scripts/substitute.scm
index 6892aa999b..16ba28455f 100755
--- a/guix/scripts/substitute.scm
+++ b/guix/scripts/substitute.scm
@@ -45,6 +45,7 @@
#:select (uri-abbreviation nar-uri-abbreviation
(open-connection-for-uri
. guix:open-connection-for-uri)))
+ #:autoload (gnutls) (error/invalid-session)
#:use-module (guix progress)
#:use-module ((guix build syscalls)
#:select (set-thread-name))
@@ -377,6 +378,31 @@ server certificates."
(drain-input socket)
socket))))))))
+(define (call-with-cached-connection-and-error-handling uri proc)
+ (define (get-port)
+ (open-connection-for-uri/cached uri
+ #:verify-certificate? #f))
+
+ (let ((port (get-port)))
+ (catch #t
+ (lambda ()
+ (proc port))
+ (lambda (key . args)
+ ;; If PORT was cached and the server closed the connection in the
+ ;; meantime, we get EPIPE. In that case, open a fresh connection
+ ;; and retry. We might also get 'bad-response or a similar
+ ;; exception from (web response) later on, once we've sent the
+ ;; request, or a ERROR/INVALID-SESSION from GnuTLS.
+ (if (or (and (eq? key 'system-error)
+ (= EPIPE (system-error-errno `(,key ,@args))))
+ (and (eq? key 'gnutls-error)
+ (eq? (first args) error/invalid-session))
+ (memq key '(bad-response bad-header bad-header-component)))
+ (begin
+ (close-port port) ; close the port to get a fresh one
+ (proc (get-port)))
+ (apply throw key args))))))
+
(define* (process-substitution store-item destination
#:key cache-urls acl
deduplicate? print-build-trace?)
@@ -424,11 +450,13 @@ the current output port."
(call-with-connection-error-handling
uri
(lambda ()
- (http-fetch uri #:text? #f
- #:open-connection open-connection-for-uri/cached
- #:keep-alive? #t
- #:buffered? #f
- #:verify-certificate? #f))))))
+ (call-with-cached-connection-and-error-handling
+ uri
+ (lambda (port)
+ (http-fetch uri #:text? #f
+ #:port port
+ #:keep-alive? #t
+ #:buffered? #f))))))))
(else
(leave (G_ "unsupported substitute URI scheme: ~a~%")
(uri->string uri)))))
--
2.30.1
L
L
Ludovic Courtès wrote on 15 Mar 2021 21:51
Re: bug#47160: [PATCH] scripts: substitute: Add back some error handling.
(name . Christopher Baines)(address . mail@cbaines.net)(address . 47160@debbugs.gnu.org)
87pn00b0p5.fsf_-_@gnu.org
Christopher Baines <mail@cbaines.net> skribis:

Toggle quote (2 lines)
> Ludovic Courtès <ludo@gnu.org> writes:

[...]

Toggle quote (26 lines)
>> Regarding https://issues.guix.gnu.org/47157, I would lean towards
>> perhaps reverting the connection/error-handling patch series and
>> starting anew from that “known state”.
>>
>> This area is unfortunately quite tedious to test and to get right so I’d
>> err on the path of conservative, incremental changes.
>>
>> Thought?
>
> My preference is still to try and move forward and to make the error
> handling easier to see in the code.
>
> Particularly with this change, I think the problem was introduced in
> this commit [1], but I think it's hard to tell from the diff, since the
> error handling and retrying is within with-cached-connection.
>
> 1: https://git.savannah.gnu.org/cgit/guix.git/commit/?id=f50f5751fff4cfc6d5abba9681054569694b7a5c
>
> That commit was one of the commits where I was making small incremental
> changes prior to actually getting to the changes I was looking at
> making, but a breakage was still introduced.
>
> What I was thinking about with this patch was how to make the error
> handling being added back here easier to see, and thus harder to
> break/remove.

OK.

Though I’m still unsure what the patch series starting at
7b812f7c84c43455cdd68a0e51b6ded018afcc8e was about. What was the end
goal?

I also wonder if it introduced other issues. For
example, 7b812f7c84c43455cdd68a0e51b6ded018afcc8e replaced a reference
to ‘open-connection-for-uri/cached’ by one to
‘open-connection-for-uri/maybe’. Are we still using cached connections?

Commit f50f5751fff4cfc6d5abba9681054569694b7a5c no longer passes the
#:port parameter to ‘http-fetch’.

Commit 20c08a8a45d0f137ead7c05e720456b2aea44402 does other things but at
first sight I’m not sure what the effect is.

If you’re confident we can move forward to fix the bug, that’s great
(though we’ll need a good deal of testing), but I’d still like to
clarify these points later on.

Ludo’.
C
C
Christopher Baines wrote on 15 Mar 2021 22:33
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 47160@debbugs.gnu.org)
87k0q8drv8.fsf@cbaines.net
Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (38 lines)
> Christopher Baines <mail@cbaines.net> skribis:
>
>> Ludovic Courtès <ludo@gnu.org> writes:
>
> [...]
>
>>> Regarding <https://issues.guix.gnu.org/47157>, I would lean towards
>>> perhaps reverting the connection/error-handling patch series and
>>> starting anew from that “known state”.
>>>
>>> This area is unfortunately quite tedious to test and to get right so I’d
>>> err on the path of conservative, incremental changes.
>>>
>>> Thought?
>>
>> My preference is still to try and move forward and to make the error
>> handling easier to see in the code.
>>
>> Particularly with this change, I think the problem was introduced in
>> this commit [1], but I think it's hard to tell from the diff, since the
>> error handling and retrying is within with-cached-connection.
>>
>> 1: https://git.savannah.gnu.org/cgit/guix.git/commit/?id=f50f5751fff4cfc6d5abba9681054569694b7a5c
>>
>> That commit was one of the commits where I was making small incremental
>> changes prior to actually getting to the changes I was looking at
>> making, but a breakage was still introduced.
>>
>> What I was thinking about with this patch was how to make the error
>> handling being added back here easier to see, and thus harder to
>> break/remove.
>
> OK.
>
> Though I’m still unsure what the patch series starting at
> 7b812f7c84c43455cdd68a0e51b6ded018afcc8e was about. What was the end
> goal?

So that was part of the creation of the (guix substitutes) module,
unpicking the code in the script to separate out some of the connection
caching was a prerequisite (discussion starts here

I think separating out that module is still a good thing. It's allowed
for improvements in guix, the weather script doesn't now call in to the
substitute script code for example. I'd also like the separation for
things like the Guix Build Coordinator, which currently attempts to use
the substitute code from Guix.

Toggle quote (6 lines)
> I also wonder if it introduced other issues. For
> example, 7b812f7c84c43455cdd68a0e51b6ded018afcc8e replaced a reference
> to ‘open-connection-for-uri/cached’ by one to
> ‘open-connection-for-uri/maybe’. Are we still using cached
> connections?

At least on that commit, open-connection-for-uri/maybe calls
open-connection-for-uri/cached, so yes, still using cached connections.

Toggle quote (3 lines)
> Commit f50f5751fff4cfc6d5abba9681054569694b7a5c no longer passes the
> #:port parameter to ‘http-fetch’.

Yeah, that change is sort of fine if you're just looking at how the
port/connection is handled, but that area is being fixed up here, and
because closing the port is something that happens, it's better to also
pass the port in.

Toggle quote (3 lines)
> Commit 20c08a8a45d0f137ead7c05e720456b2aea44402 does other things but at
> first sight I’m not sure what the effect is.

So, open-connection-for-uri/maybe is like
open-connection-for-uri/cached, but it catches a couple of exceptions
relating to not being able to connect to a substitute server, it also
remembers about showing the messages.

The second commit here is changing that slightly, to not apply to
process-substitution, however I do think that code might have applied in
the past (as open-connection-for-uri/maybe was used I believe). But I
think you're right in saying there's probably some overlap between the
error handling here and done by with-networking.

Toggle quote (4 lines)
> If you’re confident we can move forward to fix the bug, that’s great
> (though we’ll need a good deal of testing), but I’d still like to
> clarify these points later on.

Well, the changes I'm suggesting here seem reasonable to me. As for
testing, checking things basically work is easy enough, but I don't
currently have many ideas for how to test for when fetching things
doesn't go to plan (which can of course happen).
-----BEGIN PGP SIGNATURE-----

iQKlBAEBCgCPFiEEPonu50WOcg2XVOCyXiijOwuE9XcFAmBP0qtfFIAAAAAALgAo
aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDNF
ODlFRUU3NDU4RTcyMEQ5NzU0RTBCMjVFMjhBMzNCMEI4NEY1NzcRHG1haWxAY2Jh
aW5lcy5uZXQACgkQXiijOwuE9Xddlg//TR2Bk4rpGZXVPBe9lGJVksjCMUgQi4+k
8gnyLLQj5NxVTlHUe5amThLwqi3AES6VC8r3I18X8ya+zJRSflb6yLawteil3bu7
YQ8J1fOE20EPUK9+A1zSxnBRCA5Mtf6AqmbkxUNyABL08sbEYgJinbjP/TLQuvOt
ZQAvEVND/Mdm7UCdSruNQX/G/h2epG6c+6MSAUIb1DTrpPJ7w1sydROTWZzS1lqs
Pp6R+B04NndxLoidPfrbFHzM4VAuVgIjynzSUk/EvpdzTqwPLvlG99NMrFQjEmML
4nryWXLEBbNL9w5/lGRhgMTOupMm4Sd4A2c7MO5v41Wzp46y3juIZwGl5AuiJDQp
2YYdDk9ximEymj4RTrE6eFwdd4sJo5NU2DfH82DRQ11Ng7m0VVAG0QvcHzT1UtXi
3btZ+74QIOS1+8+CfvCbIXVC8OoeOh5c3QEBLz/fawPn9iUgi1QHhrxac3XSIZ1T
cWPuff+1ykuw0+hSQFi8uBixX9PJiEDRDV650sXff6i/0ZkYw2/DQz+AnoMs4Z/7
xLHco+UFNUMrG0KSipXTmgmwLdGaCtXfOsy5faTWSMZZCD9VjGZqccK+QcnKgGNR
s/DfnJL/MVSHFfdWeTArgw38dH2PGOrz/C345hhhACpZND7haw0dm3dW8oxE9OPT
7v99lkZfjhg=
=NEX4
-----END PGP SIGNATURE-----

L
L
Ludovic Courtès wrote on 16 Mar 2021 21:34
(name . Christopher Baines)(address . mail@cbaines.net)(address . 47160@debbugs.gnu.org)
87zgz27s7j.fsf_-_@gnu.org
Howdy!

Christopher Baines <mail@cbaines.net> skribis:

Toggle quote (2 lines)
> Ludovic Courtès <ludo@gnu.org> writes:

[...]

Toggle quote (15 lines)
>> Though I’m still unsure what the patch series starting at
>> 7b812f7c84c43455cdd68a0e51b6ded018afcc8e was about. What was the end
>> goal?
>
> So that was part of the creation of the (guix substitutes) module,
> unpicking the code in the script to separate out some of the connection
> caching was a prerequisite (discussion starts here
> https://issues.guix.gnu.org/45409#5 ).
>
> I think separating out that module is still a good thing. It's allowed
> for improvements in guix, the weather script doesn't now call in to the
> substitute script code for example. I'd also like the separation for
> things like the Guix Build Coordinator, which currently attempts to use
> the substitute code from Guix.

Right, I agree this is a worthy goal. Untangling the stateful bits is
the hard part, as we see. :-)

Toggle quote (9 lines)
>> I also wonder if it introduced other issues. For
>> example, 7b812f7c84c43455cdd68a0e51b6ded018afcc8e replaced a reference
>> to ‘open-connection-for-uri/cached’ by one to
>> ‘open-connection-for-uri/maybe’. Are we still using cached
>> connections?
>
> At least on that commit, open-connection-for-uri/maybe calls
> open-connection-for-uri/cached, so yes, still using cached connections.

OK.

Toggle quote (8 lines)
>> Commit f50f5751fff4cfc6d5abba9681054569694b7a5c no longer passes the
>> #:port parameter to ‘http-fetch’.
>
> Yeah, that change is sort of fine if you're just looking at how the
> port/connection is handled, but that area is being fixed up here, and
> because closing the port is something that happens, it's better to also
> pass the port in.

OK.

Toggle quote (14 lines)
>> Commit 20c08a8a45d0f137ead7c05e720456b2aea44402 does other things but at
>> first sight I’m not sure what the effect is.
>
> So, open-connection-for-uri/maybe is like
> open-connection-for-uri/cached, but it catches a couple of exceptions
> relating to not being able to connect to a substitute server, it also
> remembers about showing the messages.
>
> The second commit here is changing that slightly, to not apply to
> process-substitution, however I do think that code might have applied in
> the past (as open-connection-for-uri/maybe was used I believe). But I
> think you're right in saying there's probably some overlap between the
> error handling here and done by with-networking.

Alright.

Toggle quote (9 lines)
>> If you’re confident we can move forward to fix the bug, that’s great
>> (though we’ll need a good deal of testing), but I’d still like to
>> clarify these points later on.
>
> Well, the changes I'm suggesting here seem reasonable to me. As for
> testing, checking things basically work is easy enough, but I don't
> currently have many ideas for how to test for when fetching things
> doesn't go to plan (which can of course happen).

I’ll do some testing of v2 on my end and report back.

Thanks for the explanations!

Ludo’.
L
L
Ludovic Courtès wrote on 16 Mar 2021 22:30
(name . Christopher Baines)(address . mail@cbaines.net)(address . 47160@debbugs.gnu.org)
87k0q67pn8.fsf_-_@gnu.org
Christopher Baines <mail@cbaines.net> skribis:

Toggle quote (12 lines)
> In f50f5751fff4cfc6d5abba9681054569694b7a5c, the way fetch was called within
> process-substitution was changed. As with-cached-connection actually includes
> important error handling for the opening of a HTTP request (when using a
> cached connection), this change removed some error handling.
>
> This commit adds that error handling back,
> (call-)with-cached-connection is back, rebranded as
> call-with-cached-connection-and-error-handling.
>
> * guix/scripts/substitute.scm (process-substitution): Retry once for some
> errors when making HTTP requests to fetch substitutes.

Please mention also the new procedure, and a
“Fixes https://bugs.gnu.org/47157.” line

Toggle quote (24 lines)
> +(define (call-with-cached-connection-and-error-handling uri proc)
> + (define (get-port)
> + (open-connection-for-uri/cached uri
> + #:verify-certificate? #f))
> +
> + (let ((port (get-port)))
> + (catch #t
> + (lambda ()
> + (proc port))
> + (lambda (key . args)
> + ;; If PORT was cached and the server closed the connection in the
> + ;; meantime, we get EPIPE. In that case, open a fresh connection
> + ;; and retry. We might also get 'bad-response or a similar
> + ;; exception from (web response) later on, once we've sent the
> + ;; request, or a ERROR/INVALID-SESSION from GnuTLS.
> + (if (or (and (eq? key 'system-error)
> + (= EPIPE (system-error-errno `(,key ,@args))))
> + (and (eq? key 'gnutls-error)
> + (eq? (first args) error/invalid-session))
> + (memq key '(bad-response bad-header bad-header-component)))
> + (begin
> + (close-port port) ; close the port to get a fresh one
> + (proc (get-port)))

I find it marginally clearer to pass #:fresh? #t (as was done in
the code removed in 7c85877fdf964694061e3192eac35723ebc047bf) than to
rely on the closed-port side effect.

I think it’s OK to remove ‘-and-error-handling’ because that doesn’t
really tell much and because too many words obscure the message IMO, but
that’s a detail. I also like the helper macro as was removed in
7c85877fdf964694061e3192eac35723ebc047bf.

Apart from that LGTM.

My limited testing suggests it’s working as intended.

Thank you!

Ludo’.
C
C
Christopher Baines wrote on 17 Mar 2021 00:11
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 47160@debbugs.gnu.org)
87eegeelsj.fsf@cbaines.net
Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (50 lines)
> Christopher Baines <mail@cbaines.net> skribis:
>
>> In f50f5751fff4cfc6d5abba9681054569694b7a5c, the way fetch was called within
>> process-substitution was changed. As with-cached-connection actually includes
>> important error handling for the opening of a HTTP request (when using a
>> cached connection), this change removed some error handling.
>>
>> This commit adds that error handling back,
>> (call-)with-cached-connection is back, rebranded as
>> call-with-cached-connection-and-error-handling.
>>
>> * guix/scripts/substitute.scm (process-substitution): Retry once for some
>> errors when making HTTP requests to fetch substitutes.
>
> Please mention also the new procedure, and a
> “Fixes <https://bugs.gnu.org/47157>.” line
>
>> +(define (call-with-cached-connection-and-error-handling uri proc)
>> + (define (get-port)
>> + (open-connection-for-uri/cached uri
>> + #:verify-certificate? #f))
>> +
>> + (let ((port (get-port)))
>> + (catch #t
>> + (lambda ()
>> + (proc port))
>> + (lambda (key . args)
>> + ;; If PORT was cached and the server closed the connection in the
>> + ;; meantime, we get EPIPE. In that case, open a fresh connection
>> + ;; and retry. We might also get 'bad-response or a similar
>> + ;; exception from (web response) later on, once we've sent the
>> + ;; request, or a ERROR/INVALID-SESSION from GnuTLS.
>> + (if (or (and (eq? key 'system-error)
>> + (= EPIPE (system-error-errno `(,key ,@args))))
>> + (and (eq? key 'gnutls-error)
>> + (eq? (first args) error/invalid-session))
>> + (memq key '(bad-response bad-header bad-header-component)))
>> + (begin
>> + (close-port port) ; close the port to get a fresh one
>> + (proc (get-port)))
>
> I find it marginally clearer to pass #:fresh? #t (as was done in
> the code removed in 7c85877fdf964694061e3192eac35723ebc047bf) than to
> rely on the closed-port side effect.
>
> I think it’s OK to remove ‘-and-error-handling’ because that doesn’t
> really tell much and because too many words obscure the message IMO, but
> that’s a detail. I also like the helper macro as was removed in
> 7c85877fdf964694061e3192eac35723ebc047bf.

Sure, I'll send a v3 set of patches shortly.
-----BEGIN PGP SIGNATURE-----

iQKlBAEBCgCPFiEEPonu50WOcg2XVOCyXiijOwuE9XcFAmBROyxfFIAAAAAALgAo
aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDNF
ODlFRUU3NDU4RTcyMEQ5NzU0RTBCMjVFMjhBMzNCMEI4NEY1NzcRHG1haWxAY2Jh
aW5lcy5uZXQACgkQXiijOwuE9XdTOw/5AWZ05MSZ6gfR/2uvGkTXfiOnVGomLOZt
c2zr3bTqvXtHXSJ9bjHG0ygRpJHjI+sGYX4JAxzLD4U1JwPRf1tg30vW8pTmpx+e
KyyaGimYJc/YXQX3yDkPUeMYl5EJkrI4H3noHCkK+0RoyvKAhdiXtbijD2OPElbB
kTkRxWW3UIl/37Z7mXnJmQM2K/E9vH1ICO8KkaMalIImLSkOQPpyV2MNeIfMwsLK
YniB9JlhVjk3e+yv3ei8bGCU78+fbrbbEeELXbxi5ah06o0+IbG55TolM/62eWzA
a9pB1RyCtwhGx0AHYyFq94Ef8eyrOu53V+36DyH2ql1MifxtdNgf9TrFbUSZQwcJ
liYKGZskVbQWCECMqF54aHeLhF8uJuAK9auFCmNkrc8PrMeoxsohsHEtscFvIvF9
t5IjHQDaz9qEkvtIk6TWH6QeIZBYFuXa3bEpQzbC0+w1eFBbxoIpvAXO2dqAgxj9
fvPOz1tBijHoVvUnIE5dlUHvWL7QB8n/smF0M1XhyNHooQmmgzdA/0e73BCfQ0hy
SGy44zoLJqtalDacQQg5dhDkUTo1mwEXuD5ITuJ6dd86MNZauRJwAZmOix2+6GE/
E9pUbaFz9hHyaQFsBbHnxnXbyViKlUGVQ6KrEaicvF0hKpL2yBuB1aAXxFPVb9fH
xpXOYFFfb14=
=Ccxw
-----END PGP SIGNATURE-----

C
C
Christopher Baines wrote on 17 Mar 2021 00:46
[PATCH 2/2] scripts: substitute: Tweak error reporting in process-substitution.
(address . 47160@debbugs.gnu.org)
20210316234628.24479-2-mail@cbaines.net
The call-with-connection-error-handling was added in
20c08a8a45d0f137ead7c05e720456b2aea44402, but that error handling was
previously inside of open-connection-for-uri/maybe, which is related
to (call-)with-cached-connection which was used in process-substitution, but
only actually used with call-with-cached-connection when used in
fetch-narinfos.

There's some handling for similar errors within with-networking, which is used
within process-substitution.

* guix/scripts/substitute.scm (process-substitution): Remove
call-with-connection-error-handling call.
---
guix/scripts/substitute.scm | 13 +++++--------
1 file changed, 5 insertions(+), 8 deletions(-)

Toggle diff (26 lines)
diff --git a/guix/scripts/substitute.scm b/guix/scripts/substitute.scm
index 812f2999ab..2bbbafe204 100755
--- a/guix/scripts/substitute.scm
+++ b/guix/scripts/substitute.scm
@@ -448,14 +448,11 @@ the current output port."
(warning (G_ "while fetching ~a: server is somewhat slow~%")
(uri->string uri))
(warning (G_ "try `--no-substitutes' if the problem persists~%")))
- (call-with-connection-error-handling
- uri
- (lambda ()
- (with-cached-connection uri port
- (http-fetch uri #:text? #f
- #:port port
- #:keep-alive? #t
- #:buffered? #f)))))))
+ (with-cached-connection uri port
+ (http-fetch uri #:text? #f
+ #:port port
+ #:keep-alive? #t
+ #:buffered? #f)))))
(else
(leave (G_ "unsupported substitute URI scheme: ~a~%")
(uri->string uri)))))
--
2.30.1
C
C
Christopher Baines wrote on 17 Mar 2021 00:46
[PATCH 1/2] scripts: substitute: Add back some error handling.
(address . 47160@debbugs.gnu.org)
20210316234628.24479-1-mail@cbaines.net
In f50f5751fff4cfc6d5abba9681054569694b7a5c, the way fetch was called within
process-substitution was changed. As call-with-cached-connection actually
includes important error handling for the opening of a HTTP request, this
change removed some error handling. This commit adds that back.


* guix/scripts/substitute.scm (call-with-cached-connection): New procedure.
(with-cached-connection): New syntax rule.
(process-substitution): Retry once for some errors when making HTTP requests
to fetch substitutes.
---
guix/scripts/substitute.scm | 39 ++++++++++++++++++++++++++++++++-----
1 file changed, 34 insertions(+), 5 deletions(-)

Toggle diff (73 lines)
diff --git a/guix/scripts/substitute.scm b/guix/scripts/substitute.scm
index 6892aa999b..812f2999ab 100755
--- a/guix/scripts/substitute.scm
+++ b/guix/scripts/substitute.scm
@@ -45,6 +45,7 @@
#:select (uri-abbreviation nar-uri-abbreviation
(open-connection-for-uri
. guix:open-connection-for-uri)))
+ #:autoload (gnutls) (error/invalid-session)
#:use-module (guix progress)
#:use-module ((guix build syscalls)
#:select (set-thread-name))
@@ -377,6 +378,32 @@ server certificates."
(drain-input socket)
socket))))))))
+(define (call-with-cached-connection uri proc)
+ (let ((port (open-connection-for-uri/cached uri
+ #:verify-certificate? #f)))
+ (catch #t
+ (lambda ()
+ (proc port))
+ (lambda (key . args)
+ ;; If PORT was cached and the server closed the connection in the
+ ;; meantime, we get EPIPE. In that case, open a fresh connection
+ ;; and retry. We might also get 'bad-response or a similar
+ ;; exception from (web response) later on, once we've sent the
+ ;; request, or a ERROR/INVALID-SESSION from GnuTLS.
+ (if (or (and (eq? key 'system-error)
+ (= EPIPE (system-error-errno `(,key ,@args))))
+ (and (eq? key 'gnutls-error)
+ (eq? (first args) error/invalid-session))
+ (memq key '(bad-response bad-header bad-header-component)))
+ (proc (open-connection-for-uri/cached uri
+ #:verify-certificate? #f
+ #:fresh? #t))
+ (apply throw key args))))))
+
+(define-syntax-rule (with-cached-connection uri port exp ...)
+ "Bind PORT with EXP... to a socket connected to URI."
+ (call-with-cached-connection uri (lambda (port) exp ...)))
+
(define* (process-substitution store-item destination
#:key cache-urls acl
deduplicate? print-build-trace?)
@@ -424,11 +451,11 @@ the current output port."
(call-with-connection-error-handling
uri
(lambda ()
- (http-fetch uri #:text? #f
- #:open-connection open-connection-for-uri/cached
- #:keep-alive? #t
- #:buffered? #f
- #:verify-certificate? #f))))))
+ (with-cached-connection uri port
+ (http-fetch uri #:text? #f
+ #:port port
+ #:keep-alive? #t
+ #:buffered? #f)))))))
(else
(leave (G_ "unsupported substitute URI scheme: ~a~%")
(uri->string uri)))))
@@ -715,6 +742,8 @@ if needed, as expected by the daemon's agent."
;;; Local Variables:
;;; eval: (put 'with-timeout 'scheme-indent-function 1)
;;; eval: (put 'with-redirected-error-port 'scheme-indent-function 0)
+;;; eval: (put 'with-cached-connection 'scheme-indent-function 2)
+;;; eval: (put 'call-with-cached-connection 'scheme-indent-function 1)
;;; End:
;;; substitute.scm ends here
--
2.30.1
L
L
Ludovic Courtès wrote on 17 Mar 2021 21:18
Re: bug#47160: [PATCH] scripts: substitute: Add back some error handling.
(name . Christopher Baines)(address . mail@cbaines.net)(address . 47160@debbugs.gnu.org)
8735wt1qm2.fsf_-_@gnu.org
Howdy!

Christopher Baines <mail@cbaines.net> skribis:

Toggle quote (12 lines)
> In f50f5751fff4cfc6d5abba9681054569694b7a5c, the way fetch was called within
> process-substitution was changed. As call-with-cached-connection actually
> includes important error handling for the opening of a HTTP request, this
> change removed some error handling. This commit adds that back.
>
> Fixes <https://bugs.gnu.org/47157>.
>
> * guix/scripts/substitute.scm (call-with-cached-connection): New procedure.
> (with-cached-connection): New syntax rule.
> (process-substitution): Retry once for some errors when making HTTP requests
> to fetch substitutes.

[...]

Toggle quote (13 lines)
> The call-with-connection-error-handling was added in
> 20c08a8a45d0f137ead7c05e720456b2aea44402, but that error handling was
> previously inside of open-connection-for-uri/maybe, which is related
> to (call-)with-cached-connection which was used in process-substitution, but
> only actually used with call-with-cached-connection when used in
> fetch-narinfos.
>
> There's some handling for similar errors within with-networking, which is used
> within process-substitution.
>
> * guix/scripts/substitute.scm (process-substitution): Remove
> call-with-connection-error-handling call.

Both LGTM, thank you!

Ludo’.
C
C
Christopher Baines wrote on 17 Mar 2021 21:46
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 47160-done@debbugs.gnu.org)
87zgz1cxv6.fsf@cbaines.net
Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (33 lines)
> Howdy!
>
> Christopher Baines <mail@cbaines.net> skribis:
>
>> In f50f5751fff4cfc6d5abba9681054569694b7a5c, the way fetch was called within
>> process-substitution was changed. As call-with-cached-connection actually
>> includes important error handling for the opening of a HTTP request, this
>> change removed some error handling. This commit adds that back.
>>
>> Fixes <https://bugs.gnu.org/47157>.
>>
>> * guix/scripts/substitute.scm (call-with-cached-connection): New procedure.
>> (with-cached-connection): New syntax rule.
>> (process-substitution): Retry once for some errors when making HTTP requests
>> to fetch substitutes.
>
> [...]
>
>> The call-with-connection-error-handling was added in
>> 20c08a8a45d0f137ead7c05e720456b2aea44402, but that error handling was
>> previously inside of open-connection-for-uri/maybe, which is related
>> to (call-)with-cached-connection which was used in process-substitution, but
>> only actually used with call-with-cached-connection when used in
>> fetch-narinfos.
>>
>> There's some handling for similar errors within with-networking, which is used
>> within process-substitution.
>>
>> * guix/scripts/substitute.scm (process-substitution): Remove
>> call-with-connection-error-handling call.
>
> Both LGTM, thank you!

Great, pushed as b48204259aa9cad80c5b23a4060e2d796007ec7a.

Note that this won't have any affect on the substitute script for most
users until the guix package is updated to include these changes.

Chris
-----BEGIN PGP SIGNATURE-----

iQKlBAEBCgCPFiEEPonu50WOcg2XVOCyXiijOwuE9XcFAmBSao1fFIAAAAAALgAo
aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDNF
ODlFRUU3NDU4RTcyMEQ5NzU0RTBCMjVFMjhBMzNCMEI4NEY1NzcRHG1haWxAY2Jh
aW5lcy5uZXQACgkQXiijOwuE9XfRrA//R+PiZGPPWoZV8M1We/wQW8uQ7O3HwXPG
c+klF38zgrbiFHreGriob3JwGdPTR+6NUm56Dne8aPRsLHdMdDWMcfAVsNDGtD5g
mRh1vIC1mtd4hlXoiV8gSrsMX05anZ5HVXwqD7ZvzFfwQhQBI5N1XT2d2pcTY5PA
NbGGNNmwAMzIlTmZOtOzx52aokX6Cffskw3//HoE/dJLEaUJ+61JqCB7X9svxJ22
jboq1uX+Q39HjL0nunE8BxthdQf45LvG53gJ/q5JK94o4fKyHUuvfukMEvvaffS8
bZnB74Zx2o9oQ3DFZ5ectQok3iEHV93GoTHhiKlzxVREKZ5Nm9n+LjfsO297/SYn
9bDT1XMWjFugxp2zxGTbob0ojw44DlO9toZK6LV7uqj2AQp0OSw0PLllLv7cWv7n
jE/D2b+z8SqU6ytBe4LzmjEUVoDTQnbcP4ASiV2ixR7yXimc6o+2Mkh8lHTKeSwv
vyoEDBjEB16mXm0SnjimbHvjlR205FoIGugW9o/Wl0rPE9P6RAESvr8C2+BzL1xP
QvbJzrGo3oOVVHg6g1wu9YQ8A7vZbkGXZFTQSQjdZf2vdKorzkIvcaDYOPqmSdo6
QdgkWWIgPSYUzv3izEefLaaawsExHLVfyHXHiO74iV1t2YL8wgHb7RpvqVZYGpL9
AqvCigBmeM0=
=t8H9
-----END PGP SIGNATURE-----

Closed
?