Guile's "sleep pipe" can leak into processes created by 'spawn'

  • Done
  • quality assurance status badge
Details
6 participants
  • Greg Hogan
  • Josselin Poiret
  • Travis Zimmerman
  • Ludovic Courtès
  • Maxim Cournoyer
  • zimoun
Owner
unassigned
Submitted by
Greg Hogan
Severity
important
Merged with
G
G
Greg Hogan wrote on 21 Mar 2023 15:17
Network is unreachable only for recursive pypi import
(address . bug-guix@gnu.org)
CA+3U0ZmV+B3mqut5QVU55Bv_Y8q-dBf8fFekNZAYKra+vFT5GA@mail.gmail.com
The following package import fails when importing the dependency, but
succeeds when directly importing that dependency. I can also manually
download the tar without issue.

Toggle snippet (8 lines)
$ guix describe
Generation 38 Mar 21 2023 13:38:52 (current)
guix 38b64d4
repository URL: https://git.savannah.gnu.org/git/guix.git
branch: master
commit: 38b64d47ed3dfaeb63b859e7a8834e477ffed3a1

Toggle snippet (46 lines)
$ guix import pypi -r cdlib

Starting download of /tmp/guix-file.oeSQon
From https://files.pythonhosted.org/packages/eb/6d/97167dce848b65023a272e2ffd04b2e462612efdb3538d16e2b8b2221a15/cdlib-0.2.6.tar.gz...
….6.tar.gz 191KiB 41.0MiB/s 00:00 ???????????????????? 100.0%

Starting download of /tmp/guix-file.H0dsHq
From https://files.pythonhosted.org/packages/56/37/80bdc21fbb88d87bca352cdf1ea0b9e2492737dd5775f32a04ea242f842e/cdlib-0.2.6-py3-none-any.whl...
…3-none-any.whl 223KiB 74.0MiB/s 00:00 ???????????????????? 100.0%

Starting download of /tmp/guix-file.xReGb4
From https://files.pythonhosted.org/packages/49/01/21337857631a97ab551bb2e3b5691ed6b1b9586011aa6a5355b9694a37fa/demon-2.0.6.tar.gz...
In procedure connect: Network is unreachable

Starting download of /tmp/guix-file.xReGb4
From https://web.archive.org/web/20230321140958/https://files.pythonhosted.org/packages/49/01/21337857631a97ab551bb2e3b5691ed6b1b9586011aa6a5355b9694a37fa/demon-2.0.6.tar.gz...
In procedure connect*: Connection timed out
Trying to use Disarchive to assemble /tmp/guix-file.xReGb4...
could not find its Disarchive specification
failed to download "/tmp/guix-file.xReGb4" from
"https://files.pythonhosted.org/packages/49/01/21337857631a97ab551bb2e3b5691ed6b1b9586011aa6a5355b9694a37fa/demon-2.0.6.tar.gz"
Backtrace:
10 (primitive-load "/home/ec2-user/.config/guix/current/bi…")
In guix/ui.scm:
2300:7 9 (run-guix . _)
2263:10 8 (run-guix-command _ . _)
In guix/scripts/import.scm:
89:11 7 (guix-import . _)
In ice-9/boot-9.scm:
1752:10 6 (with-exception-handler _ _ #:unwind? _ # _)
In guix/scripts/import/pypi.scm:
97:21 5 (_)
In guix/import/utils.scm:
638:3 4 (recursive-import _ #:repo->guix-package _ #:guix-name . #)
600:31 3 (topological-sort _ #<procedure 7faf595b7270 at guix/i…> …)
In srfi/srfi-1.scm:
586:17 2 (map1 (("demon" #f) ("nf1" #f) ("eva-lcd" #f) ("b…" …) …))
In guix/import/utils.scm:
630:33 1 (lookup-node "demon" #f)
In ice-9/boot-9.scm:
1685:16 0 (raise-exception _ #:continuable? _)

ice-9/boot-9.scm:1685:16: In procedure raise-exception:
Wrong number of values returned to continuation (expected 2)

Toggle snippet (25 lines)
$ guix import pypi demon

Starting download of /tmp/guix-file.MOR39A
From https://files.pythonhosted.org/packages/49/01/21337857631a97ab551bb2e3b5691ed6b1b9586011aa6a5355b9694a37fa/demon-2.0.6.tar.gz...
….6.tar.gz 7KiB 17.3MiB/s 00:00 ???????????????????? 100.0%

Starting download of /tmp/guix-file.RuwPtL
From https://files.pythonhosted.org/packages/96/85/5f9146e3f8d7324d8741cd3bea9b99926d5ee75571b9610c661b1bbd06f5/demon-2.0.6-py3-none-any.whl...
…3-none-any.whl 7KiB 17.6MiB/s 00:00 ???????????????????? 100.0%
(package
(name "python-demon")
(version "2.0.6")
(source (origin
(method url-fetch)
(uri (pypi-uri "demon" version))
(sha256
(base32
"0nd370yjfb3jirmxd0wa03c75dl7jac4xfsq1rvp5483mxddirb0"))))
(build-system python-build-system)
(propagated-inputs (list python-networkx python-tqdm))
(home-page "https://github.com/GiulioRossetti/DEMON")
(synopsis "Community Discovery algorithm")
(description "Community Discovery algorithm")
(license license:bsd-2))
J
J
Josselin Poiret wrote on 21 Mar 2023 22:45
87jzz9rd7b.fsf@jpoiret.xyz
Hi Greg,

Greg Hogan <code@greghogan.com> writes:

Toggle quote (4 lines)
> The following package import fails when importing the dependency, but
> succeeds when directly importing that dependency. I can also manually
> download the tar without issue.

I cannot reproduce on my side :( were you able to reproduce this
multiple times? Can you still reproduce it now?

Best,
--
Josselin Poiret
-----BEGIN PGP SIGNATURE-----

iQHEBAEBCAAuFiEEOSSM2EHGPMM23K8vUF5AuRYXGooFAmQaJWgQHGRldkBqcG9p
cmV0Lnh5egAKCRBQXkC5Fhcaiv/UDACoW5AYgcwVNNCE63lZ2ol2vo0Boun/nmXP
cn6RwB+2UucGbz8XrRAH1/0g0MguPQvj+QsRVnv6RomNuwK/o2IKxRS5Xz2LXuBA
9OaovxKWft5FfghRqEF7gwICrgsmx7z5nH5F5pie1knw7Kpf98sQ88NufN21jC+i
9zjQtkFpVukXymJQ3q83MyPdJitYges+2HeQnmRCBB2wIyIaxXJxfny7Grri1SvB
wEkmCxnn2qv2JHlzGSXqC0w9rdnZVl3bdCsTycUVwdzT2UQpF4/e6TRxzrnQhbaz
G4r3AcAcaHH1bpC26xnhOTF/M+AKQzjie/LyPV0dcCH8QiEggBd38PAGlqJOKYZF
k0oGJPL9eJ8kJqgZHyorwonR+Hcqi7dw5HLBOeFHrgDXeZXKJrPtD2eyzk/SsVwk
MCA+zYXciuqjNik17P4z2P8Mk+a3fwm18QVlpoOq6Ii9LFDK6utSWIsRUUSnWkC4
s2RoIzNXNTYb0RxCC4hf8ZFL2mWNu+s=
=sbjs
-----END PGP SIGNATURE-----

M
M
Maxim Cournoyer wrote on 24 Mar 2023 13:19
(name . Josselin Poiret)(address . dev@jpoiret.xyz)
87mt421gvi.fsf@gmail.com
Hi,

Josselin Poiret <dev@jpoiret.xyz> writes:

Toggle quote (13 lines)
> Hi Greg,
>
> Greg Hogan <code@greghogan.com> writes:
>
>> The following package import fails when importing the dependency, but
>> succeeds when directly importing that dependency. I can also manually
>> download the tar without issue.
>
> I cannot reproduce on my side :( were you able to reproduce this
> multiple times? Can you still reproduce it now?
>
> Best,

I can't reproduce it either, so it seems a genuine network error on your
side.

What we should do though is catch it and report it nicely, instead of
crashing.

--
Thanks,
Maxim
T
T
Travis Zimmerman wrote on 28 Mar 2023 22:37
Network
(address . 62334@debbugs.gnu.org)
CA+TSwwNXZ83mUhok3+zph8b2jSwVXKtQX0wbuzhuX7HLT00pNg@mail.gmail.com
Having the same issue.
As far as I can tell, it seems to happen for me when there are nested
recursions?
For example: a depends on b, b depends on c
`guix import pypi -r package-c` - works
`guix import pypi -r package-b` - works, importing both b and c
`guix import pypi -r package-a` - downloads a and b, but fails at c.
Attachment: file
Z
Z
zimoun wrote on 24 Apr 2023 10:42
Re: bug#62334: Network is unreachable only for recursive pypi import
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
86354pwu1j.fsf@gmail.com
Hi,

On Fri, 24 Mar 2023 at 08:19, Maxim Cournoyer <maxim.cournoyer@gmail.com> wrote:

Toggle quote (3 lines)
> I can't reproduce it either, so it seems a genuine network error on your
> side.

As I report in #62765 [1], it seems a regression on our side:

which passes with 29efa27. And indeed, using 86d580c, “guix import pypi
num2words -r” passes without any error, downloading from the exact same
URL. Hum?!

Well, I do not know which change impacts this regression.



Cheers,
simon

PS: Well, I just merged 2 other issues reporting the same behaviour.
Z
Z
zimoun wrote on 24 Apr 2023 10:28
control message for bug #63024
(address . control@debbugs.gnu.org)
86bkjdwuoz.fsf@gmail.com
merge 63024 62334
quit
Z
Z
zimoun wrote on 24 Apr 2023 10:36
control message for bug #62765
(address . control@debbugs.gnu.org)
865y9lwubt.fsf@gmail.com
merge 62765 62334
quit
S
S
Simon Tournier wrote on 25 Apr 2023 14:23
Re: bug#62334: Network is unreachable only for recursive pypi import
86ildkb174.fsf@gmail.com
Hi,

On Tue, 21 Mar 2023 at 10:17, Greg Hogan <code@greghogan.com> wrote:

Toggle quote (9 lines)
> --8<---------------cut here---------------start------------->8---
> $ guix describe
> Generation 38 Mar 21 2023 13:38:52 (current)
> guix 38b64d4
> repository URL: https://git.savannah.gnu.org/git/guix.git
> branch: master
> commit: 38b64d47ed3dfaeb63b859e7a8834e477ffed3a1
> --8<---------------cut here---------------end--------------->8---

I can reproduce.

Toggle quote (3 lines)
> --8<---------------cut here---------------start------------->8---
> $ guix import pypi -r cdlib

[...]

Toggle quote (8 lines)
> Starting download of /tmp/guix-file.xReGb4
> From https://files.pythonhosted.org/packages/49/01/21337857631a97ab551bb2e3b5691ed6b1b9586011aa6a5355b9694a37fa/demon-2.0.6.tar.gz...
> In procedure connect: Network is unreachable
>
> Starting download of /tmp/guix-file.xReGb4
> From https://web.archive.org/web/20230321140958/https://files.pythonhosted.org/packages/49/01/21337857631a97ab551bb2e3b5691ed6b1b9586011aa6a5355b9694a37fa/demon-2.0.6.tar.gz...
> In procedure connect*: Connection timed out

[...]

Toggle quote (10 lines)
> 586:17 2 (map1 (("demon" #f) ("nf1" #f) ("eva-lcd" #f) ("b…" …) …))
> In guix/import/utils.scm:
> 630:33 1 (lookup-node "demon" #f)
> In ice-9/boot-9.scm:
> 1685:16 0 (raise-exception _ #:continuable? _)
>
> ice-9/boot-9.scm:1685:16: In procedure raise-exception:
> Wrong number of values returned to continuation (expected 2)
> --8<---------------cut here---------------end--------------->8---

Well, tracking the origin of the failure “In procedure connect*:
Connection timed out”, it comes from ’open-socket-for-uri’, see (guix
build download).

Using the diff below, for some mysterious reasons, ’connect*’ throws the
error:

Toggle snippet (25 lines)
$ ./pre-inst-env guix import pypi -r cdlib
[...]
Starting download of /tmp/guix-file.Ujg3KY
From https://files.pythonhosted.org/packages/49/01/21337857631a97ab551bb2e3b5691ed6b1b9586011aa6a5355b9694a37fa/demon-2.0.6.tar.gz...

;;; (uri "https://files.pythonhosted.org/packages/49/01/21337857631a97ab551bb2e3b5691ed6b1b9586011aa6a5355b9694a37fa/demon-2.0.6.tar.gz")

;;; (before #<input-output: socket 14> #(2 2454407487 443) 10)

;;; (failed #<input-output: socket 14>)

;;; (closed)

;;; (uri "https://files.pythonhosted.org/packages/49/01/21337857631a97ab551bb2e3b5691ed6b1b9586011aa6a5355b9694a37fa/demon-2.0.6.tar.gz")

;;; (before #<input-output: socket 14> #(10 55849932267565577100763269369304187673 443 0 0) 10)

;;; (failed #<input-output: socket 14>)

;;; (closed)

;;; (boum)
In procedure connect: Network is unreachable

and note it is an issue on our side:

Toggle snippet (51 lines)
$ ./pre-inst-env guix import pypi -r demon
;;; (uri "https://pypi.org/pypi/demon/json")

;;; (before #<input-output: socket 13> #(2 2539978975 443) #f)

;;; (connect* #t)

;;; (done)

Starting download of /tmp/guix-file.G5r627
From https://files.pythonhosted.org/packages/49/01/21337857631a97ab551bb2e3b5691ed6b1b9586011aa6a5355b9694a37fa/demon-2.0.6.tar.gz...

;;; (uri "https://files.pythonhosted.org/packages/49/01/21337857631a97ab551bb2e3b5691ed6b1b9586011aa6a5355b9694a37fa/demon-2.0.6.tar.gz")

;;; (before #<input-output: socket 14> #(2 2454407487 443) 10)

;;; (connect* #<unspecified>)

;;; (done)
….6.tar.gz 7KiB 4.2MiB/s 00:00 ???????????????????? 100.0%

Starting download of /tmp/guix-file.4L0Phu
From https://files.pythonhosted.org/packages/96/85/5f9146e3f8d7324d8741cd3bea9b99926d5ee75571b9610c661b1bbd06f5/demon-2.0.6-py3-none-any.whl...

;;; (uri "https://files.pythonhosted.org/packages/96/85/5f9146e3f8d7324d8741cd3bea9b99926d5ee75571b9610c661b1bbd06f5/demon-2.0.6-py3-none-any.whl")

;;; (before #<input-output: socket 15> #(2 2454407487 443) 10)

;;; (connect* #<unspecified>)

;;; (done)
…3-none-any.whl 7KiB 5.5MiB/s 00:00 ???????????????????? 100.0%
(define-public python-demon
(package
(name "python-demon")
(version "2.0.6")
(source (origin
(method url-fetch)
(uri (pypi-uri "demon" version))
(sha256
(base32
"0nd370yjfb3jirmxd0wa03c75dl7jac4xfsq1rvp5483mxddirb0"))))
(build-system python-build-system)
(propagated-inputs (list python-networkx python-tqdm))
(home-page "https://github.com/GiulioRossetti/DEMON")
(synopsis "Community Discovery algorithm")
(description "Community Discovery algorithm")
(license license:bsd-2)))


Maybe, it is a regression from some recent updates of Guile?

I do not know… Any idea?


Cheers,
simon
S
S
Simon Tournier wrote on 27 Apr 2023 15:32
874jp1cuxj.fsf@gmail.com
Hi,

On mar., 25 avril 2023 at 14:23, Simon Tournier <zimon.toutoune@gmail.com> wrote:

Toggle quote (1 lines)
> Using the diff below,
Toggle diff (40 lines)
diff --git a/guix/build/download.scm b/guix/build/download.scm
index db0a39084b..2eee984b31 100644
--- a/guix/build/download.scm
+++ b/guix/build/download.scm
@@ -409,18 +409,24 @@ (define addresses
(socket (addrinfo:fam ai) SOCK_STREAM IPPROTO_IP))))
(catch 'system-error
(lambda ()
- (connect* s (addrinfo:addr ai) timeout)
-
+ (pk 'uri (uri->string uri))
+ (pk 'before s (addrinfo:addr ai) timeout)
+ (pk 'connect* (connect* s (addrinfo:addr ai) timeout))
+ (pk 'done)
;; Buffer input and output on this port.
(setvbuf s 'block)
;; If we're using a proxy, make a note of that.
(when http-proxy (set-http-proxy-port?! s #t))
s)
(lambda args
+ (pk 'failed s)
;; Connection failed, so try one of the other addresses.
(close s)
+ (pk 'closed)
(if (null? (cdr addresses))
- (apply throw args)
+ (begin
+ (pk 'boum)
+ (apply throw args))
(loop (cdr addresses))))))))
(define (setup-http-tunnel port uri)
@@ -465,6 +471,7 @@ (define https-proxy (let ((proxy (getenv "https_proxy")))
(parameterize ((current-http-proxy https-proxy))
(thunk))
(thunk)))))))
+
(with-https-proxy
(let ((s (open-socket-for-uri uri #:timeout timeout)))
;; Buffer input and output on this port.
Toggle quote (85 lines)
> for some mysterious reasons, ’connect*’ throws the
> error:
>
> --8<---------------cut here---------------start------------->8---
> $ ./pre-inst-env guix import pypi -r cdlib
> [...]
> Starting download of /tmp/guix-file.Ujg3KY
> From https://files.pythonhosted.org/packages/49/01/21337857631a97ab551bb2e3b5691ed6b1b9586011aa6a5355b9694a37fa/demon-2.0.6.tar.gz...
>
> ;;; (uri "https://files.pythonhosted.org/packages/49/01/21337857631a97ab551bb2e3b5691ed6b1b9586011aa6a5355b9694a37fa/demon-2.0.6.tar.gz")
>
> ;;; (before #<input-output: socket 14> #(2 2454407487 443) 10)
>
> ;;; (failed #<input-output: socket 14>)
>
> ;;; (closed)
>
> ;;; (uri "https://files.pythonhosted.org/packages/49/01/21337857631a97ab551bb2e3b5691ed6b1b9586011aa6a5355b9694a37fa/demon-2.0.6.tar.gz")
>
> ;;; (before #<input-output: socket 14> #(10 55849932267565577100763269369304187673 443 0 0) 10)
>
> ;;; (failed #<input-output: socket 14>)
>
> ;;; (closed)
>
> ;;; (boum)
> In procedure connect: Network is unreachable
> --8<---------------cut here---------------end--------------->8---
>
> and note it is an issue on our side:
>
> --8<---------------cut here---------------start------------->8---
> $ ./pre-inst-env guix import pypi -r demon
> ;;; (uri "https://pypi.org/pypi/demon/json")
>
> ;;; (before #<input-output: socket 13> #(2 2539978975 443) #f)
>
> ;;; (connect* #t)
>
> ;;; (done)
>
> Starting download of /tmp/guix-file.G5r627
> From https://files.pythonhosted.org/packages/49/01/21337857631a97ab551bb2e3b5691ed6b1b9586011aa6a5355b9694a37fa/demon-2.0.6.tar.gz...
>
> ;;; (uri "https://files.pythonhosted.org/packages/49/01/21337857631a97ab551bb2e3b5691ed6b1b9586011aa6a5355b9694a37fa/demon-2.0.6.tar.gz")
>
> ;;; (before #<input-output: socket 14> #(2 2454407487 443) 10)
>
> ;;; (connect* #<unspecified>)
>
> ;;; (done)
> ….6.tar.gz 7KiB 4.2MiB/s 00:00 ???????????????????? 100.0%
>
> Starting download of /tmp/guix-file.4L0Phu
> From https://files.pythonhosted.org/packages/96/85/5f9146e3f8d7324d8741cd3bea9b99926d5ee75571b9610c661b1bbd06f5/demon-2.0.6-py3-none-any.whl...
>
> ;;; (uri "https://files.pythonhosted.org/packages/96/85/5f9146e3f8d7324d8741cd3bea9b99926d5ee75571b9610c661b1bbd06f5/demon-2.0.6-py3-none-any.whl")
>
> ;;; (before #<input-output: socket 15> #(2 2454407487 443) 10)
>
> ;;; (connect* #<unspecified>)
>
> ;;; (done)
> …3-none-any.whl 7KiB 5.5MiB/s 00:00 ???????????????????? 100.0%
> (define-public python-demon
> (package
> (name "python-demon")
> (version "2.0.6")
> (source (origin
> (method url-fetch)
> (uri (pypi-uri "demon" version))
> (sha256
> (base32
> "0nd370yjfb3jirmxd0wa03c75dl7jac4xfsq1rvp5483mxddirb0"))))
> (build-system python-build-system)
> (propagated-inputs (list python-networkx python-tqdm))
> (home-page "https://github.com/GiulioRossetti/DEMON")
> (synopsis "Community Discovery algorithm")
> (description "Community Discovery algorithm")
> (license license:bsd-2)))
> --8<---------------cut here---------------end--------------->8---
>
>
> Maybe, it is a regression from some recent updates of Guile?

Anyone else is able to reproduce the same error?

Cheers,
simon
L
L
Ludovic Courtès wrote on 2 May 2023 23:47
Re: bug#63024: Crash during `guix import pypi -r'
(name . Greg Hogan)(address . code@greghogan.com)
87jzxq5rtf.fsf@gnu.org
Hello!

Greg Hogan <code@greghogan.com> skribis:

Toggle quote (4 lines)
> Starting download of /tmp/guix-file.xReGb4
>>From https://web.archive.org/web/20230321140958/https://files.pythonhosted.org/packages/49/01/21337857631a97ab551bb2e3b5691ed6b1b9586011aa6a5355b9694a37fa/demon-2.0.6.tar.gz...
> In procedure connect*: Connection timed out

I can very much reproduce the bug with “guix import pypi -r cdlib” on
39ba8a10971f15264966823e8696d63c2995df86 and thereabouts. Turns out
it’s a fun story! (Josselin, I’m sure you’ll love it.)

First, the person who wrote ‘connect*’ *cough* hadn’t carefully read the
Guile manual, which reads (info "(guile) Ports and File Descriptors"):

Note that ‘select’ may return early for other reasons, for example due
to pending interrupts.

Clearly, ‘select’ is returning early, so we should check that and loop.
So far, so good.

Now, why is ‘select’ returning immediately? Because the underlying
select(2) call is passed an extra file descriptor, the “sleep pipe”, as
you can see in ‘scm_std_select’ in libguile, and it’s that file
descriptor, number 3, that’s active at each select(2) call:

Toggle snippet (4 lines)
28825 23:06:04 pselect6(15, [3], [14], [14], {tv_sec=10, tv_nsec=0}, NULL) = 1 (in [3], left {tv_sec=9, tv_nsec=999991063}) <0.000079>
28825 23:06:04 read(3, "A", 1) = 1 <0.000044>

Why so much activity on that “sleep pipe”, you ask? Well, once we’ve
filtered the read(3, …) calls, a pattern appears:

Toggle snippet (22 lines)
14335:28825 23:06:04 read(3, "A", 1) = 1 <0.000044>
14344:28825 23:06:04 read(3, "r", 1) = 1 <0.000121>
14353:28825 23:06:04 read(3, "c", 1) = 1 <0.000107>
14362:28825 23:06:04 read(3, "h", 1) = 1 <0.000040>
14436:28825 23:06:04 read(3, "i", 1) = 1 <0.000049>
14445:28825 23:06:04 read(3, "v", 1) = 1 <0.000029>
14454:28825 23:06:04 read(3, "e", 1) = 1 <0.000019>
14463:28825 23:06:04 read(3, ":", 1) = 1 <0.000017>
14472:28825 23:06:04 read(3, " ", 1) = 1 <0.000019>
14481:28825 23:06:04 read(3, " ", 1) = 1 <0.000020>
14490:28825 23:06:04 read(3, "/", 1) = 1 <0.000019>
14499:28825 23:06:04 read(3, "t", 1) = 1 <0.000015>
14508:28825 23:06:04 read(3, "m", 1) = 1 <0.000018>
15001:28825 23:06:04 read(3, "p", 1) = 1 <0.000015>
15010:28825 23:06:04 read(3, "/", 1) = 1 <0.000016>
15019:28825 23:06:04 read(3, "g", 1) = 1 <0.000016>
15028:28825 23:06:04 read(3, "u", 1) = 1 <0.000015>
15037:28825 23:06:04 read(3, "i", 1) = 1 <0.000015>
15046:28825 23:06:04 read(3, "x", 1) = 1 <0.000015>
15055:28825 23:06:04 read(3, "-", 1) = 1 <0.000015>

See? Who’s chatting over our sleep pipe?

Toggle snippet (24 lines)
4806:28831 23:06:03 dup2(0, 3) = 3 <0.000011>
4807:28831 23:06:03 dup2(3, 0) = 0 <0.000011>
4808:28831 23:06:03 dup2(4, 1) = 1 <0.000012>
4809:28831 23:06:03 dup2(5, 2) = 2 <0.000010>
4810:28831 23:06:03 close(3) = 0 <0.000010>
4811:28831 23:06:03 close(4) = 0 <0.000011>
4812:28831 23:06:03 close(5) = 0 <0.000010>
4813:28831 23:06:03 close(6) = 0 <0.000010>
4814:28831 23:06:03 close(7) = 0 <0.000010>
4815:28831 23:06:03 close(8) = 0 <0.000011>
4816:28831 23:06:03 close(9) = 0 <0.000010>
4817:28831 23:06:03 close(10) = 0 <0.000010>
4818:28831 23:06:03 close(11) = 0 <0.000011>
4819:28831 23:06:03 close(12) = 0 <0.000010>
4820:28831 23:06:03 close(13) = 0 <0.000010>
4821:28831 23:06:03 close(14) = 0 <0.000011>
4822:28831 23:06:03 close(15) = -1 EBADF (Bad file descriptor) <0.000011>
4823:28831 23:06:03 prlimit64(0, RLIMIT_NOFILE, NULL, {rlim_cur=1024, rlim_max=4*1024}) = 0 <0.000011>
4824:28831 23:06:03 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000010>
4825:28831 23:06:03 execve("/home/ludo/src/guix/scripts/unzip", ["unzip", "/tmp/guix-file.bmholm", "-d", "/tmp/guix-directory.Ok2DAA", "cdlib-0.2.6.dist-info/METADATA"], 0x18a7db0 /* 85 v
[…]
4938:28831 23:06:03 write(1, "Archive: /tmp/guix-file.bmholm\n", 32) = 32 <0.000018>

‘unzip’! And all its friends invoked from (guix import pypi):

Toggle snippet (10 lines)
(call-with-temporary-directory
(lambda (dir)
(parameterize ((current-error-port (%make-void-port "rw+"))
(current-output-port (%make-void-port "rw+")))
(if (string=? "zip" (file-extension source-url))
(invoke "unzip" archive "-d" dir)
(invoke "tar" "xf" archive "-C" dir)))
…))

Looks like we shoudn’t dup(4, 1) in the child process, because 4 is the
other end of our sleep pipe. :-)

The problem exists both in 3.0.9 and Guile ‘main’. Looks related to the
file descriptor shuffling code in ‘do_spawn’. WDYT, Josselin?

Ludo’.
L
L
Ludovic Courtès wrote on 2 May 2023 23:48
(name . Greg Hogan)(address . code@greghogan.com)
87fs8e5rrp.fsf@gnu.org
Hello!

Greg Hogan <code@greghogan.com> skribis:

Toggle quote (4 lines)
> Starting download of /tmp/guix-file.xReGb4
>>From https://web.archive.org/web/20230321140958/https://files.pythonhosted.org/packages/49/01/21337857631a97ab551bb2e3b5691ed6b1b9586011aa6a5355b9694a37fa/demon-2.0.6.tar.gz...
> In procedure connect*: Connection timed out

I can very much reproduce the bug with “guix import pypi -r cdlib” on
39ba8a10971f15264966823e8696d63c2995df86 and thereabouts. Turns out
it’s a fun story! (Josselin, I’m sure you’ll love it.)

First, the person who wrote ‘connect*’ *cough* hadn’t carefully read the
Guile manual, which reads (info "(guile) Ports and File Descriptors"):

Note that ‘select’ may return early for other reasons, for example due
to pending interrupts.

Clearly, ‘select’ is returning early, so we should check that and loop.
So far, so good.

Now, why is ‘select’ returning immediately? Because the underlying
select(2) call is passed an extra file descriptor, the “sleep pipe”, as
you can see in ‘scm_std_select’ in libguile, and it’s that file
descriptor, number 3, that’s active at each select(2) call:

Toggle snippet (4 lines)
28825 23:06:04 pselect6(15, [3], [14], [14], {tv_sec=10, tv_nsec=0}, NULL) = 1 (in [3], left {tv_sec=9, tv_nsec=999991063}) <0.000079>
28825 23:06:04 read(3, "A", 1) = 1 <0.000044>

Why so much activity on that “sleep pipe”, you ask? Well, once we’ve
filtered the read(3, …) calls, a pattern appears:

Toggle snippet (22 lines)
14335:28825 23:06:04 read(3, "A", 1) = 1 <0.000044>
14344:28825 23:06:04 read(3, "r", 1) = 1 <0.000121>
14353:28825 23:06:04 read(3, "c", 1) = 1 <0.000107>
14362:28825 23:06:04 read(3, "h", 1) = 1 <0.000040>
14436:28825 23:06:04 read(3, "i", 1) = 1 <0.000049>
14445:28825 23:06:04 read(3, "v", 1) = 1 <0.000029>
14454:28825 23:06:04 read(3, "e", 1) = 1 <0.000019>
14463:28825 23:06:04 read(3, ":", 1) = 1 <0.000017>
14472:28825 23:06:04 read(3, " ", 1) = 1 <0.000019>
14481:28825 23:06:04 read(3, " ", 1) = 1 <0.000020>
14490:28825 23:06:04 read(3, "/", 1) = 1 <0.000019>
14499:28825 23:06:04 read(3, "t", 1) = 1 <0.000015>
14508:28825 23:06:04 read(3, "m", 1) = 1 <0.000018>
15001:28825 23:06:04 read(3, "p", 1) = 1 <0.000015>
15010:28825 23:06:04 read(3, "/", 1) = 1 <0.000016>
15019:28825 23:06:04 read(3, "g", 1) = 1 <0.000016>
15028:28825 23:06:04 read(3, "u", 1) = 1 <0.000015>
15037:28825 23:06:04 read(3, "i", 1) = 1 <0.000015>
15046:28825 23:06:04 read(3, "x", 1) = 1 <0.000015>
15055:28825 23:06:04 read(3, "-", 1) = 1 <0.000015>

See? Who’s chatting over our sleep pipe?

Toggle snippet (24 lines)
4806:28831 23:06:03 dup2(0, 3) = 3 <0.000011>
4807:28831 23:06:03 dup2(3, 0) = 0 <0.000011>
4808:28831 23:06:03 dup2(4, 1) = 1 <0.000012>
4809:28831 23:06:03 dup2(5, 2) = 2 <0.000010>
4810:28831 23:06:03 close(3) = 0 <0.000010>
4811:28831 23:06:03 close(4) = 0 <0.000011>
4812:28831 23:06:03 close(5) = 0 <0.000010>
4813:28831 23:06:03 close(6) = 0 <0.000010>
4814:28831 23:06:03 close(7) = 0 <0.000010>
4815:28831 23:06:03 close(8) = 0 <0.000011>
4816:28831 23:06:03 close(9) = 0 <0.000010>
4817:28831 23:06:03 close(10) = 0 <0.000010>
4818:28831 23:06:03 close(11) = 0 <0.000011>
4819:28831 23:06:03 close(12) = 0 <0.000010>
4820:28831 23:06:03 close(13) = 0 <0.000010>
4821:28831 23:06:03 close(14) = 0 <0.000011>
4822:28831 23:06:03 close(15) = -1 EBADF (Bad file descriptor) <0.000011>
4823:28831 23:06:03 prlimit64(0, RLIMIT_NOFILE, NULL, {rlim_cur=1024, rlim_max=4*1024}) = 0 <0.000011>
4824:28831 23:06:03 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000010>
4825:28831 23:06:03 execve("/home/ludo/src/guix/scripts/unzip", ["unzip", "/tmp/guix-file.bmholm", "-d", "/tmp/guix-directory.Ok2DAA", "cdlib-0.2.6.dist-info/METADATA"], 0x18a7db0 /* 85 v
[…]
4938:28831 23:06:03 write(1, "Archive: /tmp/guix-file.bmholm\n", 32) = 32 <0.000018>

‘unzip’! And all its friends invoked from (guix import pypi):

Toggle snippet (10 lines)
(call-with-temporary-directory
(lambda (dir)
(parameterize ((current-error-port (%make-void-port "rw+"))
(current-output-port (%make-void-port "rw+")))
(if (string=? "zip" (file-extension source-url))
(invoke "unzip" archive "-d" dir)
(invoke "tar" "xf" archive "-C" dir)))
…))

Looks like we shoudn’t dup(4, 1) in the child process, because 4 is the
other end of our sleep pipe. :-)

The problem exists both in 3.0.9 and Guile ‘main’. Looks related to the
file descriptor shuffling code in ‘do_spawn’. WDYT, Josselin?

Ludo’.
L
L
Ludovic Courtès wrote on 2 May 2023 23:49
control message for bug #63024
(address . control@debbugs.gnu.org)
87a5ym5rq4.fsf@gnu.org
severity 63024 important
quit
L
L
Ludovic Courtès wrote on 2 May 2023 23:50
(address . control@debbugs.gnu.org)
878re65row.fsf@gnu.org
retitle 63024 Guile's "sleep pipe" can leak into processes created by 'spawn'
quit
L
L
Ludovic Courtès wrote on 3 May 2023 11:04
Re: bug#63024: Crash during `guix import pypi -r'
(name . Josselin Poiret)(address . dev@jpoiret.xyz)
87mt2l4whd.fsf@gnu.org
Hey,

Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (3 lines)
> Looks like we shoudn’t dup(4, 1) in the child process, because 4 is the
> other end of our sleep pipe. :-)

How about this patch, Josselin?

Ludo’.
Toggle diff (59 lines)
diff --git a/libguile/posix.c b/libguile/posix.c
index 3adc743c4..2d55d985c 100644
--- a/libguile/posix.c
+++ b/libguile/posix.c
@@ -1388,11 +1388,27 @@ do_spawn (char *exec_file, char **exec_argv, char **exec_env,
}
/* Move the fds out of the way, so that duplicate fds or fds equal
- to 0, 1, 2 don't trample each other */
+ to 0, 1, 2 don't trample each other. Since 'system*' might give
+ us -1 for IN, OUT, or ERR, open /dev/null when that's the case. */
+
+ if (in < 0)
+ posix_spawn_file_actions_addopen (&actions, fd_slot[0],
+ "/dev/null", O_RDONLY | O_CLOEXEC, 0);
+ else
+ posix_spawn_file_actions_adddup2 (&actions, in, fd_slot[0]);
+
+ if (out < 0)
+ posix_spawn_file_actions_addopen (&actions, fd_slot[1],
+ "/dev/null", O_WRONLY | O_CLOEXEC, 0);
+ else
+ posix_spawn_file_actions_adddup2 (&actions, out, fd_slot[1]);
+
+ if (err < 0)
+ posix_spawn_file_actions_addopen (&actions, fd_slot[2],
+ "/dev/null", O_WRONLY | O_CLOEXEC, 0);
+ else
+ posix_spawn_file_actions_adddup2 (&actions, err, fd_slot[2]);
- posix_spawn_file_actions_adddup2 (&actions, in, fd_slot[0]);
- posix_spawn_file_actions_adddup2 (&actions, out, fd_slot[1]);
- posix_spawn_file_actions_adddup2 (&actions, err, fd_slot[2]);
posix_spawn_file_actions_adddup2 (&actions, fd_slot[0], 0);
posix_spawn_file_actions_adddup2 (&actions, fd_slot[1], 1);
posix_spawn_file_actions_adddup2 (&actions, fd_slot[2], 2);
diff --git a/test-suite/tests/posix.test b/test-suite/tests/posix.test
index d5cf47cda..18dad8902 100644
--- a/test-suite/tests/posix.test
+++ b/test-suite/tests/posix.test
@@ -374,7 +374,17 @@
(system* "sh" "-c" "echo bong >&2"))))))))
(and (zero? (status:exit-val status))
- (call-with-input-file file get-string-all)))))
+ (call-with-input-file file get-string-all))))
+
+ (pass-if-equal "https://bugs.gnu.org/63024"
+ 0
+ (if (file-exists? "/proc/self/fd/0") ;on GNU/Linux?
+ (parameterize ((current-output-port (%make-void-port "w0")))
+ (system* "guile" "-c"
+ (object->string
+ '(exit (string=? "/dev/null"
+ (readlink "/proc/self/fd/1"))))))
+ (throw 'unresolved))))
;;
;; spawn
S
S
Simon Tournier wrote on 3 May 2023 12:08
Re: bug#62334: bug#63024: Crash during `guix import pypi -r'
86354d20e6.fsf@gmail.com
Hi Ludo,

On Wed, 03 May 2023 at 11:04, Ludovic Courtès <ludo@gnu.org> wrote:

Toggle quote (5 lines)
> diff --git a/libguile/posix.c b/libguile/posix.c
> index 3adc743c4..2d55d985c 100644
> --- a/libguile/posix.c
> +++ b/libguile/posix.c

Does it mean patch the current Guile or a new release of Guile?


Cheers,
simon
L
L
Ludovic Courtès wrote on 4 May 2023 13:10
Re: bug#63024: Crash during `guix import pypi -r'
(name . Simon Tournier)(address . zimon.toutoune@gmail.com)
877ctotkr9.fsf_-_@gnu.org
Hi,

Simon Tournier <zimon.toutoune@gmail.com> skribis:

Toggle quote (9 lines)
> On Wed, 03 May 2023 at 11:04, Ludovic Courtès <ludo@gnu.org> wrote:
>
>> diff --git a/libguile/posix.c b/libguile/posix.c
>> index 3adc743c4..2d55d985c 100644
>> --- a/libguile/posix.c
>> +++ b/libguile/posix.c
>
> Does it mean patch the current Guile or a new release of Guile?

It’s a patch against Guile ‘main’. That gives an incentive to push a
new Guile release.

Ludo’.
L
L
Ludovic Courtès wrote on 4 May 2023 13:13
(name . Greg Hogan)(address . code@greghogan.com)
87mt2ks62g.fsf_-_@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (8 lines)
> First, the person who wrote ‘connect*’ *cough* hadn’t carefully read the
> Guile manual, which reads (info "(guile) Ports and File Descriptors"):
>
> Note that ‘select’ may return early for other reasons, for example due
> to pending interrupts.
>
> Clearly, ‘select’ is returning early, so we should check that and loop.

Pushed a fix for that in fc6c96c88a0e4ad0b9e48272e5f97ffaa6eec36e.

Ludo’.
L
L
Ludovic Courtès wrote on 5 May 2023 10:54
(name . Greg Hogan)(address . code@greghogan.com)
87h6srnop6.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (12 lines)
> Ludovic Courtès <ludo@gnu.org> skribis:
>
>> First, the person who wrote ‘connect*’ *cough* hadn’t carefully read the
>> Guile manual, which reads (info "(guile) Ports and File Descriptors"):
>>
>> Note that ‘select’ may return early for other reasons, for example due
>> to pending interrupts.
>>
>> Clearly, ‘select’ is returning early, so we should check that and loop.
>
> Pushed a fix for that in fc6c96c88a0e4ad0b9e48272e5f97ffaa6eec36e.

… and updated the ‘guix’ package in
3c91f4ca490a7ac56dc0aebbca9c4bf4df201877, which should address
‘connect*’ timeout issues that were coming from ‘guix substitute’.

Ludo’.
J
J
Josselin Poiret wrote on 5 May 2023 15:39
[PATCH 2/3] Use /dev/null in piped-process if port is not backed by a fdes
269750b29a104e35a6bbd92e267130cfcf35c33e.1683293896.git.dev@jpoiret.xyz
From: Josselin Poiret <dev@jpoiret.xyz>

* libguile/posix.c (piped_process): Open /dev/null to use as in/out/err
if the corresponding port is not backed by a file descriptor.
---
libguile/posix.c | 20 ++++++++++++++++----
1 file changed, 16 insertions(+), 4 deletions(-)

Toggle diff (33 lines)
diff --git a/libguile/posix.c b/libguile/posix.c
index 2969f1f24..f41a99efe 100644
--- a/libguile/posix.c
+++ b/libguile/posix.c
@@ -1558,10 +1558,22 @@ piped_process (pid_t *pid, SCM prog, SCM args, SCM from, SCM to)
if (SCM_OPOUTFPORTP ((port = scm_current_error_port ())))
err = SCM_FPORT_FDES (port);
- if (out == -1 && SCM_OPOUTFPORTP ((port = scm_current_output_port ())))
- out = SCM_FPORT_FDES (port);
- if (in == -1 && SCM_OPINFPORTP ((port = scm_current_input_port ())))
- in = SCM_FPORT_FDES (port);
+ else
+ err = open ("/dev/null", O_WRONLY | O_CLOEXEC);
+ if (out == -1)
+ {
+ if (SCM_OPOUTFPORTP ((port = scm_current_output_port ())))
+ out = SCM_FPORT_FDES (port);
+ else
+ out = open ("/dev/null", O_WRONLY | O_CLOEXEC);
+ }
+ if (in == -1)
+ {
+ if (SCM_OPINFPORTP ((port = scm_current_input_port ())))
+ in = SCM_FPORT_FDES (port);
+ else
+ in = open ("/dev/null", O_RDONLY | O_CLOEXEC);
+ }
}
*pid = do_spawn (exec_file, exec_argv, exec_env, in, out, err, 1);
--
2.39.2
J
J
Josselin Poiret wrote on 5 May 2023 15:39
[PATCH 3/3] tests: Test that system* works if stdin/out/err isn't backed by fdes
19baf4a40b5ce28624a7bcb3f9381fce0f318cf3.1683293896.git.dev@jpoiret.xyz
From: Ludovic Courtès <ludo@gnu.org>

* test-suite/tests/posix.test: New test for https://bugs.gnu.org/63024.
---
test-suite/tests/posix.test | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)

Toggle diff (25 lines)
diff --git a/test-suite/tests/posix.test b/test-suite/tests/posix.test
index d5cf47cda..18dad8902 100644
--- a/test-suite/tests/posix.test
+++ b/test-suite/tests/posix.test
@@ -374,7 +374,17 @@
(system* "sh" "-c" "echo bong >&2"))))))))
(and (zero? (status:exit-val status))
- (call-with-input-file file get-string-all)))))
+ (call-with-input-file file get-string-all))))
+
+ (pass-if-equal "https://bugs.gnu.org/63024"
+ 0
+ (if (file-exists? "/proc/self/fd/0") ;on GNU/Linux?
+ (parameterize ((current-output-port (%make-void-port "w0")))
+ (system* "guile" "-c"
+ (object->string
+ '(exit (string=? "/dev/null"
+ (readlink "/proc/self/fd/1"))))))
+ (throw 'unresolved))))
;;
;; spawn
--
2.39.2
J
J
Josselin Poiret wrote on 5 May 2023 15:39
[PATCH 1/3] Add error handling for spawn's posix_spawn_file_actions_adddup2
f7db538ea324545f2e297b6a6cdcd937f889ef63.1683293896.git.dev@jpoiret.xyz
From: Josselin Poiret <dev@jpoiret.xyz>

* libguile/posix.c (do_spawn): Add error handling if
posix_spawn_file_actions_adddup2 fails.
---
Hi Ludo,

Sorry for not noticing this thread sooner! Here's my take on it, a minor
variation but that's what I'd prefer tbh.

LMKWYT,
Josselin

libguile/posix.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)

Toggle diff (30 lines)
diff --git a/libguile/posix.c b/libguile/posix.c
index 3adc743c4..2969f1f24 100644
--- a/libguile/posix.c
+++ b/libguile/posix.c
@@ -1390,12 +1390,15 @@ do_spawn (char *exec_file, char **exec_argv, char **exec_env,
/* Move the fds out of the way, so that duplicate fds or fds equal
to 0, 1, 2 don't trample each other */
- posix_spawn_file_actions_adddup2 (&actions, in, fd_slot[0]);
- posix_spawn_file_actions_adddup2 (&actions, out, fd_slot[1]);
- posix_spawn_file_actions_adddup2 (&actions, err, fd_slot[2]);
- posix_spawn_file_actions_adddup2 (&actions, fd_slot[0], 0);
- posix_spawn_file_actions_adddup2 (&actions, fd_slot[1], 1);
- posix_spawn_file_actions_adddup2 (&actions, fd_slot[2], 2);
+ int dup2_action_from[] = {in, out, err,
+ fd_slot[0], fd_slot[1], fd_slot[2]};
+ int dup2_action_to [] = {fd_slot[0], fd_slot[1], fd_slot[2],
+ 0, 1, 2};
+
+ errno = 0;
+ for (int i = 0;i < sizeof (dup2_action_from) / sizeof (int);i++)
+ if ((errno = posix_spawn_file_actions_adddup2 (&actions, dup2_action_from[i], dup2_action_to[i])))
+ return -1;
#ifdef HAVE_ADDCLOSEFROM
/* This function appears in glibc 2.34. It's both free from race

base-commit: fe6cc6d04ab094ea802907bdc4f728416c0e97ba
--
2.39.2
L
L
Ludovic Courtès wrote on 8 May 2023 16:08
Re: bug#63024: Guile's "sleep pipe" can leak into processes created by 'spawn'
(name . Josselin Poiret)(address . dev@jpoiret.xyz)
87mt2edifn.fsf_-_@gnu.org
Hi Josselin,

Josselin Poiret <dev@jpoiret.xyz> skribis:

Toggle quote (3 lines)
> Sorry for not noticing this thread sooner! Here's my take on it, a minor
> variation but that's what I'd prefer tbh.

Even nicer! Pushed to ‘main’ as two patches (I folded the test along
with the fix, for clarity) and with cosmetic changes to match GNU coding
style for C.

36fd2b492 main origin/main Use /dev/null in 'piped-process' if port is not backed by a fdes.
ccd7400fd Add error handling for spawn's posix_spawn_file_actions_adddup2.

Thanks!

Ludo’.
L
L
Ludovic Courtès wrote on 6 Sep 22:57 +0200
control message for bug #63024
(address . control@debbugs.gnu.org)
8734mcxzye.fsf@gnu.org
close 63024
quit
?
Your comment

This issue is archived.

To comment on this conversation send an email to 62334@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 62334
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch