pypi importer outputs strange character series in optional dependency case.

DoneSubmitted by ng0.
Details
5 participants
  • Ludovic Courtès
  • T460s laptop
  • ng0
  • Ricardo Wurmus
  • swedebugia
Owner
unassigned
Severity
normal
Merged with
N
(address . bug-guix@gnu.org)
87h99fipj1.fsf@we.make.ritual.n0.is
I think this should not happen with pypi import:

(inputs
`(("python-certifi==2016.2.28"
,python-certifi==2016.2.28)
("python-dateutil==2.5.3"
,python-dateutil==2.5.3)
("python-flask-babel==0.11.1"
,python-flask-babel==0.11.1)
("python-flask==0.11.1" ,python-flask==0.11.1)
("python-lxml==3.6.0" ,python-lxml==3.6.0)
("python-ndg-httpsclient==0.4.1"
,python-ndg-httpsclient==0.4.1)
("python-pyasn1-modules==0.0.8"
,python-pyasn1-modules==0.0.8)
("python-pyasn1==0.1.9" ,python-pyasn1==0.1.9)
("python-pygments==2.1.3"
,python-pygments==2.1.3)
("python-pyopenssl==0.15.1"
,python-pyopenssl==0.15.1)
("python-pyyaml==3.11" ,python-pyyaml==3.11)
("python-requests[socks]==2.10.0"
,#{python-requests\x5b;socks\x5d;==2.10.0}#)
("python-setuptools" ,python-setuptools)))


I can understand the version numbers, I can also understand the optional
socks building/module of the python-requests, but why does it read like
Gobbledygook? Can't we improve the output here?

For version numbers, this is not a format which happened recently which
is exclusive for python build system right? This is just bad formated
because of the pypi query.
I will first try and not pin the application to these version numbers,
maybe itjustworks™.


To reproduce: "guix import pypi searx"
--
ng0
T
T
T460s laptop wrote on 28 Mar 2019 05:32
control message for bug #33047
(address . control@debbugs.gnu.org)
871s2r7h68.fsf@kwak.i-did-not-set--mail-host-address--so-tickle-me
merge 33047 24450
M
M
Maxim Cournoyer wrote on 28 Mar 2019 05:37
control message for bug #33569
(address . control@debbugs.gnu.org)
875zs362dt.fsf@gmail.com
forcemerge 33569 24450
M
M
Maxim Cournoyer wrote on 29 Mar 2019 05:19
Re: bug#34266: pypi importer cannot handle [ and ] correctly
(name . swedebugia)(address . swedebugia@riseup.net)
87va022tz3.fsf@gmail.com
swedebugia <swedebugia@riseup.net> writes:

Toggle quote (53 lines)
> $ ./pre-inst-env guix import pypi beaker
>
> following redirection to `https://pypi.org/pypi/Beaker/json'...
>
> Starting download of /tmp/guix-file.p15GJZ
> From
> https://files.pythonhosted.org/packages/c2/21/b052b2fbfee3def06670923d5d34b0d353d4c278013e4a714c3fb663f150/Beaker-1.10.0.tar.gz...
> ...0.0.tar.gz 40KiB 521KiB/s 00:00
> [##################] 100.0%
> (package
> (name "python-beaker")
> (version "1.10.0")
> (source
> (origin
> (method url-fetch)
> (uri (pypi-uri "beaker" version))
> (sha256
> (base32
> "0l047yl3n9b3w7ba0wrqdb5fpww5y8pjy20kah2mlpr230lqjwk0"))))
> (build-system python-build-system)
> (propagated-inputs
> `(("python-[crypto]" ,#{python-\x5b;crypto\x5d;}#)
> ("python-[cryptography]"
> ,#{python-\x5b;cryptography\x5d;}#)
> ("python-[pycrypto]"
> ,#{python-\x5b;pycrypto\x5d;}#)
> ("python-[pycryptodome]"
> ,#{python-\x5b;pycryptodome\x5d;}#)
> ("python-[testsuite]"
> ,#{python-\x5b;testsuite\x5d;}#)
> ("python-coverage" ,python-coverage)
> ("python-cryptography" ,python-cryptography)
> ("python-cryptography" ,python-cryptography)
> ("python-funcsigs" ,python-funcsigs)
> ("python-memcached" ,python-memcached)
> ("python-mock" ,python-mock)
> ("python-nose" ,python-nose)
> ("python-pycrypto" ,python-pycrypto)
> ("python-pycryptodome" ,python-pycryptodome)
> ("python-pycryptodome" ,python-pycryptodome)
> ("python-pycryptopp" ,python-pycryptopp)
> ("python-pylibmc" ,python-pylibmc)
> ("python-pymongo" ,python-pymongo)
> ("python-redis" ,python-redis)
> ("python-sqlalchemy" ,python-sqlalchemy)
> ("python-webtest" ,python-webtest)))
> (home-page "https://beaker.readthedocs.io/")
> (synopsis
> "A Session and Caching library with WSGI Middleware")
> (description
> "A Session and Caching library with WSGI Middleware")
> (license license:bsd-3))

Testing with my soon-to-be sent for review changes:

Toggle snippet (37 lines)
./pre-inst-env guix import pypi beaker
following redirection to `https://pypi.org/pypi/Beaker/json'...

Starting download of /tmp/guix-file.0MWu4B
From https://files.pythonhosted.org/packages/76/87/ecc1a222f0caaa7ba7b8928737e89b2e91b8c22450c12b8a51ee625a4d87/Beaker-1.10.1.tar.gz...
…0.1.tar.gz 40KiB 487KiB/s 00:00 [##################] 100.0%
(package
(name "python-beaker")
(version "1.10.1")
(source
(origin
(method url-fetch)
(uri (pypi-uri "beaker" version))
(sha256
(base32
"16zdjfl8v73yl1capph0n371vd26c7zpzb48n505ip32ffgmvc4f"))))
(build-system python-build-system)
(native-inputs
`(("python-coverage" ,python-coverage)
("python-cryptography" ,python-cryptography)
("python-memcached" ,python-memcached)
("python-mock" ,python-mock)
("python-nose" ,python-nose)
("python-pycryptodome" ,python-pycryptodome)
("python-pylibmc" ,python-pylibmc)
("python-pymongo" ,python-pymongo)
("python-redis" ,python-redis)
("python-sqlalchemy" ,python-sqlalchemy)
("python-webtest" ,python-webtest)))
(home-page "https://beaker.readthedocs.io/")
(synopsis
"A Session and Caching library with WSGI Middleware")
(description
"A Session and Caching library with WSGI Middleware")
(license license:bsd-3))

Looking better?

Maxim
M
M
Maxim Cournoyer wrote on 29 Mar 2019 05:20
Re: bug#33569: Missing sanitizing of '[]' in pypi-importer
(name . swedebugia)(address . swedebugia@riseup.net)
87r2aq2twa.fsf@gmail.com
swedebugia <swedebugia@riseup.net> writes:

Toggle quote (10 lines)
> E.g.
> sdb@komputilo ~/guix-tree$ ~/guix-tree/pre-inst-env guix import pypi
> snakemake
> ...
> (propagated-inputs
> `(("python-[reports]"
> ,#{python-\x5b;reports\x5d;}#)
> ("python-appdirs" ,python-appdirs)
> ...

This one now gives (local branch):

Toggle snippet (34 lines)
./pre-inst-env guix import pypi snakemake

Starting download of /tmp/guix-file.4XvWMX
From https://files.pythonhosted.org/packages/4a/aa/aab1515d220be06fbdccf3c89335d9585b08ac6be74b8e3c9e8c3c32798e/snakemake-5.4.4.tar.gz...
….4.4.tar.gz 169KiB 723KiB/s 00:00 [##################] 100.0%
(package
(name "python-snakemake")
(version "5.4.4")
(source
(origin
(method url-fetch)
(uri (pypi-uri "snakemake" version))
(sha256
(base32
"0prpr5qajqwr8sh4gzggpj8l4np2rcm9nfdzvcp30d5yw7h26wqm"))))
(build-system python-build-system)
(propagated-inputs
`(("python-appdirs" ,python-appdirs)
("python-configargparse" ,python-configargparse)
("python-datrie" ,python-datrie)
("python-docutils" ,python-docutils)
("python-gitpython" ,python-gitpython)
("python-jsonschema" ,python-jsonschema)
("python-pyyaml" ,python-pyyaml)
("python-ratelimiter" ,python-ratelimiter)
("python-requests" ,python-requests)
("python-wrapt" ,python-wrapt)))
(home-page "http://snakemake.bitbucket.io")
(synopsis
"Snakemake is a workflow management system that aims to reduce the complexity of creating workflows by providing a fast and comfortable execution environment, together with a clean and modern specification language in python style. Snakemake workflows are essentially Python scripts extended by declarative code to define rules. Rules describe how to create output files from input files.")
(description
"Snakemake is a workflow management system that aims to reduce the complexity of creating workflows by providing a fast and comfortable execution environment, together with a clean and modern specification language in python style. Snakemake workflows are essentially Python scripts extended by declarative code to define rules. Rules describe how to create output files from input files.")
(license license:expat))
M
M
Maxim Cournoyer wrote on 29 Mar 2019 05:23
Re: bug#33047: pypi importer uses incorrect package names
(name . Julien Lepiller)(address . julien@lepiller.eu)
87mule2tso.fsf@gmail.com
Julien Lepiller <julien@lepiller.eu> writes:

Toggle quote (22 lines)
> Hi,
>
> I found that sometimes the pypi importer had trouble importing
> packages correctly. For instance, running "guix import pypi txaio"
> gave me this list of dependencies:
>
> (propagated-inputs
> `(("python-[all]" ,#{python-\x5b;all\x5d;}#)
> ("python-[asyncio]"
> ,#{python-\x5b;asyncio\x5d;}#)
> ...))
>
> guix import pypi magic-wormhole had this:
>
> (propagated-inputs
> ("python-autobahn[twisted]"
> ,#{python-autobahn\x5b;twisted\x5d;}#)
> ...))
>
> Of course, they break the recursive importer, which makes it difficult
> to import these packages correctly.

Testing local branch:

./pre-inst-env guix import pypi txaio

Starting download of /tmp/guix-file.jTNBQz
…8.1.tar.gz 50KiB 894KiB/s 00:00 [##################] 100.0%

Starting download of /tmp/guix-file.ZB3Q2n
….py3-none-any.whl 27KiB 746KiB/s 00:00 [##################] 100.0%
(package
(name "python-txaio")
(version "18.8.1")
(source
(origin
(method url-fetch)
(uri (pypi-uri "txaio" version))
(sha256
(base32
"1zmpdph6zddgrnkkcykh6qk5s46l7s5mzfqrh82m4b5iffn61qv7"))))
(build-system python-build-system)
(propagated-inputs `(("python-six" ,python-six)))
(native-inputs
`(("python-mock" ,python-mock)
("python-pep8" ,python-pep8)
("python-pyenchant" ,python-pyenchant)
("python-pytest" ,python-pytest)
("python-pytest-cov" ,python-pytest-cov)
("python-sphinx" ,python-sphinx)
("python-sphinx-rtd-theme"
,python-sphinx-rtd-theme)
("python-sphinxcontrib-spelling"
,python-sphinxcontrib-spelling)
("python-tox" ,python-tox)
("python-twine" ,python-twine)
("python-wheel" ,python-wheel)))
(synopsis
"Compatibility API between asyncio/Twisted/Trollius")
(description
"Compatibility API between asyncio/Twisted/Trollius")
(license #f))

and

./pre-inst-env guix import pypi magic-wormhole

Starting download of /tmp/guix-file.80RRqk
…-0.11.2.tar.gz 193KiB 911KiB/s 00:00 [##################] 100.0%

Starting download of /tmp/guix-file.mRGAx3
…-py2.py3-none-any.whl 128KiB 1009KiB/s 00:00 [##################] 100.0%
(package
(name "python-magic-wormhole")
(version "0.11.2")
(source
(origin
(method url-fetch)
(uri (pypi-uri "magic-wormhole" version))
(sha256
(base32
"01fr4bi6kc6fz9n3c4qq892inrc3nf6p2djy65yvm7xkvdxncydf"))))
(build-system python-build-system)
(propagated-inputs
`(("python-attrs" ,python-attrs)
("python-autobahn" ,python-autobahn)
("python-automat" ,python-automat)
("python-click" ,python-click)
("python-hkdf" ,python-hkdf)
("python-humanize" ,python-humanize)
("python-pynacl" ,python-pynacl)
("python-pywin32" ,python-pywin32)
("python-six" ,python-six)
("python-spake2" ,python-spake2)
("python-tqdm" ,python-tqdm)
("python-twisted" ,python-twisted)
("python-txtorcon" ,python-txtorcon)))
(native-inputs
`(("python-magic-wormhole-mailbox-server"
,python-magic-wormhole-mailbox-server)
("python-magic-wormhole-transit-relay"
,python-magic-wormhole-transit-relay)
("python-mock" ,python-mock)
("python-pyflakes" ,python-pyflakes)
("python-tox" ,python-tox)))
(home-page
(synopsis
"Securely transfer data between computers")
(description
"Securely transfer data between computers")
(license license:expat))
M
M
Maxim Cournoyer wrote on 29 Mar 2019 05:24
Re: bug#24450: pypi importer outputs strange character series in optional dependency case.
(name . ng0)(address . ng0@we.make.ritual.n0.is)(address . 24450@debbugs.gnu.org)
87imw22tqf.fsf@gmail.com
ng0 <ng0@we.make.ritual.n0.is> writes:

Toggle quote (39 lines)
> I think this should not happen with pypi import:
>
> (inputs
> `(("python-certifi==2016.2.28"
> ,python-certifi==2016.2.28)
> ("python-dateutil==2.5.3"
> ,python-dateutil==2.5.3)
> ("python-flask-babel==0.11.1"
> ,python-flask-babel==0.11.1)
> ("python-flask==0.11.1" ,python-flask==0.11.1)
> ("python-lxml==3.6.0" ,python-lxml==3.6.0)
> ("python-ndg-httpsclient==0.4.1"
> ,python-ndg-httpsclient==0.4.1)
> ("python-pyasn1-modules==0.0.8"
> ,python-pyasn1-modules==0.0.8)
> ("python-pyasn1==0.1.9" ,python-pyasn1==0.1.9)
> ("python-pygments==2.1.3"
> ,python-pygments==2.1.3)
> ("python-pyopenssl==0.15.1"
> ,python-pyopenssl==0.15.1)
> ("python-pyyaml==3.11" ,python-pyyaml==3.11)
> ("python-requests[socks]==2.10.0"
> ,#{python-requests\x5b;socks\x5d;==2.10.0}#)
> ("python-setuptools" ,python-setuptools)))
>
>
> I can understand the version numbers, I can also understand the optional
> socks building/module of the python-requests, but why does it read like
> Gobbledygook? Can't we improve the output here?
>
> For version numbers, this is not a format which happened recently which
> is exclusive for python build system right? This is just bad formated
> because of the pypi query.
> I will first try and not pin the application to these version numbers,
> maybe itjustworks™.
>
>
> To reproduce: "guix import pypi searx"

This would now give (change to be sent for review soon):

Toggle snippet (48 lines)
./pre-inst-env guix import pypi searx

Starting download of /tmp/guix-file.1wD8K4
From https://files.pythonhosted.org/packages/75/3f/5941ad2d500ff7cf6f8da1022c78013dcd2207941d533586a8e7bfe699d3/searx-0.15.0.tar.gz...
…5.0.tar.gz 1.6MiB 729KiB/s 00:02 [##################] 100.0%
(package
(name "python-searx")
(version "0.15.0")
(source
(origin
(method url-fetch)
(uri (pypi-uri "searx" version))
(sha256
(base32
"1gmww73q7wydkvlyz73wnr3sybpjn40wha7avnz9ak9m365zcjxf"))))
(build-system python-build-system)
(propagated-inputs
`(("python-certifi" ,python-certifi)
("python-dateutil" ,python-dateutil)
("python-flask" ,python-flask)
("python-flask-babel" ,python-flask-babel)
("python-idna" ,python-idna)
("python-lxml" ,python-lxml)
("python-pygments" ,python-pygments)
("python-pyopenssl" ,python-pyopenssl)
("python-pyyaml" ,python-pyyaml)
("python-requests" ,python-requests)))
(native-inputs
`(("python-babel" ,python-babel)
("python-cov-core" ,python-cov-core)
("python-mock" ,python-mock)
("python-nose2" ,python-nose2)
("python-pep8" ,python-pep8)
("python-plone.testing" ,python-plone.testing)
("python-selenium" ,python-selenium)
("python-splinter" ,python-splinter)
("python-transifex-client"
,python-transifex-client)
("python-unittest2" ,python-unittest2)
("python-zope.testrunner"
,python-zope.testrunner)))
(home-page "https://github.com/asciimoo/searx")
(synopsis
"A privacy-respecting, hackable metasearch engine")
(description
"A privacy-respecting, hackable metasearch engine")
(license #f))
M
M
Maxim Cournoyer wrote on 29 Mar 2019 05:34
[PATCH] bug#24450: pypi importer outputs strange character series in optional dependency case.
(name . ng0)(address . ng0@we.make.ritual.n0.is)(address . 24450@debbugs.gnu.org)
87tvfm1eos.fsf@gmail.com
Hello,

ng0 <ng0@we.make.ritual.n0.is> writes:

Toggle quote (39 lines)
> I think this should not happen with pypi import:
>
> (inputs
> `(("python-certifi==2016.2.28"
> ,python-certifi==2016.2.28)
> ("python-dateutil==2.5.3"
> ,python-dateutil==2.5.3)
> ("python-flask-babel==0.11.1"
> ,python-flask-babel==0.11.1)
> ("python-flask==0.11.1" ,python-flask==0.11.1)
> ("python-lxml==3.6.0" ,python-lxml==3.6.0)
> ("python-ndg-httpsclient==0.4.1"
> ,python-ndg-httpsclient==0.4.1)
> ("python-pyasn1-modules==0.0.8"
> ,python-pyasn1-modules==0.0.8)
> ("python-pyasn1==0.1.9" ,python-pyasn1==0.1.9)
> ("python-pygments==2.1.3"
> ,python-pygments==2.1.3)
> ("python-pyopenssl==0.15.1"
> ,python-pyopenssl==0.15.1)
> ("python-pyyaml==3.11" ,python-pyyaml==3.11)
> ("python-requests[socks]==2.10.0"
> ,#{python-requests\x5b;socks\x5d;==2.10.0}#)
> ("python-setuptools" ,python-setuptools)))
>
>
> I can understand the version numbers, I can also understand the optional
> socks building/module of the python-requests, but why does it read like
> Gobbledygook? Can't we improve the output here?
>
> For version numbers, this is not a format which happened recently which
> is exclusive for python build system right? This is just bad formated
> because of the pypi query.
> I will first try and not pin the application to these version numbers,
> maybe itjustworks™.
>
>
> To reproduce: "guix import pypi searx"

The following patches fix this, and more!
From 54e44b7397f17910d95dbdb233d23e5c97c095aa Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:00 -0400
Subject: [PATCH 1/7] import: pypi: Do not consider requirements.txt files.

* guix/import/pypi.scm (guess-requirements): Update comment.
[guess-requirements-from-source]: Do not attempt to parse the file
requirements.txt. Streamline logic.
---
guix/import/pypi.scm | 35 +++++++++++++----------------------
tests/pypi.scm | 23 +++++++++++------------
2 files changed, 24 insertions(+), 34 deletions(-)

Toggle diff (113 lines)
diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 3a20fc4b9b..8269aa61d7 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -206,35 +206,26 @@ cannot determine package dependencies"))
           (call-with-temporary-directory
            (lambda (dir)
              (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
-                    (req-files (list (string-append dirname "/requirements.txt")
-                                     (string-append dirname "/" pypi-name ".egg-info"
-                                                    "/requires.txt")))
-                    (exit-codes (map (lambda (file-name)
-                                       (parameterize ((current-error-port (%make-void-port "rw+"))
-                                                      (current-output-port (%make-void-port "rw+")))
-                                         (system* "tar" "xf" tarball "-C" dir file-name)))
-                                     req-files)))
-               ;; Only one of these files needs to exist.
-               (if (any zero? exit-codes)
-                   (match (find-files dir)
-                     ((file . _)
-                      (read-requirements file))
-                     (()
-                      (warning (G_ "No requirements file found.\n"))))
+                    (requires.txt (string-append dirname "/" pypi-name
+                                                 ".egg-info" "/requires.txt"))
+                    (exit-code (parameterize ((current-error-port (%make-void-port "rw+"))
+                                              (current-output-port (%make-void-port "rw+")))
+                                 (system* "tar" "xf" tarball "-C" dir requires.txt))))
+               (if (zero? exit-code)
+                   (read-requirements (string-append dir "/" requires.txt))
                    (begin
-                     (warning (G_ "Failed to extract requirements files\n"))
+                     (warning
+                      (G_ "Failed to extract file: ~a from source.~%")
+                      requires.txt)
                      '())))))
           '())))
 
-  ;; First, try to compute the requirements using the wheel, since that is the
-  ;; most reliable option. If a wheel is not provided for this package, try
-  ;; getting them by reading either the "requirements.txt" file or the
-  ;; "requires.txt" from the egg-info directory from the source tarball. Note
-  ;; that "requirements.txt" is not mandatory, so this is likely to fail.
+  ;; First, try to compute the requirements using the wheel, else, fallback to
+  ;; reading the "requires.txt" from the egg-info directory from the source
+  ;; tarball.
   (or (guess-requirements-from-wheel)
       (guess-requirements-from-source)))
 
-
 (define (compute-inputs source-url wheel-url tarball)
   "Given the SOURCE-URL of an already downloaded TARBALL, return a list of
 name/variable pairs describing the required inputs of this package.  Also
diff --git a/tests/pypi.scm b/tests/pypi.scm
index 6daa44a6e7..335be42644 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -23,7 +23,7 @@
   #:use-module (gcrypt hash)
   #:use-module (guix tests)
   #:use-module (guix build-system python)
-  #:use-module ((guix build utils) #:select (delete-file-recursively which))
+  #:use-module ((guix build utils) #:select (delete-file-recursively which mkdir-p))
   #:use-module (srfi srfi-64)
   #:use-module (ice-9 match))
 
@@ -55,11 +55,10 @@
 (define test-source-hash
   "")
 
-(define test-requirements
-"# A comment
- # A comment after a space
+(define test-requires.txt "\
 bar
-baz > 13.37")
+baz > 13.37
+")
 
 (define test-metadata
   "{
@@ -107,10 +106,10 @@ baz > 13.37")
              (match url
                ("https://example.com/foo-1.0.0.tar.gz"
                 (begin
-                  (mkdir "foo-1.0.0")
-                  (with-output-to-file "foo-1.0.0/requirements.txt"
+                  (mkdir-p "foo-1.0.0/foo.egg-info/")
+                  (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                     (lambda ()
-                      (display test-requirements)))
+                      (display test-requires.txt)))
                   (system* "tar" "czvf" file-name "foo-1.0.0/")
                   (delete-file-recursively "foo-1.0.0")
                   (set! test-source-hash
@@ -157,11 +156,11 @@ baz > 13.37")
          (lambda (url file-name)
            (match url
              ("https://example.com/foo-1.0.0.tar.gz"
-               (begin
-                 (mkdir "foo-1.0.0")
-                 (with-output-to-file "foo-1.0.0/requirements.txt"
+              (begin
+                (mkdir-p "foo-1.0.0/foo.egg-info/")
+                (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                    (lambda ()
-                     (display test-requirements)))
+                     (display test-requires.txt)))
                  (system* "tar" "czvf" file-name "foo-1.0.0/")
                  (delete-file-recursively "foo-1.0.0")
                  (set! test-source-hash
-- 
2.20.1
From 5f79b0502f62bd1dacc8ea143c1dbd9ef7cfc29d Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:00 -0400
Subject: [PATCH 2/7] import: pypi: Do not parse optional requirements from
source.

* guix/import/pypi.scm: Export PARSE-REQUIRES.TXT.
(guess-requirements): Move the READ-REQUIREMENTS procedure to the top level,
and rename it to PARSE-REQUIRES.TXT. Move the CLEAN-REQUIREMENT and COMMENT?
functions inside the READ-REQUIREMENTS procedure.
(parse-requires.txt): Add a SECTION-HEADER? predicate, and use it to prevent
parsing optional requirements.

* tests/pypi.scm (test-requires-with-sections): New variable.
("parse-requires.txt, with sections"): New test.
("pypi->guix-package"): Mute tar output to stdout.
---
guix/import/pypi.scm | 76 +++++++++++++++++++++++++++-----------------
tests/pypi.scm | 21 ++++++++++--
2 files changed, 65 insertions(+), 32 deletions(-)

Toggle diff (163 lines)
diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 8269aa61d7..91e987e9f1 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -47,7 +47,8 @@
   #:use-module (guix upstream)
   #:use-module ((guix licenses) #:prefix license:)
   #:use-module (guix build-system python)
-  #:export (guix-package->pypi-name
+  #:export (parse-requires.txt
+            guix-package->pypi-name
             pypi-recursive-import
             pypi->guix-package
             %pypi-updater))
@@ -117,6 +118,49 @@ package definition."
     ((package-inputs ...)
      `((propagated-inputs (,'quasiquote ,package-inputs))))))
 
+(define (clean-requirement s)
+  ;; Given a requirement LINE, as can be found in a setuptools requires.txt
+  ;; file, remove everything other than the actual name of the required
+  ;; package, and return it.
+  (string-take s (or (string-index s (lambda (chr)
+                                       (member chr '(#\space #\> #\= #\<))))
+                     (string-length s))))
+
+(define (parse-requires.txt requires.txt)
+  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
+requirement names."
+  ;; This is a very incomplete parser, which job is to select the non-optional
+  ;; dependencies and strip them out of any version information.
+  ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
+  ;; library and the requirements grammar defined by PEP-0508
+  ;; (https://www.python.org/dev/peps/pep-0508/).
+
+  (define (comment? line)
+    ;; Return #t if the given LINE is a comment, #f otherwise.
+    (eq? (string-ref (string-trim line) 0) #\#))
+
+  (define (section-header? line)
+    ;; Return #t if the given LINE is a section header, #f otherwise.
+    (let ((trimmed-line (string-trim line)))
+      (and (not (string-null? trimmed-line))
+           (eq? (string-ref trimmed-line 0) #\[))))
+
+  (call-with-input-file requires.txt
+    (lambda (port)
+      (let loop ((result '()))
+        (let ((line (read-line port)))
+          ;; Stop when a section is encountered, as sections contains optional
+          ;; (extra) requirements.  Non-optional requirements must appear
+          ;; before any section is defined.
+          (if (or (eof-object? line) (section-header? line))
+              (reverse result)
+              (cond
+               ((or (string-null? line) (comment? line))
+                (loop result))
+               (else
+                (loop (cons (clean-requirement line)
+                            result))))))))))
+
 (define (guess-requirements source-url wheel-url tarball)
   "Given SOURCE-URL, WHEEL-URL and a TARBALL of the package, return a list
 of the required packages specified in the requirements.txt file.  TARBALL will
@@ -139,34 +183,6 @@ be extracted in a temporary directory."
 cannot determine package dependencies"))
           #f)))))
 
-  (define (clean-requirement s)
-    ;; Given a requirement LINE, as can be found in a Python requirements.txt
-    ;; file, remove everything other than the actual name of the required
-    ;; package, and return it.
-    (string-take s
-      (or (string-index s (lambda (chr) (member chr '(#\space #\> #\= #\<))))
-          (string-length s))))
-
-  (define (comment? line)
-    ;; Return #t if the given LINE is a comment, #f otherwise.
-    (eq? (string-ref (string-trim line) 0) #\#))
-
-  (define (read-requirements requirements-file)
-    ;; Given REQUIREMENTS-FILE, a Python requirements.txt file, return a list
-    ;; of name/variable pairs describing the requirements.
-    (call-with-input-file requirements-file
-      (lambda (port)
-        (let loop ((result '()))
-          (let ((line (read-line port)))
-            (if (eof-object? line)
-                result
-                (cond
-                 ((or (string-null? line) (comment? line))
-                  (loop result))
-                 (else
-                  (loop (cons (clean-requirement line)
-                              result))))))))))
-
   (define (read-wheel-metadata wheel-archive)
     ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
     ;; requirements.
@@ -212,7 +228,7 @@ cannot determine package dependencies"))
                                               (current-output-port (%make-void-port "rw+")))
                                  (system* "tar" "xf" tarball "-C" dir requires.txt))))
                (if (zero? exit-code)
-                   (read-requirements (string-append dir "/" requires.txt))
+                   (parse-requires.txt (string-append dir "/" requires.txt))
                    (begin
                      (warning
                       (G_ "Failed to extract file: ~a from source.~%")
diff --git a/tests/pypi.scm b/tests/pypi.scm
index 335be42644..e4b7142311 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -60,6 +60,15 @@ bar
 baz > 13.37
 ")
 
+(define test-requires-with-sections "\
+# A comment
+foo ~= 3
+bar != 2
+
+[test]
+pytest (>=2.5.0)
+")
+
 (define test-metadata
   "{
   \"run_requires\": [
@@ -99,6 +108,12 @@ baz > 13.37
                     (uri (list "https://bitheap.org/cram/cram-0.7.tar.gz"
                                (pypi-uri "cram" "0.7"))))))))
 
+(test-equal "parse-requires.txt, with sections"
+  '("foo" "bar")
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-requires.txt test-requires-with-sections)))
+
 (test-assert "pypi->guix-package"
   ;; Replace network resources with sample data.
     (mock ((guix import utils) url-fetch
@@ -110,7 +125,8 @@ baz > 13.37
                   (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                     (lambda ()
                       (display test-requires.txt)))
-                  (system* "tar" "czvf" file-name "foo-1.0.0/")
+                  (parameterize ((current-output-port (%make-void-port "rw+")))
+                    (system* "tar" "czvf" file-name "foo-1.0.0/"))
                   (delete-file-recursively "foo-1.0.0")
                   (set! test-source-hash
                     (call-with-input-file file-name port-sha256))))
@@ -161,7 +177,8 @@ baz > 13.37
                 (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                    (lambda ()
                      (display test-requires.txt)))
-                 (system* "tar" "czvf" file-name "foo-1.0.0/")
+                (parameterize ((current-output-port (%make-void-port "rw+")))
+                  (system* "tar" "czvf" file-name "foo-1.0.0/"))
                  (delete-file-recursively "foo-1.0.0")
                  (set! test-source-hash
                        (call-with-input-file file-name port-sha256))))
-- 
2.20.1
From 0c62b541a3e8925b5ca31fe55dbe7536cf95151f Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:01 -0400
Subject: [PATCH 3/7] import: pypi: Improve parsing of requirement
specifications.

The previous solution was fragile and could leave unwanted characters in a
requirement name, such as '[' or ']'.

Partially fixes issue #33047 (see:

* guix/import/pypi.scm (use-modules): Export SPECIFICATION->REQUIREMENT-NAME
(%requirement-name-regexp): New variable.
(clean-requirement): Rename to...
(specification->requirement-name): this, which now uses
%requirement-name-regexp to select the requirement name from the requirement
specification.
(parse-requires.txt): Adapt.
---
guix/import/pypi.scm | 43 ++++++++++++++++++++++++++++++++++---------
tests/pypi.scm | 12 ++++++++++++
2 files changed, 46 insertions(+), 9 deletions(-)

Toggle diff (107 lines)
diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 91e987e9f1..efb5939c78 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -48,6 +48,7 @@
   #:use-module ((guix licenses) #:prefix license:)
   #:use-module (guix build-system python)
   #:export (parse-requires.txt
+            specification->requirement-name
             guix-package->pypi-name
             pypi-recursive-import
             pypi->guix-package
@@ -118,13 +119,37 @@ package definition."
     ((package-inputs ...)
      `((propagated-inputs (,'quasiquote ,package-inputs))))))
 
-(define (clean-requirement s)
-  ;; Given a requirement LINE, as can be found in a setuptools requires.txt
-  ;; file, remove everything other than the actual name of the required
-  ;; package, and return it.
-  (string-take s (or (string-index s (lambda (chr)
-                                       (member chr '(#\space #\> #\= #\<))))
-                     (string-length s))))
+(define %requirement-name-regexp
+  ;; Regexp to match the requirement name in a requirement specification.
+
+  ;; Some grammar, taken from PEP-0508 (see:
+  ;; https://www.python.org/dev/peps/pep-0508/).
+
+  ;; The unified rule can be expressed as:
+  ;; specification = wsp* ( url_req | name_req ) wsp*
+
+  ;; where url_req is:
+  ;; url_req = name wsp* extras? wsp* urlspec wsp+ quoted_marker?
+
+  ;; and where name_req is:
+  ;; name_req = name wsp* extras? wsp* versionspec? wsp* quoted_marker?
+
+  ;; Thus, we need only matching NAME, which is expressed as:
+  ;; identifer_end = letterOrDigit | (('-' | '_' | '.' )* letterOrDigit)
+  ;; identifier    = letterOrDigit identifier_end*
+  ;; name          = identifier
+  (let* ((letter-or-digit "[A-Za-z0-9]")
+         (identifier-end (string-append "(" letter-or-digit "|"
+                                        "[-_.]*" letter-or-digit ")"))
+         (identifier (string-append "^" letter-or-digit identifier-end "*"))
+         (name identifier))
+    (make-regexp name)))
+
+(define (specification->requirement-name spec)
+  "Given a specification SPEC, return the requirement name."
+  (match:substring
+   (or (regexp-exec %requirement-name-regexp spec)
+       (error (G_ "Could not extract requirement name in spec:") spec))))
 
 (define (parse-requires.txt requires.txt)
   "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
@@ -158,7 +183,7 @@ requirement names."
                ((or (string-null? line) (comment? line))
                 (loop result))
                (else
-                (loop (cons (clean-requirement line)
+                (loop (cons (specification->requirement-name line)
                             result))))))))))
 
 (define (guess-requirements source-url wheel-url tarball)
@@ -200,7 +225,7 @@ cannot determine package dependencies"))
                                             (hash-ref (list-ref run_requires 0)
                                                        "requires")
                                             '())))
-                     (map clean-requirement requirements)))))
+                     (map specification->requirement-name requirements)))))
              (lambda ()
                (delete-file json-file)
                (rmdir dirname))))))
diff --git a/tests/pypi.scm b/tests/pypi.scm
index e4b7142311..82d6bba8dd 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -55,6 +55,14 @@
 (define test-source-hash
   "")
 
+(define test-specifications
+  '("Fizzy [foo, bar]"
+    "PickyThing<1.6,>1.9,!=1.9.6,<2.0a0,==2.4c1"
+    "SomethingWithMarker[foo]>1.0;python_version<\"2.7\""
+    "requests [security,tests] >= 2.8.1, == 2.8.* ; python_version < \"2.7\""
+    "pip @ https://github.com/pypa/pip/archive/1.3.1.zip#\
+sha1=da9234ee9982d4bbb3c72346a6de940a148ea686"))
+
 (define test-requires.txt "\
 bar
 baz > 13.37
@@ -108,6 +116,10 @@ pytest (>=2.5.0)
                     (uri (list "https://bitheap.org/cram/cram-0.7.tar.gz"
                                (pypi-uri "cram" "0.7"))))))))
 
+(test-equal "specification->requirement-name"
+  '("Fizzy" "PickyThing" "SomethingWithMarker" "requests" "pip")
+  (map specification->requirement-name test-specifications))
+
 (test-equal "parse-requires.txt, with sections"
   '("foo" "bar")
   (mock ((ice-9 ports) call-with-input-file
-- 
2.20.1
From 76e4a3150f8126e0b952c6129b6e1371afba80c0 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:01 -0400
Subject: [PATCH 4/7] import: pypi: Deduplicate requirements.

* guix/import/pypi.scm (parse-requires.txt): Remove potential duplicates.
---
guix/import/pypi.scm | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

Toggle diff (19 lines)
diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index efb5939c78..a90be67bb0 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -178,7 +178,11 @@ requirement names."
           ;; (extra) requirements.  Non-optional requirements must appear
           ;; before any section is defined.
           (if (or (eof-object? line) (section-header? line))
-              (reverse result)
+              ;; Duplicates can occur, since the same requirement can be
+              ;; listed multiple times with different conditional markers, e.g.
+              ;; pytest >= 3 ; python_version >= "3.3"
+              ;; pytest < 3 ; python_version < "3.3"
+              (reverse (delete-duplicates result))
               (cond
                ((or (string-null? line) (comment? line))
                 (loop result))
-- 
2.20.1
From 73e27235cac1275ba7671fd2364325cf5788cb3c Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:02 -0400
Subject: [PATCH 5/7] import: pypi: Support more types of archives.

This change enables the PyPI importer to look for requirements in a source
archive of a different type than "tar.gz" or "tar.bz2".

* guix/import/pypi.scm: (guess-requirements)[tarball-directory]: Rename to...
[archive-root-directory]: this. Use COMPRESSED-FILED? to determine if an
archive is supported or not.
[guess-requirements-from-source]: Adapt to use the new method, and use unzip
to extract ZIP archives.
(guess-requirements): Rename the TARBALL argument to ARCHIVE, to denote the
archive format is no longer bound specifically to the Tar format.
---
guix/import/pypi.scm | 47 ++++++++++++++++++++++----------------------
1 file changed, 24 insertions(+), 23 deletions(-)

Toggle diff (89 lines)
diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index a90be67bb0..8e93653717 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -190,27 +190,24 @@ requirement names."
                 (loop (cons (specification->requirement-name line)
                             result))))))))))
 
-(define (guess-requirements source-url wheel-url tarball)
-  "Given SOURCE-URL, WHEEL-URL and a TARBALL of the package, return a list
-of the required packages specified in the requirements.txt file.  TARBALL will
+(define (guess-requirements source-url wheel-url archive)
+  "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
+of the required packages specified in the requirements.txt file.  ARCHIVE will
 be extracted in a temporary directory."
 
-  (define (tarball-directory url)
-    ;; Given the URL of the package's tarball, return the name of the directory
+  (define (archive-root-directory url)
+    ;; Given the URL of the package's archive, return the name of the directory
     ;; that will be created upon decompressing it. If the filetype is not
     ;; supported, return #f.
-    ;; TODO: Support more archive formats.
-    (let ((basename (substring url (+ 1 (string-rindex url #\/)))))
-      (cond
-       ((string-suffix? ".tar.gz" basename)
-        (string-drop-right basename 7))
-       ((string-suffix? ".tar.bz2" basename)
-        (string-drop-right basename 8))
-       (else
+    (if (compressed-file? url)
+        (let ((root-directory (file-sans-extension (basename url))))
+          (if (string=? "tar" (file-extension root-directory))
+              (file-sans-extension root-directory)
+              root-directory))
         (begin
-          (warning (G_ "Unsupported archive format: \
-cannot determine package dependencies"))
-          #f)))))
+          (warning (G_ "Unsupported archive format (~a): \
+cannot determine package dependencies") (file-extension url))
+          #f)))
 
   (define (read-wheel-metadata wheel-archive)
     ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
@@ -246,16 +243,20 @@ cannot determine package dependencies"))
 
   (define (guess-requirements-from-source)
     ;; Return the package's requirements by guessing them from the source.
-    (let ((dirname (tarball-directory source-url)))
+    (let ((dirname (archive-root-directory source-url))
+          (extension (file-extension source-url)))
       (if (string? dirname)
           (call-with-temporary-directory
            (lambda (dir)
              (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
                     (requires.txt (string-append dirname "/" pypi-name
                                                  ".egg-info" "/requires.txt"))
-                    (exit-code (parameterize ((current-error-port (%make-void-port "rw+"))
-                                              (current-output-port (%make-void-port "rw+")))
-                                 (system* "tar" "xf" tarball "-C" dir requires.txt))))
+                    (exit-code
+                     (parameterize ((current-error-port (%make-void-port "rw+"))
+                                    (current-output-port (%make-void-port "rw+")))
+                       (if (string=? "zip" extension)
+                           (system* "unzip" archive "-d" dir requires.txt)
+                           (system* "tar" "xf" archive "-C" dir requires.txt)))))
                (if (zero? exit-code)
                    (parse-requires.txt (string-append dir "/" requires.txt))
                    (begin
@@ -271,13 +272,13 @@ cannot determine package dependencies"))
   (or (guess-requirements-from-wheel)
       (guess-requirements-from-source)))
 
-(define (compute-inputs source-url wheel-url tarball)
-  "Given the SOURCE-URL of an already downloaded TARBALL, return a list of
+(define (compute-inputs source-url wheel-url archive)
+  "Given the SOURCE-URL of an already downloaded ARCHIVE, return a list of
 name/variable pairs describing the required inputs of this package.  Also
 return the unaltered list of upstream dependency names."
   (let ((dependencies
          (remove (cut string=? "argparse" <>)
-                 (guess-requirements source-url wheel-url tarball))))
+                 (guess-requirements source-url wheel-url archive))))
     (values (sort
              (map (lambda (input)
                     (let ((guix-name (python->package-name input)))
-- 
2.20.1
From fb0547ef225103c0f8355a7eccc41e0d028f6563 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:03 -0400
Subject: [PATCH 6/7] import: pypi: Parse wheel METADATA instead of
metadata.json.

With newer Wheel releases, there is no more metadata.json file; the METADATA
file should be used instead (see: https://github.com/pypa/wheel/issues/195).

This change updates our PyPI importer so that it uses the later.

* guix/import/pypi.scm (define-module): Remove unnecessary modules and export
the PARSE-WHEEL-METADATA method.
(parse-wheel-metadata): Add method.
(guess-requirements): Use it.
* tests/pypi.scm (test-metadata): Test it.
---
guix/import/pypi.scm | 66 +++++++++++++++++++++++++++++---------------
tests/pypi.scm | 60 ++++++++++++++++++++++++++++++----------
2 files changed, 89 insertions(+), 37 deletions(-)

Toggle diff (220 lines)
diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 8e93653717..c520213b6a 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -21,9 +21,7 @@
 ;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
 
 (define-module (guix import pypi)
-  #:use-module (ice-9 binary-ports)
   #:use-module (ice-9 match)
-  #:use-module (ice-9 pretty-print)
   #:use-module (ice-9 regex)
   #:use-module (ice-9 receive)
   #:use-module ((ice-9 rdelim) #:select (read-line))
@@ -31,9 +29,6 @@
   #:use-module (srfi srfi-26)
   #:use-module (srfi srfi-34)
   #:use-module (srfi srfi-35)
-  #:use-module (rnrs bytevectors)
-  #:use-module (json)
-  #:use-module (web uri)
   #:use-module (guix ui)
   #:use-module (guix utils)
   #:use-module ((guix build utils)
@@ -48,6 +43,7 @@
   #:use-module ((guix licenses) #:prefix license:)
   #:use-module (guix build-system python)
   #:export (parse-requires.txt
+            parse-wheel-metadata
             specification->requirement-name
             guix-package->pypi-name
             pypi-recursive-import
@@ -190,6 +186,37 @@ requirement names."
                 (loop (cons (specification->requirement-name line)
                             result))))))))))
 
+(define (parse-wheel-metadata metadata)
+  "Given METADATA, a Wheel metadata file, return a list of requirement names."
+  ;; METADATA is a RFC-2822-like, header based file.
+
+  (define (requires-dist-header? line)
+    ;; Return #t if the given LINE is a Requires-Dist header.
+    (regexp-match? (string-match "^Requires-Dist: " line)))
+
+  (define (requires-dist-value line)
+    (string-drop line (string-length "Requires-Dist: ")))
+
+  (define (extra? line)
+    ;; Return #t if the given LINE is an "extra" requirement.
+    (regexp-match? (string-match "extra == " line)))
+
+  (call-with-input-file metadata
+    (lambda (port)
+      (let loop ((requirements '()))
+        (let ((line (read-line port)))
+          ;; Stop at the first 'Provides-Extra' section: the non-optional
+          ;; requirements appear before the optional ones.
+          (if (eof-object? line)
+              (reverse (delete-duplicates requirements))
+              (cond
+               ((and (requires-dist-header? line) (not (extra? line)))
+                (loop (cons (specification->requirement-name
+                             (requires-dist-value line))
+                            requirements)))
+               (else
+                (loop requirements)))))))))
+
 (define (guess-requirements source-url wheel-url archive)
   "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
 of the required packages specified in the requirements.txt file.  ARCHIVE will
@@ -211,25 +238,18 @@ cannot determine package dependencies") (file-extension url))
 
   (define (read-wheel-metadata wheel-archive)
     ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
-    ;; requirements.
+    ;; requirements, or #f if the metadata file contained therein couldn't be
+    ;; extracted.
     (let* ((dirname (wheel-url->extracted-directory wheel-url))
-           (json-file (string-append dirname "/metadata.json")))
-      (and (zero? (system* "unzip" "-q" wheel-archive json-file))
-           (dynamic-wind
-             (const #t)
-             (lambda ()
-               (call-with-input-file json-file
-                 (lambda (port)
-                   (let* ((metadata (json->scm port))
-                          (run_requires (hash-ref metadata "run_requires"))
-                          (requirements (if run_requires
-                                            (hash-ref (list-ref run_requires 0)
-                                                       "requires")
-                                            '())))
-                     (map specification->requirement-name requirements)))))
-             (lambda ()
-               (delete-file json-file)
-               (rmdir dirname))))))
+           (metadata (string-append dirname "/METADATA")))
+      (call-with-temporary-directory
+       (lambda (dir)
+         (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
+             (parse-wheel-metadata (string-append dir "/" metadata))
+             (begin
+               (warning
+                (G_ "Failed to extract file: ~a from wheel.~%") metadata)
+               #f))))))
 
   (define (guess-requirements-from-wheel)
     ;; Return the package's requirements using the wheel, or #f if an error
diff --git a/tests/pypi.scm b/tests/pypi.scm
index 82d6bba8dd..ca8cb5f6de 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -21,6 +21,7 @@
   #:use-module (guix import pypi)
   #:use-module (guix base32)
   #:use-module (gcrypt hash)
+  #:use-module (guix memoization)
   #:use-module (guix tests)
   #:use-module (guix build-system python)
   #:use-module ((guix build utils) #:select (delete-file-recursively which mkdir-p))
@@ -77,17 +78,33 @@ bar != 2
 pytest (>=2.5.0)
 ")
 
-(define test-metadata
-  "{
-  \"run_requires\": [
-    {
-      \"requires\": [
-        \"bar\",
-        \"baz (>13.37)\"
-      ]
-    }
-  ]
-}")
+(define test-metadata "\
+Classifier: Programming Language :: Python :: 3.7
+Requires-Dist: baz ~= 3
+Requires-Dist: bar != 2
+Provides-Extra: test
+pytest (>=2.5.0)
+")
+
+(define test-metadata-with-extras "
+Classifier: Programming Language :: Python :: 3.7
+Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*
+Requires-Dist: wrapt (<2,>=1)
+Requires-Dist: bar
+
+Provides-Extra: dev
+Requires-Dist: tox ; extra == 'dev'
+Requires-Dist: bumpversion (<1) ; extra == 'dev'
+")
+
+;;; Provides-Extra can appear before Requires-Dist.
+(define test-metadata-with-extras-jedi "\
+Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*
+Provides-Extra: testing
+Requires-Dist: parso (>=0.3.0)
+Provides-Extra: testing
+Requires-Dist: pytest (>=3.1.0); extra == 'testing'
+")
 
 (test-begin "pypi")
 
@@ -126,6 +143,18 @@ pytest (>=2.5.0)
          call-with-input-string)
         (parse-requires.txt test-requires-with-sections)))
 
+(test-equal "parse-wheel-metadata, with extras"
+  '("wrapt" "bar")
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-wheel-metadata test-metadata-with-extras)))
+
+(test-equal "parse-wheel-metadata, with extras - Jedi"
+  '("parso")
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-wheel-metadata test-metadata-with-extras-jedi)))
+
 (test-assert "pypi->guix-package"
   ;; Replace network resources with sample data.
     (mock ((guix import utils) url-fetch
@@ -188,7 +217,7 @@ pytest (>=2.5.0)
                 (mkdir-p "foo-1.0.0/foo.egg-info/")
                 (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                    (lambda ()
-                     (display test-requires.txt)))
+                     (display "wrong data to make sure we're testing wheels ")))
                 (parameterize ((current-output-port (%make-void-port "rw+")))
                   (system* "tar" "czvf" file-name "foo-1.0.0/"))
                  (delete-file-recursively "foo-1.0.0")
@@ -197,13 +226,13 @@ pytest (>=2.5.0)
              ("https://example.com/foo-1.0.0-py2.py3-none-any.whl"
                (begin
                  (mkdir "foo-1.0.0.dist-info")
-                 (with-output-to-file "foo-1.0.0.dist-info/metadata.json"
+                 (with-output-to-file "foo-1.0.0.dist-info/METADATA"
                    (lambda ()
                      (display test-metadata)))
                  (let ((zip-file (string-append file-name ".zip")))
                    ;; zip always adds a "zip" extension to the file it creates,
                    ;; so we need to rename it.
-                   (system* "zip" zip-file "foo-1.0.0.dist-info/metadata.json")
+                   (system* "zip" zip-file "foo-1.0.0.dist-info/METADATA")
                    (rename-file zip-file file-name))
                  (delete-file-recursively "foo-1.0.0.dist-info")))
              (_ (error "Unexpected URL: " url)))))
@@ -215,6 +244,9 @@ pytest (>=2.5.0)
                             (string-length test-json)))
                    ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #f)
                    (_ (error "Unexpected URL: " url)))))
+              ;; Not clearing the memoization cache here would mean returning the value
+              ;; computed in the previous test.
+              (invalidate-memoization! pypi->guix-package)
               (match (pypi->guix-package "foo")
                 (('package
                    ('name "python-foo")
-- 
2.20.1
From ea0f24eb7b19c57ebb24ec48ba776b240bccfc99 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 23:12:26 -0400
Subject: [PATCH 7/7] import: pypi: Include optional test inputs as
native-inputs.

* guix/import/pypi.scm (maybe-inputs): Add INPUT-TYPE argument, and use it.
(test-section?): New predicate.
(parse-requires.txt): Collect the optional test inputs, and return them as the
second element of the returned list.
(parse-wheel-metadata): Likewise.
(guess-requirements): Adapt, and hide unzip output.
(make-pypi-sexp): Likewise, and include the test inputs requirements as native
inputs in the returned package expression.

* tests/pypi.scm (test-requires.txt): Include a test section in the
test-requires.txt data.
(test-requires.txt-beaker): New variable.
("parse-requires.txt"): Adapt.
("parse-requires.txt - Beaker"): New test.
("parse-wheel-metadata, with extras"): Adapt.
("parse-wheel-metadata, with extras - Jedi"): Adapt.
("pypi->guix-package, no wheel"): Re-indent, and add the expected
native-inputs.
("pypi->guix-package, wheels"): Likewise.
("pypi->guix-package, no usable requirement file."): New test.
---
guix/import/pypi.scm | 158 ++++++++++++++++++++++++++++---------------
tests/pypi.scm | 123 +++++++++++++++++++++++++--------
2 files changed, 199 insertions(+), 82 deletions(-)

Toggle diff (482 lines)
diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index c520213b6a..f84ad88e44 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -4,6 +4,7 @@
 ;;; Copyright © 2015, 2016, 2017 Ludovic Courtès <ludo@gnu.org>
 ;;; Copyright © 2017 Mathieu Othacehe <m.othacehe@gmail.com>
 ;;; Copyright © 2018 Ricardo Wurmus <rekado@elephly.net>
+;;; Copyright © 2019 Maxim Cournoyer <maxim.cournoyer@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -26,6 +27,7 @@
   #:use-module (ice-9 receive)
   #:use-module ((ice-9 rdelim) #:select (read-line))
   #:use-module (srfi srfi-1)
+  #:use-module (srfi srfi-11)
   #:use-module (srfi srfi-26)
   #:use-module (srfi srfi-34)
   #:use-module (srfi srfi-35)
@@ -106,14 +108,15 @@ package on PyPI."
     ((name version _ ...)
      (string-append name "-" version ".dist-info"))))
 
-(define (maybe-inputs package-inputs)
+(define (maybe-inputs package-inputs input-type)
   "Given a list of PACKAGE-INPUTS, tries to generate the 'inputs' field of a
-package definition."
+package definition.  INPUT-TYPE, a symbol, is used to populate the name of
+the input field."
   (match package-inputs
     (()
      '())
     ((package-inputs ...)
-     `((propagated-inputs (,'quasiquote ,package-inputs))))))
+     `((,input-type (,'quasiquote ,package-inputs))))))
 
 (define %requirement-name-regexp
   ;; Regexp to match the requirement name in a requirement specification.
@@ -147,11 +150,21 @@ package definition."
    (or (regexp-exec %requirement-name-regexp spec)
        (error (G_ "Could not extract requirement name in spec:") spec))))
 
+(define (test-section? name)
+  "Return #t if the section name contains 'test' or 'dev'."
+  (any (cut string-contains-ci name <>)
+       '("test" "dev")))
+
 (define (parse-requires.txt requires.txt)
-  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
-requirement names."
-  ;; This is a very incomplete parser, which job is to select the non-optional
-  ;; dependencies and strip them out of any version information.
+  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a pair of requirements.
+
+The first element of the pair contains the required dependencies while the
+second the optional test dependencies.  Note that currently, optional,
+non-test dependencies are omitted since these can be difficult or expensive to
+satisfy."
+
+  ;; This is a very incomplete parser, which job is to read in the requirement
+  ;; specification lines, and strip them out of any version information.
   ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
   ;; library and the requirements grammar defined by PEP-0508
   ;; (https://www.python.org/dev/peps/pep-0508/).
@@ -168,57 +181,89 @@ requirement names."
 
   (call-with-input-file requires.txt
     (lambda (port)
-      (let loop ((result '()))
+      (let loop ((required-deps '())
+                 (test-deps '())
+                 (inside-test-section? #f)
+                 (optional? #f))
         (let ((line (read-line port)))
-          ;; Stop when a section is encountered, as sections contains optional
-          ;; (extra) requirements.  Non-optional requirements must appear
-          ;; before any section is defined.
-          (if (or (eof-object? line) (section-header? line))
+          (if (eof-object? line)
               ;; Duplicates can occur, since the same requirement can be
               ;; listed multiple times with different conditional markers, e.g.
               ;; pytest >= 3 ; python_version >= "3.3"
               ;; pytest < 3 ; python_version < "3.3"
-              (reverse (delete-duplicates result))
+              (map (compose reverse delete-duplicates)
+                   (list required-deps test-deps))
               (cond
                ((or (string-null? line) (comment? line))
-                (loop result))
-               (else
+                (loop required-deps test-deps inside-test-section? optional?))
+               ((section-header? line)
+                ;; Encountering a section means that all the requirements
+                ;; listed below are optional. Since we want to pick only the
+                ;; test dependencies from the optional dependencies, we must
+                ;; track those separately.
+                (loop required-deps test-deps (test-section? line) #t))
+               (inside-test-section?
+                (loop required-deps
+                      (cons (specification->requirement-name line)
+                            test-deps)
+                      inside-test-section? optional?))
+               ((not optional?)
                 (loop (cons (specification->requirement-name line)
-                            result))))))))))
+                            required-deps)
+                      test-deps inside-test-section? optional?))
+               (optional?
+                ;; Skip optional items.
+                (loop required-deps test-deps inside-test-section? optional?))
+               (else
+                (warning (G_ "parse-requires.txt reached an unexpected \
+condition on line ~a~%") line)))))))))
 
 (define (parse-wheel-metadata metadata)
-  "Given METADATA, a Wheel metadata file, return a list of requirement names."
+  "Given METADATA, a Wheel metadata file, return a pair of requirements.
+
+The first element of the pair contains the required dependencies while the second the optional
+test dependencies.  Note that currently, optional, non-test dependencies are
+omitted since these can be difficult or expensive to satisfy."
   ;; METADATA is a RFC-2822-like, header based file.
 
   (define (requires-dist-header? line)
     ;; Return #t if the given LINE is a Requires-Dist header.
-    (regexp-match? (string-match "^Requires-Dist: " line)))
+    (string-match "^Requires-Dist: " line))
 
   (define (requires-dist-value line)
     (string-drop line (string-length "Requires-Dist: ")))
 
   (define (extra? line)
     ;; Return #t if the given LINE is an "extra" requirement.
-    (regexp-match? (string-match "extra == " line)))
+    (string-match "extra == '(.*)'" line))
+
+  (define (test-requirement? line)
+    (let ((extra-label (match:substring (extra? line) 1)))
+      (and extra-label (test-section? extra-label))))
 
   (call-with-input-file metadata
     (lambda (port)
-      (let loop ((requirements '()))
+      (let loop ((required-deps '())
+                 (test-deps '()))
         (let ((line (read-line port)))
-          ;; Stop at the first 'Provides-Extra' section: the non-optional
-          ;; requirements appear before the optional ones.
           (if (eof-object? line)
-              (reverse (delete-duplicates requirements))
+              (map (compose reverse delete-duplicates)
+                   (list required-deps test-deps))
               (cond
                ((and (requires-dist-header? line) (not (extra? line)))
                 (loop (cons (specification->requirement-name
                              (requires-dist-value line))
-                            requirements)))
+                            required-deps)
+                      test-deps))
+               ((and (requires-dist-header? line) (test-requirement? line))
+                (loop required-deps
+                      (cons (specification->requirement-name (requires-dist-value line))
+                            test-deps)))
                (else
-                (loop requirements)))))))))
+                (loop required-deps test-deps))))))))) ;skip line
 
 (define (guess-requirements source-url wheel-url archive)
-  "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
+  "Given SOURCE-URL, WHEEL-URL and an ARCHIVE of the package, return a list
 of the required packages specified in the requirements.txt file.  ARCHIVE will
 be extracted in a temporary directory."
 
@@ -244,7 +289,10 @@ cannot determine package dependencies") (file-extension url))
            (metadata (string-append dirname "/METADATA")))
       (call-with-temporary-directory
        (lambda (dir)
-         (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
+         (if (zero?
+              (parameterize ((current-error-port (%make-void-port "rw+"))
+                             (current-output-port (%make-void-port "rw+")))
+                (system* "unzip" wheel-archive "-d" dir metadata)))
              (parse-wheel-metadata (string-append dir "/" metadata))
              (begin
                (warning
@@ -283,32 +331,38 @@ cannot determine package dependencies") (file-extension url))
                      (warning
                       (G_ "Failed to extract file: ~a from source.~%")
                       requires.txt)
-                     '())))))
-          '())))
+                     (list '() '()))))))
+          (list '() '()))))
 
   ;; First, try to compute the requirements using the wheel, else, fallback to
   ;; reading the "requires.txt" from the egg-info directory from the source
-  ;; tarball.
+  ;; archive.
   (or (guess-requirements-from-wheel)
       (guess-requirements-from-source)))
 
 (define (compute-inputs source-url wheel-url archive)
-  "Given the SOURCE-URL of an already downloaded ARCHIVE, return a list of
-name/variable pairs describing the required inputs of this package.  Also
-return the unaltered list of upstream dependency names."
-  (let ((dependencies
-         (remove (cut string=? "argparse" <>)
-                 (guess-requirements source-url wheel-url archive))))
-    (values (sort
-             (map (lambda (input)
-                    (let ((guix-name (python->package-name input)))
-                      (list guix-name (list 'unquote (string->symbol guix-name)))))
-                  dependencies)
-             (lambda args
-               (match args
-                 (((a _ ...) (b _ ...))
-                  (string-ci<? a b)))))
-            dependencies)))
+  "Given the SOURCE-URL and WHEEL-URL of an already downloaded ARCHIVE, return
+a pair of lists, each consisting of a list of name/variable pairs, for the
+propagated inputs and the native inputs, respectively."
+
+  (define (strip-argparse deps)
+    (remove (cut string=? "argparse" <>) deps))
+
+  (define (requirement->package-name/sort deps)
+    (sort
+     (map (lambda (input)
+            (let ((guix-name (python->package-name input)))
+              (list guix-name (list 'unquote (string->symbol guix-name)))))
+          deps)
+     (lambda args
+       (match args
+         (((a _ ...) (b _ ...))
+          (string-ci<? a b))))))
+
+  (define process-requirements
+    (compose requirement->package-name/sort strip-argparse))
+
+  (map process-requirements (guess-requirements source-url wheel-url archive)))
 
 (define (make-pypi-sexp name version source-url wheel-url home-page synopsis
                         description license)
@@ -317,15 +371,13 @@ VERSION, SOURCE-URL, HOME-PAGE, SYNOPSIS, DESCRIPTION, and LICENSE."
   (call-with-temporary-output-file
    (lambda (temp port)
      (and (url-fetch source-url temp)
-          (receive (input-package-names upstream-dependency-names)
-              (compute-inputs source-url wheel-url temp)
-            (values
+          (match (compute-inputs source-url wheel-url temp)
+            ((required-inputs test-inputs)
              `(package
                 (name ,(python->package-name name))
                 (version ,version)
                 (source (origin
                           (method url-fetch)
-
                           ;; Sometimes 'pypi-uri' doesn't quite work due to mixed
                           ;; cases in NAME, for instance, as is the case with
                           ;; "uwsgi".  In that case, fall back to a full URL.
@@ -334,12 +386,12 @@ VERSION, SOURCE-URL, HOME-PAGE, SYNOPSIS, DESCRIPTION, and LICENSE."
                            (base32
                             ,(guix-hash-url temp)))))
                 (build-system python-build-system)
-                ,@(maybe-inputs input-package-names)
+                ,@(maybe-inputs required-inputs 'propagated-inputs)
+                ,@(maybe-inputs test-inputs 'native-inputs)
                 (home-page ,home-page)
                 (synopsis ,synopsis)
                 (description ,description)
-                (license ,(license->symbol license)))
-             upstream-dependency-names))))))
+                (license ,(license->symbol license)))))))))
 
 (define pypi->guix-package
   (memoize
diff --git a/tests/pypi.scm b/tests/pypi.scm
index ca8cb5f6de..aa08e2cb54 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -1,6 +1,7 @@
 ;;; GNU Guix --- Functional package management for GNU
 ;;; Copyright © 2014 David Thompson <davet@gnu.org>
 ;;; Copyright © 2016 Ricardo Wurmus <rekado@elephly.net>
+;;; Copyright © 2019 Maxim Cournoyer <maxim.cournoyer@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -65,11 +66,6 @@
 sha1=da9234ee9982d4bbb3c72346a6de940a148ea686"))
 
 (define test-requires.txt "\
-bar
-baz > 13.37
-")
-
-(define test-requires-with-sections "\
 # A comment
 foo ~= 3
 bar != 2
@@ -78,12 +74,25 @@ bar != 2
 pytest (>=2.5.0)
 ")
 
+;; Beaker contains only optional dependencies.
+(define test-requires.txt-beaker "\
+[crypto]
+pycryptopp>=0.5.12
+
+[cryptography]
+cryptography
+
+[testsuite]
+Mock
+coverage
+")
+
 (define test-metadata "\
 Classifier: Programming Language :: Python :: 3.7
 Requires-Dist: baz ~= 3
 Requires-Dist: bar != 2
 Provides-Extra: test
-pytest (>=2.5.0)
+Requires-Dist: pytest (>=2.5.0) ; extra == 'test'
 ")
 
 (define test-metadata-with-extras "
@@ -137,25 +146,31 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
   '("Fizzy" "PickyThing" "SomethingWithMarker" "requests" "pip")
   (map specification->requirement-name test-specifications))
 
-(test-equal "parse-requires.txt, with sections"
-  '("foo" "bar")
+(test-equal "parse-requires.txt"
+  (list '("foo" "bar") '("pytest"))
   (mock ((ice-9 ports) call-with-input-file
          call-with-input-string)
-        (parse-requires.txt test-requires-with-sections)))
+        (parse-requires.txt test-requires.txt)))
+
+(test-equal "parse-requires.txt - Beaker"
+  (list '() '("Mock" "coverage"))
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-requires.txt test-requires.txt-beaker)))
 
 (test-equal "parse-wheel-metadata, with extras"
-  '("wrapt" "bar")
+  (list '("wrapt" "bar") '("tox" "bumpversion"))
   (mock ((ice-9 ports) call-with-input-file
          call-with-input-string)
         (parse-wheel-metadata test-metadata-with-extras)))
 
 (test-equal "parse-wheel-metadata, with extras - Jedi"
-  '("parso")
+  (list '("parso") '("pytest"))
   (mock ((ice-9 ports) call-with-input-file
          call-with-input-string)
         (parse-wheel-metadata test-metadata-with-extras-jedi)))
 
-(test-assert "pypi->guix-package"
+(test-assert "pypi->guix-package, no wheel"
   ;; Replace network resources with sample data.
     (mock ((guix import utils) url-fetch
            (lambda (url file-name)
@@ -195,7 +210,10 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                      ('propagated-inputs
                       ('quasiquote
                        (("python-bar" ('unquote 'python-bar))
-                        ("python-baz" ('unquote 'python-baz)))))
+                        ("python-foo" ('unquote 'python-foo)))))
+                     ('native-inputs
+                      ('quasiquote
+                       (("python-pytest" ('unquote 'python-pytest)))))
                      ('home-page "http://example.com")
                      ('synopsis "summary")
                      ('description "summary")
@@ -216,25 +234,25 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
               (begin
                 (mkdir-p "foo-1.0.0/foo.egg-info/")
                 (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
-                   (lambda ()
-                     (display "wrong data to make sure we're testing wheels ")))
+                  (lambda ()
+                    (display "wrong data to make sure we're testing wheels ")))
                 (parameterize ((current-output-port (%make-void-port "rw+")))
                   (system* "tar" "czvf" file-name "foo-1.0.0/"))
-                 (delete-file-recursively "foo-1.0.0")
-                 (set! test-source-hash
-                       (call-with-input-file file-name port-sha256))))
+                (delete-file-recursively "foo-1.0.0")
+                (set! test-source-hash
+                  (call-with-input-file file-name port-sha256))))
              ("https://example.com/foo-1.0.0-py2.py3-none-any.whl"
-               (begin
-                 (mkdir "foo-1.0.0.dist-info")
-                 (with-output-to-file "foo-1.0.0.dist-info/METADATA"
-                   (lambda ()
-                     (display test-metadata)))
-                 (let ((zip-file (string-append file-name ".zip")))
-                   ;; zip always adds a "zip" extension to the file it creates,
-                   ;; so we need to rename it.
-                   (system* "zip" zip-file "foo-1.0.0.dist-info/METADATA")
-                   (rename-file zip-file file-name))
-                 (delete-file-recursively "foo-1.0.0.dist-info")))
+              (begin
+                (mkdir "foo-1.0.0.dist-info")
+                (with-output-to-file "foo-1.0.0.dist-info/METADATA"
+                  (lambda ()
+                    (display test-metadata)))
+                (let ((zip-file (string-append file-name ".zip")))
+                  ;; zip always adds a "zip" extension to the file it creates,
+                  ;; so we need to rename it.
+                  (system* "zip" "-q" zip-file "foo-1.0.0.dist-info/METADATA")
+                  (rename-file zip-file file-name))
+                (delete-file-recursively "foo-1.0.0.dist-info")))
              (_ (error "Unexpected URL: " url)))))
         (mock ((guix http-client) http-fetch
                (lambda (url . rest)
@@ -262,6 +280,9 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                     ('quasiquote
                      (("python-bar" ('unquote 'python-bar))
                       ("python-baz" ('unquote 'python-baz)))))
+                   ('native-inputs
+                    ('quasiquote
+                     (("python-pytest" ('unquote 'python-pytest)))))
                    ('home-page "http://example.com")
                    ('synopsis "summary")
                    ('description "summary")
@@ -272,4 +293,48 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                 (x
                  (pk 'fail x #f))))))
 
+(test-assert "pypi->guix-package, no usable requirement file."
+  ;; Replace network resources with sample data.
+  (mock ((guix import utils) url-fetch
+         (lambda (url file-name)
+           (match url
+             ("https://example.com/foo-1.0.0.tar.gz"
+              (set! test-source-hash
+                (call-with-input-file file-name port-sha256))
+              #t)
+             ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #t)
+             (_ (error "Unexpected URL: " url)))))
+        (mock ((guix http-client) http-fetch
+               (lambda (url . rest)
+                 (match url
+                   ("https://pypi.org/pypi/foo/json"
+                    (values (open-input-string test-json)
+                            (string-length test-json)))
+                   ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #f)
+                   (_ (error "Unexpected URL: " url)))))
+              ;; Not clearing the memoization cache here would mean returning the value
+              ;; computed in the previous test.
+              (invalidate-memoization! pypi->guix-package)
+              (match (pypi->guix-package "foo")
+                (('package
+                   ('name "python-foo")
+                   ('version "1.0.0")
+                   ('source ('origin
+                              ('method 'url-fetch)
+                              ('uri ('pypi-uri "foo" 'version))
+                              ('sha256
+                               ('base32
+                                (? string? hash)))))
+                   ('build-system 'python-build-system)
+                   ('home-page "http://example.com")
+                   ('synopsis "summary")
+                   ('description "summary")
+                   ('license 'license:lgpl2.0))
+                 (string=? (bytevector->nix-base32-string
+                            test-source-hash)
+                           hash))
+                (x
+                 (pk 'fail x #f)))
+              )))
+
 (test-end "pypi")
-- 
2.20.1
Thanks,

Maxim
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEEJ9WGpPiQCFQyn/CfEmDkZILmNWIFAlydoGMACgkQEmDkZILm
NWJMOA/+Nk3rtALS/r5Su8aNXIZfC5z0p5Zk3WBc/FSOkZ0e+56qdiiLLQ6bEBrT
xEmr23LqIiHfwU2H3H+mpZCKWDmqBb0edLW0u6cKRRO0eU5T55pEA71CyQj8c+mu
UHNyFGm6y/hmaMKPVsyP8p3rhNl8TIPJe2vFXnN4LWaCJaj3E0X6Ge2fsnNTgzXk
zpe5eU1vAFNfYOaf7uc658cfIPFzTEWgBfHASsfuKTLE34ZNB9HOxwskTNd1CvlN
USjs3kSFRGQl2q0tXQzrxt9eNwQzcXWMk61bCYtDrZ4d87dhENKd2EAivMgGLGq/
0poA4ijhb/zfmK7RNtHItivipzaZ+xAOpIEhcfwlmz2WxIZfV4eKkpLEB1q6QGAQ
0OKUmzAPHdYvNcHkhhAZIZl195JZ11HlrpTX5gUMqjL11d4ReuhlqNlAnynt8+tJ
8WgjVemJ7dhn8oZVLcXELBdFH01DZnbvF/8kheAyO6/dhTzKbxua+mPbNTpMn0w4
Ipe94BiF5DlaDffvWfmTebR9j+x7j8WSQeezc0GiNQuV8wOXKdVGsm+B66MdCpOH
ioQ8B/62Tu/9ul4L+hkaZCBrVM/JI8W+SAeAn+CMQ0Zye20POpwn1xXK5SCn1FRU
wXoFrEGE1L7yG+B406fRiABr711AItzTO9MvZQQmEquZBCN0qqI=
=2rRk
-----END PGP SIGNATURE-----

T
T
T460s laptop wrote on 30 Mar 2019 03:12
[PATCHv2] bug#24450: pypi importer outputs strange character series in optional dependency case.
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
877ech5cvd.fsf_-_@kwak.i-did-not-set--mail-host-address--so-tickle-me
The previous 0007 patch had broken the recursive importer. This
reworked version fixes this. The rest of the patches stack sent in the
previous message is still good as is.
From 37e499d5d5d5f690aa0a065c730e13f6a31dd30d Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 23:12:26 -0400
Subject: [PATCH] import: pypi: Include optional test inputs as native-inputs.

* guix/import/pypi.scm (maybe-inputs): Add INPUT-TYPE argument, and use it.
(test-section?): New predicate.
(parse-requires.txt): Collect the optional test inputs, and return them as the
second element of the returned list.
(parse-wheel-metadata): Likewise.
(guess-requirements): Adapt, and hide unzip output.
(make-pypi-sexp): Likewise, and include the test inputs requirements as native
inputs in the returned package expression.

* tests/pypi.scm (test-requires.txt): Include a test section in the
test-requires.txt data.
(test-requires.txt-beaker): New variable.
("parse-requires.txt"): Adapt.
("parse-requires.txt - Beaker"): New test.
("parse-wheel-metadata, with extras"): Adapt.
("parse-wheel-metadata, with extras - Jedi"): Adapt.
("pypi->guix-package, no wheel"): Re-indent, and add the expected
native-inputs.
("pypi->guix-package, wheels"): Likewise.
("pypi->guix-package, no usable requirement file."): New test.
---
guix/import/pypi.scm | 195 ++++++++++++++++++++++++++++---------------
tests/pypi.scm | 123 ++++++++++++++++++++-------
2 files changed, 222 insertions(+), 96 deletions(-)

Toggle diff (506 lines)
diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index c520213b6a..099768f0c8 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -4,6 +4,7 @@
 ;;; Copyright © 2015, 2016, 2017 Ludovic Courtès <ludo@gnu.org>
 ;;; Copyright © 2017 Mathieu Othacehe <m.othacehe@gmail.com>
 ;;; Copyright © 2018 Ricardo Wurmus <rekado@elephly.net>
+;;; Copyright © 2019 Maxim Cournoyer <maxim.cournoyer@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -26,6 +27,7 @@
   #:use-module (ice-9 receive)
   #:use-module ((ice-9 rdelim) #:select (read-line))
   #:use-module (srfi srfi-1)
+  #:use-module (srfi srfi-11)
   #:use-module (srfi srfi-26)
   #:use-module (srfi srfi-34)
   #:use-module (srfi srfi-35)
@@ -106,14 +108,15 @@ package on PyPI."
     ((name version _ ...)
      (string-append name "-" version ".dist-info"))))
 
-(define (maybe-inputs package-inputs)
+(define (maybe-inputs package-inputs input-type)
   "Given a list of PACKAGE-INPUTS, tries to generate the 'inputs' field of a
-package definition."
+package definition.  INPUT-TYPE, a symbol, is used to populate the name of
+the input field."
   (match package-inputs
     (()
      '())
     ((package-inputs ...)
-     `((propagated-inputs (,'quasiquote ,package-inputs))))))
+     `((,input-type (,'quasiquote ,package-inputs))))))
 
 (define %requirement-name-regexp
   ;; Regexp to match the requirement name in a requirement specification.
@@ -147,11 +150,21 @@ package definition."
    (or (regexp-exec %requirement-name-regexp spec)
        (error (G_ "Could not extract requirement name in spec:") spec))))
 
+(define (test-section? name)
+  "Return #t if the section name contains 'test' or 'dev'."
+  (any (cut string-contains-ci name <>)
+       '("test" "dev")))
+
 (define (parse-requires.txt requires.txt)
-  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
-requirement names."
-  ;; This is a very incomplete parser, which job is to select the non-optional
-  ;; dependencies and strip them out of any version information.
+  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a pair of requirements.
+
+The first element of the pair contains the required dependencies while the
+second the optional test dependencies.  Note that currently, optional,
+non-test dependencies are omitted since these can be difficult or expensive to
+satisfy."
+
+  ;; This is a very incomplete parser, which job is to read in the requirement
+  ;; specification lines, and strip them out of any version information.
   ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
   ;; library and the requirements grammar defined by PEP-0508
   ;; (https://www.python.org/dev/peps/pep-0508/).
@@ -168,57 +181,89 @@ requirement names."
 
   (call-with-input-file requires.txt
     (lambda (port)
-      (let loop ((result '()))
+      (let loop ((required-deps '())
+                 (test-deps '())
+                 (inside-test-section? #f)
+                 (optional? #f))
         (let ((line (read-line port)))
-          ;; Stop when a section is encountered, as sections contains optional
-          ;; (extra) requirements.  Non-optional requirements must appear
-          ;; before any section is defined.
-          (if (or (eof-object? line) (section-header? line))
+          (if (eof-object? line)
               ;; Duplicates can occur, since the same requirement can be
               ;; listed multiple times with different conditional markers, e.g.
               ;; pytest >= 3 ; python_version >= "3.3"
               ;; pytest < 3 ; python_version < "3.3"
-              (reverse (delete-duplicates result))
+              (map (compose reverse delete-duplicates)
+                   (list required-deps test-deps))
               (cond
                ((or (string-null? line) (comment? line))
-                (loop result))
-               (else
+                (loop required-deps test-deps inside-test-section? optional?))
+               ((section-header? line)
+                ;; Encountering a section means that all the requirements
+                ;; listed below are optional. Since we want to pick only the
+                ;; test dependencies from the optional dependencies, we must
+                ;; track those separately.
+                (loop required-deps test-deps (test-section? line) #t))
+               (inside-test-section?
+                (loop required-deps
+                      (cons (specification->requirement-name line)
+                            test-deps)
+                      inside-test-section? optional?))
+               ((not optional?)
                 (loop (cons (specification->requirement-name line)
-                            result))))))))))
+                            required-deps)
+                      test-deps inside-test-section? optional?))
+               (optional?
+                ;; Skip optional items.
+                (loop required-deps test-deps inside-test-section? optional?))
+               (else
+                (warning (G_ "parse-requires.txt reached an unexpected \
+condition on line ~a~%") line)))))))))
 
 (define (parse-wheel-metadata metadata)
-  "Given METADATA, a Wheel metadata file, return a list of requirement names."
+  "Given METADATA, a Wheel metadata file, return a pair of requirements.
+
+The first element of the pair contains the required dependencies while the second the optional
+test dependencies.  Note that currently, optional, non-test dependencies are
+omitted since these can be difficult or expensive to satisfy."
   ;; METADATA is a RFC-2822-like, header based file.
 
   (define (requires-dist-header? line)
     ;; Return #t if the given LINE is a Requires-Dist header.
-    (regexp-match? (string-match "^Requires-Dist: " line)))
+    (string-match "^Requires-Dist: " line))
 
   (define (requires-dist-value line)
     (string-drop line (string-length "Requires-Dist: ")))
 
   (define (extra? line)
     ;; Return #t if the given LINE is an "extra" requirement.
-    (regexp-match? (string-match "extra == " line)))
+    (string-match "extra == '(.*)'" line))
+
+  (define (test-requirement? line)
+    (let ((extra-label (match:substring (extra? line) 1)))
+      (and extra-label (test-section? extra-label))))
 
   (call-with-input-file metadata
     (lambda (port)
-      (let loop ((requirements '()))
+      (let loop ((required-deps '())
+                 (test-deps '()))
         (let ((line (read-line port)))
-          ;; Stop at the first 'Provides-Extra' section: the non-optional
-          ;; requirements appear before the optional ones.
           (if (eof-object? line)
-              (reverse (delete-duplicates requirements))
+              (map (compose reverse delete-duplicates)
+                   (list required-deps test-deps))
               (cond
                ((and (requires-dist-header? line) (not (extra? line)))
                 (loop (cons (specification->requirement-name
                              (requires-dist-value line))
-                            requirements)))
+                            required-deps)
+                      test-deps))
+               ((and (requires-dist-header? line) (test-requirement? line))
+                (loop required-deps
+                      (cons (specification->requirement-name (requires-dist-value line))
+                            test-deps)))
                (else
-                (loop requirements)))))))))
+                (loop required-deps test-deps))))))))) ;skip line
 
 (define (guess-requirements source-url wheel-url archive)
-  "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
+  "Given SOURCE-URL, WHEEL-URL and an ARCHIVE of the package, return a list
 of the required packages specified in the requirements.txt file.  ARCHIVE will
 be extracted in a temporary directory."
 
@@ -244,7 +289,10 @@ cannot determine package dependencies") (file-extension url))
            (metadata (string-append dirname "/METADATA")))
       (call-with-temporary-directory
        (lambda (dir)
-         (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
+         (if (zero?
+              (parameterize ((current-error-port (%make-void-port "rw+"))
+                             (current-output-port (%make-void-port "rw+")))
+                (system* "unzip" wheel-archive "-d" dir metadata)))
              (parse-wheel-metadata (string-append dir "/" metadata))
              (begin
                (warning
@@ -283,32 +331,41 @@ cannot determine package dependencies") (file-extension url))
                      (warning
                       (G_ "Failed to extract file: ~a from source.~%")
                       requires.txt)
-                     '())))))
-          '())))
+                     (list '() '()))))))
+          (list '() '()))))
 
   ;; First, try to compute the requirements using the wheel, else, fallback to
   ;; reading the "requires.txt" from the egg-info directory from the source
-  ;; tarball.
+  ;; archive.
   (or (guess-requirements-from-wheel)
       (guess-requirements-from-source)))
 
 (define (compute-inputs source-url wheel-url archive)
-  "Given the SOURCE-URL of an already downloaded ARCHIVE, return a list of
-name/variable pairs describing the required inputs of this package.  Also
+  "Given the SOURCE-URL and WHEEL-URL of an already downloaded ARCHIVE, return
+a pair of lists, each consisting of a list of name/variable pairs, for the
+propagated inputs and the native inputs, respectively.  Also
 return the unaltered list of upstream dependency names."
-  (let ((dependencies
-         (remove (cut string=? "argparse" <>)
-                 (guess-requirements source-url wheel-url archive))))
-    (values (sort
-             (map (lambda (input)
-                    (let ((guix-name (python->package-name input)))
-                      (list guix-name (list 'unquote (string->symbol guix-name)))))
-                  dependencies)
-             (lambda args
-               (match args
-                 (((a _ ...) (b _ ...))
-                  (string-ci<? a b)))))
-            dependencies)))
+
+  (define (strip-argparse deps)
+    (remove (cut string=? "argparse" <>) deps))
+
+  (define (requirement->package-name/sort deps)
+    (sort
+     (map (lambda (input)
+            (let ((guix-name (python->package-name input)))
+              (list guix-name (list 'unquote (string->symbol guix-name)))))
+          deps)
+     (lambda args
+       (match args
+         (((a _ ...) (b _ ...))
+          (string-ci<? a b))))))
+
+  (define process-requirements
+    (compose requirement->package-name/sort strip-argparse))
+
+  (let ((dependencies (guess-requirements source-url wheel-url archive)))
+    (values (map process-requirements dependencies)
+            (concatenate dependencies))))
 
 (define (make-pypi-sexp name version source-url wheel-url home-page synopsis
                         description license)
@@ -317,29 +374,33 @@ VERSION, SOURCE-URL, HOME-PAGE, SYNOPSIS, DESCRIPTION, and LICENSE."
   (call-with-temporary-output-file
    (lambda (temp port)
      (and (url-fetch source-url temp)
-          (receive (input-package-names upstream-dependency-names)
+          (receive (guix-dependencies upstream-dependencies)
               (compute-inputs source-url wheel-url temp)
-            (values
-             `(package
-                (name ,(python->package-name name))
-                (version ,version)
-                (source (origin
-                          (method url-fetch)
-
-                          ;; Sometimes 'pypi-uri' doesn't quite work due to mixed
-                          ;; cases in NAME, for instance, as is the case with
-                          ;; "uwsgi".  In that case, fall back to a full URL.
-                          (uri (pypi-uri ,(string-downcase name) version))
-                          (sha256
-                           (base32
-                            ,(guix-hash-url temp)))))
-                (build-system python-build-system)
-                ,@(maybe-inputs input-package-names)
-                (home-page ,home-page)
-                (synopsis ,synopsis)
-                (description ,description)
-                (license ,(license->symbol license)))
-             upstream-dependency-names))))))
+            (match guix-dependencies
+              ((required-inputs test-inputs)
+               (values
+                `(package
+                   (name ,(python->package-name name))
+                   (version ,version)
+                   (source (origin
+                             (method url-fetch)
+                             ;; Sometimes 'pypi-uri' doesn't quite work due to mixed
+                             ;; cases in NAME, for instance, as is the case with
+                             ;; "uwsgi".  In that case, fall back to a full URL.
+                             (uri (pypi-uri ,(string-downcase name) version))
+                             (sha256
+                              (base32
+                               ,(guix-hash-url temp)))))
+                   (build-system python-build-system)
+                   ,@(maybe-inputs required-inputs 'propagated-inputs)
+                   ,@(maybe-inputs test-inputs 'native-inputs)
+                   (home-page ,home-page)
+                   (synopsis ,synopsis)
+                   (description ,description)
+                   (license ,(license->symbol license)))
+                ;; Flatten the nested lists and return the upstream
+                ;; dependencies.
+                upstream-dependencies))))))))
 
 (define pypi->guix-package
   (memoize
diff --git a/tests/pypi.scm b/tests/pypi.scm
index ca8cb5f6de..aa08e2cb54 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -1,6 +1,7 @@
 ;;; GNU Guix --- Functional package management for GNU
 ;;; Copyright © 2014 David Thompson <davet@gnu.org>
 ;;; Copyright © 2016 Ricardo Wurmus <rekado@elephly.net>
+;;; Copyright © 2019 Maxim Cournoyer <maxim.cournoyer@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -65,11 +66,6 @@
 sha1=da9234ee9982d4bbb3c72346a6de940a148ea686"))
 
 (define test-requires.txt "\
-bar
-baz > 13.37
-")
-
-(define test-requires-with-sections "\
 # A comment
 foo ~= 3
 bar != 2
@@ -78,12 +74,25 @@ bar != 2
 pytest (>=2.5.0)
 ")
 
+;; Beaker contains only optional dependencies.
+(define test-requires.txt-beaker "\
+[crypto]
+pycryptopp>=0.5.12
+
+[cryptography]
+cryptography
+
+[testsuite]
+Mock
+coverage
+")
+
 (define test-metadata "\
 Classifier: Programming Language :: Python :: 3.7
 Requires-Dist: baz ~= 3
 Requires-Dist: bar != 2
 Provides-Extra: test
-pytest (>=2.5.0)
+Requires-Dist: pytest (>=2.5.0) ; extra == 'test'
 ")
 
 (define test-metadata-with-extras "
@@ -137,25 +146,31 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
   '("Fizzy" "PickyThing" "SomethingWithMarker" "requests" "pip")
   (map specification->requirement-name test-specifications))
 
-(test-equal "parse-requires.txt, with sections"
-  '("foo" "bar")
+(test-equal "parse-requires.txt"
+  (list '("foo" "bar") '("pytest"))
   (mock ((ice-9 ports) call-with-input-file
          call-with-input-string)
-        (parse-requires.txt test-requires-with-sections)))
+        (parse-requires.txt test-requires.txt)))
+
+(test-equal "parse-requires.txt - Beaker"
+  (list '() '("Mock" "coverage"))
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-requires.txt test-requires.txt-beaker)))
 
 (test-equal "parse-wheel-metadata, with extras"
-  '("wrapt" "bar")
+  (list '("wrapt" "bar") '("tox" "bumpversion"))
   (mock ((ice-9 ports) call-with-input-file
          call-with-input-string)
         (parse-wheel-metadata test-metadata-with-extras)))
 
 (test-equal "parse-wheel-metadata, with extras - Jedi"
-  '("parso")
+  (list '("parso") '("pytest"))
   (mock ((ice-9 ports) call-with-input-file
          call-with-input-string)
         (parse-wheel-metadata test-metadata-with-extras-jedi)))
 
-(test-assert "pypi->guix-package"
+(test-assert "pypi->guix-package, no wheel"
   ;; Replace network resources with sample data.
     (mock ((guix import utils) url-fetch
            (lambda (url file-name)
@@ -195,7 +210,10 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                      ('propagated-inputs
                       ('quasiquote
                        (("python-bar" ('unquote 'python-bar))
-                        ("python-baz" ('unquote 'python-baz)))))
+                        ("python-foo" ('unquote 'python-foo)))))
+                     ('native-inputs
+                      ('quasiquote
+                       (("python-pytest" ('unquote 'python-pytest)))))
                      ('home-page "http://example.com")
                      ('synopsis "summary")
                      ('description "summary")
@@ -216,25 +234,25 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
               (begin
                 (mkdir-p "foo-1.0.0/foo.egg-info/")
                 (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
-                   (lambda ()
-                     (display "wrong data to make sure we're testing wheels ")))
+                  (lambda ()
+                    (display "wrong data to make sure we're testing wheels ")))
                 (parameterize ((current-output-port (%make-void-port "rw+")))
                   (system* "tar" "czvf" file-name "foo-1.0.0/"))
-                 (delete-file-recursively "foo-1.0.0")
-                 (set! test-source-hash
-                       (call-with-input-file file-name port-sha256))))
+                (delete-file-recursively "foo-1.0.0")
+                (set! test-source-hash
+                  (call-with-input-file file-name port-sha256))))
              ("https://example.com/foo-1.0.0-py2.py3-none-any.whl"
-               (begin
-                 (mkdir "foo-1.0.0.dist-info")
-                 (with-output-to-file "foo-1.0.0.dist-info/METADATA"
-                   (lambda ()
-                     (display test-metadata)))
-                 (let ((zip-file (string-append file-name ".zip")))
-                   ;; zip always adds a "zip" extension to the file it creates,
-                   ;; so we need to rename it.
-                   (system* "zip" zip-file "foo-1.0.0.dist-info/METADATA")
-                   (rename-file zip-file file-name))
-                 (delete-file-recursively "foo-1.0.0.dist-info")))
+              (begin
+                (mkdir "foo-1.0.0.dist-info")
+                (with-output-to-file "foo-1.0.0.dist-info/METADATA"
+                  (lambda ()
+                    (display test-metadata)))
+                (let ((zip-file (string-append file-name ".zip")))
+                  ;; zip always adds a "zip" extension to the file it creates,
+                  ;; so we need to rename it.
+                  (system* "zip" "-q" zip-file "foo-1.0.0.dist-info/METADATA")
+                  (rename-file zip-file file-name))
+                (delete-file-recursively "foo-1.0.0.dist-info")))
              (_ (error "Unexpected URL: " url)))))
         (mock ((guix http-client) http-fetch
                (lambda (url . rest)
@@ -262,6 +280,9 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                     ('quasiquote
                      (("python-bar" ('unquote 'python-bar))
                       ("python-baz" ('unquote 'python-baz)))))
+                   ('native-inputs
+                    ('quasiquote
+                     (("python-pytest" ('unquote 'python-pytest)))))
                    ('home-page "http://example.com")
                    ('synopsis "summary")
                    ('description "summary")
@@ -272,4 +293,48 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                 (x
                  (pk 'fail x #f))))))
 
+(test-assert "pypi->guix-package, no usable requirement file."
+  ;; Replace network resources with sample data.
+  (mock ((guix import utils) url-fetch
+         (lambda (url file-name)
+           (match url
+             ("https://example.com/foo-1.0.0.tar.gz"
+              (set! test-source-hash
+                (call-with-input-file file-name port-sha256))
+              #t)
+             ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #t)
+             (_ (error "Unexpected URL: " url)))))
+        (mock ((guix http-client) http-fetch
+               (lambda (url . rest)
+                 (match url
+                   ("https://pypi.org/pypi/foo/json"
+                    (values (open-input-string test-json)
+                            (string-length test-json)))
+                   ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #f)
+                   (_ (error "Unexpected URL: " url)))))
+              ;; Not clearing the memoization cache here would mean returning the value
+              ;; computed in the previous test.
+              (invalidate-memoization! pypi->guix-package)
+              (match (pypi->guix-package "foo")
+                (('package
+                   ('name "python-foo")
+                   ('version "1.0.0")
+                   ('source ('origin
+                              ('method 'url-fetch)
+                              ('uri ('pypi-uri "foo" 'version))
+                              ('sha256
+                               ('base32
+                                (? string? hash)))))
+                   ('build-system 'python-build-system)
+                   ('home-page "http://example.com")
+                   ('synopsis "summary")
+                   ('description "summary")
+                   ('license 'license:lgpl2.0))
+                 (string=? (bytevector->nix-base32-string
+                            test-source-hash)
+                           hash))
+                (x
+                 (pk 'fail x #f)))
+              )))
+
 (test-end "pypi")
-- 
2.20.1
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEEJ9WGpPiQCFQyn/CfEmDkZILmNWIFAlye0JYACgkQEmDkZILm
NWJGEg/9HElFyj+z1M2Z6ffboYKW3WJ0D7q+vV0fvkacd3X/8XkC6DPEjI8veFYf
3qiygovcK6tgI5sAVzRWZ4WJ7oW284YoUURGjfax1ilvJl3kGkzqNy45nD3CQo3A
JK/rfW432Pu2UgC/Ewu+4V1NZqU5gCoHnz/rVNt0gRwWANBCLQG9fia/0u7zq8WK
E0UIlKJZeOi6/1xr+QYcSYtMV9Nr3mb7OOC3TleHrz+lrT8YbmrsqdxXfZuJw3iA
FY4jMX6SGAlfhKc7/+2F2z+iwkev7tTg1AAVhLWyqeR+zqYwBOCj5TOEAFMBcO8e
FMUr6GONxdJfHwSapnzzEq6HcTqhx+MutMCiynap3hBVxONDA2uXmGTM/pe0zrZm
Dqdr4XgFgaSJcaAtc3Ar/v6jwRy3muNKHBl6opQjd7zbxIju9hgzsiXvTbC/RUgg
9Oc7Yd2PT2kCBvJ1GgCk996EHcY1ACOeHHI0Yg0XHO6JCx1Dqi9i0vImGqcoOg/o
OmA0cxRntaCgiCSyGaSyBzcyHjsfieKwZlRQk5DaeMfguzGHGFS9b6OHuJ7Ofdb7
QESgYGuD5YlQhAsNc64A0GTm6+JDh9v84MwUqyR0EcU3toPO3r2zAWeGoi8k3FDZ
IaPuNC3qZ9XY2Pp7kHsHcTCRs+lXcoIDxZElkOgNMWCvW5wKpr4=
=2WjY
-----END PGP SIGNATURE-----

T
T
T460s laptop wrote on 30 Mar 2019 03:15
control message for bug #24450
(address . control@debbugs.gnu.org)
875zs15cq0.fsf@kwak.i-did-not-set--mail-host-address--so-tickle-me
forcemerge 24450 24557
S
S
swedebugia wrote on 30 Mar 2019 08:59
Re: bug#34266: pypi importer cannot handle [ and ] correctly
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
991a432d-e5b8-88cd-ee47-bb60af1a8dbe@riseup.net
On 2019-03-29 05:19, Maxim Cournoyer wrote:
Toggle quote (97 lines)
> swedebugia <swedebugia@riseup.net> writes:
>
>> $ ./pre-inst-env guix import pypi beaker
>>
>> following redirection to `https://pypi.org/pypi/Beaker/json'...
>>
>> Starting download of /tmp/guix-file.p15GJZ
>> From
>> https://files.pythonhosted.org/packages/c2/21/b052b2fbfee3def06670923d5d34b0d353d4c278013e4a714c3fb663f150/Beaker-1.10.0.tar.gz...
>> ...0.0.tar.gz 40KiB 521KiB/s 00:00
>> [##################] 100.0%
>> (package
>> (name "python-beaker")
>> (version "1.10.0")
>> (source
>> (origin
>> (method url-fetch)
>> (uri (pypi-uri "beaker" version))
>> (sha256
>> (base32
>> "0l047yl3n9b3w7ba0wrqdb5fpww5y8pjy20kah2mlpr230lqjwk0"))))
>> (build-system python-build-system)
>> (propagated-inputs
>> `(("python-[crypto]" ,#{python-\x5b;crypto\x5d;}#)
>> ("python-[cryptography]"
>> ,#{python-\x5b;cryptography\x5d;}#)
>> ("python-[pycrypto]"
>> ,#{python-\x5b;pycrypto\x5d;}#)
>> ("python-[pycryptodome]"
>> ,#{python-\x5b;pycryptodome\x5d;}#)
>> ("python-[testsuite]"
>> ,#{python-\x5b;testsuite\x5d;}#)
>> ("python-coverage" ,python-coverage)
>> ("python-cryptography" ,python-cryptography)
>> ("python-cryptography" ,python-cryptography)
>> ("python-funcsigs" ,python-funcsigs)
>> ("python-memcached" ,python-memcached)
>> ("python-mock" ,python-mock)
>> ("python-nose" ,python-nose)
>> ("python-pycrypto" ,python-pycrypto)
>> ("python-pycryptodome" ,python-pycryptodome)
>> ("python-pycryptodome" ,python-pycryptodome)
>> ("python-pycryptopp" ,python-pycryptopp)
>> ("python-pylibmc" ,python-pylibmc)
>> ("python-pymongo" ,python-pymongo)
>> ("python-redis" ,python-redis)
>> ("python-sqlalchemy" ,python-sqlalchemy)
>> ("python-webtest" ,python-webtest)))
>> (home-page "https://beaker.readthedocs.io/")
>> (synopsis
>> "A Session and Caching library with WSGI Middleware")
>> (description
>> "A Session and Caching library with WSGI Middleware")
>> (license license:bsd-3))
>
> Testing with my soon-to-be sent for review changes:
>
> --8<---------------cut here---------------start------------->8---
> ./pre-inst-env guix import pypi beaker
> following redirection to `https://pypi.org/pypi/Beaker/json'...
>
> Starting download of /tmp/guix-file.0MWu4B
> From https://files.pythonhosted.org/packages/76/87/ecc1a222f0caaa7ba7b8928737e89b2e91b8c22450c12b8a51ee625a4d87/Beaker-1.10.1.tar.gz...
> …0.1.tar.gz 40KiB 487KiB/s 00:00 [##################] 100.0%
> (package
> (name "python-beaker")
> (version "1.10.1")
> (source
> (origin
> (method url-fetch)
> (uri (pypi-uri "beaker" version))
> (sha256
> (base32
> "16zdjfl8v73yl1capph0n371vd26c7zpzb48n505ip32ffgmvc4f"))))
> (build-system python-build-system)
> (native-inputs
> `(("python-coverage" ,python-coverage)
> ("python-cryptography" ,python-cryptography)
> ("python-memcached" ,python-memcached)
> ("python-mock" ,python-mock)
> ("python-nose" ,python-nose)
> ("python-pycryptodome" ,python-pycryptodome)
> ("python-pylibmc" ,python-pylibmc)
> ("python-pymongo" ,python-pymongo)
> ("python-redis" ,python-redis)
> ("python-sqlalchemy" ,python-sqlalchemy)
> ("python-webtest" ,python-webtest)))
> (home-page "https://beaker.readthedocs.io/")
> (synopsis
> "A Session and Caching library with WSGI Middleware")
> (description
> "A Session and Caching library with WSGI Middleware")
> (license license:bsd-3))
> --8<---------------cut here---------------end--------------->8---
>
> Looking better?

Perfect!

--
Cheers Swedebugia
Attachment: signature.asc
M
M
Maxim Cournoyer wrote on 31 Mar 2019 16:40
[PATCH] bug#24450: pypi importer outputs strange character series in optional dependency case.
(name . ng0)(address . ng0@we.make.ritual.n0.is)(address . 24450@debbugs.gnu.org)
87ftr32jkt.fsf_-_@gmail.com
Hello!

I've yet another commit to add on the top of the current stack of
patches already sent. This one makes it so that the source archive is
searched for the '[...].egg-info/requires.txt' file rather than check
the more common, fixed location.

This makes the importer more useful, as some packages such as
robotframework-sshlibrary use a "src" directory in their hierarchy which
would cause the previous scheme to not find requires.txt.
From cfde6e09f8f8c692fe252d76ed27e8c50a9e5377 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Sat, 30 Mar 2019 23:13:26 -0400
Subject: [PATCH] import: pypi: Scan source archive to find requires.txt file.

* guix/import/pypi.scm (use-modules): Use invoke from (guix build utils).
(guess-requirements)[archive-root-directory]: Remove procedure.
[guess-requirements-from-wheel]: Re-ident.
[guess-requirements-from-source]: Search for the requires.txt file in the
archive instead of using a static, expected location.
* tests/pypi.scm ("pypi->guix-package, no wheel"): Mock the requires.txt at a
non-standard location to test the new feature.
("pypi->guix-package, no usable requirement file."): Adapt.
---
guix/import/pypi.scm | 65 ++++++++++++++++++--------------------------
tests/pypi.scm | 17 +++++++-----
2 files changed, 37 insertions(+), 45 deletions(-)

Toggle diff (137 lines)
diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 099768f0c8..a2ce14b192 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -36,7 +36,8 @@
   #:use-module ((guix build utils)
                 #:select ((package-name->name+version
                            . hyphen-package-name->name+version)
-                          find-files))
+                          find-files
+                          invoke))
   #:use-module (guix import utils)
   #:use-module ((guix download) #:prefix download:)
   #:use-module (guix import json)
@@ -267,19 +268,6 @@ omitted since these can be difficult or expensive to satisfy."
 of the required packages specified in the requirements.txt file.  ARCHIVE will
 be extracted in a temporary directory."
 
-  (define (archive-root-directory url)
-    ;; Given the URL of the package's archive, return the name of the directory
-    ;; that will be created upon decompressing it. If the filetype is not
-    ;; supported, return #f.
-    (if (compressed-file? url)
-        (let ((root-directory (file-sans-extension (basename url))))
-          (if (string=? "tar" (file-extension root-directory))
-              (file-sans-extension root-directory)
-              root-directory))
-        (begin
-          (warning (G_ "Unsupported archive format (~a): \
-cannot determine package dependencies") (file-extension url))
-          #f)))
 
   (define (read-wheel-metadata wheel-archive)
     ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
@@ -305,33 +293,34 @@ cannot determine package dependencies") (file-extension url))
     (call-with-temporary-output-file
      (lambda (temp port)
        (if wheel-url
-         (and (url-fetch wheel-url temp)
-              (read-wheel-metadata temp))
-         #f))))
+           (and (url-fetch wheel-url temp)
+                (read-wheel-metadata temp))
+           #f))))
 
   (define (guess-requirements-from-source)
     ;; Return the package's requirements by guessing them from the source.
-    (let ((dirname (archive-root-directory source-url))
-          (extension (file-extension source-url)))
-      (if (string? dirname)
-          (call-with-temporary-directory
-           (lambda (dir)
-             (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
-                    (requires.txt (string-append dirname "/" pypi-name
-                                                 ".egg-info" "/requires.txt"))
-                    (exit-code
-                     (parameterize ((current-error-port (%make-void-port "rw+"))
-                                    (current-output-port (%make-void-port "rw+")))
-                       (if (string=? "zip" extension)
-                           (system* "unzip" archive "-d" dir requires.txt)
-                           (system* "tar" "xf" archive "-C" dir requires.txt)))))
-               (if (zero? exit-code)
-                   (parse-requires.txt (string-append dir "/" requires.txt))
-                   (begin
-                     (warning
-                      (G_ "Failed to extract file: ~a from source.~%")
-                      requires.txt)
-                     (list '() '()))))))
+    (if (compressed-file? source-url)
+        (call-with-temporary-directory
+         (lambda (dir)
+           (parameterize ((current-error-port (%make-void-port "rw+"))
+                          (current-output-port (%make-void-port "rw+")))
+             (if (string=? "zip" (file-extension source-url))
+                 (invoke "unzip" archive "-d" dir)
+                 (invoke "tar" "xf" archive "-C" dir)))
+           (let ((requires.txt-files
+                  (find-files dir (lambda (abs-file-name _)
+		                    (string-match "\\.egg-info/requires.txt$"
+                                                  abs-file-name)))))
+             (if (> (length requires.txt-files) 0)
+                 (begin
+                   (parse-requires.txt (first requires.txt-files)))
+                 (begin (warning (G_ "Cannot guess requirements from source archive:\
+ no requires.txt file found.~%"))
+                        (list '() '()))))))
+        (begin
+          (warning (G_ "Unsupported archive format; \
+cannot determine package dependencies from source archive: ~a~%")
+                   (basename source-url))
           (list '() '()))))
 
   ;; First, try to compute the requirements using the wheel, else, fallback to
diff --git a/tests/pypi.scm b/tests/pypi.scm
index aa08e2cb54..ad188df16c 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -177,8 +177,9 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
              (match url
                ("https://example.com/foo-1.0.0.tar.gz"
                 (begin
-                  (mkdir-p "foo-1.0.0/foo.egg-info/")
-                  (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
+                  ;; Unusual requires.txt location should still be found.
+                  (mkdir-p "foo-1.0.0/src/bizarre.egg-info")
+                  (with-output-to-file "foo-1.0.0/src/bizarre.egg-info/requires.txt"
                     (lambda ()
                       (display test-requires.txt)))
                   (parameterize ((current-output-port (%make-void-port "rw+")))
@@ -299,10 +300,13 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
          (lambda (url file-name)
            (match url
              ("https://example.com/foo-1.0.0.tar.gz"
+              (mkdir-p "foo-1.0.0/foo.egg-info/")
+              (parameterize ((current-output-port (%make-void-port "rw+")))
+                (system* "tar" "czvf" file-name "foo-1.0.0/"))
+              (delete-file-recursively "foo-1.0.0")
               (set! test-source-hash
-                (call-with-input-file file-name port-sha256))
-              #t)
-             ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #t)
+                (call-with-input-file file-name port-sha256)))
+             ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #f)
              (_ (error "Unexpected URL: " url)))))
         (mock ((guix http-client) http-fetch
                (lambda (url . rest)
@@ -334,7 +338,6 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                             test-source-hash)
                            hash))
                 (x
-                 (pk 'fail x #f)))
-              )))
+                 (pk 'fail x #f))))))
 
 (test-end "pypi")
-- 
2.20.1
Thanks,

Maxim
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEEJ9WGpPiQCFQyn/CfEmDkZILmNWIFAlyg0WIACgkQEmDkZILm
NWIEMA/9HsMlrdzTwtllz/jPaAM13pPbyvI1sx1uwQi6AYessiZYBr/vGiFEBBTB
FJJUQ9Sz0XQOTW3wOitgTRz1kr55c17cv80IvwWYhTZLGZ+O2F29hFpp38OucRS7
d4XeZSJwBSn9ccYgytebd0Xtkp340HHeLOHnNTnA600TLwTq/6YXP4hLi5hiU7p1
zIOHe3Nze9dk1B30tuivqwvIVkSMnnPQyCmPSE+1vhzczVQ+m+f5PxxmSbP6Kznx
KF7Rlm+Ltjup0JfiKeglNiqLYYOKRpYuWM+oMzWSJgl1zA/MzDRZCnbs30TD/XKd
dkixB0Hmazer2BBRJoNSgCKu/Mx8hZg6ML/NcgrbmJnUd4Sw4r/M2My/Iq1D6RLA
Kkj7gzoqqowy+PcsU+/l+XrLdvT0wd6VR2x+DwY+y3Fiv8DSQRWI0mXxl9V4M4St
gF27p1RhfyUhIQo801AtJ+Tin/N5cYik8orRnARHnWNXVy7Iqz1RGr4MIqGheS0l
8GMOMVeLskAqagLynhWY4fz0teGCnaGAjVyzda7R1tYBkWzsioLugzsDoIftpEA8
p0TBiVEu0/BAWlL/BYdWyOIbkPQ05xV+Yad+QvECS3uu4/MckAcEwLMQpPjqKP8O
J5tRiRMRkEP1Dxl7oBuKC1IOHDhGvUOuIqFKqmyGRcBInwTC5Is=
=H7lm
-----END PGP SIGNATURE-----

M
M
Maxim Cournoyer wrote on 31 Mar 2019 16:41
control message for bug #24450
(address . control@debbugs.gnu.org)
87ef6n2jiz.fsf@gmail.com
tags 24450 patch
L
L
Ludovic Courtès wrote on 1 Apr 2019 17:28
Re: bug#24450: [PATCHv2] bug#24450: pypi importer outputs strange character series in optional dependency case.
(name . T460s laptop)(address . maxim.cournoyer@gmail.com)
87sgv1lp76.fsf@gnu.org
Hi Maxim,

T460s laptop <maxim.cournoyer@gmail.com> skribis:

Toggle quote (26 lines)
> From 37e499d5d5d5f690aa0a065c730e13f6a31dd30d Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Thu, 28 Mar 2019 23:12:26 -0400
> Subject: [PATCH] import: pypi: Include optional test inputs as native-inputs.
>
> * guix/import/pypi.scm (maybe-inputs): Add INPUT-TYPE argument, and use it.
> (test-section?): New predicate.
> (parse-requires.txt): Collect the optional test inputs, and return them as the
> second element of the returned list.
> (parse-wheel-metadata): Likewise.
> (guess-requirements): Adapt, and hide unzip output.
> (make-pypi-sexp): Likewise, and include the test inputs requirements as native
> inputs in the returned package expression.
>
> * tests/pypi.scm (test-requires.txt): Include a test section in the
> test-requires.txt data.
> (test-requires.txt-beaker): New variable.
> ("parse-requires.txt"): Adapt.
> ("parse-requires.txt - Beaker"): New test.
> ("parse-wheel-metadata, with extras"): Adapt.
> ("parse-wheel-metadata, with extras - Jedi"): Adapt.
> ("pypi->guix-package, no wheel"): Re-indent, and add the expected
> native-inputs.
> ("pypi->guix-package, wheels"): Likewise.
> ("pypi->guix-package, no usable requirement file."): New test.

It seems that the patch does several unrelated things, such as silencing
‘unzip’, improving docstrings, handling inputs other than
‘propagated-inputs’, and correctly parsing wheels.

Could you try to separate these as much as possible to simplify review?

Thank you!

Ludo’.
R
R
Ricardo Wurmus wrote on 15 May 2019 13:06
pypi importer outputs strange character series in optional dependency case.
(address . maxim.cournoyer@gmail.com)(address . 24450@debbugs.gnu.org)
idj4l5w9dtu.fsf@bimsb-sys02.mdc-berlin.net
Hi Maxim,

I would very much like to see your improvements to the pypi importer to
be merged. Have you been able to separate the independent changes as
suggested by Ludo?

--
Ricardo
M
M
Maxim Cournoyer wrote on 20 May 2019 06:05
(name . Ricardo Wurmus)(address . ricardo.wurmus@mdc-berlin.de)(address . 24450@debbugs.gnu.org)
87pnod7ot4.fsf@gmail.com
Hi Ricardo!

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

Toggle quote (6 lines)
> Hi Maxim,
>
> I would very much like to see your improvements to the pypi importer to
> be merged. Have you been able to separate the independent changes as
> suggested by Ludo?

I'm thrilled that someone has an interest in this :-)

I took my time, but finally got around to restructure the changes a
bit. I hope it'll be easier to review this time around!

Thank you!

Maxim
From 54e44b7397f17910d95dbdb233d23e5c97c095aa Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:00 -0400
Subject: [PATCH 1/9] import: pypi: Do not consider requirements.txt files.

* guix/import/pypi.scm (guess-requirements): Update comment.
[guess-requirements-from-source]: Do not attempt to parse the file
requirements.txt. Streamline logic.
---
guix/import/pypi.scm | 35 +++++++++++++----------------------
tests/pypi.scm | 23 +++++++++++------------
2 files changed, 24 insertions(+), 34 deletions(-)

Toggle diff (113 lines)
diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 3a20fc4b9b..8269aa61d7 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -206,35 +206,26 @@ cannot determine package dependencies"))
           (call-with-temporary-directory
            (lambda (dir)
              (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
-                    (req-files (list (string-append dirname "/requirements.txt")
-                                     (string-append dirname "/" pypi-name ".egg-info"
-                                                    "/requires.txt")))
-                    (exit-codes (map (lambda (file-name)
-                                       (parameterize ((current-error-port (%make-void-port "rw+"))
-                                                      (current-output-port (%make-void-port "rw+")))
-                                         (system* "tar" "xf" tarball "-C" dir file-name)))
-                                     req-files)))
-               ;; Only one of these files needs to exist.
-               (if (any zero? exit-codes)
-                   (match (find-files dir)
-                     ((file . _)
-                      (read-requirements file))
-                     (()
-                      (warning (G_ "No requirements file found.\n"))))
+                    (requires.txt (string-append dirname "/" pypi-name
+                                                 ".egg-info" "/requires.txt"))
+                    (exit-code (parameterize ((current-error-port (%make-void-port "rw+"))
+                                              (current-output-port (%make-void-port "rw+")))
+                                 (system* "tar" "xf" tarball "-C" dir requires.txt))))
+               (if (zero? exit-code)
+                   (read-requirements (string-append dir "/" requires.txt))
                    (begin
-                     (warning (G_ "Failed to extract requirements files\n"))
+                     (warning
+                      (G_ "Failed to extract file: ~a from source.~%")
+                      requires.txt)
                      '())))))
           '())))
 
-  ;; First, try to compute the requirements using the wheel, since that is the
-  ;; most reliable option. If a wheel is not provided for this package, try
-  ;; getting them by reading either the "requirements.txt" file or the
-  ;; "requires.txt" from the egg-info directory from the source tarball. Note
-  ;; that "requirements.txt" is not mandatory, so this is likely to fail.
+  ;; First, try to compute the requirements using the wheel, else, fallback to
+  ;; reading the "requires.txt" from the egg-info directory from the source
+  ;; tarball.
   (or (guess-requirements-from-wheel)
       (guess-requirements-from-source)))
 
-
 (define (compute-inputs source-url wheel-url tarball)
   "Given the SOURCE-URL of an already downloaded TARBALL, return a list of
 name/variable pairs describing the required inputs of this package.  Also
diff --git a/tests/pypi.scm b/tests/pypi.scm
index 6daa44a6e7..335be42644 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -23,7 +23,7 @@
   #:use-module (gcrypt hash)
   #:use-module (guix tests)
   #:use-module (guix build-system python)
-  #:use-module ((guix build utils) #:select (delete-file-recursively which))
+  #:use-module ((guix build utils) #:select (delete-file-recursively which mkdir-p))
   #:use-module (srfi srfi-64)
   #:use-module (ice-9 match))
 
@@ -55,11 +55,10 @@
 (define test-source-hash
   "")
 
-(define test-requirements
-"# A comment
- # A comment after a space
+(define test-requires.txt "\
 bar
-baz > 13.37")
+baz > 13.37
+")
 
 (define test-metadata
   "{
@@ -107,10 +106,10 @@ baz > 13.37")
              (match url
                ("https://example.com/foo-1.0.0.tar.gz"
                 (begin
-                  (mkdir "foo-1.0.0")
-                  (with-output-to-file "foo-1.0.0/requirements.txt"
+                  (mkdir-p "foo-1.0.0/foo.egg-info/")
+                  (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                     (lambda ()
-                      (display test-requirements)))
+                      (display test-requires.txt)))
                   (system* "tar" "czvf" file-name "foo-1.0.0/")
                   (delete-file-recursively "foo-1.0.0")
                   (set! test-source-hash
@@ -157,11 +156,11 @@ baz > 13.37")
          (lambda (url file-name)
            (match url
              ("https://example.com/foo-1.0.0.tar.gz"
-               (begin
-                 (mkdir "foo-1.0.0")
-                 (with-output-to-file "foo-1.0.0/requirements.txt"
+              (begin
+                (mkdir-p "foo-1.0.0/foo.egg-info/")
+                (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                    (lambda ()
-                     (display test-requirements)))
+                     (display test-requires.txt)))
                  (system* "tar" "czvf" file-name "foo-1.0.0/")
                  (delete-file-recursively "foo-1.0.0")
                  (set! test-source-hash
-- 
2.21.0
From 5f79b0502f62bd1dacc8ea143c1dbd9ef7cfc29d Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:00 -0400
Subject: [PATCH 2/9] import: pypi: Do not parse optional requirements from
source.

* guix/import/pypi.scm: Export PARSE-REQUIRES.TXT.
(guess-requirements): Move the READ-REQUIREMENTS procedure to the top level,
and rename it to PARSE-REQUIRES.TXT. Move the CLEAN-REQUIREMENT and COMMENT?
functions inside the READ-REQUIREMENTS procedure.
(parse-requires.txt): Add a SECTION-HEADER? predicate, and use it to prevent
parsing optional requirements.

* tests/pypi.scm (test-requires-with-sections): New variable.
("parse-requires.txt, with sections"): New test.
("pypi->guix-package"): Mute tar output to stdout.
---
guix/import/pypi.scm | 76 +++++++++++++++++++++++++++-----------------
tests/pypi.scm | 21 ++++++++++--
2 files changed, 65 insertions(+), 32 deletions(-)

Toggle diff (163 lines)
diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 8269aa61d7..91e987e9f1 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -47,7 +47,8 @@
   #:use-module (guix upstream)
   #:use-module ((guix licenses) #:prefix license:)
   #:use-module (guix build-system python)
-  #:export (guix-package->pypi-name
+  #:export (parse-requires.txt
+            guix-package->pypi-name
             pypi-recursive-import
             pypi->guix-package
             %pypi-updater))
@@ -117,6 +118,49 @@ package definition."
     ((package-inputs ...)
      `((propagated-inputs (,'quasiquote ,package-inputs))))))
 
+(define (clean-requirement s)
+  ;; Given a requirement LINE, as can be found in a setuptools requires.txt
+  ;; file, remove everything other than the actual name of the required
+  ;; package, and return it.
+  (string-take s (or (string-index s (lambda (chr)
+                                       (member chr '(#\space #\> #\= #\<))))
+                     (string-length s))))
+
+(define (parse-requires.txt requires.txt)
+  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
+requirement names."
+  ;; This is a very incomplete parser, which job is to select the non-optional
+  ;; dependencies and strip them out of any version information.
+  ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
+  ;; library and the requirements grammar defined by PEP-0508
+  ;; (https://www.python.org/dev/peps/pep-0508/).
+
+  (define (comment? line)
+    ;; Return #t if the given LINE is a comment, #f otherwise.
+    (eq? (string-ref (string-trim line) 0) #\#))
+
+  (define (section-header? line)
+    ;; Return #t if the given LINE is a section header, #f otherwise.
+    (let ((trimmed-line (string-trim line)))
+      (and (not (string-null? trimmed-line))
+           (eq? (string-ref trimmed-line 0) #\[))))
+
+  (call-with-input-file requires.txt
+    (lambda (port)
+      (let loop ((result '()))
+        (let ((line (read-line port)))
+          ;; Stop when a section is encountered, as sections contains optional
+          ;; (extra) requirements.  Non-optional requirements must appear
+          ;; before any section is defined.
+          (if (or (eof-object? line) (section-header? line))
+              (reverse result)
+              (cond
+               ((or (string-null? line) (comment? line))
+                (loop result))
+               (else
+                (loop (cons (clean-requirement line)
+                            result))))))))))
+
 (define (guess-requirements source-url wheel-url tarball)
   "Given SOURCE-URL, WHEEL-URL and a TARBALL of the package, return a list
 of the required packages specified in the requirements.txt file.  TARBALL will
@@ -139,34 +183,6 @@ be extracted in a temporary directory."
 cannot determine package dependencies"))
           #f)))))
 
-  (define (clean-requirement s)
-    ;; Given a requirement LINE, as can be found in a Python requirements.txt
-    ;; file, remove everything other than the actual name of the required
-    ;; package, and return it.
-    (string-take s
-      (or (string-index s (lambda (chr) (member chr '(#\space #\> #\= #\<))))
-          (string-length s))))
-
-  (define (comment? line)
-    ;; Return #t if the given LINE is a comment, #f otherwise.
-    (eq? (string-ref (string-trim line) 0) #\#))
-
-  (define (read-requirements requirements-file)
-    ;; Given REQUIREMENTS-FILE, a Python requirements.txt file, return a list
-    ;; of name/variable pairs describing the requirements.
-    (call-with-input-file requirements-file
-      (lambda (port)
-        (let loop ((result '()))
-          (let ((line (read-line port)))
-            (if (eof-object? line)
-                result
-                (cond
-                 ((or (string-null? line) (comment? line))
-                  (loop result))
-                 (else
-                  (loop (cons (clean-requirement line)
-                              result))))))))))
-
   (define (read-wheel-metadata wheel-archive)
     ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
     ;; requirements.
@@ -212,7 +228,7 @@ cannot determine package dependencies"))
                                               (current-output-port (%make-void-port "rw+")))
                                  (system* "tar" "xf" tarball "-C" dir requires.txt))))
                (if (zero? exit-code)
-                   (read-requirements (string-append dir "/" requires.txt))
+                   (parse-requires.txt (string-append dir "/" requires.txt))
                    (begin
                      (warning
                       (G_ "Failed to extract file: ~a from source.~%")
diff --git a/tests/pypi.scm b/tests/pypi.scm
index 335be42644..e4b7142311 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -60,6 +60,15 @@ bar
 baz > 13.37
 ")
 
+(define test-requires-with-sections "\
+# A comment
+foo ~= 3
+bar != 2
+
+[test]
+pytest (>=2.5.0)
+")
+
 (define test-metadata
   "{
   \"run_requires\": [
@@ -99,6 +108,12 @@ baz > 13.37
                     (uri (list "https://bitheap.org/cram/cram-0.7.tar.gz"
                                (pypi-uri "cram" "0.7"))))))))
 
+(test-equal "parse-requires.txt, with sections"
+  '("foo" "bar")
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-requires.txt test-requires-with-sections)))
+
 (test-assert "pypi->guix-package"
   ;; Replace network resources with sample data.
     (mock ((guix import utils) url-fetch
@@ -110,7 +125,8 @@ baz > 13.37
                   (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                     (lambda ()
                       (display test-requires.txt)))
-                  (system* "tar" "czvf" file-name "foo-1.0.0/")
+                  (parameterize ((current-output-port (%make-void-port "rw+")))
+                    (system* "tar" "czvf" file-name "foo-1.0.0/"))
                   (delete-file-recursively "foo-1.0.0")
                   (set! test-source-hash
                     (call-with-input-file file-name port-sha256))))
@@ -161,7 +177,8 @@ baz > 13.37
                 (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                    (lambda ()
                      (display test-requires.txt)))
-                 (system* "tar" "czvf" file-name "foo-1.0.0/")
+                (parameterize ((current-output-port (%make-void-port "rw+")))
+                  (system* "tar" "czvf" file-name "foo-1.0.0/"))
                  (delete-file-recursively "foo-1.0.0")
                  (set! test-source-hash
                        (call-with-input-file file-name port-sha256))))
-- 
2.21.0
From 0c62b541a3e8925b5ca31fe55dbe7536cf95151f Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:01 -0400
Subject: [PATCH 3/9] import: pypi: Improve parsing of requirement
specifications.

The previous solution was fragile and could leave unwanted characters in a
requirement name, such as '[' or ']'.

Partially fixes issue #33047 (see:

* guix/import/pypi.scm (use-modules): Export SPECIFICATION->REQUIREMENT-NAME
(%requirement-name-regexp): New variable.
(clean-requirement): Rename to...
(specification->requirement-name): this, which now uses
%requirement-name-regexp to select the requirement name from the requirement
specification.
(parse-requires.txt): Adapt.
---
guix/import/pypi.scm | 43 ++++++++++++++++++++++++++++++++++---------
tests/pypi.scm | 12 ++++++++++++
2 files changed, 46 insertions(+), 9 deletions(-)

Toggle diff (107 lines)
diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 91e987e9f1..efb5939c78 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -48,6 +48,7 @@
   #:use-module ((guix licenses) #:prefix license:)
   #:use-module (guix build-system python)
   #:export (parse-requires.txt
+            specification->requirement-name
             guix-package->pypi-name
             pypi-recursive-import
             pypi->guix-package
@@ -118,13 +119,37 @@ package definition."
     ((package-inputs ...)
      `((propagated-inputs (,'quasiquote ,package-inputs))))))
 
-(define (clean-requirement s)
-  ;; Given a requirement LINE, as can be found in a setuptools requires.txt
-  ;; file, remove everything other than the actual name of the required
-  ;; package, and return it.
-  (string-take s (or (string-index s (lambda (chr)
-                                       (member chr '(#\space #\> #\= #\<))))
-                     (string-length s))))
+(define %requirement-name-regexp
+  ;; Regexp to match the requirement name in a requirement specification.
+
+  ;; Some grammar, taken from PEP-0508 (see:
+  ;; https://www.python.org/dev/peps/pep-0508/).
+
+  ;; The unified rule can be expressed as:
+  ;; specification = wsp* ( url_req | name_req ) wsp*
+
+  ;; where url_req is:
+  ;; url_req = name wsp* extras? wsp* urlspec wsp+ quoted_marker?
+
+  ;; and where name_req is:
+  ;; name_req = name wsp* extras? wsp* versionspec? wsp* quoted_marker?
+
+  ;; Thus, we need only matching NAME, which is expressed as:
+  ;; identifer_end = letterOrDigit | (('-' | '_' | '.' )* letterOrDigit)
+  ;; identifier    = letterOrDigit identifier_end*
+  ;; name          = identifier
+  (let* ((letter-or-digit "[A-Za-z0-9]")
+         (identifier-end (string-append "(" letter-or-digit "|"
+                                        "[-_.]*" letter-or-digit ")"))
+         (identifier (string-append "^" letter-or-digit identifier-end "*"))
+         (name identifier))
+    (make-regexp name)))
+
+(define (specification->requirement-name spec)
+  "Given a specification SPEC, return the requirement name."
+  (match:substring
+   (or (regexp-exec %requirement-name-regexp spec)
+       (error (G_ "Could not extract requirement name in spec:") spec))))
 
 (define (parse-requires.txt requires.txt)
   "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
@@ -158,7 +183,7 @@ requirement names."
                ((or (string-null? line) (comment? line))
                 (loop result))
                (else
-                (loop (cons (clean-requirement line)
+                (loop (cons (specification->requirement-name line)
                             result))))))))))
 
 (define (guess-requirements source-url wheel-url tarball)
@@ -200,7 +225,7 @@ cannot determine package dependencies"))
                                             (hash-ref (list-ref run_requires 0)
                                                        "requires")
                                             '())))
-                     (map clean-requirement requirements)))))
+                     (map specification->requirement-name requirements)))))
              (lambda ()
                (delete-file json-file)
                (rmdir dirname))))))
diff --git a/tests/pypi.scm b/tests/pypi.scm
index e4b7142311..82d6bba8dd 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -55,6 +55,14 @@
 (define test-source-hash
   "")
 
+(define test-specifications
+  '("Fizzy [foo, bar]"
+    "PickyThing<1.6,>1.9,!=1.9.6,<2.0a0,==2.4c1"
+    "SomethingWithMarker[foo]>1.0;python_version<\"2.7\""
+    "requests [security,tests] >= 2.8.1, == 2.8.* ; python_version < \"2.7\""
+    "pip @ https://github.com/pypa/pip/archive/1.3.1.zip#\
+sha1=da9234ee9982d4bbb3c72346a6de940a148ea686"))
+
 (define test-requires.txt "\
 bar
 baz > 13.37
@@ -108,6 +116,10 @@ pytest (>=2.5.0)
                     (uri (list "https://bitheap.org/cram/cram-0.7.tar.gz"
                                (pypi-uri "cram" "0.7"))))))))
 
+(test-equal "specification->requirement-name"
+  '("Fizzy" "PickyThing" "SomethingWithMarker" "requests" "pip")
+  (map specification->requirement-name test-specifications))
+
 (test-equal "parse-requires.txt, with sections"
   '("foo" "bar")
   (mock ((ice-9 ports) call-with-input-file
-- 
2.21.0
From 76e4a3150f8126e0b952c6129b6e1371afba80c0 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:01 -0400
Subject: [PATCH 4/9] import: pypi: Deduplicate requirements.

* guix/import/pypi.scm (parse-requires.txt): Remove potential duplicates.
---
guix/import/pypi.scm | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

Toggle diff (19 lines)
diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index efb5939c78..a90be67bb0 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -178,7 +178,11 @@ requirement names."
           ;; (extra) requirements.  Non-optional requirements must appear
           ;; before any section is defined.
           (if (or (eof-object? line) (section-header? line))
-              (reverse result)
+              ;; Duplicates can occur, since the same requirement can be
+              ;; listed multiple times with different conditional markers, e.g.
+              ;; pytest >= 3 ; python_version >= "3.3"
+              ;; pytest < 3 ; python_version < "3.3"
+              (reverse (delete-duplicates result))
               (cond
                ((or (string-null? line) (comment? line))
                 (loop result))
-- 
2.21.0
From 73e27235cac1275ba7671fd2364325cf5788cb3c Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:02 -0400
Subject: [PATCH 5/9] import: pypi: Support more types of archives.

This change enables the PyPI importer to look for requirements in a source
archive of a different type than "tar.gz" or "tar.bz2".

* guix/import/pypi.scm: (guess-requirements)[tarball-directory]: Rename to...
[archive-root-directory]: this. Use COMPRESSED-FILED? to determine if an
archive is supported or not.
[guess-requirements-from-source]: Adapt to use the new method, and use unzip
to extract ZIP archives.
(guess-requirements): Rename the TARBALL argument to ARCHIVE, to denote the
archive format is no longer bound specifically to the Tar format.
---
guix/import/pypi.scm | 47 ++++++++++++++++++++++----------------------
1 file changed, 24 insertions(+), 23 deletions(-)

Toggle diff (89 lines)
diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index a90be67bb0..8e93653717 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -190,27 +190,24 @@ requirement names."
                 (loop (cons (specification->requirement-name line)
                             result))))))))))
 
-(define (guess-requirements source-url wheel-url tarball)
-  "Given SOURCE-URL, WHEEL-URL and a TARBALL of the package, return a list
-of the required packages specified in the requirements.txt file.  TARBALL will
+(define (guess-requirements source-url wheel-url archive)
+  "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
+of the required packages specified in the requirements.txt file.  ARCHIVE will
 be extracted in a temporary directory."
 
-  (define (tarball-directory url)
-    ;; Given the URL of the package's tarball, return the name of the directory
+  (define (archive-root-directory url)
+    ;; Given the URL of the package's archive, return the name of the directory
     ;; that will be created upon decompressing it. If the filetype is not
     ;; supported, return #f.
-    ;; TODO: Support more archive formats.
-    (let ((basename (substring url (+ 1 (string-rindex url #\/)))))
-      (cond
-       ((string-suffix? ".tar.gz" basename)
-        (string-drop-right basename 7))
-       ((string-suffix? ".tar.bz2" basename)
-        (string-drop-right basename 8))
-       (else
+    (if (compressed-file? url)
+        (let ((root-directory (file-sans-extension (basename url))))
+          (if (string=? "tar" (file-extension root-directory))
+              (file-sans-extension root-directory)
+              root-directory))
         (begin
-          (warning (G_ "Unsupported archive format: \
-cannot determine package dependencies"))
-          #f)))))
+          (warning (G_ "Unsupported archive format (~a): \
+cannot determine package dependencies") (file-extension url))
+          #f)))
 
   (define (read-wheel-metadata wheel-archive)
     ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
@@ -246,16 +243,20 @@ cannot determine package dependencies"))
 
   (define (guess-requirements-from-source)
     ;; Return the package's requirements by guessing them from the source.
-    (let ((dirname (tarball-directory source-url)))
+    (let ((dirname (archive-root-directory source-url))
+          (extension (file-extension source-url)))
       (if (string? dirname)
           (call-with-temporary-directory
            (lambda (dir)
              (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
                     (requires.txt (string-append dirname "/" pypi-name
                                                  ".egg-info" "/requires.txt"))
-                    (exit-code (parameterize ((current-error-port (%make-void-port "rw+"))
-                                              (current-output-port (%make-void-port "rw+")))
-                                 (system* "tar" "xf" tarball "-C" dir requires.txt))))
+                    (exit-code
+                     (parameterize ((current-error-port (%make-void-port "rw+"))
+                                    (current-output-port (%make-void-port "rw+")))
+                       (if (string=? "zip" extension)
+                           (system* "unzip" archive "-d" dir requires.txt)
+                           (system* "tar" "xf" archive "-C" dir requires.txt)))))
                (if (zero? exit-code)
                    (parse-requires.txt (string-append dir "/" requires.txt))
                    (begin
@@ -271,13 +272,13 @@ cannot determine package dependencies"))
   (or (guess-requirements-from-wheel)
       (guess-requirements-from-source)))
 
-(define (compute-inputs source-url wheel-url tarball)
-  "Given the SOURCE-URL of an already downloaded TARBALL, return a list of
+(define (compute-inputs source-url wheel-url archive)
+  "Given the SOURCE-URL of an already downloaded ARCHIVE, return a list of
 name/variable pairs describing the required inputs of this package.  Also
 return the unaltered list of upstream dependency names."
   (let ((dependencies
          (remove (cut string=? "argparse" <>)
-                 (guess-requirements source-url wheel-url tarball))))
+                 (guess-requirements source-url wheel-url archive))))
     (values (sort
              (map (lambda (input)
                     (let ((guix-name (python->package-name input)))
-- 
2.21.0
From fb0547ef225103c0f8355a7eccc41e0d028f6563 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 00:26:03 -0400
Subject: [PATCH 6/9] import: pypi: Parse wheel METADATA instead of
metadata.json.

With newer Wheel releases, there is no more metadata.json file; the METADATA
file should be used instead (see: https://github.com/pypa/wheel/issues/195).

This change updates our PyPI importer so that it uses the later.

* guix/import/pypi.scm (define-module): Remove unnecessary modules and export
the PARSE-WHEEL-METADATA method.
(parse-wheel-metadata): Add method.
(guess-requirements): Use it.
* tests/pypi.scm (test-metadata): Test it.
---
guix/import/pypi.scm | 66 +++++++++++++++++++++++++++++---------------
tests/pypi.scm | 60 ++++++++++++++++++++++++++++++----------
2 files changed, 89 insertions(+), 37 deletions(-)

Toggle diff (220 lines)
diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 8e93653717..c520213b6a 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -21,9 +21,7 @@
 ;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
 
 (define-module (guix import pypi)
-  #:use-module (ice-9 binary-ports)
   #:use-module (ice-9 match)
-  #:use-module (ice-9 pretty-print)
   #:use-module (ice-9 regex)
   #:use-module (ice-9 receive)
   #:use-module ((ice-9 rdelim) #:select (read-line))
@@ -31,9 +29,6 @@
   #:use-module (srfi srfi-26)
   #:use-module (srfi srfi-34)
   #:use-module (srfi srfi-35)
-  #:use-module (rnrs bytevectors)
-  #:use-module (json)
-  #:use-module (web uri)
   #:use-module (guix ui)
   #:use-module (guix utils)
   #:use-module ((guix build utils)
@@ -48,6 +43,7 @@
   #:use-module ((guix licenses) #:prefix license:)
   #:use-module (guix build-system python)
   #:export (parse-requires.txt
+            parse-wheel-metadata
             specification->requirement-name
             guix-package->pypi-name
             pypi-recursive-import
@@ -190,6 +186,37 @@ requirement names."
                 (loop (cons (specification->requirement-name line)
                             result))))))))))
 
+(define (parse-wheel-metadata metadata)
+  "Given METADATA, a Wheel metadata file, return a list of requirement names."
+  ;; METADATA is a RFC-2822-like, header based file.
+
+  (define (requires-dist-header? line)
+    ;; Return #t if the given LINE is a Requires-Dist header.
+    (regexp-match? (string-match "^Requires-Dist: " line)))
+
+  (define (requires-dist-value line)
+    (string-drop line (string-length "Requires-Dist: ")))
+
+  (define (extra? line)
+    ;; Return #t if the given LINE is an "extra" requirement.
+    (regexp-match? (string-match "extra == " line)))
+
+  (call-with-input-file metadata
+    (lambda (port)
+      (let loop ((requirements '()))
+        (let ((line (read-line port)))
+          ;; Stop at the first 'Provides-Extra' section: the non-optional
+          ;; requirements appear before the optional ones.
+          (if (eof-object? line)
+              (reverse (delete-duplicates requirements))
+              (cond
+               ((and (requires-dist-header? line) (not (extra? line)))
+                (loop (cons (specification->requirement-name
+                             (requires-dist-value line))
+                            requirements)))
+               (else
+                (loop requirements)))))))))
+
 (define (guess-requirements source-url wheel-url archive)
   "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
 of the required packages specified in the requirements.txt file.  ARCHIVE will
@@ -211,25 +238,18 @@ cannot determine package dependencies") (file-extension url))
 
   (define (read-wheel-metadata wheel-archive)
     ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
-    ;; requirements.
+    ;; requirements, or #f if the metadata file contained therein couldn't be
+    ;; extracted.
     (let* ((dirname (wheel-url->extracted-directory wheel-url))
-           (json-file (string-append dirname "/metadata.json")))
-      (and (zero? (system* "unzip" "-q" wheel-archive json-file))
-           (dynamic-wind
-             (const #t)
-             (lambda ()
-               (call-with-input-file json-file
-                 (lambda (port)
-                   (let* ((metadata (json->scm port))
-                          (run_requires (hash-ref metadata "run_requires"))
-                          (requirements (if run_requires
-                                            (hash-ref (list-ref run_requires 0)
-                                                       "requires")
-                                            '())))
-                     (map specification->requirement-name requirements)))))
-             (lambda ()
-               (delete-file json-file)
-               (rmdir dirname))))))
+           (metadata (string-append dirname "/METADATA")))
+      (call-with-temporary-directory
+       (lambda (dir)
+         (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
+             (parse-wheel-metadata (string-append dir "/" metadata))
+             (begin
+               (warning
+                (G_ "Failed to extract file: ~a from wheel.~%") metadata)
+               #f))))))
 
   (define (guess-requirements-from-wheel)
     ;; Return the package's requirements using the wheel, or #f if an error
diff --git a/tests/pypi.scm b/tests/pypi.scm
index 82d6bba8dd..ca8cb5f6de 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -21,6 +21,7 @@
   #:use-module (guix import pypi)
   #:use-module (guix base32)
   #:use-module (gcrypt hash)
+  #:use-module (guix memoization)
   #:use-module (guix tests)
   #:use-module (guix build-system python)
   #:use-module ((guix build utils) #:select (delete-file-recursively which mkdir-p))
@@ -77,17 +78,33 @@ bar != 2
 pytest (>=2.5.0)
 ")
 
-(define test-metadata
-  "{
-  \"run_requires\": [
-    {
-      \"requires\": [
-        \"bar\",
-        \"baz (>13.37)\"
-      ]
-    }
-  ]
-}")
+(define test-metadata "\
+Classifier: Programming Language :: Python :: 3.7
+Requires-Dist: baz ~= 3
+Requires-Dist: bar != 2
+Provides-Extra: test
+pytest (>=2.5.0)
+")
+
+(define test-metadata-with-extras "
+Classifier: Programming Language :: Python :: 3.7
+Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*
+Requires-Dist: wrapt (<2,>=1)
+Requires-Dist: bar
+
+Provides-Extra: dev
+Requires-Dist: tox ; extra == 'dev'
+Requires-Dist: bumpversion (<1) ; extra == 'dev'
+")
+
+;;; Provides-Extra can appear before Requires-Dist.
+(define test-metadata-with-extras-jedi "\
+Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*
+Provides-Extra: testing
+Requires-Dist: parso (>=0.3.0)
+Provides-Extra: testing
+Requires-Dist: pytest (>=3.1.0); extra == 'testing'
+")
 
 (test-begin "pypi")
 
@@ -126,6 +143,18 @@ pytest (>=2.5.0)
          call-with-input-string)
         (parse-requires.txt test-requires-with-sections)))
 
+(test-equal "parse-wheel-metadata, with extras"
+  '("wrapt" "bar")
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-wheel-metadata test-metadata-with-extras)))
+
+(test-equal "parse-wheel-metadata, with extras - Jedi"
+  '("parso")
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-wheel-metadata test-metadata-with-extras-jedi)))
+
 (test-assert "pypi->guix-package"
   ;; Replace network resources with sample data.
     (mock ((guix import utils) url-fetch
@@ -188,7 +217,7 @@ pytest (>=2.5.0)
                 (mkdir-p "foo-1.0.0/foo.egg-info/")
                 (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
                    (lambda ()
-                     (display test-requires.txt)))
+                     (display "wrong data to make sure we're testing wheels ")))
                 (parameterize ((current-output-port (%make-void-port "rw+")))
                   (system* "tar" "czvf" file-name "foo-1.0.0/"))
                  (delete-file-recursively "foo-1.0.0")
@@ -197,13 +226,13 @@ pytest (>=2.5.0)
              ("https://example.com/foo-1.0.0-py2.py3-none-any.whl"
                (begin
                  (mkdir "foo-1.0.0.dist-info")
-                 (with-output-to-file "foo-1.0.0.dist-info/metadata.json"
+                 (with-output-to-file "foo-1.0.0.dist-info/METADATA"
                    (lambda ()
                      (display test-metadata)))
                  (let ((zip-file (string-append file-name ".zip")))
                    ;; zip always adds a "zip" extension to the file it creates,
                    ;; so we need to rename it.
-                   (system* "zip" zip-file "foo-1.0.0.dist-info/metadata.json")
+                   (system* "zip" zip-file "foo-1.0.0.dist-info/METADATA")
                    (rename-file zip-file file-name))
                  (delete-file-recursively "foo-1.0.0.dist-info")))
              (_ (error "Unexpected URL: " url)))))
@@ -215,6 +244,9 @@ pytest (>=2.5.0)
                             (string-length test-json)))
                    ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #f)
                    (_ (error "Unexpected URL: " url)))))
+              ;; Not clearing the memoization cache here would mean returning the value
+              ;; computed in the previous test.
+              (invalidate-memoization! pypi->guix-package)
               (match (pypi->guix-package "foo")
                 (('package
                    ('name "python-foo")
-- 
2.21.0
From 37e499d5d5d5f690aa0a065c730e13f6a31dd30d Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Thu, 28 Mar 2019 23:12:26 -0400
Subject: [PATCH 7/9] import: pypi: Include optional test inputs as
native-inputs.

* guix/import/pypi.scm (maybe-inputs): Add INPUT-TYPE argument, and use it.
(test-section?): New predicate.
(parse-requires.txt): Collect the optional test inputs, and return them as the
second element of the returned list.
(parse-wheel-metadata): Likewise.
(guess-requirements): Adapt, and hide unzip output.
(make-pypi-sexp): Likewise, and include the test inputs requirements as native
inputs in the returned package expression.

* tests/pypi.scm (test-requires.txt): Include a test section in the
test-requires.txt data.
(test-requires.txt-beaker): New variable.
("parse-requires.txt"): Adapt.
("parse-requires.txt - Beaker"): New test.
("parse-wheel-metadata, with extras"): Adapt.
("parse-wheel-metadata, with extras - Jedi"): Adapt.
("pypi->guix-package, no wheel"): Re-indent, and add the expected
native-inputs.
("pypi->guix-package, wheels"): Likewise.
("pypi->guix-package, no usable requirement file."): New test.
---
guix/import/pypi.scm | 195 ++++++++++++++++++++++++++++---------------
tests/pypi.scm | 123 ++++++++++++++++++++-------
2 files changed, 222 insertions(+), 96 deletions(-)

Toggle diff (506 lines)
diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index c520213b6a..099768f0c8 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -4,6 +4,7 @@
 ;;; Copyright © 2015, 2016, 2017 Ludovic Courtès <ludo@gnu.org>
 ;;; Copyright © 2017 Mathieu Othacehe <m.othacehe@gmail.com>
 ;;; Copyright © 2018 Ricardo Wurmus <rekado@elephly.net>
+;;; Copyright © 2019 Maxim Cournoyer <maxim.cournoyer@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -26,6 +27,7 @@
   #:use-module (ice-9 receive)
   #:use-module ((ice-9 rdelim) #:select (read-line))
   #:use-module (srfi srfi-1)
+  #:use-module (srfi srfi-11)
   #:use-module (srfi srfi-26)
   #:use-module (srfi srfi-34)
   #:use-module (srfi srfi-35)
@@ -106,14 +108,15 @@ package on PyPI."
     ((name version _ ...)
      (string-append name "-" version ".dist-info"))))
 
-(define (maybe-inputs package-inputs)
+(define (maybe-inputs package-inputs input-type)
   "Given a list of PACKAGE-INPUTS, tries to generate the 'inputs' field of a
-package definition."
+package definition.  INPUT-TYPE, a symbol, is used to populate the name of
+the input field."
   (match package-inputs
     (()
      '())
     ((package-inputs ...)
-     `((propagated-inputs (,'quasiquote ,package-inputs))))))
+     `((,input-type (,'quasiquote ,package-inputs))))))
 
 (define %requirement-name-regexp
   ;; Regexp to match the requirement name in a requirement specification.
@@ -147,11 +150,21 @@ package definition."
    (or (regexp-exec %requirement-name-regexp spec)
        (error (G_ "Could not extract requirement name in spec:") spec))))
 
+(define (test-section? name)
+  "Return #t if the section name contains 'test' or 'dev'."
+  (any (cut string-contains-ci name <>)
+       '("test" "dev")))
+
 (define (parse-requires.txt requires.txt)
-  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
-requirement names."
-  ;; This is a very incomplete parser, which job is to select the non-optional
-  ;; dependencies and strip them out of any version information.
+  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a pair of requirements.
+
+The first element of the pair contains the required dependencies while the
+second the optional test dependencies.  Note that currently, optional,
+non-test dependencies are omitted since these can be difficult or expensive to
+satisfy."
+
+  ;; This is a very incomplete parser, which job is to read in the requirement
+  ;; specification lines, and strip them out of any version information.
   ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
   ;; library and the requirements grammar defined by PEP-0508
   ;; (https://www.python.org/dev/peps/pep-0508/).
@@ -168,57 +181,89 @@ requirement names."
 
   (call-with-input-file requires.txt
     (lambda (port)
-      (let loop ((result '()))
+      (let loop ((required-deps '())
+                 (test-deps '())
+                 (inside-test-section? #f)
+                 (optional? #f))
         (let ((line (read-line port)))
-          ;; Stop when a section is encountered, as sections contains optional
-          ;; (extra) requirements.  Non-optional requirements must appear
-          ;; before any section is defined.
-          (if (or (eof-object? line) (section-header? line))
+          (if (eof-object? line)
               ;; Duplicates can occur, since the same requirement can be
               ;; listed multiple times with different conditional markers, e.g.
               ;; pytest >= 3 ; python_version >= "3.3"
               ;; pytest < 3 ; python_version < "3.3"
-              (reverse (delete-duplicates result))
+              (map (compose reverse delete-duplicates)
+                   (list required-deps test-deps))
               (cond
                ((or (string-null? line) (comment? line))
-                (loop result))
-               (else
+                (loop required-deps test-deps inside-test-section? optional?))
+               ((section-header? line)
+                ;; Encountering a section means that all the requirements
+                ;; listed below are optional. Since we want to pick only the
+                ;; test dependencies from the optional dependencies, we must
+                ;; track those separately.
+                (loop required-deps test-deps (test-section? line) #t))
+               (inside-test-section?
+                (loop required-deps
+                      (cons (specification->requirement-name line)
+                            test-deps)
+                      inside-test-section? optional?))
+               ((not optional?)
                 (loop (cons (specification->requirement-name line)
-                            result))))))))))
+                            required-deps)
+                      test-deps inside-test-section? optional?))
+               (optional?
+                ;; Skip optional items.
+                (loop required-deps test-deps inside-test-section? optional?))
+               (else
+                (warning (G_ "parse-requires.txt reached an unexpected \
+condition on line ~a~%") line)))))))))
 
 (define (parse-wheel-metadata metadata)
-  "Given METADATA, a Wheel metadata file, return a list of requirement names."
+  "Given METADATA, a Wheel metadata file, return a pair of requirements.
+
+The first element of the pair contains the required dependencies while the second the optional
+test dependencies.  Note that currently, optional, non-test dependencies are
+omitted since these can be difficult or expensive to satisfy."
   ;; METADATA is a RFC-2822-like, header based file.
 
   (define (requires-dist-header? line)
     ;; Return #t if the given LINE is a Requires-Dist header.
-    (regexp-match? (string-match "^Requires-Dist: " line)))
+    (string-match "^Requires-Dist: " line))
 
   (define (requires-dist-value line)
     (string-drop line (string-length "Requires-Dist: ")))
 
   (define (extra? line)
     ;; Return #t if the given LINE is an "extra" requirement.
-    (regexp-match? (string-match "extra == " line)))
+    (string-match "extra == '(.*)'" line))
+
+  (define (test-requirement? line)
+    (let ((extra-label (match:substring (extra? line) 1)))
+      (and extra-label (test-section? extra-label))))
 
   (call-with-input-file metadata
     (lambda (port)
-      (let loop ((requirements '()))
+      (let loop ((required-deps '())
+                 (test-deps '()))
         (let ((line (read-line port)))
-          ;; Stop at the first 'Provides-Extra' section: the non-optional
-          ;; requirements appear before the optional ones.
           (if (eof-object? line)
-              (reverse (delete-duplicates requirements))
+              (map (compose reverse delete-duplicates)
+                   (list required-deps test-deps))
               (cond
                ((and (requires-dist-header? line) (not (extra? line)))
                 (loop (cons (specification->requirement-name
                              (requires-dist-value line))
-                            requirements)))
+                            required-deps)
+                      test-deps))
+               ((and (requires-dist-header? line) (test-requirement? line))
+                (loop required-deps
+                      (cons (specification->requirement-name (requires-dist-value line))
+                            test-deps)))
                (else
-                (loop requirements)))))))))
+                (loop required-deps test-deps))))))))) ;skip line
 
 (define (guess-requirements source-url wheel-url archive)
-  "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
+  "Given SOURCE-URL, WHEEL-URL and an ARCHIVE of the package, return a list
 of the required packages specified in the requirements.txt file.  ARCHIVE will
 be extracted in a temporary directory."
 
@@ -244,7 +289,10 @@ cannot determine package dependencies") (file-extension url))
            (metadata (string-append dirname "/METADATA")))
       (call-with-temporary-directory
        (lambda (dir)
-         (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
+         (if (zero?
+              (parameterize ((current-error-port (%make-void-port "rw+"))
+                             (current-output-port (%make-void-port "rw+")))
+                (system* "unzip" wheel-archive "-d" dir metadata)))
              (parse-wheel-metadata (string-append dir "/" metadata))
              (begin
                (warning
@@ -283,32 +331,41 @@ cannot determine package dependencies") (file-extension url))
                      (warning
                       (G_ "Failed to extract file: ~a from source.~%")
                       requires.txt)
-                     '())))))
-          '())))
+                     (list '() '()))))))
+          (list '() '()))))
 
   ;; First, try to compute the requirements using the wheel, else, fallback to
   ;; reading the "requires.txt" from the egg-info directory from the source
-  ;; tarball.
+  ;; archive.
   (or (guess-requirements-from-wheel)
       (guess-requirements-from-source)))
 
 (define (compute-inputs source-url wheel-url archive)
-  "Given the SOURCE-URL of an already downloaded ARCHIVE, return a list of
-name/variable pairs describing the required inputs of this package.  Also
+  "Given the SOURCE-URL and WHEEL-URL of an already downloaded ARCHIVE, return
+a pair of lists, each consisting of a list of name/variable pairs, for the
+propagated inputs and the native inputs, respectively.  Also
 return the unaltered list of upstream dependency names."
-  (let ((dependencies
-         (remove (cut string=? "argparse" <>)
-                 (guess-requirements source-url wheel-url archive))))
-    (values (sort
-             (map (lambda (input)
-                    (let ((guix-name (python->package-name input)))
-                      (list guix-name (list 'unquote (string->symbol guix-name)))))
-                  dependencies)
-             (lambda args
-               (match args
-                 (((a _ ...) (b _ ...))
-                  (string-ci<? a b)))))
-            dependencies)))
+
+  (define (strip-argparse deps)
+    (remove (cut string=? "argparse" <>) deps))
+
+  (define (requirement->package-name/sort deps)
+    (sort
+     (map (lambda (input)
+            (let ((guix-name (python->package-name input)))
+              (list guix-name (list 'unquote (string->symbol guix-name)))))
+          deps)
+     (lambda args
+       (match args
+         (((a _ ...) (b _ ...))
+          (string-ci<? a b))))))
+
+  (define process-requirements
+    (compose requirement->package-name/sort strip-argparse))
+
+  (let ((dependencies (guess-requirements source-url wheel-url archive)))
+    (values (map process-requirements dependencies)
+            (concatenate dependencies))))
 
 (define (make-pypi-sexp name version source-url wheel-url home-page synopsis
                         description license)
@@ -317,29 +374,33 @@ VERSION, SOURCE-URL, HOME-PAGE, SYNOPSIS, DESCRIPTION, and LICENSE."
   (call-with-temporary-output-file
    (lambda (temp port)
      (and (url-fetch source-url temp)
-          (receive (input-package-names upstream-dependency-names)
+          (receive (guix-dependencies upstream-dependencies)
               (compute-inputs source-url wheel-url temp)
-            (values
-             `(package
-                (name ,(python->package-name name))
-                (version ,version)
-                (source (origin
-                          (method url-fetch)
-
-                          ;; Sometimes 'pypi-uri' doesn't quite work due to mixed
-                          ;; cases in NAME, for instance, as is the case with
-                          ;; "uwsgi".  In that case, fall back to a full URL.
-                          (uri (pypi-uri ,(string-downcase name) version))
-                          (sha256
-                           (base32
-                            ,(guix-hash-url temp)))))
-                (build-system python-build-system)
-                ,@(maybe-inputs input-package-names)
-                (home-page ,home-page)
-                (synopsis ,synopsis)
-                (description ,description)
-                (license ,(license->symbol license)))
-             upstream-dependency-names))))))
+            (match guix-dependencies
+              ((required-inputs test-inputs)
+               (values
+                `(package
+                   (name ,(python->package-name name))
+                   (version ,version)
+                   (source (origin
+                             (method url-fetch)
+                             ;; Sometimes 'pypi-uri' doesn't quite work due to mixed
+                             ;; cases in NAME, for instance, as is the case with
+                             ;; "uwsgi".  In that case, fall back to a full URL.
+                             (uri (pypi-uri ,(string-downcase name) version))
+                             (sha256
+                              (base32
+                               ,(guix-hash-url temp)))))
+                   (build-system python-build-system)
+                   ,@(maybe-inputs required-inputs 'propagated-inputs)
+                   ,@(maybe-inputs test-inputs 'native-inputs)
+                   (home-page ,home-page)
+                   (synopsis ,synopsis)
+                   (description ,description)
+                   (license ,(license->symbol license)))
+                ;; Flatten the nested lists and return the upstream
+                ;; dependencies.
+                upstream-dependencies))))))))
 
 (define pypi->guix-package
   (memoize
diff --git a/tests/pypi.scm b/tests/pypi.scm
index ca8cb5f6de..aa08e2cb54 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -1,6 +1,7 @@
 ;;; GNU Guix --- Functional package management for GNU
 ;;; Copyright © 2014 David Thompson <davet@gnu.org>
 ;;; Copyright © 2016 Ricardo Wurmus <rekado@elephly.net>
+;;; Copyright © 2019 Maxim Cournoyer <maxim.cournoyer@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -65,11 +66,6 @@
 sha1=da9234ee9982d4bbb3c72346a6de940a148ea686"))
 
 (define test-requires.txt "\
-bar
-baz > 13.37
-")
-
-(define test-requires-with-sections "\
 # A comment
 foo ~= 3
 bar != 2
@@ -78,12 +74,25 @@ bar != 2
 pytest (>=2.5.0)
 ")
 
+;; Beaker contains only optional dependencies.
+(define test-requires.txt-beaker "\
+[crypto]
+pycryptopp>=0.5.12
+
+[cryptography]
+cryptography
+
+[testsuite]
+Mock
+coverage
+")
+
 (define test-metadata "\
 Classifier: Programming Language :: Python :: 3.7
 Requires-Dist: baz ~= 3
 Requires-Dist: bar != 2
 Provides-Extra: test
-pytest (>=2.5.0)
+Requires-Dist: pytest (>=2.5.0) ; extra == 'test'
 ")
 
 (define test-metadata-with-extras "
@@ -137,25 +146,31 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
   '("Fizzy" "PickyThing" "SomethingWithMarker" "requests" "pip")
   (map specification->requirement-name test-specifications))
 
-(test-equal "parse-requires.txt, with sections"
-  '("foo" "bar")
+(test-equal "parse-requires.txt"
+  (list '("foo" "bar") '("pytest"))
   (mock ((ice-9 ports) call-with-input-file
          call-with-input-string)
-        (parse-requires.txt test-requires-with-sections)))
+        (parse-requires.txt test-requires.txt)))
+
+(test-equal "parse-requires.txt - Beaker"
+  (list '() '("Mock" "coverage"))
+  (mock ((ice-9 ports) call-with-input-file
+         call-with-input-string)
+        (parse-requires.txt test-requires.txt-beaker)))
 
 (test-equal "parse-wheel-metadata, with extras"
-  '("wrapt" "bar")
+  (list '("wrapt" "bar") '("tox" "bumpversion"))
   (mock ((ice-9 ports) call-with-input-file
          call-with-input-string)
         (parse-wheel-metadata test-metadata-with-extras)))
 
 (test-equal "parse-wheel-metadata, with extras - Jedi"
-  '("parso")
+  (list '("parso") '("pytest"))
   (mock ((ice-9 ports) call-with-input-file
          call-with-input-string)
         (parse-wheel-metadata test-metadata-with-extras-jedi)))
 
-(test-assert "pypi->guix-package"
+(test-assert "pypi->guix-package, no wheel"
   ;; Replace network resources with sample data.
     (mock ((guix import utils) url-fetch
            (lambda (url file-name)
@@ -195,7 +210,10 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                      ('propagated-inputs
                       ('quasiquote
                        (("python-bar" ('unquote 'python-bar))
-                        ("python-baz" ('unquote 'python-baz)))))
+                        ("python-foo" ('unquote 'python-foo)))))
+                     ('native-inputs
+                      ('quasiquote
+                       (("python-pytest" ('unquote 'python-pytest)))))
                      ('home-page "http://example.com")
                      ('synopsis "summary")
                      ('description "summary")
@@ -216,25 +234,25 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
               (begin
                 (mkdir-p "foo-1.0.0/foo.egg-info/")
                 (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
-                   (lambda ()
-                     (display "wrong data to make sure we're testing wheels ")))
+                  (lambda ()
+                    (display "wrong data to make sure we're testing wheels ")))
                 (parameterize ((current-output-port (%make-void-port "rw+")))
                   (system* "tar" "czvf" file-name "foo-1.0.0/"))
-                 (delete-file-recursively "foo-1.0.0")
-                 (set! test-source-hash
-                       (call-with-input-file file-name port-sha256))))
+                (delete-file-recursively "foo-1.0.0")
+                (set! test-source-hash
+                  (call-with-input-file file-name port-sha256))))
              ("https://example.com/foo-1.0.0-py2.py3-none-any.whl"
-               (begin
-                 (mkdir "foo-1.0.0.dist-info")
-                 (with-output-to-file "foo-1.0.0.dist-info/METADATA"
-                   (lambda ()
-                     (display test-metadata)))
-                 (let ((zip-file (string-append file-name ".zip")))
-                   ;; zip always adds a "zip" extension to the file it creates,
-                   ;; so we need to rename it.
-                   (system* "zip" zip-file "foo-1.0.0.dist-info/METADATA")
-                   (rename-file zip-file file-name))
-                 (delete-file-recursively "foo-1.0.0.dist-info")))
+              (begin
+                (mkdir "foo-1.0.0.dist-info")
+                (with-output-to-file "foo-1.0.0.dist-info/METADATA"
+                  (lambda ()
+                    (display test-metadata)))
+                (let ((zip-file (string-append file-name ".zip")))
+                  ;; zip always adds a "zip" extension to the file it creates,
+                  ;; so we need to rename it.
+                  (system* "zip" "-q" zip-file "foo-1.0.0.dist-info/METADATA")
+                  (rename-file zip-file file-name))
+                (delete-file-recursively "foo-1.0.0.dist-info")))
              (_ (error "Unexpected URL: " url)))))
         (mock ((guix http-client) http-fetch
                (lambda (url . rest)
@@ -262,6 +280,9 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                     ('quasiquote
                      (("python-bar" ('unquote 'python-bar))
                       ("python-baz" ('unquote 'python-baz)))))
+                   ('native-inputs
+                    ('quasiquote
+                     (("python-pytest" ('unquote 'python-pytest)))))
                    ('home-page "http://example.com")
                    ('synopsis "summary")
                    ('description "summary")
@@ -272,4 +293,48 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                 (x
                  (pk 'fail x #f))))))
 
+(test-assert "pypi->guix-package, no usable requirement file."
+  ;; Replace network resources with sample data.
+  (mock ((guix import utils) url-fetch
+         (lambda (url file-name)
+           (match url
+             ("https://example.com/foo-1.0.0.tar.gz"
+              (set! test-source-hash
+                (call-with-input-file file-name port-sha256))
+              #t)
+             ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #t)
+             (_ (error "Unexpected URL: " url)))))
+        (mock ((guix http-client) http-fetch
+               (lambda (url . rest)
+                 (match url
+                   ("https://pypi.org/pypi/foo/json"
+                    (values (open-input-string test-json)
+                            (string-length test-json)))
+                   ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #f)
+                   (_ (error "Unexpected URL: " url)))))
+              ;; Not clearing the memoization cache here would mean returning the value
+              ;; computed in the previous test.
+              (invalidate-memoization! pypi->guix-package)
+              (match (pypi->guix-package "foo")
+                (('package
+                   ('name "python-foo")
+                   ('version "1.0.0")
+                   ('source ('origin
+                              ('method 'url-fetch)
+                              ('uri ('pypi-uri "foo" 'version))
+                              ('sha256
+                               ('base32
+                                (? string? hash)))))
+                   ('build-system 'python-build-system)
+                   ('home-page "http://example.com")
+                   ('synopsis "summary")
+                   ('description "summary")
+                   ('license 'license:lgpl2.0))
+                 (string=? (bytevector->nix-base32-string
+                            test-source-hash)
+                           hash))
+                (x
+                 (pk 'fail x #f)))
+              )))
+
 (test-end "pypi")
-- 
2.21.0
From cfde6e09f8f8c692fe252d76ed27e8c50a9e5377 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Sat, 30 Mar 2019 23:13:26 -0400
Subject: [PATCH 8/9] import: pypi: Scan source archive to find requires.txt
file.

* guix/import/pypi.scm (use-modules): Use invoke from (guix build utils).
(guess-requirements)[archive-root-directory]: Remove procedure.
[guess-requirements-from-wheel]: Re-ident.
[guess-requirements-from-source]: Search for the requires.txt file in the
archive instead of using a static, expected location.
* tests/pypi.scm ("pypi->guix-package, no wheel"): Mock the requires.txt at a
non-standard location to test the new feature.
("pypi->guix-package, no usable requirement file."): Adapt.
---
guix/import/pypi.scm | 65 ++++++++++++++++++--------------------------
tests/pypi.scm | 17 +++++++-----
2 files changed, 37 insertions(+), 45 deletions(-)

Toggle diff (137 lines)
diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 099768f0c8..a2ce14b192 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -36,7 +36,8 @@
   #:use-module ((guix build utils)
                 #:select ((package-name->name+version
                            . hyphen-package-name->name+version)
-                          find-files))
+                          find-files
+                          invoke))
   #:use-module (guix import utils)
   #:use-module ((guix download) #:prefix download:)
   #:use-module (guix import json)
@@ -267,19 +268,6 @@ omitted since these can be difficult or expensive to satisfy."
 of the required packages specified in the requirements.txt file.  ARCHIVE will
 be extracted in a temporary directory."
 
-  (define (archive-root-directory url)
-    ;; Given the URL of the package's archive, return the name of the directory
-    ;; that will be created upon decompressing it. If the filetype is not
-    ;; supported, return #f.
-    (if (compressed-file? url)
-        (let ((root-directory (file-sans-extension (basename url))))
-          (if (string=? "tar" (file-extension root-directory))
-              (file-sans-extension root-directory)
-              root-directory))
-        (begin
-          (warning (G_ "Unsupported archive format (~a): \
-cannot determine package dependencies") (file-extension url))
-          #f)))
 
   (define (read-wheel-metadata wheel-archive)
     ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
@@ -305,33 +293,34 @@ cannot determine package dependencies") (file-extension url))
     (call-with-temporary-output-file
      (lambda (temp port)
        (if wheel-url
-         (and (url-fetch wheel-url temp)
-              (read-wheel-metadata temp))
-         #f))))
+           (and (url-fetch wheel-url temp)
+                (read-wheel-metadata temp))
+           #f))))
 
   (define (guess-requirements-from-source)
     ;; Return the package's requirements by guessing them from the source.
-    (let ((dirname (archive-root-directory source-url))
-          (extension (file-extension source-url)))
-      (if (string? dirname)
-          (call-with-temporary-directory
-           (lambda (dir)
-             (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
-                    (requires.txt (string-append dirname "/" pypi-name
-                                                 ".egg-info" "/requires.txt"))
-                    (exit-code
-                     (parameterize ((current-error-port (%make-void-port "rw+"))
-                                    (current-output-port (%make-void-port "rw+")))
-                       (if (string=? "zip" extension)
-                           (system* "unzip" archive "-d" dir requires.txt)
-                           (system* "tar" "xf" archive "-C" dir requires.txt)))))
-               (if (zero? exit-code)
-                   (parse-requires.txt (string-append dir "/" requires.txt))
-                   (begin
-                     (warning
-                      (G_ "Failed to extract file: ~a from source.~%")
-                      requires.txt)
-                     (list '() '()))))))
+    (if (compressed-file? source-url)
+        (call-with-temporary-directory
+         (lambda (dir)
+           (parameterize ((current-error-port (%make-void-port "rw+"))
+                          (current-output-port (%make-void-port "rw+")))
+             (if (string=? "zip" (file-extension source-url))
+                 (invoke "unzip" archive "-d" dir)
+                 (invoke "tar" "xf" archive "-C" dir)))
+           (let ((requires.txt-files
+                  (find-files dir (lambda (abs-file-name _)
+		                    (string-match "\\.egg-info/requires.txt$"
+                                                  abs-file-name)))))
+             (if (> (length requires.txt-files) 0)
+                 (begin
+                   (parse-requires.txt (first requires.txt-files)))
+                 (begin (warning (G_ "Cannot guess requirements from source archive:\
+ no requires.txt file found.~%"))
+                        (list '() '()))))))
+        (begin
+          (warning (G_ "Unsupported archive format; \
+cannot determine package dependencies from source archive: ~a~%")
+                   (basename source-url))
           (list '() '()))))
 
   ;; First, try to compute the requirements using the wheel, else, fallback to
diff --git a/tests/pypi.scm b/tests/pypi.scm
index aa08e2cb54..ad188df16c 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -177,8 +177,9 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
              (match url
                ("https://example.com/foo-1.0.0.tar.gz"
                 (begin
-                  (mkdir-p "foo-1.0.0/foo.egg-info/")
-                  (with-output-to-file "foo-1.0.0/foo.egg-info/requires.txt"
+                  ;; Unusual requires.txt location should still be found.
+                  (mkdir-p "foo-1.0.0/src/bizarre.egg-info")
+                  (with-output-to-file "foo-1.0.0/src/bizarre.egg-info/requires.txt"
                     (lambda ()
                       (display test-requires.txt)))
                   (parameterize ((current-output-port (%make-void-port "rw+")))
@@ -299,10 +300,13 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
          (lambda (url file-name)
            (match url
              ("https://example.com/foo-1.0.0.tar.gz"
+              (mkdir-p "foo-1.0.0/foo.egg-info/")
+              (parameterize ((current-output-port (%make-void-port "rw+")))
+                (system* "tar" "czvf" file-name "foo-1.0.0/"))
+              (delete-file-recursively "foo-1.0.0")
               (set! test-source-hash
-                (call-with-input-file file-name port-sha256))
-              #t)
-             ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #t)
+                (call-with-input-file file-name port-sha256)))
+             ("https://example.com/foo-1.0.0-py2.py3-none-any.whl" #f)
              (_ (error "Unexpected URL: " url)))))
         (mock ((guix http-client) http-fetch
                (lambda (url . rest)
@@ -334,7 +338,6 @@ Requires-Dist: pytest (>=3.1.0); extra == 'testing'
                             test-source-hash)
                            hash))
                 (x
-                 (pk 'fail x #f)))
-              )))
+                 (pk 'fail x #f))))))
 
 (test-end "pypi")
-- 
2.21.0
From 1290f9d1f0d594fdd4723d76b94116be25da9dd5 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Sat, 30 Mar 2019 20:27:35 -0400
Subject: [PATCH 9/9] import: pypi: Preserve package name case when forming
pypi-uri.

Fixes issue: #33046.

* guix/build-system/python.scm (pypi-uri): Update the host URI to
"files.pythonhosted.org".
* guix/import/pypi.scm (make-pypi-sexp): Preserve the package name case when
the source URL calls for it.
---
guix/build-system/python.scm | 2 +-
guix/import/pypi.scm | 23 ++++++++++++++---------
2 files changed, 15 insertions(+), 10 deletions(-)

Toggle diff (49 lines)
diff --git a/guix/build-system/python.scm b/guix/build-system/python.scm
index b753940bad..e39c06528e 100644
--- a/guix/build-system/python.scm
+++ b/guix/build-system/python.scm
@@ -50,7 +50,7 @@
   "Return a URI string for the Python package hosted on the Python Package
 Index (PyPI) corresponding to NAME and VERSION.  EXTENSION is the file name
 extension, such as '.tar.gz'."
-  (string-append "https://pypi.org/packages/source/"
+  (string-append "https://files.pythonhosted.org/packages/source/"
                  (string-take name 1) "/" name "/"
                  name "-" version extension))
 
diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index a2ce14b192..fecf95d0a7 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -371,15 +371,20 @@ VERSION, SOURCE-URL, HOME-PAGE, SYNOPSIS, DESCRIPTION, and LICENSE."
                 `(package
                    (name ,(python->package-name name))
                    (version ,version)
-                   (source (origin
-                             (method url-fetch)
-                             ;; Sometimes 'pypi-uri' doesn't quite work due to mixed
-                             ;; cases in NAME, for instance, as is the case with
-                             ;; "uwsgi".  In that case, fall back to a full URL.
-                             (uri (pypi-uri ,(string-downcase name) version))
-                             (sha256
-                              (base32
-                               ,(guix-hash-url temp)))))
+                   (source
+                    (origin
+                      (method url-fetch)
+                      ;; PyPI URL are case sensitive, but sometimes a project
+                      ;; named using mixed case has a URL using lower case, so
+                      ;; we must work around this inconsistency.  For actual
+                      ;; examples, compare the URLs of the "Deprecated" and
+                      ;; "uWSGI" PyPI packages.
+                      (uri ,(if (string-contains source-url name)
+                                `(pypi-uri ,name version)
+                                `(pypi-uri ,(string-downcase name) version)))
+                      (sha256
+                       (base32
+                        ,(guix-hash-url temp)))))
                    (build-system python-build-system)
                    ,@(maybe-inputs required-inputs 'propagated-inputs)
                    ,@(maybe-inputs test-inputs 'native-inputs)
-- 
2.21.0
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEEJ9WGpPiQCFQyn/CfEmDkZILmNWIFAlziJ6cACgkQEmDkZILm
NWKgRA//WC0iRN9CECjP4VuWTah5C8jEWI8F9tjZqdpwHu0fwQ5Ip5I6Y1vxw4iv
yCqPSsbLevqNrosl16k6KgIJoj9JKFm6egNKNZIuHSftZC+hC/xaN3z2rPTd0Vys
CkpjSnjP1aJS/Zn5OdMuSIn/W+isbgaq4eQ8cR5sBP2QjFI79w5E+b+Ww3NXH46u
UFrmILhQXjRZ3XOjHHDFychluhU1EHi97e35WUCAn+j8oSIwfsVkVLMPF8YvGVMq
OkOk9q9BKv8eJH0a00edcflu1YwhU5xAV8g9lBxkZjGLO6lDKFTe+aq/03UDQ5eR
vBHGQTMuykTdiVIOeNH7xLV2X12WIO22neEM6Cj+6IMSThqSoZKn99XDuC1rYqCO
1yRPf4TM2wXHwhaOatYeokwb+t3Ozztc2gT5RiZnnKNgYJlatoElJi0XuQj9Qrpc
fh3fE/iccvszQuBF4ZQ0z9FoWQiDXblvZ1uy5/upQVnUMfal4I4td5C58laYef+Z
zGzjw5Z/RWB2ssuztMQKOPrUoohx3Ne/s7QMndzhmBvEbfPqkQ7ROUXjO7Jf3kQl
1A6toSrO7ImsaiUsPJhM8dOqAoMXRvJfiG8viisPclN/aVrAItweiNZRH+oCc6iL
lWYd8Lg+Yn6LUG9HS81bZ0Czi7fdYfY4T3rvctxv3fKcjc7YKZE=
=cn9M
-----END PGP SIGNATURE-----

L
L
Ludovic Courtès wrote on 20 May 2019 17:05
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
87k1elrwrt.fsf@gnu.org
Hello!

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

Toggle quote (13 lines)
> Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:
>
>> Hi Maxim,
>>
>> I would very much like to see your improvements to the pypi importer to
>> be merged. Have you been able to separate the independent changes as
>> suggested by Ludo?
>
> I'm thrilled that someone has an interest in this :-)
>
> I took my time, but finally got around to restructure the changes a
> bit. I hope it'll be easier to review this time around!

I’ll let Ricardo comment. For my part, I see that it has tests, and
that’s enough to make me happy. :-)

Thanks for improving the importer!

Ludo’.
M
M
Maxim Cournoyer wrote on 22 May 2019 03:13
(name . Ludovic Courtès)(address . ludo@gnu.org)
87lfyz5m17.fsf@gmail.com
Greetings!

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (20 lines)
> Hello!
>
> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>> Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:
>>
>>> Hi Maxim,
>>>
>>> I would very much like to see your improvements to the pypi importer to
>>> be merged. Have you been able to separate the independent changes as
>>> suggested by Ludo?
>>
>> I'm thrilled that someone has an interest in this :-)
>>
>> I took my time, but finally got around to restructure the changes a
>> bit. I hope it'll be easier to review this time around!
>
> I’ll let Ricardo comment. For my part, I see that it has tests, and
> that’s enough to make me happy. :-)

OK!

Toggle quote (2 lines)
> Thanks for improving the importer!

My pleasure! One less itch to scratch ;-)

Maxim
R
R
Ricardo Wurmus wrote on 27 May 2019 16:48
Re: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 24450@debbugs.gnu.org)
idja7f89cms.fsf@bimsb-sys02.mdc-berlin.net
Hi Maxim,

Toggle quote (6 lines)
> Subject: [PATCH 1/9] import: pypi: Do not consider requirements.txt files.
>
> * guix/import/pypi.scm (guess-requirements): Update comment.
> [guess-requirements-from-source]: Do not attempt to parse the file
> requirements.txt. Streamline logic.

Why remove the handling of the requirements.txt? Is it no longer
popular enough to expect its availability in the source archives?

Please also mention in the commit message that and how you adjusted the
tests. You removed the comments from the example requires.txt — are
comments no longer permitted in these files? If they are, please don’t
include those changes.

--
Ricardo
R
R
Ricardo Wurmus wrote on 27 May 2019 17:11
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 24450@debbugs.gnu.org)
idj8sus9bkq.fsf@bimsb-sys02.mdc-berlin.net
Hi Maxim,

on to patch number 2!

Toggle quote (17 lines)
> From 5f79b0502f62bd1dacc8ea143c1dbd9ef7cfc29d Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Thu, 28 Mar 2019 00:26:00 -0400
> Subject: [PATCH 2/9] import: pypi: Do not parse optional requirements from
> source.
>
> * guix/import/pypi.scm: Export PARSE-REQUIRES.TXT.
> (guess-requirements): Move the READ-REQUIREMENTS procedure to the top level,
> and rename it to PARSE-REQUIRES.TXT. Move the CLEAN-REQUIREMENT and COMMENT?
> functions inside the READ-REQUIREMENTS procedure.
> (parse-requires.txt): Add a SECTION-HEADER? predicate, and use it to prevent
> parsing optional requirements.
>
> * tests/pypi.scm (test-requires-with-sections): New variable.
> ("parse-requires.txt, with sections"): New test.
> ("pypi->guix-package"): Mute tar output to stdout.

The commit log does not match the changes. CLEAN-REQUIREMENT is now a
top-level procedure, not a local procedure inside of READ-REQUIREMENTS
as reported in the commit message. Which is correct?

Toggle quote (6 lines)
> + (call-with-input-file requires.txt
> + (lambda (port)
> + (let loop ((result '()))
> + (let ((line (read-line port)))
> + ;; Stop when a section is encountered, as sections contains optional

Should be “contain”.

Toggle quote (12 lines)
> + ;; (extra) requirements. Non-optional requirements must appear
> + ;; before any section is defined.
> + (if (or (eof-object? line) (section-header? line))
> + (reverse result)
> + (cond
> + ((or (string-null? line) (comment? line))
> + (loop result))
> + (else
> + (loop (cons (clean-requirement line)
> + result))))))))))
> +

I think it would be better to use “match” here instead of nested “let”,
“if” and “cond”. At least you can drop the “if” and just use cond.

The loop let and the inner let can be merged.


Toggle quote (5 lines)
> +(define (parse-requires.txt requires.txt)
> + "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
> +requirement names."
> + ;; This is a very incomplete parser, which job is to select the non-optional

“which” –> “whose”

Toggle quote (5 lines)
> + ;; dependencies and strip them out of any version information.
> + ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
> + ;; library and the requirements grammar defined by PEP-0508
> + ;; (https://www.python.org/dev/peps/pep-0508/).

Let’s remove the sentence starting with “Alternatively…”. We could do
that but we didn’t :)

Toggle quote (7 lines)
> + (define (section-header? line)
> + ;; Return #t if the given LINE is a section header, #f otherwise.
> + (let ((trimmed-line (string-trim line)))
> + (and (not (string-null? trimmed-line))
> + (eq? (string-ref trimmed-line 0) #\[))))
> +

How about using string-prefix? instead? This looks more complicated
than it deserves. You can get rid of string-null? and eq? and string-ref
and all that.

Same here:

Toggle quote (4 lines)
> + (define (comment? line)
> + ;; Return #t if the given LINE is a comment, #f otherwise.
> + (eq? (string-ref (string-trim line) 0) #\#))

I’d just use string-prefix? here.

Toggle quote (8 lines)
> +(define (clean-requirement s)
> + ;; Given a requirement LINE, as can be found in a setuptools requires.txt
> + ;; file, remove everything other than the actual name of the required
> + ;; package, and return it.
> + (string-take s (or (string-index s (lambda (chr)
> + (member chr '(#\space #\> #\= #\<))))
> + (string-length s))))

“string-take” with “string-length” is not very elegant. The char
predicate in string-index could better be a char set:

Toggle snippet (7 lines)
(define (clean-requirement s)
(cond
((string-index s (char-set #\space #\> #\= #\<)) => (cut string-take s <>))
(else s)))


Toggle quote (2 lines)
> ("pypi->guix-package"): Mute tar output to stdout.

Finally, I think it would be better to keep this separate because it’s
really orthogonal to the other changes in this patch.

What do you think?

--
Ricardo
R
R
Ricardo Wurmus wrote on 27 May 2019 17:54
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 24450@debbugs.gnu.org)
idj7eabao5c.fsf@bimsb-sys02.mdc-berlin.net
Patch number 3!

Toggle quote (9 lines)
> From 0c62b541a3e8925b5ca31fe55dbe7536cf95151f Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Thu, 28 Mar 2019 00:26:01 -0400
> Subject: [PATCH 3/9] import: pypi: Improve parsing of requirement
> specifications.
>
> The previous solution was fragile and could leave unwanted characters in a
> requirement name, such as '[' or ']'.

Wouldn’t it be sufficient to add [ and ] to the list of forbidden
characters? The tests pass with this implementation of
clean-requirements:

(define (clean-requirements s)
(cond
((string-index s (char-set #\space #\> #\= #\< #\[ #\])) => (cut string-take s <>))
(else s)))

Toggle quote (26 lines)
> +(define %requirement-name-regexp
> + ;; Regexp to match the requirement name in a requirement specification.
> +
> + ;; Some grammar, taken from PEP-0508 (see:
> + ;; https://www.python.org/dev/peps/pep-0508/).
> +
> + ;; The unified rule can be expressed as:
> + ;; specification = wsp* ( url_req | name_req ) wsp*
> +
> + ;; where url_req is:
> + ;; url_req = name wsp* extras? wsp* urlspec wsp+ quoted_marker?
> +
> + ;; and where name_req is:
> + ;; name_req = name wsp* extras? wsp* versionspec? wsp* quoted_marker?
> +
> + ;; Thus, we need only matching NAME, which is expressed as:
> + ;; identifer_end = letterOrDigit | (('-' | '_' | '.' )* letterOrDigit)
> + ;; identifier = letterOrDigit identifier_end*
> + ;; name = identifier
> + (let* ((letter-or-digit "[A-Za-z0-9]")
> + (identifier-end (string-append "(" letter-or-digit "|"
> + "[-_.]*" letter-or-digit ")"))
> + (identifier (string-append "^" letter-or-digit identifier-end "*"))
> + (name identifier))
> + (make-regexp name)))

This seems a little too complicated. Translating a grammar into a
regexp is probably not a good idea in general. Since we don’t care
about anything other than the name it seems easier to just chop off
the string tail as soon as we find an invalid character.

--
Ricardo
R
R
Ricardo Wurmus wrote on 27 May 2019 17:58
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 24450@debbugs.gnu.org)
idj5zpvanyg.fsf@bimsb-sys02.mdc-berlin.net
Toggle quote (7 lines)
> From 76e4a3150f8126e0b952c6129b6e1371afba80c0 Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Thu, 28 Mar 2019 00:26:01 -0400
> Subject: [PATCH 4/9] import: pypi: Deduplicate requirements.
>
> * guix/import/pypi.scm (parse-requires.txt): Remove potential duplicates.

This looks fine to me, but it is subject to changes that I requested to
the procedure in my comments to an earlier patch.

--
Ricardo
R
R
Ricardo Wurmus wrote on 28 May 2019 12:23
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 24450@debbugs.gnu.org)
idj36kyand2.fsf@bimsb-sys02.mdc-berlin.net
On to the next:

Toggle quote (8 lines)
> From 73e27235cac1275ba7671fd2364325cf5788cb3c Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Thu, 28 Mar 2019 00:26:02 -0400
> Subject: [PATCH 5/9] import: pypi: Support more types of archives.
>
> This change enables the PyPI importer to look for requirements in a source
> archive of a different type than "tar.gz" or "tar.bz2".

Okay.

Toggle quote (4 lines)
> * guix/import/pypi.scm: (guess-requirements)[tarball-directory]: Rename to...
> [archive-root-directory]: this. Use COMPRESSED-FILED? to determine if an
> archive is supported or not.

Nitpick: please use “...this.” and leave two spaces between sentences.

Typo: it should be COMPRESSED-FILE?

Toggle quote (3 lines)
> [guess-requirements-from-source]: Adapt to use the new method, and use unzip
> to extract ZIP archives.

s/method/procedure/

Please also mention that “compute-inputs” has been adjusted.

Toggle quote (27 lines)
> - (define (tarball-directory url)
> - ;; Given the URL of the package's tarball, return the name of the directory
> + (define (archive-root-directory url)
> + ;; Given the URL of the package's archive, return the name of the directory
> ;; that will be created upon decompressing it. If the filetype is not
> ;; supported, return #f.
> - ;; TODO: Support more archive formats.
> - (let ((basename (substring url (+ 1 (string-rindex url #\/)))))
> - (cond
> - ((string-suffix? ".tar.gz" basename)
> - (string-drop-right basename 7))
> - ((string-suffix? ".tar.bz2" basename)
> - (string-drop-right basename 8))
> - (else
> + (if (compressed-file? url)
> + (let ((root-directory (file-sans-extension (basename url))))
> + (if (string=? "tar" (file-extension root-directory))
> + (file-sans-extension root-directory)
> + root-directory))
> (begin
> - (warning (G_ "Unsupported archive format: \
> -cannot determine package dependencies"))
> - #f)))))
> + (warning (G_ "Unsupported archive format (~a): \
> +cannot determine package dependencies") (file-extension url))
> + #f)))

I think the double application of file-sans-extension and the
intermediate variable name “root-directory” for something that is a file
is a little confusing, but I don’t have a better proposal (other than to
replace file-extension and file-sans-extension with a match expression).

Toggle quote (25 lines)
> (define (read-wheel-metadata wheel-archive)
> ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
> @@ -246,16 +243,20 @@ cannot determine package dependencies"))

> (define (guess-requirements-from-source)
> ;; Return the package's requirements by guessing them from the source.
> - (let ((dirname (tarball-directory source-url)))
> + (let ((dirname (archive-root-directory source-url))
> + (extension (file-extension source-url)))
> (if (string? dirname)
> (call-with-temporary-directory
> (lambda (dir)
> (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
> (requires.txt (string-append dirname "/" pypi-name
> ".egg-info" "/requires.txt"))
> - (exit-code (parameterize ((current-error-port (%make-void-port "rw+"))
> - (current-output-port (%make-void-port "rw+")))
> - (system* "tar" "xf" tarball "-C" dir requires.txt))))
> + (exit-code
> + (parameterize ((current-error-port (%make-void-port "rw+"))
> + (current-output-port (%make-void-port "rw+")))
> + (if (string=? "zip" extension)
> + (system* "unzip" archive "-d" dir requires.txt)
> + (system* "tar" "xf" archive "-C" dir requires.txt)))))

I guess this is why I’m not too happy with this: we’re checking in
multiple places if the format is supported but then forget about this
again until the next time we need to do something to the file.

I wonder if we could do better and answer the question just once.

--
Ricardo
R
R
Ricardo Wurmus wrote on 28 May 2019 13:04
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 24450@debbugs.gnu.org)
idj1s0ialgz.fsf@bimsb-sys02.mdc-berlin.net
Patch number 6:

Toggle quote (11 lines)
> From fb0547ef225103c0f8355a7eccc41e0d028f6563 Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Thu, 28 Mar 2019 00:26:03 -0400
> Subject: [PATCH 6/9] import: pypi: Parse wheel METADATA instead of
> metadata.json.

> With newer Wheel releases, there is no more metadata.json file; the METADATA
> file should be used instead (see: https://github.com/pypa/wheel/issues/195).

> This change updates our PyPI importer so that it uses the later.

Typo: should be “latter” instead of “later”.

Toggle quote (3 lines)
> * guix/import/pypi.scm (define-module): Remove unnecessary modules and export
> the PARSE-WHEEL-METADATA method.

Please remove the indentation here. Also, please don’t use “method”
(because it’s not); use “procedure” instead.

Toggle quote (2 lines)
> (parse-wheel-metadata): Add method.

Same here.

Toggle quote (11 lines)
> + (define (requires-dist-header? line)
> + ;; Return #t if the given LINE is a Requires-Dist header.
> + (regexp-match? (string-match "^Requires-Dist: " line)))
> +
> + (define (requires-dist-value line)
> + (string-drop line (string-length "Requires-Dist: ")))
> +
> + (define (extra? line)
> + ;; Return #t if the given LINE is an "extra" requirement.
> + (regexp-match? (string-match "extra == " line)))

The use of “regexp-match?” here isn’t strictly necessary as the return
value is true-ish anyway.

Toggle quote (17 lines)
> + (call-with-input-file metadata
> + (lambda (port)
> + (let loop ((requirements '()))
> + (let ((line (read-line port)))
> + ;; Stop at the first 'Provides-Extra' section: the non-optional
> + ;; requirements appear before the optional ones.
> + (if (eof-object? line)
> + (reverse (delete-duplicates requirements))
> + (cond
> + ((and (requires-dist-header? line) (not (extra? line)))
> + (loop (cons (specification->requirement-name
> + (requires-dist-value line))
> + requirements)))
> + (else
> + (loop requirements)))))))))
> +

As before you can simplify the nested let and merge “if” and "cond“.

Toggle quote (33 lines)
> (define (read-wheel-metadata wheel-archive)
> ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
> - ;; requirements.
> + ;; requirements, or #f if the metadata file contained therein couldn't be
> + ;; extracted.
> (let* ((dirname (wheel-url->extracted-directory wheel-url))
> - (json-file (string-append dirname "/metadata.json")))
> - (and (zero? (system* "unzip" "-q" wheel-archive json-file))
> - (dynamic-wind
> - (const #t)
> - (lambda ()
> - (call-with-input-file json-file
> - (lambda (port)
> - (let* ((metadata (json->scm port))
> - (run_requires (hash-ref metadata "run_requires"))
> - (requirements (if run_requires
> - (hash-ref (list-ref run_requires 0)
> - "requires")
> - '())))
> - (map specification->requirement-name requirements)))))
> - (lambda ()
> - (delete-file json-file)
> - (rmdir dirname))))))
> + (metadata (string-append dirname "/METADATA")))
> + (call-with-temporary-directory
> + (lambda (dir)
> + (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
> + (parse-wheel-metadata (string-append dir "/" metadata))
> + (begin
> + (warning
> + (G_ "Failed to extract file: ~a from wheel.~%") metadata)
> + #f))))))

The old approach took care of removing the unpacked archive no matter
what happened. The new code doesn’t do that.

Toggle quote (3 lines)
> --- a/tests/pypi.scm
> +++ b/tests/pypi.scm

Thanks for the tests!

--
Ricardo
R
R
Ricardo Wurmus wrote on 28 May 2019 15:21
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 24450@debbugs.gnu.org)
idjzhn690jv.fsf@bimsb-sys02.mdc-berlin.net
Next up: Seven of Nine, tertiary adjunct of unimatrix zero one:

Toggle quote (11 lines)
> From 37e499d5d5d5f690aa0a065c730e13f6a31dd30d Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Thu, 28 Mar 2019 23:12:26 -0400
> Subject: [PATCH 7/9] import: pypi: Include optional test inputs as
> native-inputs.
>
> * guix/import/pypi.scm (maybe-inputs): Add INPUT-TYPE argument, and use it.
> (test-section?): New predicate.
> (parse-requires.txt): Collect the optional test inputs, and return them as the
> second element of the returned list.

AFAICT parse-requires.txt now returns a list of pairs, but used to
return a plain list before. Is this correct?

Toggle quote (18 lines)
> (define (parse-requires.txt requires.txt)
> - "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
> -requirement names."
> - ;; This is a very incomplete parser, which job is to select the non-optional
> - ;; dependencies and strip them out of any version information.
> + "Given REQUIRES.TXT, a Setuptools requires.txt file, return a pair of requirements.
> +
> +The first element of the pair contains the required dependencies while the
> +second the optional test dependencies. Note that currently, optional,
> +non-test dependencies are omitted since these can be difficult or expensive to
> +satisfy."
> +
> + ;; This is a very incomplete parser, which job is to read in the requirement
> + ;; specification lines, and strip them out of any version information.
> ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
> ;; library and the requirements grammar defined by PEP-0508
> ;; (https://www.python.org/dev/peps/pep-0508/).

Does it really return a pair? Or a list of pairs? Or is it a
two-element list of lists?

Toggle quote (21 lines)
> (call-with-input-file requires.txt
> (lambda (port)
> - (let loop ((result '()))
> + (let loop ((required-deps '())
> + (test-deps '())
> + (inside-test-section? #f)
> + (optional? #f))
> (let ((line (read-line port)))
> - ;; Stop when a section is encountered, as sections contains optional
> - ;; (extra) requirements. Non-optional requirements must appear
> - ;; before any section is defined.
> - (if (or (eof-object? line) (section-header? line))
> + (if (eof-object? line)
> ;; Duplicates can occur, since the same requirement can be
> ;; listed multiple times with different conditional markers, e.g.
> ;; pytest >= 3 ; python_version >= "3.3"
> ;; pytest < 3 ; python_version < "3.3"
> - (reverse (delete-duplicates result))
> + (map (compose reverse delete-duplicates)
> + (list required-deps test-deps))

Looks like a list of lists to me. “delete-duplicates” now won’t delete
a name that is in both “required-deps” as well as in “test-deps”. Is
this acceptable?

Personally, I’m not a fan of using data structures for returning
multiple values, because we can simply return multiple values.

Or we could have more than just strings. The meaning of these strings
is provided by the bin into which they are thrown — either
“required-deps” or “test-deps”. It could be an option to collect tagged
values instead and have the caller deal with filtering.

Toggle quote (9 lines)
> (define (parse-wheel-metadata metadata)
> - "Given METADATA, a Wheel metadata file, return a list of requirement names."
> + "Given METADATA, a Wheel metadata file, return a pair of requirements.
> +
> +The first element of the pair contains the required dependencies while the second the optional
> +test dependencies. Note that currently, optional, non-test dependencies are
> +omitted since these can be difficult or expensive to satisfy."
> ;; METADATA is a RFC-2822-like, header based file.

This sounds like this is going to duplicate the previous procedures.

Toggle quote (13 lines)
> (define (requires-dist-header? line)
> ;; Return #t if the given LINE is a Requires-Dist header.
> - (regexp-match? (string-match "^Requires-Dist: " line)))
> + (string-match "^Requires-Dist: " line))
>
> (define (requires-dist-value line)
> (string-drop line (string-length "Requires-Dist: ")))
>
> (define (extra? line)
> ;; Return #t if the given LINE is an "extra" requirement.
> - (regexp-match? (string-match "extra == " line)))
> + (string-match "extra == '(.*)'" line))

These hunks should be part of the previous patch where they were
introduced. (See my comments there about “regexp-match?”.)

Toggle quote (4 lines)
> + (define (test-requirement? line)
> + (let ((extra-label (match:substring (extra? line) 1)))
> + (and extra-label (test-section? extra-label))))

You can use “and=>” instead of binding a name:

(and=> (match:substring (extra? line) 1) test-section?)

Toggle quote (27 lines)
> (call-with-input-file metadata
> (lambda (port)
> - (let loop ((requirements '()))
> + (let loop ((required-deps '())
> + (test-deps '()))
> (let ((line (read-line port)))
> - ;; Stop at the first 'Provides-Extra' section: the non-optional
> - ;; requirements appear before the optional ones.
> (if (eof-object? line)
> - (reverse (delete-duplicates requirements))
> + (map (compose reverse delete-duplicates)
> + (list required-deps test-deps))
> (cond
> ((and (requires-dist-header? line) (not (extra? line)))
> (loop (cons (specification->requirement-name
> (requires-dist-value line))
> - requirements)))
> + required-deps)
> + test-deps))
> + ((and (requires-dist-header? line) (test-requirement? line))
> + (loop required-deps
> + (cons (specification->requirement-name (requires-dist-value line))
> + test-deps)))
> (else
> - (loop requirements)))))))))
> + (loop required-deps test-deps))))))))) ;skip line

Could you refactor this so that the other parser can be reused? If not,
the same comments about “if” and “cond” and the use of pairs/lists
instead of “values” applies here.

Toggle quote (27 lines)
> (define (guess-requirements source-url wheel-url archive)
> - "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
> + "Given SOURCE-URL, WHEEL-URL and an ARCHIVE of the package, return a list
> of the required packages specified in the requirements.txt file. ARCHIVE will
> be extracted in a temporary directory."
>
> @@ -244,7 +289,10 @@ cannot determine package dependencies") (file-extension url))
> (metadata (string-append dirname "/METADATA")))
> (call-with-temporary-directory
> (lambda (dir)
> - (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
> + (if (zero?
> + (parameterize ((current-error-port (%make-void-port "rw+"))
> + (current-output-port (%make-void-port "rw+")))
> + (system* "unzip" wheel-archive "-d" dir metadata)))
> (parse-wheel-metadata (string-append dir "/" metadata))
> (begin
> (warning
> @@ -283,32 +331,41 @@ cannot determine package dependencies") (file-extension url))
> (warning
> (G_ "Failed to extract file: ~a from source.~%")
> requires.txt)
> - '())))))
> - '())))
> + (list '() '()))))))
> + (list '() '()))))

I would like to see cosmetic changes like these three hunks in separate
commits.

Toggle quote (42 lines)
> (define (compute-inputs source-url wheel-url archive)
> - "Given the SOURCE-URL of an already downloaded ARCHIVE, return a list of
> -name/variable pairs describing the required inputs of this package. Also
> + "Given the SOURCE-URL and WHEEL-URL of an already downloaded ARCHIVE, return
> +a pair of lists, each consisting of a list of name/variable pairs, for the
> +propagated inputs and the native inputs, respectively. Also
> return the unaltered list of upstream dependency names."
> - (let ((dependencies
> - (remove (cut string=? "argparse" <>)
> - (guess-requirements source-url wheel-url archive))))
> - (values (sort
> - (map (lambda (input)
> - (let ((guix-name (python->package-name input)))
> - (list guix-name (list 'unquote (string->symbol guix-name)))))
> - dependencies)
> - (lambda args
> - (match args
> - (((a _ ...) (b _ ...))
> - (string-ci<? a b)))))
> - dependencies)))
> +
> + (define (strip-argparse deps)
> + (remove (cut string=? "argparse" <>) deps))
> +
> + (define (requirement->package-name/sort deps)
> + (sort
> + (map (lambda (input)
> + (let ((guix-name (python->package-name input)))
> + (list guix-name (list 'unquote (string->symbol guix-name)))))
> + deps)
> + (lambda args
> + (match args
> + (((a _ ...) (b _ ...))
> + (string-ci<? a b))))))
> +
> + (define process-requirements
> + (compose requirement->package-name/sort strip-argparse))
> +
> + (let ((dependencies (guess-requirements source-url wheel-url archive)))
> + (values (map process-requirements dependencies)
> + (concatenate dependencies))))

Giving names to these processing steps is fine and improves clarity, but
I’m not so comfortable about returning ad-hoc pairs *and* multiple
values (like above).

Toggle quote (26 lines)
> + (match guix-dependencies
> + ((required-inputs test-inputs)
> + (values
> + `(package
> + (name ,(python->package-name name))
> + (version ,version)
> + (source (origin
> + (method url-fetch)
> + ;; Sometimes 'pypi-uri' doesn't quite work due to mixed
> + ;; cases in NAME, for instance, as is the case with
> + ;; "uwsgi". In that case, fall back to a full URL.
> + (uri (pypi-uri ,(string-downcase name) version))
> + (sha256
> + (base32
> + ,(guix-hash-url temp)))))
> + (build-system python-build-system)
> + ,@(maybe-inputs required-inputs 'propagated-inputs)
> + ,@(maybe-inputs test-inputs 'native-inputs)
> + (home-page ,home-page)
> + (synopsis ,synopsis)
> + (description ,description)
> + (license ,(license->symbol license)))
> + ;; Flatten the nested lists and return the upstream
> + ;; dependencies.
> + upstream-dependencies))))))))

I don’t see anything being flattened here?

--
Ricardo
R
R
Ricardo Wurmus wrote on 28 May 2019 16:48
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 24450@debbugs.gnu.org)
idjy32q8wiu.fsf@bimsb-sys02.mdc-berlin.net
Toggle quote (9 lines)
> From cfde6e09f8f8c692fe252d76ed27e8c50a9e5377 Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Sat, 30 Mar 2019 23:13:26 -0400
> Subject: [PATCH 8/9] import: pypi: Scan source archive to find requires.txt
> file.

> * guix/import/pypi.scm (use-modules): Use invoke from (guix build utils).
> (guess-requirements)[archive-root-directory]: Remove procedure.

Oh, I guess I reviewed this procedure in vain :(

Please modify the commits so that added procedures are not removed in
later commits. This is easier on the reviewer and makes for a clearer
commit history.

Toggle quote (37 lines)
> (define (guess-requirements-from-source)
> ;; Return the package's requirements by guessing them from the source.
> - (let ((dirname (archive-root-directory source-url))
> - (extension (file-extension source-url)))
> - (if (string? dirname)
> - (call-with-temporary-directory
> - (lambda (dir)
> - (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
> - (requires.txt (string-append dirname "/" pypi-name
> - ".egg-info" "/requires.txt"))
> - (exit-code
> - (parameterize ((current-error-port (%make-void-port "rw+"))
> - (current-output-port (%make-void-port "rw+")))
> - (if (string=? "zip" extension)
> - (system* "unzip" archive "-d" dir requires.txt)
> - (system* "tar" "xf" archive "-C" dir requires.txt)))))
> - (if (zero? exit-code)
> - (parse-requires.txt (string-append dir "/" requires.txt))
> - (begin
> - (warning
> - (G_ "Failed to extract file: ~a from source.~%")
> - requires.txt)
> - (list '() '()))))))
> + (if (compressed-file? source-url)
> + (call-with-temporary-directory
> + (lambda (dir)
> + (parameterize ((current-error-port (%make-void-port "rw+"))
> + (current-output-port (%make-void-port "rw+")))
> + (if (string=? "zip" (file-extension source-url))
> + (invoke "unzip" archive "-d" dir)
> + (invoke "tar" "xf" archive "-C" dir)))
> + (let ((requires.txt-files
> + (find-files dir (lambda (abs-file-name _)
> + (string-match "\\.egg-info/requires.txt$"
> + abs-file-name)))))
> + (if (> (length requires.txt-files) 0)

Let’s work on the empty list directly. Here “match” would be better.

Toggle quote (3 lines)
> + (begin
> + (parse-requires.txt (first requires.txt-files)))

No need for “begin” here.

Toggle quote (4 lines)
> + (begin (warning (G_ "Cannot guess requirements from source archive:\
> + no requires.txt file found.~%"))
> + (list '() '()))))))

I know that this is from an earlier commit, but I don’t like the look of
“(list '() '())” at all :)

Toggle quote (6 lines)
> + (begin
> + (warning (G_ "Unsupported archive format; \
> +cannot determine package dependencies from source archive: ~a~%")
> + (basename source-url))
> (list '() '()))))

Same here. Certainly there’s a better return value.

--
Ricardo
R
R
Ricardo Wurmus wrote on 28 May 2019 16:53
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 24450@debbugs.gnu.org)
idjwoia8wbm.fsf@bimsb-sys02.mdc-berlin.net
And finally: Number 9!

Toggle quote (8 lines)
> From 1290f9d1f0d594fdd4723d76b94116be25da9dd5 Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Sat, 30 Mar 2019 20:27:35 -0400
> Subject: [PATCH 9/9] import: pypi: Preserve package name case when forming
> pypi-uri.
>
> Fixes issue: #33046.

Please change this to:


Toggle quote (5 lines)
> * guix/build-system/python.scm (pypi-uri): Update the host URI to
> "files.pythonhosted.org".
> * guix/import/pypi.scm (make-pypi-sexp): Preserve the package name case when
> the source URL calls for it.

Is the first change to use files.pythonhosted.org required to fix this?
Or is this unrelated?

If it is required this looks fine to me.

Thank you!

--
Ricardo
M
M
Maxim Cournoyer wrote on 30 May 2019 04:24
(name . Ricardo Wurmus)(address . ricardo.wurmus@mdc-berlin.de)(address . 24450@debbugs.gnu.org)
87pno03cj6.fsf@gmail.com
Hello Ricardo!

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

Toggle quote (26 lines)
> And finally: Number 9!
>
>> From 1290f9d1f0d594fdd4723d76b94116be25da9dd5 Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Sat, 30 Mar 2019 20:27:35 -0400
>> Subject: [PATCH 9/9] import: pypi: Preserve package name case when forming
>> pypi-uri.
>>
>> Fixes issue: #33046.
>
> Please change this to:
>
> Fixes <https://bugs.gnu.org/33046>.
>
>> * guix/build-system/python.scm (pypi-uri): Update the host URI to
>> "files.pythonhosted.org".
>> * guix/import/pypi.scm (make-pypi-sexp): Preserve the package name case when
>> the source URL calls for it.
>
> Is the first change to use files.pythonhosted.org required to fix this?
> Or is this unrelated?
>
> If it is required this looks fine to me.
>
> Thank you!

Thank you for this thorough review! I'll need some time
to go through it, and an upcoming travel will delay it some more, so
don't fret if you don't hear back from me before a couple days. Sorry!

I'll keep you posted.

Thanks a bunch!

Maxim
M
M
Maxim Cournoyer wrote on 10 Jun 2019 04:10
(name . Ricardo Wurmus)(address . ricardo.wurmus@mdc-berlin.de)(address . 24450@debbugs.gnu.org)
87sgsi5gwd.fsf@gmail.com
Hello Ricardo!

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

Toggle quote (14 lines)
> Hi Maxim,
>
>> Subject: [PATCH 1/9] import: pypi: Do not consider requirements.txt files.
>>
>> * guix/import/pypi.scm (guess-requirements): Update comment.
>> [guess-requirements-from-source]: Do not attempt to parse the file
>> requirements.txt. Streamline logic.
>
> Why remove the handling of the requirements.txt? Is it no longer
> popular enough to expect its availability in the source archives?
>
> Please also mention in the commit message that and how you adjusted the
> tests.

The commit message now explains the above:

import: pypi: Do not consider requirements.txt files.

PyPI packages are mandated to have a setup.py file, which contains a listing
of the required dependencies. The setuptools/distutils machinery embed
metadata in the archives they produce, which contains this information. There
is no need nor gain to collect the requirements from a "requirements.txt"
file, as it is not the true record of dependencies for PyPI packages and may
contain extraneous requirements or not exist at all.

* guix/import/pypi.scm (guess-requirements): Update comment.
[guess-requirements-from-source]: Do not attempt to parse the file
requirements.txt. Streamline logic.
* tests/pypi.scm (test-requires.txt): Rename from test-requirements, to hint
at the file being tested.
("pypi->guix-package"): Adapt so that the fake package contains a requires.txt
file rather than a requirements.txt file.
("pypi->guix-package, wheels"): Likewise.

Toggle quote (4 lines)
> You removed the comments from the example requires.txt — are
> comments no longer permitted in these files? If they are, please don’t
> include those changes.

The comments are now preserved. Thank you!

Maxim
M
M
Maxim Cournoyer wrote on 10 Jun 2019 05:30
(name . Ricardo Wurmus)(address . ricardo.wurmus@mdc-berlin.de)(address . 24450@debbugs.gnu.org)
87muiq5d7c.fsf@gmail.com
Hello again!

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

Toggle quote (4 lines)
> Hi Maxim,
>
> on to patch number 2!

Yay!

Toggle quote (21 lines)
>> From 5f79b0502f62bd1dacc8ea143c1dbd9ef7cfc29d Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Thu, 28 Mar 2019 00:26:00 -0400
>> Subject: [PATCH 2/9] import: pypi: Do not parse optional requirements from
>> source.
>>
>> * guix/import/pypi.scm: Export PARSE-REQUIRES.TXT.
>> (guess-requirements): Move the READ-REQUIREMENTS procedure to the top level,
>> and rename it to PARSE-REQUIRES.TXT. Move the CLEAN-REQUIREMENT and COMMENT?
>> functions inside the READ-REQUIREMENTS procedure.
>> (parse-requires.txt): Add a SECTION-HEADER? predicate, and use it to prevent
>> parsing optional requirements.
>>
>> * tests/pypi.scm (test-requires-with-sections): New variable.
>> ("parse-requires.txt, with sections"): New test.
>> ("pypi->guix-package"): Mute tar output to stdout.
>
> The commit log does not match the changes. CLEAN-REQUIREMENT is now a
> top-level procedure, not a local procedure inside of READ-REQUIREMENTS
> as reported in the commit message. Which is correct?

Fixed.

Toggle quote (8 lines)
>> + (call-with-input-file requires.txt
>> + (lambda (port)
>> + (let loop ((result '()))
>> + (let ((line (read-line port)))
>> + ;; Stop when a section is encountered, as sections contains optional
>
> Should be “contain”.

Fixed.

Toggle quote (17 lines)
>> + ;; (extra) requirements. Non-optional requirements must appear
>> + ;; before any section is defined.
>> + (if (or (eof-object? line) (section-header? line))
>> + (reverse result)
>> + (cond
>> + ((or (string-null? line) (comment? line))
>> + (loop result))
>> + (else
>> + (loop (cons (clean-requirement line)
>> + result))))))))))
>> +
>
> I think it would be better to use “match” here instead of nested “let”,
> “if” and “cond”. At least you can drop the “if” and just use cond.
>
> The loop let and the inner let can be merged.

I'm not sure I understand; wouldn't merging the named let with the plain
let mean adding an extra LINE argument to my LOOP procedure? I don't
want that.

Also, how could the above code be expressed using "match"? I'm using
predicates which tests for (special) characters in a string; I don't see
how the more primitive pattern language of "match" will enable me to do
the same.

Toggle quote (7 lines)
>> +(define (parse-requires.txt requires.txt)
>> + "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
>> +requirement names."
>> + ;; This is a very incomplete parser, which job is to select the non-optional
>
> “which” –> “whose”

Fixed, with due diligence reading on English grammar ;-)

Toggle quote (8 lines)
>> + ;; dependencies and strip them out of any version information.
>> + ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
>> + ;; library and the requirements grammar defined by PEP-0508
>> + ;; (https://www.python.org/dev/peps/pep-0508/).
>
> Let’s remove the sentence starting with “Alternatively…”. We could do
> that but we didn’t :)

Alright; done!

Toggle quote (19 lines)
>> + (define (section-header? line)
>> + ;; Return #t if the given LINE is a section header, #f otherwise.
>> + (let ((trimmed-line (string-trim line)))
>> + (and (not (string-null? trimmed-line))
>> + (eq? (string-ref trimmed-line 0) #\[))))
>> +
>
> How about using string-prefix? instead? This looks more complicated
> than it deserves. You can get rid of string-null? and eq? and string-ref
> and all that.
>
> Same here:
>
>> + (define (comment? line)
>> + ;; Return #t if the given LINE is a comment, #f otherwise.
>> + (eq? (string-ref (string-trim line) 0) #\#))
>
> I’d just use string-prefix? here.

Neat! Adjusted, for both counts.

Toggle quote (16 lines)
>> +(define (clean-requirement s)
>> + ;; Given a requirement LINE, as can be found in a setuptools requires.txt
>> + ;; file, remove everything other than the actual name of the required
>> + ;; package, and return it.
>> + (string-take s (or (string-index s (lambda (chr)
>> + (member chr '(#\space #\> #\= #\<))))
>> + (string-length s))))
>
> “string-take” with “string-length” is not very elegant. The char
> predicate in string-index could better be a char set:
>
> (define (clean-requirement s)
> (cond
> ((string-index s (char-set #\space #\> #\= #\<)) => (cut string-take s <>))
> (else s)))

That's nicer, thanks!

Toggle quote (5 lines)
>> ("pypi->guix-package"): Mute tar output to stdout.
>
> Finally, I think it would be better to keep this separate because it’s
> really orthogonal to the other changes in this patch.

OK, done!

Toggle quote (2 lines)
> What do you think?

All good points; I just don't understand the one about using a match
and/or merging the regular "let" with the named "let".

Thanks,

Maxim
M
M
Maxim Cournoyer wrote on 10 Jun 2019 10:32
(name . Ricardo Wurmus)(address . ricardo.wurmus@mdc-berlin.de)(address . 24450@debbugs.gnu.org)
87imtd6dtq.fsf@gmail.com
Hello!

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

Toggle quote (2 lines)
> Patch number 3!

Yay!

Toggle quote (18 lines)
>> From 0c62b541a3e8925b5ca31fe55dbe7536cf95151f Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Thu, 28 Mar 2019 00:26:01 -0400
>> Subject: [PATCH 3/9] import: pypi: Improve parsing of requirement
>> specifications.
>>
>> The previous solution was fragile and could leave unwanted characters in a
>> requirement name, such as '[' or ']'.
>
> Wouldn’t it be sufficient to add [ and ] to the list of forbidden
> characters? The tests pass with this implementation of
> clean-requirements:
>
> (define (clean-requirements s)
> (cond
> ((string-index s (char-set #\space #\> #\= #\< #\[ #\])) => (cut string-take s <>))
> (else s)))

Indeed this would be sufficient to make the tests pass, but the tests
don't cover all the cases; as an example, consider:

Toggle snippet (3 lines)
argparse;python_version<"2.7"

While we could make it work with the current logic by adding more
invalid characters (such as ';' here) to the character set, it seems
less error prone to use the upstream provided regex to match a package
name. [0]

Toggle quote (31 lines)
>> +(define %requirement-name-regexp
>> + ;; Regexp to match the requirement name in a requirement specification.
>> +
>> + ;; Some grammar, taken from PEP-0508 (see:
>> + ;; https://www.python.org/dev/peps/pep-0508/).
>> +
>> + ;; The unified rule can be expressed as:
>> + ;; specification = wsp* ( url_req | name_req ) wsp*
>> +
>> + ;; where url_req is:
>> + ;; url_req = name wsp* extras? wsp* urlspec wsp+ quoted_marker?
>> +
>> + ;; and where name_req is:
>> + ;; name_req = name wsp* extras? wsp* versionspec? wsp* quoted_marker?
>> +
>> + ;; Thus, we need only matching NAME, which is expressed as:
>> + ;; identifer_end = letterOrDigit | (('-' | '_' | '.' )* letterOrDigit)
>> + ;; identifier = letterOrDigit identifier_end*
>> + ;; name = identifier
>> + (let* ((letter-or-digit "[A-Za-z0-9]")
>> + (identifier-end (string-append "(" letter-or-digit "|"
>> + "[-_.]*" letter-or-digit ")"))
>> + (identifier (string-append "^" letter-or-digit identifier-end "*"))
>> + (name identifier))
>> + (make-regexp name)))
>
> This seems a little too complicated. Translating a grammar into a
> regexp is probably not a good idea in general. Since we don’t care
> about anything other than the name it seems easier to just chop off
> the string tail as soon as we find an invalid character.

While I agree that a regexp is a bigger hammer than basic string
manipulation, I see some merit to it here:

1) We can be assured of conformance with upstream, again, per PEP-0508.
2) It is easier to extend; we might want to add parsing for the version
spec in order to disregard dependencies specified for Python < 3, for
example.

The use of the PEP-0508 grammar to define the regexp is useful to detail
in a more human-friendly language the components of the regexp. We
could have otherwise used the more cryptic regexp for Python
distribution names:

Toggle snippet (3 lines)
^([A-Z0-9]|[A-Z0-9][A-Z0-9._-]*[A-Z0-9])$

So I guess that what I'm saying is that I prefer this approach to using
string-index with invalid characters, for the reasons above.


Thanks!

Maxim
R
R
Ricardo Wurmus wrote on 10 Jun 2019 11:12
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 24450@debbugs.gnu.org)
87d0jldct8.fsf@mdc-berlin.de
Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

Toggle quote (22 lines)
> While I agree that a regexp is a bigger hammer than basic string
> manipulation, I see some merit to it here:
>
> 1) We can be assured of conformance with upstream, again, per PEP-0508.
> 2) It is easier to extend; we might want to add parsing for the version
> spec in order to disregard dependencies specified for Python < 3, for
> example.
>
> The use of the PEP-0508 grammar to define the regexp is useful to detail
> in a more human-friendly language the components of the regexp. We
> could have otherwise used the more cryptic regexp for Python
> distribution names:
>
> --8<---------------cut here---------------start------------->8---
> ^([A-Z0-9]|[A-Z0-9][A-Z0-9._-]*[A-Z0-9])$
> --8<---------------cut here---------------end--------------->8---
>
> So I guess that what I'm saying is that I prefer this approach to using
> string-index with invalid characters, for the reasons above.
>
> [0] https://www.python.org/dev/peps/pep-0508/

Okay, sounds good. Please make sure to note this in a comment, so that
I won’t be asking myself this same question in a year :)

--
Ricardo
R
R
Ricardo Wurmus wrote on 10 Jun 2019 11:23
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 24450@debbugs.gnu.org)
87blz5dcap.fsf@mdc-berlin.de
Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

Toggle quote (21 lines)
>>> + ;; (extra) requirements. Non-optional requirements must appear
>>> + ;; before any section is defined.
>>> + (if (or (eof-object? line) (section-header? line))
>>> + (reverse result)
>>> + (cond
>>> + ((or (string-null? line) (comment? line))
>>> + (loop result))
>>> + (else
>>> + (loop (cons (clean-requirement line)
>>> + result))))))))))
>>> +
>>
>> I think it would be better to use “match” here instead of nested “let”,
>> “if” and “cond”. At least you can drop the “if” and just use cond.
>>
>> The loop let and the inner let can be merged.
>
> I'm not sure I understand; wouldn't merging the named let with the plain
> let mean adding an extra LINE argument to my LOOP procedure? I don't
> want that.

Let’s forget about merging the nested “let”, because you would indeed
need to change a few more things. It’s fine to keep that as it is. But
(if … (cond …)) is not pretty. At least it could be done in one “cond”:

(cond
((or (eof-object? line) (section-header? line))
(reverse result))
((or (string-null? line) (comment? line))
(loop result))
(else
(loop (cons (clean-requirement line)
result))))

Toggle quote (5 lines)
> Also, how could the above code be expressed using "match"? I'm using
> predicates which tests for (special) characters in a string; I don't see
> how the more primitive pattern language of "match" will enable me to do
> the same.

“match” has support for predicates, so you could do something like this:

(match line
((or (eof-object) (? section-header?))
(reverse result))
((or '() (? comment?))
(loop result))
(_ (loop (cons (clean-requirement line) result))))

This allows you to match “eof-object” and '() directly. Whenever I see
“string-null?” I think it might be better to “match” on the empty list
directly.

But really, that’s up to you. I only feel strongly about avoiding “(if
… (cond …))”.

--
Ricardo
M
M
Maxim Cournoyer wrote on 10 Jun 2019 15:30
(name . Ricardo Wurmus)(address . ricardo.wurmus@mdc-berlin.de)(address . 24450@debbugs.gnu.org)
87blz55zzu.fsf@gmail.com
Hello again!

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

Toggle quote (18 lines)
> On to the next:
>
>> From 73e27235cac1275ba7671fd2364325cf5788cb3c Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Thu, 28 Mar 2019 00:26:02 -0400
>> Subject: [PATCH 5/9] import: pypi: Support more types of archives.
>>
>> This change enables the PyPI importer to look for requirements in a source
>> archive of a different type than "tar.gz" or "tar.bz2".
>
> Okay.
>
>> * guix/import/pypi.scm: (guess-requirements)[tarball-directory]: Rename to...
>> [archive-root-directory]: this. Use COMPRESSED-FILED? to determine if an
>> archive is supported or not.
>
> Nitpick: please use “...this.” and leave two spaces between sentences.

Done.

Toggle quote (2 lines)
> Typo: it should be COMPRESSED-FILE?

Fixed.

Toggle quote (5 lines)
>> [guess-requirements-from-source]: Adapt to use the new method, and use unzip
>> to extract ZIP archives.
>
> s/method/procedure/

Done.

Toggle quote (2 lines)
> Please also mention that “compute-inputs” has been adjusted.

Done.

Toggle quote (32 lines)
>> - (define (tarball-directory url)
>> - ;; Given the URL of the package's tarball, return the name of the directory
>> + (define (archive-root-directory url)
>> + ;; Given the URL of the package's archive, return the name of the directory
>> ;; that will be created upon decompressing it. If the filetype is not
>> ;; supported, return #f.
>> - ;; TODO: Support more archive formats.
>> - (let ((basename (substring url (+ 1 (string-rindex url #\/)))))
>> - (cond
>> - ((string-suffix? ".tar.gz" basename)
>> - (string-drop-right basename 7))
>> - ((string-suffix? ".tar.bz2" basename)
>> - (string-drop-right basename 8))
>> - (else
>> + (if (compressed-file? url)
>> + (let ((root-directory (file-sans-extension (basename url))))
>> + (if (string=? "tar" (file-extension root-directory))
>> + (file-sans-extension root-directory)
>> + root-directory))
>> (begin
>> - (warning (G_ "Unsupported archive format: \
>> -cannot determine package dependencies"))
>> - #f)))))
>> + (warning (G_ "Unsupported archive format (~a): \
>> +cannot determine package dependencies") (file-extension url))
>> + #f)))
>
> I think the double application of file-sans-extension and the
> intermediate variable name “root-directory” for something that is a file
> is a little confusing, but I don’t have a better proposal (other than to
> replace file-extension and file-sans-extension with a match expression).

Done, w.r.t. using "match":

Toggle snippet (20 lines)
@@ -198,10 +198,12 @@ be extracted in a temporary directory."
;; that will be created upon decompressing it. If the filetype is not
;; supported, return #f.
(if (compressed-file? url)
- (let ((root-directory (file-sans-extension (basename url))))
- (if (string=? "tar" (file-extension root-directory))
- (file-sans-extension root-directory)
- root-directory))
+ (match (file-sans-extension (basename url))
+ (root-directory
+ (match (file-extension root-directory)
+ ("tar"
+ (file-sans-extension root-directory))
+ (_ root-directory))))
(begin
(warning (G_ "Unsupported archive format (~a): \
cannot determine package dependencies") (file-extension url))


Toggle quote (29 lines)
>> (define (read-wheel-metadata wheel-archive)
>> ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
>> @@ -246,16 +243,20 @@ cannot determine package dependencies"))
>
>> (define (guess-requirements-from-source)
>> ;; Return the package's requirements by guessing them from the source.
>> - (let ((dirname (tarball-directory source-url)))
>> + (let ((dirname (archive-root-directory source-url))
>> + (extension (file-extension source-url)))
>> (if (string? dirname)
>> (call-with-temporary-directory
>> (lambda (dir)
>> (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
>> (requires.txt (string-append dirname "/" pypi-name
>> ".egg-info" "/requires.txt"))
>> - (exit-code (parameterize ((current-error-port (%make-void-port "rw+"))
>> - (current-output-port (%make-void-port "rw+")))
>> - (system* "tar" "xf" tarball "-C" dir requires.txt))))
>> + (exit-code
>> + (parameterize ((current-error-port (%make-void-port "rw+"))
>> + (current-output-port (%make-void-port "rw+")))
>> + (if (string=? "zip" extension)
>> + (system* "unzip" archive "-d" dir requires.txt)
>> + (system* "tar" "xf" archive "-C" dir requires.txt)))))
>
> I guess this is why I’m not too happy with this: we’re checking in
> multiple places if the format is supported but then forget about this
> again until the next time we need to do something to the file.

I don't see much of a problem with the current design since there are
two questions being answered:

1) What should be the directory name of the extracted package (retrieved
from the base name of the archive).
2) What extractor should be used (zip vs tar).

These two questions are orthogonal, and that the same primitive get used
to answer both is an implementation, or rather, an optimization detail.

Toggle quote (2 lines)
> I wonder if we could do better and answer the question just once.

The questions are different :-). We could optimize, but that would be at
the price of expressiveness (squash the two questions into one solving
space).

What do you think?

Maxim
R
R
Ricardo Wurmus wrote on 10 Jun 2019 22:13
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 24450@debbugs.gnu.org)
878su9ci6l.fsf@mdc-berlin.de
Hi Maxim,

thanks for your patience in addressing my comments. I appreciate it!

Toggle quote (27 lines)
>> I think the double application of file-sans-extension and the
>> intermediate variable name “root-directory” for something that is a file
>> is a little confusing, but I don’t have a better proposal (other than to
>> replace file-extension and file-sans-extension with a match expression).
>
> Done, w.r.t. using "match":
>
> --8<---------------cut here---------------start------------->8---
> @@ -198,10 +198,12 @@ be extracted in a temporary directory."
> ;; that will be created upon decompressing it. If the filetype is not
> ;; supported, return #f.
> (if (compressed-file? url)
> - (let ((root-directory (file-sans-extension (basename url))))
> - (if (string=? "tar" (file-extension root-directory))
> - (file-sans-extension root-directory)
> - root-directory))
> + (match (file-sans-extension (basename url))
> + (root-directory
> + (match (file-extension root-directory)
> + ("tar"
> + (file-sans-extension root-directory))
> + (_ root-directory))))
> (begin
> (warning (G_ "Unsupported archive format (~a): \
> cannot determine package dependencies") (file-extension url))
> --8<---------------cut here---------------end--------------->8---

The first application of “match” matches anything. What I had in mind
was really a slightly different approach, namely to split up the “url”
string at dots and then match the resulting list of strings.

Something like this:

(match (string-split "hello.tar.gz" #\.)
((base "tar" (or "bz2" "gz")) base)
((base ext) base))

Toggle quote (16 lines)
> I don't see much of a problem with the current design since there are
> two questions being answered:
>
> 1) What should be the directory name of the extracted package (retrieved
> from the base name of the archive).
> 2) What extractor should be used (zip vs tar).
>
> These two questions are orthogonal, and that the same primitive get used
> to answer both is an implementation, or rather, an optimization detail.
>
>> I wonder if we could do better and answer the question just once.
>
> The questions are different :-). We could optimize, but that would be at
> the price of expressiveness (squash the two questions into one solving
> space).

Okay. I guess I’m too picky :)
I’d be happy if we could move the checks to tiny named procedures, but
it’s probably fine the way it is.

Thanks!

--
Ricardo
M
M
Maxim Cournoyer wrote on 11 Jun 2019 02:39
(name . Ricardo Wurmus)(address . ricardo.wurmus@mdc-berlin.de)(address . 24450@debbugs.gnu.org)
874l4x550r.fsf@gmail.com
Hello!

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

Toggle quote (15 lines)
> Patch number 6:
>
>> From fb0547ef225103c0f8355a7eccc41e0d028f6563 Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Thu, 28 Mar 2019 00:26:03 -0400
>> Subject: [PATCH 6/9] import: pypi: Parse wheel METADATA instead of
>> metadata.json.
>
>> With newer Wheel releases, there is no more metadata.json file; the METADATA
>> file should be used instead (see: https://github.com/pypa/wheel/issues/195).
>
>> This change updates our PyPI importer so that it uses the later.
>
> Typo: should be “latter” instead of “later”.

Fixed.

Toggle quote (6 lines)
>> * guix/import/pypi.scm (define-module): Remove unnecessary modules and export
>> the PARSE-WHEEL-METADATA method.
>
> Please remove the indentation here. Also, please don’t use “method”
> (because it’s not); use “procedure” instead.

Done. Thanks for fixing my terminology :-).

Toggle quote (4 lines)
>> (parse-wheel-metadata): Add method.
>
> Same here.

Done.

Toggle quote (14 lines)
>> + (define (requires-dist-header? line)
>> + ;; Return #t if the given LINE is a Requires-Dist header.
>> + (regexp-match? (string-match "^Requires-Dist: " line)))
>> +
>> + (define (requires-dist-value line)
>> + (string-drop line (string-length "Requires-Dist: ")))
>> +
>> + (define (extra? line)
>> + ;; Return #t if the given LINE is an "extra" requirement.
>> + (regexp-match? (string-match "extra == " line)))
>
> The use of “regexp-match?” here isn’t strictly necessary as the return
> value is true-ish anyway.

Done.

Toggle quote (19 lines)
>> + (call-with-input-file metadata
>> + (lambda (port)
>> + (let loop ((requirements '()))
>> + (let ((line (read-line port)))
>> + ;; Stop at the first 'Provides-Extra' section: the non-optional
>> + ;; requirements appear before the optional ones.
>> + (if (eof-object? line)
>> + (reverse (delete-duplicates requirements))
>> + (cond
>> + ((and (requires-dist-header? line) (not (extra? line)))
>> + (loop (cons (specification->requirement-name
>> + (requires-dist-value line))
>> + requirements)))
>> + (else
>> + (loop requirements)))))))))
>> +
>
> As before you can simplify the nested let and merge “if” and "cond“.

Oh, I get it now, I think:

Toggle snippet (34 lines)
(call-with-input-file metadata
(lambda (port)
(let loop ((requirements '()))
- (let ((line (read-line port)))
- ;; Stop at the first 'Provides-Extra' section: the non-optional
- ;; requirements appear before the optional ones.
- (if (eof-object? line)
- (reverse (delete-duplicates requirements))
- (cond
- ((and (requires-dist-header? line) (not (extra? line)))
- (loop (cons (specification->requirement-name
- (requires-dist-value line))
- requirements)))
- (else
- (loop requirements)))))))))
+ (match (read-line port)
+ (line
+ ;; Stop at the first 'Provides-Extra' section: the non-optional
+ ;; requirements appear before the optional ones.
+ (cond
+ ((eof-object? line)
+ (reverse (delete-duplicates requirements)))
+ ((and (requires-dist-header? line) (not (extra? line)))
+ (loop (cons (specification->requirement-name
+ (requires-dist-value line))
+ requirements)))
+ (else
+ (loop requirements)))))))))
(define (guess-requirements source-url wheel-url archive)
"Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list

Toggle quote (36 lines)
>> (define (read-wheel-metadata wheel-archive)
>> ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
>> - ;; requirements.
>> + ;; requirements, or #f if the metadata file contained therein couldn't be
>> + ;; extracted.
>> (let* ((dirname (wheel-url->extracted-directory wheel-url))
>> - (json-file (string-append dirname "/metadata.json")))
>> - (and (zero? (system* "unzip" "-q" wheel-archive json-file))
>> - (dynamic-wind
>> - (const #t)
>> - (lambda ()
>> - (call-with-input-file json-file
>> - (lambda (port)
>> - (let* ((metadata (json->scm port))
>> - (run_requires (hash-ref metadata "run_requires"))
>> - (requirements (if run_requires
>> - (hash-ref (list-ref run_requires 0)
>> - "requires")
>> - '())))
>> - (map specification->requirement-name requirements)))))
>> - (lambda ()
>> - (delete-file json-file)
>> - (rmdir dirname))))))
>> + (metadata (string-append dirname "/METADATA")))
>> + (call-with-temporary-directory
>> + (lambda (dir)
>> + (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
>> + (parse-wheel-metadata (string-append dir "/" metadata))
>> + (begin
>> + (warning
>> + (G_ "Failed to extract file: ~a from wheel.~%") metadata)
>> + #f))))))
>
> The old approach took care of removing the unpacked archive no matter
> what happened. The new code doesn’t do that.

The temporary directory where the archive is unpacked should be cleared
when leaving upon leaving its scope; the docstring of
"call-with-temporary-directory" says:

"Call PROC with a name of a temporary directory; close the directory and
delete it when leaving the dynamic extent of this call."

Toggle quote (5 lines)
>> --- a/tests/pypi.scm
>> +++ b/tests/pypi.scm
>
> Thanks for the tests!

:-)

Maxim
R
R
Ricardo Wurmus wrote on 11 Jun 2019 13:56
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 24450@debbugs.gnu.org)
87v9xc9vy9.fsf@mdc-berlin.de
Hi Maxim,

Toggle quote (56 lines)
>>> + (call-with-input-file metadata
>>> + (lambda (port)
>>> + (let loop ((requirements '()))
>>> + (let ((line (read-line port)))
>>> + ;; Stop at the first 'Provides-Extra' section: the non-optional
>>> + ;; requirements appear before the optional ones.
>>> + (if (eof-object? line)
>>> + (reverse (delete-duplicates requirements))
>>> + (cond
>>> + ((and (requires-dist-header? line) (not (extra? line)))
>>> + (loop (cons (specification->requirement-name
>>> + (requires-dist-value line))
>>> + requirements)))
>>> + (else
>>> + (loop requirements)))))))))
>>> +
>>
>> As before you can simplify the nested let and merge “if” and "cond“.
>
> Oh, I get it now, I think:
>
> --8<---------------cut here---------------start------------->8---
>
> (call-with-input-file metadata
> (lambda (port)
> (let loop ((requirements '()))
> - (let ((line (read-line port)))
> - ;; Stop at the first 'Provides-Extra' section: the non-optional
> - ;; requirements appear before the optional ones.
> - (if (eof-object? line)
> - (reverse (delete-duplicates requirements))
> - (cond
> - ((and (requires-dist-header? line) (not (extra? line)))
> - (loop (cons (specification->requirement-name
> - (requires-dist-value line))
> - requirements)))
> - (else
> - (loop requirements)))))))))
> + (match (read-line port)
> + (line
> + ;; Stop at the first 'Provides-Extra' section: the non-optional
> + ;; requirements appear before the optional ones.
> + (cond
> + ((eof-object? line)
> + (reverse (delete-duplicates requirements)))
> + ((and (requires-dist-header? line) (not (extra? line)))
> + (loop (cons (specification->requirement-name
> + (requires-dist-value line))
> + requirements)))
> + (else
> + (loop requirements)))))))))
>
> (define (guess-requirements source-url wheel-url archive)
> "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
> --8<---------------cut here---------------end--------------->8---

Not quite. Your ‘match’ expression here doesn’t do anything that a
‘let’ wouldn’t have done. It really just binds the return value of
(read-line port) to ‘line’; that’s the same as (let ((line (read-line
port))) …).

I gave a match example using predicate matchers in a previous reply. In
any case, using ‘cond’ inside of a let would be just fine. If you
wanted to go with ‘match’, though, you’d replace the ‘cond’.

--
Ricardo
M
M
Maxim Cournoyer wrote on 12 Jun 2019 05:00
(name . Ricardo Wurmus)(address . ricardo.wurmus@mdc-berlin.de)(address . 24450@debbugs.gnu.org)
87imtb33tp.fsf@gmail.com
Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

Toggle quote (2 lines)
> Next up: Seven of Nine, tertiary adjunct of unimatrix zero one:

Ehe! I had to look up the reference; I'm not much of a Star Trek fan
obviously :-P.

Toggle quote (14 lines)
>> From 37e499d5d5d5f690aa0a065c730e13f6a31dd30d Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Thu, 28 Mar 2019 23:12:26 -0400
>> Subject: [PATCH 7/9] import: pypi: Include optional test inputs as
>> native-inputs.
>>
>> * guix/import/pypi.scm (maybe-inputs): Add INPUT-TYPE argument, and use it.
>> (test-section?): New predicate.
>> (parse-requires.txt): Collect the optional test inputs, and return them as the
>> second element of the returned list.
>
> AFAICT parse-requires.txt now returns a list of pairs, but used to
> return a plain list before. Is this correct?

Right, a list of two lists to be technically correct.

Toggle quote (21 lines)
>> (define (parse-requires.txt requires.txt)
>> - "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
>> -requirement names."
>> - ;; This is a very incomplete parser, which job is to select the non-optional
>> - ;; dependencies and strip them out of any version information.
>> + "Given REQUIRES.TXT, a Setuptools requires.txt file, return a pair of requirements.
>> +
>> +The first element of the pair contains the required dependencies while the
>> +second the optional test dependencies. Note that currently, optional,
>> +non-test dependencies are omitted since these can be difficult or expensive to
>> +satisfy."
>> +
>> + ;; This is a very incomplete parser, which job is to read in the requirement
>> + ;; specification lines, and strip them out of any version information.
>> ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
>> ;; library and the requirements grammar defined by PEP-0508
>> ;; (https://www.python.org/dev/peps/pep-0508/).
>
> Does it really return a pair? Or a list of pairs? Or is it a
> two-element list of lists?

The latter! I've fixed the docstring accordingly.

Toggle quote (25 lines)
>> (call-with-input-file requires.txt
>> (lambda (port)
>> - (let loop ((result '()))
>> + (let loop ((required-deps '())
>> + (test-deps '())
>> + (inside-test-section? #f)
>> + (optional? #f))
>> (let ((line (read-line port)))
>> - ;; Stop when a section is encountered, as sections contains optional
>> - ;; (extra) requirements. Non-optional requirements must appear
>> - ;; before any section is defined.
>> - (if (or (eof-object? line) (section-header? line))
>> + (if (eof-object? line)
>> ;; Duplicates can occur, since the same requirement can be
>> ;; listed multiple times with different conditional markers, e.g.
>> ;; pytest >= 3 ; python_version >= "3.3"
>> ;; pytest < 3 ; python_version < "3.3"
>> - (reverse (delete-duplicates result))
>> + (map (compose reverse delete-duplicates)
>> + (list required-deps test-deps))
>
> Looks like a list of lists to me. “delete-duplicates” now won’t delete
> a name that is in both “required-deps” as well as in “test-deps”. Is
> this acceptable?

It is acceptable, as this corner case cannot exist given the current
code (a requirement can exist in either required-deps or test-deps, but
never in both). It also doesn't make sense that a run time requirement
would also be listed as a test requirement, so that corner case is not
likely to exist in the future either.

Toggle quote (3 lines)
> Personally, I’m not a fan of using data structures for returning
> multiple values, because we can simply return multiple values.

I thought the Guile supported multiple values return value would be
great here as well, but I've found that for this specific case here, a
list of lists worked better, since the two lists contain requirements to
be processed the same, which "map" can readily do (i.e. less ceremony is
required).

Toggle quote (5 lines)
> Or we could have more than just strings. The meaning of these strings
> is provided by the bin into which they are thrown — either
> “required-deps” or “test-deps”. It could be an option to collect tagged
> values instead and have the caller deal with filtering.

Sounds neat, but I'd rather punt on this one for now.

Toggle quote (11 lines)
>> (define (parse-wheel-metadata metadata)
>> - "Given METADATA, a Wheel metadata file, return a list of requirement names."
>> + "Given METADATA, a Wheel metadata file, return a pair of requirements.
>> +
>> +The first element of the pair contains the required dependencies while the second the optional
>> +test dependencies. Note that currently, optional, non-test dependencies are
>> +omitted since these can be difficult or expensive to satisfy."
>> ;; METADATA is a RFC-2822-like, header based file.
>
> This sounds like this is going to duplicate the previous procedures.

Instead of duplicating the docstring, I'm now referring to that of
PARSE-REQUIRES.TXT for PARSE-WHEEL-METADATA. The two procedures share
the same interface, but implement a different parser, which consist of
mostly a loop + conditional branches. IMHO, it's not worth, or even,
desirable to try to merge those two parsers into one; I prefer to keep
the logic of two distinct parsers separate.

Toggle quote (16 lines)
>> (define (requires-dist-header? line)
>> ;; Return #t if the given LINE is a Requires-Dist header.
>> - (regexp-match? (string-match "^Requires-Dist: " line)))
>> + (string-match "^Requires-Dist: " line))
>>
>> (define (requires-dist-value line)
>> (string-drop line (string-length "Requires-Dist: ")))
>>
>> (define (extra? line)
>> ;; Return #t if the given LINE is an "extra" requirement.
>> - (regexp-match? (string-match "extra == " line)))
>> + (string-match "extra == '(.*)'" line))
>
> These hunks should be part of the previous patch where they were
> introduced. (See my comments there about “regexp-match?”.)

Done.

Toggle quote (8 lines)
>> + (define (test-requirement? line)
>> + (let ((extra-label (match:substring (extra? line) 1)))
>> + (and extra-label (test-section? extra-label))))
>
> You can use “and=>” instead of binding a name:
>
> (and=> (match:substring (extra? line) 1) test-section?)

Neat! I still don't have the reflex to use "and=>", thanks for reminding
me about it!

Toggle quote (31 lines)
>> (call-with-input-file metadata
>> (lambda (port)
>> - (let loop ((requirements '()))
>> + (let loop ((required-deps '())
>> + (test-deps '()))
>> (let ((line (read-line port)))
>> - ;; Stop at the first 'Provides-Extra' section: the non-optional
>> - ;; requirements appear before the optional ones.
>> (if (eof-object? line)
>> - (reverse (delete-duplicates requirements))
>> + (map (compose reverse delete-duplicates)
>> + (list required-deps test-deps))
>> (cond
>> ((and (requires-dist-header? line) (not (extra? line)))
>> (loop (cons (specification->requirement-name
>> (requires-dist-value line))
>> - requirements)))
>> + required-deps)
>> + test-deps))
>> + ((and (requires-dist-header? line) (test-requirement? line))
>> + (loop required-deps
>> + (cons (specification->requirement-name (requires-dist-value line))
>> + test-deps)))
>> (else
>> - (loop requirements)))))))))
>> + (loop required-deps test-deps))))))))) ;skip line
>
> Could you refactor this so that the other parser can be reused? If not,
> the same comments about “if” and “cond” and the use of pairs/lists
> instead of “values” applies here.

Done w.r.t. using only "cond" rather than "if" + "cond". As explained
above, the docstring's body now refer to the documentation of
PARSE-REQUIRES.TXT; otherwise I prefer to keep the parsers separate.

Toggle quote (30 lines)
>> (define (guess-requirements source-url wheel-url archive)
>> - "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
>> + "Given SOURCE-URL, WHEEL-URL and an ARCHIVE of the package, return a list
>> of the required packages specified in the requirements.txt file. ARCHIVE will
>> be extracted in a temporary directory."
>>
>> @@ -244,7 +289,10 @@ cannot determine package dependencies") (file-extension url))
>> (metadata (string-append dirname "/METADATA")))
>> (call-with-temporary-directory
>> (lambda (dir)
>> - (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
>> + (if (zero?
>> + (parameterize ((current-error-port (%make-void-port "rw+"))
>> + (current-output-port (%make-void-port "rw+")))
>> + (system* "unzip" wheel-archive "-d" dir metadata)))
>> (parse-wheel-metadata (string-append dir "/" metadata))
>> (begin
>> (warning
>> @@ -283,32 +331,41 @@ cannot determine package dependencies") (file-extension url))
>> (warning
>> (G_ "Failed to extract file: ~a from source.~%")
>> requires.txt)
>> - '())))))
>> - '())))
>> + (list '() '()))))))
>> + (list '() '()))))
>
> I would like to see cosmetic changes like these three hunks in separate
> commits.

Done for the first two hunks; the last one isn't cosmetic; it changes
the default return from an empty list to a list of two empty lists.

Toggle quote (46 lines)
>> (define (compute-inputs source-url wheel-url archive)
>> - "Given the SOURCE-URL of an already downloaded ARCHIVE, return a list of
>> -name/variable pairs describing the required inputs of this package. Also
>> + "Given the SOURCE-URL and WHEEL-URL of an already downloaded ARCHIVE, return
>> +a pair of lists, each consisting of a list of name/variable pairs, for the
>> +propagated inputs and the native inputs, respectively. Also
>> return the unaltered list of upstream dependency names."
>> - (let ((dependencies
>> - (remove (cut string=? "argparse" <>)
>> - (guess-requirements source-url wheel-url archive))))
>> - (values (sort
>> - (map (lambda (input)
>> - (let ((guix-name (python->package-name input)))
>> - (list guix-name (list 'unquote (string->symbol guix-name)))))
>> - dependencies)
>> - (lambda args
>> - (match args
>> - (((a _ ...) (b _ ...))
>> - (string-ci<? a b)))))
>> - dependencies)))
>> +
>> + (define (strip-argparse deps)
>> + (remove (cut string=? "argparse" <>) deps))
>> +
>> + (define (requirement->package-name/sort deps)
>> + (sort
>> + (map (lambda (input)
>> + (let ((guix-name (python->package-name input)))
>> + (list guix-name (list 'unquote (string->symbol guix-name)))))
>> + deps)
>> + (lambda args
>> + (match args
>> + (((a _ ...) (b _ ...))
>> + (string-ci<? a b))))))
>> +
>> + (define process-requirements
>> + (compose requirement->package-name/sort strip-argparse))
>> +
>> + (let ((dependencies (guess-requirements source-url wheel-url archive)))
>> + (values (map process-requirements dependencies)
>> + (concatenate dependencies))))
>
> Giving names to these processing steps is fine and improves clarity, but
> I’m not so comfortable about returning ad-hoc pairs *and* multiple
> values (like above).

Using "values" is required by the API of the recursive importer.

Toggle quote (28 lines)
>> + (match guix-dependencies
>> + ((required-inputs test-inputs)
>> + (values
>> + `(package
>> + (name ,(python->package-name name))
>> + (version ,version)
>> + (source (origin
>> + (method url-fetch)
>> + ;; Sometimes 'pypi-uri' doesn't quite work due to mixed
>> + ;; cases in NAME, for instance, as is the case with
>> + ;; "uwsgi". In that case, fall back to a full URL.
>> + (uri (pypi-uri ,(string-downcase name) version))
>> + (sha256
>> + (base32
>> + ,(guix-hash-url temp)))))
>> + (build-system python-build-system)
>> + ,@(maybe-inputs required-inputs 'propagated-inputs)
>> + ,@(maybe-inputs test-inputs 'native-inputs)
>> + (home-page ,home-page)
>> + (synopsis ,synopsis)
>> + (description ,description)
>> + (license ,(license->symbol license)))
>> + ;; Flatten the nested lists and return the upstream
>> + ;; dependencies.
>> + upstream-dependencies))))))))
>
> I don’t see anything being flattened here?

Good catch! Seems a remnant of something that is now gone :-).

Thanks for the review.

Maxim
R
R
Ricardo Wurmus wrote on 12 Jun 2019 08:39
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 24450@debbugs.gnu.org)
87imtb9ujz.fsf@mdc-berlin.de
Hi Maxim,

Toggle quote (31 lines)
>>> (call-with-input-file requires.txt
>>> (lambda (port)
>>> - (let loop ((result '()))
>>> + (let loop ((required-deps '())
>>> + (test-deps '())
>>> + (inside-test-section? #f)
>>> + (optional? #f))
>>> (let ((line (read-line port)))
>>> - ;; Stop when a section is encountered, as sections contains optional
>>> - ;; (extra) requirements. Non-optional requirements must appear
>>> - ;; before any section is defined.
>>> - (if (or (eof-object? line) (section-header? line))
>>> + (if (eof-object? line)
>>> ;; Duplicates can occur, since the same requirement can be
>>> ;; listed multiple times with different conditional markers, e.g.
>>> ;; pytest >= 3 ; python_version >= "3.3"
>>> ;; pytest < 3 ; python_version < "3.3"
>>> - (reverse (delete-duplicates result))
>>> + (map (compose reverse delete-duplicates)
>>> + (list required-deps test-deps))
>>
>> Looks like a list of lists to me. “delete-duplicates” now won’t delete
>> a name that is in both “required-deps” as well as in “test-deps”. Is
>> this acceptable?
>
> It is acceptable, as this corner case cannot exist given the current
> code (a requirement can exist in either required-deps or test-deps, but
> never in both). It also doesn't make sense that a run time requirement
> would also be listed as a test requirement, so that corner case is not
> likely to exist in the future either.

I mentioned it because I believe I’ve seen this in the past where the
importer would return some of the same inputs as both regular inputs and
test dependencies.

Toggle quote (9 lines)
>> Personally, I’m not a fan of using data structures for returning
>> multiple values, because we can simply return multiple values.
>
> I thought the Guile supported multiple values return value would be
> great here as well, but I've found that for this specific case here, a
> list of lists worked better, since the two lists contain requirements to
> be processed the same, which "map" can readily do (i.e. less ceremony is
> required).

“map” can also operate on more than one list at a time:

(call-with-values
(lambda ()
(values (list 1 2 3)
(list 9 8 7)))
(lambda (a b) (map + a b)))

=> (10 10 10)

Of course, it would be simpler to just use a single list of tagged
items.

--
Ricardo
M
M
Maxim Cournoyer wrote on 16 Jun 2019 07:10
(name . Ricardo Wurmus)(address . ricardo.wurmus@mdc-berlin.de)(address . 24450@debbugs.gnu.org)
878su22jz7.fsf@gmail.com
Hello Ricardo!

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

Toggle quote (15 lines)
>> From cfde6e09f8f8c692fe252d76ed27e8c50a9e5377 Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Sat, 30 Mar 2019 23:13:26 -0400
>> Subject: [PATCH 8/9] import: pypi: Scan source archive to find requires.txt
>> file.
>
>> * guix/import/pypi.scm (use-modules): Use invoke from (guix build utils).
>> (guess-requirements)[archive-root-directory]: Remove procedure.
>
> Oh, I guess I reviewed this procedure in vain :(
>
> Please modify the commits so that added procedures are not removed in
> later commits. This is easier on the reviewer and makes for a clearer
> commit history.

Indeed; I'll be more careful about this is the future; sorry!

I've squashed this commit along with the one enabling more archive types
support, as this is where the modified (and later removed) procedure
originated.

Toggle quote (39 lines)
>> (define (guess-requirements-from-source)
>> ;; Return the package's requirements by guessing them from the source.
>> - (let ((dirname (archive-root-directory source-url))
>> - (extension (file-extension source-url)))
>> - (if (string? dirname)
>> - (call-with-temporary-directory
>> - (lambda (dir)
>> - (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
>> - (requires.txt (string-append dirname "/" pypi-name
>> - ".egg-info" "/requires.txt"))
>> - (exit-code
>> - (parameterize ((current-error-port (%make-void-port "rw+"))
>> - (current-output-port (%make-void-port "rw+")))
>> - (if (string=? "zip" extension)
>> - (system* "unzip" archive "-d" dir requires.txt)
>> - (system* "tar" "xf" archive "-C" dir requires.txt)))))
>> - (if (zero? exit-code)
>> - (parse-requires.txt (string-append dir "/" requires.txt))
>> - (begin
>> - (warning
>> - (G_ "Failed to extract file: ~a from source.~%")
>> - requires.txt)
>> - (list '() '()))))))
>> + (if (compressed-file? source-url)
>> + (call-with-temporary-directory
>> + (lambda (dir)
>> + (parameterize ((current-error-port (%make-void-port "rw+"))
>> + (current-output-port (%make-void-port "rw+")))
>> + (if (string=? "zip" (file-extension source-url))
>> + (invoke "unzip" archive "-d" dir)
>> + (invoke "tar" "xf" archive "-C" dir)))
>> + (let ((requires.txt-files
>> + (find-files dir (lambda (abs-file-name _)
>> + (string-match "\\.egg-info/requires.txt$"
>> + abs-file-name)))))
>> + (if (> (length requires.txt-files) 0)
>
> Let’s work on the empty list directly. Here “match” would be better.

Done, like this:

Toggle snippet (12 lines)
- (if (> (length requires.txt-files) 0)
- (parse-requires.txt (first requires.txt-files))
- (begin (warning (G_ "Cannot guess requirements from source archive:\
+ (match requires.txt-files
+ (()
+ (warning (G_ "Cannot guess requirements from source archive:\
no requires.txt file found.~%"))
- '())))))
+ '())
+ (else (parse-requires.txt (first requires.txt-files)))))))

Toggle quote (5 lines)
>> + (begin
>> + (parse-requires.txt (first requires.txt-files)))
>
> No need for “begin” here.

Fixed.

Toggle quote (15 lines)
>> + (begin (warning (G_ "Cannot guess requirements from source archive:\
>> + no requires.txt file found.~%"))
>> + (list '() '()))))))
>
> I know that this is from an earlier commit, but I don’t like the look of
> “(list '() '())” at all :)
>
>> + (begin
>> + (warning (G_ "Unsupported archive format; \
>> +cannot determine package dependencies from source archive: ~a~%")
>> + (basename source-url))
>> (list '() '()))))
>
> Same here. Certainly there’s a better return value.

This might look ugly, but I can't think of a better return value, since
using anything else would mean having to introduce extra logic in the
callers, while it is now a correct value that needs no special case.

Maxim
M
M
Maxim Cournoyer wrote on 16 Jun 2019 07:53
(name . Ricardo Wurmus)(address . ricardo.wurmus@mdc-berlin.de)(address . 24450@debbugs.gnu.org)
871rzu2i02.fsf@gmail.com
Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

Toggle quote (2 lines)
> And finally: Number 9!

Yay!

Toggle quote (12 lines)
>> From 1290f9d1f0d594fdd4723d76b94116be25da9dd5 Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Sat, 30 Mar 2019 20:27:35 -0400
>> Subject: [PATCH 9/9] import: pypi: Preserve package name case when forming
>> pypi-uri.
>>
>> Fixes issue: #33046.
>
> Please change this to:
>
> Fixes <https://bugs.gnu.org/33046>.

Done!

Toggle quote (12 lines)
>> * guix/build-system/python.scm (pypi-uri): Update the host URI to
>> "files.pythonhosted.org".
>> * guix/import/pypi.scm (make-pypi-sexp): Preserve the package name case when
>> the source URL calls for it.
>
> Is the first change to use files.pythonhosted.org required to fix this?
> Or is this unrelated?
>
> If it is required this looks fine to me.
>
> Thank you!

The permanent redirection was found while fixing the issue; but it's
better to have the fix separate. I've separated it into its own commit.

Thank you!
M
M
Maxim Cournoyer wrote on 16 Jun 2019 08:05
(name . Ricardo Wurmus)(address . ricardo.wurmus@mdc-berlin.de)(address . 24450@debbugs.gnu.org)
87wohm12vs.fsf@gmail.com
Hi again,

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

Toggle quote (27 lines)
> Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:
>
>> While I agree that a regexp is a bigger hammer than basic string
>> manipulation, I see some merit to it here:
>>
>> 1) We can be assured of conformance with upstream, again, per PEP-0508.
>> 2) It is easier to extend; we might want to add parsing for the version
>> spec in order to disregard dependencies specified for Python < 3, for
>> example.
>>
>> The use of the PEP-0508 grammar to define the regexp is useful to detail
>> in a more human-friendly language the components of the regexp. We
>> could have otherwise used the more cryptic regexp for Python
>> distribution names:
>>
>> --8<---------------cut here---------------start------------->8---
>> ^([A-Z0-9]|[A-Z0-9][A-Z0-9._-]*[A-Z0-9])$
>> --8<---------------cut here---------------end--------------->8---
>>
>> So I guess that what I'm saying is that I prefer this approach to using
>> string-index with invalid characters, for the reasons above.
>>
>> [0] https://www.python.org/dev/peps/pep-0508/
>
> Okay, sounds good. Please make sure to note this in a comment, so that
> I won’t be asking myself this same question in a year :)

Done!

Maxim
M
M
Maxim Cournoyer wrote on 16 Jun 2019 16:11
(name . Ricardo Wurmus)(address . ricardo.wurmus@mdc-berlin.de)(address . 24450@debbugs.gnu.org)
87sgs91uyy.fsf@gmail.com
Hello!

Continued feedback about your much appreciated comments! :-)

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

Toggle quote (36 lines)
> Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:
>
>>>> + ;; (extra) requirements. Non-optional requirements must appear
>>>> + ;; before any section is defined.
>>>> + (if (or (eof-object? line) (section-header? line))
>>>> + (reverse result)
>>>> + (cond
>>>> + ((or (string-null? line) (comment? line))
>>>> + (loop result))
>>>> + (else
>>>> + (loop (cons (clean-requirement line)
>>>> + result))))))))))
>>>> +
>>>
>>> I think it would be better to use “match” here instead of nested “let”,
>>> “if” and “cond”. At least you can drop the “if” and just use cond.
>>>
>>> The loop let and the inner let can be merged.
>>
>> I'm not sure I understand; wouldn't merging the named let with the plain
>> let mean adding an extra LINE argument to my LOOP procedure? I don't
>> want that.
>
> Let’s forget about merging the nested “let”, because you would indeed
> need to change a few more things. It’s fine to keep that as it is. But
> (if … (cond …)) is not pretty. At least it could be done in one “cond”:
>
> (cond
> ((or (eof-object? line) (section-header? line))
> (reverse result))
> ((or (string-null? line) (comment? line))
> (loop result))
> (else
> (loop (cons (clean-requirement line)
> result))))

Agreed and fixed, thanks.

Toggle quote (14 lines)
>> Also, how could the above code be expressed using "match"? I'm using
>> predicates which tests for (special) characters in a string; I don't see
>> how the more primitive pattern language of "match" will enable me to do
>> the same.
>
> “match” has support for predicates, so you could do something like this:
>
> (match line
> ((or (eof-object) (? section-header?))
> (reverse result))
> ((or '() (? comment?))
> (loop result))
> (_ (loop (cons (clean-requirement line) result))))

Oh, that's neat! I had no idea that predicates could be used with
"match". '() would need to be replaced by "" to match the empty
string. Another gotcha with "match", is that the "or" seems to evaluate
every component, no matter if a early true condition was found; this
resulted in the following error:

Toggle snippet (8 lines)
+ (wrong-type-arg
+ "string-trim"
+ "Wrong type argument in position ~A (expecting ~A): ~S"
+ (1 "string" #<eof>)
+ (#<eof>))
result: FAIL

Due to the "(or (eof-object) (? section-header?)" match clause
evaluating the section-header? predicate despite the line being an EOF
character.

Toggle quote (4 lines)
> This allows you to match “eof-object” and '() directly. Whenever I see
> “string-null?” I think it might be better to “match” on the empty list
> directly.

string-null? and an empty list are not the same, unless I'm missing something.

Toggle quote (3 lines)
> But really, that’s up to you. I only feel strongly about avoiding “(if
> … (cond …))”.

Due to the problem mentioned above, I stayed with "cond".

Thanks!

Maxim
M
M
Maxim Cournoyer wrote on 16 Jun 2019 16:29
(name . Ricardo Wurmus)(address . ricardo.wurmus@mdc-berlin.de)(address . 24450@debbugs.gnu.org)
87o92x1u3d.fsf@gmail.com
Hey Ricardo :-)

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:

Toggle quote (37 lines)
> Hi Maxim,
>
>>>> (call-with-input-file requires.txt
>>>> (lambda (port)
>>>> - (let loop ((result '()))
>>>> + (let loop ((required-deps '())
>>>> + (test-deps '())
>>>> + (inside-test-section? #f)
>>>> + (optional? #f))
>>>> (let ((line (read-line port)))
>>>> - ;; Stop when a section is encountered, as sections contains optional
>>>> - ;; (extra) requirements. Non-optional requirements must appear
>>>> - ;; before any section is defined.
>>>> - (if (or (eof-object? line) (section-header? line))
>>>> + (if (eof-object? line)
>>>> ;; Duplicates can occur, since the same requirement can be
>>>> ;; listed multiple times with different conditional markers, e.g.
>>>> ;; pytest >= 3 ; python_version >= "3.3"
>>>> ;; pytest < 3 ; python_version < "3.3"
>>>> - (reverse (delete-duplicates result))
>>>> + (map (compose reverse delete-duplicates)
>>>> + (list required-deps test-deps))
>>>
>>> Looks like a list of lists to me. “delete-duplicates” now won’t delete
>>> a name that is in both “required-deps” as well as in “test-deps”. Is
>>> this acceptable?
>>
>> It is acceptable, as this corner case cannot exist given the current
>> code (a requirement can exist in either required-deps or test-deps, but
>> never in both). It also doesn't make sense that a run time requirement
>> would also be listed as a test requirement, so that corner case is not
>> likely to exist in the future either.
>
> I mentioned it because I believe I’ve seen this in the past where the
> importer would return some of the same inputs as both regular inputs and
> test dependencies.

OK!

Toggle quote (19 lines)
>>> Personally, I’m not a fan of using data structures for returning
>>> multiple values, because we can simply return multiple values.
>>
>> I thought the Guile supported multiple values return value would be
>> great here as well, but I've found that for this specific case here, a
>> list of lists worked better, since the two lists contain requirements to
>> be processed the same, which "map" can readily do (i.e. less ceremony is
>> required).
>
> “map” can also operate on more than one list at a time:
>
> (call-with-values
> (lambda ()
> (values (list 1 2 3)
> (list 9 8 7)))
> (lambda (a b) (map + a b)))
>
> => (10 10 10)

That's what I meant by "requires more ceremony". I can simply apply
"map" to the return value of the function and get what I need, rather
than having to use "values" in the callee, then "call-with-values" in
the caller and establish a binding for each list.
Toggle quote (3 lines)
> Of course, it would be simpler to just use a single list of tagged
> items.

Do you feel strongly about it? I don't; I'm open to try to use a tagged
list if you feel this is worth it.

Maxim
M
M
Maxim Cournoyer wrote on 16 Jun 2019 16:36
Re: bug#24450: [PATCHv3] Re: pypi importer outputs strange character series in optional dependency case.
(name . Ricardo Wurmus)(address . ricardo.wurmus@mdc-berlin.de)(address . 24450@debbugs.gnu.org)
87pnndzjem.fsf_-_@gmail.com
Here's the current patch set, version 3, with the modifications
as discussed in the previous exchanges: