[PATCH] Update to Pandas, enable Excel writer support

  • Done
  • quality assurance status badge
Details
3 participants
  • Maxim Cournoyer
  • Marius Bakke
  • Ricardo Wurmus
Owner
unassigned
Submitted by
Maxim Cournoyer
Severity
normal
M
M
Maxim Cournoyer wrote on 16 Mar 2019 04:48
(name . guix-patches)(address . guix-patches@gnu.org)
87k1gzqy2c.fsf@gmail.com
From 4903d5a179a0a441910217653285226981905c4e Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Fri, 11 Jan 2019 11:25:54 -0500
Subject: [PATCH 1/5] gnu: Add python-et-xmlfile.

* gnu/packages/python.scm (python-et-xmlfile): New variable.
---
gnu/packages/python-xyz.scm | 35 ++++++++++++++++++++++++++++++++++-
1 file changed, 34 insertions(+), 1 deletion(-)

Toggle diff (55 lines)
diff --git a/gnu/packages/python-xyz.scm b/gnu/packages/python-xyz.scm
index fa2701bc1c..6c2f01e3cc 100644
--- a/gnu/packages/python-xyz.scm
+++ b/gnu/packages/python-xyz.scm
@@ -54,7 +54,7 @@
;;; Copyright © 2018 Nicolas Goaziou <mail@nicolasgoaziou.fr>
;;; Copyright © 2018 Oleg Pykhalov <go.wigust@gmail.com>
;;; Copyright © 2018 Clément Lassieur <clement@lassieur.org>
-;;; Copyright © 2018 Maxim Cournoyer <maxim.cournoyer@gmail.com>
+;;; Copyright © 2018, 2019 Maxim Cournoyer <maxim.cournoyer@gmail.com>
;;; Copyright © 2018 Luther Thompson <lutheroto@gmail.com>
;;; Copyright © 2018 Vagrant Cascadian <vagrant@debian.org>
;;; Copyright © 2019 Brett Gilio <brettg@posteo.net>
@@ -832,6 +832,39 @@ messages in color.")
(define-public python2-coloredlogs
(package-with-python2 python-coloredlogs))
+(define-public python-et-xmlfile
+ (package
+ (name "python-et-xmlfile")
+ (version "1.0.1")
+ (source
+ (origin
+ (method url-fetch)
+ (uri (pypi-uri "et_xmlfile" version))
+ (sha256
+ (base32
+ "0nrkhcb6jdrlb6pwkvd4rycw34y3s931hjf409ij9xkjsli9fkb1"))))
+ (build-system python-build-system)
+ (arguments
+ `(#:phases (modify-phases %standard-phases
+ (replace 'check
+ (lambda _
+ (invoke "pytest"))))))
+ (native-inputs
+ `(("python-pytest" ,python-pytest)
+ ("python-lxml" ,python-lxml)))
+ (home-page
+ "https://bitbucket.org/openpyxl/et_xmlfile")
+ (synopsis
+ "Low memory implementation of @code{lxml.xmlfile}")
+ (description
+ "This Python library is based upon the @code{xmlfile} module
+from @code{lxml}. It aims to provide a low memory, compatible implementation
+of @code{xmlfile}.")
+ (license license:expat)))
+
+(define-public python2-et-xmlfile
+ (package-with-python2 python-et-xmlfile))
+
(define-public python-eventlet
(package
(name "python-eventlet")
--
2.20.1
From 992f103c131897a1c0f310490cab46dd23e9c625 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Fri, 11 Jan 2019 11:26:28 -0500
Subject: [PATCH 2/5] gnu: Add python-jdcal.

* gnu/packages/python.scm (python-jdcal): New variable.
---
gnu/packages/python-xyz.scm | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

Toggle diff (43 lines)
diff --git a/gnu/packages/python-xyz.scm b/gnu/packages/python-xyz.scm
index 6c2f01e3cc..f9595ae2cb 100644
--- a/gnu/packages/python-xyz.scm
+++ b/gnu/packages/python-xyz.scm
@@ -1791,6 +1791,36 @@ version numbers.")
(define-public python2-vcversioner
(package-with-python2 python-vcversioner))
+(define-public python-jdcal
+ (package
+ (name "python-jdcal")
+ (version "1.4")
+ (source
+ (origin
+ (method url-fetch)
+ (uri (pypi-uri "jdcal" version))
+ (sha256
+ (base32
+ "1ja6j2xq97bsl6rv09mhdx7n0xnrsfx0mj5xqza0mxghqmkm02pa"))))
+ (build-system python-build-system)
+ (arguments
+ `(#:phases (modify-phases %standard-phases
+ (replace 'check
+ (lambda _
+ (invoke "pytest"))))))
+ (native-inputs
+ `(("python-pytest" ,python-pytest)))
+ (home-page "https://github.com/phn/jdcal")
+ (synopsis
+ "Functions to convert between Julian dates Gregorian dates")
+ (description
+ "This Python library provides functions for converting between Julian
+dates and Gregorian dates.")
+ (license license:bsd-2)))
+
+(define-public python2-jdcal
+ (package-with-python2 python-jdcal))
+
(define-public python-jsonschema
(package
(name "python-jsonschema")
--
2.20.1
From 4b72713af3e8de61bae8d7a76895a9b2f32f09c9 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Fri, 11 Jan 2019 11:27:10 -0500
Subject: [PATCH 3/5] gnu: Add python-openpyxl.

* gnu/packages/python.scm (python-openpyxl): New variable.
---
gnu/packages/python-xyz.scm | 38 +++++++++++++++++++++++++++++++++++++
1 file changed, 38 insertions(+)

Toggle diff (58 lines)
diff --git a/gnu/packages/python-xyz.scm b/gnu/packages/python-xyz.scm
index f9595ae2cb..427d6818e2 100644
--- a/gnu/packages/python-xyz.scm
+++ b/gnu/packages/python-xyz.scm
@@ -142,6 +142,7 @@
#:use-module (guix packages)
#:use-module (guix download)
#:use-module (guix git-download)
+ #:use-module (guix hg-download)
#:use-module (guix utils)
#:use-module (guix build-system gnu)
#:use-module (guix build-system cmake)
@@ -865,6 +866,43 @@ of @code{xmlfile}.")
(define-public python2-et-xmlfile
(package-with-python2 python-et-xmlfile))
+(define-public python-openpyxl
+ (package
+ (name "python-openpyxl")
+ (version "2.6.0")
+ (source
+ (origin
+ (method hg-fetch)
+ (uri (hg-reference
+ (url "https://bitbucket.org/openpyxl/openpyxl")
+ (changeset version)))
+ (file-name (string-append name "-" version "-checkout"))
+ (sha256
+ (base32
+ "1x47ngn7ybaqdbvg90c8h2x0j6yfdfj25gjfinp2w5rf62gsany7"))))
+ (build-system python-build-system)
+ (arguments
+ `(#:phases (modify-phases %standard-phases
+ (replace 'check
+ (lambda _
+ (invoke "pytest"))))))
+ (native-inputs
+ `(("python-lxml" ,python-lxml)
+ ;; For the test suite.
+ ("python-pillow" ,python-pillow)
+ ("python-pytest" ,python-pytest)))
+ (propagated-inputs
+ `(("python-et-xmlfile" ,python-et-xmlfile)
+ ("python-jdcal" ,python-jdcal)))
+ (home-page "https://openpyxl.readthedocs.io")
+ (synopsis
+ "Python library to read/write Excel 2010 XLSX/XLSM files")
+ (description
+ "This Python library allows reading and writing to the Excel XLSX, XLSM,
+XLTX and XLTM file formats that are defined by the Office Open XML (OOXML)
+standard.")
+ (license license:expat)))
+
(define-public python-eventlet
(package
(name "python-eventlet")
--
2.20.1
From d13e1c487fc63f5ec878a492c048b947b6cb11e7 Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Fri, 11 Jan 2019 13:24:43 -0500
Subject: [PATCH 4/5] gnu: python-pandas: Enable Excel file format support.

* gnu/packages/python.scm (python-pandas)[phases]{check}: Re-instate the tests
from the test_excel.py module.
* gnu/packages/python.scm (python-pandas)[propagated-inputs]: Add
python-openpyxl and python-xlrd.
---
gnu/packages/python-xyz.scm | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

Toggle diff (26 lines)
diff --git a/gnu/packages/python-xyz.scm b/gnu/packages/python-xyz.scm
index 427d6818e2..321c881f4d 100644
--- a/gnu/packages/python-xyz.scm
+++ b/gnu/packages/python-xyz.scm
@@ -1046,7 +1046,6 @@ human-friendly syntax.")
'("pandas/tests/io/conftest.py"
"pandas/tests/io/json/test_compression.py"
"pandas/tests/io/parser/test_network.py"
- "pandas/tests/io/test_excel.py"
"pandas/tests/io/test_parquet.py"))
(invoke "pytest" "-vv" "pandas" "--skip-slow"
"--skip-network" "-k"
@@ -1054,8 +1053,10 @@ human-friendly syntax.")
"not test_read_s3_jsonl"))))))))
(propagated-inputs
`(("python-numpy" ,python-numpy)
+ ("python-openpyxl" ,python-openpyxl)
("python-pytz" ,python-pytz)
- ("python-dateutil" ,python-dateutil)))
+ ("python-dateutil" ,python-dateutil)
+ ("python-xlrd" ,python-xlrd)))
(native-inputs
`(("python-cython" ,python-cython)
("python-beautifulsoup4" ,python-beautifulsoup4)
--
2.20.1
M
M
Marius Bakke wrote on 17 Mar 2019 20:45
87tvg18eui.fsf@fastmail.com
Hello Maxim,

Overall LGTM, some comments inline.

[...]

Toggle quote (21 lines)
> +(define-public python-et-xmlfile
> + (package
> + (name "python-et-xmlfile")
> + (version "1.0.1")
> + (source
> + (origin
> + (method url-fetch)
> + (uri (pypi-uri "et_xmlfile" version))
> + (sha256
> + (base32
> + "0nrkhcb6jdrlb6pwkvd4rycw34y3s931hjf409ij9xkjsli9fkb1"))))
> + (build-system python-build-system)
> + (arguments
> + `(#:phases (modify-phases %standard-phases
> + (replace 'check
> + (lambda _
> + (invoke "pytest"))))))
> + (native-inputs
> + `(("python-pytest" ,python-pytest)
> + ("python-lxml" ,python-lxml)))

Should python-lxml be a propagated-input?

Toggle quote (5 lines)
> + (home-page
> + "https://bitbucket.org/openpyxl/et_xmlfile")
> + (synopsis
> + "Low memory implementation of @code{lxml.xmlfile}")

Please remove the extra newlines in these patches.

Toggle quote (6 lines)
> + (description
> + "This Python library is based upon the @code{xmlfile} module
> +from @code{lxml}. It aims to provide a low memory, compatible implementation
> +of @code{xmlfile}.")
> + (license license:expat)))

[...]
Toggle quote (15 lines)
> +(define-public python-openpyxl
> + (package
> + (name "python-openpyxl")
> + (version "2.6.0")
> + (source
> + (origin
> + (method hg-fetch)
> + (uri (hg-reference
> + (url "https://bitbucket.org/openpyxl/openpyxl")
> + (changeset version)))
> + (file-name (string-append name "-" version "-checkout"))
> + (sha256
> + (base32
> + "1x47ngn7ybaqdbvg90c8h2x0j6yfdfj25gjfinp2w5rf62gsany7"))))

Can you leave a comment about why we take it from this repository
instead of PyPi?

Toggle quote (3 lines)
> + (native-inputs
> + `(("python-lxml" ,python-lxml)

Why is python-lxml a native-input?

Toggle quote (15 lines)
> + ;; For the test suite.
> + ("python-pillow" ,python-pillow)
> + ("python-pytest" ,python-pytest)))
> + (propagated-inputs
> + `(("python-et-xmlfile" ,python-et-xmlfile)
> + ("python-jdcal" ,python-jdcal)))
> + (home-page "https://openpyxl.readthedocs.io")
> + (synopsis
> + "Python library to read/write Excel 2010 XLSX/XLSM files")
> + (description
> + "This Python library allows reading and writing to the Excel XLSX, XLSM,
> +XLTX and XLTM file formats that are defined by the Office Open XML (OOXML)
> +standard.")
> + (license license:expat)))

[...]

Toggle quote (92 lines)
> From ad1f0efe4a5c3d28ee9d7e2e5da275721af9e172 Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Sat, 9 Feb 2019 00:25:51 -0500
> Subject: [PATCH 5/5] gnu: python-pandas: Update to 0.24.2.
>
> * gnu/packages/python-xyz.scm (python-pandas): Update to 0.24.2.
> [phases]{patch-which}: Add phase.
> [inputs]: Add WHICH.
> ---
> gnu/packages/python-xyz.scm | 65 ++++++++++++++++++++++---------------
> 1 file changed, 38 insertions(+), 27 deletions(-)
>
> diff --git a/gnu/packages/python-xyz.scm b/gnu/packages/python-xyz.scm
> index 321c881f4d..bbf1403758 100644
> --- a/gnu/packages/python-xyz.scm
> +++ b/gnu/packages/python-xyz.scm
> @@ -1014,56 +1014,67 @@ human-friendly syntax.")
> (define-public python-pandas
> (package
> (name "python-pandas")
> - (version "0.23.4")
> + (version "0.24.2")
> (source
> (origin
> (method url-fetch)
> (uri (pypi-uri "pandas" version))
> (sha256
> - (base32 "1x54pd7hr3y7qahx6b5bf2wzj54xvl8r3s1h4pl254pnmi3wl92v"))))
> + (base32 "18imlm8xbhcbwy4wa957a1fkamrcb0z988z006jpfda3ki09z4ag"))))
> (build-system python-build-system)
> (arguments
> `(#:modules ((guix build utils)
> (guix build python-build-system)
> (ice-9 ftw)
> (srfi srfi-26))
> - #:phases (modify-phases %standard-phases
> - (replace 'check
> - (lambda _
> - (let ((build-directory
> - (string-append
> - (getcwd) "/build/"
> - (car (scandir "build"
> - (cut string-prefix? "lib." <>))))))
> - ;; Disable the "strict data files" option which causes
> - ;; the build to error out if required data files are not
> - ;; available (as is the case with PyPI archives).
> - (substitute* "setup.cfg"
> - (("addopts = --strict-data-files") "addopts = "))
> - (with-directory-excursion build-directory
> - ;; Delete tests that require "moto" which is not yet in Guix.
> - (for-each delete-file
> - '("pandas/tests/io/conftest.py"
> - "pandas/tests/io/json/test_compression.py"
> - "pandas/tests/io/parser/test_network.py"
> - "pandas/tests/io/test_parquet.py"))
> - (invoke "pytest" "-vv" "pandas" "--skip-slow"
> - "--skip-network" "-k"
> - ;; XXX: Due to the deleted tests above.
> - "not test_read_s3_jsonl"))))))))
> + #:phases
> + (modify-phases %standard-phases
> + (add-after 'unpack 'patch-which
> + (lambda* (#:key inputs #:allow-other-keys)
> + (let ((which (assoc-ref inputs "which")))
> + (substitute* "pandas/io/clipboard/__init__.py"
> + (("^CHECK_CMD = .*")
> + (string-append "CHECK_CMD = \"" which "\"\n"))))
> + #t))
> + (replace 'check
> + (lambda _
> + (let ((build-directory
> + (string-append
> + (getcwd) "/build/"
> + (car (scandir "build"
> + (cut string-prefix? "lib." <>))))))
> + ;; Disable the "strict data files" option which causes
> + ;; the build to error out if required data files are not
> + ;; available (as is the case with PyPI archives).
> + (substitute* "setup.cfg"
> + (("addopts = --strict-data-files") "addopts = "))
> + (with-directory-excursion build-directory
> + ;; Delete tests that require "moto" which is not yet in Guix.
> + (for-each delete-file
> + '("pandas/tests/io/conftest.py"
> + "pandas/tests/io/json/test_compression.py"
> + "pandas/tests/io/parser/test_network.py"
> + "pandas/tests/io/test_parquet.py"))
> + (invoke "pytest" "-vv" "pandas" "--skip-slow"
> + "--skip-network" "-k"
> + ;; XXX: Due to the deleted tests above.
> + "not test_read_s3_jsonl"))))))))

LGTM, although I'd prefer not to reindent the phases section. It makes
the patch harder to read, and I prefer the "deep" indentation for
logically separate chunks of code anyway (though I am probably in the
minority here..). YMMV!

Thanks!

Toggle quote (22 lines)
> (propagated-inputs
> `(("python-numpy" ,python-numpy)
> ("python-openpyxl" ,python-openpyxl)
> ("python-pytz" ,python-pytz)
> ("python-dateutil" ,python-dateutil)
> ("python-xlrd" ,python-xlrd)))
> + (inputs
> + `(("which" ,which)))
> (native-inputs
> `(("python-cython" ,python-cython)
> ("python-beautifulsoup4" ,python-beautifulsoup4)
> ("python-lxml" ,python-lxml)
> ("python-html5lib" ,python-html5lib)
> ("python-nose" ,python-nose)
> - ("python-pytest" ,python-pytest)))
> + ("python-pytest" ,python-pytest)
> + ("python-pytest-mock" ,python-pytest-mock)))
> (home-page "https://pandas.pydata.org")
> (synopsis "Data structures for data analysis, time series, and statistics")
> (description
> --
> 2.20.1
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEu7At3yzq9qgNHeZDoqBt8qM6VPoFAlyOo9UACgkQoqBt8qM6
VPpsyAf+LbAF2/lfykwAQP9aML8BGZl8rcj190J77sjFRmbkm5/EKJDrViJNy0YW
Yq0Hg0eSqHjYOq5aKvt3ED7NIC1fGz1nQZJw25zFuYqvkZoxsQL6ZAVs50MTLoIL
4In22sa72UVRqmgFrKMCHY0pLPvbY36aUA2cEmj7/GIfNkR3VeOX1lClSbyeB8X3
bpevihuvnbW3qdwd1BkzFJM+IYwXCNDq0g3IqONPKcSMYhph2E5c97PvpfVtsGNg
ye5tsu+JAZ1KHoFjfB2t+XrlaOG2s5A3YCtQBjV4zM24HsYqf8jRkay7RHStVLcy
WtPA2kFRh7x2mpffVAWChd+D09j6rw==
=8slR
-----END PGP SIGNATURE-----

M
M
Maxim Cournoyer wrote on 18 Mar 2019 14:18
(name . Marius Bakke)(address . mbakke@fastmail.com)(address . 34882-done@debbugs.gnu.org)
87sgvkiam8.fsf@gmail.com
Hi Marius, and thanks for having a look!

Marius Bakke <mbakke@fastmail.com> writes:

Toggle quote (29 lines)
> Hello Maxim,
>
> Overall LGTM, some comments inline.
>
> [...]
>
>> +(define-public python-et-xmlfile
>> + (package
>> + (name "python-et-xmlfile")
>> + (version "1.0.1")
>> + (source
>> + (origin
>> + (method url-fetch)
>> + (uri (pypi-uri "et_xmlfile" version))
>> + (sha256
>> + (base32
>> + "0nrkhcb6jdrlb6pwkvd4rycw34y3s931hjf409ij9xkjsli9fkb1"))))
>> + (build-system python-build-system)
>> + (arguments
>> + `(#:phases (modify-phases %standard-phases
>> + (replace 'check
>> + (lambda _
>> + (invoke "pytest"))))))
>> + (native-inputs
>> + `(("python-pytest" ,python-pytest)
>> + ("python-lxml" ,python-lxml)))
>
> Should python-lxml be a propagated-input?

No, otherwise this package would be pretty pointless, as it aims to be
a "low memory implementation of a component of lxml" :-). The lxml
dependency is used in the test suite (I'm guessing to validate that both
implementations' behaviors match).

Toggle quote (8 lines)
>
>> + (home-page
>> + "https://bitbucket.org/openpyxl/et_xmlfile")
>> + (synopsis
>> + "Low memory implementation of @code{lxml.xmlfile}")
>
> Please remove the extra newlines in these patches.

Done.

Toggle quote (27 lines)
>> + (description
>> + "This Python library is based upon the @code{xmlfile} module
>> +from @code{lxml}. It aims to provide a low memory, compatible
>> implementation
>> +of @code{xmlfile}.")
>> + (license license:expat)))
>
> [...]
>
>> +(define-public python-openpyxl
>> + (package
>> + (name "python-openpyxl")
>> + (version "2.6.0")
>> + (source
>> + (origin
>> + (method hg-fetch)
>> + (uri (hg-reference
>> + (url "https://bitbucket.org/openpyxl/openpyxl")
>> + (changeset version)))
>> + (file-name (string-append name "-" version "-checkout"))
>> + (sha256
>> + (base32
>> + "1x47ngn7ybaqdbvg90c8h2x0j6yfdfj25gjfinp2w5rf62gsany7"))))
>
> Can you leave a comment about why we take it from this repository
> instead of PyPi?

Done. The reason is that the tests are missing from the PyPI
release.

Toggle quote (5 lines)
>> + (native-inputs
>> + `(("python-lxml" ,python-lxml)
>
> Why is python-lxml a native-input?

Here also it is a test dependency. lxml is an optional backend. I've
moved the existing comment (" ;; For the test suite.") above this
native-input as well.

Toggle quote (116 lines)
>> + ;; For the test suite.
>> + ("python-pillow" ,python-pillow)
>> + ("python-pytest" ,python-pytest)))
>> + (propagated-inputs
>> + `(("python-et-xmlfile" ,python-et-xmlfile)
>> + ("python-jdcal" ,python-jdcal)))
>> + (home-page "https://openpyxl.readthedocs.io")
>> + (synopsis
>> + "Python library to read/write Excel 2010 XLSX/XLSM files")
>> + (description
>> + "This Python library allows reading and writing to the Excel XLSX,
>> XLSM,
>> +XLTX and XLTM file formats that are defined by the Office Open XML
>> (OOXML)
>> +standard.")
>> + (license license:expat)))
>
> [...]
>
>> From ad1f0efe4a5c3d28ee9d7e2e5da275721af9e172 Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Sat, 9 Feb 2019 00:25:51 -0500
>> Subject: [PATCH 5/5] gnu: python-pandas: Update to 0.24.2.
>>
>> * gnu/packages/python-xyz.scm (python-pandas): Update to 0.24.2.
>> [phases]{patch-which}: Add phase.
>> [inputs]: Add WHICH.
>> ---
>> gnu/packages/python-xyz.scm | 65 ++++++++++++++++++++++---------------
>> 1 file changed, 38 insertions(+), 27 deletions(-)
>>
>> diff --git a/gnu/packages/python-xyz.scm b/gnu/packages/python-xyz.scm
>> index 321c881f4d..bbf1403758 100644
>> --- a/gnu/packages/python-xyz.scm
>> +++ b/gnu/packages/python-xyz.scm
>> @@ -1014,56 +1014,67 @@ human-friendly syntax.")
>> (define-public python-pandas
>> (package
>> (name "python-pandas")
>> - (version "0.23.4")
>> + (version "0.24.2")
>> (source
>> (origin
>> (method url-fetch)
>> (uri (pypi-uri "pandas" version))
>> (sha256
>> - (base32 "1x54pd7hr3y7qahx6b5bf2wzj54xvl8r3s1h4pl254pnmi3wl92v"))))
>> + (base32 "18imlm8xbhcbwy4wa957a1fkamrcb0z988z006jpfda3ki09z4ag"))))
>> (build-system python-build-system)
>> (arguments
>> `(#:modules ((guix build utils)
>> (guix build python-build-system)
>> (ice-9 ftw)
>> (srfi srfi-26))
>> - #:phases (modify-phases %standard-phases
>> - (replace 'check
>> - (lambda _
>> - (let ((build-directory
>> - (string-append
>> - (getcwd) "/build/"
>> - (car (scandir "build"
>> - (cut string-prefix? "lib." <>))))))
>> - ;; Disable the "strict data files" option which causes
>> - ;; the build to error out if required data files are not
>> - ;; available (as is the case with PyPI archives).
>> - (substitute* "setup.cfg"
>> - (("addopts = --strict-data-files") "addopts = "))
>> - (with-directory-excursion build-directory
>> - ;; Delete tests that require "moto" which is not yet in Guix.
>> - (for-each delete-file
>> - '("pandas/tests/io/conftest.py"
>> - "pandas/tests/io/json/test_compression.py"
>> - "pandas/tests/io/parser/test_network.py"
>> - "pandas/tests/io/test_parquet.py"))
>> - (invoke "pytest" "-vv" "pandas" "--skip-slow"
>> - "--skip-network" "-k"
>> - ;; XXX: Due to the deleted tests above.
>> - "not test_read_s3_jsonl"))))))))
>> + #:phases
>> + (modify-phases %standard-phases
>> + (add-after 'unpack 'patch-which
>> + (lambda* (#:key inputs #:allow-other-keys)
>> + (let ((which (assoc-ref inputs "which")))
>> + (substitute* "pandas/io/clipboard/__init__.py"
>> + (("^CHECK_CMD = .*")
>> + (string-append "CHECK_CMD = \"" which "\"\n"))))
>> + #t))
>> + (replace 'check
>> + (lambda _
>> + (let ((build-directory
>> + (string-append
>> + (getcwd) "/build/"
>> + (car (scandir "build"
>> + (cut string-prefix? "lib." <>))))))
>> + ;; Disable the "strict data files" option which causes
>> + ;; the build to error out if required data files are not
>> + ;; available (as is the case with PyPI archives).
>> + (substitute* "setup.cfg"
>> + (("addopts = --strict-data-files") "addopts = "))
>> + (with-directory-excursion build-directory
>> + ;; Delete tests that require "moto" which is not yet in Guix.
>> + (for-each delete-file
>> + '("pandas/tests/io/conftest.py"
>> + "pandas/tests/io/json/test_compression.py"
>> + "pandas/tests/io/parser/test_network.py"
>> + "pandas/tests/io/test_parquet.py"))
>> + (invoke "pytest" "-vv" "pandas" "--skip-slow"
>> + "--skip-network" "-k"
>> + ;; XXX: Due to the deleted tests above.
>> + "not test_read_s3_jsonl"))))))))
>
> LGTM, although I'd prefer not to reindent the phases section. It makes
> the patch harder to read, and I prefer the "deep" indentation for
> logically separate chunks of code anyway (though I am probably in the
> minority here..). YMMV!

While I loathe any "deep" indentation, I've reverted my indentation
change here as it was a bit gratuitous (I needn't struggle to fit into
the 80 chars guideline).

Toggle quote (2 lines)
> Thanks!

I pushed this change as c0d43f6223 with modifications based on your feedback.

Thank you!

Maxim
Closed
R
R
Ricardo Wurmus wrote on 18 Mar 2019 14:50
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 34882@debbugs.gnu.org)
87sgvkxpel.fsf@elephly.net
Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

Toggle quote (9 lines)
> From ad1f0efe4a5c3d28ee9d7e2e5da275721af9e172 Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Sat, 9 Feb 2019 00:25:51 -0500
> Subject: [PATCH 5/5] gnu: python-pandas: Update to 0.24.2.
>
> * gnu/packages/python-xyz.scm (python-pandas): Update to 0.24.2.
> [phases]{patch-which}: Add phase.
> [inputs]: Add WHICH.

I have no objections to updating Pandas, but please make sure that this
version of Pandas works well with the other scientific Python packages
like numpy, scipy, sklearn, numba, etc.

These packages usually have rather strict interdependencies and need to
be updated together to avoid breakage.

--
Ricardo
R
R
Ricardo Wurmus wrote on 18 Mar 2019 18:28
Re: bug#34882: [PATCH] Update to Pandas, enable Excel writer support
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
87lg1cxfbp.fsf@elephly.net
Hi Maxim,

Toggle quote (2 lines)
> I pushed this change as c0d43f6223 with modifications based on your feedback.

Have you checked if this version of Pandas is known to be compatible
with our versions of the scientific Python stack, including numpy,
scipy, statsmodels, matplotlib, sklearn, etc?

--
Ricardo
Closed
M
M
Maxim Cournoyer wrote on 18 Mar 2019 22:04
Re: [bug#34882] [PATCH] Update to Pandas, enable Excel writer support
(name . Ricardo Wurmus)(address . rekado@elephly.net)(address . 34882@debbugs.gnu.org)
87ef73rj0x.fsf@gmail.com
Hello Ricardo,

Ricardo Wurmus <rekado@elephly.net> writes:

Toggle quote (18 lines)
> Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:
>
>> From ad1f0efe4a5c3d28ee9d7e2e5da275721af9e172 Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Sat, 9 Feb 2019 00:25:51 -0500
>> Subject: [PATCH 5/5] gnu: python-pandas: Update to 0.24.2.
>>
>> * gnu/packages/python-xyz.scm (python-pandas): Update to 0.24.2.
>> [phases]{patch-which}: Add phase.
>> [inputs]: Add WHICH.
>
> I have no objections to updating Pandas, but please make sure that this
> version of Pandas works well with the other scientific Python packages
> like numpy, scipy, sklearn, numba, etc.
>
> These packages usually have rather strict interdependencies and need to
> be updated together to avoid breakage.

I've already went ahead and merged those changes, but retested the
following (on master) to make sure:

Toggle snippet (3 lines)
for o in $(./pre-inst-env guix refresh -l python-pandas | cut -d':' -f2); do ./pre-inst-env guix build --check --no-grafts "$o" && echo "$o OK" >> build.results || echo "$o NOK" >> build.results; done

And then:
Toggle snippet (16 lines)
$ cat build.results
cnvkit@0.9.5 OK
deeptools@3.1.3 NOK
nanopolish@0.10.2-1.50e8b5c NOK
pigx@0.0.3 NOK
python-biom-format@2.1.7 NOK
python-feather-format@0.4.0 NOK
python-hic2cool@0.4.2 OK
python-plastid@0.4.8 OK
python-pybedtools@0.8.0 OK
python-pygenometracks@2.0 OK
python-scanpy@1.2.2 OK
python-scikit-image@0.14.2 OK
python-velocyto@0.17.17 OK

So nanopolish, deeptools, python-feather-format, python-biom-format, and
pigx are currently broken, but...

When using master on commit g8c72f13fd4
# (and re-running the same script as earlier)

Toggle snippet (16 lines)
cat build.results.g8c72f13fd4
cnvkit@0.9.5 OK
deeptools@3.1.3 NOK
nanopolish@0.10.2-1.50e8b5c NOK
pigx@0.0.3 NOK
python-biom-format@2.1.7 NOK
python-feather-format@0.4.0 NOK
python-hic2cool@0.4.2 OK
python-plastid@0.4.8 OK
python-pybedtools@0.8.0 NOK
python-pygenometracks@2.0 OK
python-scanpy@1.2.2 OK
python-scikit-image@0.14.2 OK
python-velocyto@0.17.17 OK

they already were!

I've also found out while testing that Pandas was not reproducible (this
was true also before my changes).

I will create tickets for all of these problems.

Apart from that, I have run some script which uses Pandas successfully
(and the Pandas test suite passes).

Are these verifications sufficient? And why does 'guix refresh -l' seem
to miss some packages which depend on python-pandas, e.g. python-seaborn?

Thanks,

Maxim
R
R
Ricardo Wurmus wrote on 18 Mar 2019 23:34
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 34882@debbugs.gnu.org)
87a7hryfpi.fsf@elephly.net
Hi Maxim,

Toggle quote (2 lines)
> deeptools@3.1.3 NOK

I can’t reproduce this. I get a substitute for deeptools.

Toggle quote (2 lines)
> nanopolish@0.10.2-1.50e8b5c NOK

I can’t reproduce this. I get a substitute for nanopolish.

Toggle quote (2 lines)
> pigx@0.0.3 NOK

This is broken since the upgrade to python-loompy. The authors are
working on fixing it.

Toggle quote (2 lines)
> python-biom-format@2.1.7 NOK

I just fixed this.

Toggle quote (2 lines)
> python-feather-format@0.4.0 NOK

This is broken because apache-arrow is broken. I’m trying to fix this
now. I just updated arrow to 0.10.0 (couldn’t build 0.12.0).

Toggle quote (2 lines)
> python-pybedtools@0.8.0 NOK

I can’t reproduce this. I get a substitute.

--
Ricardo
M
M
Maxim Cournoyer wrote on 19 Mar 2019 02:15
(name . Ricardo Wurmus)(address . rekado@elephly.net)(address . 34882@debbugs.gnu.org)
87r2b3mzqg.fsf@gmail.com
Hello Ricardo,

Ricardo Wurmus <rekado@elephly.net> writes:

Toggle quote (6 lines)
> Hi Maxim,
>
>> deeptools@3.1.3 NOK
>
> I can’t reproduce this. I get a substitute for deeptools.

It builds, but isn't reproducible. Try with --check and --no-grafts, it
should give you something like: guix build: error: derivation
`/gnu/store/7a80qjk898f7lhh46bjvv6mbbsrgaq5i-deeptools-3.1.3.drv' may
not be deterministic: output
`/gnu/store/f3z6fczw70j6692ddy467pbagbjck009-deeptools-3.1.3' differs

Toggle quote (4 lines)
>> nanopolish@0.10.2-1.50e8b5c NOK
>
> I can’t reproduce this. I get a substitute for nanopolish.

It builds, but isn't reproducible.

Toggle quote (5 lines)
>> pigx@0.0.3 NOK
>
> This is broken since the upgrade to python-loompy. The authors are
> working on fixing it.

OK

Toggle quote (4 lines)
>> python-biom-format@2.1.7 NOK
>
> I just fixed this.

Cool!

Toggle quote (5 lines)
>> python-feather-format@0.4.0 NOK
>
> This is broken because apache-arrow is broken. I’m trying to fix this
> now. I just updated arrow to 0.10.0 (couldn’t build 0.12.0).

This builds fine here now! :-)

Toggle quote (4 lines)
>> python-pybedtools@0.8.0 NOK
>
> I can’t reproduce this. I get a substitute.

Yeah this builds fine on master, but on the older commit it seems it had
trouble. Not to worry about!

Thanks for the follow-up! Should we create tickets for the
reproducibility issues?

Maxim
R
R
Ricardo Wurmus wrote on 19 Mar 2019 09:41
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 34882@debbugs.gnu.org)
87bm27mf32.fsf@elephly.net
Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

Toggle quote (10 lines)
>>> deeptools@3.1.3 NOK
>>
>> I can’t reproduce this. I get a substitute for deeptools.
>
> It builds, but isn't reproducible. Try with --check and --no-grafts, it
> should give you something like: guix build: error: derivation
> `/gnu/store/7a80qjk898f7lhh46bjvv6mbbsrgaq5i-deeptools-3.1.3.drv' may
> not be deterministic: output
> `/gnu/store/f3z6fczw70j6692ddy467pbagbjck009-deeptools-3.1.3' differs

Indeed.

“lib/python3.7/site-packages/deeptoolsintervals/tree.cpython-37m-x86_64-linux-gnu.so”
differs, but looking at the diffoscope output I can’t figure out why.

Toggle quote (6 lines)
>>> nanopolish@0.10.2-1.50e8b5c NOK
>>
>> I can’t reproduce this. I get a substitute for nanopolish.
>
> It builds, but isn't reproducible.

Yes, here it’s “bin/nanopolish” that differs. I’ll investigate.

[…]
Toggle quote (3 lines)
> Thanks for the follow-up! Should we create tickets for the
> reproducibility issues?

Sure, thanks!

--
Ricardo
M
M
Maxim Cournoyer wrote on 21 Mar 2019 04:14
(name . Ricardo Wurmus)(address . rekado@elephly.net)(address . 34882@debbugs.gnu.org)
87ef706hqn.fsf@gmail.com
I created the issues #34934 and #34935 to track the reproducibility
problems of deeptools and nanopolish, respectively.
?