[PATCH 0/2] gnu: Add python-pyjanitor.

  • Done
  • quality assurance status badge
Details
2 participants
  • Sharlatan Hellseher
  • Troy Figiel
Owner
unassigned
Submitted by
Troy Figiel
Severity
normal
T
T
Troy Figiel wrote on 28 Jan 23:49 +0100
(address . guix-patches@gnu.org)
87il3df4y5.fsf@troyfigiel.com
This patch series adds python-pyjanitor and its dependency python-unyt.

Troy Figiel (2):
gnu: Add python-unyt.
gnu: Add python-pyjanitor.

gnu/packages/python-science.scm | 88 +++++++++++++++++++++++++++++++++
1 file changed, 88 insertions(+)


base-commit: 08ed3ec64ecd571d92d497b2493f5c0225102c99
--
2.42.0
T
T
Troy Figiel wrote on 28 Jan 22:47 +0100
[PATCH 1/2] gnu: Add python-unyt.
(address . 68789@debbugs.gnu.org)
87h6ixf4u6.fsf@troyfigiel.com
* gnu/packages/python-science.scm (python-unyt): New variable.
---
gnu/packages/python-science.scm | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)

Toggle diff (49 lines)
diff --git a/gnu/packages/python-science.scm b/gnu/packages/python-science.scm
index 9d72608de4..3013c77c34 100644
--- a/gnu/packages/python-science.scm
+++ b/gnu/packages/python-science.scm
@@ -42,6 +42,7 @@
(define-module (gnu packages python-science)
#:use-module ((guix licenses) #:prefix license:)
#:use-module (gnu packages)
+ #:use-module (gnu packages astronomy)
#:use-module (gnu packages base)
#:use-module (gnu packages bioinformatics)
#:use-module (gnu packages boost)
@@ -1217,6 +1218,34 @@ (define-public python-statannot
annotations on an existing boxplots and barplots generated by seaborn.")
(license license:expat)))
+(define-public python-unyt
+ (package
+ (name "python-unyt")
+ (version "3.0.1")
+ (source
+ (origin
+ (method url-fetch)
+ (uri (pypi-uri "unyt" version))
+ (sha256
+ (base32 "00900bw24rxgcgwgxp9xlx0l5im96r1n5hn0r3mxvbdgc3lyyq48"))))
+ (build-system pyproject-build-system)
+ (propagated-inputs (list python-h5py ;optional import
+ python-matplotlib ;optional import
+ python-numpy
+ python-sympy))
+ ;; python-astropy and python-pint are also optional imports, but we do not
+ ;; propagate them due to their sizes.
+ (native-inputs (list python-astropy python-pint python-pytest))
+ (home-page "https://unyt.readthedocs.io")
+ (synopsis "Library for working with data that has physical units")
+ (description
+ "Writing code that deals with data with physical units can be confusing.
+A function might return an array but at least with plain @code{numpy}, there
+is no way to easily tell what the units of the data are without somehow
+knowing a priori. @code{unyt} handles this problem by providing a subclass of
+the @code{ndarray} class in @code{numpy} that is unit aware.")
+ (license license:bsd-3)))
+
(define-public python-upsetplot
(package
(name "python-upsetplot")
--
2.42.0
T
T
Troy Figiel wrote on 28 Jan 23:13 +0100
[PATCH 2/2] gnu: Add python-pyjanitor.
(address . 68789@debbugs.gnu.org)
87fryhf4u0.fsf@troyfigiel.com
* gnu/packages/python-science.scm (python-pyjanitor): New variable.
---
gnu/packages/python-science.scm | 59 +++++++++++++++++++++++++++++++++
1 file changed, 59 insertions(+)

Toggle diff (79 lines)
diff --git a/gnu/packages/python-science.scm b/gnu/packages/python-science.scm
index 3013c77c34..00b7e6cae1 100644
--- a/gnu/packages/python-science.scm
+++ b/gnu/packages/python-science.scm
@@ -48,6 +48,7 @@ (define-module (gnu packages python-science)
#:use-module (gnu packages boost)
#:use-module (gnu packages build-tools)
#:use-module (gnu packages check)
+ #:use-module (gnu packages chemistry)
#:use-module (gnu packages cpp)
#:use-module (gnu packages crypto)
#:use-module (gnu packages databases)
@@ -771,6 +772,64 @@ (define-public python-pandera
@end itemize")
(license license:expat)))
+(define-public python-pyjanitor
+ (package
+ (name "python-pyjanitor")
+ (version "0.26.0")
+ (source
+ (origin
+ ;; The build requires the mkdocs directory for the description in
+ ;; setup.py. This is not included in the PyPI tarball.
+ (method git-fetch)
+ (uri (git-reference
+ (url "https://github.com/pyjanitor-devs/pyjanitor")
+ (commit (string-append "v" version))))
+ (file-name (git-file-name name version))
+ (sha256
+ (base32 "1f8xbl1k9l2z56bapp7v6bd3016zrk48igcaz6hb553r6yfl7vfx"))))
+ (build-system pyproject-build-system)
+ ;; Pyjanitor has an extensive test suite. For quick debugging, the tests
+ ;; marked turtle can be skipped using "-m" "not turtle".
+ (arguments
+ (list
+ #:test-flags '(list
+ ;; Tries to connect to the internet.
+ "-k"
+ "not test_is_connected"
+
+ ;; PySpark has not been packaged yet.
+ "--ignore"
+ "tests/spark")
+ #:phases #~(modify-phases %standard-phases
+ (add-before 'check 'set-env-ci
+ (lambda _
+ ;; Some tests are skipped if the JANITOR_CI_MACHINE
+ ;; variable is not set.
+ (setenv "JANITOR_CI_MACHINE" "1"))))))
+ (propagated-inputs (list python-multipledispatch
+ python-natsort
+ python-pandas-flavor
+ python-scipy
+
+ ;; Optional imports.
+ python-biopython ;biology submodule
+ python-unyt)) ;engineering submodule
+ (native-inputs (list python-pytest
+
+ ;; Optional imports. We do not propagate them due to
+ ;; their size.
+ python-numba ;speedup of joins
+ rdkit)) ;chemistry submodule
+ (home-page "https://github.com/pyjanitor-devs/pyjanitor")
+ (synopsis "Tools for cleaning and transforming pandas DataFrames")
+ (description
+ "@code{pyjanitor} provides a set of data cleaning routines for
+@code{pandas} DataFrames. These routines extend the method chaining API
+defined by @code{pandas} for a subset of its methods. Originally, this
+package was a port of the R package by the same name and it is inspired by the
+ease-of-use and expressiveness of the @code{dplyr} package.")
+ (license license:expat)))
+
(define-public python-pythran
(package
(name "python-pythran")
--
2.42.0
S
S
Sharlatan Hellseher wrote on 29 Jan 15:26 +0100
[PATCH 0/2] gnu: Add python-pyjanitor.
(name . Troy Figiel)(address . troy@troyfigiel.com)(address . 68789@debbugs.gnu.org)
87bk94tdvi.fsf@gmail.com
Hi Troy,

Thank you for the patches!

I'm in the process of packaging python-yt in (gnu packages astronomy)
and I've noticed that python-unyt is part of it which brought me here
:-) I started reviewing this issue so.

One note - you introduced a module cycle which was not before
astronomy->python-science->astronomy. If the requirement of
python-astropy is soft let's silent it for now.

Also I've already updated the whole chain depending on python-astropy
after it's update to 6.0.0, letting you know if your work requires fresh
Astropy version. It will be in review on 20th next month.

What do you think?

Regards,
Oleg
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEmEeB3micIcJkGAhndtcnv/Ys0rUFAmW3tYEACgkQdtcnv/Ys
0rXDug//ZENjQQDavLulznVxoH0NJgcUq7m5sGIQ3x4ziFsk538p4klQOxo0pWc/
lEE2nboOU5XWLmkSCAANJVWWG/JRcY7QeclK3IxQYGbdAGOtHxAg3sxqmlMqyCXe
kw78eaWJ/S4+ZGAa8i/ZR00zGTU7uGuy3XNFZ9qsThEO4XEiio07SGznrXNloQxy
RHqUgxysX+RWC/gtElRGsAFlXiLDkQFJIBlTKUtBjJ8UtAlstUOhPxqzlxGcbLBW
Oj+IrmrUf8VRQBoM0j0iz2YV/K1/nY1zKqNVkZooOlj11+9yP4M1UyvYM4xlQrZT
FAA8aK9o5StkqD4JMJTYX1LPlwegfUZ+dYgvrHZA7BEJk5xDrMsKYAUBYbzgygDm
7dJVtHIKe6hA1Q21+yFWbUOyBT9rOlrz1squRehlrsC9/U4pKwMbAeWuaRmyX9Az
TJmEQhO8CNMuJ0m5ev+cPkjNXP6L+qbTb1pat3km65QBIJoQkM3rQ8QOR94p/ZQ4
eUL8Io+v7QrrfxA1yCcjqx0JvovMetzm3tf3i2h8zMbk0Ez3uXhoTjHwYFyevMrV
Vab6jizmYPPwlMCyECBkUVY+6sn08lgrgnjDCKiOW81mFFIn7d0obh+z+rjkWQH/
wjB8e+P+20kAagmw2hx/hyDPBDX/d6fqNmBBIrPTSZahRdxwBDU=
=X2DU
-----END PGP SIGNATURE-----

T
T
Troy Figiel wrote on 29 Jan 18:18 +0100
(name . Sharlatan Hellseher)(address . sharlatanus@gmail.com)(address . 68789@debbugs.gnu.org)
bce44714-c405-4dbc-a06d-998ad6345bda@troyfigiel.com
Hi Oleg,

Thanks for the check!

On 2024-01-29 15:26, Sharlatan Hellseher wrote:
Toggle quote (4 lines)
> One note - you introduced a module cycle which was not before
> astronomy->python-science->astronomy. If the requirement of
> python-astropy is soft let's silent it for now.

Removing the python-astropy dependency should be fine for python-unyt. I
agree that avoiding module cycles would be better. If I recall
correctly, Astropy was only used in tests, because it has a similar
submodule dealing with physical units.

The build was successful and the cycle did not show up in the linter.
How did you find it? Did you happen to notice it when you saw the imports?

Best wishes,

Troy
S
S
Sharlatan Hellseher wrote on 29 Jan 18:31 +0100
(name . Troy Figiel)(address . troy@troyfigiel.com)(address . 68789@debbugs.gnu.org)
CAO+9K5o8cVbjYx8pht+L-yts7dne6mjGXpC5TqDKAfBCzuK0Cg@mail.gmail.com
Hi,


How did you find it? Did you happen to notice it when you saw the imports?
Toggle quote (2 lines)
>

It's usually pops up in issues with efforts to break modules cycles

I'm not quite sure how it is critical right now, but there was a discussion
that cycles in modules slow down ~guix pull~.

Let's comment astropy out with some notes about optional test dependency
and potential module cycle.

Looking forward for v2, patches look good.

If you have wider plan of upcoming patches please share to coordinate
efforts ;-).

Regards,
Oleg
Attachment: file
T
T
Troy Figiel wrote on 29 Jan 19:13 +0100
(name . Sharlatan Hellseher)(address . sharlatanus@gmail.com)(address . 68789@debbugs.gnu.org)
737483e8-7798-436d-96d5-67d4cfb47e85@troyfigiel.com
Now that you mention it, there are quite a few cycles. To name a few:

- astronomy->python-science->python-xyz->astronomy
- databases->python-xyz->databases
- bioinformatics->python-science->bioinformatics

On 2024-01-29 18:31, Sharlatan Hellseher wrote:
Toggle quote (3 lines)
> If you have wider plan of upcoming patches please share to coordinate
> efforts ;-).

There is only the guix-devel list, right? No Python specific list?

When it comes to the Python ecosystem, I have been looking at

- python-shap
- python-cocotb (#68153)
- ruff

Unfortunately, ruff has caused me some headaches since it uses a Rust
workspace definition. I will probably have to write guix-devel for
advice sooner or later.

I've also still had some Golang packages on my radar, since long-term I
would like to see opentofu and gotenberg included. That might be going
off-topic a bit :-)
T
T
Troy Figiel wrote on 29 Jan 19:18 +0100
[PATCH v2 0/2] gnu: Add python-pyjanitor.
(address . 68789@debbugs.gnu.org)
87ede0f1ck.fsf@troyfigiel.com
This is the updated patch series. I have rebased it on the current master and made the suggested changes.

Troy Figiel (2):
gnu: Add python-unyt.
gnu: Add python-pyjanitor.

gnu/packages/python-science.scm | 88 +++++++++++++++++++++++++++++++++
1 file changed, 88 insertions(+)


base-commit: 21e4d6cd6913eca131f2c0fd0cd509fc843c7eb8
--
2.42.0
T
T
Troy Figiel wrote on 29 Jan 19:16 +0100
[PATCH 1/2] gnu: Add python-unyt.
(address . 68789@debbugs.gnu.org)
87cytkf1bl.fsf@troyfigiel.com
* gnu/packages/python-science.scm (python-unyt): New variable.
---
gnu/packages/python-science.scm | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)

Toggle diff (42 lines)
diff --git a/gnu/packages/python-science.scm b/gnu/packages/python-science.scm
index f775d46349..3390b918a4 100644
--- a/gnu/packages/python-science.scm
+++ b/gnu/packages/python-science.scm
@@ -1287,6 +1287,35 @@ (define-public python-statannot
annotations on an existing boxplots and barplots generated by seaborn.")
(license license:expat)))
+(define-public python-unyt
+ (package
+ (name "python-unyt")
+ (version "3.0.1")
+ (source
+ (origin
+ (method url-fetch)
+ (uri (pypi-uri "unyt" version))
+ (sha256
+ (base32 "00900bw24rxgcgwgxp9xlx0l5im96r1n5hn0r3mxvbdgc3lyyq48"))))
+ (build-system pyproject-build-system)
+ ;; Astropy is an optional import, but we do not include it as it creates a
+ ;; module cycle: astronomy->python-science->astronomy.
+ (propagated-inputs (list python-h5py ;optional import
+ python-matplotlib ;optional import
+ python-numpy
+ python-sympy))
+ ;; Pint is optional, but we do not propagate it due to its size.
+ (native-inputs (list python-pint python-pytest))
+ (home-page "https://unyt.readthedocs.io")
+ (synopsis "Library for working with data that has physical units")
+ (description
+ "Writing code that deals with data with physical units can be confusing.
+A function might return an array but at least with plain @code{numpy}, there
+is no way to easily tell what the units of the data are without somehow
+knowing a priori. @code{unyt} handles this problem by providing a subclass of
+the @code{ndarray} class in @code{numpy} that is unit aware.")
+ (license license:bsd-3)))
+
(define-public python-upsetplot
(package
(name "python-upsetplot")
--
2.42.0
T
T
Troy Figiel wrote on 29 Jan 19:17 +0100
[PATCH 2/2] gnu: Add python-pyjanitor.
(address . 68789@debbugs.gnu.org)
87bk94f1an.fsf@troyfigiel.com
* gnu/packages/python-science.scm (python-pyjanitor): New variable.
---
gnu/packages/python-science.scm | 59 +++++++++++++++++++++++++++++++++
1 file changed, 59 insertions(+)

Toggle diff (79 lines)
diff --git a/gnu/packages/python-science.scm b/gnu/packages/python-science.scm
index 3390b918a4..643fb69f3f 100644
--- a/gnu/packages/python-science.scm
+++ b/gnu/packages/python-science.scm
@@ -47,6 +47,7 @@ (define-module (gnu packages python-science)
#:use-module (gnu packages boost)
#:use-module (gnu packages build-tools)
#:use-module (gnu packages check)
+ #:use-module (gnu packages chemistry)
#:use-module (gnu packages cpp)
#:use-module (gnu packages crypto)
#:use-module (gnu packages databases)
@@ -840,6 +841,64 @@ (define-public python-pandera
@end itemize")
(license license:expat)))
+(define-public python-pyjanitor
+ (package
+ (name "python-pyjanitor")
+ (version "0.26.0")
+ (source
+ (origin
+ ;; The build requires the mkdocs directory for the description in
+ ;; setup.py. This is not included in the PyPI tarball.
+ (method git-fetch)
+ (uri (git-reference
+ (url "https://github.com/pyjanitor-devs/pyjanitor")
+ (commit (string-append "v" version))))
+ (file-name (git-file-name name version))
+ (sha256
+ (base32 "1f8xbl1k9l2z56bapp7v6bd3016zrk48igcaz6hb553r6yfl7vfx"))))
+ (build-system pyproject-build-system)
+ ;; Pyjanitor has an extensive test suite. For quick debugging, the tests
+ ;; marked turtle can be skipped using "-m" "not turtle".
+ (arguments
+ (list
+ #:test-flags '(list
+ ;; Tries to connect to the internet.
+ "-k"
+ "not test_is_connected"
+
+ ;; PySpark has not been packaged yet.
+ "--ignore"
+ "tests/spark")
+ #:phases #~(modify-phases %standard-phases
+ (add-before 'check 'set-env-ci
+ (lambda _
+ ;; Some tests are skipped if the JANITOR_CI_MACHINE
+ ;; variable is not set.
+ (setenv "JANITOR_CI_MACHINE" "1"))))))
+ (propagated-inputs (list python-multipledispatch
+ python-natsort
+ python-pandas-flavor
+ python-scipy
+
+ ;; Optional imports.
+ python-biopython ;biology submodule
+ python-unyt)) ;engineering submodule
+ (native-inputs (list python-pytest
+
+ ;; Optional imports. We do not propagate them due to
+ ;; their size.
+ python-numba ;speedup of joins
+ rdkit)) ;chemistry submodule
+ (home-page "https://github.com/pyjanitor-devs/pyjanitor")
+ (synopsis "Tools for cleaning and transforming pandas DataFrames")
+ (description
+ "@code{pyjanitor} provides a set of data cleaning routines for
+@code{pandas} DataFrames. These routines extend the method chaining API
+defined by @code{pandas} for a subset of its methods. Originally, this
+package was a port of the R package by the same name and it is inspired by the
+ease-of-use and expressiveness of the @code{dplyr} package.")
+ (license license:expat)))
+
(define-public python-pythran
(package
(name "python-pythran")
--
2.42.0
S
S
Sharlatan Hellseher wrote on 30 Jan 00:01 +0100
[PATCH 0/2] gnu: Add python-pyjanitor.
(address . 68789-done@debbugs.gnu.org)
875xzbu4ku.fsf@gmail.com
Modifications applied:

- python-unyt :: rephrase description, partly sourced and combined from

- python-pyjanitor :: speed up tests with python-pytest-xdist (~x3
faster on 16x threads), remove blank lines, disable exact tests
related to PySpark.

Pushed as 370b79b4f5..cde0adaacd to master.

Thanks,
Oleg
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEmEeB3micIcJkGAhndtcnv/Ys0rUFAmW4LlEACgkQdtcnv/Ys
0rXSQQ/7B7Rby2USxpGyGGENgWDEFR9fktraOpLGf6btMOUNzDHdm566IPi2Pg2B
DfXSfhfmNMgsWD3wADbJEGkfc3XsITVDzwcUI5geYArMxBkB8F2HSQ9aPheHaKXu
t2MFqN9YFDiadw/pBds4diFW8bUI1h4tJRBaH/mBkOC6/0K5QUw/Uba6Qzsb9U2L
aPJNs2ZzUymEXddgIW65LI/jCTO93R64sLTh8bPdiWYldG9m8tnkiWJ5zaeiX0ZD
eU0gWAE2bF4Piz+GshTkEhAOTEP/brKnuqibREBlPMQCTWwtFCoDBEVw0I00sLuy
8rauBmyNFNFL5t6ASlLJ+QgrcXmCZTSZgGeIBR6jE5E6YfGZGZfHsY3ZzEktE2rE
YswkRUJPRcJ2z7tXW01b/+XXklvWx7hWUKVWZsOa4S00dplwuaW8350L/JBmQBoD
C6L6IOBWuEL5Z5do9TiEZ2LxkofbeSg/uDUiWUOCopCMmG5QDDtnrBrTXoHr0rwP
dyRlAX6XTsv91MrgatIkOMEGMQQhw67RRWS/tR4r/b501G2NLBVTmgTS85uj+XsD
Bdmac4DJCzpelFcxuoZxui6sGp79hB5PIX2YOQlFMb2kJxnMAhNygspT8XiLCLh9
sSjpCRWDJBnDx2bDqZmEljYkYKBtLj22y1bva3JNg3LbruapIyc=
=/ogw
-----END PGP SIGNATURE-----

Closed
?