[patch] gnu: Add python-cmseq, python-cmseq, python-phylophlan and python-metaphlan.

  • Open
  • quality assurance status badge
Details
3 participants
  • MadalinIonel.Patrascu@mdc-berlin.de
  • M?d?lin Ionel Patra?cu
  • Ricardo Wurmus
Owner
unassigned
Submitted by
MadalinIonel.Patrascu@mdc-berlin.de
Severity
normal
M
M
MadalinIonel.Patrascu@mdc-berlin.de wrote on 22 Jan 2023 01:44
(name . guix-patches@gnu.org)(address . guix-patches@gnu.org)
55c300694f6c49f6a89a8018a3f86729@mdc-berlin.de

Attachment: file
M
M
M?d?lin Ionel Patra?cu wrote on 22 Jan 2023 01:49
[PATCH 2/4] gnu: Add python-hclust2.
(address . 60997@debbugs.gnu.org)
20230122004951.119277-2-madalinionel.patrascu@mdc-berlin.de
* gnu/packages/bioinformatics.scm (python-hclust2): New variable.
---
gnu/packages/bioinformatics.scm | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)

Toggle diff (39 lines)
diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 4e7bc07a5f..bf01c9c7e4 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -1041,6 +1041,32 @@ (define-public python-cmseq
and sequence consensus.")
(license license:expat)))
+(define-public python-hclust2
+ (package
+ (name "python-hclust2")
+ (version "1.0.0")
+ (source (origin
+ (method url-fetch)
+ (uri (pypi-uri "hclust2" version))
+ (sha256
+ (base32
+ "0v89n2g42d7jhgfs8glf06apgxx6aswp3mfisgnhm518cv8z2rwn"))))
+ (build-system python-build-system)
+ (arguments
+ (list
+ #:tests? #f)) ;;pypi no tests
+ (propagated-inputs
+ (list python-matplotlib
+ python-numpy
+ python-pandas
+ python-scipy))
+ (home-page "http://github.com/SegataLab/hclust2/")
+ (synopsis "Plotting heat-maps for publications")
+ (description
+ "Hclust2 is a handy tool for plotting heat-maps with several useful options
+to produce high quality figures that can be used in publication")
+ (license license:expat)))
+
(define-public python-htsget
(package
(name "python-htsget")
--
2.39.1
M
M
M?d?lin Ionel Patra?cu wrote on 22 Jan 2023 01:49
[PATCH 4/4] gnu: Add python-metaphlan.
(address . 60997@debbugs.gnu.org)
20230122004951.119277-4-madalinionel.patrascu@mdc-berlin.de
* gnu/packages/bioinformatics.scm (python-metaphlan): New variable.
---
gnu/packages/bioinformatics.scm | 37 +++++++++++++++++++++++++++++++++
1 file changed, 37 insertions(+)

Toggle diff (50 lines)
diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 5c9c222d59..884e5aa1e5 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -11234,6 +11234,43 @@ (define-public python-biothings-client
API services.")
(license license:bsd-3)))
+(define-public python-metaphlan
+ (package
+ (name "python-metaphlan")
+ (version "4.0.4")
+ (source
+ (origin
+ (method url-fetch)
+ (uri (pypi-uri "MetaPhlAn" version))
+ (sha256
+ (base32
+ "1jw29m8p8bcwn4q5qvh8s96qlgqv1kaizbmm87jk55f34k1y3y8a"))))
+ (build-system python-build-system)
+ (arguments
+ (list
+ #:tests? #f)) ;pypi no tests
+ (propagated-inputs
+ (list python-biom-format
+ python-biopython
+ python-cmseq
+ python-dendropy
+ python-h5py
+ python-hclust2
+ python-numpy
+ python-pandas
+ python-phylophlan
+ python-pysam
+ python-requests
+ python-scipy))
+ (home-page "http://github.com/biobakery/MetaPhlAn/")
+ (synopsis "Metagenomic phylogenetic analysis")
+ (description
+ "MetaPhlAn is a computational tool for profiling the composition of microbial
+communities (Bacteria, Archaea and Eukaryotes) from metagenomic shotgun sequencing
+data (i.e. not 16S) with species-level. With the newly added StrainPhlAn module,
+it is now possible to perform accurate strain-level microbial profiling.")
+ (license license:expat)))
+
(define-public python-multivelo
(package
(name "python-multivelo")
--
2.39.1
M
M
M?d?lin Ionel Patra?cu wrote on 22 Jan 2023 01:49
[PATCH 1/4] gnu: Add python-cmseq.
(address . 60997@debbugs.gnu.org)
20230122004951.119277-1-madalinionel.patrascu@mdc-berlin.de
* gnu/packages/bioinformatics.scm (python-cmseq): New variable.
---
gnu/packages/bioinformatics.scm | 29 ++++++++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)

Toggle diff (51 lines)
diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 36c9db90bd..4e7bc07a5f 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -11,7 +11,7 @@
;;; Copyright © 2017, 2021, 2022 Arun Isaac <arunisaac@systemreboot.net>
;;; Copyright © 2018 Joshua Sierles, Nextjournal <joshua@nextjournal.com>
;;; Copyright © 2018 Gábor Boskovits <boskovits@gmail.com>
-;;; Copyright © 2018, 2019, 2020, 2021, 2022 M?d?lin Ionel Patra?cu <madalinionel.patrascu@mdc-berlin.de>
+;;; Copyright © 2018-2023 M?d?lin Ionel Patra?cu <madalinionel.patrascu@mdc-berlin.de>
;;; Copyright © 2019, 2020, 2021 Maxim Cournoyer <maxim.cournoyer@gmail.com>
;;; Copyright © 2019 Brian Leung <bkleung89@gmail.com>
;;; Copyright © 2019 Brett Gilio <brettg@gnu.org>
@@ -1014,6 +1014,33 @@ (define-public python-cellbender
from high-throughput single-cell RNA sequencing (scRNA-seq) data.")
(license license:bsd-3)))
+(define-public python-cmseq
+ (package
+ (name "python-cmseq")
+ (version "1.0.4")
+ (source (origin
+ (method url-fetch)
+ (uri (pypi-uri "CMSeq" version))
+ (sha256
+ (base32
+ "0p6a99c299m5wi2z57dgqz52m1z3nfr8mv7kdnk2jvl2p9nql0wk"))))
+ (build-system python-build-system)
+ (arguments
+ (list #:tests? #f )) ;pypi no tests
+ (propagated-inputs
+ (list python-bcbio-gff
+ python-biopython
+ python-numpy
+ python-pandas
+ python-pysam
+ python-scipy))
+ (home-page "http://github.com/SegataLab/cmseq/")
+ (synopsis "Set of utilities on sequences and BAM files")
+ (description
+ "CMSeq is a set of commands to provide an interface to .bam files for coverage
+and sequence consensus.")
+ (license license:expat)))
+
(define-public python-htsget
(package
(name "python-htsget")

base-commit: f088763356e88c4911ee933fdafcad6ed66a7aa3
--
2.39.1
M
M
M?d?lin Ionel Patra?cu wrote on 22 Jan 2023 01:49
[PATCH 3/4] gnu: Add python-phylophlan.
(address . 60997@debbugs.gnu.org)
20230122004951.119277-3-madalinionel.patrascu@mdc-berlin.de
* gnu/packages/bioinformatics.scm (python-phylophlan): New variable.
---
gnu/packages/bioinformatics.scm | 43 +++++++++++++++++++++++++++++++++
1 file changed, 43 insertions(+)

Toggle diff (56 lines)
diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index bf01c9c7e4..5c9c222d59 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -1089,6 +1089,49 @@ (define-public python-htsget
servers supporting the protocol.")
(license license:asl2.0)))
+(define-public python-phylophlan
+ (package
+ (name "python-phylophlan")
+ (version "3.0.3")
+ (source (origin
+ (method url-fetch)
+ (uri (pypi-uri "PhyloPhlAn" version))
+ (sha256
+ (base32
+ "1r1bnnh4d38l410hfzf882y43ln8fd2lcsqbralqshxqw2hzc7x7"))))
+ (build-system python-build-system)
+ (arguments
+ (list
+ #:tests? #f ;pypi no tests
+ #:phases
+ #~(modify-phases %standard-phases
+ ;;pypi does not provide the readme.md file
+ (add-before 'build 'loose-readme-file-requirement
+ (lambda _
+ (substitute* "setup.py"
+ (("long_description")
+ "#long_description")))))))
+ (propagated-inputs
+ (list python-biopython
+ python-dendropy
+ python-matplotlib
+ python-numpy
+ python-pandas
+ python-seaborn))
+ (home-page "https://github.com/biobakery/phylophlan")
+ (synopsis
+ "Phylogenetic analysis of microbial isolates and genomes from metagenomes")
+ (description
+ "This package is an integrated pipeline for large-scale phylogenetic profiling
+of genomes and metagenomes. PhyloPhlAn is an accurate, rapid, and easy-to-use
+method for large-scale microbial genome characterization and phylogenetic analysis
+at multiple levels of resolution. This software package can assign both genomes
+and @acronym{MAGs, metagenome-assembled genomes} to @acronym{SGBs, species-level
+genome bins}. PhyloPhlAn can reconstruct strain-level phylogenies using clade-
+specific maximally informative phylogenetic markers, and can also scale to very
+large phylogenies comprising >17,000 microbial species.")
+ (license license:expat)))
+
(define-public python-pybedtools
(package
(name "python-pybedtools")
--
2.39.1
R
R
Ricardo Wurmus wrote on 24 Jan 2023 09:36
Re: [bug#60997] [PATCH 1/4] gnu: Add python-cmseq.
(name . M?d?lin Ionel Patra?cu)(address . madalinionel.patrascu@mdc-berlin.de)(address . 60997@debbugs.gnu.org)
87lelse36h.fsf@elephly.net
Hi M?d?lin,

Toggle quote (2 lines)
> * gnu/packages/bioinformatics.scm (python-cmseq): New variable.

Thanks for the patch.

Unfortunately, this is incomplete:

- The tool calls out to samtools (see cmseq/cmseq.py), so it needs
samtools as an input and the call needs to be patched.

- The README says that biopython <= 1.76 is needed for polymut.py.
You’re using 1.80.

Toggle quote (2 lines)
> + (build-system python-build-system)

Please consider using the pyproject-build-system.

Toggle quote (3 lines)
> + (arguments
> + (list #:tests? #f )) ;pypi no tests

Apparantly, there are no tests anywhere. It’s not a pypi problem.
Please update the comment and remove that extra space after #f.

Toggle quote (2 lines)
Please use HTTPS. “guix lint” informs you about the redirect.

--
Ricardo
R
R
Ricardo Wurmus wrote on 24 Jan 2023 10:10
Re: [bug#60997] [PATCH 3/4] gnu: Add python-phylophlan.
(name . M?d?lin Ionel Patra?cu)(address . madalinionel.patrascu@mdc-berlin.de)
87h6wge1q7.fsf@elephly.net
M?d?lin Ionel Patra?cu <madalinionel.patrascu@mdc-berlin.de> writes:

Toggle quote (2 lines)
> + (build-system python-build-system)

Please use pyproject-build-system where possible.

Toggle quote (4 lines)
> + (arguments
> + (list
> + #:tests? #f ;pypi no tests

There are no tests, nothing to do with pypi.

Toggle quote (9 lines)
> + #:phases
> + #~(modify-phases %standard-phases
> + ;;pypi does not provide the readme.md file
> + (add-before 'build 'loose-readme-file-requirement
> + (lambda _
> + (substitute* "setup.py"
> + (("long_description")
> + "#long_description")))))))

I changed this to just fetch the source from git.

Toggle quote (3 lines)
> + (propagated-inputs
> + (list python-biopython

This will become a problem down the line when combined with cmseq.

--
Ricardo
R
R
Ricardo Wurmus wrote on 24 Jan 2023 10:11
Re: [bug#60997] [PATCH 4/4] gnu: Add python-metaphlan.
(name . M?d?lin Ionel Patra?cu)(address . madalinionel.patrascu@mdc-berlin.de)
87cz74e1ji.fsf@elephly.net
M?d?lin Ionel Patra?cu <madalinionel.patrascu@mdc-berlin.de> writes:

Toggle quote (2 lines)
> * gnu/packages/bioinformatics.scm (python-metaphlan): New variable.

Unfortunately, this one is also incomplete.

The package includes R code that needs to have its dependencies
satisfied. It also calls out to bowtie2, and needs raxml, muscle,
blast, etc.

The clash of biopython versions will also need to be addressed.

I’ll push as much as is feasible, but I’d like to ask you to rework this
package definition in particular.

Thanks!

--
Ricardo
R
R
Ricardo Wurmus wrote on 25 Jan 2023 10:54
(name . GNU bug tracker automated control server)(address . control@debbugs.gnu.org)
87edrjc52f.fsf@elephly.net
tag 60997 moreinfo
thanks

--
Ricardo
?