[PATCH] gnu: Add pbgzip.

  • Done
  • quality assurance status badge
Details
4 participants
  • Efraim Flashner
  • Maxime Devos
  • Xinglu Chen
  • Roel Janssen
Owner
unassigned
Submitted by
Roel Janssen
Severity
normal
R
R
Roel Janssen wrote on 21 Apr 2021 14:26
(address . guix-patches@gnu.org)
eb1b7cd3-e60b-46c5-da4a-371b5038b3a3@gnu.org
Hi Guix,

Here's a patch to add pbgzip.  Lint complains that there is no release
on the Github page, but there's nothing I can do about it.

Kind regards,
Roel Janssen
From 3d34e82ee67f5ee0b226de350f40d7f881169a56 Mon Sep 17 00:00:00 2001
From: Roel Janssen <roel@gnu.org>
Date: Wed, 21 Apr 2021 14:24:07 +0200
Subject: [PATCH] gnu: Add pbgzip.

* gnu/packages/bioinformatics.scm (pbgzip): New variable.
---
gnu/packages/bioinformatics.scm | 42 ++++++++++++++++++++++++++++++++-
1 file changed, 41 insertions(+), 1 deletion(-)

Toggle diff (62 lines)
diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 31205c473a..35601378c2 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -3,7 +3,7 @@
;;; Copyright © 2015, 2016, 2017, 2018 Ben Woodcroft <donttrustben@gmail.com>
;;; Copyright © 2015, 2016, 2018, 2019, 2020 Pjotr Prins <pjotr.guix@thebird.nl>
;;; Copyright © 2015 Andreas Enge <andreas@enge.fr>
-;;; Copyright © 2016, 2020 Roel Janssen <roel@gnu.org>
+;;; Copyright © 2016, 2020, 2021 Roel Janssen <roel@gnu.org>
;;; Copyright © 2016, 2017, 2018, 2019, 2020, 2021 Efraim Flashner <efraim@flashner.co.il>
;;; Copyright © 2016, 2020 Marius Bakke <mbakke@fastmail.com>
;;; Copyright © 2016, 2018 Raoul Bonnal <ilpuccio.febo@gmail.com>
@@ -569,6 +569,46 @@ input and output BAMs must adhere to the PacBio BAM format specification.
Non-PacBio BAMs will cause exceptions to be thrown.")
(license license:bsd-3)))
+(define-public pbgzip
+ (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
+ (package
+ (name "pbgzip")
+ (version (string-take commit 7))
+ (source (origin
+ (method git-fetch)
+ (uri (git-reference
+ (url "https://github.com/nh13/pbgzip")
+ (commit commit)))
+ (file-name (string-append name "-" version))
+ (sha256
+ (base32
+ "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
+ (build-system gnu-build-system)
+ (arguments
+ `(#:phases
+ (modify-phases %standard-phases
+ (add-after 'unpack 'autogen
+ (lambda _
+ (zero? (system* "sh" "autogen.sh")))))))
+ (native-inputs
+ `(("autoconf" ,autoconf)
+ ("automake" ,automake)))
+ (inputs
+ `(("zlib" ,zlib)))
+ (home-page "https://github.com/nh13/pbgzip")
+ (synopsis "Parallel Block GZIP")
+ (description "This package implements parallel block gzip. For many
+formats, in particular genomics data formats, data are compressed in
+fixed-length blocks such that they can be easily indexed based on a (genomic)
+coordinate order, since typically each block is sorted according to this order.
+This allows for each block to be individually compressed (deflated), or more
+importantly, decompressed (inflated), with the latter enabling random retrieval
+of data in large files (gigabytes to terabytes). @code{pbgzip} is not limited
+to any particular format, but certain features are tailored to genomics data
+formats when enabled. Parallel decompression is somewhat faster, but truly the
+speedup comes during compression.")
+ (license license:expat))))
+
(define-public blasr-libcpp
(package
(name "blasr-libcpp")
--
2.31.1
X
X
Xinglu Chen wrote on 21 Apr 2021 23:44
875z0f71nz.fsf@yoctocell.xyz
On Wed, Apr 21 2021, Roel Janssen wrote:

Toggle quote (10 lines)
> * gnu/packages/bioinformatics.scm (pbgzip): New variable.
>
> [...]
>
> +(define-public pbgzip
> + (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
> + (package
> + (name "pbgzip")
> + (version (string-take commit 7))

I think using (git-version VERSION REVISION COMMIT) is preferred.
Something like (git-version "0.0.0" "0" commit).

Toggle quote (7 lines)
> + (source (origin
> + (method git-fetch)
> + (uri (git-reference
> + (url "https://github.com/nh13/pbgzip")
> + (commit commit)))
> + (file-name (string-append name "-" version))

Use (git-file-name name version).

Toggle quote (11 lines)
> + (sha256
> + (base32
> + "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
> + (build-system gnu-build-system)
> + (arguments
> + `(#:phases
> + (modify-phases %standard-phases
> + (add-after 'unpack 'autogen
> + (lambda _
> + (zero? (system* "sh" "autogen.sh")))))))

IIRC, phases don’t have to return #t, so you could remove ‘zero?’.

Builds fine, but I haven’t tested it.
-----BEGIN PGP SIGNATURE-----

iQJJBAEBCAAzFiEEAVhh4yyK5+SEykIzrPUJmaL7XHkFAmCAnNAVHHB1YmxpY0B5
b2N0b2NlbGwueHl6AAoJEKz1CZmi+1x5gzoP/ix8CAlbjDKcYjkIWPg4sn9CMnrp
iY9N/AqBRf/gdLr8uBLIryClg/9MK6ihpLzFxn0Aa9Ildbn/kEl40OjfSfhap5lP
d8N/HHaJuXd6JY6xF9xi/Nd24/Uzjg0hXO1pUFYJdq6JhYGkIFaZfBDgpUvKi51b
a0pl3Nh511RnevAN9flHpdY0sso/QXYEyYo7fIOY7XPxstpjd0yYtdunmxSgsUb4
Lt9FhqzYcVMZYAbBCdn3PYQ/sV3XQbemdmf0+BtDTe2h35AT9u4ShhucF4xeFq/n
zTIptWkZXQFCU6VdkTHka905ltuqiCPk7bxfPC2Q8wBDMlwHa9SVoakuQbFlC83C
dZ/qFVukImnGSIn3DRWVD9OY0zlJQA4uqZbmjvIkV6fy+QDEnDDs0BMV8hfajzZY
NjT4RFNWVn7554P1AGEGkRDKbeRia61laHnpBYhvgCWZmHlEcxDcVtvrqg/g/B4F
ErZRQo71iLZ09MzebDefDq98JjLt0OAqmb8bsZ+Iav8WyIBz1JUqM06ySwq+DksC
bvfjiF19zz40Rp1PxkeAzVEIuIeCu6NX4vWq2LInAZQMy1JamscA53aRHXJCu7p5
RhK4goD70bwK5FakduxEBM7K83eR3UG3bd/Jyyzff08bbuhGxD6r/en+hicQZD8G
dzlOM36nbDONq8OR
=lCr3
-----END PGP SIGNATURE-----

X
X
Xinglu Chen wrote on 21 Apr 2021 23:45
874kfz71ni.fsf@yoctocell.xyz
On Wed, Apr 21 2021, Roel Janssen wrote:

Toggle quote (10 lines)
> * gnu/packages/bioinformatics.scm (pbgzip): New variable.
>
> [...]
>
> +(define-public pbgzip
> + (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
> + (package
> + (name "pbgzip")
> + (version (string-take commit 7))

I think using (git-version VERSION REVISION COMMIT) is preferred.
Something like (git-version "0.0.0" "0" commit).

Toggle quote (7 lines)
> + (source (origin
> + (method git-fetch)
> + (uri (git-reference
> + (url "https://github.com/nh13/pbgzip")
> + (commit commit)))
> + (file-name (string-append name "-" version))

Use (git-file-name name version).

Toggle quote (11 lines)
> + (sha256
> + (base32
> + "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
> + (build-system gnu-build-system)
> + (arguments
> + `(#:phases
> + (modify-phases %standard-phases
> + (add-after 'unpack 'autogen
> + (lambda _
> + (zero? (system* "sh" "autogen.sh")))))))

IIRC, phases don’t have to return #t, so you could remove ‘zero?’.

Builds fine, but I haven’t tested it.
M
M
Maxime Devos wrote on 22 Apr 2021 18:40
e926a502496bd2def92b248cd328d2e180dedef4.camel@telenet.be
Xinglu Chen schreef op wo 21-04-2021 om 23:45 [+0200]:
Toggle quote (12 lines)
> On Wed, Apr 21 2021, Roel Janssen wrote:
>
> > [...]
> > + (arguments
> > + `(#:phases
> > + (modify-phases %standard-phases
> > + (add-after 'unpack 'autogen
> > + (lambda _
> > + (zero? (system* "sh" "autogen.sh")))))))
>
> IIRC, phases don’t have to return #t, so you could remove ‘zero?’.

Try running (system* "does-not-exist"). It will fail by returning
something non-zero. If I recall how to call "invoke" correctly,
I would recommend (invoke "sh" "autogen.sh") here. "invoke" raises
an exception when the command fails, instead of returning something.

Greetings,
Maxime.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYIGnDhccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7onhAP4lw3JsxMrS3hVpGd7NA5OU/MXV
+UbQEMCLHcydR3OQ0wEA9DQz9+i3E3/XgCGhdELKcSxuV+5N1rqygNC6UtolbAk=
=oB8m
-----END PGP SIGNATURE-----


E
E
Efraim Flashner wrote on 29 Apr 2021 09:29
(name . Maxime Devos)(address . maximedevos@telenet.be)
YIpgaLrbEoiZX7iM@3900XT
On Thu, Apr 22, 2021 at 06:40:46PM +0200, Maxime Devos wrote:
Toggle quote (18 lines)
> Xinglu Chen schreef op wo 21-04-2021 om 23:45 [+0200]:
> > On Wed, Apr 21 2021, Roel Janssen wrote:
> >
> > > [...]
> > > + (arguments
> > > + `(#:phases
> > > + (modify-phases %standard-phases
> > > + (add-after 'unpack 'autogen
> > > + (lambda _
> > > + (zero? (system* "sh" "autogen.sh")))))))
> >
> > IIRC, phases don’t have to return #t, so you could remove ‘zero?’.
>
> Try running (system* "does-not-exist"). It will fail by returning
> something non-zero. If I recall how to call "invoke" correctly,
> I would recommend (invoke "sh" "autogen.sh") here. "invoke" raises
> an exception when the command fails, instead of returning something.

While we're at it, can this phase replace 'bootstrap? It seems to me we
shouldn't need both phases.


--
Efraim Flashner <efraim@flashner.co.il> ????? ?????
GPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted
-----BEGIN PGP SIGNATURE-----

iQIzBAABCgAdFiEEoov0DD5VE3JmLRT3Qarn3Mo9g1EFAmCKYGUACgkQQarn3Mo9
g1ENrxAAn0JbKu8WcZUoA6IZtbkbQJWGcoOILW0KwjLxOky1rgWh7kMPtGvRBs/e
Y8ISNoxfnnwdaRSacxDzlL9VAaMNAtcHYVdfhlKRMG/Uhg0F2EwyvjTDmpTLUWEg
jjlfkFLaM2P1hgvsSGKciLpLgL9xgDZ0hSSQGMs+hAjBzpzroeK6l/tLEgqWbFLb
PLN/fcNfVuGOum5PAObxuicWZQJzxwKKvksiDaxsP9CZ/Zb2jI1IIi6HT8P4G0Pq
x6XeAHX5XVVNuK+PUFATG6RG1uX3wfN3BIHScQNl0KL6vknBBquE/AoJ8sT1DOcj
aQDLkHnIAPYsKB/pi+G9N1B9XGJqmQsRvfjff57k5cvyNGyqHDW82BS7HOwWRW6f
H1htinWn7LHShFR3sc23oM+pBTvshN26CtAYdOIoq0t+Sp3P5tYw8ADIlBHSjRC+
vDDHXkYt5F+iYYOljviGLA7Wq58xUEgEcaN+y0zrr+/YLFLAUwiRYMsjz7Bb0YyT
/TAuDrV6YWwHfddBUXZeCYajGRloes2XL1ossO317emnRaceWPhSWzPWAoTAEy5S
M5ETAyrrIGIdZdc1CzW+O3wzdunTgTvTQXtKvuacKm3eFsz9twlNQias6dzh3eiG
4/HfU7PbWQtM1WtvwMWqYt+1kHWjukeCzobmlsXQoou1GUZgZog=
=GAp0
-----END PGP SIGNATURE-----


R
R
Roel Janssen wrote on 29 Apr 2021 14:22
(address . 47930@debbugs.gnu.org)
ad09ffe1-b06b-8c25-0cf8-a6dbf094bacb@gnu.org
On 4/29/21 9:29 AM, Efraim Flashner wrote:
Toggle quote (18 lines)
> On Thu, Apr 22, 2021 at 06:40:46PM +0200, Maxime Devos wrote:
>> Xinglu Chen schreef op wo 21-04-2021 om 23:45 [+0200]:
>>> On Wed, Apr 21 2021, Roel Janssen wrote:
>>>
>>>> [...]
>>>> + (arguments
>>>> + `(#:phases
>>>> + (modify-phases %standard-phases
>>>> + (add-after 'unpack 'autogen
>>>> + (lambda _
>>>> + (zero? (system* "sh" "autogen.sh")))))))
>>> IIRC, phases don’t have to return #t, so you could remove ‘zero?’.
>> Try running (system* "does-not-exist"). It will fail by returning
>> something non-zero. If I recall how to call "invoke" correctly,
>> I would recommend (invoke "sh" "autogen.sh") here. "invoke" raises
>> an exception when the command fails, instead of returning something.
> While we're at it, can this phase replace 'bootstrap? It seems to me we
> shouldn't need both phases.
This indeed seems to be the best thing to do.  I attached a new patch.

I had to leave autoconf and automake in the native-inputs because
otherwise the command "aclocal" and "autom4te" couldn't be found.

Thanks all for the feedback!  I hope this new patch is fine.

Kind regards,
Roel Janssen
From b03f8d8926cdd6a28502f2bdc6db74854144f050 Mon Sep 17 00:00:00 2001
From: Roel Janssen <roel@gnu.org>
Date: Thu, 29 Apr 2021 14:18:30 +0200
Subject: [PATCH] gnu: Add pbgzip.

* gnu/packages/bioinformatics.scm (pbgzip): New variable.
---
gnu/packages/bioinformatics.scm | 36 ++++++++++++++++++++++++++++++++-
1 file changed, 35 insertions(+), 1 deletion(-)

Toggle diff (56 lines)
diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 83ebfc2d8f..8c4d0fc649 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -3,7 +3,7 @@
;;; Copyright © 2015, 2016, 2017, 2018 Ben Woodcroft <donttrustben@gmail.com>
;;; Copyright © 2015, 2016, 2018, 2019, 2020 Pjotr Prins <pjotr.guix@thebird.nl>
;;; Copyright © 2015 Andreas Enge <andreas@enge.fr>
-;;; Copyright © 2016, 2020 Roel Janssen <roel@gnu.org>
+;;; Copyright © 2016, 2020, 2021 Roel Janssen <roel@gnu.org>
;;; Copyright © 2016, 2017, 2018, 2019, 2020, 2021 Efraim Flashner <efraim@flashner.co.il>
;;; Copyright © 2016, 2020 Marius Bakke <mbakke@fastmail.com>
;;; Copyright © 2016, 2018 Raoul Bonnal <ilpuccio.febo@gmail.com>
@@ -571,6 +571,40 @@ input and output BAMs must adhere to the PacBio BAM format specification.
Non-PacBio BAMs will cause exceptions to be thrown.")
(license license:bsd-3)))
+(define-public pbgzip
+ (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
+ (package
+ (name "pbgzip")
+ (version (string-take commit 7))
+ (source (origin
+ (method git-fetch)
+ (uri (git-reference
+ (url "https://github.com/nh13/pbgzip")
+ (commit commit)))
+ (file-name (string-append name "-" version))
+ (sha256
+ (base32
+ "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
+ (build-system gnu-build-system)
+ (native-inputs
+ `(("autoconf" ,autoconf)
+ ("automake" ,automake)))
+ (inputs
+ `(("zlib" ,zlib)))
+ (home-page "https://github.com/nh13/pbgzip")
+ (synopsis "Parallel Block GZIP")
+ (description "This package implements parallel block gzip. For many
+formats, in particular genomics data formats, data are compressed in
+fixed-length blocks such that they can be easily indexed based on a (genomic)
+coordinate order, since typically each block is sorted according to this order.
+This allows for each block to be individually compressed (deflated), or more
+importantly, decompressed (inflated), with the latter enabling random retrieval
+of data in large files (gigabytes to terabytes). @code{pbgzip} is not limited
+to any particular format, but certain features are tailored to genomics data
+formats when enabled. Parallel decompression is somewhat faster, but truly the
+speedup comes during compression.")
+ (license license:expat))))
+
(define-public blasr-libcpp
(package
(name "blasr-libcpp")
--
2.31.1
X
X
Xinglu Chen wrote on 30 Apr 2021 10:30
(address . 47930@debbugs.gnu.org)
87czucnphi.fsf@yoctocell.xyz
On Thu, Apr 29 2021, Roel Janssen wrote:

Toggle quote (6 lines)
> +(define-public pbgzip
> + (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
> + (package
> + (name "pbgzip")
> + (version (string-take commit 7))

Maybe you missed my previous suggestions?

Toggle quote (26 lines)
> + (source (origin
> + (method git-fetch)
> + (uri (git-reference
> + (url "https://github.com/nh13/pbgzip")
> + (commit commit)))
> + (file-name (string-append name "-" version))
> + (sha256
> + (base32
> + "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
> + (build-system gnu-build-system)
> + (native-inputs
> + `(("autoconf" ,autoconf)
> + ("automake" ,automake)))
> + (inputs
> + `(("zlib" ,zlib)))
> + (home-page "https://github.com/nh13/pbgzip")
> + (synopsis "Parallel Block GZIP")
> + (description "This package implements parallel block gzip. For many
> +formats, in particular genomics data formats, data are compressed in
> +fixed-length blocks such that they can be easily indexed based on a (genomic)
> +coordinate order, since typically each block is sorted according to this order.
> +This allows for each block to be individually compressed (deflated), or more
> +importantly, decompressed (inflated), with the latter enabling random retrieval
> +of data in large files (gigabytes to terabytes). @code{pbgzip} is not limited
> +to any particular format, but certain features are tailored to genomics data
> +formats when enabled. Parallel decompression is somewhat faster, but truly the
^^^^^^^^^^^^^
Toggle quote (1 lines)
> +speedup comes during compression.")
^^^^^^^

“but the true speedup” instead?
R
R
Roel Janssen wrote on 30 Apr 2021 13:48
(address . 47930@debbugs.gnu.org)
052ee880cea08e4e1627a2181f7173ab9587b6c8.camel@gnu.org
On Fri, 2021-04-30 at 10:30 +0200, Xinglu Chen wrote:
Toggle quote (13 lines)
> On Thu, Apr 29 2021, Roel Janssen wrote:
>
> > +(define-public pbgzip
> > +  (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
> > +    (package
> > +      (name "pbgzip")
> > +      (version (string-take commit 7))
>
> Maybe you missed my previous suggestions?
>
>   https://issues.guix.gnu.org/47930#2
>

I'm sorry, I forgot to adapt.
Toggle quote (44 lines)
>  
> > +      (source (origin
> > +                (method git-fetch)
> > +                (uri (git-reference
> > +                      (url "https://github.com/nh13/pbgzip")
> > +                      (commit commit)))
> > +                (file-name (string-append name "-" version))
> > +                (sha256
> > +                 (base32
> > +                 
> > "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
> > +      (build-system gnu-build-system)
> > +      (native-inputs
> > +       `(("autoconf" ,autoconf)
> > +         ("automake" ,automake)))
> > +      (inputs
> > +       `(("zlib" ,zlib)))
> > +      (home-page "https://github.com/nh13/pbgzip")
> > +      (synopsis "Parallel Block GZIP")
> > +      (description "This package implements parallel block gzip. 
> > For many
> > +formats, in particular genomics data formats, data are compressed
> > in
> > +fixed-length blocks such that they can be easily indexed based on
> > a (genomic)
> > +coordinate order, since typically each block is sorted according
> > to this order.
> > +This allows for each block to be individually compressed
> > (deflated), or more
> > +importantly, decompressed (inflated), with the latter enabling
> > random retrieval
> > +of data in large files (gigabytes to terabytes).  @code{pbgzip} is
> > not limited
> > +to any particular format, but certain features are tailored to
> > genomics data
> > +formats when enabled.  Parallel decompression is somewhat faster,
> > but truly the
>                                                                     
> ^^^^^^^^^^^^^
> > +speedup comes during compression.")
>    ^^^^^^^
>
> “but the true speedup” instead?

Sure. I usually don't change descriptions as given by the creators of
the software, but I applied your suggestion.

Thank you for the elaborate suggestions!

I attached another version of the patch, which I hope is fine now. :)

Kind regards,
Roel Janssen
From 1af29f66980ba19740e05a27135f141e23b7fd3f Mon Sep 17 00:00:00 2001
From: Roel Janssen <roel@gnu.org>
Date: Fri, 30 Apr 2021 13:47:43 +0200
Subject: [PATCH] gnu: Add pbgzip.

* gnu/packages/bioinformatics.scm (pbgzip): New variable.
---
gnu/packages/bioinformatics.scm | 36 ++++++++++++++++++++++++++++++++-
1 file changed, 35 insertions(+), 1 deletion(-)

Toggle diff (56 lines)
diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 83ebfc2d8f..cd2dae05d5 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -3,7 +3,7 @@
;;; Copyright © 2015, 2016, 2017, 2018 Ben Woodcroft <donttrustben@gmail.com>
;;; Copyright © 2015, 2016, 2018, 2019, 2020 Pjotr Prins <pjotr.guix@thebird.nl>
;;; Copyright © 2015 Andreas Enge <andreas@enge.fr>
-;;; Copyright © 2016, 2020 Roel Janssen <roel@gnu.org>
+;;; Copyright © 2016, 2020, 2021 Roel Janssen <roel@gnu.org>
;;; Copyright © 2016, 2017, 2018, 2019, 2020, 2021 Efraim Flashner <efraim@flashner.co.il>
;;; Copyright © 2016, 2020 Marius Bakke <mbakke@fastmail.com>
;;; Copyright © 2016, 2018 Raoul Bonnal <ilpuccio.febo@gmail.com>
@@ -571,6 +571,40 @@ input and output BAMs must adhere to the PacBio BAM format specification.
Non-PacBio BAMs will cause exceptions to be thrown.")
(license license:bsd-3)))
+(define-public pbgzip
+ (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
+ (package
+ (name "pbgzip")
+ (version (git-version "0.0.0" "0" commit))
+ (source (origin
+ (method git-fetch)
+ (uri (git-reference
+ (url "https://github.com/nh13/pbgzip")
+ (commit commit)))
+ (file-name (git-file-name name version))
+ (sha256
+ (base32
+ "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
+ (build-system gnu-build-system)
+ (native-inputs
+ `(("autoconf" ,autoconf)
+ ("automake" ,automake)))
+ (inputs
+ `(("zlib" ,zlib)))
+ (home-page "https://github.com/nh13/pbgzip")
+ (synopsis "Parallel Block GZIP")
+ (description "This package implements parallel block gzip. For many
+formats, in particular genomics data formats, data are compressed in
+fixed-length blocks such that they can be easily indexed based on a (genomic)
+coordinate order, since typically each block is sorted according to this order.
+This allows for each block to be individually compressed (deflated), or more
+importantly, decompressed (inflated), with the latter enabling random retrieval
+of data in large files (gigabytes to terabytes). @code{pbgzip} is not limited
+to any particular format, but certain features are tailored to genomics data
+formats when enabled. Parallel decompression is somewhat faster, but the true
+speedup comes during compression.")
+ (license license:expat))))
+
(define-public blasr-libcpp
(package
(name "blasr-libcpp")
--
2.31.1
E
E
Efraim Flashner wrote on 30 Apr 2021 13:53
(name . Roel Janssen)(address . roel@gnu.org)
YIvvxUphQsqgDL2S@3900XT
Attachment: file
-----BEGIN PGP SIGNATURE-----

iQIzBAABCgAdFiEEoov0DD5VE3JmLRT3Qarn3Mo9g1EFAmCL78UACgkQQarn3Mo9
g1HyzQ//fuPkFgnSuU0kom2qkiA6m+R8a4ilzGRv+dgYc2XIaEZ0mZt4Pcg8Hj6B
6wIuiIs+aFjJZ6jaobnhS62B9D7hb5+v6Gy5GPk9fm8XGQjHMwlDWxvTLSc06LD8
cjYMjHH8INeaTaB9ne2AOgDf1l90mOfaYG2RWs6k4TpEwwER8erqGN3sf8Xa9ikl
6Ki/ygivAw/TEhJaFL4KPaVkLNh2HSlH0gjW1HLJgTWwe2gjSg/RVWnHFel1sY0a
dzt5OeuUSLEg0zZzy9zQKUOvB/pldWJl7j1sOlSkSe2OwaLGASvX5UU+zA2v72gZ
fTOhAOCytfd6EbhoIc/6i4M3GUb+/q1hUfYXq4B0HxvCQYMfyF8nRXeu130d9bcF
8HiYdIAkhTXL+Ld1CoQogIlMriJEnS37mdBxR5uqdn5g+AillUDhejFMCKhOSRqv
PSXUd1cm/uySWlBAzumG9+eqLIYXNVt0yz2ksrjkqJC3S5jQN0Q79tB3P4qTe35b
uvy4Ug/crxfD7sCJ7KtjWbZeKsuwUV7yHs/2Xg9/HmArvN99nM3XJftNYGcX9YY2
P8Ykfc2pmSG3DYWR0l8hSPuz8ZZiEIXNy5jWxa7TeGTvi+e2sGqWz6ahfPRQrrIt
cn/t3VFnCa9+v00XklHI+FGOYaPef97ztmO+o4cApdFHD6UFiZ4=
=RtpN
-----END PGP SIGNATURE-----


R
R
Roel Janssen wrote on 30 Apr 2021 18:47
(name . Efraim Flashner)(address . efraim@flashner.co.il)
288c615cc9a12c6333a28f7002429140c2924ae5.camel@gnu.org
On Fri, 2021-04-30 at 14:53 +0300, Efraim Flashner wrote:
Toggle quote (45 lines)
>
>
> > From 1af29f66980ba19740e05a27135f141e23b7fd3f Mon Sep 17 00:00:00
> > 2001
> > From: Roel Janssen <roel@gnu.org>
> > Date: Fri, 30 Apr 2021 13:47:43 +0200
> > Subject: [PATCH] gnu: Add pbgzip.
> >
> > ...
> > +      (synopsis "Parallel Block GZIP")
> > +      (description "This package implements parallel block gzip. 
> > For many
> > +formats, in particular genomics data formats, data are compressed
> > in
>
> I wasn't sure about 'data are' vs 'data is' but I think data here is
> plural, so 'data are' should be right.
>
> > +fixed-length blocks such that they can be easily indexed based on
> > a (genomic)
> > +coordinate order, since typically each block is sorted according
> > to this order.
> > +This allows for each block to be individually compressed
> > (deflated), or more
> > +importantly, decompressed (inflated), with the latter enabling
> > random retrieval
> > +of data in large files (gigabytes to terabytes).  @code{pbgzip} is
> > not limited
> > +to any particular format, but certain features are tailored to
> > genomics data
> > +formats when enabled.  Parallel decompression is somewhat faster,
> > but the true
> > +speedup comes during compression.")
> > +      (license license:expat))))
> > +
> >  (define-public blasr-libcpp
> >    (package
> >      (name "blasr-libcpp")
> > --
> > 2.31.1
> >
>
> Looks good to me!
>

Thank you Efraim, and thank you Xinglu Chen.
I pushed this patch.

Kind regards,
Roel Janssen
Closed
?