[PATCH] gnu: Add pbgzip.

  • Done
  • quality assurance status badge
Details
4 participants
  • Efraim Flashner
  • Maxime Devos
  • Xinglu Chen
  • Roel Janssen
Owner
unassigned
Submitted by
Roel Janssen
Severity
normal
R
R
Roel Janssen wrote on 21 Apr 2021 14:26
(address . guix-patches@gnu.org)
eb1b7cd3-e60b-46c5-da4a-371b5038b3a3@gnu.org
Hi Guix,

Here's a patch to add pbgzip.  Lint complains that there is no release
on the Github page, but there's nothing I can do about it.

Kind regards,
Roel Janssen
From 3d34e82ee67f5ee0b226de350f40d7f881169a56 Mon Sep 17 00:00:00 2001
From: Roel Janssen <roel@gnu.org>
Date: Wed, 21 Apr 2021 14:24:07 +0200
Subject: [PATCH] gnu: Add pbgzip.

* gnu/packages/bioinformatics.scm (pbgzip): New variable.
---
gnu/packages/bioinformatics.scm | 42 ++++++++++++++++++++++++++++++++-
1 file changed, 41 insertions(+), 1 deletion(-)

Toggle diff (62 lines)
diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 31205c473a..35601378c2 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -3,7 +3,7 @@
;;; Copyright © 2015, 2016, 2017, 2018 Ben Woodcroft <donttrustben@gmail.com>
;;; Copyright © 2015, 2016, 2018, 2019, 2020 Pjotr Prins <pjotr.guix@thebird.nl>
;;; Copyright © 2015 Andreas Enge <andreas@enge.fr>
-;;; Copyright © 2016, 2020 Roel Janssen <roel@gnu.org>
+;;; Copyright © 2016, 2020, 2021 Roel Janssen <roel@gnu.org>
;;; Copyright © 2016, 2017, 2018, 2019, 2020, 2021 Efraim Flashner <efraim@flashner.co.il>
;;; Copyright © 2016, 2020 Marius Bakke <mbakke@fastmail.com>
;;; Copyright © 2016, 2018 Raoul Bonnal <ilpuccio.febo@gmail.com>
@@ -569,6 +569,46 @@ input and output BAMs must adhere to the PacBio BAM format specification.
Non-PacBio BAMs will cause exceptions to be thrown.")
(license license:bsd-3)))
+(define-public pbgzip
+ (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
+ (package
+ (name "pbgzip")
+ (version (string-take commit 7))
+ (source (origin
+ (method git-fetch)
+ (uri (git-reference
+ (url "https://github.com/nh13/pbgzip")
+ (commit commit)))
+ (file-name (string-append name "-" version))
+ (sha256
+ (base32
+ "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
+ (build-system gnu-build-system)
+ (arguments
+ `(#:phases
+ (modify-phases %standard-phases
+ (add-after 'unpack 'autogen
+ (lambda _
+ (zero? (system* "sh" "autogen.sh")))))))
+ (native-inputs
+ `(("autoconf" ,autoconf)
+ ("automake" ,automake)))
+ (inputs
+ `(("zlib" ,zlib)))
+ (home-page "https://github.com/nh13/pbgzip")
+ (synopsis "Parallel Block GZIP")
+ (description "This package implements parallel block gzip. For many
+formats, in particular genomics data formats, data are compressed in
+fixed-length blocks such that they can be easily indexed based on a (genomic)
+coordinate order, since typically each block is sorted according to this order.
+This allows for each block to be individually compressed (deflated), or more
+importantly, decompressed (inflated), with the latter enabling random retrieval
+of data in large files (gigabytes to terabytes). @code{pbgzip} is not limited
+to any particular format, but certain features are tailored to genomics data
+formats when enabled. Parallel decompression is somewhat faster, but truly the
+speedup comes during compression.")
+ (license license:expat))))
+
(define-public blasr-libcpp
(package
(name "blasr-libcpp")
--
2.31.1
X
X
Xinglu Chen wrote on 21 Apr 2021 23:44
875z0f71nz.fsf@yoctocell.xyz
On Wed, Apr 21 2021, Roel Janssen wrote:

Toggle quote (10 lines)
> * gnu/packages/bioinformatics.scm (pbgzip): New variable.
>
> [...]
>
> +(define-public pbgzip
> + (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
> + (package
> + (name "pbgzip")
> + (version (string-take commit 7))

I think using (git-version VERSION REVISION COMMIT) is preferred.
Something like (git-version "0.0.0" "0" commit).

Toggle quote (7 lines)
> + (source (origin
> + (method git-fetch)
> + (uri (git-reference
> + (url "https://github.com/nh13/pbgzip")
> + (commit commit)))
> + (file-name (string-append name "-" version))

Use (git-file-name name version).

Toggle quote (11 lines)
> + (sha256
> + (base32
> + "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
> + (build-system gnu-build-system)
> + (arguments
> + `(#:phases
> + (modify-phases %standard-phases
> + (add-after 'unpack 'autogen
> + (lambda _
> + (zero? (system* "sh" "autogen.sh")))))))

IIRC, phases don’t have to return #t, so you could remove ‘zero?’.

Builds fine, but I haven’t tested it.
-----BEGIN PGP SIGNATURE-----

iQJJBAEBCAAzFiEEAVhh4yyK5+SEykIzrPUJmaL7XHkFAmCAnNAVHHB1YmxpY0B5
b2N0b2NlbGwueHl6AAoJEKz1CZmi+1x5gzoP/ix8CAlbjDKcYjkIWPg4sn9CMnrp
iY9N/AqBRf/gdLr8uBLIryClg/9MK6ihpLzFxn0Aa9Ildbn/kEl40OjfSfhap5lP
d8N/HHaJuXd6JY6xF9xi/Nd24/Uzjg0hXO1pUFYJdq6JhYGkIFaZfBDgpUvKi51b
a0pl3Nh511RnevAN9flHpdY0sso/QXYEyYo7fIOY7XPxstpjd0yYtdunmxSgsUb4
Lt9FhqzYcVMZYAbBCdn3PYQ/sV3XQbemdmf0+BtDTe2h35AT9u4ShhucF4xeFq/n
zTIptWkZXQFCU6VdkTHka905ltuqiCPk7bxfPC2Q8wBDMlwHa9SVoakuQbFlC83C
dZ/qFVukImnGSIn3DRWVD9OY0zlJQA4uqZbmjvIkV6fy+QDEnDDs0BMV8hfajzZY
NjT4RFNWVn7554P1AGEGkRDKbeRia61laHnpBYhvgCWZmHlEcxDcVtvrqg/g/B4F
ErZRQo71iLZ09MzebDefDq98JjLt0OAqmb8bsZ+Iav8WyIBz1JUqM06ySwq+DksC
bvfjiF19zz40Rp1PxkeAzVEIuIeCu6NX4vWq2LInAZQMy1JamscA53aRHXJCu7p5
RhK4goD70bwK5FakduxEBM7K83eR3UG3bd/Jyyzff08bbuhGxD6r/en+hicQZD8G
dzlOM36nbDONq8OR
=lCr3
-----END PGP SIGNATURE-----

X
X
Xinglu Chen wrote on 21 Apr 2021 23:45
874kfz71ni.fsf@yoctocell.xyz
On Wed, Apr 21 2021, Roel Janssen wrote:

Toggle quote (10 lines)
> * gnu/packages/bioinformatics.scm (pbgzip): New variable.
>
> [...]
>
> +(define-public pbgzip
> + (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
> + (package
> + (name "pbgzip")
> + (version (string-take commit 7))

I think using (git-version VERSION REVISION COMMIT) is preferred.
Something like (git-version "0.0.0" "0" commit).

Toggle quote (7 lines)
> + (source (origin
> + (method git-fetch)
> + (uri (git-reference
> + (url "https://github.com/nh13/pbgzip")
> + (commit commit)))
> + (file-name (string-append name "-" version))

Use (git-file-name name version).

Toggle quote (11 lines)
> + (sha256
> + (base32
> + "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
> + (build-system gnu-build-system)
> + (arguments
> + `(#:phases
> + (modify-phases %standard-phases
> + (add-after 'unpack 'autogen
> + (lambda _
> + (zero? (system* "sh" "autogen.sh")))))))

IIRC, phases don’t have to return #t, so you could remove ‘zero?’.

Builds fine, but I haven’t tested it.
M
M
Maxime Devos wrote on 22 Apr 2021 18:40
e926a502496bd2def92b248cd328d2e180dedef4.camel@telenet.be
Xinglu Chen schreef op wo 21-04-2021 om 23:45 [+0200]:
Toggle quote (12 lines)
> On Wed, Apr 21 2021, Roel Janssen wrote:
>
> > [...]
> > + (arguments
> > + `(#:phases
> > + (modify-phases %standard-phases
> > + (add-after 'unpack 'autogen
> > + (lambda _
> > + (zero? (system* "sh" "autogen.sh")))))))
>
> IIRC, phases don’t have to return #t, so you could remove ‘zero?’.

Try running (system* "does-not-exist"). It will fail by returning
something non-zero. If I recall how to call "invoke" correctly,
I would recommend (invoke "sh" "autogen.sh") here. "invoke" raises
an exception when the command fails, instead of returning something.

Greetings,
Maxime.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYIGnDhccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7onhAP4lw3JsxMrS3hVpGd7NA5OU/MXV
+UbQEMCLHcydR3OQ0wEA9DQz9+i3E3/XgCGhdELKcSxuV+5N1rqygNC6UtolbAk=
=oB8m
-----END PGP SIGNATURE-----


E
E
Efraim Flashner wrote on 29 Apr 2021 09:29
(name . Maxime Devos)(address . maximedevos@telenet.be)
YIpgaLrbEoiZX7iM@3900XT
On Thu, Apr 22, 2021 at 06:40:46PM +0200, Maxime Devos wrote:
Toggle quote (18 lines)
> Xinglu Chen schreef op wo 21-04-2021 om 23:45 [+0200]:
> > On Wed, Apr 21 2021, Roel Janssen wrote:
> >
> > > [...]
> > > + (arguments
> > > + `(#:phases
> > > + (modify-phases %standard-phases
> > > + (add-after 'unpack 'autogen
> > > + (lambda _
> > > + (zero? (system* "sh" "autogen.sh")))))))
> >
> > IIRC, phases don’t have to return #t, so you could remove ‘zero?’.
>
> Try running (system* "does-not-exist"). It will fail by returning
> something non-zero. If I recall how to call "invoke" correctly,
> I would recommend (invoke "sh" "autogen.sh") here. "invoke" raises
> an exception when the command fails, instead of returning something.

While we're at it, can this phase replace 'bootstrap? It seems to me we
shouldn't need both phases.


--
Efraim Flashner <efraim@flashner.co.il> ????? ?????
GPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted
-----BEGIN PGP SIGNATURE-----

iQIzBAABCgAdFiEEoov0DD5VE3JmLRT3Qarn3Mo9g1EFAmCKYGUACgkQQarn3Mo9
g1ENrxAAn0JbKu8WcZUoA6IZtbkbQJWGcoOILW0KwjLxOky1rgWh7kMPtGvRBs/e
Y8ISNoxfnnwdaRSacxDzlL9VAaMNAtcHYVdfhlKRMG/Uhg0F2EwyvjTDmpTLUWEg
jjlfkFLaM2P1hgvsSGKciLpLgL9xgDZ0hSSQGMs+hAjBzpzroeK6l/tLEgqWbFLb
PLN/fcNfVuGOum5PAObxuicWZQJzxwKKvksiDaxsP9CZ/Zb2jI1IIi6HT8P4G0Pq
x6XeAHX5XVVNuK+PUFATG6RG1uX3wfN3BIHScQNl0KL6vknBBquE/AoJ8sT1DOcj
aQDLkHnIAPYsKB/pi+G9N1B9XGJqmQsRvfjff57k5cvyNGyqHDW82BS7HOwWRW6f
H1htinWn7LHShFR3sc23oM+pBTvshN26CtAYdOIoq0t+Sp3P5tYw8ADIlBHSjRC+
vDDHXkYt5F+iYYOljviGLA7Wq58xUEgEcaN+y0zrr+/YLFLAUwiRYMsjz7Bb0YyT
/TAuDrV6YWwHfddBUXZeCYajGRloes2XL1ossO317emnRaceWPhSWzPWAoTAEy5S
M5ETAyrrIGIdZdc1CzW+O3wzdunTgTvTQXtKvuacKm3eFsz9twlNQias6dzh3eiG
4/HfU7PbWQtM1WtvwMWqYt+1kHWjukeCzobmlsXQoou1GUZgZog=
=GAp0
-----END PGP SIGNATURE-----


R
R
Roel Janssen wrote on 29 Apr 2021 14:22
(address . 47930@debbugs.gnu.org)
ad09ffe1-b06b-8c25-0cf8-a6dbf094bacb@gnu.org
On 4/29/21 9:29 AM, Efraim Flashner wrote:
Toggle quote (18 lines)
> On Thu, Apr 22, 2021 at 06:40:46PM +0200, Maxime Devos wrote:
>> Xinglu Chen schreef op wo 21-04-2021 om 23:45 [+0200]:
>>> On Wed, Apr 21 2021, Roel Janssen wrote:
>>>
>>>> [...]
>>>> + (arguments
>>>> + `(#:phases
>>>> + (modify-phases %standard-phases
>>>> + (add-after 'unpack 'autogen
>>>> + (lambda _
>>>> + (zero? (system* "sh" "autogen.sh")))))))
>>> IIRC, phases don’t have to return #t, so you could remove ‘zero?’.
>> Try running (system* "does-not-exist"). It will fail by returning
>> something non-zero. If I recall how to call "invoke" correctly,
>> I would recommend (invoke "sh" "autogen.sh") here. "invoke" raises
>> an exception when the command fails, instead of returning something.
> While we're at it, can this phase replace 'bootstrap? It seems to me we
> shouldn't need both phases.
This indeed seems to be the best thing to do.  I attached a new patch.

I had to leave autoconf and automake in the native-inputs because
otherwise the command "aclocal" and "autom4te" couldn't be found.

Thanks all for the feedback!  I hope this new patch is fine.

Kind regards,
Roel Janssen
From b03f8d8926cdd6a28502f2bdc6db74854144f050 Mon Sep 17 00:00:00 2001
From: Roel Janssen <roel@gnu.org>
Date: Thu, 29 Apr 2021 14:18:30 +0200
Subject: [PATCH] gnu: Add pbgzip.

* gnu/packages/bioinformatics.scm (pbgzip): New variable.
---
gnu/packages/bioinformatics.scm | 36 ++++++++++++++++++++++++++++++++-
1 file changed, 35 insertions(+), 1 deletion(-)

Toggle diff (56 lines)
diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 83ebfc2d8f..8c4d0fc649 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -3,7 +3,7 @@
;;; Copyright © 2015, 2016, 2017, 2018 Ben Woodcroft <donttrustben@gmail.com>
;;; Copyright © 2015, 2016, 2018, 2019, 2020 Pjotr Prins <pjotr.guix@thebird.nl>
;;; Copyright © 2015 Andreas Enge <andreas@enge.fr>
-;;; Copyright © 2016, 2020 Roel Janssen <roel@gnu.org>
+;;; Copyright © 2016, 2020, 2021 Roel Janssen <roel@gnu.org>
;;; Copyright © 2016, 2017, 2018, 2019, 2020, 2021 Efraim Flashner <efraim@flashner.co.il>
;;; Copyright © 2016, 2020 Marius Bakke <mbakke@fastmail.com>
;;; Copyright © 2016, 2018 Raoul Bonnal <ilpuccio.febo@gmail.com>
@@ -571,6 +571,40 @@ input and output BAMs must adhere to the PacBio BAM format specification.
Non-PacBio BAMs will cause exceptions to be thrown.")
(license license:bsd-3)))
+(define-public pbgzip
+ (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
+ (package
+ (name "pbgzip")
+ (version (string-take commit 7))
+ (source (origin
+ (method git-fetch)
+ (uri (git-reference
+ (url "https://github.com/nh13/pbgzip")
+ (commit commit)))
+ (file-name (string-append name "-" version))
+ (sha256
+ (base32
+ "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
+ (build-system gnu-build-system)
+ (native-inputs
+ `(("autoconf" ,autoconf)
+ ("automake" ,automake)))
+ (inputs
+ `(("zlib" ,zlib)))
+ (home-page "https://github.com/nh13/pbgzip")
+ (synopsis "Parallel Block GZIP")
+ (description "This package implements parallel block gzip. For many
+formats, in particular genomics data formats, data are compressed in
+fixed-length blocks such that they can be easily indexed based on a (genomic)
+coordinate order, since typically each block is sorted according to this order.
+This allows for each block to be individually compressed (deflated), or more
+importantly, decompressed (inflated), with the latter enabling random retrieval
+of data in large files (gigabytes to terabytes). @code{pbgzip} is not limited
+to any particular format, but certain features are tailored to genomics data
+formats when enabled. Parallel decompression is somewhat faster, but truly the
+speedup comes during compression.")
+ (license license:expat))))
+
(define-public blasr-libcpp
(package
(name "blasr-libcpp")
--
2.31.1
X
X
Xinglu Chen wrote on 30 Apr 2021 10:30
(address . 47930@debbugs.gnu.org)
87czucnphi.fsf@yoctocell.xyz
On Thu, Apr 29 2021, Roel Janssen wrote:

Toggle quote (6 lines)
> +(define-public pbgzip
> + (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
> + (package
> + (name "pbgzip")
> + (version (string-take commit 7))

Maybe you missed my previous suggestions?

Toggle quote (26 lines)
> + (source (origin
> + (method git-fetch)
> + (uri (git-reference
> + (url "https://github.com/nh13/pbgzip")
> + (commit commit)))
> + (file-name (string-append name "-" version))
> + (sha256
> + (base32
> + "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
> + (build-system gnu-build-system)
> + (native-inputs
> + `(("autoconf" ,autoconf)
> + ("automake" ,automake)))
> + (inputs
> + `(("zlib" ,zlib)))
> + (home-page "https://github.com/nh13/pbgzip")
> + (synopsis "Parallel Block GZIP")
> + (description "This package implements parallel block gzip. For many
> +formats, in particular genomics data formats, data are compressed in
> +fixed-length blocks such that they can be easily indexed based on a (genomic)
> +coordinate order, since typically each block is sorted according to this order.
> +This allows for each block to be individually compressed (deflated), or more
> +importantly, decompressed (inflated), with the latter enabling random retrieval
> +of data in large files (gigabytes to terabytes). @code{pbgzip} is not limited
> +to any particular format, but certain features are tailored to genomics data
> +formats when enabled. Parallel decompression is somewhat faster, but truly the
^^^^^^^^^^^^^
Toggle quote (1 lines)
> +speedup comes during compression.")
^^^^^^^

“but the true speedup” instead?
R
R
Roel Janssen wrote on 30 Apr 2021 13:48
(address . 47930@debbugs.gnu.org)
052ee880cea08e4e1627a2181f7173ab9587b6c8.camel@gnu.org
On Fri, 2021-04-30 at 10:30 +0200, Xinglu Chen wrote:
Toggle quote (13 lines)
> On Thu, Apr 29 2021, Roel Janssen wrote:
>
> > +(define-public pbgzip
> > +  (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
> > +    (package
> > +      (name "pbgzip")
> > +      (version (string-take commit 7))
>
> Maybe you missed my previous suggestions?
>
>   https://issues.guix.gnu.org/47930#2
>

I'm sorry, I forgot to adapt.
Toggle quote (44 lines)
>  
> > +      (source (origin
> > +                (method git-fetch)
> > +                (uri (git-reference
> > +                      (url "https://github.com/nh13/pbgzip")
> > +                      (commit commit)))
> > +                (file-name (string-append name "-" version))
> > +                (sha256
> > +                 (base32
> > +                 
> > "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
> > +      (build-system gnu-build-system)
> > +      (native-inputs
> > +       `(("autoconf" ,autoconf)
> > +         ("automake" ,automake)))
> > +      (inputs
> > +       `(("zlib" ,zlib)))
> > +      (home-page "https://github.com/nh13/pbgzip")
> > +      (synopsis "Parallel Block GZIP")
> > +      (description "This package implements parallel block gzip. 
> > For many
> > +formats, in particular genomics data formats, data are compressed
> > in
> > +fixed-length blocks such that they can be easily indexed based on
> > a (genomic)
> > +coordinate order, since typically each block is sorted according
> > to this order.
> > +This allows for each block to be individually compressed
> > (deflated), or more
> > +importantly, decompressed (inflated), with the latter enabling
> > random retrieval
> > +of data in large files (gigabytes to terabytes).  @code{pbgzip} is
> > not limited
> > +to any particular format, but certain features are tailored to
> > genomics data
> > +formats when enabled.  Parallel decompression is somewhat faster,
> > but truly the
>                                                                     
> ^^^^^^^^^^^^^
> > +speedup comes during compression.")
>    ^^^^^^^
>
> “but the true speedup” instead?

Sure. I usually don't change descriptions as given by the creators of
the software, but I applied your suggestion.

Thank you for the elaborate suggestions!

I attached another version of the patch, which I hope is fine now. :)

Kind regards,
Roel Janssen
From 1af29f66980ba19740e05a27135f141e23b7fd3f Mon Sep 17 00:00:00 2001
From: Roel Janssen <roel@gnu.org>
Date: Fri, 30 Apr 2021 13:47:43 +0200
Subject: [PATCH] gnu: Add pbgzip.

* gnu/packages/bioinformatics.scm (pbgzip): New variable.
---
gnu/packages/bioinformatics.scm | 36 ++++++++++++++++++++++++++++++++-
1 file changed, 35 insertions(+), 1 deletion(-)

Toggle diff (56 lines)
diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 83ebfc2d8f..cd2dae05d5 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -3,7 +3,7 @@
;;; Copyright © 2015, 2016, 2017, 2018 Ben Woodcroft <donttrustben@gmail.com>
;;; Copyright © 2015, 2016, 2018, 2019, 2020 Pjotr Prins <pjotr.guix@thebird.nl>
;;; Copyright © 2015 Andreas Enge <andreas@enge.fr>
-;;; Copyright © 2016, 2020 Roel Janssen <roel@gnu.org>
+;;; Copyright © 2016, 2020, 2021 Roel Janssen <roel@gnu.org>
;;; Copyright © 2016, 2017, 2018, 2019, 2020, 2021 Efraim Flashner <efraim@flashner.co.il>
;;; Copyright © 2016, 2020 Marius Bakke <mbakke@fastmail.com>
;;; Copyright © 2016, 2018 Raoul Bonnal <ilpuccio.febo@gmail.com>
@@ -571,6 +571,40 @@ input and output BAMs must adhere to the PacBio BAM format specification.
Non-PacBio BAMs will cause exceptions to be thrown.")
(license license:bsd-3)))
+(define-public pbgzip
+ (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
+ (package
+ (name "pbgzip")
+ (version (git-version "0.0.0" "0" commit))
+ (source (origin
+ (method git-fetch)
+ (uri (git-reference
+ (url "https://github.com/nh13/pbgzip")
+ (commit commit)))
+ (file-name (git-file-name name version))
+ (sha256
+ (base32
+ "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
+ (build-system gnu-build-system)
+ (native-inputs
+ `(("autoconf" ,autoconf)
+ ("automake" ,automake)))
+ (inputs
+ `(("zlib" ,zlib)))
+ (home-page "https://github.com/nh13/pbgzip")
+ (synopsis "Parallel Block GZIP")
+ (description "This package implements parallel block gzip. For many
+formats, in particular genomics data formats, data are compressed in
+fixed-length blocks such that they can be easily indexed based on a (genomic)
+coordinate order, since typically each block is sorted according to this order.
+This allows for each block to be individually compressed (deflated), or more
+importantly, decompressed (inflated), with the latter enabling random retrieval
+of data in large files (gigabytes to terabytes). @code{pbgzip} is not limited
+to any particular format, but certain features are tailored to genomics data
+formats when enabled. Parallel decompression is somewhat faster, but the true
+speedup comes during compression.")
+ (license license:expat))))
+
(define-public blasr-libcpp
(package
(name "blasr-libcpp")
--
2.31.1
E
E
Efraim Flashner wrote on 30 Apr 2021 13:53
(name . Roel Janssen)(address . roel@gnu.org)
YIvvxUphQsqgDL2S@3900XT
Attachment: file
-----BEGIN PGP SIGNATURE-----

iQIzBAABCgAdFiEEoov0DD5VE3JmLRT3Qarn3Mo9g1EFAmCL78UACgkQQarn3Mo9
g1HyzQ//fuPkFgnSuU0kom2qkiA6m+R8a4ilzGRv+dgYc2XIaEZ0mZt4Pcg8Hj6B
6wIuiIs+aFjJZ6jaobnhS62B9D7hb5+v6Gy5GPk9fm8XGQjHMwlDWxvTLSc06LD8
cjYMjHH8INeaTaB9ne2AOgDf1l90mOfaYG2RWs6k4TpEwwER8erqGN3sf8Xa9ikl
6Ki/ygivAw/TEhJaFL4KPaVkLNh2HSlH0gjW1HLJgTWwe2gjSg/RVWnHFel1sY0a
dzt5OeuUSLEg0zZzy9zQKUOvB/pldWJl7j1sOlSkSe2OwaLGASvX5UU+zA2v72gZ
fTOhAOCytfd6EbhoIc/6i4M3GUb+/q1hUfYXq4B0HxvCQYMfyF8nRXeu130d9bcF
8HiYdIAkhTXL+Ld1CoQogIlMriJEnS37mdBxR5uqdn5g+AillUDhejFMCKhOSRqv
PSXUd1cm/uySWlBAzumG9+eqLIYXNVt0yz2ksrjkqJC3S5jQN0Q79tB3P4qTe35b
uvy4Ug/crxfD7sCJ7KtjWbZeKsuwUV7yHs/2Xg9/HmArvN99nM3XJftNYGcX9YY2
P8Ykfc2pmSG3DYWR0l8hSPuz8ZZiEIXNy5jWxa7TeGTvi+e2sGqWz6ahfPRQrrIt
cn/t3VFnCa9+v00XklHI+FGOYaPef97ztmO+o4cApdFHD6UFiZ4=
=RtpN
-----END PGP SIGNATURE-----


R
R
Roel Janssen wrote on 30 Apr 2021 18:47
(name . Efraim Flashner)(address . efraim@flashner.co.il)
288c615cc9a12c6333a28f7002429140c2924ae5.camel@gnu.org
On Fri, 2021-04-30 at 14:53 +0300, Efraim Flashner wrote:
Toggle quote (45 lines)
>
>
> > From 1af29f66980ba19740e05a27135f141e23b7fd3f Mon Sep 17 00:00:00
> > 2001
> > From: Roel Janssen <roel@gnu.org>
> > Date: Fri, 30 Apr 2021 13:47:43 +0200
> > Subject: [PATCH] gnu: Add pbgzip.
> >
> > ...
> > +      (synopsis "Parallel Block GZIP")
> > +      (description "This package implements parallel block gzip. 
> > For many
> > +formats, in particular genomics data formats, data are compressed
> > in
>
> I wasn't sure about 'data are' vs 'data is' but I think data here is
> plural, so 'data are' should be right.
>
> > +fixed-length blocks such that they can be easily indexed based on
> > a (genomic)
> > +coordinate order, since typically each block is sorted according
> > to this order.
> > +This allows for each block to be individually compressed
> > (deflated), or more
> > +importantly, decompressed (inflated), with the latter enabling
> > random retrieval
> > +of data in large files (gigabytes to terabytes).  @code{pbgzip} is
> > not limited
> > +to any particular format, but certain features are tailored to
> > genomics data
> > +formats when enabled.  Parallel decompression is somewhat faster,
> > but the true
> > +speedup comes during compression.")
> > +      (license license:expat))))
> > +
> >  (define-public blasr-libcpp
> >    (package
> >      (name "blasr-libcpp")
> > --
> > 2.31.1
> >
>
> Looks good to me!
>

Thank you Efraim, and thank you Xinglu Chen.
I pushed this patch.

Kind regards,
Roel Janssen
Closed
?
Your comment

This issue is archived.

To comment on this conversation send an email to 47930@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 47930
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch