[PATCH 0/5] Update vcflib

  • Open
  • quality assurance status badge
Details
2 participants
  • Efraim Flashner
  • Ricardo Wurmus
Owner
unassigned
Submitted by
Efraim Flashner
Severity
normal
E
E
Efraim Flashner wrote on 29 Jan 2023 14:09
(address . guix-patches@gnu.org)
cover.1674997469.git.efraim@flashner.co.il
vcflib has a new update out so it's time to update it and the dependant
packages. libvcfh doesn't seem to be used, perhaps Pjotr knows if it's
needed? I don't love '-DTABIX_FOUND=ON' in vcflib but it seems to work.
I'm also not sure that the simde headers are actually used.

I'd like to use more of the unbundled sources since it seems like a
waste to unbundle them and then just toss the sources back into the
build but they don't natively produce libraries so I'd rather not.
Perhaps it'd be better to just not unbundle them in the first place?

Efraim Flashner (5):
gnu: intervaltree: Update to 0.1-1.aa59377.
gnu: tabixpp: Update to 1.1.2.
gnu: Add wfa2-lib.
gnu: Add libvcfh.
gnu: vcflib: Update to 1.0.6.

gnu/packages/bioinformatics.scm | 217 ++++++++++++++++++++++++--------
1 file changed, 166 insertions(+), 51 deletions(-)


base-commit: b9e6e31877cdb96cceba4d1ec6268f86b824dec4
--
Efraim Flashner <efraim@flashner.co.il> ????? ?????
GPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted
E
E
Efraim Flashner wrote on 29 Jan 2023 14:12
[PATCH 1/5] gnu: intervaltree: Update to 0.1-1.aa59377.
(address . 61148@debbugs.gnu.org)(name . Efraim Flashner)(address . efraim@flashner.co.il)
4fd1568af1aa4bafb18a1efbcdfd084ca6b18711.1674997469.git.efraim@flashner.co.il
* gnu/packages/bioinformatics.scm (intervaltree): Update to
0.1-1.aa59377.
[arguments]: Fix cross-compiling.
---
gnu/packages/bioinformatics.scm | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)

Toggle diff (45 lines)
diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 8c75607a65..5cea726f8e 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -13632,10 +13632,10 @@ (define-public gffcompare
license:artistic2.0))))) ;license for gclib
(define-public intervaltree
- (let ((commit "b90527f9e6d51cd36ecbb50429e4524d3a418ea5"))
+ (let ((commit "aa5937755000f1cd007402d03b6f7ce4427c5d21"))
(package
(name "intervaltree")
- (version (git-version "0.0.0" "1" commit))
+ (version (git-version "0.1" "1" commit))
(source
(origin
(method git-fetch)
@@ -13644,15 +13644,18 @@ (define-public intervaltree
(commit commit)))
(file-name (git-file-name name version))
(sha256
- (base32 "0rgv6q5fl4x5d74n6p5wvdna6zmbdbqpb4jqqh6vq3670gn08xad"))))
+ (base32 "0p9aphy6sc01dg67xzqpnhvjmk21xa380bpfbkz24a23s6krhjwl"))))
(build-system gnu-build-system)
(arguments
- `(#:tests? #f ; No tests.
+ (list
+ #:tests? #f ; No tests.
#:make-flags
- ,#~(list (string-append "PREFIX=" #$output) "DESTDIR=\"\"")
+ #~(list (string-append "PREFIX=" #$output)
+ (string-append "CXX=" #$(cxx-for-target))
+ "DESTDIR=")
#:phases
- (modify-phases %standard-phases
- (delete 'configure)))) ; There is no configure phase.
+ #~(modify-phases %standard-phases
+ (delete 'configure)))) ; There is no configure phase.
(home-page "https://github.com/ekg/intervaltree")
(synopsis "Minimal C++ interval tree implementation")
(description "An interval tree can be used to efficiently find a set of
--
Efraim Flashner <efraim@flashner.co.il> ????? ?????
GPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted
E
E
Efraim Flashner wrote on 29 Jan 2023 14:12
[PATCH 2/5] gnu: tabixpp: Update to 1.1.2.
(address . 61148@debbugs.gnu.org)(name . Efraim Flashner)(address . efraim@flashner.co.il)
29b7080a94dbfdc13c890742a4fabcf1d731617c.1674997469.git.efraim@flashner.co.il
* gnu/packages/bioinformatics.scm (tabixpp): Update to 1.1.2.
[source]: Add snippet to keep library name the same.
[arguments]: Adjust the make-flags to find htslib. Enable the tests.
Remove custom 'build-libraries phase, it is built by default now. Add a
phase to symlink the shared library to a generic .so name. Don't
override the 'install phase. Add a phase after 'install to create a
pkg-config file.
---
gnu/packages/bioinformatics.scm | 60 ++++++++++++++++++---------------
1 file changed, 33 insertions(+), 27 deletions(-)

Toggle diff (114 lines)
diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 5cea726f8e..fa0a6c0dd6 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -4,7 +4,7 @@
;;; Copyright © 2015, 2016, 2018, 2019, 2020 Pjotr Prins <pjotr.guix@thebird.nl>
;;; Copyright © 2015 Andreas Enge <andreas@enge.fr>
;;; Copyright © 2016, 2020, 2021 Roel Janssen <roel@gnu.org>
-;;; Copyright © 2016, 2017, 2018, 2019, 2020, 2021, 2022 Efraim Flashner <efraim@flashner.co.il>
+;;; Copyright © 2016-2023 Efraim Flashner <efraim@flashner.co.il>
;;; Copyright © 2016, 2020, 2022 Marius Bakke <marius@gnu.org>
;;; Copyright © 2016, 2018 Raoul Bonnal <ilpuccio.febo@gmail.com>
;;; Copyright © 2017, 2018 Tobias Geerinckx-Rice <me@tobias.gr>
@@ -15640,7 +15640,7 @@ (define-public tbsp
(define-public tabixpp
(package
(name "tabixpp")
- (version "1.1.0")
+ (version "1.1.2")
(source (origin
(method git-fetch)
(uri (git-reference
@@ -15648,43 +15648,47 @@ (define-public tabixpp
(commit (string-append "v" version))))
(file-name (git-file-name name version))
(sha256
- (base32 "1k2a3vbq96ic4lw72iwp5s3mwwc4xhdffjj584yn6l9637q9j1yd"))
+ (base32 "00aqs147yn8zcvxims5njwxqsbnlbjv7lnmiwqy80bfdcbhljkqf"))
(modules '((guix build utils)))
(snippet
#~(begin
- (delete-file-recursively "htslib")))))
+ (delete-file-recursively "htslib")
+ ;; Keep it named tabixpp.
+ (substitute* "Makefile"
+ (("libtabix") "libtabixpp"))))))
(build-system gnu-build-system)
- (inputs
- (list bzip2 htslib xz zlib))
(arguments
(list #:make-flags #~(list (string-append "CC=" #$(cc-for-target))
(string-append "CXX=" #$(cxx-for-target))
+ (string-append "AR=" #$(ar-for-target))
"HTS_HEADERS="
- (string-append "HTS_LIB="
- (search-input-file %build-inputs
- "/lib/libhts.a"))
- "INCLUDES=")
- #:tests? #f ; There are no tests to run.
+ "HTS_LIB="
+ (string-append
+ "INCLUDES= -I"
+ (search-input-directory %build-inputs
+ "include/htslib"))
+ (string-append
+ "LIBPATH= -L. -L"
+ (dirname
+ (search-input-file %build-inputs
+ "/lib/libhts.a")))
+ (string-append "PREFIX=" #$output)
+ "DESTDIR=")
+ #:test-target "test"
#:phases
#~(modify-phases %standard-phases
(delete 'configure) ; There is no configure phase.
- ;; Build shared and static libraries.
- (add-after 'build 'build-libraries
- (lambda* (#:key inputs #:allow-other-keys)
- (invoke #$(cxx-for-target)
- "-shared" "-o" "libtabixpp.so" "tabix.o" "-lhts")
- (invoke #$(ar-for-target) "rcs" "libtabixpp.a" "tabix.o")))
- (replace 'install
+ (add-after 'install 'symlink-shared-library
+ (lambda* (#:key outputs #:allow-other-keys)
+ (with-directory-excursion
+ (string-append (assoc-ref outputs "out") "/lib")
+ (symlink "libtabixpp.so.1" "libtabixpp.so"))))
+ (add-after 'install 'make-pkg-config-file
(lambda* (#:key outputs #:allow-other-keys)
(let* ((out (assoc-ref outputs "out"))
- (lib (string-append out "/lib"))
- (bin (string-append out "/bin")))
- (install-file "tabix++" bin)
- (install-file "libtabixpp.so" lib)
- (install-file "libtabixpp.a" lib)
- (install-file "tabix.hpp" (string-append out "/include"))
- (mkdir-p (string-append lib "/pkgconfig"))
- (with-output-to-file (string-append lib "/pkgconfig/tabixpp.pc")
+ (pkgconfig (string-append out "/lib/pkgconfig")))
+ (mkdir-p pkgconfig)
+ (with-output-to-file (string-append pkgconfig "/tabixpp.pc")
(lambda _
(format #t "prefix=~a~@
exec_prefix=${prefix}~@
@@ -15692,12 +15696,14 @@ (define-public tabixpp
includedir=${prefix}/include~@
~@
~@
- Name: libtabixpp~@
+ Name: tabixpp~@
Version: ~a~@
Description: C++ wrapper around tabix project~@
Libs: -L${libdir} -ltabixpp~@
Cflags: -I${includedir}~%"
out #$version)))))))))
+ (inputs
+ (list bzip2 curl htslib xz zlib))
(home-page "https://github.com/ekg/tabixpp")
(synopsis "C++ wrapper around tabix project")
(description "This is a C++ wrapper around the Tabix project which abstracts
--
Efraim Flashner <efraim@flashner.co.il> ????? ?????
GPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted
E
E
Efraim Flashner wrote on 29 Jan 2023 14:12
[PATCH 3/5] gnu: Add wfa2-lib.
(address . 61148@debbugs.gnu.org)(name . Efraim Flashner)(address . efraim@flashner.co.il)
8cd6e9d8b601af9fd507a60dd8efbd0b255eba96.1674997469.git.efraim@flashner.co.il
* gnu/packages/bioinformatics.scm (wfa2-lib): New variable.
---
gnu/packages/bioinformatics.scm | 34 +++++++++++++++++++++++++++++++++
1 file changed, 34 insertions(+)

Toggle diff (49 lines)
diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index fa0a6c0dd6..7b5d5c5e8c 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -15710,6 +15710,40 @@ (define-public tabixpp
some of the details of opening and jumping in tabix-indexed files.")
(license license:expat)))
+(define-public wfa2-lib
+ (let ((commit "188b522ae634add3c692ca7547595b7266f1fa19")
+ (revision "1"))
+ (package
+ (name "wfa2-lib")
+ (version (git-version "2.3.1" revision commit))
+ (source (origin
+ (method git-fetch)
+ (uri (git-reference
+ (url "https://github.com/smarco/WFA2-lib")
+ (commit commit)))
+ (file-name (git-file-name name version))
+ (sha256
+ (base32 "1pq844zsl7v5zk6pzkbh5j2k2g1ac54nlvgyihla2wlwi0ibndax"))))
+ (build-system cmake-build-system)
+ (arguments
+ (list
+ #:configure-flags
+ #~(list "-DOPENMP=ON")))
+ (native-inputs
+ (list pkg-config))
+ (home-page "https://github.com/ekg/tabixpp")
+ (synopsis "Wavefront alignment algorithm library")
+ (description "The @acronym{wavefront alignment, WFA} algorithm is an exact
+gap-affine algorithm that takes advantage of homologous regions between the
+sequences to accelerate the alignment process. Unlike to traditional dynamic
+programming algorithms that run in quadratic time, the WFA runs in time
+@code{O(ns+s^2)}, proportional to the sequence length @code{n} and the alignment
+score @code{s}, using @code{O(s^2)} memory (or @code{O(s)} using the
+ultralow/BiWFA mode). Moreover, the WFA algorithm exhibits simple computational
+patterns that the modern compilers can automatically vectorize for different
+architectures without adapting the code.")
+ (license license:expat))))
+
(define-public smithwaterman
(let ((commit "2610e259611ae4cde8f03c72499d28f03f6d38a7"))
(package
--
Efraim Flashner <efraim@flashner.co.il> ????? ?????
GPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted
E
E
Efraim Flashner wrote on 29 Jan 2023 14:12
[PATCH 4/5] gnu: Add libvcfh.
(address . 61148@debbugs.gnu.org)(name . Efraim Flashner)(address . efraim@flashner.co.il)
9625ac451fe44ad7e2ea707e999cdc94d1389fce.1674997469.git.efraim@flashner.co.il
* gnu/packages/bioinformatics.scm (libvcfh): New variable.
---
gnu/packages/bioinformatics.scm | 37 +++++++++++++++++++++++++++++++++
1 file changed, 37 insertions(+)

Toggle diff (52 lines)
diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 7b5d5c5e8c..05a07af7f3 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -15744,6 +15744,43 @@ (define-public wfa2-lib
architectures without adapting the code.")
(license license:expat))))
+(define-public libvcfh
+ (let ((commit "44b6580639a216a484fd96de75a839091f25768a")
+ (revision "1"))
+ (package
+ (name "libvcfh")
+ (version (git-version "0.0.0" revision commit))
+ (source (origin
+ (method git-fetch)
+ (uri (git-reference
+ (url "https://github.com/edawson/libVCFH.git")
+ (commit commit)))
+ (file-name (git-file-name name version))
+ (sha256
+ (base32 "0jjqnzvai0849czh1hi5inm6y0228cw2s97i76f3vhyxj21mzvwm"))))
+ (build-system gnu-build-system)
+ (arguments
+ (list
+ #:phases
+ #~(modify-phases %standard-phases
+ (delete 'configure) ; No configure script
+ (replace 'check
+ (lambda* (#:key tests? make-flags #:allow-other-keys)
+ (when tests?
+ (apply invoke "make" "test" make-flags)
+ (invoke "./test"))))
+ (replace 'install
+ (lambda* (#:key outputs #:allow-other-keys)
+ (let ((out (assoc-ref outputs "out")))
+ (install-file "libvcfh.a" (string-append out "/lib"))
+ (install-file "vcfheader.hpp"
+ (string-append out "/include/libvcfh"))))))))
+ (home-page "https://github.com/edawson/libVCFH")
+ (synopsis "Library for generating VCF headers")
+ (description "@code{libVCFH} is a set of data structures you can populate
+to print a VCf header. It should be in spec with VCF4.1/4.2.")
+ (license license:expat))))
+
(define-public smithwaterman
(let ((commit "2610e259611ae4cde8f03c72499d28f03f6d38a7"))
(package
--
Efraim Flashner <efraim@flashner.co.il> ????? ?????
GPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted
E
E
Efraim Flashner wrote on 29 Jan 2023 14:12
[PATCH 5/5] gnu: vcflib: Update to 1.0.6.
(address . 61148@debbugs.gnu.org)(name . Efraim Flashner)(address . efraim@flashner.co.il)
c7d697d5ebaec22ad98468b67eff00e2e3663203.1674997470.git.efraim@flashner.co.il
* gnu/packages/bioinformatics.scm (vcflib): Update to 1.0.6.
[source]: Adjust snippet to use unbundled wfa2-lib include directory.
Also unbundle simde, wfa2-lib. Remove googletest from unbundle list, it
is no longer in use.
[inputs]: Add curl, simde, wfa2-lib.
[native-inputs]: Add pybind11, pandoc when available. Add the sources
for libvcfh.
[arguments]: Adjust configure-flags to build without zig and to use the
unbundled wfa2-lib. Adjust custom 'build-shared-library phase for
changes in the source. Adjust custom 'unpack-submodule-sources for
changes in the source.
---
gnu/packages/bioinformatics.scm | 69 +++++++++++++++++++++++++--------
1 file changed, 52 insertions(+), 17 deletions(-)

Toggle diff (146 lines)
diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 05a07af7f3..c4eeb6d68f 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -16033,74 +16033,108 @@ (define-public fastahack
(define-public vcflib
(package
(name "vcflib")
- (version "1.0.3")
+ (version "1.0.6")
(source
(origin
(method git-fetch)
(uri (git-reference
(url "https://github.com/vcflib/vcflib")
- (commit (string-append "v" version))))
+ (commit (string-append "v" version))
+ (recursive? #t)))
(file-name (git-file-name name version))
(sha256
- (base32 "1r7pnajg997zdjkf1b38m14v0zqnfx52w7nbldwh1xpbpahb1hjh"))
+ (base32 "0zcs8j3vdajram53srvjmq353f3prqdbn8fvzja4412w4zay79fz"))
(modules '((guix build utils)))
(snippet
#~(begin
(substitute* "CMakeLists.txt"
((".*fastahack.*") "")
((".*smithwaterman.*") "")
+ ;; Also look for fastahack and smithwaterman since
+ ;; we've just unbundled them.
(("(pkg_check_modules\\(TABIXPP)" text)
(string-append
"pkg_check_modules(FASTAHACK REQUIRED fastahack)\n"
"pkg_check_modules(SMITHWATERMAN REQUIRED smithwaterman)\n"
text))
+ ;; Also link vcflib to fastahack and smithwaterman.
(("\\$\\{TABIXPP_LIBRARIES\\}" text)
(string-append "${FASTAHACK_LIBRARIES} "
"${SMITHWATERMAN_LIBRARIES} "
- text)))
+ text))
+ ;; Honor setting WFA_INCLUDE_DIRS and not look at
+ ;; PREFIX/include/wfa2lib.
+ (((string-append "\\$\\{CMAKE_INSTALL_PREFIX\\}/"
+ "\\$\\{CMAKE_INSTALL_INCLUDEDIR\\}/wfa2lib"))
+ "${WFA_INCLUDE_DIRS}"))
(substitute* (find-files "." "\\.(h|c)(pp)?$")
(("\"SmithWatermanGotoh.h\"") "<smithwaterman/SmithWatermanGotoh.h>")
(("\"convert.h\"") "<smithwaterman/convert.h>")
(("\"disorder.h\"") "<smithwaterman/disorder.h>")
(("Fasta.h") "fastahack/Fasta.h"))
- (for-each delete-file-recursively
- '("fastahack" "filevercmp" "fsom" "googletest" "intervaltree"
- "libVCFH" "multichoose" "smithwaterman"))))))
+ (substitute* "src/Variant.h"
+ (("wavefront/wfa.hpp") "wfa2lib/wavefront/wfa.hpp"))
+ (delete-file-recursively "src/simde")
+ (with-directory-excursion "contrib"
+ (for-each delete-file-recursively
+ '(;"c-progress-bar"
+ "fastahack"
+ "filevercmp"
+ "fsom"
+ "intervaltree"
+ "libVCFH"
+ "multichoose"
+ "smithwaterman"
+ "tabixpp"
+ "WFA2-lib")))))))
(build-system cmake-build-system)
(inputs
(list bzip2
+ curl
htslib
fastahack
perl
python
+ simde
smithwaterman
tabixpp
+ wfa2-lib
xz
zlib))
(native-inputs
`(("pkg-config" ,pkg-config)
+ ("pybind11" ,pybind11)
+ ,@(if (member (%current-system)
+ (package-transitive-supported-systems pandoc))
+ `(("pandoc" ,pandoc))
+ '())
;; Submodules.
- ;; This package builds against the .o files so we need to extract the source.
+ ;; Not all of these packages provide libraries to link against.
("filevercmp-src" ,(package-source filevercmp))
("fsom-src" ,(package-source fsom))
("intervaltree-src" ,(package-source intervaltree))
+ ("libvcfh-src" ,(package-source libvcfh))
("multichoose-src" ,(package-source multichoose))))
(arguments
(list #:configure-flags
- #~(list (string-append
+ #~(list "-DZIG=NO"
+ "-DWFA_GITMODULE=OFF"
+ (string-append "-DWFA_INCLUDE_DIRS="
+ (search-input-directory %build-inputs
+ "include/wfa2lib"))
+ "-DTABIX_FOUND=ON" ; Default to found
+ (string-append
"-DPKG_CONFIG_EXECUTABLE="
(search-input-file
%build-inputs (string-append
"/bin/" #$(pkg-config-for-target)))))
- #:tests? #f ; no tests
+ #:tests? #f ; Tests need more configuring.
#:phases
#~(modify-phases %standard-phases
(add-after 'unpack 'build-shared-library
(lambda _
(substitute* "CMakeLists.txt"
- (("vcflib STATIC") "vcflib SHARED"))
- (substitute* "test/Makefile"
- (("libvcflib.a") "libvcflib.so"))))
+ (("vcflib STATIC") "vcflib SHARED"))))
(add-after 'unpack 'unpack-submodule-sources
(lambda* (#:key inputs native-inputs #:allow-other-keys)
(let ((unpack (lambda (source target)
@@ -16114,10 +16148,11 @@ (define-public vcflib
source
"--strip-components=1")))))))
(and
- (unpack "filevercmp-src" "filevercmp")
- (unpack "fsom-src" "fsom")
- (unpack "intervaltree-src" "intervaltree")
- (unpack "multichoose-src" "multichoose")))))
+ (unpack "filevercmp-src" "contrib/filevercmp")
+ (unpack "fsom-src" "contrib/fsom")
+ (unpack "intervaltree-src" "contrib/intervaltree")
+ (unpack "libvcfh-src" "contrib/libvcfh")
+ (unpack "multichoose-src" "contrib/multichoose")))))
;; This pkg-config file is provided by other distributions.
(add-after 'install 'install-pkg-config-file
(lambda* (#:key outputs #:allow-other-keys)
--
Efraim Flashner <efraim@flashner.co.il> ????? ?????
GPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted
R
R
Ricardo Wurmus wrote on 27 Oct 2023 14:38
[PATCH 0/5] Update vcflib
(address . 61148@debbugs.gnu.org)
874jici6hy.fsf@elephly.net
These patches look good to me.

One note: you can use #$output directly instead of (assoc-ref outputs "out").

--
Ricardo
?