[PATCH 0/2] Package some dependencies for Argos Translate

  • Open
  • quality assurance status badge
Details
One participant
  • Nguy?n Gia Phong
Owner
unassigned
Submitted by
Nguy?n Gia Phong
Severity
normal
N
N
Nguy?n Gia Phong wrote on 14 Mar 09:29 +0100
(address . guix-patches@gnu.org)(name . Nguy?n Gia Phong)(address . mcsinyx@disroot.org)
cover.1710404630.git.mcsinyx@disroot.org
is an offline translation library based on OpenNMT.

Below are some of its dependencies that are trivial to package.
The last one missing is CTranslate2 https://opennmt.net/CTranslate2.

Nguy?n Gia Phong (2):
gnu: Add python-sacremoses.
gnu: Add python-stanza.

gnu/packages/machine-learning.scm | 30 +++++++++++++++++++++++++++
gnu/packages/python-xyz.scm | 34 +++++++++++++++++++++++++++++++
2 files changed, 64 insertions(+)


base-commit: 76a3414a1bc500626a9feca013673f994eb51a34
--
2.41.0
N
N
Nguy?n Gia Phong wrote on 14 Mar 09:32 +0100
[PATCH 1/2] gnu: Add python-sacremoses.
(address . guix-patches@gnu.org)(name . Nguy?n Gia Phong)(address . mcsinyx@disroot.org)
03cb7e5cac1e4af60d9e655285b76bfd8dbf76c9.1710404630.git.mcsinyx@disroot.org
* gnu/packages/python-xyz.scm (python-sacremoses): New variable.

Change-Id: I2c2cd94c054d7e952ffb4b3afdedd2ee8ce905bf
---
gnu/packages/python-xyz.scm | 34 ++++++++++++++++++++++++++++++++++
1 file changed, 34 insertions(+)

Toggle diff (54 lines)
diff --git a/gnu/packages/python-xyz.scm b/gnu/packages/python-xyz.scm
index 232b5d69993c..ad33d98db142 100644
--- a/gnu/packages/python-xyz.scm
+++ b/gnu/packages/python-xyz.scm
@@ -149,6 +149,7 @@
;;; Copyright © 2024 Timothee Mathieu <timothee.mathieu@inria.fr>
;;; Copyright © 2024 Ian Eure <ian@retrospec.tv>
;;; Copyright © 2024 Adriel Dumas--Jondeau <leirda@disroot.org>
+;;; Copyright © 2024 Nguy?n Gia Phong <mcsinyx@disroot.org>
;;;
;;; This file is part of GNU Guix.
;;;
@@ -21897,6 +21898,39 @@ (define-public python-nltk
reasoning, wrappers for natural language processing libraries.")
(license license:asl2.0)))
+(define-public python-sacremoses
+ (package
+ (name "python-sacremoses")
+ (version "0.1.0")
+ (source (origin
+ (method git-fetch)
+ (uri (git-reference
+ (url "https://github.com/hplt-project/sacremoses")
+ (commit version)))
+ (sha256
+ (base32
+ "0g70vchfniknp65n4wnx7chg6g49d4xrz1wagv7f7ir2swdzyn9b"))))
+ (build-system python-build-system)
+ (arguments
+ '(#:phases
+ (modify-phases %standard-phases
+ (replace 'check
+ (lambda* (#:key tests? #:allow-other-keys)
+ (when tests?
+ ;; Skip truecaser tests which fetch https://norvig.com/big.txt
+ (invoke "python" "-m" "unittest"
+ "sacremoses/test/test_corpus.py"
+ "sacremoses/test/test_no_redos_has_numeric_only.py"
+ "sacremoses/test/test_normalizer.py"
+ "sacremoses/test/test_tokenizer.py")))))))
+ (propagated-inputs
+ (list python-click-7 python-joblib python-regex python-tqdm))
+ (home-page "https://github.com/hplt-project/sacremoses")
+ (synopsis "Natural language tokenizer, truecaser and normalizer")
+ (description "SacreMoses is a Python port of Moses'
+tokenizer, detokenizer, truecaser and punctuation normalizer.")
+ (license license:expat)))
+
(define-public python-pymongo
(package
(name "python-pymongo")
--
2.41.0
N
N
Nguy?n Gia Phong wrote on 14 Mar 09:32 +0100
[PATCH 2/2] gnu: Add python-stanza.
(address . guix-patches@gnu.org)(name . Nguy?n Gia Phong)(address . mcsinyx@disroot.org)
d45e620b075a501f144a561a5416ccdeba3a6136.1710404630.git.mcsinyx@disroot.org
* gnu/packages/machine-learning.scm (python-stanza): New variable.

Change-Id: Ibde67dcb8a015b91554f6a1e36dbf5eef0b73f36
---
gnu/packages/machine-learning.scm | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

Toggle diff (50 lines)
diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index 5c18a2e9d57d..5e403d905c49 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -27,6 +27,7 @@
;;; Copyright © 2024 David Pflug <david@pflug.io>
;;; Copyright © 2024 Timothee Mathieu <timothee.mathieu@inria.fr>
;;; Copyright © 2024 Spencer King <spencer.king@geneoscopy.com>
+;;; Copyright © 2024 Nguy?n Gia Phong <mcsinyx@disroot.org>
;;;
;;; This file is part of GNU Guix.
;;;
@@ -1127,6 +1128,35 @@ (define-public python-spacy
model packaging, deployment and workflow management.")
(license license:expat)))
+(define-public python-stanza
+ (package
+ (name "python-stanza")
+ (version "1.8.1")
+ (source
+ (origin
+ (method url-fetch)
+ (uri (pypi-uri "stanza" version))
+ (sha256
+ (base32 "1drq9wyafisnf44jgby1sh45svp0pj2svb01v397i9h0bczc5i08"))))
+ (build-system python-build-system)
+ (propagated-inputs (list python-emoji
+ python-numpy
+ python-protobuf
+ python-requests
+ python-networkx
+ python-toml
+ python-pytorch
+ python-tqdm))
+ ;; Tests require downloading of datasets.
+ (arguments (list #:tests? #false))
+ (home-page "https://stanfordnlp.github.io/stanza")
+ (synopsis "Stanford NLP Python library for many human languages")
+ (description "Stanza is a collection of accurate and efficient tools
+for the linguistic analysis of many human languages. Starting from raw text,
+Stanza divides it into sentences and words, and then can recognize
+parts of speech and entities, do syntactic analysis, and more.")
+ (license license:asl2.0)))
+
(define-public shogun
(package
(name "shogun")
--
2.41.0
?
Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send an email to 69794@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 69794
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch