[PATCH 0/3] Add Optuna.

OpenSubmitted by Vinicius Monego.
Details
One participant
  • Vinicius Monego
Owner
unassigned
Severity
normal
V
V
Vinicius Monego wrote on 2 Oct 2021 05:39
(address . guix-patches@gnu.org)(name . Vinicius Monego)(address . monego@posteo.net)
20211002033916.64690-1-monego@posteo.net
This patchset adds the Optuna hyperparameter optimization framework.

Optuna requires a number of ML frameworks for the whole test suite. I deleted some test files and added comments on missing packages.

I packaged Skorch in the series but it's an optional dependency that is only used in the integration tests, which are completely disabled. The latest version of scikit-optimize (0.8.1), another optional dependency, is incompatible with the scikit-learn version in Guix (0.24.x). I am waiting for the next release to package it. The other dependencies are trickier.

The executable in bin/ throws an error coming from pygobject, which I couldn't update. While there is a CLI, Optuna is mainly used as a library, that is why I prefix it with python-.

Vinicius Monego (3):
gnu: Add python-cma.
gnu: Add python-skorch.
gnu: Add python-optuna.

gnu/packages/machine-learning.scm | 166 ++++++++++++++++++++++++++++++
1 file changed, 167 insertions(+)

--
2.30.2
V
V
Vinicius Monego wrote on 2 Oct 2021 05:41
[PATCH 1/3] gnu: Add python-cma.
(address . 50956@debbugs.gnu.org)(name . Vinicius Monego)(address . monego@posteo.net)
20211002034117.64876-1-monego@posteo.net
* gnu/packages/machine-learning.scm (python-cma): New variable.
---
gnu/packages/machine-learning.scm | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)

Toggle diff (40 lines)
diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index e34de5df43..281c88d6f8 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -1150,6 +1150,33 @@ good at identifying feature interactions that are normally overlooked by
 standard feature selection algorithms.")
     (license license:expat)))
 
+(define-public python-cma
+  (package
+    (name "python-cma")
+    (version "3.1.0")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (pypi-uri "cma" version))
+       (sha256
+        (base32 "1ip32lnilbhmv1fyvmmdn5rcf084c0ps4q9dr3cf2ax5wdzhg0rv"))))
+    (build-system python-build-system)
+    (arguments
+     `(#:phases
+       (modify-phases %standard-phases
+         (replace 'check
+           (lambda* (#:key inputs outputs tests? #:allow-other-keys)
+             (when tests?
+               (add-installed-pythonpath inputs outputs)
+               (invoke "python" "-m" "cma.test")))))))
+    (propagated-inputs
+     `(("python-numpy" ,python-numpy)))
+    (home-page "https://github.com/CMA-ES/pycma")
+    (synopsis "Python implementation of CMA-ES")
+    (description "@code{pycma} is a Python implementation of CMA-ES and a few
+related numerical optimization tools.")
+    (license license:bsd-3)))
+
 (define-public python-cmaes
   (package
     (name "python-cmaes")
-- 
2.30.2
V
V
Vinicius Monego wrote on 2 Oct 2021 05:41
[PATCH 2/3] gnu: Add python-skorch.
(address . 50956@debbugs.gnu.org)(name . Vinicius Monego)(address . monego@posteo.net)
20211002034117.64876-2-monego@posteo.net
* gnu/packages/machine-learning.scm (python-skorch): New variable.
---
gnu/packages/machine-learning.scm | 42 +++++++++++++++++++++++++++++++
1 file changed, 42 insertions(+)

Toggle diff (55 lines)
diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index 281c88d6f8..fd3e6b2090 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -1050,6 +1050,48 @@ number of threads used in the threadpool-backed of common native libraries used
 for scientific computing and data science (e.g. BLAS and OpenMP).")
     (license license:bsd-3)))
 
+(define-public python-skorch
+  (package
+    (name "python-skorch")
+    (version "0.10.0")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (pypi-uri "skorch" version))
+       (sha256
+        (base32 "196hr0q5nw1nzckwanfv27myasayfdxxhx80iv9whm7675rzj44r"))))
+    (build-system python-build-system)
+    (arguments
+     `(#:phases
+       (modify-phases %standard-phases
+         (replace 'check
+           (lambda* (#:key inputs outputs tests? #:allow-other-keys)
+             (when tests?
+               (add-installed-pythonpath inputs outputs)
+               (invoke "python" "-m" "pytest" "--pyargs" "skorch" "-k"
+                       (string-append
+                        ;; Errors because of missing weight and pickle files.
+                        "not test_load_cuda_params_to_cpu"
+                        " and not test_pickle_load"))))))))
+    (propagated-inputs
+     `(("python-numpy" ,python-numpy)
+       ("python-scikit-learn" ,python-scikit-learn)
+       ("python-scipy" ,python-scipy)
+       ("python-tabulate" ,python-tabulate)
+       ("python-tqdm" ,python-tqdm)))
+    (native-inputs
+     `(("python-flaky" ,python-flaky)
+       ("python-pandas" ,python-pandas)
+       ("python-pytest" ,python-pytest)
+       ("python-pytest-cov" ,python-pytest-cov)
+       ("python-pytorch" ,python-pytorch)))
+    (home-page "https://github.com/skorch-dev/skorch")
+    (synopsis "Scikit-learn compatible neural network library for PyTorch")
+    (description "Skorch is a scikit-learn compatible neural network library
+that wraps PyTorch.  It allows to build and train PyTorch models using a
+scikit-learn-like API.")
+    (license license:bsd-3)))
+
 (define-public python-pynndescent
   (package
     (name "python-pynndescent")
-- 
2.30.2
V
V
Vinicius Monego wrote on 2 Oct 2021 05:41
[PATCH 3/3] gnu: Add python-optuna.
(address . 50956@debbugs.gnu.org)(name . Vinicius Monego)(address . monego@posteo.net)
20211002034117.64876-3-monego@posteo.net
* gnu/packages/machine-learning.scm (python-optuna): New variable.
---
gnu/packages/machine-learning.scm | 96 +++++++++++++++++++++++++++++++
1 file changed, 96 insertions(+)

Toggle diff (116 lines)
diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index fd3e6b2090..3b6f709c4e 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -74,6 +74,7 @@
   #:use-module (gnu packages ocaml)
   #:use-module (gnu packages onc-rpc)
   #:use-module (gnu packages parallel)
+  #:use-module (gnu packages openstack)
   #:use-module (gnu packages perl)
   #:use-module (gnu packages pkg-config)
   #:use-module (gnu packages protobuf)
@@ -938,6 +939,101 @@ computing environments.")
     (home-page "http://dlib.net")
     (license license:boost1.0)))
 
+(define-public python-optuna
+  (package
+    (name "python-optuna")
+    (version "2.9.1")
+    (source
+     (origin
+       ;; No tests in the PyPI tarball.
+       (method git-fetch)
+       (uri (git-reference
+             (url "https://github.com/optuna/optuna")
+             (commit (string-append "v" version))))
+       (file-name (git-file-name name version))
+       (sha256
+        (base32 "1fx80qjrkmnvn2mg9fx26qn3sjlwnwqlmkaf6sqhdw79pn6khlpi"))))
+    (build-system python-build-system)
+    (arguments
+     `(#:phases
+       (modify-phases %standard-phases
+         (add-after 'unpack 'dont-check-deps
+           ;; Don't check for dependencies we don't have or don't need.
+           ;; TODO: Package and enable some of these.
+           (lambda _
+             (substitute* "setup.py"
+               ((".*allennlp.*") "")
+               ((".*bokeh.*") "")
+               ((".*catalyst.*") "")
+               ((".*chainer.*") "")
+               ((".*fastai.*") "")
+               ((".*keras.*") "")
+               ((".*lightgbm.*") "")
+               ((".*mlflow.*") "")
+               ((".*mxnet.*") "")
+               ((".*plotly.*") "")
+               ((".*scikit-optimize.*") "")
+               ((".*skorch.*") "")
+               ((".*tensorflow.*") "")
+               ((".*torch.*") "")
+               ((".*xgboost.*") ""))))
+         (add-after 'unpack 'disable-some-tests
+           (lambda _
+             ;; Integration tests require most of the dependencies above.
+             (delete-file-recursively "tests/integration_tests")
+             ;; TODO: Requires scikit-optimize.
+             (delete-file "tests/samplers_tests/test_samplers.py")
+             ;; TODO: Requires bokeh.
+             (delete-file "tests/test_dashboard.py")
+             ;; FIXME: "Optuna" executable is not found.
+             (delete-file "tests/test_cli.py")
+             ;; FIXME: Files below require plotly but setup fails to identify
+             ;; plotly version and suggests an upgrade to >= 4.0.0.
+             (delete-file "tests/visualization_tests/test_contour.py")
+             (delete-file (string-append "tests/multi_objective_tests/"
+                                         "visualization_tests/"
+                                         "test_pareto_front.py"))
+             (delete-file-recursively "tests/visualization_tests")))
+         (replace 'check
+           (lambda* (#:key inputs outputs tests? #:allow-other-keys)
+             (when tests?
+               (add-installed-pythonpath inputs outputs)
+               (invoke "python" "-m" "pytest" "-k"
+                       ;; redis.exceptions.ResponseError: unknown command 'TIME'.
+                       (string-append
+                        "not test_retry_failed_trial_callback"
+                        " and not test_failed_trial_callback"
+                        " and not test_fail_stale_trials_with_optimize"))))))))
+    (propagated-inputs
+     `(("python-alembic" ,python-alembic)
+       ("python-cliff" ,python-cliff)
+       ("python-cmaes" ,python-cmaes)
+       ("python-colorlog" ,python-colorlog)
+       ("python-numpy" ,python-numpy)
+       ("python-packaging" ,python-packaging)
+       ("python-pyyaml" ,python-pyyaml)
+       ("python-scipy" ,python-scipy)
+       ("python-sqlalchemy" ,python-sqlalchemy)
+       ("python-tqdm" ,python-tqdm)))
+    (native-inputs
+     `(("python-cma" ,python-cma)
+       ("python-fakeredis" ,python-fakeredis)
+       ("python-matplotlib" ,python-matplotlib)
+       ("python-mpi4py" ,python-mpi4py)
+       ("python-pandas" ,python-pandas)
+       ("python-pytest" ,python-pytest)
+       ("python-redis" ,python-redis)
+       ("python-scikit-learn" ,python-scikit-learn)
+       ("which" ,which)))
+    (home-page "https://optuna.org/")
+    (synopsis "Hyperparameter optimization framework")
+    (description "Optuna is an automatic hyperparameter optimization software
+framework, particularly designed for machine learning.  It features an
+imperative, @emph{define-by-run} style user API.  Thanks to it, the code
+written with Optuna enjoys high modularity, and the user of Optuna can
+dynamically construct the search spaces for the hyperparameters.")
+    (license license:expat)))
+
 (define-public python-scikit-learn
   (package
     (name "python-scikit-learn")
-- 
2.30.2
V
V
Vinicius Monego wrote on 13 Oct 2021 04:00
[PATCH v2 1/4] gnu: Add python-scikit-optimize.
(address . 50956@debbugs.gnu.org)(name . Vinicius Monego)(address . monego@posteo.net)
20211013020035.53000-1-monego@posteo.net
* gnu/packages/machine-learning.scm (python-scikit-optimize): New variable.
---
scikit-optimize 0.9.0 was released today, added it as a patch in the series.

gnu/packages/machine-learning.scm | 38 +++++++++++++++++++++++++++++++
1 file changed, 38 insertions(+)

Toggle diff (51 lines)
diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index 029422677a..478e9548e9 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -1218,6 +1218,44 @@ main intended application of Autograd is gradient-based optimization.")
 (define-public python2-autograd
   (package-with-python2 python-autograd))
 
+(define-public python-scikit-optimize
+  (package
+    (name "python-scikit-optimize")
+    (version "0.9.0")
+    (source
+     (origin
+       (method git-fetch) ; no tests in PyPI tarball
+       (uri (git-reference
+             (url "https://github.com/scikit-optimize/scikit-optimize")
+             (commit (string-append "v" version))))
+       (sha256
+        (base32 "0hsq6pmryimxc275yrcy4bv217bx7ma6rz0q6m4138bv4zgq18d1"))
+       (file-name (git-file-name name version))))
+    (build-system python-build-system)
+    (arguments
+     `(#:phases
+       (modify-phases %standard-phases
+         (replace 'check
+           (lambda* (#:key inputs outputs tests? #:allow-other-keys)
+             (when tests?
+               (add-installed-pythonpath inputs outputs)
+               (invoke "python" "-m" "pytest")))))))
+    (native-inputs
+     `(("python-pytest" ,python-pytest)))
+    (propagated-inputs
+     `(("python-joblib" ,python-joblib)
+       ("python-numpy" ,python-numpy)
+       ("python-pyaml" ,python-pyaml)
+       ("python-scikit-learn" ,python-scikit-learn)
+       ("python-scipy" ,python-scipy)))
+    (home-page "https://scikit-optimize.github.io/")
+    (synopsis "Sequential model-based optimization")
+    (description
+     "Scikit-Optimize, or @code{skopt}, is a library to minimize (very)
+expensive and noisy black-box functions.  It implements several methods
+for sequential model-based optimization.")
+    (license license:bsd-3)))
+
 (define-public lightgbm
   (package
     (name "lightgbm")
-- 
2.30.2
V
V
Vinicius Monego wrote on 13 Oct 2021 04:00
[PATCH v2 2/4] gnu: Add python-cma.
(address . 50956@debbugs.gnu.org)(name . Vinicius Monego)(address . monego@posteo.net)
20211013020035.53000-2-monego@posteo.net
* gnu/packages/machine-learning.scm (python-cma): New variable.
---
gnu/packages/machine-learning.scm | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)

Toggle diff (40 lines)
diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index 478e9548e9..6b7736702a 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -1150,6 +1150,33 @@ good at identifying feature interactions that are normally overlooked by
 standard feature selection algorithms.")
     (license license:expat)))
 
+(define-public python-cma
+  (package
+    (name "python-cma")
+    (version "3.1.0")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (pypi-uri "cma" version))
+       (sha256
+        (base32 "1ip32lnilbhmv1fyvmmdn5rcf084c0ps4q9dr3cf2ax5wdzhg0rv"))))
+    (build-system python-build-system)
+    (arguments
+     `(#:phases
+       (modify-phases %standard-phases
+         (replace 'check
+           (lambda* (#:key inputs outputs tests? #:allow-other-keys)
+             (when tests?
+               (add-installed-pythonpath inputs outputs)
+               (invoke "python" "-m" "cma.test")))))))
+    (propagated-inputs
+     `(("python-numpy" ,python-numpy)))
+    (home-page "https://github.com/CMA-ES/pycma")
+    (synopsis "Python implementation of CMA-ES")
+    (description "@code{pycma} is a Python implementation of CMA-ES and a few
+related numerical optimization tools.")
+    (license license:bsd-3)))
+
 (define-public python-cmaes
   (package
     (name "python-cmaes")
-- 
2.30.2
V
V
Vinicius Monego wrote on 13 Oct 2021 04:00
[PATCH v2 3/4] gnu: Add python-skorch.
(address . 50956@debbugs.gnu.org)(name . Vinicius Monego)(address . monego@posteo.net)
20211013020035.53000-3-monego@posteo.net
* gnu/packages/machine-learning.scm (python-skorch): New variable.
---
Swapped order of native and propagated inputs.

gnu/packages/machine-learning.scm | 42 +++++++++++++++++++++++++++++++
1 file changed, 42 insertions(+)

Toggle diff (55 lines)
diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index 6b7736702a..09767584bb 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -1050,6 +1050,48 @@ number of threads used in the threadpool-backed of common native libraries used
 for scientific computing and data science (e.g. BLAS and OpenMP).")
     (license license:bsd-3)))
 
+(define-public python-skorch
+  (package
+    (name "python-skorch")
+    (version "0.10.0")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (pypi-uri "skorch" version))
+       (sha256
+        (base32 "196hr0q5nw1nzckwanfv27myasayfdxxhx80iv9whm7675rzj44r"))))
+    (build-system python-build-system)
+    (arguments
+     `(#:phases
+       (modify-phases %standard-phases
+         (replace 'check
+           (lambda* (#:key inputs outputs tests? #:allow-other-keys)
+             (when tests?
+               (add-installed-pythonpath inputs outputs)
+               (invoke "python" "-m" "pytest" "--pyargs" "skorch" "-k"
+                       (string-append
+                        ;; Errors because of missing weight and pickle files.
+                        "not test_load_cuda_params_to_cpu"
+                        " and not test_pickle_load"))))))))
+    (native-inputs
+     `(("python-flaky" ,python-flaky)
+       ("python-pandas" ,python-pandas)
+       ("python-pytest" ,python-pytest)
+       ("python-pytest-cov" ,python-pytest-cov)
+       ("python-pytorch" ,python-pytorch)))
+    (propagated-inputs
+     `(("python-numpy" ,python-numpy)
+       ("python-scikit-learn" ,python-scikit-learn)
+       ("python-scipy" ,python-scipy)
+       ("python-tabulate" ,python-tabulate)
+       ("python-tqdm" ,python-tqdm)))
+    (home-page "https://github.com/skorch-dev/skorch")
+    (synopsis "Scikit-learn compatible neural network library for PyTorch")
+    (description "Skorch is a scikit-learn compatible neural network library
+that wraps PyTorch.  It allows to build and train PyTorch models using a
+scikit-learn-like API.")
+    (license license:bsd-3)))
+
 (define-public python-pynndescent
   (package
     (name "python-pynndescent")
-- 
2.30.2
V
V
Vinicius Monego wrote on 13 Oct 2021 04:00
[PATCH v2 4/4] gnu: Add python-optuna.
(address . 50956@debbugs.gnu.org)(name . Vinicius Monego)(address . monego@posteo.net)
20211013020035.53000-4-monego@posteo.net
* gnu/packages/machine-learning.scm (python-optuna): New variable.
---
Updated to 2.10.0, added scikit-optimize to native-inputs don't delete the test file that requires it. Swapped order of native and propagated inputs.

gnu/packages/machine-learning.scm | 93 +++++++++++++++++++++++++++++++
1 file changed, 93 insertions(+)

Toggle diff (113 lines)
diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index 09767584bb..44033cfb77 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -74,6 +74,7 @@
   #:use-module (gnu packages ocaml)
   #:use-module (gnu packages onc-rpc)
   #:use-module (gnu packages parallel)
+  #:use-module (gnu packages openstack)
   #:use-module (gnu packages perl)
   #:use-module (gnu packages pkg-config)
   #:use-module (gnu packages protobuf)
@@ -938,6 +939,98 @@ computing environments.")
     (home-page "http://dlib.net")
     (license license:boost1.0)))
 
+(define-public python-optuna
+  (package
+    (name "python-optuna")
+    (version "2.10.0")
+    (source
+     (origin
+       (method git-fetch) ; no tests in PyPI tarball
+       (uri (git-reference
+             (url "https://github.com/optuna/optuna")
+             (commit (string-append "v" version))))
+       (file-name (git-file-name name version))
+       (sha256
+        (base32 "0fha0pwxq6n3mbpvpz3vk8hh61zqncj5cnq063kzfl5d8rd48vcd"))))
+    (build-system python-build-system)
+    (arguments
+     `(#:phases
+       (modify-phases %standard-phases
+         (add-after 'unpack 'dont-check-deps
+           ;; Don't check for test dependencies we don't have.
+           ;; TODO: Package and enable some of these.
+           (lambda _
+             (substitute* "setup.py"
+               ((".*allennlp.*") "")
+               ((".*bokeh.*") "")
+               ((".*catalyst.*") "")
+               ((".*chainer.*") "")
+               ((".*fastai.*") "")
+               ((".*keras.*") "")
+               ((".*lightgbm.*") "")
+               ((".*mlflow.*") "")
+               ((".*mxnet.*") "")
+               ((".*plotly.*") "")
+               ((".*skorch.*") "")
+               ((".*tensorflow.*") "")
+               ((".*torch.*") "")
+               ((".*xgboost.*") ""))))
+         (add-after 'unpack 'disable-some-tests
+           (lambda _
+             ;; Integration tests require most of the dependencies above.
+             (delete-file-recursively "tests/integration_tests")
+             ;; TODO: Requires bokeh.
+             (delete-file "tests/test_dashboard.py")
+             ;; FIXME: "Optuna" executable is not found.
+             (delete-file "tests/test_cli.py")
+             ;; FIXME: Files below require plotly but setup fails to identify
+             ;; plotly version and suggests an upgrade to >= 4.0.0.
+             (delete-file "tests/visualization_tests/test_contour.py")
+             (delete-file (string-append "tests/multi_objective_tests/"
+                                         "visualization_tests/"
+                                         "test_pareto_front.py"))
+             (delete-file-recursively "tests/visualization_tests")))
+         (replace 'check
+           (lambda* (#:key inputs outputs tests? #:allow-other-keys)
+             (when tests?
+               (add-installed-pythonpath inputs outputs)
+               (invoke "python" "-m" "pytest" "-k"
+                       ;; redis.exceptions.ResponseError: unknown command 'TIME'.
+                       (string-append
+                        "not test_retry_failed_trial_callback"
+                        " and not test_failed_trial_callback"
+                        " and not test_fail_stale_trials_with_optimize"))))))))
+    (native-inputs
+     `(("python-cma" ,python-cma)
+       ("python-fakeredis" ,python-fakeredis)
+       ("python-matplotlib" ,python-matplotlib)
+       ("python-mpi4py" ,python-mpi4py)
+       ("python-pandas" ,python-pandas)
+       ("python-pytest" ,python-pytest)
+       ("python-redis" ,python-redis)
+       ("python-scikit-learn" ,python-scikit-learn)
+       ("python-scikit-optimize" ,python-scikit-optimize)
+       ("which" ,which)))
+    (propagated-inputs
+     `(("python-alembic" ,python-alembic)
+       ("python-cliff" ,python-cliff)
+       ("python-cmaes" ,python-cmaes)
+       ("python-colorlog" ,python-colorlog)
+       ("python-numpy" ,python-numpy)
+       ("python-packaging" ,python-packaging)
+       ("python-pyyaml" ,python-pyyaml)
+       ("python-scipy" ,python-scipy)
+       ("python-sqlalchemy" ,python-sqlalchemy)
+       ("python-tqdm" ,python-tqdm)))
+    (home-page "https://optuna.org/")
+    (synopsis "Hyperparameter optimization framework")
+    (description "Optuna is an automatic hyperparameter optimization software
+framework, particularly designed for machine learning.  It features an
+imperative, @emph{define-by-run} style user API.  Thanks to it, the code
+written with Optuna enjoys high modularity, and the user of Optuna can
+dynamically construct the search spaces for the hyperparameters.")
+    (license license:expat)))
+
 (define-public python-scikit-learn
   (package
     (name "python-scikit-learn")
-- 
2.30.2
?