[PATCH] import: Generate list of importers based on available modules

  • Open
  • quality assurance status badge
Details
4 participants
  • Sarah Morgensen
  • Maxime Devos
  • pinoaffe
  • zimoun
Owner
unassigned
Submitted by
pinoaffe
Severity
normal
P
P
pinoaffe wrote on 23 Sep 2021 14:24
(address . guix-patches@gnu.org)
87tuibh43w.fsf@airmail.cc
* guix/scripts/import.scm (importers): Generate a list of all importers by
looping over available guile modules, allowing for extensibility.
---
guix/scripts/import.scm | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

Toggle diff (29 lines)
diff --git a/guix/scripts/import.scm b/guix/scripts/import.scm
index 40fa6759ae..44cbaf13d6 100644
--- a/guix/scripts/import.scm
+++ b/guix/scripts/import.scm
@@ -23,6 +23,7 @@
(define-module (guix scripts import)
#:use-module (guix ui)
+ #:use-module (guix discovery)
#:use-module (guix scripts)
#:use-module (guix utils)
#:use-module (srfi srfi-1)
@@ -78,9 +79,11 @@ rather than \\n."
;;; Entry point.
;;;
-(define importers '("gnu" "pypi" "cpan" "hackage" "stackage" "egg" "elpa"
- "gem" "go" "cran" "crate" "texlive" "json" "opam"
- "minetest"))
+(define importers (map (lambda (module)
+ (symbol->string (caddr (module-name module))))
+ (all-modules (map (lambda (entry)
+ `(,entry . "guix/import"))
+ %load-path))))
(define (resolve-importer name)
(let ((module (resolve-interface
--
2.32.0
S
S
Sarah Morgensen wrote on 23 Sep 2021 20:07
(name . pinoaffe)(address . pinoaffe@airmail.cc)(address . 50755@debbugs.gnu.org)
86pmszkvwz.fsf@mgsn.dev
Hello,

This looks like a good improvement! Thanks for submitting the patch.
Just reading ths, I have a couple comments.

pinoaffe <pinoaffe@airmail.cc> writes:

Toggle quote (28 lines)
> * guix/scripts/import.scm (importers): Generate a list of all importers by
> looping over available guile modules, allowing for extensibility.
> ---
> guix/scripts/import.scm | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/guix/scripts/import.scm b/guix/scripts/import.scm
> index 40fa6759ae..44cbaf13d6 100644
> --- a/guix/scripts/import.scm
> +++ b/guix/scripts/import.scm
> @@ -23,6 +23,7 @@
>
> (define-module (guix scripts import)
> #:use-module (guix ui)
> + #:use-module (guix discovery)
> #:use-module (guix scripts)
> #:use-module (guix utils)
> #:use-module (srfi srfi-1)
> @@ -78,9 +79,11 @@ rather than \\n."
> ;;; Entry point.
> ;;;
>
> -(define importers '("gnu" "pypi" "cpan" "hackage" "stackage" "egg" "elpa"
> - "gem" "go" "cran" "crate" "texlive" "json" "opam"
> - "minetest"))
> +(define importers (map (lambda (module)
> + (symbol->string (caddr (module-name module))))

Prefer ice-9 'match'/'match-lambda' over 'car'/'cadr'/'caddr'/etc, or if
necessary, SRFI-1 'first', 'second', ..., 'last'.

Toggle quote (2 lines)
> + (all-modules (map (lambda (entry)
> + `(,entry . "guix/import"))
should this be guix/scripts/import? ^

Toggle quote (5 lines)
> + %load-path))))
>
> (define (resolve-importer name)
> (let ((module (resolve-interface

--
Sarah
P
P
pinoaffe wrote on 23 Sep 2021 23:17
(name . Sarah Morgensen)(address . iskarian@mgsn.dev)(address . 50755@debbugs.gnu.org)
87zgs3oute.fsf@airmail.cc
Hi, thanks for the comments!

Sarah Morgensen writes:
Toggle quote (2 lines)
> Prefer ice-9 'match'/'match-lambda' over 'car'/'cadr'/'caddr'/etc, or if
> necessary, SRFI-1 'first', 'second', ..., 'last'.
okay, I'll change this

Toggle quote (1 lines)
> should this be guix/scripts/import? ^
It definitely should be, oopsidaisy :)

will send a new patch in a sec
P
P
pinoaffe wrote on 23 Sep 2021 23:19
[PATCH v2] import: Generate list of importers based on available modules
(address . 50755@debbugs.gnu.org)
87wnn7ouq3.fsf@airmail.cc
* guix/scripts/import.scm (importers): Generate a list of all importers by
looping over available guile modules, allowing for extensibility.
---
guix/scripts/import.scm | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

Toggle diff (32 lines)
diff --git a/guix/scripts/import.scm b/guix/scripts/import.scm
index 40fa6759ae..ed702d3bff 100644
--- a/guix/scripts/import.scm
+++ b/guix/scripts/import.scm
@@ -23,6 +23,7 @@
(define-module (guix scripts import)
#:use-module (guix ui)
+ #:use-module (guix discovery)
#:use-module (guix scripts)
#:use-module (guix utils)
#:use-module (srfi srfi-1)
@@ -78,9 +79,14 @@ rather than \\n."
;;; Entry point.
;;;
-(define importers '("gnu" "pypi" "cpan" "hackage" "stackage" "egg" "elpa"
- "gem" "go" "cran" "crate" "texlive" "json" "opam"
- "minetest"))
+(define importers (filter-map (lambda (module)
+ (match (module-name module)
+ (`(guix scripts import ,importer)
+ (symbol->string importer))
+ ( #t #f)))
+ (all-modules (map (lambda (entry)
+ `(,entry . "guix/scripts/import"))
+ %load-path))))
(define (resolve-importer name)
(let ((module (resolve-interface
--
2.32.0
Z
Z
zimoun wrote on 27 Sep 2021 16:28
(name . pinoaffe)(address . pinoaffe@airmail.cc)(address . 50755@debbugs.gnu.org)
CAJ3okZ3FtEN8ytJqjv0HiCkpgMxp2KbofUB07i3aV-3c7c3SOA@mail.gmail.com
Hi,

Thanks. Two comments.

On Thu, 23 Sept 2021 at 23:20, pinoaffe <pinoaffe@airmail.cc> wrote:

Toggle quote (15 lines)
> -(define importers '("gnu" "pypi" "cpan" "hackage" "stackage" "egg" "elpa"
> - "gem" "go" "cran" "crate" "texlive" "json" "opam"
> - "minetest"))
> +(define importers (filter-map (lambda (module)
> + (match (module-name module)
> + (`(guix scripts import ,importer)
> + (symbol->string importer))
> + ( #t #f)))
> + (all-modules (map (lambda (entry)
> + `(,entry . "guix/scripts/import"))
> + %load-path))))
>
> (define (resolve-importer name)
> (let ((module (resolve-interface

First, I think, it breaks "guix import --help". Therefore, this patch
needs a v3. :-)

Second, what is the average extra time added on cold cache? On my
machine, for hot cache, I get:

Toggle snippet (13 lines)
$ time guix import cran -h

real 0m0.113s
user 0m0.110s
sys 0m0.025s

$ time ./pre-inst-env guix import cran -h

real 0m0.470s
user 0m0.529s
sys 0m0.054s

which is something. On cold cache, it is:

real 0m10.438s
user 0m0.164s
sys 0m0.082s

vs

real 0m12.226s
user 0m0.897s
sys 0m0.190s

but these numbers are not so much meaningful because there is a strong
variability; hence on average. :-)

Because of 'filter-map', it walks all the modules, so there is a
performance loss. The question is: which performance loss is
acceptable here?
Other said, is the code improvement worth compared to the performance decrease?

All the best,
simon
P
P
pinoaffe wrote on 27 Sep 2021 20:20
(address . 50755@debbugs.gnu.org)
87mtnxswwq.fsf@airmail.cc
User-agent: mu4e 1.4.15; emacs 27.2
* guix/scripts/import.scm (importers): Generate a list of all importers by
looping over available guile modules, allowing for extensibility.
---
guix/scripts/import.scm | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)

Toggle diff (36 lines)
diff --git a/guix/scripts/import.scm b/guix/scripts/import.scm
index 40fa6759ae..71ee8cc00b 100644
--- a/guix/scripts/import.scm
+++ b/guix/scripts/import.scm
@@ -23,6 +23,7 @@
(define-module (guix scripts import)
#:use-module (guix ui)
+ #:use-module (guix discovery)
#:use-module (guix scripts)
#:use-module (guix utils)
#:use-module (srfi srfi-1)
@@ -78,9 +79,15 @@ rather than \\n."
;;; Entry point.
;;;
-(define importers '("gnu" "pypi" "cpan" "hackage" "stackage" "egg" "elpa"
- "gem" "go" "cran" "crate" "texlive" "json" "opam"
- "minetest"))
+(define importers (delete-duplicates
+ (filter-map (lambda (module)
+ (match (module-name module)
+ (`(guix scripts import ,importer)
+ (symbol->string importer))
+ ( #t #f)))
+ (all-modules (map (lambda (entry)
+ `(,entry . "guix/scripts/import"))
+ %load-path)))))
(define (resolve-importer name)
(let ((module (resolve-interface
--
2.32.0

Date: Mon, 27 Sep 2021 20:20:21 +0200
Message-ID: <87o88dswwq.fsf@airmail.cc>
P
P
pinoaffe wrote on 27 Sep 2021 20:27
(name . zimoun)(address . zimon.toutoune@gmail.com)(address . 50755@debbugs.gnu.org)
87ilylswkt.fsf@airmail.cc
Hi,

thank you for your feedback!

zimoun writes:
Toggle quote (2 lines)
> First, I think, it breaks "guix import --help". Therefore, this patch
> needs a v3. :-)
I sent a v3, thanks :)

Toggle quote (1 lines)
> The question is: which performance loss is acceptable here?
To me a performance loss similar to the one you describe would be
acceptable, particularly since it should be a constant performance hit
for every time ~guix import~ is called, aka it won't significantly
impact long-running import commands. However, I think I may be somewhat
biased on this one, so it'd be great if others could weigh in :)

Kind regards,
pinoaffe
Z
Z
zimoun wrote on 27 Sep 2021 22:09
Re: [bug#50755] [PATCH v3] import: Generate list of importers based on available modules
(name . pinoaffe)(address . pinoaffe@airmail.cc)(address . 50755@debbugs.gnu.org)
CAJ3okZ05LLzBLTY_srqarf1LG-tq2sN7wJmg3UaU7fxFBMovHQ@mail.gmail.com
Hi,

On Mon, 27 Sept 2021 at 20:21, pinoaffe <pinoaffe@airmail.cc> wrote:

Toggle quote (2 lines)
> +(define importers (delete-duplicates

This fixes my first point...

Toggle quote (9 lines)
> + (filter-map (lambda (module)
> + (match (module-name module)
> + (`(guix scripts import ,importer)
> + (symbol->string importer))
> + ( #t #f)))
> + (all-modules (map (lambda (entry)
> + `(,entry . "guix/scripts/import"))
> + %load-path)))))

...and it means it is walking more than needed. Therefore, what is
the performance loss?

For instance, on my machine and hot cache, it is 4x slower. And, this
readibility improvement is not worth, IMHO.
On cold cache, I do not have meaningful numbers because it requires to
run it several times and then compute an average. What are the
numbers of your machine?

All the best,
simon
M
M
Maxime Devos wrote on 28 Sep 2021 11:51
(address . 50755@debbugs.gnu.org)
4ddbb2adfa86c9ed2e1cf01ad5c1d0129553cbae.camel@telenet.be
zimoun schreef op ma 27-09-2021 om 22:09 [+0200]:
Toggle quote (22 lines)
> Hi,
>
> On Mon, 27 Sept 2021 at 20:21, pinoaffe <pinoaffe@airmail.cc> wrote:
>
> > +(define importers (delete-duplicates
>
> This fixes my first point...
>
> > + (filter-map (lambda (module)
> > + (match (module-name module)
> > + (`(guix scripts import ,importer)
> > + (symbol->string importer))
> > + ( #t #f)))
> > + (all-modules (map (lambda (entry)
> > + `(,entry . "guix/scripts/import"))
> > + %load-path)))))
>
> ...and it means it is walking more than needed. Therefore, what is
> the performance loss?
>
> For instance, on my machine and hot cache, it is 4x slower.

FWIW, calling ./pre-inst-env guix ... will always be slower than guix ...,
because the former will have a longer %load-path (IIRC) and possibly
other reasons.

Using "guix pull --profile=..." "time .../guix import ..." might be a better test.

To only measure the time required for defiing 'importers', wrap delete-duplicates
in a call to 'time' from (ice-9 time).

Greetings,
Maxime.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYVLltBccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7oFuAP4inWX640bGvGAqeFOSr8Ci4tHW
Wdvk/g0kIJkDo1yQTAEAu4A1DDzr/aDiXGkRInPFWhvZxEpxMlWMEmvUgli92Ac=
=2WqB
-----END PGP SIGNATURE-----


P
P
pinoaffe wrote on 28 Sep 2021 16:39
(name . Maxime Devos)(address . maximedevos@telenet.be)
87k0j0hii3.fsf@airmail.cc
Hi,

Maxime Devos writes:
Toggle quote (3 lines)
> To only measure the time required for defiing 'importers', wrap
> delete-duplicates in a call to 'time' from (ice-9 time).

Running

(time (for-each (lambda (_)
(delete-duplicates (filter-map (lambda (module)
(match (module-name module)
(`(guix scripts import ,importer)
(symbol->string importer))
(#t #f)))
(all-modules (map (lambda (entry)
`(,entry . "guix/scripts/import"))
%load-path)))))
(iota 1000)))

in a guix repl on my system results in

clock utime stime cutime cstime gctime
0.96 1.67 0.07 0.00 0.00 1.19

If I'm interpreting that correctly that would amount to a couple of
thousands of a second per run

Kind regards,
pinoaffe
M
M
Maxime Devos wrote on 29 Sep 2021 22:59
(name . pinoaffe)(address . pinoaffe@airmail.cc)
a78030ec4daefc13a4ca7c6b9502e4660b127bf6.camel@telenet.be
pinoaffe schreef op di 28-09-2021 om 16:39 [+0200]:
Toggle quote (27 lines)
> Hi,
>
> Maxime Devos writes:
> > To only measure the time required for defiing 'importers', wrap
> > delete-duplicates in a call to 'time' from (ice-9 time).
>
> Running
>
> (time (for-each (lambda (_)
> (delete-duplicates (filter-map (lambda (module)
> (match (module-name module)
> (`(guix scripts import ,importer)
> (symbol->string importer))
> (#t #f)))
> (all-modules (map (lambda (entry)
> `(,entry . "guix/scripts/import"))
> %load-path)))))
> (iota 1000)))
>
> in a guix repl on my system results in
>
> clock utime stime cutime cstime gctime
> 0.96 1.67 0.07 0.00 0.00 1.19
>
> If I'm interpreting that correctly that would amount to a couple of
> thousands of a second per run

These numbers turn out to be misleading, because 'scheme-modules'
(indirectly called from all-modules) calls 'resolve-interface' on every module
name. For a module name, the first 'resolve-module' incurs disk I/O and some
CPU for loading the module, but the second 'resolve-module' on the same module
name would be free, as the module is already loaded.

Greetings,
Maxime
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYVTTqRccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7nSqAP4h8hgA68UTCqb6JLWNAufWq4Pp
LFlippHoE3+paowHaQEA48/yt3XWfE6GctcFSRRfvbgJoawuHzaw0zuoG60TFwg=
=t0wq
-----END PGP SIGNATURE-----


P
P
pinoaffe wrote on 30 Sep 2021 10:17
(name . Maxime Devos)(address . maximedevos@telenet.be)
87r1d6zdcj.fsf@airmail.cc
Maxime Devos writes:
Toggle quote (5 lines)
> These numbers turn out to be misleading, because 'scheme-modules'
> (indirectly called from all-modules) calls 'resolve-interface' on every module
> name. For a module name, the first 'resolve-module' incurs disk I/O and some
> CPU for loading the module, but the second 'resolve-module' on the same module
> name would be free, as the module is already loaded.
okay, the first incantation of

(time (for-each (lambda (_)
(delete-duplicates (filter-map (lambda (module)
(match (module-name module)
(`(guix scripts import ,importer)
(symbol->string importer))
(#t #f)))
(all-modules (map (lambda (entry)
`(,entry . "guix/scripts/import"))
%load-path)))))
(iota 1)))

on a "fresh" guix repl on my system results in

clock utime stime cutime cstime gctime
1.28 0.76 0.13 0.00 0.00 0.16

which is indeed a significant amount of time, though I don't think it'd
be much of an issue considering that it's not likely that users will run
lots of `guix import` shell commands in rapid succession.

kind regards,
pinoaffe
M
M
Maxime Devos wrote on 30 Sep 2021 10:37
(name . pinoaffe)(address . pinoaffe@airmail.cc)
b49ef3ad39319d3d9057b1a69c62dee2c0ce1bbb.camel@telenet.be
pinoaffe schreef op do 30-09-2021 om 10:17 [+0200]:
Toggle quote (24 lines)
> Maxime Devos writes:
> > These numbers turn out to be misleading, because 'scheme-modules'
> > (indirectly called from all-modules) calls 'resolve-interface' on every module
> > name. For a module name, the first 'resolve-module' incurs disk I/O and some
> > CPU for loading the module, but the second 'resolve-module' on the same module
> > name would be free, as the module is already loaded.
> okay, the first incantation of
>
> (time (for-each (lambda (_)
> (delete-duplicates (filter-map (lambda (module)
> (match (module-name module)
> (`(guix scripts import ,importer)
> (symbol->string importer))
> (#t #f)))
> (all-modules (map (lambda (entry)
> `(,entry . "guix/scripts/import"))
> %load-path)))))
> (iota 1)))
>
> on a "fresh" guix repl on my system results in
>
> clock utime stime cutime cstime gctime
> 1.28 0.76 0.13 0.00 0.00 0.16

On my fresh guix repl, it's a bit longer:

clock utime stime cutime cstime gctime
9.54 1.79 0.31 0.00 0.00 0.53

(9 or 10 seconds)

If I restart the guix repl and run it again,
I get about half a second:

clock utime stime cutime cstime gctime
0.47 0.57 0.02 0.00 0.00 0.19

Toggle quote (4 lines)
> which is indeed a significant amount of time, though I don't think it'd
> be much of an issue considering that it's not likely that users will run
> lots of `guix import` shell commands in rapid succession.

The list of importers is only needed for two purposes, right?

1. to print a list of importers when "guix import --help" is run
2. to verify the string actually specifies an importer

Then 'guix import SOME-IMPORTER STUFF' could be optimised:

reolve-importer and guix-import could be modified to skip the validation
step and let resolve-importer print the error if the module couldn't be
found. Possibly (resolve-module '(the possibly undefined module) #:ensure #f)
might be useful. Then 'importers' would only be required for purpose (1),
so it could be wrapped in a promise, such that if "guix import some-importer stuff"
is called, only the required importer module is loaded.

Greetings,
Maxime.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYVV3PRccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7r2kAP9n4wQy7Hwm+gOfK7xoJ5hC5dTS
R2nxUMr0HKz2lFlR4wEA+lMWrOz1DD7BQPmIIBASugkP1zH34bGBZQOuw/FS9gI=
=bB49
-----END PGP SIGNATURE-----


Z
Z
zimoun wrote on 11 Oct 2021 13:51
(name . Maxime Devos)(address . maximedevos@telenet.be)
CAJ3okZ3WfJeN0=U4JuZU-2sQ9KAdq07jaVxy_tAWAa3YKHx-tQ@mail.gmail.com
Hi,

On Thu, 30 Sept 2021 at 10:37, Maxime Devos <maximedevos@telenet.be> wrote:

Toggle quote (14 lines)
> The list of importers is only needed for two purposes, right?
>
> 1. to print a list of importers when "guix import --help" is run
> 2. to verify the string actually specifies an importer
>
> Then 'guix import SOME-IMPORTER STUFF' could be optimised:
>
> reolve-importer and guix-import could be modified to skip the validation
> step and let resolve-importer print the error if the module couldn't be
> found. Possibly (resolve-module '(the possibly undefined module) #:ensure #f)
> might be useful. Then 'importers' would only be required for purpose (1),
> so it could be wrapped in a promise, such that if "guix import some-importer stuff"
> is called, only the required importer module is loaded.

My comment is about the elegance vs the performance loss. On old
machines, Guix is becoming unpractical for many operations (almost all
the operations indeed) and I would not add another slowness. I am
fine to sacrifice some performances if it is worth. However, the
balance is always: what is gain and what is loss? Here the gain is
small code elegance against the performance lost for end-user. The
question: does it worth? From my point of view, no, this change is
not worth. For what my opinion is worth here. ;-)

Cheers,
simon
?