`guix environment` performance issues

OpenSubmitted by Lars-Dominik Braun.
Details
2 participants
  • Lars-Dominik Braun
  • Ludovic Courtès
Owner
unassigned
Severity
important
L
L
Lars-Dominik Braun wrote on 4 Jun 10:23 +0200
(address . bug-guix@gnu.org)
20200604082316.GA3146@zpidnp36
Hi,
I’ve noticed that `guix environment` can be very very slow for some packages.Whereas usually a call like
time guix environment --ad-hoc <package> -- true
takes 600ms to 1.5s, it takes 8.4s for the package r-learnr on my Ryzen 5 3600with NVMe SSD, 32G of RAM and a warm cache. This seems to accumulate whentrying to install multiple unrelated “slow” packages into the same environment,i.e. jupyter (2.9s) + r-rlearnr (8.4s) + r-stm (2.9s) = 12.1s. What thesepackages seem to have in common is a deep chain of propagated inputs.
And in fact, if I create artificial test packages where test-999 propagatestest-998, test-998→test-997 and so on the command above takes 1m 9s. A singlepackage propagating 999 packages takes 4.5s.
Instrumenting guix-environment’s with-store call with statprof I see a lot oftime is spent in the garbage collector:
Total time: 24.672604202 seconds (19.431122691 seconds in GC)./pre-inst-env guix environment --ad-hoc r-learnr -- true 25,18s user 0,24s system 308% cpu 8,248 total
More specifically in an anonymous function and reap-pipes, which is a gc hook,I believe:
% cumulative selftime seconds seconds calls procedure 33.41 14.49 8.24 anon #xbb8480 27.95 6.90 6.90 ice-9/popen.scm:145:0:reap-pipes 4.37 1.08 1.08 anon #xbbdcd8 3.28 0.86 0.81 ice-9/vlist.scm:539:0:vhash-assq 2.40 2.37 0.59 guix/grafts.scm:202:22
Not sure if that helps tracking the issue down though.
Cheers,Lars
PS: Here’s `guix describe`:Generation 1 Jun 03 2020 10:38:17 (current) guix 9ab2790 repository URL: https://git.savannah.gnu.org/git/guix.git branch: master commit: 9ab2790e2acf872c0f2cf41ec180de80d1eee3f3 zpid bab33c1 repository URL: https://github.com/leibniz-psychology/guix-zpid.git branch: master commit: bab33c19e0f0304d7aa82751653eb85d8cad8c0a
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEEyk+M9DfXR4/aBV/UQhN3ARo3hEYFAl7Yr3EACgkQQhN3ARo3hEZvXgwAortYSBFszMEb4QyWF4EtfFQ8GMqhj9Av2r72HXGKtqnu+ZoY4VLK3eZxNtt0HIn50+BLQ/0TkWRCy5znN5t4NPVg7Qdn46DlCM9rOL6kkr61xrVGmmi5yzvarir2cUIgLFHEwHQPP3AoKZAVrkzIoKxHhHPzFGyLZPglf0MER82X30pj5Yq14h3rWlA+u6ktDsnHGYc9dD25mTSvuhITT8NAp4viNBC328FEZ+b/ct2/lpeln/khZYmUHSG+bYI9R+fwg6isCPtKuZB+B7T5703KOyzvwbiasnCToMr1/PCV91z3Arba8MsCeYZUmv8FfsrENz7iFs7le9J3csH39bhrZn7DNADwSt3HTILGHrrbEikgD3R3LyvqRcYRA9NZLZHSRdYntPbktFvK4i4H7UIBHx/b2MmiGAgrhqYvAPPFgTU8D0Y/w4Rdrq6uXckgl3WlGnbmsVO7CEefanSv3RwWg7+m2LJ4sziXpT5fmyp+qru/l58yCUsZMiyfjkrB=lFqp-----END PGP SIGNATURE-----

L
L
Ludovic Courtès wrote on 6 Jun 16:34 +0200
control message for bug #41702
(address . control@debbugs.gnu.org)
87sgf8tg9p.fsf@gnu.org
severity 41702 importantquit
L
L
Ludovic Courtès wrote on 6 Jun 18:08 +0200
Re: bug#41702: `guix environment` performance issues
(name . Lars-Dominik Braun)(address . ldb@leibniz-psychology.org)(address . 41702@debbugs.gnu.org)
87mu5gtbwg.fsf@gnu.org
Hi,
Lars-Dominik Braun <ldb@leibniz-psychology.org> skribis:
Toggle quote (14 lines)> Total time: 24.672604202 seconds (19.431122691 seconds in GC)> ./pre-inst-env guix environment --ad-hoc r-learnr -- true 25,18s user 0,24s system 308% cpu 8,248 total>> More specifically in an anonymous function and reap-pipes, which is a gc hook,> I believe:>> % cumulative self> time seconds seconds calls procedure> 33.41 14.49 8.24 anon #xbb8480> 27.95 6.90 6.90 ice-9/popen.scm:145:0:reap-pipes> 4.37 1.08 1.08 anon #xbbdcd8> 3.28 0.86 0.81 ice-9/vlist.scm:539:0:vhash-assq> 2.40 2.37 0.59 guix/grafts.scm:202:22
Guile master has a fix for statprof that yields more useful info:
Toggle snippet (38 lines)scheme@(guile-user)> ,use(guix scripts environment)scheme@(guile-user)> ,pr (guix-environment "--ad-hoc" "r-learnr" "--" "true")% cumulative self time seconds seconds procedure 29.84 9.87 6.16 append 19.56 4.04 4.04 %after-gc-thunk 6.85 1.87 1.42 ice-9/vlist.scm:539:0:vhash-assq 5.44 1.17 1.12 write 3.23 0.67 0.67 guix/derivations.scm:665:0:derivation->output-paths 2.82 0.58 0.58 string=? 2.42 2.37 0.50 guix/grafts.scm:202:22 2.42 0.50 0.50 list? 2.22 0.46 0.46 hashq 2.02 0.42 0.42 display 1.61 15.82 0.33 guix/grafts.scm:186:0:reference-origin 1.61 0.87 0.33 guix/grafts.scm:204:31 1.21 0.33 0.25 guix/derivations.scm:667:7 1.21 0.29 0.25 srfi/srfi-1.scm:817:0:any 1.01 1232.14 0.21 srfi/srfi-1.scm:584:5:map1 0.81 0.83 0.17 guix/derivations.scm:697:0:derivation/masked-inputs 0.81 0.75 0.17 srfi/srfi-1.scm:580:2:map 0.81 0.17 0.17 guix/derivations.scm:158:0:%derivation-input-derivation-procedure 0.60 0.17 0.12 reverse 0.60 0.12 0.12 hashq-ref 0.60 0.12 0.12 get-bytevector-n 0.60 0.12 0.12 procedure? 0.40 0.67 0.08 guix/packages.scm:1232:0:fold-bag-dependencies 0.40 0.12 0.08 string->utf8 0.40 0.12 0.08 ice-9/vlist.scm:534:0:vhash-assoc 0.40 0.12 0.08 ice-9/vlist.scm:449:0:vhash-cons 0.40 0.12 0.08 delete-duplicates 0.40 0.08 0.08 ice-9/boot-9.scm:1389:0:->bool 0.40 0.08 0.08 ice-9/boot-9.scm:2201:0:%load-announce 0.40 0.08 0.08 hash 0.40 0.08 0.08 guix/derivations.scm:665:0:derivation->output-paths 0.20 20.73 0.04 guix/gexp.scm:1061:2
Notice that the same command with ‘--no-grafts’ takes 2s instead of 11s.
The patch below arranges so that ‘cumulative-grafts’ processesdependencies in a batch, such that the derivation’s dependency graph istraversed once for all, which makes a difference for derivations withlots of inputs.
Here’s the before/after comparison:
Toggle snippet (14 lines)$ time guix environment --ad-hoc r-learnr --search-pathsexport PATH="/gnu/store/n4wxbmqpafjfyawrla8xymzzdm5hxwph-profile/bin${PATH:+:}$PATH"
real 0m11.328suser 0m20.155ssys 0m0.172s$ time ./pre-inst-env guix environment --ad-hoc r-learnr --search-pathsexport PATH="/gnu/store/if6z77la3mx0qdzvcyl4qv9i5cyp48i0-profile/bin${PATH:+:}$PATH"
real 0m4.602suser 0m6.189ssys 0m0.136s
There’s still room for improvement, but it’s much better.
Ludo’.
Toggle diff (126 lines)diff --git a/guix/grafts.scm b/guix/grafts.scmindex 69d6fe4469..910dcadc8a 100644--- a/guix/grafts.scm+++ b/guix/grafts.scm@@ -20,10 +20,12 @@ #:use-module (guix store) #:use-module (guix monads) #:use-module (guix records)+ #:use-module (guix combinators) #:use-module (guix derivations) #:use-module ((guix utils) #:select (%current-system)) #:use-module (guix sets) #:use-module (srfi srfi-1)+ #:use-module (srfi srfi-11) #:use-module (srfi srfi-9 gnu) #:use-module (srfi srfi-26) #:use-module (srfi srfi-34)@@ -183,32 +185,47 @@ references." (set-current-state (vhash-cons key result cache)) (return result))))))) -(define (reference-origin drv item)- "Return the derivation/output pair among the inputs of DRV, recursively,-that produces ITEM. Return #f if ITEM is not produced by a derivation (i.e.,-it's a content-addressed \"source\"), or if it's not produced by a dependency-of DRV."+(define (reference-origins drv items)+ "Return the derivation/output pairs among the inputs of DRV, recursively,+that produce ITEMS. Elements of ITEMS not produced by a derivation (i.e.,+it's a content-addressed \"source\"), or not produced by a dependency of DRV,+have no corresponding element in the resulting list."+ (define (lookup-derivers drv result items)+ ;; Return RESULT augmented by all the drv/output pairs producing one of+ ;; ITEMS, and ITEMS stripped of matching items.+ (fold2 (match-lambda*+ (((output . file) result items)+ (if (member file items)+ (values (alist-cons drv output result)+ (delete file items))+ (values result items))))+ result items+ (derivation->output-paths drv)))+ ;; Perform a breadth-first traversal of the dependency graph of DRV in- ;; search of the derivation that produces ITEM.+ ;; search of the derivations that produce ITEMS. (let loop ((drv (list drv))+ (items items)+ (result '()) (visited (setq))) (match drv (()- #f)+ result) ((drv . rest)- (if (set-contains? visited drv)- (loop rest visited)- (let ((inputs (derivation-inputs drv)))- (or (any (lambda (input)- (let ((drv (derivation-input-derivation input)))- (any (match-lambda- ((output . file)- (and (string=? file item)- (cons drv output))))- (derivation->output-paths drv))))- inputs)- (loop (append rest (map derivation-input-derivation inputs))- (set-insert drv visited)))))))))+ (cond ((null? items)+ result)+ ((set-contains? visited drv)+ (loop rest items result visited))+ (else+ (let*-values (((inputs)+ (map derivation-input-derivation+ (derivation-inputs drv)))+ ((result items)+ (fold2 lookup-derivers+ result items inputs)))+ (loop (append rest inputs)+ items result+ (set-insert drv visited))))))))) (define* (cumulative-grafts store drv grafts #:key@@ -233,25 +250,27 @@ derivations to the corresponding set of grafts." (_ #f))) - (define (dependency-grafts item)- (match (reference-origin drv item)- ((drv . output)- ;; If GRAFTS already contains a graft from DRV, do not override it.- (if (find (cut graft-origin? drv <>) grafts)- (state-return grafts)- (cumulative-grafts store drv grafts- #:outputs (list output)- #:guile guile- #:system system)))- (#f- (state-return grafts))))+ (define (dependency-grafts items)+ (mapm %store-monad+ (lambda (drv+output)+ (match drv+output+ ((drv . output)+ ;; If GRAFTS already contains a graft from DRV, do not+ ;; override it.+ (if (find (cut graft-origin? drv <>) grafts)+ (state-return grafts)+ (cumulative-grafts store drv grafts+ #:outputs (list output)+ #:guile guile+ #:system system)))))+ (reference-origins drv items))) (with-cache (cons (derivation-file-name drv) outputs) (match (non-self-references store drv outputs) (() ;no dependencies (return grafts)) (deps ;one or more dependencies- (mlet %state-monad ((grafts (mapm %state-monad dependency-grafts deps)))+ (mlet %state-monad ((grafts (dependency-grafts deps))) (let ((grafts (delete-duplicates (concatenate grafts) equal?))) (match (filter (lambda (graft) (member (graft-origin-file-name graft) deps))
L
L
Ludovic Courtès wrote on 6 Jun 23:40 +0200
(name . Lars-Dominik Braun)(address . ldb@leibniz-psychology.org)(address . 41702@debbugs.gnu.org)
87d06bub30.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:
Toggle quote (5 lines)> The patch below arranges so that ‘cumulative-grafts’ processes> dependencies in a batch, such that the derivation’s dependency graph is> traversed once for all, which makes a difference for derivations with> lots of inputs.
L
L
Lars-Dominik Braun wrote on 8 Jun 11:04 +0200
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 41702@debbugs.gnu.org)
20200608090453.GC3166@zpidnp36
Hi Ludo,
Toggle quote (14 lines)> --8<---------------cut here---------------start------------->8---> $ time guix environment --ad-hoc r-learnr --search-paths> export PATH="/gnu/store/n4wxbmqpafjfyawrla8xymzzdm5hxwph-profile/bin${PATH:+:}$PATH"> > real 0m11.328s> user 0m20.155s> sys 0m0.172s> $ time ./pre-inst-env guix environment --ad-hoc r-learnr --search-paths> export PATH="/gnu/store/if6z77la3mx0qdzvcyl4qv9i5cyp48i0-profile/bin${PATH:+:}$PATH"> > real 0m4.602s> user 0m6.189s> sys 0m0.136s> --8<---------------cut here---------------end--------------->8---
that’s awesome and brings me much closer to my goal of running all applicationsinside a `guix environment` container for reproducibility. Including theprotocol fixes from #41720 I’m now down to ~30s from ~50s, which may be calledsomewhat usable. Obviously I’d be very interested in further speedups.
Thanks,Lars
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEEyk+M9DfXR4/aBV/UQhN3ARo3hEYFAl7d/zEACgkQQhN3ARo3hEYcoQwAtb945F+lwytXrepW+8lYfu/S2Pdi8uwyyMfFFjx8oP+t9CmQbVByt2244N7YLTJReP1BFEN3lnOiV8R1bA9stKCWaSCcUR2jI4LgtOlIDFfr6YwMRSf2krboylpAX6x9dHxdEl32M++ODzBGy5BgxDBKga6gk7OPoGUzld9nFYQX6s7VQGB6IcJgIyjP9zFhCTEO9eTFoUyUynIdx/LCT6l0vRobX0hy+sStXoMa7weaoBwEMbQ08Hedltxv35iBAYcOJoTM68r/Ph9G8ipvL6KQMBIPfTRxGm/UXH2z1OH4/sDYST0wNbZ8mG4JGt/oiewQ/ssZRM8JzrRAd5s0pOK5c0B9olojRerfUvF8ubfVvZIqpvfyKPX5ny8SN75MyEDhvodqQbjCAXlAPFvlW+0vWxlMTC+SufZceQ4cBaGCDNeuV2K3KtzWywW/nwkpHm5y5d3jePyBZreB8WO9ml7UTfRX7AawlmKSMx6OGWM7m3tYJh2fghJ5rMerWnCh=SDlJ-----END PGP SIGNATURE-----

L
L
Ludovic Courtès wrote on 8 Jun 23:59 +0200
(name . Lars-Dominik Braun)(address . ldb@leibniz-psychology.org)(address . 41702@debbugs.gnu.org)
87k10hp6ba.fsf@gnu.org
Hi,
Lars-Dominik Braun <ldb@leibniz-psychology.org> skribis:
Toggle quote (19 lines)>> --8<---------------cut here---------------start------------->8--->> $ time guix environment --ad-hoc r-learnr --search-paths>> export PATH="/gnu/store/n4wxbmqpafjfyawrla8xymzzdm5hxwph-profile/bin${PATH:+:}$PATH">> >> real 0m11.328s>> user 0m20.155s>> sys 0m0.172s>> $ time ./pre-inst-env guix environment --ad-hoc r-learnr --search-paths>> export PATH="/gnu/store/if6z77la3mx0qdzvcyl4qv9i5cyp48i0-profile/bin${PATH:+:}$PATH">> >> real 0m4.602s>> user 0m6.189s>> sys 0m0.136s>> --8<---------------cut here---------------end--------------->8---> that’s awesome and brings me much closer to my goal of running all applications> inside a `guix environment` container for reproducibility. Including the> protocol fixes from #41720 I’m now down to ~30s from ~50s, which may be called> somewhat usable. Obviously I’d be very interested in further speedups.
That’s over SSH, right?
Probably what’s killing us is the round-trip time for all these smallRPCs. We would need pipelining but the RPC protocol is not designed tomake that easy.
Perhaps you could “strace -Tt” the thing to check whether thishypothesis is correct by looking at the time we spend waiting forreplies?
As for the CPU cost (i.e., going below the 4.6s above), we should keepprofiling just like you did.
Thanks,Ludo’.
L
L
Lars-Dominik Braun wrote on 9 Jun 11:15 +0200
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 41702@debbugs.gnu.org)
20200609091522.GB8597@zpidnp36
Hey,
Toggle quote (1 lines)> That’s over SSH, right?
correct, the worst possible case: Inside two VM’s on a Laptop, SSH transportbetween them and /gnu+/var/guix on an NFS share (nfsd is in the same VM asguix-daemon).
Toggle quote (3 lines)> Probably what’s killing us is the round-trip time for all these small> RPCs. We would need pipelining but the RPC protocol is not designed to> make that easy.
That would have been my best guess too, but it does not seem to be the biggestproblem right now. Looking at the numbers again (both patches applied) with theattached manifest[1], I see that:
---snip---Local UNIX socket with and without --no-grafts: N Min Max Median Avg Stddevx 10 6.07 6.35 6.145 6.16 0.08232726+ 10 17.47 17.89 17.545 17.602 0.14351152Difference at 99.0% confidence 11.442 +/- 0.150576 185.747% +/- 4.07133%
Local UNIX socket vs. guix://localhost transport: N Min Max Median Avg Stddevx 10 17.47 17.89 17.545 17.602 0.14351152+ 10 17.43 18.1 17.61 17.642 0.20131788No difference proven at 99.0% confidence
Local UNIX socket vs ssh://localhost transport: N Min Max Median Avg Stddevx 10 17.47 17.89 17.545 17.602 0.14351152+ 10 33.46 35.27 34.315 34.359 0.53873205Difference at 99.0% confidence 16.757 +/- 0.5074 95.1994% +/- 3.13957%---snap---
So I would conclude:
1) Grafting still takes a lot of time and needs more work2) Linux optimizes localhost networking pretty well3) Our SSH transport is terribly slow
Moving to non-localhost communication between two VM’s:
---snip---guix://localhost vs. guix://remote-host transport: N Min Max Median Avg Stddevx 10 17.43 18.1 17.61 17.642 0.20131788+ 10 20.88 22.58 21.095 21.222 0.49689704Difference at 99.0% confidence 3.58 +/- 0.487934 20.2925% +/- 2.85159%
guix://remote-host vs. ssh://remote-host: N Min Max Median Avg Stddevx 10 20.88 22.58 21.095 21.222 0.49689704+ 10 30.1 32.56 31.005 31.093 0.70740606Difference at 99.0% confidence 9.871 +/- 0.786769 46.5131% +/- 4.35326%---snap---
Conclusion here is the same: Not alot of impact of networking/NFS and SSHtransport is still terribly slow. (Confusingly faster than localhost though.)
Toggle quote (3 lines)> Perhaps you could “strace -Tt” the thing to check whether this> hypothesis is correct by looking at the time we spend waiting for> replies?
I’m not sure this will yield meaningful data for SSH, so I analyzet it forguix://localhost vs. guix://remote-host. Takeaway is, yes, of course there is astatistically significant difference and it’s about 40%±50%, which means thismethod is pretty useless, because we can’t bin RPC’s by type.
So, I guess it would make sense for me to look at the SSH transport itselfagain and see if there are any other low-hanging fruit. Not sure how much I canhelp with profiling guile/guix itself. A different/better RPC protocol isprobably GSoC/v2.0-worthy?
Sorry for all the lengthy emails,Lars
[1] You’ll need this channel: https://github.com/leibniz-psychology/guix-zpid
(specifications->manifest '( "coreutils" "findutils" "procps" "strace" "openssh" "mit-krb5" "bash" "which" "zip" "geeqie" "util-linux" "grep" "glibc" ;; for locales "glibc-locales" ;; front-end software "jupyter-zpid" "python-jupyterlab" ;; available notebook kernel ; provided by jupyter-zpid ;"python-ipykernel" "r-irkernel" "rstudio-server-zpid" "r" ;; for RMarkdown "r-knitr" "r-yaml" "r-markdown" "r-rmarkdown" "texlive" ;; commonly used r packages "r-psych" "r-ggplot2" "r-lattice" "r-foreign" "r-readr" "r-haven" "r-dplyr" "r-tidyr" "r-stringr" "r-forecast" "r-lme4" "r-nlme" "r-nnet" "r-glmnet" "r-caret" "r-xmisc" "r-splitstackshape" "r-tm" "r-quanteda" "r-topicmodels" "r-stm" ;;"r-parallel" "r-dt" "r-nlp" "r-data-table" "r-hmisc" "r-learnr" "r-metafor" ;; for rmarkdown "ghc-pandoc" ))
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEEyk+M9DfXR4/aBV/UQhN3ARo3hEYFAl7fUycACgkQQhN3ARo3hEbGwgv9G/Jy0viE4XGqrmhCfmnQSKMvxgGZg7WPUBNYaOFcUr3fjpR13egOlcqSmvebo/JZ8cs+nRNQB7/v+FTYFPTmk/xFIN3jLNdHnqUmHHBmhLsirbyQRdMXYr6FsD7eho57lQ/nlnTsEU6NZQwO3WA3kd4aFjvNxSBpaVeO1HPaxyWTPEQVqpWGN7u3nFzw4866BBBNP1R1nSdbUKsAzilHKTWMJ0B6O34dNK1/fdaVSJ/yNaAzehN+cykFSgYCzl1eUYquNUtZvyHjNptYm9BX54YQPA7HrMmpsd8qOaeTe6R4HHrV6+JPy/oKrrr2RPsVnI1bY5o58mRKzk8up75FsTlN6tWBwuempr00dIsjny/gjBHuL6J5v3vCGqmHOMhjF66PNmiwheqCGHNae8tq7Li4diFmSo9uV+kFpwgmGQ13ywZeKex2l6xVg0vpX0bGkr1gxK/XWy2dYRtgkNYf5ycAhUBBLOhTlN6/+nWI6QLEgKiAmZR16Ju5AVtbp5Gm=zWIs-----END PGP SIGNATURE-----

L
L
Ludovic Courtès wrote on 27 Jun 23:20 +0200
(name . Lars-Dominik Braun)(address . ldb@leibniz-psychology.org)(address . 41702@debbugs.gnu.org)
87wo3sjjdw.fsf@gnu.org
Hi,
Lars-Dominik Braun <ldb@leibniz-psychology.org> skribis:
Toggle quote (32 lines)> That would have been my best guess too, but it does not seem to be the biggest> problem right now. Looking at the numbers again (both patches applied) with the> attached manifest[1], I see that:>> ---snip---> Local UNIX socket with and without --no-grafts:> N Min Max Median Avg Stddev> x 10 6.07 6.35 6.145 6.16 0.08232726> + 10 17.47 17.89 17.545 17.602 0.14351152> Difference at 99.0% confidence> 11.442 +/- 0.150576> 185.747% +/- 4.07133%>> Local UNIX socket vs. guix://localhost transport:> N Min Max Median Avg Stddev> x 10 17.47 17.89 17.545 17.602 0.14351152> + 10 17.43 18.1 17.61 17.642 0.20131788> No difference proven at 99.0% confidence>> Local UNIX socket vs ssh://localhost transport:> N Min Max Median Avg Stddev> x 10 17.47 17.89 17.545 17.602 0.14351152> + 10 33.46 35.27 34.315 34.359 0.53873205> Difference at 99.0% confidence> 16.757 +/- 0.5074> 95.1994% +/- 3.13957%> ---snap--->> So I would conclude:>> 1) Grafting still takes a lot of time and needs more work
Yes, I noticed that there’s still redundant work being done, so we mayneed global caching in addition to what58bb833365db4e8934a386497d5b00a063cfd27d did.
Toggle quote (3 lines)> 2) Linux optimizes localhost networking pretty well> 3) Our SSH transport is terribly slow
Yes.
The patch below is a noticeable improvement for me. On my laptop,
GUIX_DAEMON_SOCKET=ssh://localhost ./pre-inst-env guix build libreoffice -n
goes from 5.8s to 3.3s. It just does the same thing as we do forguix://.
Could you check what results it gives you?
Toggle quote (5 lines)> So, I guess it would make sense for me to look at the SSH transport itself> again and see if there are any other low-hanging fruit. Not sure how much I can> help with profiling guile/guix itself. A different/better RPC protocol is> probably GSoC/v2.0-worthy?
There’s a project to rewrite the daemon in Scheme, started by CalebRistvedt (reepa) as part of GSoC a few years ago. This could be anopportunity to add a new version of the protocol that would supportpipelining.
Thanks,Ludo’.
Toggle diff (86 lines)diff --git a/guix/ssh.scm b/guix/ssh.scmindex 2d7ca7d01d..a9312b7c8c 100644--- a/guix/ssh.scm+++ b/guix/ssh.scm@@ -98,7 +98,7 @@ actual key does not match." key type)))))))) (define* (open-ssh-session host #:key user port identity- host-key+ host-key open-connection (compression %compression) (timeout 3600)) "Open an SSH session for HOST and return it. IDENTITY specifies the file@@ -110,6 +110,10 @@ When HOST-KEY is true, it must be a string like \"ssh-ed25519 AAAAC3Nz… root@example.org\"; the server is authenticated and an error is raised if its host key is different from HOST-KEY. +When OPEN-CONNECTION is true, it must be a two-argument procedure; it is+passed HOST and PORT and must return a socket (a file port). When+OPEN-CONNECTION is false, Guile-SSH takes care of opening the connection.+ Install TIMEOUT as the maximum time in seconds after which a read or write operation on a channel of the returned session is considered as failing. @@ -134,6 +138,13 @@ Throw an error on failure." ;; Honor ~/.ssh/config. (session-parse-config! session) + (when open-connection+ (let* ((sock (open-connection host port)))+ ;; Since 'session-set!' doesn't increase the revealed count of SOCK+ ;; (as of Guile-SSH 0.12.0), do it ourselves.+ (set-port-revealed! sock 1)+ (session-set! session 'fd sock)))+ (match (connect! session) ('ok (if host-keydiff --git a/guix/store/ssh.scm b/guix/store/ssh.scmindex 09c0832505..d07f0d7467 100644--- a/guix/store/ssh.scm+++ b/guix/store/ssh.scm@@ -1,5 +1,5 @@ ;;; GNU Guix --- Functional package management for GNU-;;; Copyright © 2017 Ludovic Courtès <ludo@gnu.org>+;;; Copyright © 2017, 2020 Ludovic Courtès <ludo@gnu.org> ;;; ;;; This file is part of GNU Guix. ;;;@@ -19,6 +19,7 @@ (define-module (guix store ssh) #:use-module (guix ssh) #:use-module (web uri)+ #:use-module (ssh session) #:export (connect-to-daemon)) ;;; Commentary:@@ -29,11 +30,27 @@ ;;; ;;; End: +(define (open-connection host port)+ "Open a connection to HOST and PORT. Use the standard SSH port if PORT is+false."+ (let* ((lst (getaddrinfo host+ (if port (number->string port) "ssh")+ (if port AI_NUMERICSERV 0)))+ (addr (addrinfo:addr (car lst)))+ (sock (socket (sockaddr:fam addr) SOCK_STREAM 0)))+ ;; Setting this option makes a dramatic difference because it avoids the+ ;; "ACK delay" on our RPC messages.+ (setsockopt sock IPPROTO_TCP TCP_NODELAY 1)++ (connect sock addr)+ sock))+ (define (connect-to-daemon uri) "Connect to the SSH daemon at URI, a URI object with the 'ssh' scheme." (remote-daemon-channel (open-ssh-session (uri-host uri) #:port (or (uri-port uri) 22)- #:user (uri-userinfo uri))))+ #:user (uri-userinfo uri)+ #:open-connection open-connection))) ;;; ssh.scm ends here
L
L
Ludovic Courtès wrote 6 days ago
control message for bug #41702
(address . control@debbugs.gnu.org)
87pn9ha8wx.fsf@gnu.org
tags 41702 + patchquit
L
L
Lars-Dominik Braun wrote 6 days ago
Re: bug#41702: `guix environment` performance issues
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 41702@debbugs.gnu.org)
20200630095936.GA3630@zpidnp36
Hi Ludo,
Toggle quote (8 lines)> The patch below is a noticeable improvement for me. On my laptop,> > GUIX_DAEMON_SOCKET=ssh://localhost ./pre-inst-env guix build libreoffice -n > > goes from 5.8s to 3.3s. It just does the same thing as we do for> guix://.> > Could you check what results it gives you?
looking pretty good, comparing master without/with patch:
---snip--- N Min Max Median Avg Stddevx 10 16.87 20.77 17.44 17.802 1.1228515+ 10 9.92 13.67 10.19 10.569 1.1254574Difference at 99.0% confidence -7.233 +/- 1.44688 -40.6303% +/- 6.68735%---snap---
Toggle quote (4 lines)> There’s a project to rewrite the daemon in Scheme, started by Caleb> Ristvedt (reepa) as part of GSoC a few years ago. This could be an> opportunity to add a new version of the protocol that would support> pipelining.
Given how much improvement you’ve achieved so far I’d say we can (for now)probably get away with the current protocol. Still, what’s the status of thisdaemon?
Lars
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEEyk+M9DfXR4/aBV/UQhN3ARo3hEYFAl77DQMACgkQQhN3ARo3hEZuWAwAs5btXPUr3WH4g/FqdVHSqbn6QRe6Lo8WPqKZq2IHKpk+HR+VWwRT+uyuo9S41JWMk00cbFr9I4xi2xyyld1J3NSDDOrGPP8sjxz7Z/y/h13Dh2vV3lHborEx2ItAxK9+nGThJaFdFFdVOTd/1DZb7nLnaoQ6jrhVaQ2kMD3XmXd2dkNu/vEd8tatDV2li5sWznnvmyglyGA7gRyPVdJITy8lhRPeOn1MVNWVRxAgHZMmB/pUKcL+7KB1msmUuylIa0suKoFwqSOh9Mw6t8tGIaNEPaaJtjWKu5ew9tg0vnEsRPJU3cunuDtJRS/7tWk3nU+0wXCMErUgdfb9Tm3yebTopHCuyIFhC+z9M5X1owPmUNQdk5hdgbxE+972C/fzlVABPtj2UEhfV9E48gEuON66NJOjdQBeGH0qohAadTQZmy4ln1ZMkYdw+cKeLzWKj/w07oeZQGO9IE38QO26xHMye0Dx219gmQ/ar0pFgTA4QPs1bs8j5BM6XvzW6T5G=YULY-----END PGP SIGNATURE-----

L
L
Ludovic Courtès wrote 5 days ago
(name . Lars-Dominik Braun)(address . ldb@leibniz-psychology.org)(address . 41702@debbugs.gnu.org)
87y2o3zf00.fsf@gnu.org
Hi,
Ludovic Courtès <ludo@gnu.org> skribis:
Toggle quote (7 lines)> + (when open-connection> + (let* ((sock (open-connection host port)))> + ;; Since 'session-set!' doesn't increase the revealed count of SOCK> + ;; (as of Guile-SSH 0.12.0), do it ourselves.> + (set-port-revealed! sock 1)> + (session-set! session 'fd sock)))
The problem with this approach is that ‘open-connection’ is past the rawhost name; aliases, proxy commands, etc. from ~/.ssh/config are ignored.
So it cannot go in as is. :-/
What we need is Guile-SSH bindings for ‘ssh_get_fd’, which would allowus to get at the actual file descriptor after the connection has beenopened, and to set TCP_NODELAY there:
http://api.libssh.org/stable/group__libssh__session.html#gafe509fcea47714b5cd277d1e35e83276
It’s not 100% clear though if it returns that “right” FD when there’s aproxy command.
Ludo’.
L
L
Lars-Dominik Braun wrote 5 days ago
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 41702@debbugs.gnu.org)
20200701112453.GB25179@zpidnp36
Hey,
Toggle quote (8 lines)> What we need is Guile-SSH bindings for ‘ssh_get_fd’, which would allow> us to get at the actual file descriptor after the connection has been> opened, and to set TCP_NODELAY there:> > http://api.libssh.org/stable/group__libssh__session.html#gafe509fcea47714b5cd277d1e35e83276> > It’s not 100% clear though if it returns that “right” FD when there’s a> proxy command.
wouldn’t it be easier to add the SSH_OPTIONS_NODELAY option (see [1]) toguile-ssh (should be trivial[2]) and let guile-ssh handle all the connectionstuff without touching the socket? Or do we need the raw socket elsewhere?
Cheers,Lars
[1] http://api.libssh.org/stable/group__libssh__session.html#ga7a801b85800baa3f4e16f5b47db0a73d[2] https://github.com/artyom-poptsov/guile-ssh/blob/master/libguile-ssh/session-func.c#L52
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEEyk+M9DfXR4/aBV/UQhN3ARo3hEYFAl78coIACgkQQhN3ARo3hEZJEQv9GJYnyKD23RlY9Et60BMB2YsJSjSqWxxjUneLfLnFDrIvusQZf+Lhtj3Vo7aBVzoJ6GEPrSOTi2FcecSUGfuSaC1w17Zs2prjMuqq66Opkx9caeBS/dv/4iLZHFf+Kci8+gaWQkn4BR7+tspfkjRxIulywpK4P7nP8KxUDhVDKXi6NLSJB5XCH4bgqwNSzeQnu9L0CDzlZPHq5h5bZ7Op4VyZ/oMa1koRYBBVA0KCDGHZUQGs/Mh7QaYe5N5AMRFabXWP4hN0qDw3H7Dic/+4jYi+2QMicx1Ysh+6bVB0R666/Qwg0PGZuHvXsTwfcuHVHtHLubXEKCnzh7P5WkhzC8NpRL5iTaaadSQL/mrYe3y7ihPRtthfMkVDzv53GVPXUnxWUc1JcxMO+9OrHVdQY9HPiNbrFmJ+FjEFGuqvmpAoA8K54aolIepfD9dd9tC21M/yYcI7RCgJ5lSGbbWkHM5fz/P1b9UmKIbyQkk8WvqSt1N+IQQZM6DNQbXGmtBu=COgO-----END PGP SIGNATURE-----

L
L
Ludovic Courtès wrote 4 days ago
(name . Lars-Dominik Braun)(address . ldb@leibniz-psychology.org)(address . 41702@debbugs.gnu.org)
87eepuyk99.fsf@gnu.org
Hi,
Lars-Dominik Braun <ldb@leibniz-psychology.org> skribis:
Toggle quote (12 lines)>> What we need is Guile-SSH bindings for ‘ssh_get_fd’, which would allow>> us to get at the actual file descriptor after the connection has been>> opened, and to set TCP_NODELAY there:>> >> http://api.libssh.org/stable/group__libssh__session.html#gafe509fcea47714b5cd277d1e35e83276>> >> It’s not 100% clear though if it returns that “right” FD when there’s a>> proxy command.> wouldn’t it be easier to add the SSH_OPTIONS_NODELAY option (see [1]) to> guile-ssh (should be trivial[2]) and let guile-ssh handle all the connection> stuff without touching the socket? Or do we need the raw socket elsewhere?
Good idea, I had overlooked that possibility.
Would you like to send them a patch?
Thanks,Ludo’.
L
L
Lars-Dominik Braun wrote 4 days ago
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 41702@debbugs.gnu.org)
20200702070000.GA4532@zpidnp36
Hi,
Toggle quote (1 lines)> Would you like to send them a patch?
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEEyk+M9DfXR4/aBV/UQhN3ARo3hEYFAl79hewACgkQQhN3ARo3hEakxwwA1QUBS0VlTTNH5L49/5dGCmvkc4pTJRv9BtneXSMJTUryIIpSEv/mPYEeyn5Lr02uyZPTqlp75sOndJmhuRnSCz6F9HTHXgTAls7fsU3Ka0kiUO4WHhM8gOH4sfyYUEq654ZuYU54sjAE3x/aMCfhxECXPmGECiz6/R5J0BcaFvr5h+lTqjWVtg6gKFKPc8C/UQb4DxLjJaEC5KPzA70T/t6zoFk4ZYTa4bPArEHbiqfvbXNpifee/8npe8KD6IL4rBEepblPdAAtdJ+x0EuFm40Ac4fGJv9b+QIe+d2go2Ixz5m2Y5x6U/Hrkn9v8ZNKFUZukQj23RY6mke0Nek+WKgTAvV2FfYZnJ2OEhhJDW7oO4U4jxbOaomc26WTErFEOvVls/ZXPIhBf4dsqcMUNmo6w2l59LnMNeWQkD19TE7LsOUGMQ6+aWQoyBf4sl1wihnG1+ljSmf088COqNtsKNHEBxWYL317BoqIiD0Rk3V6OwDjrIuis7J9Ou2EYXVE=jsIk-----END PGP SIGNATURE-----

L
L
Lars-Dominik Braun wrote 4 days ago
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 41702@debbugs.gnu.org)
20200702120333.GB4532@zpidnp36
Hi (again),
and the attached patch uses the new nodelay option, if we don’t want to waitfor another guile-ssh release.
Lars
From 2892f79f819dd2dd9420f7e74bcb6e293d377452 Mon Sep 17 00:00:00 2001From: Lars-Dominik Braun <ldb@leibniz-psychology.org>Date: Thu, 2 Jul 2020 13:59:51 +0200Subject: [PATCH] guix: Add nodelay to SSH store connection
* gnu/packages/ssh.scm (guile-ssh)[patches]: Add patch.* gnu/packages/patches/guile-ssh-0.12.0-nodelay.patch: New file.* gnu/local.mk (dist_patch_DATA): Add it.* guix/ssh.scm (open-ssh-session): Use new nodelay option.--- gnu/local.mk | 1 + .../patches/guile-ssh-0.12.0-nodelay.patch | 81 +++++++++++++++++++ gnu/packages/ssh.scm | 1 + guix/ssh.scm | 4 +- 4 files changed, 86 insertions(+), 1 deletion(-) create mode 100644 gnu/packages/patches/guile-ssh-0.12.0-nodelay.patch
Toggle diff (128 lines)diff --git a/gnu/local.mk b/gnu/local.mkindex 8bb56010c2..26725cbca1 100644--- a/gnu/local.mk+++ b/gnu/local.mk@@ -1068,6 +1068,7 @@ dist_patch_DATA = \ %D%/packages/patches/guile-present-coding.patch \ %D%/packages/patches/guile-rsvg-pkgconfig.patch \ %D%/packages/patches/guile-emacs-fix-configure.patch \+ %D%/packages/patches/guile-ssh-0.12.0-nodelay.patch \ %D%/packages/patches/gtk2-respect-GUIX_GTK2_PATH.patch \ %D%/packages/patches/gtk2-respect-GUIX_GTK2_IM_MODULE_FILE.patch \ %D%/packages/patches/gtk2-theme-paths.patch \diff --git a/gnu/packages/patches/guile-ssh-0.12.0-nodelay.patch b/gnu/packages/patches/guile-ssh-0.12.0-nodelay.patchnew file mode 100644index 0000000000..bc649b7469--- /dev/null+++ b/gnu/packages/patches/guile-ssh-0.12.0-nodelay.patch@@ -0,0 +1,81 @@+commit f697c603ef196eb88c1d7d629e456bf5054e35a4+Author: Lars-Dominik Braun <ldb@leibniz-psychology.org>+Date: Thu Jul 2 13:26:21 2020 +0200++ modules/ssh/session.scm: Add nodelay to make-session+ + * modules/ssh/session.scm (make-session): Add key nodelay++diff --git a/modules/ssh/session.scm b/modules/ssh/session.scm+index d7c0831..042bd5f 100644+--- a/modules/ssh/session.scm++++ b/modules/ssh/session.scm+@@ -74,7 +74,7 @@+ knownhosts timeout timeout-usec ssh1 ssh2 log-verbosity+ ciphers-c-s ciphers-s-c compression-c-s compression-s-c+ proxycommand stricthostkeycheck compression+- compression-level callbacks config)++ compression-level nodelay callbacks config)+ "Make a new SSH session with specified configuration.\n+ Return a new SSH session."+ (let ((session (%make-session)))+@@ -98,6 +98,7 @@ Return a new SSH session."+ (session-set-if-specified! stricthostkeycheck)+ (session-set-if-specified! compression)+ (session-set-if-specified! compression-level)++ (session-set-if-specified! nodelay)+ (session-set-if-specified! callbacks)+ + (when config++commit b80121ae3e851a676f25fd2b7e33371b94a6e030+Author: Lars-Dominik Braun <ldb@leibniz-psychology.org>+Date: Thu Jul 2 08:52:13 2020 +0200++ libguile-ssh/session-func.c: Add NODELAY option+ + * libguile-ssh/session-func.c (session_options, set_option): Add libssh’s+ nodelay option.+ * tests/session.scm ("session-set!, valid values", "session-set!, invalid+ values"): Add unit-tests.++diff --git a/libguile-ssh/session-func.c b/libguile-ssh/session-func.c+index 48db779..9240734 100644+--- a/libguile-ssh/session-func.c++++ b/libguile-ssh/session-func.c+@@ -71,6 +71,7 @@ static struct symbol_mapping session_options[] = {+ { "stricthostkeycheck", SSH_OPTIONS_STRICTHOSTKEYCHECK },+ { "compression", SSH_OPTIONS_COMPRESSION },+ { "compression-level", SSH_OPTIONS_COMPRESSION_LEVEL },++ { "nodelay", SSH_OPTIONS_NODELAY },+ { "callbacks", GSSH_OPTIONS_CALLBACKS },+ { NULL, -1 }+ };+@@ -366,6 +367,7 @@ set_option (SCM scm_session, struct session_data* sd, int type, SCM value)+ case SSH_OPTIONS_SSH1:+ case SSH_OPTIONS_SSH2:+ case SSH_OPTIONS_STRICTHOSTKEYCHECK:++ case SSH_OPTIONS_NODELAY:+ return set_bool_opt (session, type, value);+ + case SSH_OPTIONS_FD:+diff --git a/tests/session.scm b/tests/session.scm+index 2eb1df7..83a864d 100644+--- a/tests/session.scm++++ b/tests/session.scm+@@ -70,6 +70,7 @@+ nolog)+ (compression "yes" "no")+ (compression-level 1 2 3 4 5 6 7 8 9)++ (nodelay #f #t)+ (callbacks ((user-data . "hello")+ (global-request-callback . ,(const #f))))))+ (res #t))+@@ -95,6 +96,7 @@+ (log-verbosity "string" -1 0 1 2 3 4 5)+ (compression 12345)+ (compression-level -1 0 10)++ (nodelay 12345 "string")+ (callbacks "not a list"+ ((global-request-callback . #f)))))+ (res #t))diff --git a/gnu/packages/ssh.scm b/gnu/packages/ssh.scmindex ea42e72de2..0d9fa5fd54 100644--- a/gnu/packages/ssh.scm+++ b/gnu/packages/ssh.scm@@ -259,6 +259,7 @@ Additionally, various channel-specific options can be negotiated.") (sha256 (base32 "054hd9rzfhb48gc1hw3rphhp0cnnd4bs5qmidy5ygsyvy9ravlad"))+ (patches (search-patches "guile-ssh-0.12.0-nodelay.patch")) (modules '((guix build utils))))) (build-system gnu-build-system) (outputs '("out" "debug"))diff --git a/guix/ssh.scm b/guix/ssh.scmindex 2d7ca7d01d..df7a9bed4a 100644--- a/guix/ssh.scm+++ b/guix/ssh.scm@@ -129,7 +129,9 @@ Throw an error on failure." ;; We need lightweight compression when ;; exchanging full archives. #:compression compression- #:compression-level 3)))+ #:compression-level 3++ #:nodelay #t))) ;; Honor ~/.ssh/config. (session-parse-config! session)-- 2.20.1
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEEyk+M9DfXR4/aBV/UQhN3ARo3hEYFAl79zQ8ACgkQQhN3ARo3hEbEKAwAuHGzi/EBcAbji98iXuwjFX5pKKTvVGnb+EusvCFDwgQzOfKUBnFmDcb3+C3R0/Y4rsDuvMfmpeNSdiZB4GPxiX7bOJ+uyzJLUDi/1CPvyGDsAHvF0oAGF+vp+316dHsPryS8mYUgkYOb9aLzbzTaCm/qikohd19S8jkpcb9rbxvLVxlvEsAz7s22fxDfuYhJOGxprK3XWRK0xga8aG58x48O8uHSh93TXkLr77829Mu4zoKKzfA6Dds6ChJWpOEbGGBnl2txafXulLTDpywbGs6Ir4kDvfYVBENAyOXsws/VVtZwfNWbgZ6BPuSDP9dXaOa0hRU9hcpzNma9h3qwYEvad1TG62V9la6aBtO3/f6P49uXUW9Azo3q3zj/npLqKcrn7fr+vk29p62KUqze5rvJ80IYbPEyhWtOhMD3CS5dkRiW2Q8igKGSDxVVz0R/AFgFPJc/RTpF1KvyPBvSeQgv48OamXBOYEF5C18qZO/yOE0lCmO4JVwivrjtLqdk=tkfu-----END PGP SIGNATURE-----

?