`guix environment` performance issues

OpenSubmitted by Lars-Dominik Braun.
Details
3 participants
  • Lars-Dominik Braun
  • Ludovic Courtès
  • Maxim Cournoyer
Owner
unassigned
Severity
important
Merged with
L
L
Lars-Dominik Braun wrote on 4 Jun 2020 10:23
(address . bug-guix@gnu.org)
20200604082316.GA3146@zpidnp36
Hi,

I’ve noticed that `guix environment` can be very very slow for some packages.
Whereas usually a call like

time guix environment --ad-hoc <package> -- true

takes 600ms to 1.5s, it takes 8.4s for the package r-learnr on my Ryzen 5 3600
with NVMe SSD, 32G of RAM and a warm cache. This seems to accumulate when
trying to install multiple unrelated “slow” packages into the same environment,
i.e. jupyter (2.9s) + r-rlearnr (8.4s) + r-stm (2.9s) = 12.1s. What these
packages seem to have in common is a deep chain of propagated inputs.

And in fact, if I create artificial test packages where test-999 propagates
test-998, test-998→test-997 and so on the command above takes 1m 9s. A single
package propagating 999 packages takes 4.5s.

Instrumenting guix-environment’s with-store call with statprof I see a lot of
time is spent in the garbage collector:

Total time: 24.672604202 seconds (19.431122691 seconds in GC)
./pre-inst-env guix environment --ad-hoc r-learnr -- true 25,18s user 0,24s system 308% cpu 8,248 total

More specifically in an anonymous function and reap-pipes, which is a gc hook,
I believe:

% cumulative self
time seconds seconds calls procedure
33.41 14.49 8.24 anon #xbb8480
27.95 6.90 6.90 ice-9/popen.scm:145:0:reap-pipes
4.37 1.08 1.08 anon #xbbdcd8
3.28 0.86 0.81 ice-9/vlist.scm:539:0:vhash-assq
2.40 2.37 0.59 guix/grafts.scm:202:22

Not sure if that helps tracking the issue down though.

Cheers,
Lars

PS: Here’s `guix describe`:
Generation 1 Jun 03 2020 10:38:17 (current)
guix 9ab2790
branch: master
commit: 9ab2790e2acf872c0f2cf41ec180de80d1eee3f3
zpid bab33c1
branch: master
commit: bab33c19e0f0304d7aa82751653eb85d8cad8c0a
-----BEGIN PGP SIGNATURE-----

iQGzBAABCgAdFiEEyk+M9DfXR4/aBV/UQhN3ARo3hEYFAl7Yr3EACgkQQhN3ARo3
hEZvXgwAortYSBFszMEb4QyWF4EtfFQ8GMqhj9Av2r72HXGKtqnu+ZoY4VLK3eZx
Ntt0HIn50+BLQ/0TkWRCy5znN5t4NPVg7Qdn46DlCM9rOL6kkr61xrVGmmi5yzva
rir2cUIgLFHEwHQPP3AoKZAVrkzIoKxHhHPzFGyLZPglf0MER82X30pj5Yq14h3r
WlA+u6ktDsnHGYc9dD25mTSvuhITT8NAp4viNBC328FEZ+b/ct2/lpeln/khZYmU
HSG+bYI9R+fwg6isCPtKuZB+B7T5703KOyzvwbiasnCToMr1/PCV91z3Arba8MsC
eYZUmv8FfsrENz7iFs7le9J3csH39bhrZn7DNADwSt3HTILGHrrbEikgD3R3Lyvq
RcYRA9NZLZHSRdYntPbktFvK4i4H7UIBHx/b2MmiGAgrhqYvAPPFgTU8D0Y/w4Rd
rq6uXckgl3WlGnbmsVO7CEefanSv3RwWg7+m2LJ4sziXpT5fmyp+qru/l58yCUsZ
MiyfjkrB
=lFqp
-----END PGP SIGNATURE-----


L
L
Ludovic Courtès wrote on 6 Jun 2020 16:34
control message for bug #41702
(address . control@debbugs.gnu.org)
87sgf8tg9p.fsf@gnu.org
severity 41702 important
quit
L
L
Ludovic Courtès wrote on 6 Jun 2020 18:08
Re: bug#41702: `guix environment` performance issues
(name . Lars-Dominik Braun)(address . ldb@leibniz-psychology.org)(address . 41702@debbugs.gnu.org)
87mu5gtbwg.fsf@gnu.org
Hi,

Lars-Dominik Braun <ldb@leibniz-psychology.org> skribis:

Toggle quote (14 lines)
> Total time: 24.672604202 seconds (19.431122691 seconds in GC)
> ./pre-inst-env guix environment --ad-hoc r-learnr -- true 25,18s user 0,24s system 308% cpu 8,248 total
>
> More specifically in an anonymous function and reap-pipes, which is a gc hook,
> I believe:
>
> % cumulative self
> time seconds seconds calls procedure
> 33.41 14.49 8.24 anon #xbb8480
> 27.95 6.90 6.90 ice-9/popen.scm:145:0:reap-pipes
> 4.37 1.08 1.08 anon #xbbdcd8
> 3.28 0.86 0.81 ice-9/vlist.scm:539:0:vhash-assq
> 2.40 2.37 0.59 guix/grafts.scm:202:22

Guile master has a fix for statprof that yields more useful info:

Toggle snippet (38 lines)
scheme@(guile-user)> ,use(guix scripts environment)
scheme@(guile-user)> ,pr (guix-environment "--ad-hoc" "r-learnr" "--" "true")
% cumulative self
time seconds seconds procedure
29.84 9.87 6.16 append
19.56 4.04 4.04 %after-gc-thunk
6.85 1.87 1.42 ice-9/vlist.scm:539:0:vhash-assq
5.44 1.17 1.12 write
3.23 0.67 0.67 guix/derivations.scm:665:0:derivation->output-paths
2.82 0.58 0.58 string=?
2.42 2.37 0.50 guix/grafts.scm:202:22
2.42 0.50 0.50 list?
2.22 0.46 0.46 hashq
2.02 0.42 0.42 display
1.61 15.82 0.33 guix/grafts.scm:186:0:reference-origin
1.61 0.87 0.33 guix/grafts.scm:204:31
1.21 0.33 0.25 guix/derivations.scm:667:7
1.21 0.29 0.25 srfi/srfi-1.scm:817:0:any
1.01 1232.14 0.21 srfi/srfi-1.scm:584:5:map1
0.81 0.83 0.17 guix/derivations.scm:697:0:derivation/masked-inputs
0.81 0.75 0.17 srfi/srfi-1.scm:580:2:map
0.81 0.17 0.17 guix/derivations.scm:158:0:%derivation-input-derivation-procedure
0.60 0.17 0.12 reverse
0.60 0.12 0.12 hashq-ref
0.60 0.12 0.12 get-bytevector-n
0.60 0.12 0.12 procedure?
0.40 0.67 0.08 guix/packages.scm:1232:0:fold-bag-dependencies
0.40 0.12 0.08 string->utf8
0.40 0.12 0.08 ice-9/vlist.scm:534:0:vhash-assoc
0.40 0.12 0.08 ice-9/vlist.scm:449:0:vhash-cons
0.40 0.12 0.08 delete-duplicates
0.40 0.08 0.08 ice-9/boot-9.scm:1389:0:->bool
0.40 0.08 0.08 ice-9/boot-9.scm:2201:0:%load-announce
0.40 0.08 0.08 hash
0.40 0.08 0.08 guix/derivations.scm:665:0:derivation->output-paths
0.20 20.73 0.04 guix/gexp.scm:1061:2

Notice that the same command with ‘--no-grafts’ takes 2s instead of 11s.

The patch below arranges so that ‘cumulative-grafts’ processes
dependencies in a batch, such that the derivation’s dependency graph is
traversed once for all, which makes a difference for derivations with
lots of inputs.

Here’s the before/after comparison:

Toggle snippet (14 lines)
$ time guix environment --ad-hoc r-learnr --search-paths
export PATH="/gnu/store/n4wxbmqpafjfyawrla8xymzzdm5hxwph-profile/bin${PATH:+:}$PATH"

real 0m11.328s
user 0m20.155s
sys 0m0.172s
$ time ./pre-inst-env guix environment --ad-hoc r-learnr --search-paths
export PATH="/gnu/store/if6z77la3mx0qdzvcyl4qv9i5cyp48i0-profile/bin${PATH:+:}$PATH"

real 0m4.602s
user 0m6.189s
sys 0m0.136s

There’s still room for improvement, but it’s much better.

Ludo’.
Toggle diff (126 lines)
diff --git a/guix/grafts.scm b/guix/grafts.scm
index 69d6fe4469..910dcadc8a 100644
--- a/guix/grafts.scm
+++ b/guix/grafts.scm
@@ -20,10 +20,12 @@
   #:use-module (guix store)
   #:use-module (guix monads)
   #:use-module (guix records)
+  #:use-module (guix combinators)
   #:use-module (guix derivations)
   #:use-module ((guix utils) #:select (%current-system))
   #:use-module (guix sets)
   #:use-module (srfi srfi-1)
+  #:use-module (srfi srfi-11)
   #:use-module (srfi srfi-9 gnu)
   #:use-module (srfi srfi-26)
   #:use-module (srfi srfi-34)
@@ -183,32 +185,47 @@ references."
            (set-current-state (vhash-cons key result cache))
            (return result)))))))
 
-(define (reference-origin drv item)
-  "Return the derivation/output pair among the inputs of DRV, recursively,
-that produces ITEM.  Return #f if ITEM is not produced by a derivation (i.e.,
-it's a content-addressed \"source\"), or if it's not produced by a dependency
-of DRV."
+(define (reference-origins drv items)
+  "Return the derivation/output pairs among the inputs of DRV, recursively,
+that produce ITEMS.  Elements of ITEMS not produced by a derivation (i.e.,
+it's a content-addressed \"source\"), or not produced by a dependency of DRV,
+have no corresponding element in the resulting list."
+  (define (lookup-derivers drv result items)
+    ;; Return RESULT augmented by all the drv/output pairs producing one of
+    ;; ITEMS, and ITEMS stripped of matching items.
+    (fold2 (match-lambda*
+             (((output . file) result items)
+              (if (member file items)
+                  (values (alist-cons drv output result)
+                          (delete file items))
+                  (values result items))))
+           result items
+           (derivation->output-paths drv)))
+
   ;; Perform a breadth-first traversal of the dependency graph of DRV in
-  ;; search of the derivation that produces ITEM.
+  ;; search of the derivations that produce ITEMS.
   (let loop ((drv (list drv))
+             (items items)
+             (result '())
              (visited (setq)))
     (match drv
       (()
-       #f)
+       result)
       ((drv . rest)
-       (if (set-contains? visited drv)
-           (loop rest visited)
-           (let ((inputs (derivation-inputs drv)))
-             (or (any (lambda (input)
-                        (let ((drv (derivation-input-derivation input)))
-                          (any (match-lambda
-                                 ((output . file)
-                                  (and (string=? file item)
-                                       (cons drv output))))
-                               (derivation->output-paths drv))))
-                      inputs)
-                 (loop (append rest (map derivation-input-derivation inputs))
-                       (set-insert drv visited)))))))))
+       (cond ((null? items)
+              result)
+             ((set-contains? visited drv)
+              (loop rest items result visited))
+             (else
+              (let*-values (((inputs)
+                             (map derivation-input-derivation
+                                  (derivation-inputs drv)))
+                            ((result items)
+                             (fold2 lookup-derivers
+                                    result items inputs)))
+                (loop (append rest inputs)
+                      items result
+                      (set-insert drv visited)))))))))
 
 (define* (cumulative-grafts store drv grafts
                             #:key
@@ -233,25 +250,27 @@ derivations to the corresponding set of grafts."
       (_
        #f)))
 
-  (define (dependency-grafts item)
-    (match (reference-origin drv item)
-      ((drv . output)
-       ;; If GRAFTS already contains a graft from DRV, do not override it.
-       (if (find (cut graft-origin? drv <>) grafts)
-           (state-return grafts)
-           (cumulative-grafts store drv grafts
-                              #:outputs (list output)
-                              #:guile guile
-                              #:system system)))
-      (#f
-       (state-return grafts))))
+  (define (dependency-grafts items)
+    (mapm %store-monad
+          (lambda (drv+output)
+            (match drv+output
+              ((drv . output)
+               ;; If GRAFTS already contains a graft from DRV, do not
+               ;; override it.
+               (if (find (cut graft-origin? drv <>) grafts)
+                   (state-return grafts)
+                   (cumulative-grafts store drv grafts
+                                      #:outputs (list output)
+                                      #:guile guile
+                                      #:system system)))))
+          (reference-origins drv items)))
 
   (with-cache (cons (derivation-file-name drv) outputs)
     (match (non-self-references store drv outputs)
       (()                                         ;no dependencies
        (return grafts))
       (deps                                       ;one or more dependencies
-       (mlet %state-monad ((grafts (mapm %state-monad dependency-grafts deps)))
+       (mlet %state-monad ((grafts (dependency-grafts deps)))
          (let ((grafts (delete-duplicates (concatenate grafts) equal?)))
            (match (filter (lambda (graft)
                             (member (graft-origin-file-name graft) deps))
L
L
Ludovic Courtès wrote on 6 Jun 2020 23:40
(name . Lars-Dominik Braun)(address . ldb@leibniz-psychology.org)(address . 41702@debbugs.gnu.org)
87d06bub30.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (5 lines)
> The patch below arranges so that ‘cumulative-grafts’ processes
> dependencies in a batch, such that the derivation’s dependency graph is
> traversed once for all, which makes a difference for derivations with
> lots of inputs.

L
L
Lars-Dominik Braun wrote on 8 Jun 2020 11:04
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 41702@debbugs.gnu.org)
20200608090453.GC3166@zpidnp36
Hi Ludo,

Toggle quote (14 lines)
> --8<---------------cut here---------------start------------->8---
> $ time guix environment --ad-hoc r-learnr --search-paths
> export PATH="/gnu/store/n4wxbmqpafjfyawrla8xymzzdm5hxwph-profile/bin${PATH:+:}$PATH"
>
> real 0m11.328s
> user 0m20.155s
> sys 0m0.172s
> $ time ./pre-inst-env guix environment --ad-hoc r-learnr --search-paths
> export PATH="/gnu/store/if6z77la3mx0qdzvcyl4qv9i5cyp48i0-profile/bin${PATH:+:}$PATH"
>
> real 0m4.602s
> user 0m6.189s
> sys 0m0.136s
> --8<---------------cut here---------------end--------------->8---
that’s awesome and brings me much closer to my goal of running all applications
inside a `guix environment` container for reproducibility. Including the
protocol fixes from #41720 I’m now down to ~30s from ~50s, which may be called
somewhat usable. Obviously I’d be very interested in further speedups.

Thanks,
Lars
-----BEGIN PGP SIGNATURE-----

iQGzBAABCgAdFiEEyk+M9DfXR4/aBV/UQhN3ARo3hEYFAl7d/zEACgkQQhN3ARo3
hEYcoQwAtb945F+lwytXrepW+8lYfu/S2Pdi8uwyyMfFFjx8oP+t9CmQbVByt224
4N7YLTJReP1BFEN3lnOiV8R1bA9stKCWaSCcUR2jI4LgtOlIDFfr6YwMRSf2krbo
ylpAX6x9dHxdEl32M++ODzBGy5BgxDBKga6gk7OPoGUzld9nFYQX6s7VQGB6IcJg
IyjP9zFhCTEO9eTFoUyUynIdx/LCT6l0vRobX0hy+sStXoMa7weaoBwEMbQ08Hed
ltxv35iBAYcOJoTM68r/Ph9G8ipvL6KQMBIPfTRxGm/UXH2z1OH4/sDYST0wNbZ8
mG4JGt/oiewQ/ssZRM8JzrRAd5s0pOK5c0B9olojRerfUvF8ubfVvZIqpvfyKPX5
ny8SN75MyEDhvodqQbjCAXlAPFvlW+0vWxlMTC+SufZceQ4cBaGCDNeuV2K3KtzW
ywW/nwkpHm5y5d3jePyBZreB8WO9ml7UTfRX7AawlmKSMx6OGWM7m3tYJh2fghJ5
rMerWnCh
=SDlJ
-----END PGP SIGNATURE-----


L
L
Ludovic Courtès wrote on 8 Jun 2020 23:59
(name . Lars-Dominik Braun)(address . ldb@leibniz-psychology.org)(address . 41702@debbugs.gnu.org)
87k10hp6ba.fsf@gnu.org
Hi,

Lars-Dominik Braun <ldb@leibniz-psychology.org> skribis:

Toggle quote (19 lines)
>> --8<---------------cut here---------------start------------->8---
>> $ time guix environment --ad-hoc r-learnr --search-paths
>> export PATH="/gnu/store/n4wxbmqpafjfyawrla8xymzzdm5hxwph-profile/bin${PATH:+:}$PATH"
>>
>> real 0m11.328s
>> user 0m20.155s
>> sys 0m0.172s
>> $ time ./pre-inst-env guix environment --ad-hoc r-learnr --search-paths
>> export PATH="/gnu/store/if6z77la3mx0qdzvcyl4qv9i5cyp48i0-profile/bin${PATH:+:}$PATH"
>>
>> real 0m4.602s
>> user 0m6.189s
>> sys 0m0.136s
>> --8<---------------cut here---------------end--------------->8---
> that’s awesome and brings me much closer to my goal of running all applications
> inside a `guix environment` container for reproducibility. Including the
> protocol fixes from #41720 I’m now down to ~30s from ~50s, which may be called
> somewhat usable. Obviously I’d be very interested in further speedups.

That’s over SSH, right?

Probably what’s killing us is the round-trip time for all these small
RPCs. We would need pipelining but the RPC protocol is not designed to
make that easy.

Perhaps you could “strace -Tt” the thing to check whether this
hypothesis is correct by looking at the time we spend waiting for
replies?

As for the CPU cost (i.e., going below the 4.6s above), we should keep
profiling just like you did.

Thanks,
Ludo’.
L
L
Lars-Dominik Braun wrote on 9 Jun 2020 11:15
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 41702@debbugs.gnu.org)
20200609091522.GB8597@zpidnp36
Hey,

Toggle quote (1 lines)
> That’s over SSH, right?
correct, the worst possible case: Inside two VM’s on a Laptop, SSH transport
between them and /gnu+/var/guix on an NFS share (nfsd is in the same VM as
guix-daemon).

Toggle quote (3 lines)
> Probably what’s killing us is the round-trip time for all these small
> RPCs. We would need pipelining but the RPC protocol is not designed to
> make that easy.
That would have been my best guess too, but it does not seem to be the biggest
problem right now. Looking at the numbers again (both patches applied) with the
attached manifest[1], I see that:

---snip---
Local UNIX socket with and without --no-grafts:
N Min Max Median Avg Stddev
x 10 6.07 6.35 6.145 6.16 0.08232726
+ 10 17.47 17.89 17.545 17.602 0.14351152
Difference at 99.0% confidence
11.442 +/- 0.150576
185.747% +/- 4.07133%

Local UNIX socket vs. guix://localhost transport:
N Min Max Median Avg Stddev
x 10 17.47 17.89 17.545 17.602 0.14351152
+ 10 17.43 18.1 17.61 17.642 0.20131788
No difference proven at 99.0% confidence

Local UNIX socket vs ssh://localhost transport:
N Min Max Median Avg Stddev
x 10 17.47 17.89 17.545 17.602 0.14351152
+ 10 33.46 35.27 34.315 34.359 0.53873205
Difference at 99.0% confidence
16.757 +/- 0.5074
95.1994% +/- 3.13957%
---snap---

So I would conclude:

1) Grafting still takes a lot of time and needs more work
2) Linux optimizes localhost networking pretty well
3) Our SSH transport is terribly slow

Moving to non-localhost communication between two VM’s:

---snip---
guix://localhost vs. guix://remote-host transport:
N Min Max Median Avg Stddev
x 10 17.43 18.1 17.61 17.642 0.20131788
+ 10 20.88 22.58 21.095 21.222 0.49689704
Difference at 99.0% confidence
3.58 +/- 0.487934
20.2925% +/- 2.85159%

guix://remote-host vs. ssh://remote-host:
N Min Max Median Avg Stddev
x 10 20.88 22.58 21.095 21.222 0.49689704
+ 10 30.1 32.56 31.005 31.093 0.70740606
Difference at 99.0% confidence
9.871 +/- 0.786769
46.5131% +/- 4.35326%
---snap---

Conclusion here is the same: Not alot of impact of networking/NFS and SSH
transport is still terribly slow. (Confusingly faster than localhost though.)

Toggle quote (3 lines)
> Perhaps you could “strace -Tt” the thing to check whether this
> hypothesis is correct by looking at the time we spend waiting for
> replies?
I’m not sure this will yield meaningful data for SSH, so I analyzet it for
guix://localhost vs. guix://remote-host. Takeaway is, yes, of course there is a
statistically significant difference and it’s about 40%±50%, which means this
method is pretty useless, because we can’t bin RPC’s by type.

So, I guess it would make sense for me to look at the SSH transport itself
again and see if there are any other low-hanging fruit. Not sure how much I can
help with profiling guile/guix itself. A different/better RPC protocol is
probably GSoC/v2.0-worthy?

Sorry for all the lengthy emails,
Lars

[1] You’ll need this channel: https://github.com/leibniz-psychology/guix-zpid
(specifications->manifest '( "coreutils" "findutils" "procps" "strace" "openssh" "mit-krb5" "bash" "which" "zip" "geeqie" "util-linux" "grep" "glibc" ;; for locales "glibc-locales" ;; front-end software "jupyter-zpid" "python-jupyterlab" ;; available notebook kernel ; provided by jupyter-zpid ;"python-ipykernel" "r-irkernel" "rstudio-server-zpid" "r" ;; for RMarkdown "r-knitr" "r-yaml" "r-markdown" "r-rmarkdown" "texlive" ;; commonly used r packages "r-psych" "r-ggplot2" "r-lattice" "r-foreign" "r-readr" "r-haven" "r-dplyr" "r-tidyr" "r-stringr" "r-forecast" "r-lme4" "r-nlme" "r-nnet" "r-glmnet" "r-caret" "r-xmisc" "r-splitstackshape" "r-tm" "r-quanteda" "r-topicmodels" "r-stm" ;;"r-parallel" "r-dt" "r-nlp" "r-data-table" "r-hmisc" "r-learnr" "r-metafor" ;; for rmarkdown "ghc-pandoc" ))
-----BEGIN PGP SIGNATURE-----

iQGzBAABCgAdFiEEyk+M9DfXR4/aBV/UQhN3ARo3hEYFAl7fUycACgkQQhN3ARo3
hEbGwgv9G/Jy0viE4XGqrmhCfmnQSKMvxgGZg7WPUBNYaOFcUr3fjpR13egOlcqS
mvebo/JZ8cs+nRNQB7/v+FTYFPTmk/xFIN3jLNdHnqUmHHBmhLsirbyQRdMXYr6F
sD7eho57lQ/nlnTsEU6NZQwO3WA3kd4aFjvNxSBpaVeO1HPaxyWTPEQVqpWGN7u3
nFzw4866BBBNP1R1nSdbUKsAzilHKTWMJ0B6O34dNK1/fdaVSJ/yNaAzehN+cykF
SgYCzl1eUYquNUtZvyHjNptYm9BX54YQPA7HrMmpsd8qOaeTe6R4HHrV6+JPy/oK
rrr2RPsVnI1bY5o58mRKzk8up75FsTlN6tWBwuempr00dIsjny/gjBHuL6J5v3vC
GqmHOMhjF66PNmiwheqCGHNae8tq7Li4diFmSo9uV+kFpwgmGQ13ywZeKex2l6xV
g0vpX0bGkr1gxK/XWy2dYRtgkNYf5ycAhUBBLOhTlN6/+nWI6QLEgKiAmZR16Ju5
AVtbp5Gm
=zWIs
-----END PGP SIGNATURE-----


L
L
Ludovic Courtès wrote on 27 Jun 2020 23:20
(name . Lars-Dominik Braun)(address . ldb@leibniz-psychology.org)(address . 41702@debbugs.gnu.org)
87wo3sjjdw.fsf@gnu.org
Hi,

Lars-Dominik Braun <ldb@leibniz-psychology.org> skribis:

Toggle quote (32 lines)
> That would have been my best guess too, but it does not seem to be the biggest
> problem right now. Looking at the numbers again (both patches applied) with the
> attached manifest[1], I see that:
>
> ---snip---
> Local UNIX socket with and without --no-grafts:
> N Min Max Median Avg Stddev
> x 10 6.07 6.35 6.145 6.16 0.08232726
> + 10 17.47 17.89 17.545 17.602 0.14351152
> Difference at 99.0% confidence
> 11.442 +/- 0.150576
> 185.747% +/- 4.07133%
>
> Local UNIX socket vs. guix://localhost transport:
> N Min Max Median Avg Stddev
> x 10 17.47 17.89 17.545 17.602 0.14351152
> + 10 17.43 18.1 17.61 17.642 0.20131788
> No difference proven at 99.0% confidence
>
> Local UNIX socket vs ssh://localhost transport:
> N Min Max Median Avg Stddev
> x 10 17.47 17.89 17.545 17.602 0.14351152
> + 10 33.46 35.27 34.315 34.359 0.53873205
> Difference at 99.0% confidence
> 16.757 +/- 0.5074
> 95.1994% +/- 3.13957%
> ---snap---
>
> So I would conclude:
>
> 1) Grafting still takes a lot of time and needs more work

Yes, I noticed that there’s still redundant work being done, so we may
need global caching in addition to what
58bb833365db4e8934a386497d5b00a063cfd27d did.

Toggle quote (3 lines)
> 2) Linux optimizes localhost networking pretty well
> 3) Our SSH transport is terribly slow

Yes.

The patch below is a noticeable improvement for me. On my laptop,

GUIX_DAEMON_SOCKET=ssh://localhost ./pre-inst-env guix build libreoffice -n

goes from 5.8s to 3.3s. It just does the same thing as we do for
guix://.

Could you check what results it gives you?

Toggle quote (5 lines)
> So, I guess it would make sense for me to look at the SSH transport itself
> again and see if there are any other low-hanging fruit. Not sure how much I can
> help with profiling guile/guix itself. A different/better RPC protocol is
> probably GSoC/v2.0-worthy?

There’s a project to rewrite the daemon in Scheme, started by Caleb
Ristvedt (reepa) as part of GSoC a few years ago. This could be an
opportunity to add a new version of the protocol that would support
pipelining.

Thanks,
Ludo’.
Toggle diff (86 lines)
diff --git a/guix/ssh.scm b/guix/ssh.scm
index 2d7ca7d01d..a9312b7c8c 100644
--- a/guix/ssh.scm
+++ b/guix/ssh.scm
@@ -98,7 +98,7 @@ actual key does not match."
                                 key type))))))))
 
 (define* (open-ssh-session host #:key user port identity
-                           host-key
+                           host-key open-connection
                            (compression %compression)
                            (timeout 3600))
   "Open an SSH session for HOST and return it.  IDENTITY specifies the file
@@ -110,6 +110,10 @@ When HOST-KEY is true, it must be a string like \"ssh-ed25519 AAAAC3Nz…
 root@example.org\"; the server is authenticated and an error is raised if its
 host key is different from HOST-KEY.
 
+When OPEN-CONNECTION is true, it must be a two-argument procedure; it is
+passed HOST and PORT and must return a socket (a file port).  When
+OPEN-CONNECTION is false, Guile-SSH takes care of opening the connection.
+
 Install TIMEOUT as the maximum time in seconds after which a read or write
 operation on a channel of the returned session is considered as failing.
 
@@ -134,6 +138,13 @@ Throw an error on failure."
     ;; Honor ~/.ssh/config.
     (session-parse-config! session)
 
+    (when open-connection
+      (let* ((sock (open-connection host port)))
+        ;; Since 'session-set!' doesn't increase the revealed count of SOCK
+        ;; (as of Guile-SSH 0.12.0), do it ourselves.
+        (set-port-revealed! sock 1)
+        (session-set! session 'fd sock)))
+
     (match (connect! session)
       ('ok
        (if host-key
diff --git a/guix/store/ssh.scm b/guix/store/ssh.scm
index 09c0832505..d07f0d7467 100644
--- a/guix/store/ssh.scm
+++ b/guix/store/ssh.scm
@@ -1,5 +1,5 @@
 ;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2017 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2017, 2020 Ludovic Courtès <ludo@gnu.org>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -19,6 +19,7 @@
 (define-module (guix store ssh)
   #:use-module (guix ssh)
   #:use-module (web uri)
+  #:use-module (ssh session)
   #:export (connect-to-daemon))
 
 ;;; Commentary:
@@ -29,11 +30,27 @@
 ;;;
 ;;; End:
 
+(define (open-connection host port)
+  "Open a connection to HOST and PORT.  Use the standard SSH port if PORT is
+false."
+  (let* ((lst  (getaddrinfo host
+                            (if port (number->string port) "ssh")
+                            (if port AI_NUMERICSERV 0)))
+         (addr (addrinfo:addr (car lst)))
+         (sock (socket (sockaddr:fam addr) SOCK_STREAM 0)))
+    ;; Setting this option makes a dramatic difference because it avoids the
+    ;; "ACK delay" on our RPC messages.
+    (setsockopt sock IPPROTO_TCP TCP_NODELAY 1)
+
+    (connect sock addr)
+    sock))
+
 (define (connect-to-daemon uri)
   "Connect to the SSH daemon at URI, a URI object with the 'ssh' scheme."
   (remote-daemon-channel
    (open-ssh-session (uri-host uri)
                      #:port (or (uri-port uri) 22)
-                     #:user (uri-userinfo uri))))
+                     #:user (uri-userinfo uri)
+                     #:open-connection open-connection)))
 
 ;;; ssh.scm ends here
L
L
Ludovic Courtès wrote on 29 Jun 2020 22:55
control message for bug #41702
(address . control@debbugs.gnu.org)
87pn9ha8wx.fsf@gnu.org
tags 41702 + patch
quit
L
L
Lars-Dominik Braun wrote on 30 Jun 2020 11:59
Re: bug#41702: `guix environment` performance issues
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 41702@debbugs.gnu.org)
20200630095936.GA3630@zpidnp36
Hi Ludo,

Toggle quote (8 lines)
> The patch below is a noticeable improvement for me. On my laptop,
>
> GUIX_DAEMON_SOCKET=ssh://localhost ./pre-inst-env guix build libreoffice -n
>
> goes from 5.8s to 3.3s. It just does the same thing as we do for
> guix://.
>
> Could you check what results it gives you?
looking pretty good, comparing master without/with patch:

---snip---
N Min Max Median Avg Stddev
x 10 16.87 20.77 17.44 17.802 1.1228515
+ 10 9.92 13.67 10.19 10.569 1.1254574
Difference at 99.0% confidence
-7.233 +/- 1.44688
-40.6303% +/- 6.68735%
---snap---

Toggle quote (4 lines)
> There’s a project to rewrite the daemon in Scheme, started by Caleb
> Ristvedt (reepa) as part of GSoC a few years ago. This could be an
> opportunity to add a new version of the protocol that would support
> pipelining.
Given how much improvement you’ve achieved so far I’d say we can (for now)
probably get away with the current protocol. Still, what’s the status of this
daemon?

Lars
-----BEGIN PGP SIGNATURE-----

iQGzBAABCgAdFiEEyk+M9DfXR4/aBV/UQhN3ARo3hEYFAl77DQMACgkQQhN3ARo3
hEZuWAwAs5btXPUr3WH4g/FqdVHSqbn6QRe6Lo8WPqKZq2IHKpk+HR+VWwRT+uyu
o9S41JWMk00cbFr9I4xi2xyyld1J3NSDDOrGPP8sjxz7Z/y/h13Dh2vV3lHborEx
2ItAxK9+nGThJaFdFFdVOTd/1DZb7nLnaoQ6jrhVaQ2kMD3XmXd2dkNu/vEd8tat
DV2li5sWznnvmyglyGA7gRyPVdJITy8lhRPeOn1MVNWVRxAgHZMmB/pUKcL+7KB1
msmUuylIa0suKoFwqSOh9Mw6t8tGIaNEPaaJtjWKu5ew9tg0vnEsRPJU3cunuDtJ
RS/7tWk3nU+0wXCMErUgdfb9Tm3yebTopHCuyIFhC+z9M5X1owPmUNQdk5hdgbxE
+972C/fzlVABPtj2UEhfV9E48gEuON66NJOjdQBeGH0qohAadTQZmy4ln1ZMkYdw
+cKeLzWKj/w07oeZQGO9IE38QO26xHMye0Dx219gmQ/ar0pFgTA4QPs1bs8j5BM6
XvzW6T5G
=YULY
-----END PGP SIGNATURE-----


L
L
Ludovic Courtès wrote on 1 Jul 2020 12:49
(name . Lars-Dominik Braun)(address . ldb@leibniz-psychology.org)(address . 41702@debbugs.gnu.org)
87y2o3zf00.fsf@gnu.org
Hi,

Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (7 lines)
> + (when open-connection
> + (let* ((sock (open-connection host port)))
> + ;; Since 'session-set!' doesn't increase the revealed count of SOCK
> + ;; (as of Guile-SSH 0.12.0), do it ourselves.
> + (set-port-revealed! sock 1)
> + (session-set! session 'fd sock)))

The problem with this approach is that ‘open-connection’ is past the raw
host name; aliases, proxy commands, etc. from ~/.ssh/config are ignored.

So it cannot go in as is. :-/

What we need is Guile-SSH bindings for ‘ssh_get_fd’, which would allow
us to get at the actual file descriptor after the connection has been
opened, and to set TCP_NODELAY there:


It’s not 100% clear though if it returns that “right” FD when there’s a
proxy command.

Ludo’.
L
L
Lars-Dominik Braun wrote on 1 Jul 2020 13:24
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 41702@debbugs.gnu.org)
20200701112453.GB25179@zpidnp36
Hey,

Toggle quote (8 lines)
> What we need is Guile-SSH bindings for ‘ssh_get_fd’, which would allow
> us to get at the actual file descriptor after the connection has been
> opened, and to set TCP_NODELAY there:
>
> http://api.libssh.org/stable/group__libssh__session.html#gafe509fcea47714b5cd277d1e35e83276
>
> It’s not 100% clear though if it returns that “right” FD when there’s a
> proxy command.
wouldn’t it be easier to add the SSH_OPTIONS_NODELAY option (see [1]) to
guile-ssh (should be trivial[2]) and let guile-ssh handle all the connection
stuff without touching the socket? Or do we need the raw socket elsewhere?

Cheers,
Lars

-----BEGIN PGP SIGNATURE-----

iQGzBAABCgAdFiEEyk+M9DfXR4/aBV/UQhN3ARo3hEYFAl78coIACgkQQhN3ARo3
hEZJEQv9GJYnyKD23RlY9Et60BMB2YsJSjSqWxxjUneLfLnFDrIvusQZf+Lhtj3V
o7aBVzoJ6GEPrSOTi2FcecSUGfuSaC1w17Zs2prjMuqq66Opkx9caeBS/dv/4iLZ
HFf+Kci8+gaWQkn4BR7+tspfkjRxIulywpK4P7nP8KxUDhVDKXi6NLSJB5XCH4bg
qwNSzeQnu9L0CDzlZPHq5h5bZ7Op4VyZ/oMa1koRYBBVA0KCDGHZUQGs/Mh7QaYe
5N5AMRFabXWP4hN0qDw3H7Dic/+4jYi+2QMicx1Ysh+6bVB0R666/Qwg0PGZuHvX
sTwfcuHVHtHLubXEKCnzh7P5WkhzC8NpRL5iTaaadSQL/mrYe3y7ihPRtthfMkVD
zv53GVPXUnxWUc1JcxMO+9OrHVdQY9HPiNbrFmJ+FjEFGuqvmpAoA8K54aolIepf
D9dd9tC21M/yYcI7RCgJ5lSGbbWkHM5fz/P1b9UmKIbyQkk8WvqSt1N+IQQZM6DN
QbXGmtBu
=COgO
-----END PGP SIGNATURE-----


L
L
Ludovic Courtès wrote on 1 Jul 2020 23:53
(name . Lars-Dominik Braun)(address . ldb@leibniz-psychology.org)(address . 41702@debbugs.gnu.org)
87eepuyk99.fsf@gnu.org
Hi,

Lars-Dominik Braun <ldb@leibniz-psychology.org> skribis:

Toggle quote (12 lines)
>> What we need is Guile-SSH bindings for ‘ssh_get_fd’, which would allow
>> us to get at the actual file descriptor after the connection has been
>> opened, and to set TCP_NODELAY there:
>>
>> http://api.libssh.org/stable/group__libssh__session.html#gafe509fcea47714b5cd277d1e35e83276
>>
>> It’s not 100% clear though if it returns that “right” FD when there’s a
>> proxy command.
> wouldn’t it be easier to add the SSH_OPTIONS_NODELAY option (see [1]) to
> guile-ssh (should be trivial[2]) and let guile-ssh handle all the connection
> stuff without touching the socket? Or do we need the raw socket elsewhere?

Good idea, I had overlooked that possibility.

Would you like to send them a patch?

Thanks,
Ludo’.
L
L
Lars-Dominik Braun wrote on 2 Jul 2020 09:00
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 41702@debbugs.gnu.org)
20200702070000.GA4532@zpidnp36
Hi,

Toggle quote (1 lines)
> Would you like to send them a patch?
-----BEGIN PGP SIGNATURE-----

iQGzBAABCgAdFiEEyk+M9DfXR4/aBV/UQhN3ARo3hEYFAl79hewACgkQQhN3ARo3
hEakxwwA1QUBS0VlTTNH5L49/5dGCmvkc4pTJRv9BtneXSMJTUryIIpSEv/mPYEe
yn5Lr02uyZPTqlp75sOndJmhuRnSCz6F9HTHXgTAls7fsU3Ka0kiUO4WHhM8gOH4
sfyYUEq654ZuYU54sjAE3x/aMCfhxECXPmGECiz6/R5J0BcaFvr5h+lTqjWVtg6g
KFKPc8C/UQb4DxLjJaEC5KPzA70T/t6zoFk4ZYTa4bPArEHbiqfvbXNpifee/8np
e8KD6IL4rBEepblPdAAtdJ+x0EuFm40Ac4fGJv9b+QIe+d2go2Ixz5m2Y5x6U/Hr
kn9v8ZNKFUZukQj23RY6mke0Nek+WKgTAvV2FfYZnJ2OEhhJDW7oO4U4jxbOaomc
26WTErFEOvVls/ZXPIhBf4dsqcMUNmo6w2l59LnMNeWQkD19TE7LsOUGMQ6+aWQo
yBf4sl1wihnG1+ljSmf088COqNtsKNHEBxWYL317BoqIiD0Rk3V6OwDjrIuis7J9
Ou2EYXVE
=jsIk
-----END PGP SIGNATURE-----


L
L
Lars-Dominik Braun wrote on 2 Jul 2020 14:03
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 41702@debbugs.gnu.org)
20200702120333.GB4532@zpidnp36
Hi (again),

and the attached patch uses the new nodelay option, if we don’t want to wait
for another guile-ssh release.

Lars
From 2892f79f819dd2dd9420f7e74bcb6e293d377452 Mon Sep 17 00:00:00 2001
From: Lars-Dominik Braun <ldb@leibniz-psychology.org>
Date: Thu, 2 Jul 2020 13:59:51 +0200
Subject: [PATCH] guix: Add nodelay to SSH store connection

* gnu/packages/ssh.scm (guile-ssh)[patches]: Add patch.
* gnu/packages/patches/guile-ssh-0.12.0-nodelay.patch: New file.
* gnu/local.mk (dist_patch_DATA): Add it.
* guix/ssh.scm (open-ssh-session): Use new nodelay option.
---
gnu/local.mk | 1 +
.../patches/guile-ssh-0.12.0-nodelay.patch | 81 +++++++++++++++++++
gnu/packages/ssh.scm | 1 +
guix/ssh.scm | 4 +-
4 files changed, 86 insertions(+), 1 deletion(-)
create mode 100644 gnu/packages/patches/guile-ssh-0.12.0-nodelay.patch

Toggle diff (128 lines)
diff --git a/gnu/local.mk b/gnu/local.mk
index 8bb56010c2..26725cbca1 100644
--- a/gnu/local.mk
+++ b/gnu/local.mk
@@ -1068,6 +1068,7 @@ dist_patch_DATA =						\
   %D%/packages/patches/guile-present-coding.patch		\
   %D%/packages/patches/guile-rsvg-pkgconfig.patch		\
   %D%/packages/patches/guile-emacs-fix-configure.patch		\
+  %D%/packages/patches/guile-ssh-0.12.0-nodelay.patch		\
   %D%/packages/patches/gtk2-respect-GUIX_GTK2_PATH.patch	\
   %D%/packages/patches/gtk2-respect-GUIX_GTK2_IM_MODULE_FILE.patch \
   %D%/packages/patches/gtk2-theme-paths.patch			\
diff --git a/gnu/packages/patches/guile-ssh-0.12.0-nodelay.patch b/gnu/packages/patches/guile-ssh-0.12.0-nodelay.patch
new file mode 100644
index 0000000000..bc649b7469
--- /dev/null
+++ b/gnu/packages/patches/guile-ssh-0.12.0-nodelay.patch
@@ -0,0 +1,81 @@
+commit f697c603ef196eb88c1d7d629e456bf5054e35a4
+Author: Lars-Dominik Braun <ldb@leibniz-psychology.org>
+Date:   Thu Jul 2 13:26:21 2020 +0200
+
+    modules/ssh/session.scm: Add nodelay to make-session
+    
+    * modules/ssh/session.scm (make-session): Add key nodelay
+
+diff --git a/modules/ssh/session.scm b/modules/ssh/session.scm
+index d7c0831..042bd5f 100644
+--- a/modules/ssh/session.scm
++++ b/modules/ssh/session.scm
+@@ -74,7 +74,7 @@
+                        knownhosts timeout timeout-usec ssh1 ssh2 log-verbosity
+                        ciphers-c-s ciphers-s-c compression-c-s compression-s-c
+                        proxycommand stricthostkeycheck compression
+-                       compression-level callbacks config)
++                       compression-level nodelay callbacks config)
+   "Make a new SSH session with specified configuration.\n
+ Return a new SSH session."
+   (let ((session (%make-session)))
+@@ -98,6 +98,7 @@ Return a new SSH session."
+     (session-set-if-specified! stricthostkeycheck)
+     (session-set-if-specified! compression)
+     (session-set-if-specified! compression-level)
++    (session-set-if-specified! nodelay)
+     (session-set-if-specified! callbacks)
+ 
+     (when config
+
+commit b80121ae3e851a676f25fd2b7e33371b94a6e030
+Author: Lars-Dominik Braun <ldb@leibniz-psychology.org>
+Date:   Thu Jul 2 08:52:13 2020 +0200
+
+    libguile-ssh/session-func.c: Add NODELAY option
+    
+    * libguile-ssh/session-func.c (session_options, set_option): Add libssh’s
+      nodelay option.
+    * tests/session.scm ("session-set!, valid values", "session-set!, invalid
+      values"): Add unit-tests.
+
+diff --git a/libguile-ssh/session-func.c b/libguile-ssh/session-func.c
+index 48db779..9240734 100644
+--- a/libguile-ssh/session-func.c
++++ b/libguile-ssh/session-func.c
+@@ -71,6 +71,7 @@ static struct symbol_mapping session_options[] = {
+   { "stricthostkeycheck", SSH_OPTIONS_STRICTHOSTKEYCHECK },
+   { "compression",        SSH_OPTIONS_COMPRESSION        },
+   { "compression-level",  SSH_OPTIONS_COMPRESSION_LEVEL  },
++  { "nodelay",            SSH_OPTIONS_NODELAY            },
+   { "callbacks",          GSSH_OPTIONS_CALLBACKS         },
+   { NULL,                 -1 }
+ };
+@@ -366,6 +367,7 @@ set_option (SCM scm_session, struct session_data* sd, int type, SCM value)
+     case SSH_OPTIONS_SSH1:
+     case SSH_OPTIONS_SSH2:
+     case SSH_OPTIONS_STRICTHOSTKEYCHECK:
++    case SSH_OPTIONS_NODELAY:
+       return set_bool_opt (session, type, value);
+ 
+     case SSH_OPTIONS_FD:
+diff --git a/tests/session.scm b/tests/session.scm
+index 2eb1df7..83a864d 100644
+--- a/tests/session.scm
++++ b/tests/session.scm
+@@ -70,6 +70,7 @@
+                                   nolog)
+                    (compression   "yes" "no")
+                    (compression-level 1 2 3 4 5 6 7 8 9)
++                   (nodelay      #f #t)
+                    (callbacks     ((user-data . "hello")
+                                    (global-request-callback . ,(const #f))))))
+         (res #t))
+@@ -95,6 +96,7 @@
+                    (log-verbosity     "string" -1 0 1 2 3 4 5)
+                    (compression       12345)
+                    (compression-level -1 0 10)
++                   (nodelay           12345 "string")
+                    (callbacks         "not a list"
+                                       ((global-request-callback . #f)))))
+         (res #t))
diff --git a/gnu/packages/ssh.scm b/gnu/packages/ssh.scm
index ea42e72de2..0d9fa5fd54 100644
--- a/gnu/packages/ssh.scm
+++ b/gnu/packages/ssh.scm
@@ -259,6 +259,7 @@ Additionally, various channel-specific options can be negotiated.")
               (sha256
                (base32
                 "054hd9rzfhb48gc1hw3rphhp0cnnd4bs5qmidy5ygsyvy9ravlad"))
+              (patches (search-patches "guile-ssh-0.12.0-nodelay.patch"))
               (modules '((guix build utils)))))
     (build-system gnu-build-system)
     (outputs '("out" "debug"))
diff --git a/guix/ssh.scm b/guix/ssh.scm
index 2d7ca7d01d..df7a9bed4a 100644
--- a/guix/ssh.scm
+++ b/guix/ssh.scm
@@ -129,7 +129,9 @@ Throw an error on failure."
                                ;; We need lightweight compression when
                                ;; exchanging full archives.
                                #:compression compression
-                               #:compression-level 3)))
+                               #:compression-level 3
+
+                               #:nodelay #t)))
 
     ;; Honor ~/.ssh/config.
     (session-parse-config! session)
-- 
2.20.1
-----BEGIN PGP SIGNATURE-----

iQGzBAABCgAdFiEEyk+M9DfXR4/aBV/UQhN3ARo3hEYFAl79zQ8ACgkQQhN3ARo3
hEbEKAwAuHGzi/EBcAbji98iXuwjFX5pKKTvVGnb+EusvCFDwgQzOfKUBnFmDcb3
+C3R0/Y4rsDuvMfmpeNSdiZB4GPxiX7bOJ+uyzJLUDi/1CPvyGDsAHvF0oAGF+vp
+316dHsPryS8mYUgkYOb9aLzbzTaCm/qikohd19S8jkpcb9rbxvLVxlvEsAz7s22
fxDfuYhJOGxprK3XWRK0xga8aG58x48O8uHSh93TXkLr77829Mu4zoKKzfA6Dds6
ChJWpOEbGGBnl2txafXulLTDpywbGs6Ir4kDvfYVBENAyOXsws/VVtZwfNWbgZ6B
PuSDP9dXaOa0hRU9hcpzNma9h3qwYEvad1TG62V9la6aBtO3/f6P49uXUW9Azo3q
3zj/npLqKcrn7fr+vk29p62KUqze5rvJ80IYbPEyhWtOhMD3CS5dkRiW2Q8igKGS
DxVVz0R/AFgFPJc/RTpF1KvyPBvSeQgv48OamXBOYEF5C18qZO/yOE0lCmO4JVwi
vrjtLqdk
=tkfu
-----END PGP SIGNATURE-----


L
L
Ludovic Courtès wrote on 6 Jul 2020 10:49
(name . Lars-Dominik Braun)(address . ldb@leibniz-psychology.org)(address . 41702@debbugs.gnu.org)
87k0zhkoyf.fsf@gnu.org
Hi,

Lars-Dominik Braun <ldb@leibniz-psychology.org> skribis:

Toggle quote (3 lines)
> and the attached patch uses the new nodelay option, if we don’t want to wait
> for another guile-ssh release.

I’d rather wait; perhaps you can ask Artyom whether they’re planning for
a new release soonish?

If there’s no plan for a release within two weeks, we can go ahead with
the patch you propose.

Toggle quote (10 lines)
> From 2892f79f819dd2dd9420f7e74bcb6e293d377452 Mon Sep 17 00:00:00 2001
> From: Lars-Dominik Braun <ldb@leibniz-psychology.org>
> Date: Thu, 2 Jul 2020 13:59:51 +0200
> Subject: [PATCH] guix: Add nodelay to SSH store connection
>
> * gnu/packages/ssh.scm (guile-ssh)[patches]: Add patch.
> * gnu/packages/patches/guile-ssh-0.12.0-nodelay.patch: New file.
> * gnu/local.mk (dist_patch_DATA): Add it.
> * guix/ssh.scm (open-ssh-session): Use new nodelay option.

If we take this route, the two patches (guile-ssh changes, then (guix
ssh) changes) should be separate.

Also #:nodelay will cause a hard failure for people using an older
Guile-SSH. I wonder how to guard against that; configure check, or
run-time Guile-SSH version check?

Toggle quote (2 lines)
> +++ b/gnu/packages/patches/guile-ssh-0.12.0-nodelay.patch

It would be good to include the URL of the upstream issue

Thanks!

Ludo’.
L
L
Lars-Dominik Braun wrote on 6 Jul 2020 14:58
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 41702@debbugs.gnu.org)
20200706125800.GA3176@zpidnp36
Hi Ludo,

Toggle quote (2 lines)
> I’d rather wait; perhaps you can ask Artyom whether they’re planning for
> a new release soonish?
“I'll see if I can fix some random test failures with Guile 2.2 that sometimes
occur, and then I'll prepare a new release.”

Toggle quote (2 lines)
> If there’s no plan for a release within two weeks, we can go ahead with
> the patch you propose.
Alright, I’ll prepare a new patch if either of the two events happen.

Cheers,
Lars
-----BEGIN PGP SIGNATURE-----

iQGzBAABCgAdFiEEyk+M9DfXR4/aBV/UQhN3ARo3hEYFAl8DH9QACgkQQhN3ARo3
hEYWtAv+OuMtQymFw/zfJqmiftlasJUPsEQzhxmZEyIy31YkPIoE/LUVEBwf1Vmi
it+3N/GSAex4OboX3xb1zjLdDrju+joLCGMLqJhQvRnp7tJOTmZC0daYlIwlF/uR
xXTx2JWbCWlkuiN0X6UCuG7wT0fIVe12ngpwLsZH2umcclUYH+4PZzH9nlwlPCGE
mnq9VZTYwXzcxMCX4Dy2CmF9v5ZrB02NJlAKHRSlUOmk5aJGkX5iPPNFDydq7gow
zs72yQQXOn+NfZs6+CZnHzmApjV60eJtcGGX7UALA7IWkRzcdgBsNTj7gAsaLrN8
i62fdT41fJd59QdRPGrnzFEWwy8Ke4kgobUVaDD/kv8FJg6r22KoiDUKEMDDRUjZ
SptSBE7q2je4pV+bem2c+rDxyl0SledUst148O/rYzceM4CRn6s2ubBaHz7WlEza
g5pu2uMkh0QjHL75hWrzGmz92ugdMwoqZwJqbgDlLK4pHEsdlpI2xlHETnZwJLFA
9DWnFPkp
=xtF5
-----END PGP SIGNATURE-----


L
L
Lars-Dominik Braun wrote on 20 Jul 2020 11:50
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 41702@debbugs.gnu.org)
20200720095059.GB3143@zpidnp36
Hi Ludo,

a new version has been released, so here is the updated patchset. I’m not sure
if it’s possible to make the #:nodelay optional somehow, i.e. retaining
compatibility with guile-ssh<0.13.0. For now there’s a hard feature check.

Lars
From 42ff75c9d4da4eb2aeece96119ae5a65a9ccac82 Mon Sep 17 00:00:00 2001
From: Lars-Dominik Braun <ldb@leibniz-psychology.org>
Date: Mon, 20 Jul 2020 11:27:35 +0200
Subject: [PATCH 1/2] gnu: guile-ssh: Update to 0.13.0.

* gnu/packages/ssh.scm (guile-ssh): Update to 0.13.0.
---
gnu/packages/ssh.scm | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

Toggle diff (24 lines)
diff --git a/gnu/packages/ssh.scm b/gnu/packages/ssh.scm
index a612b9dd15..50316ea1be 100644
--- a/gnu/packages/ssh.scm
+++ b/gnu/packages/ssh.scm
@@ -297,7 +297,7 @@ Additionally, various channel-specific options can be negotiated.")
 (define-public guile-ssh
   (package
     (name "guile-ssh")
-    (version "0.12.0")
+    (version "0.13.0")
     (home-page "https://github.com/artyom-poptsov/guile-ssh")
     (source (origin
               (method git-fetch)
@@ -307,7 +307,7 @@ Additionally, various channel-specific options can be negotiated.")
               (file-name (string-append name "-" version ".tar.gz"))
               (sha256
                (base32
-                "054hd9rzfhb48gc1hw3rphhp0cnnd4bs5qmidy5ygsyvy9ravlad"))
+                "1q96h98p6x7ah6nc0d2wfx503fmsj36riv9ka9s79z3lzwaf0k26"))
               (modules '((guix build utils)))))
     (build-system gnu-build-system)
     (outputs '("out" "debug"))
-- 
2.20.1
From 6a18c9a4533075d1ee9795777a998fde3e64030f Mon Sep 17 00:00:00 2001
From: Lars-Dominik Braun <ldb@leibniz-psychology.org>
Date: Mon, 20 Jul 2020 11:28:51 +0200
Subject: [PATCH 2/2] guix: Speed up RPC via SSH connections.

See issue #41702

* guix/ssh.scm (open-ssh-session): Enable #:nodelay.
* m4/guix.m4 (GUIX_CHECK_GUILE_SSH): Add feature check for this new parameter.
---
guix/ssh.scm | 5 ++++-
m4/guix.m4 | 6 ++++--
2 files changed, 8 insertions(+), 3 deletions(-)

Toggle diff (41 lines)
diff --git a/guix/ssh.scm b/guix/ssh.scm
index 2d7ca7d01d..3bffb0b525 100644
--- a/guix/ssh.scm
+++ b/guix/ssh.scm
@@ -129,7 +129,10 @@ Throw an error on failure."
                                ;; We need lightweight compression when
                                ;; exchanging full archives.
                                #:compression compression
-                               #:compression-level 3)))
+                               #:compression-level 3
+
+                               ;; Speed up RPC.
+                               #:nodelay #t)))
 
     ;; Honor ~/.ssh/config.
     (session-parse-config! session)
diff --git a/m4/guix.m4 b/m4/guix.m4
index 7c27ae74df..cce03045db 100644
--- a/m4/guix.m4
+++ b/m4/guix.m4
@@ -142,14 +142,16 @@ dnl GUIX_CHECK_GUILE_SSH
 dnl
 dnl Check whether a recent-enough Guile-SSH is available.
 AC_DEFUN([GUIX_CHECK_GUILE_SSH], [
-  dnl Check whether 'userauth-gssapi!' (introduced in 0.12.0) is present.
+  dnl Check whether '#:nodelay' paramater to 'make-session' (introduced in
+  dnl 0.13.0) is present.
   AC_CACHE_CHECK([whether Guile-SSH is available and recent enough],
     [guix_cv_have_recent_guile_ssh],
     [GUILE_CHECK([retval],
       [(and (@ (ssh channel) channel-send-eof)
             (@ (ssh popen) open-remote-pipe)
             (@ (ssh dist node) node-eval)
-            (@ (ssh auth) userauth-gssapi!))])
+            (@ (ssh auth) userauth-gssapi!)
+            ((@ (ssh session) make-session) #:nodelay #t))])
      if test "$retval" = 0; then
        guix_cv_have_recent_guile_ssh="yes"
      else
-- 
2.20.1
-----BEGIN PGP SIGNATURE-----

iQGzBAABCgAdFiEEyk+M9DfXR4/aBV/UQhN3ARo3hEYFAl8VaQAACgkQQhN3ARo3
hEakuQwA0kCZuyAU1nBpZSQcTuvhuHwbgqIkX2JBGe9oasSkakBZ5XM5dVtARCer
k7lZuyd7Z6QjgUU0+yvLQ2yY05++xGx7Bh+7Cazp7LtbNjKbfTxzVUUKKbqE+qnc
5XYWSHpB7XmuP6RfbBNUtBvw6SUgkNO3mCU1FgSRgVAzFpI8uNS4nTZ8Ise8/X3d
zJRimFqs5qh7gGHI1zRju346c60MhUTexJwkFgxW4UfwlIuu7wZ2WSTTNbXfehrQ
w9zqDCGG8MKftYPLDE9KG1CcB0XB00ssgI7NTf/mlpQOmAizpkLK3PsyV5Cmu4NV
PBAIzn12XQBUCU0brelhfiLTE0t/ueJc3oen2D18KA+Etsjy6hzhFEpfNzLoWgqU
r2niVSUBGvTrwSH2iAdbWL0c2g8PbBQtx6KYoaTORH0xAc7PJcR+pq0QzbM4PSD4
V7StAoQ3hAQOPd3FFXYv2XSZ/yv8VXkJ26y7pT9tFz9e+NsOgnD6JyYy3wK8LRWj
CPdtRngb
=eeqN
-----END PGP SIGNATURE-----


L
L
Ludovic Courtès wrote on 20 Jul 2020 23:51
(name . Lars-Dominik Braun)(address . ldb@leibniz-psychology.org)(address . 41702@debbugs.gnu.org)
87mu3tq2hh.fsf@gnu.org
Hi Lars,

Lars-Dominik Braun <ldb@leibniz-psychology.org> skribis:

Toggle quote (7 lines)
> From 42ff75c9d4da4eb2aeece96119ae5a65a9ccac82 Mon Sep 17 00:00:00 2001
> From: Lars-Dominik Braun <ldb@leibniz-psychology.org>
> Date: Mon, 20 Jul 2020 11:27:35 +0200
> Subject: [PATCH 1/2] gnu: guile-ssh: Update to 0.13.0.
>
> * gnu/packages/ssh.scm (guile-ssh): Update to 0.13.0.

[...]

Toggle quote (10 lines)
> From 6a18c9a4533075d1ee9795777a998fde3e64030f Mon Sep 17 00:00:00 2001
> From: Lars-Dominik Braun <ldb@leibniz-psychology.org>
> Date: Mon, 20 Jul 2020 11:28:51 +0200
> Subject: [PATCH 2/2] guix: Speed up RPC via SSH connections.
>
> See issue #41702
>
> * guix/ssh.scm (open-ssh-session): Enable #:nodelay.
> * m4/guix.m4 (GUIX_CHECK_GUILE_SSH): Add feature check for this new parameter.

I adjusted the commit log and pushed.

I think it’s okay to use the new Guile-SSH feature right away because
it’s not central, and ‘guix pull’ gives people the ability to get the
new Guile-SSH version.

Thanks!

Ludo’.
M
M
Maxim Cournoyer wrote on 22 Oct 2020 23:03
(name . Ludovic Courtès)(address . ludo@gnu.org)
87r1pqc6sy.fsf@gmail.com
Hello,

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (29 lines)
> Hi Lars,
>
> Lars-Dominik Braun <ldb@leibniz-psychology.org> skribis:
>
>> From 42ff75c9d4da4eb2aeece96119ae5a65a9ccac82 Mon Sep 17 00:00:00 2001
>> From: Lars-Dominik Braun <ldb@leibniz-psychology.org>
>> Date: Mon, 20 Jul 2020 11:27:35 +0200
>> Subject: [PATCH 1/2] gnu: guile-ssh: Update to 0.13.0.
>>
>> * gnu/packages/ssh.scm (guile-ssh): Update to 0.13.0.
>
> [...]
>
>> From 6a18c9a4533075d1ee9795777a998fde3e64030f Mon Sep 17 00:00:00 2001
>> From: Lars-Dominik Braun <ldb@leibniz-psychology.org>
>> Date: Mon, 20 Jul 2020 11:28:51 +0200
>> Subject: [PATCH 2/2] guix: Speed up RPC via SSH connections.
>>
>> See issue #41702
>>
>> * guix/ssh.scm (open-ssh-session): Enable #:nodelay.
>> * m4/guix.m4 (GUIX_CHECK_GUILE_SSH): Add feature check for this new parameter.
>
> I adjusted the commit log and pushed.
>
> I think it’s okay to use the new Guile-SSH feature right away because
> it’s not central, and ‘guix pull’ gives people the ability to get the
> new Guile-SSH version.

Judging from the above, it seems this issue has been resolved.

Closing.

Thank you,

Maxim
Closed
L
L
Lars-Dominik Braun wrote on 23 Oct 2020 09:26
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
20201023072621.GA3115@zpidnp36
Hi Maxim,

Toggle quote (1 lines)
> Judging from the above, it seems this issue has been resolved.
grafting is still a performance issue imo. Compare for example:

$ time guix environment --ad-hoc --search-paths r-learnr
guix environment --ad-hoc --search-paths r-learnr 5,90s user 0,09s system 210% cpu 2,844 total
$ time guix environment --ad-hoc --search-paths r-learnr --no-grafts
guix environment --ad-hoc --search-paths r-learnr --no-grafts 2,03s user 0,08s system 164% cpu 1,277 total

Cheers,
Lars
-----BEGIN PGP SIGNATURE-----

iQGzBAABCgAdFiEEyk+M9DfXR4/aBV/UQhN3ARo3hEYFAl+ShZkACgkQQhN3ARo3
hEZzLgv/eXYwosDaui61yH06wGUcj41RHTGO/fkq56vI2oqXXBcwxEXvIqoixdj9
MMbU5YWIDg8W2GCFQ3Nf0Nul2XyyEtt7z+DeQmbBLEeqnasuPdbkesLuWpE/1vlm
qShuckvvSvhz4DsYPsOFQuQQDXGDSUP6+vWIcP5WVGIUl+eXlmYmxCMOFPPT//wS
ItJbiOICE68ghSaUf5X8U4sgIM2XipK+sf1lLLlmQX8Ws8PCOBWYQsKdMq5pq3Bt
YvDKqD40c4RLlZ4mw642cf6YB7jYjCZnzIOIc7DkgcD0mM/fFE28LEy5TTKQxE7R
WLTDFJ10K1glr/OOpXDZsUg7yR7ERDTEGg6iVy5Rr80gBx+D9tG7Ku2XCfE3Aoob
q7hcYwMOVwkS5agNlr5tY4+0GhCicOgDcYeos8qWVAkhbP+TOl5NnMRDpvT8XAFE
yNDcD2rfDfPL9KEpkXd8U215cLPQ+hVljmPJbVQZG4HvmtxzWtCcOM7cSFLgX82K
2nzwqE3m
=Wdtz
-----END PGP SIGNATURE-----


Closed
L
L
Ludovic Courtès wrote on 23 Oct 2020 16:51
(name . Lars-Dominik Braun)(address . ldb@leibniz-psychology.org)
873625ngg9.fsf@gnu.org
Hi,

Lars-Dominik Braun <ldb@leibniz-psychology.org> skribis:

Toggle quote (2 lines)
> grafting is still a performance issue imo. Compare for example:

Agreed. The fix in 58bb833365db4e8934a386497d5b00a063cfd27d is
incomplete: we’re still potentially doing things several times.

Toggle quote (5 lines)
> $ time guix environment --ad-hoc --search-paths r-learnr
> guix environment --ad-hoc --search-paths r-learnr 5,90s user 0,09s system 210% cpu 2,844 total
> $ time guix environment --ad-hoc --search-paths r-learnr --no-grafts
> guix environment --ad-hoc --search-paths r-learnr --no-grafts 2,03s user 0,08s system 164% cpu 1,277 total

Here’s what I get over a local socket:

Toggle snippet (14 lines)
ludo@ribbon ~/src/guix$ time guix environment --ad-hoc --search-paths r-learnr --no-grafts
export PATH="/gnu/store/wkhd1fnmg1q30g21bfn5jprz24m041w6-profile/bin${PATH:+:}$PATH"

real 0m1.790s
user 0m1.928s
sys 0m0.102s
ludo@ribbon ~/src/guix$ time guix environment --ad-hoc --search-paths r-learnr
export PATH="/gnu/store/pwzrpzdbli2561f2yq70kljywn93h5w6-profile/bin${PATH:+:}$PATH"

real 0m3.820s
user 0m4.954s
sys 0m0.105s

Ludo’.
Closed
L
L
Ludovic Courtès wrote on 23 Oct 2020 16:52
control message for bug #41702
(address . control@debbugs.gnu.org)
871rhpngf9.fsf@gnu.org
reopen 41702
tags 41702 - fixed patch
quit
L
L
Ludovic Courtès wrote on 27 May 2021 18:32
(address . control@debbugs.gnu.org)
87wnrkjfwv.fsf@gnu.org
merge 41702 48005
quit
?