[PATCH 0/3] Improve store caching; improve 'map/accumulate-builds' performance

DoneSubmitted by Ludovic Courtès.
Details
One participant
  • Ludovic Courtès
Owner
unassigned
Severity
normal
L
L
Ludovic Courtès wrote on 13 May 16:59 +0200
(address . guix-patches@gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20220513145947.11951-1-ludo@gnu.org
Hello!

The first patch improves cache handling, providing a generic
way to trace caches. The second patch adds a separate package/graft
cache, as was previously suggested in a comment. The cache hit rate
can be seen by setting GUIX_PROFILING:

Toggle snippet (14 lines)
$ time GUIX_PROFILING=package-graft-cache ./pre-inst-env guix home build -v1 -n ~/src/configuration/home-config.scm
28.5 MB would be downloaded

Package Graft Cache:
fresh caches: 1
lookups: 794
hits: 784 (98.7%)
cache size: 10 entries

real 0m7.953s
user 0m9.091s
sys 0m0.291s

The last patch improves performance in the presence of grafts by trimming
exploration of the call tree in ‘map/accumulate-builds’. This is very
much a heuristic, but it does help significantly for ‘guix system’,
‘guix home’, or ‘guix package’ with many packages.

Thoughts?

Ludo’.

Ludovic Courtès (3):
store: 'mcached' users can specify a cache ID.
packages: Use separate package/graft cache.
store: Use a decaying cutoff in 'map/accumulate-builds'.

guix/packages.scm | 12 ++++--
guix/store.scm | 104 +++++++++++++++++++++++++++++++++-------------
2 files changed, 84 insertions(+), 32 deletions(-)


base-commit: 0b4300d4fd8c972f0cb9d6751fc824b9a065b780
--
2.36.0
L
L
Ludovic Courtès wrote on 13 May 17:00 +0200
[PATCH 1/3] store: 'mcached' users can specify a cache ID.
(address . 55398@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20220513150044.11991-1-ludo@gnu.org
Users of 'mcached' can now specify a cache ID; furthermore, the cache
hit rate is automatically recorded for all the caches accessed with
'mcached'.

* guix/store.scm (%max-store-connection-caches)
(%store-connection-cache-names): New variables.
(recorder-for-cache): New procedure.
(record-cache-lookup!): Add 'cache-id' parameter and rewrite in terms of
'recorder-for-cache'.
(lookup-cached-object): Add 'cache-id' parameter and honor it.
(%mcached): Add #:cache parameter and honor it.
(mcached): Add '=>' keyword and corresponding clauses.
---
guix/store.scm | 65 ++++++++++++++++++++++++++++++++++++++++----------
1 file changed, 53 insertions(+), 12 deletions(-)

Toggle diff (138 lines)
diff --git a/guix/store.scm b/guix/store.scm
index 1d176fb99d..220901f6ce 100644
--- a/guix/store.scm
+++ b/guix/store.scm
@@ -1,5 +1,5 @@
 ;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2012-2022 Ludovic Courtès <ludo@gnu.org>
 ;;; Copyright © 2018 Jan Nieuwenhuizen <janneke@gnu.org>
 ;;; Copyright © 2019, 2020 Mathieu Othacehe <m.othacehe@gmail.com>
 ;;; Copyright © 2020 Florian Pelz <pelzflorian@pelzflorian.de>
@@ -1793,6 +1793,14 @@ (define-operation (clear-failed-paths (store-path-list items))
 ;; the 'caches' vector of <store-connection>.
 (define %store-connection-caches (make-atomic-box 0))
 
+(define %max-store-connection-caches
+  ;; Maximum number of caches returned by 'allocate-store-connection-cache'.
+  32)
+
+(define %store-connection-cache-names
+  ;; Mapping of cache ID to symbol.
+  (make-vector %max-store-connection-caches))
+
 (define (allocate-store-connection-cache name)
   "Allocate a new cache for store connections and return its identifier.  Said
 identifier can be passed as an argument to "
@@ -1800,7 +1808,9 @@ (define (allocate-store-connection-cache name)
     (let ((previous (atomic-box-compare-and-swap! %store-connection-caches
                                                   current (+ current 1))))
       (if (= previous current)
-          current
+          (begin
+            (vector-set! %store-connection-cache-names current name)
+            current)
           (loop current)))))
 
 (define %object-cache-id
@@ -1926,16 +1936,37 @@ (define (cache-lookup-recorder component title)
       (lambda (x y)
         #t)))
 
-(define record-cache-lookup!
-  (cache-lookup-recorder "object-cache" "Store object cache"))
+(define recorder-for-cache
+  (let ((recorders (make-vector %max-store-connection-caches)))
+    (lambda (cache-id)
+      "Return a procedure to record lookup stats for CACHE-ID."
+      (match (vector-ref recorders cache-id)
+        ((? unspecified?)
+         (let* ((name (symbol->string
+                       (vector-ref %store-connection-cache-names cache-id)))
+                (description
+                 (string-titlecase
+                  (string-map (match-lambda
+                                (#\- #\space)
+                                (chr chr))
+                              name))))
+           (let ((proc (cache-lookup-recorder name description)))
+             (vector-set! recorders cache-id proc)
+             proc)))
+        (proc proc)))))
 
-(define-inlinable (lookup-cached-object object keys vhash-fold*)
-  "Return the cached object in the store connection corresponding to OBJECT
+(define (record-cache-lookup! cache-id value cache)
+  "Record the lookup of VALUE in CACHE-ID, whose current value is CACHE."
+  (let ((record! (recorder-for-cache cache-id)))
+    (record! value cache)))
+
+(define-inlinable (lookup-cached-object cache-id object keys vhash-fold*)
+  "Return the object in store cache CACHE-ID corresponding to OBJECT
 and KEYS; use VHASH-FOLD* to look for OBJECT in the cache.  KEYS is a list of
 additional keys to match against, and which are compared with 'equal?'.
 Return #f on failure and the cached result otherwise."
   (lambda (store)
-    (let* ((cache (store-connection-cache store %object-cache-id))
+    (let* ((cache (store-connection-cache store cache-id))
 
            ;; Escape as soon as we find the result.  This avoids traversing
            ;; the whole vlist chain and significantly reduces the number of
@@ -1949,40 +1980,50 @@ (define-inlinable (lookup-cached-object object keys vhash-fold*)
                                           result))))
                                  #f object
                                  cache))))
-      (record-cache-lookup! value cache)
+      (record-cache-lookup! cache-id value cache)
       (values value store))))
 
 (define* (%mcached mthunk object #:optional (keys '())
                    #:key
+                   (cache %object-cache-id)
                    (vhash-cons vhash-consq)
                    (vhash-fold* vhash-foldq*))
   "Bind the monadic value returned by MTHUNK, which supposedly corresponds to
 OBJECT/KEYS, or return its cached value.  Use VHASH-CONS to insert OBJECT into
 the cache, and VHASH-FOLD* to look it up."
-  (mlet %store-monad ((cached (lookup-cached-object object keys
+  (mlet %store-monad ((cached (lookup-cached-object cache object keys
                                                     vhash-fold*)))
     (if cached
         (return cached)
         (>>= (mthunk)
              (lambda (result)
                (cache-object-mapping object keys result
+                                     #:cache cache
                                      #:vhash-cons vhash-cons))))))
 
 (define-syntax mcached
-  (syntax-rules (eq? equal?)
+  (syntax-rules (eq? equal? =>)
     "Run MVALUE, which corresponds to OBJECT/KEYS, and cache it; or return the
 value associated with OBJECT/KEYS in the store's object cache if there is
 one."
-    ((_ eq? mvalue object keys ...)
+    ((_ eq? (=> cache) mvalue object keys ...)
      (%mcached (lambda () mvalue)
                object (list keys ...)
+               #:cache cache
                #:vhash-cons vhash-consq
                #:vhash-fold* vhash-foldq*))
-    ((_ equal? mvalue object keys ...)
+    ((_ equal? (=> cache) mvalue object keys ...)
      (%mcached (lambda () mvalue)
                object (list keys ...)
+               #:cache cache
                #:vhash-cons vhash-cons
                #:vhash-fold* vhash-fold*))
+    ((_ eq? mvalue object keys ...)
+     (mcached eq? (=> %object-cache-id)
+              mvalue object keys ...))
+    ((_ equal? mvalue object keys ...)
+     (mcached equal? (=> %object-cache-id)
+              mvalue object keys ...))
     ((_ mvalue object keys ...)
      (mcached eq? mvalue object keys ...))))
 
-- 
2.36.0
L
L
Ludovic Courtès wrote on 13 May 17:00 +0200
[PATCH 2/3] packages: Use separate package/graft cache.
(address . 55398@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20220513150044.11991-2-ludo@gnu.org
* guix/packages.scm (%package-graft-cache): New variable.
(input-graft): Add (=> %package-graft-cache).
---
guix/packages.scm | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)

Toggle diff (39 lines)
diff --git a/guix/packages.scm b/guix/packages.scm
index a79b36d03d..7ee65e9b6b 100644
--- a/guix/packages.scm
+++ b/guix/packages.scm
@@ -1618,6 +1618,11 @@ (define* (package->bag package #:optional
                                  (&package-error
                                   (package package))))))))))))
 
+(define %package-graft-cache
+  ;; Cache mapping <package> records to <graft> records, for packages that
+  ;; have a replacement.
+  (allocate-store-connection-cache 'package-graft-cache))
+
 (define (input-graft system)
   "Return a monadic procedure that, given a package with a graft, returns a
 graft, and #f otherwise."
@@ -1626,9 +1631,8 @@ (define (input-graft system)
       (((? package? package) output)
        (let ((replacement (package-replacement package)))
          (if replacement
-             ;; XXX: We should use a separate cache instead of abusing the
-             ;; object cache.
-             (mcached (mlet %store-monad ((orig (package->derivation package system
+             (mcached eq? (=> %package-graft-cache)
+                      (mlet %store-monad ((orig (package->derivation package system
                                                                      #:graft? #f))
                                           (new  (package->derivation replacement system
                                                                      #:graft? #t)))
@@ -1637,7 +1641,7 @@ (define (input-graft system)
                                   (origin-output output)
                                   (replacement new)
                                   (replacement-output output))))
-                      package 'graft output system)
+                      package output system)
              (return #f))))
       (_
        (return #f)))))
-- 
2.36.0
L
L
Ludovic Courtès wrote on 13 May 17:00 +0200
[PATCH 3/3] store: Use a decaying cutoff in 'map/accumulate-builds'.
(address . 55398@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20220513150044.11991-3-ludo@gnu.org
This reduces the wall-clock time of:

./pre-inst-env guix system vm gnu/system/examples/desktop.tmpl -n

from 2m13s to 53s (the timings depend on which derivations have already
been built and are in store; in this case, many were missing).

* guix/store.scm (default-cutoff): New variable.
(map/accumulate-builds): Use it. Parameterize it in recursive calls to
have decaying cutoff.
---
guix/store.scm | 39 +++++++++++++++++++++++----------------
1 file changed, 23 insertions(+), 16 deletions(-)

Toggle diff (60 lines)
diff --git a/guix/store.scm b/guix/store.scm
index 220901f6ce..a3240eb2e0 100644
--- a/guix/store.scm
+++ b/guix/store.scm
@@ -1362,8 +1362,12 @@ (define (build-accumulator expected-store)
         (unresolved things continue)
         (continue #t))))
 
+(define default-cutoff
+  ;; Default cutoff parameter for 'map/accumulate-builds'.
+  (make-parameter 32))
+
 (define* (map/accumulate-builds store proc lst
-                                #:key (cutoff 30))
+                                #:key (cutoff (default-cutoff)))
   "Apply PROC over each element of LST, accumulating 'build-things' calls and
 coalescing them into a single call.
 
@@ -1377,21 +1381,24 @@ (define accumulator
     (build-accumulator store))
 
   (define-values (result rest)
-    (let loop ((lst lst)
-               (result '())
-               (unresolved 0))
-      (match lst
-        ((head . tail)
-         (match (with-build-handler accumulator
-                  (proc head))
-           ((? unresolved? obj)
-            (if (>= unresolved cutoff)
-                (values (reverse (cons obj result)) tail)
-                (loop tail (cons obj result) (+ 1 unresolved))))
-           (obj
-            (loop tail (cons obj result) unresolved))))
-        (()
-         (values (reverse result) lst)))))
+    ;; Have the default cutoff decay as we go deeper in the call stack to
+    ;; avoid pessimal behavior.
+    (parameterize ((default-cutoff (quotient cutoff 2)))
+      (let loop ((lst lst)
+                 (result '())
+                 (unresolved 0))
+        (match lst
+          ((head . tail)
+           (match (with-build-handler accumulator
+                    (proc head))
+             ((? unresolved? obj)
+              (if (>= unresolved cutoff)
+                  (values (reverse (cons obj result)) tail)
+                  (loop tail (cons obj result) (+ 1 unresolved))))
+             (obj
+              (loop tail (cons obj result) unresolved))))
+          (()
+           (values (reverse result) lst))))))
 
   (match (append-map (lambda (obj)
                        (if (unresolved? obj)
-- 
2.36.0
L
L
Ludovic Courtès wrote on 19 May 00:10 +0200
Re: bug#55398: [PATCH 0/3] Improve store caching; improve 'map/accumulate-builds' performance
(address . 55398-done@debbugs.gnu.org)
875ym21ohh.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (4 lines)
> store: 'mcached' users can specify a cache ID.
> packages: Use separate package/graft cache.
> store: Use a decaying cutoff in 'map/accumulate-builds'.

Pushed as 2f170893719e6e9fc8e19cc5f0568e20a95d92b4.

Ludo’.
Closed
?