[PATCH 0/3] Improve store caching; improve 'map/accumulate-builds' performance

  • Done
  • quality assurance status badge
Details
One participant
  • Ludovic Courtès
Owner
unassigned
Submitted by
Ludovic Courtès
Severity
normal
L
L
Ludovic Courtès wrote on 13 May 2022 16:59
(address . guix-patches@gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20220513145947.11951-1-ludo@gnu.org
Hello!

The first patch improves cache handling, providing a generic
way to trace caches. The second patch adds a separate package/graft
cache, as was previously suggested in a comment. The cache hit rate
can be seen by setting GUIX_PROFILING:

Toggle snippet (14 lines)
$ time GUIX_PROFILING=package-graft-cache ./pre-inst-env guix home build -v1 -n ~/src/configuration/home-config.scm
28.5 MB would be downloaded

Package Graft Cache:
fresh caches: 1
lookups: 794
hits: 784 (98.7%)
cache size: 10 entries

real 0m7.953s
user 0m9.091s
sys 0m0.291s

The last patch improves performance in the presence of grafts by trimming
exploration of the call tree in ‘map/accumulate-builds’. This is very
much a heuristic, but it does help significantly for ‘guix system’,
‘guix home’, or ‘guix package’ with many packages.

Thoughts?

Ludo’.

Ludovic Courtès (3):
store: 'mcached' users can specify a cache ID.
packages: Use separate package/graft cache.
store: Use a decaying cutoff in 'map/accumulate-builds'.

guix/packages.scm | 12 ++++--
guix/store.scm | 104 +++++++++++++++++++++++++++++++++-------------
2 files changed, 84 insertions(+), 32 deletions(-)


base-commit: 0b4300d4fd8c972f0cb9d6751fc824b9a065b780
--
2.36.0
L
L
Ludovic Courtès wrote on 13 May 2022 17:00
[PATCH 1/3] store: 'mcached' users can specify a cache ID.
(address . 55398@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20220513150044.11991-1-ludo@gnu.org
Users of 'mcached' can now specify a cache ID; furthermore, the cache
hit rate is automatically recorded for all the caches accessed with
'mcached'.

* guix/store.scm (%max-store-connection-caches)
(%store-connection-cache-names): New variables.
(recorder-for-cache): New procedure.
(record-cache-lookup!): Add 'cache-id' parameter and rewrite in terms of
'recorder-for-cache'.
(lookup-cached-object): Add 'cache-id' parameter and honor it.
(%mcached): Add #:cache parameter and honor it.
(mcached): Add '=>' keyword and corresponding clauses.
---
guix/store.scm | 65 ++++++++++++++++++++++++++++++++++++++++----------
1 file changed, 53 insertions(+), 12 deletions(-)

Toggle diff (138 lines)
diff --git a/guix/store.scm b/guix/store.scm
index 1d176fb99d..220901f6ce 100644
--- a/guix/store.scm
+++ b/guix/store.scm
@@ -1,5 +1,5 @@
;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2012-2022 Ludovic Courtès <ludo@gnu.org>
;;; Copyright © 2018 Jan Nieuwenhuizen <janneke@gnu.org>
;;; Copyright © 2019, 2020 Mathieu Othacehe <m.othacehe@gmail.com>
;;; Copyright © 2020 Florian Pelz <pelzflorian@pelzflorian.de>
@@ -1793,6 +1793,14 @@ (define-operation (clear-failed-paths (store-path-list items))
;; the 'caches' vector of <store-connection>.
(define %store-connection-caches (make-atomic-box 0))
+(define %max-store-connection-caches
+ ;; Maximum number of caches returned by 'allocate-store-connection-cache'.
+ 32)
+
+(define %store-connection-cache-names
+ ;; Mapping of cache ID to symbol.
+ (make-vector %max-store-connection-caches))
+
(define (allocate-store-connection-cache name)
"Allocate a new cache for store connections and return its identifier. Said
identifier can be passed as an argument to "
@@ -1800,7 +1808,9 @@ (define (allocate-store-connection-cache name)
(let ((previous (atomic-box-compare-and-swap! %store-connection-caches
current (+ current 1))))
(if (= previous current)
- current
+ (begin
+ (vector-set! %store-connection-cache-names current name)
+ current)
(loop current)))))
(define %object-cache-id
@@ -1926,16 +1936,37 @@ (define (cache-lookup-recorder component title)
(lambda (x y)
#t)))
-(define record-cache-lookup!
- (cache-lookup-recorder "object-cache" "Store object cache"))
+(define recorder-for-cache
+ (let ((recorders (make-vector %max-store-connection-caches)))
+ (lambda (cache-id)
+ "Return a procedure to record lookup stats for CACHE-ID."
+ (match (vector-ref recorders cache-id)
+ ((? unspecified?)
+ (let* ((name (symbol->string
+ (vector-ref %store-connection-cache-names cache-id)))
+ (description
+ (string-titlecase
+ (string-map (match-lambda
+ (#\- #\space)
+ (chr chr))
+ name))))
+ (let ((proc (cache-lookup-recorder name description)))
+ (vector-set! recorders cache-id proc)
+ proc)))
+ (proc proc)))))
-(define-inlinable (lookup-cached-object object keys vhash-fold*)
- "Return the cached object in the store connection corresponding to OBJECT
+(define (record-cache-lookup! cache-id value cache)
+ "Record the lookup of VALUE in CACHE-ID, whose current value is CACHE."
+ (let ((record! (recorder-for-cache cache-id)))
+ (record! value cache)))
+
+(define-inlinable (lookup-cached-object cache-id object keys vhash-fold*)
+ "Return the object in store cache CACHE-ID corresponding to OBJECT
and KEYS; use VHASH-FOLD* to look for OBJECT in the cache. KEYS is a list of
additional keys to match against, and which are compared with 'equal?'.
Return #f on failure and the cached result otherwise."
(lambda (store)
- (let* ((cache (store-connection-cache store %object-cache-id))
+ (let* ((cache (store-connection-cache store cache-id))
;; Escape as soon as we find the result. This avoids traversing
;; the whole vlist chain and significantly reduces the number of
@@ -1949,40 +1980,50 @@ (define-inlinable (lookup-cached-object object keys vhash-fold*)
result))))
#f object
cache))))
- (record-cache-lookup! value cache)
+ (record-cache-lookup! cache-id value cache)
(values value store))))
(define* (%mcached mthunk object #:optional (keys '())
#:key
+ (cache %object-cache-id)
(vhash-cons vhash-consq)
(vhash-fold* vhash-foldq*))
"Bind the monadic value returned by MTHUNK, which supposedly corresponds to
OBJECT/KEYS, or return its cached value. Use VHASH-CONS to insert OBJECT into
the cache, and VHASH-FOLD* to look it up."
- (mlet %store-monad ((cached (lookup-cached-object object keys
+ (mlet %store-monad ((cached (lookup-cached-object cache object keys
vhash-fold*)))
(if cached
(return cached)
(>>= (mthunk)
(lambda (result)
(cache-object-mapping object keys result
+ #:cache cache
#:vhash-cons vhash-cons))))))
(define-syntax mcached
- (syntax-rules (eq? equal?)
+ (syntax-rules (eq? equal? =>)
"Run MVALUE, which corresponds to OBJECT/KEYS, and cache it; or return the
value associated with OBJECT/KEYS in the store's object cache if there is
one."
- ((_ eq? mvalue object keys ...)
+ ((_ eq? (=> cache) mvalue object keys ...)
(%mcached (lambda () mvalue)
object (list keys ...)
+ #:cache cache
#:vhash-cons vhash-consq
#:vhash-fold* vhash-foldq*))
- ((_ equal? mvalue object keys ...)
+ ((_ equal? (=> cache) mvalue object keys ...)
(%mcached (lambda () mvalue)
object (list keys ...)
+ #:cache cache
#:vhash-cons vhash-cons
#:vhash-fold* vhash-fold*))
+ ((_ eq? mvalue object keys ...)
+ (mcached eq? (=> %object-cache-id)
+ mvalue object keys ...))
+ ((_ equal? mvalue object keys ...)
+ (mcached equal? (=> %object-cache-id)
+ mvalue object keys ...))
((_ mvalue object keys ...)
(mcached eq? mvalue object keys ...))))
--
2.36.0
L
L
Ludovic Courtès wrote on 13 May 2022 17:00
[PATCH 2/3] packages: Use separate package/graft cache.
(address . 55398@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20220513150044.11991-2-ludo@gnu.org
* guix/packages.scm (%package-graft-cache): New variable.
(input-graft): Add (=> %package-graft-cache).
---
guix/packages.scm | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)

Toggle diff (39 lines)
diff --git a/guix/packages.scm b/guix/packages.scm
index a79b36d03d..7ee65e9b6b 100644
--- a/guix/packages.scm
+++ b/guix/packages.scm
@@ -1618,6 +1618,11 @@ (define* (package->bag package #:optional
(&package-error
(package package))))))))))))
+(define %package-graft-cache
+ ;; Cache mapping <package> records to <graft> records, for packages that
+ ;; have a replacement.
+ (allocate-store-connection-cache 'package-graft-cache))
+
(define (input-graft system)
"Return a monadic procedure that, given a package with a graft, returns a
graft, and #f otherwise."
@@ -1626,9 +1631,8 @@ (define (input-graft system)
(((? package? package) output)
(let ((replacement (package-replacement package)))
(if replacement
- ;; XXX: We should use a separate cache instead of abusing the
- ;; object cache.
- (mcached (mlet %store-monad ((orig (package->derivation package system
+ (mcached eq? (=> %package-graft-cache)
+ (mlet %store-monad ((orig (package->derivation package system
#:graft? #f))
(new (package->derivation replacement system
#:graft? #t)))
@@ -1637,7 +1641,7 @@ (define (input-graft system)
(origin-output output)
(replacement new)
(replacement-output output))))
- package 'graft output system)
+ package output system)
(return #f))))
(_
(return #f)))))
--
2.36.0
L
L
Ludovic Courtès wrote on 13 May 2022 17:00
[PATCH 3/3] store: Use a decaying cutoff in 'map/accumulate-builds'.
(address . 55398@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20220513150044.11991-3-ludo@gnu.org
This reduces the wall-clock time of:

./pre-inst-env guix system vm gnu/system/examples/desktop.tmpl -n

from 2m13s to 53s (the timings depend on which derivations have already
been built and are in store; in this case, many were missing).

* guix/store.scm (default-cutoff): New variable.
(map/accumulate-builds): Use it. Parameterize it in recursive calls to
have decaying cutoff.
---
guix/store.scm | 39 +++++++++++++++++++++++----------------
1 file changed, 23 insertions(+), 16 deletions(-)

Toggle diff (60 lines)
diff --git a/guix/store.scm b/guix/store.scm
index 220901f6ce..a3240eb2e0 100644
--- a/guix/store.scm
+++ b/guix/store.scm
@@ -1362,8 +1362,12 @@ (define (build-accumulator expected-store)
(unresolved things continue)
(continue #t))))
+(define default-cutoff
+ ;; Default cutoff parameter for 'map/accumulate-builds'.
+ (make-parameter 32))
+
(define* (map/accumulate-builds store proc lst
- #:key (cutoff 30))
+ #:key (cutoff (default-cutoff)))
"Apply PROC over each element of LST, accumulating 'build-things' calls and
coalescing them into a single call.
@@ -1377,21 +1381,24 @@ (define accumulator
(build-accumulator store))
(define-values (result rest)
- (let loop ((lst lst)
- (result '())
- (unresolved 0))
- (match lst
- ((head . tail)
- (match (with-build-handler accumulator
- (proc head))
- ((? unresolved? obj)
- (if (>= unresolved cutoff)
- (values (reverse (cons obj result)) tail)
- (loop tail (cons obj result) (+ 1 unresolved))))
- (obj
- (loop tail (cons obj result) unresolved))))
- (()
- (values (reverse result) lst)))))
+ ;; Have the default cutoff decay as we go deeper in the call stack to
+ ;; avoid pessimal behavior.
+ (parameterize ((default-cutoff (quotient cutoff 2)))
+ (let loop ((lst lst)
+ (result '())
+ (unresolved 0))
+ (match lst
+ ((head . tail)
+ (match (with-build-handler accumulator
+ (proc head))
+ ((? unresolved? obj)
+ (if (>= unresolved cutoff)
+ (values (reverse (cons obj result)) tail)
+ (loop tail (cons obj result) (+ 1 unresolved))))
+ (obj
+ (loop tail (cons obj result) unresolved))))
+ (()
+ (values (reverse result) lst))))))
(match (append-map (lambda (obj)
(if (unresolved? obj)
--
2.36.0
L
L
Ludovic Courtès wrote on 19 May 2022 00:10
Re: bug#55398: [PATCH 0/3] Improve store caching; improve 'map/accumulate-builds' performance
(address . 55398-done@debbugs.gnu.org)
875ym21ohh.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (4 lines)
> store: 'mcached' users can specify a cache ID.
> packages: Use separate package/graft cache.
> store: Use a decaying cutoff in 'map/accumulate-builds'.

Pushed as 2f170893719e6e9fc8e19cc5f0568e20a95d92b4.

Ludo’.
Closed
?