Closure copy in ‘guix system init’ is inefficient

DoneSubmitted by Ludovic Courtès.
Details
3 participants
  • Jonathan Brielmaier
  • Ludovic Courtès
  • raingloom
Owner
unassigned
Severity
important
L
L
Ludovic Courtès wrote on 20 Nov 2020 12:02
Closure copy in ‘guix system init ’ is inefficient
(address . bug-guix@gnu.org)
87h7pkffzy.fsf@inria.fr
‘guix system init’ ends by copying the system’s closure from the “host”store to the target store; it also initializes the database of thattarget store.
That copy is inefficient for several reasons. Let’s pick one file,shred.1.gz, that ends up being copied, and let’s look at its occurrencesin the strace log of ‘guix system init config.scm /tmp/os’:
Toggle snippet (31 lines)$ grep -A2 '/shred.1.gz' ,,slstat("/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0openat(AT_FDCWD, "/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", O_RDONLY) = 15fstat(15, {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0openat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", O_WRONLY|O_CREAT|O_TRUNC, 0444) = 16read(15, "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"..., 8192) = 1490write(16, "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"..., 1490) = 1490--utimensat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", [{tv_sec=1605721025, tv_nsec=616985411} /* 2020-11-18T18:37:05.616985411+0100 */, {tv_sec=1, tv_nsec=0} /* 1970-01-01T01:00:01+0100 */], 0) = 0lstat("/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/sleep.1.gz", {st_mode=S_IFREG|0444, st_size=813, ...}) = 0openat(AT_FDCWD, "/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/sleep.1.gz", O_RDONLY) = 15--lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shuf.1.gz", {st_mode=S_IFREG|0444, st_size=972, ...}) = 0lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/sleep.1.gz", {st_mode=S_IFREG|0444, st_size=813, ...}) = 0--lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0openat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", O_RDONLY) = 17lseek(17, 0, SEEK_CUR) = 0read(17, "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"..., 1490) = 1490--lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0openat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", O_RDONLY) = 17lseek(17, 0, SEEK_CUR) = 0read(17, "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"..., 1490) = 1490--link("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", "/tmp/os/gnu/store/.links/0w0qcs5lp36i89yry91r2ixlghihzf0vc56bpd9yylj342gv82xl") = 0lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shuf.1.gz", {st_mode=S_IFREG|0444, st_size=972, ...}) = 0openat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shuf.1.gz", O_RDONLY) = 17
First, /tmp/os/…/shred.1.gz is read entirely twice: once in‘register-items’ (in the ‘nar-sha256’ call) to compute its hash, and asecond time for deduplication (the ‘deduplicate’ call in there.)
The ‘nar-sha256’ call could be avoided because the database of/gnu/store contains that value. As for deduplication, we could perhapscreate those ‘.links’ entries as we copy files instead of re-traversingthe whole thing afterwards.
Second, all of /tmp/os is traversed to reset timestamps, although wecould have cleared those timestamps when we created those files in thefirst place (https://issues.guix.gnu.org/44741 prevents that though,unless we keep a bug-fixed copy of ‘copy-recursively’ in there.)
Third, in the case of the installer, we’re really copying from/mnt/guix-inst/store to /mnt/gnu/store, which is likely the samedevice. In this case we could create hard links instead of actuallycopying files.
Fourth, we’re adding items one by one in the target store database, butit may be more efficient to more or less dump the subset of the sourcedatabase in bulk.
Surely we can do better.
Ludo’.
L
L
Ludovic Courtès wrote on 21 Nov 2020 12:01
control message for bug #44760
(address . control@debbugs.gnu.org)
87o8jrc6sa.fsf@gnu.org
severity 44760 importantquit
R
R
raingloom wrote on 22 Nov 2020 20:46
Re: bug#44760: Closure copy in ‘guix system init’ is inefficient
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 44760@debbugs.gnu.org)
20201122204634.2730df12@riseup.net
On Fri, 20 Nov 2020 12:02:25 +0100Ludovic Courtès <ludo@gnu.org> wrote:
Toggle quote (87 lines)> ‘guix system init’ ends by copying the system’s closure from the> “host” store to the target store; it also initializes the database of> that target store.> > That copy is inefficient for several reasons. Let’s pick one file,> shred.1.gz, that ends up being copied, and let’s look at its> occurrences in the strace log of ‘guix system init config.scm> /tmp/os’:> > --8<---------------cut here---------------start------------->8---> $ grep -A2 '/shred.1.gz' ,,s> lstat("/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",> {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0 openat(AT_FDCWD,> "/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",> O_RDONLY) = 15 fstat(15, {st_mode=S_IFREG|0444, st_size=1490, ...}) => 0 openat(AT_FDCWD,> "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",> O_WRONLY|O_CREAT|O_TRUNC, 0444) = 16 read(15,> "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"...,> 8192) = 1490 write(16,> "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"...,> 1490) = 1490 -- utimensat(AT_FDCWD,> "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",> [{tv_sec=1605721025, tv_nsec=616985411} /*> 2020-11-18T18:37:05.616985411+0100 */, {tv_sec=1, tv_nsec=0} /*> 1970-01-01T01:00:01+0100 */], 0) = 0> lstat("/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/sleep.1.gz",> {st_mode=S_IFREG|0444, st_size=813, ...}) = 0 openat(AT_FDCWD,> "/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/sleep.1.gz",> O_RDONLY) = 15 --> lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",> {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0> lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shuf.1.gz",> {st_mode=S_IFREG|0444, st_size=972, ...}) = 0> lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/sleep.1.gz",> {st_mode=S_IFREG|0444, st_size=813, ...}) = 0 --> lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",> {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0 openat(AT_FDCWD,> "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",> O_RDONLY) = 17 lseek(17, 0, SEEK_CUR) = 0 read(17,> "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"...,> 1490) = 1490 --> lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",> {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0 openat(AT_FDCWD,> "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",> O_RDONLY) = 17 lseek(17, 0, SEEK_CUR) = 0 read(17,> "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"...,> 1490) = 1490 --> link("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",> "/tmp/os/gnu/store/.links/0w0qcs5lp36i89yry91r2ixlghihzf0vc56bpd9yylj342gv82xl")> = 0> lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shuf.1.gz",> {st_mode=S_IFREG|0444, st_size=972, ...}) = 0 openat(AT_FDCWD,> "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shuf.1.gz",> O_RDONLY) = 17 --8<---------------cut> here---------------end--------------->8---> > First, /tmp/os/…/shred.1.gz is read entirely twice: once in> ‘register-items’ (in the ‘nar-sha256’ call) to compute its hash, and a> second time for deduplication (the ‘deduplicate’ call in there.)> > The ‘nar-sha256’ call could be avoided because the database of> /gnu/store contains that value. As for deduplication, we could> perhaps create those ‘.links’ entries as we copy files instead of> re-traversing the whole thing afterwards.> > Second, all of /tmp/os is traversed to reset timestamps, although we> could have cleared those timestamps when we created those files in the> first place (<https://issues.guix.gnu.org/44741> prevents that though,> unless we keep a bug-fixed copy of ‘copy-recursively’ in there.)> > Third, in the case of the installer, we’re really copying from> /mnt/guix-inst/store to /mnt/gnu/store, which is likely the same> device. In this case we could create hard links instead of actually> copying files.> > Fourth, we’re adding items one by one in the target store database,> but it may be more efficient to more or less dump the subset of the> source database in bulk.> > Surely we can do better.> > Ludo’.> > >
Also, if a store is already present (eg.: because of a partialinstall), it could make sense to (optionally) keep its contents. AFAIKthis is still not possible. It was one the bigger time sinks while Iwas working on the F2FS support.
L
L
Ludovic Courtès wrote on 22 Nov 2020 22:10
(name . raingloom)(address . raingloom@riseup.net)(address . 44760@debbugs.gnu.org)
87v9dxkshz.fsf@gnu.org
Hi,
raingloom <raingloom@riseup.net> skribis:
Toggle quote (5 lines)> Also, if a store is already present (eg.: because of a partial> install), it could make sense to (optionally) keep its contents. AFAIK> this is still not possible. It was one the bigger time sinks while I> was working on the F2FS support.
It’s on purpose that ‘guix system init’ forcefully wipes store itemsalready present before copying them; some time ago it was not the caseand that led to problems:
https://issues.guix.gnu.org/20722
I think it’s safer to keep it unchanged.
Ludo’.
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 00/15] Speed up 'guix system init' & co.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150919.18435-1-ludo@gnu.org
Hi there!
Here’s a long and rather boring patch series to addresshttps://issues.guix.gnu.org/44760 and a bit more.
To avoid traversing store items repeatedly as described in theissue above, the strategy here is to gradually move thereset-timestamps and deduplicate phases as part of the filecopying process, such that each file is accessed only once.Consequently, the kitchen sink that ‘register-items’ once wasis now very focused.
Furthermore, it changes ‘guix system init’ so that it reusesthe already-known store item hashes when populating the targetdatabase instead of re-traversing store items.
On my laptop (SSD, warm cache, derivations already built), thecommand:
guix system init gnu/system/examples/bare-bones.tmpl /tmp/sys
goes from 32s to 22s, a 33% improvement.
Feedback welcome!
Ludo’.
Ludovic Courtès (15): serialization: 'fold-archive' notifies about directory processing completion. serialization: 'restore-file' sets canonical timestamp and permissions. nar: Deduplicate files right as they are restored. store-copy: 'populate-store' resets timestamps. image: 'register-closure' assumes already-reset timestamps. database: Remove #:reset-timestamps? from 'register-items'. store-copy: 'populate-store' can optionally deduplicate files. image: 'register-closure' leaves it up to the caller to deduplicate. database: Remove #:deduplicate? from 'register-items'. guix system: 'init' copies, resets timestamps, and deduplicates at once. database: Remove #:deduplicate? and #:reset-timestamps? from 'register-path'. system: 'init' does not recompute the hash of each store item. database: Remove 'register-path'. database: Honor 'SOURCE_DATE_EPOCH'. deduplicate: Create the '.links' directory lazily.
.dir-locals.el | 1 + gnu/build/image.scm | 16 +- gnu/build/install.scm | 3 +- gnu/build/linux-initrd.scm | 3 +- gnu/build/vm.scm | 14 +- gnu/system/install.scm | 12 +- gnu/system/linux-initrd.scm | 10 +- guix/build/store-copy.scm | 133 ++++++++++++---- guix/nar.scm | 8 +- guix/scripts/archive.scm | 2 + guix/scripts/challenge.scm | 1 + guix/scripts/pack.scm | 276 +++++++++++++++++----------------- guix/scripts/system.scm | 64 ++++---- guix/serialization.scm | 36 +++-- guix/store/database.scm | 58 ++----- guix/store/deduplication.scm | 167 ++++++++++++++------ tests/gexp.scm | 20 ++- tests/guix-archive.sh | 4 +- tests/nar.scm | 21 ++- tests/store-database.scm | 18 ++- tests/store-deduplication.scm | 20 ++- 21 files changed, 544 insertions(+), 343 deletions(-)
-- 2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 01/15] serialization: 'fold-archive' notifies about directory processing completion.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150919.18435-2-ludo@gnu.org
* guix/serialization.scm (fold-archive): Call PROC with a'directory-complete tag when done with a directory.(restore-file): Handle it.* guix/scripts/archive.scm (list-contents): Likewise.* guix/scripts/challenge.scm (archive-contents): Likewise.* tests/nar.scm ("write-file-tree + fold-archive"): Adjust accordingly.--- guix/scripts/archive.scm | 2 ++ guix/scripts/challenge.scm | 1 + guix/serialization.scm | 5 ++++- tests/nar.scm | 6 ++++-- 4 files changed, 11 insertions(+), 3 deletions(-)
Toggle diff (73 lines)diff --git a/guix/scripts/archive.scm b/guix/scripts/archive.scmindex c04baf9784..1f73fff711 100644--- a/guix/scripts/archive.scm+++ b/guix/scripts/archive.scm@@ -347,6 +347,8 @@ output port." (match type ('directory (format #t "D ~a~%" file))+ ('directory-complete+ #t) ('symlink (format #t "S ~a -> ~a~%" file content)) ((or 'regular 'executable)diff --git a/guix/scripts/challenge.scm b/guix/scripts/challenge.scmindex 39bd2c1c0f..d0a456ac1d 100644--- a/guix/scripts/challenge.scm+++ b/guix/scripts/challenge.scm@@ -210,6 +210,7 @@ taken since we do not import the archives." (cons `(,file ,type ,(port-sha256* port size)) result)))) ('directory result)+ ('directory-complete result) ('symlink (cons `(,file ,type ,contents) result)))) '()diff --git a/guix/serialization.scm b/guix/serialization.scmindex 836ad06caf..cc56134ef4 100644--- a/guix/serialization.scm+++ b/guix/serialization.scm@@ -444,7 +444,8 @@ depends on TYPE." (file file) (token x)))))) (loop (read-string port) result)))))- (")" result) ;done with DIR+ (")" ;done with DIR+ (proc file 'directory-complete #f result)) (x (raise (condition@@ -463,6 +464,8 @@ Restore it as FILE." (match type ('directory (mkdir file))+ ('directory-complete+ #t) ('symlink (symlink content file)) ((or 'regular 'executable)diff --git a/tests/nar.scm b/tests/nar.scmindex aeff3d3330..b542ebd47c 100644--- a/tests/nar.scm+++ b/tests/nar.scm@@ -1,5 +1,5 @@ ;;; GNU Guix --- Functional package management for GNU-;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019 Ludovic Courtès <ludo@gnu.org>+;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020 Ludovic Courtès <ludo@gnu.org> ;;; ;;; This file is part of GNU Guix. ;;;@@ -218,8 +218,10 @@ '(("R" directory #f) ("R/dir" directory #f) ("R/dir/exe" executable "1234")+ ("R/dir" directory-complete #f) ("R/foo" regular "abcdefg")- ("R/lnk" symlink "foo"))+ ("R/lnk" symlink "foo")+ ("R" directory-complete #f)) (let () (define-values (port get-bytevector)-- 2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 02/15] serialization: 'restore-file' sets canonical timestamp and permissions.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150919.18435-3-ludo@gnu.org
* guix/serialization.scm (restore-file): Set the permissions and mtimeof FILE.* guix/nar.scm (finalize-store-file): Pass #:reset-timestamps? #f to'register-items'.* tests/nar.scm (rm-rf): Add 'chmod' calls to ensure files are writable.("write-file + restore-file with symlinks"): Ensure every file in OUTPUTpasses 'canonical-file?'.* tests/guix-archive.sh: Run "chmod -R +w" before "rm -rf".--- guix/nar.scm | 8 +++++--- guix/serialization.scm | 14 +++++++++----- tests/guix-archive.sh | 4 ++-- tests/nar.scm | 12 ++++++++++-- 4 files changed, 26 insertions(+), 12 deletions(-)
Toggle diff (110 lines)diff --git a/guix/nar.scm b/guix/nar.scmindex a23af2e5de..edfcc9aab5 100644--- a/guix/nar.scm+++ b/guix/nar.scm@@ -114,10 +114,12 @@ held." ;; Install the new TARGET. (rename-file source target) - ;; Register TARGET. As a side effect, it resets the timestamps of all- ;; its files, recursively, and runs a deduplication pass.+ ;; Register TARGET. As a side effect, run a deduplication pass.+ ;; Timestamps and permissions are already correct thanks to+ ;; 'restore-file'. (register-items db- (list (store-info target deriver references))))+ (list (store-info target deriver references))+ #:reset-timestamps? #f)) (when lock? (delete-file (string-append target ".lock"))diff --git a/guix/serialization.scm b/guix/serialization.scmindex cc56134ef4..677ca60b66 100644--- a/guix/serialization.scm+++ b/guix/serialization.scm@@ -459,23 +459,27 @@ depends on TYPE." (define (restore-file port file) "Read a file (possibly a directory structure) in Nar format from PORT.-Restore it as FILE."+Restore it as FILE with canonical permissions and timestamps." (fold-archive (lambda (file type content result) (match type ('directory (mkdir file)) ('directory-complete- #t)+ (chmod file #o555)+ (utime file 1 1 0 0)) ('symlink- (symlink content file))+ (symlink content file)+ (utime file 1 1 0 0 AT_SYMLINK_NOFOLLOW)) ((or 'regular 'executable) (match content ((input . size) (call-with-output-file file (lambda (output) (dump input output size)- (when (eq? type 'executable)- (chmod output #o755)))))))))+ (chmod output (if (eq? type 'executable)+ #o555+ #o444))))+ (utime file 1 1 0 0)))))) #t port file))diff --git a/tests/guix-archive.sh b/tests/guix-archive.shindex e796c62f9a..00b87ff0ac 100644--- a/tests/guix-archive.sh+++ b/tests/guix-archive.sh@@ -1,5 +1,5 @@ # GNU Guix --- Functional package management for GNU-# Copyright © 2013, 2014, 2015, 2019 Ludovic Courtès <ludo@gnu.org>+# Copyright © 2013, 2014, 2015, 2019, 2020 Ludovic Courtès <ludo@gnu.org> # # This file is part of GNU Guix. #@@ -28,7 +28,7 @@ tmpdir="t-archive-dir-$$" rm -f "$archive" "$archive_alt" rm -rf "$tmpdir" -trap 'rm -f "$archive" "$archive_alt"; rm -rf "$tmpdir"' EXIT+trap 'rm -f "$archive" "$archive_alt"; chmod -R +w "$tmpdir"; rm -rf "$tmpdir"' EXIT guix archive --export guile-bootstrap > "$archive" guix archive --export guile-bootstrap:out > "$archive_alt"diff --git a/tests/nar.scm b/tests/nar.scmindex b542ebd47c..59616659c8 100644--- a/tests/nar.scm+++ b/tests/nar.scm@@ -136,8 +136,11 @@ (define (rm-rf dir) (file-system-fold (const #t) ; enter? (lambda (file stat result) ; leaf+ (unless (eq? 'symlink (stat:type stat))+ (chmod file #o644)) (delete-file file))- (const #t) ; down+ (lambda (dir stat result) ; down+ (chmod dir #o755)) (lambda (dir stat result) ; up (rmdir dir)) (const #t) ; skip@@ -363,7 +366,12 @@ (cut write-file input <>)) (call-with-input-file nar (cut restore-file <> output))- (file-tree-equal? input output))++ (and (file-tree-equal? input output)+ (every (lambda (file)+ (canonical-file?+ (string-append output "/" file)))+ '("root" "root/reg" "root/exe")))) (lambda () (false-if-exception (delete-file nar)) (false-if-exception (rm-rf output)))))))-- 2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 03/15] nar: Deduplicate files right as they are restored.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150919.18435-4-ludo@gnu.org
This avoids having to traverse and re-read the files that we have justrestored, thereby reducing I/O.
* guix/serialization.scm (dump-file): New procedure.(restore-file): Add #:dump-file parameter and honor it.* guix/store/deduplication.scm (tee, dump-file/deduplicate): Newprocedures.* guix/nar.scm (restore-one-item): Pass #:dump-file to 'restore-file'.(finalize-store-file): Pass #:deduplicate? #f to 'register-items'.* tests/nar.scm <top level>: Call 'setenv' to set "NIX_STORE".--- guix/nar.scm | 12 ++++---- guix/serialization.scm | 27 ++++++++++++----- guix/store/deduplication.scm | 57 +++++++++++++++++++++++++++++++++++- tests/nar.scm | 3 ++ 4 files changed, 85 insertions(+), 14 deletions(-)
Toggle diff (178 lines)diff --git a/guix/nar.scm b/guix/nar.scmindex edfcc9aab5..ba035ca6dc 100644--- a/guix/nar.scm+++ b/guix/nar.scm@@ -27,6 +27,7 @@ ;; (guix store) since this is "daemon-side" code. #:use-module (guix store) #:use-module (guix store database)+ #:use-module ((guix store deduplication) #:select (dump-file/deduplicate)) #:use-module ((guix build store-copy) #:select (store-info)) #:use-module (guix i18n)@@ -114,12 +115,12 @@ held." ;; Install the new TARGET. (rename-file source target) - ;; Register TARGET. As a side effect, run a deduplication pass.- ;; Timestamps and permissions are already correct thanks to- ;; 'restore-file'.+ ;; Register TARGET. The 'restore-file' call took care of+ ;; deduplication, timestamps, and permissions. (register-items db (list (store-info target deriver references))- #:reset-timestamps? #f))+ #:reset-timestamps? #f+ #:deduplicate? #f)) (when lock? (delete-file (string-append target ".lock"))@@ -212,7 +213,8 @@ s-expression")) (let-values (((port get-hash) (open-sha256-input-port port))) (with-temporary-store-file temp- (restore-file port temp)+ (restore-file port temp+ #:dump-file dump-file/deduplicate) (let ((magic (read-int port))) (unless (= magic %export-magic)diff --git a/guix/serialization.scm b/guix/serialization.scmindex 677ca60b66..9e2dce8bb0 100644--- a/guix/serialization.scm+++ b/guix/serialization.scm@@ -457,9 +457,22 @@ depends on TYPE." (&message (message "unsupported nar entry type")) (&nar-read-error (port port) (file file) (token x))))))))) -(define (restore-file port file)+(define (dump-file file input size type)+ "Dump SIZE bytes from INPUT to FILE."+ (call-with-output-file file+ (lambda (output)+ (dump input output size))))++(define* (restore-file port file+ #:key (dump-file dump-file)) "Read a file (possibly a directory structure) in Nar format from PORT.-Restore it as FILE with canonical permissions and timestamps."+Restore it as FILE with canonical permissions and timestamps. To write a+regular or executable file, call:++ (DUMP-FILE FILE INPUT SIZE TYPE)++The default is to dump SIZE bytes from INPUT to FILE, but callers can provide+a custom procedure, for instance to deduplicate FILE on the fly." (fold-archive (lambda (file type content result) (match type ('directory@@ -473,12 +486,10 @@ Restore it as FILE with canonical permissions and timestamps." ((or 'regular 'executable) (match content ((input . size)- (call-with-output-file file- (lambda (output)- (dump input output size)- (chmod output (if (eq? type 'executable)- #o555- #o444))))+ (dump-file file input size type)+ (chmod file (if (eq? type 'executable)+ #o555+ #o444)) (utime file 1 1 0 0)))))) #t portdiff --git a/guix/store/deduplication.scm b/guix/store/deduplication.scmindex 0655ceb890..b4d37d4525 100644--- a/guix/store/deduplication.scm+++ b/guix/store/deduplication.scm@@ -26,12 +26,15 @@ #:use-module (guix build syscalls) #:use-module (guix base32) #:use-module (srfi srfi-11)+ #:use-module (srfi srfi-34)+ #:use-module (srfi srfi-35) #:use-module (rnrs io ports) #:use-module (ice-9 ftw) #:use-module (ice-9 match) #:use-module (guix serialization) #:export (nar-sha256- deduplicate))+ deduplicate+ dump-file/deduplicate)) ;; XXX: This port is used as a workaround on Guile <= 2.2.4 where ;; 'port-position' throws to 'out-of-range' when the offset is great than or@@ -201,3 +204,55 @@ under STORE." ;; that's OK: we just can't deduplicate it more. #f) (else (apply throw args)))))))))))++(define (tee input len output)+ "Return a port that reads up to LEN bytes from INPUT and writes them to+OUTPUT as it goes."+ (define bytes-read 0)++ (define (fail)+ ;; Reached EOF before we had read LEN bytes from INPUT.+ (raise (condition+ (&nar-error (port input)+ (file (port-filename output))))))++ (define (read! bv start count)+ ;; Read at most LEN bytes in total.+ (let ((count (min count (- len bytes-read))))+ (let loop ((ret (get-bytevector-n! input bv start count)))+ (cond ((eof-object? ret)+ (if (= bytes-read len)+ 0 ; EOF+ (fail)))+ ((and (zero? ret) (> count 0))+ ;; Do not return zero since zero means EOF, so try again.+ (loop (get-bytevector-n! input bv start count)))+ (else+ (put-bytevector output bv start ret)+ (set! bytes-read (+ bytes-read ret))+ ret)))))++ (make-custom-binary-input-port "tee input port" read! #f #f #f))++(define* (dump-file/deduplicate file input size type+ #:key (store (%store-directory)))+ "Write SIZE bytes read from INPUT to FILE. TYPE is a symbol, either+'regular or 'executable.++This procedure is suitable as a #:dump-file argument to 'restore-file'. When+used that way, it deduplicates files on the fly as they are restored, thereby+removing the need to a deduplication pass that would re-read all the files+down the road."+ (define hash+ (call-with-output-file file+ (lambda (output)+ (let-values (((hash-port get-hash)+ (open-hash-port (hash-algorithm sha256))))+ (write-file-tree file hash-port+ #:file-type+size (lambda (_) (values type size))+ #:file-port+ (const (tee input size output)))+ (close-port hash-port)+ (get-hash)))))++ (deduplicate file hash #:store store))diff --git a/tests/nar.scm b/tests/nar.scmindex 59616659c8..ba4881caaa 100644--- a/tests/nar.scm+++ b/tests/nar.scm@@ -452,6 +452,9 @@ (false-if-exception (rm-rf %test-dir)) (setlocale LC_ALL locale))))) +;; XXX: Tell the 'deduplicate' procedure what store we're actually using.+(setenv "NIX_STORE" (%store-prefix))+ (test-assert "restore-file-set (signed, valid)" (with-store store (let* ((texts (unfold (cut >= <> 10)-- 2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 04/15] store-copy: 'populate-store' resets timestamps.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150919.18435-5-ludo@gnu.org
Until now, 'populate-store' would reset permissions but not timestamps,so callers would resort to going through an extra directory traversal toreset timestamps.
* guix/build/store-copy.scm (reset-permissions): Remove.(copy-recursively): New procedure.(populate-store): Pass #:keep-permissions? to 'copy-recursively'.Remove call to 'reset-permissions'.* tests/gexp.scm ("gexp->derivation, store copy"): In BUILD-DRV, checkwhether 'populate-store' canonicalizes permissions and timestamps.* gnu/build/image.scm (initialize-root-partition): Pass #:reset-timestamps? #fto 'register-closure'.* gnu/build/vm.scm (root-partition-initializer): Likewise.--- gnu/build/image.scm | 5 +- gnu/build/vm.scm | 2 +- guix/build/store-copy.scm | 103 +++++++++++++++++++++++++++----------- tests/gexp.scm | 19 ++++++- 4 files changed, 95 insertions(+), 34 deletions(-)
Toggle diff (202 lines)diff --git a/gnu/build/image.scm b/gnu/build/image.scmindex 640a784204..2857362914 100644--- a/gnu/build/image.scm+++ b/gnu/build/image.scm@@ -196,9 +196,8 @@ register-closure." (when register-closures? (for-each (lambda (closure)- (register-closure root- closure- #:reset-timestamps? #t+ (register-closure root closure+ #:reset-timestamps? #f #:deduplicate? deduplicate? #:wal-mode? wal-mode?)) references-graphs))diff --git a/gnu/build/vm.scm b/gnu/build/vm.scmindex 287d099f79..30feaf800f 100644--- a/gnu/build/vm.scm+++ b/gnu/build/vm.scm@@ -414,7 +414,7 @@ system that is passed to 'populate-root-file-system'." (for-each (lambda (closure) (register-closure target (string-append "/xchg/" closure)- #:reset-timestamps? copy-closures?+ #:reset-timestamps? #f #:deduplicate? deduplicate?)) closures) (unless copy-closures?diff --git a/guix/build/store-copy.scm b/guix/build/store-copy.scmindex ad551bca98..95dcb8e114 100644--- a/guix/build/store-copy.scm+++ b/guix/build/store-copy.scm@@ -1,5 +1,5 @@ ;;; GNU Guix --- Functional package management for GNU-;;; Copyright © 2013, 2014, 2017, 2018 Ludovic Courtès <ludo@gnu.org>+;;; Copyright © 2013, 2014, 2017, 2018, 2020 Ludovic Courtès <ludo@gnu.org> ;;; ;;; This file is part of GNU Guix. ;;;@@ -17,7 +17,7 @@ ;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>. (define-module (guix build store-copy)- #:use-module (guix build utils)+ #:use-module ((guix build utils) #:hide (copy-recursively)) #:use-module (guix sets) #:use-module (guix progress) #:use-module (srfi srfi-1)@@ -169,32 +169,83 @@ REFERENCE-GRAPHS, a list of reference-graph files." (reduce + 0 (map file-size items))) -(define (reset-permissions file)- "Reset the permissions on FILE and its sub-directories so that they are all-read-only."- ;; XXX: This procedure exists just to work around the inability of- ;; 'copy-recursively' to preserve permissions.- (file-system-fold (const #t) ;enter?- (lambda (file stat _) ;leaf- (unless (eq? 'symlink (stat:type stat))- (chmod file- (if (zero? (logand (stat:mode stat)- #o100))- #o444- #o555))))- (const #t) ;down- (lambda (directory stat _) ;up- (chmod directory #o555))- (const #f) ;skip- (const #f) ;error+;; TODO: Remove when the one in (guix build utils) has #:keep-permissions?,+;; the fix for <https://bugs.gnu.org/44741>, and when #:keep-mtime? works for+;; symlinks.+(define* (copy-recursively source destination+ #:key+ (log (current-output-port))+ (follow-symlinks? #f)+ (copy-file copy-file)+ keep-mtime? keep-permissions?)+ "Copy SOURCE directory to DESTINATION. Follow symlinks if FOLLOW-SYMLINKS?+is true; otherwise, just preserve them. Call COPY-FILE to copy regular files.+When KEEP-MTIME? is true, keep the modification time of the files in SOURCE on+those of DESTINATION. When KEEP-PERMISSIONS? is true, preserve file+permissions. Write verbose output to the LOG port."+ (define AT_SYMLINK_NOFOLLOW+ ;; Guile 2.0 did not define this constant, hence this hack.+ (let ((variable (module-variable the-root-module 'AT_SYMLINK_NOFOLLOW)))+ (if variable+ (variable-ref variable)+ 256))) ;for GNU/Linux++ (define (set-file-time file stat)+ (utime file+ (stat:atime stat)+ (stat:mtime stat)+ (stat:atimensec stat)+ (stat:mtimensec stat)+ AT_SYMLINK_NOFOLLOW))++ (define strip-source+ (let ((len (string-length source)))+ (lambda (file)+ (substring file len))))++ (file-system-fold (const #t) ; enter?+ (lambda (file stat result) ; leaf+ (let ((dest (string-append destination+ (strip-source file))))+ (format log "`~a' -> `~a'~%" file dest)+ (case (stat:type stat)+ ((symlink)+ (let ((target (readlink file)))+ (symlink target dest)))+ (else+ (copy-file file dest)+ (when keep-permissions?+ (chmod dest (stat:perms stat)))))+ (when keep-mtime?+ (set-file-time dest stat))))+ (lambda (dir stat result) ; down+ (let ((target (string-append destination+ (strip-source dir))))+ (mkdir-p target)))+ (lambda (dir stat result) ; up+ (let ((target (string-append destination+ (strip-source dir))))+ (when keep-mtime?+ (set-file-time target stat))+ (when keep-permissions?+ (chmod target (stat:perms stat)))))+ (const #t) ; skip+ (lambda (file stat errno result)+ (format (current-error-port) "i/o error: ~a: ~a~%"+ file (strerror errno))+ #f) #t- file- lstat))+ source++ (if follow-symlinks?+ stat+ lstat))) (define* (populate-store reference-graphs target #:key (log-port (current-error-port))) "Populate the store under directory TARGET with the items specified in-REFERENCE-GRAPHS, a list of reference-graph files."+REFERENCE-GRAPHS, a list of reference-graph files. Items copied to TARGET+maintain timestamps and permissions." (define store (string-append target (%store-directory))) @@ -221,12 +272,8 @@ REFERENCE-GRAPHS, a list of reference-graph files." (copy-recursively thing (string-append target thing) #:keep-mtime? #t+ #:keep-permissions? #t #:log (%make-void-port "w"))-- ;; XXX: Since 'copy-recursively' doesn't allow us to- ;; preserve permissions, we have to traverse TARGET to- ;; make sure everything is read-only.- (reset-permissions (string-append target thing)) (report)) things))))) diff --git a/tests/gexp.scm b/tests/gexp.scmindex 686334af61..a0e55178fa 100644--- a/tests/gexp.scm+++ b/tests/gexp.scm@@ -723,10 +723,25 @@ (lambda (port) (display "This is the second one." port)))))) (build-drv #~(begin- (use-modules (guix build store-copy))+ (use-modules (guix build store-copy)+ (guix build utils)+ (srfi srfi-1))++ (define (canonical-file? file)+ ;; Copied from (guix tests).+ (let ((st (lstat file)))+ (or (not (string-prefix? (%store-directory) file))+ (eq? 'symlink (stat:type st))+ (and (= 1 (stat:mtime st))+ (zero? (logand #o222 (stat:mode st))))))) (mkdir #$output)- (populate-store '("graph") #$output))))+ (populate-store '("graph") #$output)++ ;; Check whether 'populate-store' canonicalizes+ ;; permissions and timestamps.+ (unless (every canonical-file? (find-files #$output))+ (error "not canonical!" #$output))))) (mlet* %store-monad ((one (gexp->derivation "one" build-one)) (two (gexp->derivation "two" (build-two one))) (drv (gexp->derivation "store-copy" build-drv-- 2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 05/15] image: 'register-closure' assumes already-reset timestamps.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150919.18435-6-ludo@gnu.org
* gnu/build/image.scm (register-closure): Remove #:reset-timestamps?parameter. Pass #:reset-timestamps? #f to 'register-items'.(initialize-root-partition): Adjust accordingly.* gnu/build/vm.scm (register-closure, root-partition-initializer):Likewise.--- gnu/build/image.scm | 8 +++----- gnu/build/vm.scm | 8 +++----- 2 files changed, 6 insertions(+), 10 deletions(-)
Toggle diff (75 lines)diff --git a/gnu/build/image.scm b/gnu/build/image.scmindex 2857362914..4f80a1964f 100644--- a/gnu/build/image.scm+++ b/gnu/build/image.scm@@ -140,13 +140,12 @@ given CONFIG file." (define* (register-closure prefix closure #:key- (deduplicate? #t) (reset-timestamps? #t)+ (deduplicate? #t) (schema (sql-schema)) (wal-mode? #t)) "Register CLOSURE in PREFIX, where PREFIX is the directory name of the target store and CLOSURE is the name of a file containing a reference graph as-produced by #:references-graphs.. As a side effect, if RESET-TIMESTAMPS? is-true, reset timestamps on store files and, if DEDUPLICATE? is true,+produced by #:references-graphs. As a side effect, if DEDUPLICATE? is true, deduplicates files common to CLOSURE and the rest of PREFIX. Pass WAL-MODE? to call-with-database." (let ((items (call-with-input-file closure read-reference-graph)))@@ -156,7 +155,7 @@ to call-with-database." (register-items db items #:prefix prefix #:deduplicate? deduplicate?- #:reset-timestamps? reset-timestamps?+ #:reset-timestamps? #f #:registration-time %epoch))))) (define* (initialize-efi-partition root@@ -197,7 +196,6 @@ register-closure." (when register-closures? (for-each (lambda (closure) (register-closure root closure- #:reset-timestamps? #f #:deduplicate? deduplicate? #:wal-mode? wal-mode?)) references-graphs))diff --git a/gnu/build/vm.scm b/gnu/build/vm.scmindex 30feaf800f..f700e08b25 100644--- a/gnu/build/vm.scm+++ b/gnu/build/vm.scm@@ -215,12 +215,11 @@ the #:references-graphs parameter of 'derivation'." (define* (register-closure prefix closure #:key- (deduplicate? #t) (reset-timestamps? #t)+ (deduplicate? #t) (schema (sql-schema))) "Register CLOSURE in PREFIX, where PREFIX is the directory name of the target store and CLOSURE is the name of a file containing a reference graph as-produced by #:references-graphs.. As a side effect, if RESET-TIMESTAMPS? is-true, reset timestamps on store files and, if DEDUPLICATE? is true,+produced by #:references-graphs. As a side effect, if DEDUPLICATE? is true, deduplicates files common to CLOSURE and the rest of PREFIX." (let ((items (call-with-input-file closure read-reference-graph))) (parameterize ((sql-schema schema))@@ -228,7 +227,7 @@ deduplicates files common to CLOSURE and the rest of PREFIX." (register-items db items #:prefix prefix #:deduplicate? deduplicate?- #:reset-timestamps? reset-timestamps?+ #:reset-timestamps? #f #:registration-time %epoch))))) @@ -414,7 +413,6 @@ system that is passed to 'populate-root-file-system'." (for-each (lambda (closure) (register-closure target (string-append "/xchg/" closure)- #:reset-timestamps? #f #:deduplicate? deduplicate?)) closures) (unless copy-closures?-- 2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 06/15] database: Remove #:reset-timestamps? from 'register-items'.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150919.18435-7-ludo@gnu.org
The assumption now is that the caller took care of resetting timestampsand permissions.
* guix/store/database.scm (register-items): Remove #:reset-timestamps?parameter and the call to 'reset-timestamps'.(register-path): Adjust accordingly and add call to 'reset-timestamps'.* gnu/build/image.scm (register-closure): Remove #:reset-timestamps?parameter to 'register-items'.* gnu/build/vm.scm (register-closure): Likewise.* guix/nar.scm (finalize-store-file): Adjust accordingly.* guix/scripts/pack.scm (store-database)[build]: Likewise.--- gnu/build/image.scm | 1 - gnu/build/vm.scm | 1 - guix/nar.scm | 1 - guix/scripts/pack.scm | 1 - guix/store/database.scm | 13 ++++++++----- 5 files changed, 8 insertions(+), 9 deletions(-)
Toggle diff (100 lines)diff --git a/gnu/build/image.scm b/gnu/build/image.scmindex 4f80a1964f..0deea10a9d 100644--- a/gnu/build/image.scm+++ b/gnu/build/image.scm@@ -155,7 +155,6 @@ to call-with-database." (register-items db items #:prefix prefix #:deduplicate? deduplicate?- #:reset-timestamps? #f #:registration-time %epoch))))) (define* (initialize-efi-partition rootdiff --git a/gnu/build/vm.scm b/gnu/build/vm.scmindex f700e08b25..abb0317faf 100644--- a/gnu/build/vm.scm+++ b/gnu/build/vm.scm@@ -227,7 +227,6 @@ deduplicates files common to CLOSURE and the rest of PREFIX." (register-items db items #:prefix prefix #:deduplicate? deduplicate?- #:reset-timestamps? #f #:registration-time %epoch))))) diff --git a/guix/nar.scm b/guix/nar.scmindex ba035ca6dc..947b393d84 100644--- a/guix/nar.scm+++ b/guix/nar.scm@@ -119,7 +119,6 @@ held." ;; deduplication, timestamps, and permissions. (register-items db (list (store-info target deriver references))- #:reset-timestamps? #f #:deduplicate? #f)) (when lock?diff --git a/guix/scripts/pack.scm b/guix/scripts/pack.scmindex ba9a6dc1b2..1612ec8f04 100644--- a/guix/scripts/pack.scm+++ b/guix/scripts/pack.scm@@ -168,7 +168,6 @@ dependencies are registered." (with-database db-file db (register-items db items #:deduplicate? #f- #:reset-timestamps? #f #:registration-time %epoch))))))) (computed-file "store-database" builddiff --git a/guix/store/database.scm b/guix/store/database.scmindex b36b127630..0ed66a6e2c 100644--- a/guix/store/database.scm+++ b/guix/store/database.scm@@ -392,7 +392,8 @@ references, and DERIVER as its deriver (.drv that led to it.) If PREFIX is given, it must be the name of the directory containing the new store to initialize; if STATE-DIRECTORY is given, it must be a string containing the absolute file name to the state directory of the store being initialized.-Return #t on success.+Return #t on success. As a side effect, reset timestamps on PATH, unless+RESET-TIMESTAMPS? is false. Use with care as it directly modifies the store! This is primarily meant to be used internally by the daemon's build hook.@@ -403,12 +404,17 @@ by adding it as a temp-root." (store-database-file #:prefix prefix #:state-directory state-directory)) + (define real-file-name+ (string-append (or prefix "") path))++ (when reset-timestamps?+ (reset-timestamps real-file-name))+ (parameterize ((sql-schema schema)) (with-database db-file db (register-items db (list (store-info path deriver references)) #:prefix prefix #:deduplicate? deduplicate?- #:reset-timestamps? reset-timestamps? #:log-port (%make-void-port "w"))))) (define %epoch@@ -418,7 +424,6 @@ by adding it as a temp-root." (define* (register-items db items #:key prefix (deduplicate? #t)- (reset-timestamps? #t) registration-time (log-port (current-error-port))) "Register all of ITEMS, a list of <store-info> records as returned by@@ -452,8 +457,6 @@ typically by adding them as temp-roots." ;; significant differences when 'register-closures' is called ;; consecutively for overlapping closures such as 'system' and 'bootcfg'. (unless (path-id db to-register)- (when reset-timestamps?- (reset-timestamps real-file-name)) (let-values (((hash nar-size) (nar-sha256 real-file-name))) (call-with-retrying-transaction db (lambda ()-- 2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 07/15] store-copy: 'populate-store' can optionally deduplicate files.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150919.18435-8-ludo@gnu.org
Until now deduplication was performed as an additional pass aftercopying files, which involve re-traversing all the files that had justbeen copied.
* guix/store/deduplication.scm (copy-file/deduplicate): New procedure.* tests/store-deduplication.scm ("copy-file/deduplicate"): New test.* guix/build/store-copy.scm (populate-store): Add #:deduplicate?parameter and honor it.* tests/gexp.scm ("gexp->derivation, store copy"): Pass #:deduplicate? #fto 'populate-store'.* gnu/build/image.scm (initialize-root-partition): Pass #:deduplicate?to 'populate-store'. Pass #:deduplicate? #f to 'register-closure'.* gnu/build/vm.scm (root-partition-initializer): Likewise.* gnu/build/install.scm (populate-single-profile-directory): Pass #:deduplicate? #f to 'populate-store'.* gnu/build/linux-initrd.scm (build-initrd): Likewise.* guix/scripts/pack.scm (self-contained-tarball)[import-module?]: Newprocedure.[build]: Pass it as an argument to 'source-module-closure'.* guix/scripts/pack.scm (squashfs-image)[build]: Wrap in'with-extensions'.* gnu/system/linux-initrd.scm (expression->initrd)[import-module?]: Newprocedure.[builder]: Pass it to 'source-module-closure'.* gnu/system/install.scm (cow-store-service-type)[import-module?]: Newprocedure. Pass it to 'source-module-closure'.--- gnu/build/image.scm | 5 +- gnu/build/install.scm | 3 +- gnu/build/linux-initrd.scm | 3 +- gnu/build/vm.scm | 5 +- gnu/system/install.scm | 12 +- gnu/system/linux-initrd.scm | 10 +- guix/build/store-copy.scm | 13 +- guix/scripts/pack.scm | 274 +++++++++++++++++----------------- guix/store/deduplication.scm | 16 +- tests/gexp.scm | 3 +- tests/store-deduplication.scm | 18 ++- 11 files changed, 215 insertions(+), 147 deletions(-)
Toggle diff (558 lines)diff --git a/gnu/build/image.scm b/gnu/build/image.scmindex 0deea10a9d..8f50f27f78 100644--- a/gnu/build/image.scm+++ b/gnu/build/image.scm@@ -186,7 +186,8 @@ rest of the store when registering the closures. SYSTEM-DIRECTORY is the name of the directory of the 'system' derivation. Pass WAL-MODE? to register-closure." (populate-root-file-system system-directory root)- (populate-store references-graphs root)+ (populate-store references-graphs root+ #:deduplicate? deduplicate?) ;; Populate /dev. (when make-device-nodes@@ -195,7 +196,7 @@ register-closure." (when register-closures? (for-each (lambda (closure) (register-closure root closure- #:deduplicate? deduplicate?+ #:deduplicate? #f #:wal-mode? wal-mode?)) references-graphs)) diff --git a/gnu/build/install.scm b/gnu/build/install.scmindex 63995e1d09..f5c8407b89 100644--- a/gnu/build/install.scm+++ b/gnu/build/install.scm@@ -214,7 +214,8 @@ This is used to create the self-contained tarballs with 'guix pack'." (symlink old (scope new))) ;; Populate the store.- (populate-store (list closure) directory)+ (populate-store (list closure) directory+ #:deduplicate? #f) (when database (install-database-and-gc-roots directory database profilediff --git a/gnu/build/linux-initrd.scm b/gnu/build/linux-initrd.scmindex 99796adba6..bb2ed0db0c 100644--- a/gnu/build/linux-initrd.scm+++ b/gnu/build/linux-initrd.scm@@ -127,7 +127,8 @@ REFERENCES-GRAPHS." (mkdir "contents") ;; Copy the closures of all the items referenced in REFERENCES-GRAPHS.- (populate-store references-graphs "contents")+ (populate-store references-graphs "contents"+ #:deduplicate? #f) (with-directory-excursion "contents" ;; Make '/init'.diff --git a/gnu/build/vm.scm b/gnu/build/vm.scmindex abb0317faf..03be5697b7 100644--- a/gnu/build/vm.scm+++ b/gnu/build/vm.scm@@ -395,7 +395,8 @@ system that is passed to 'populate-root-file-system'." (when copy-closures? ;; Populate the store. (populate-store (map (cut string-append "/xchg/" <>) closures)- target))+ target+ #:deduplicate? deduplicate?)) ;; Populate /dev. (make-device-nodes target)@@ -412,7 +413,7 @@ system that is passed to 'populate-root-file-system'." (for-each (lambda (closure) (register-closure target (string-append "/xchg/" closure)- #:deduplicate? deduplicate?))+ #:deduplicate? #f)) closures) (unless copy-closures? (umount target-store)))diff --git a/gnu/system/install.scm b/gnu/system/install.scmindex 7701297411..06f8043bb7 100644--- a/gnu/system/install.scm+++ b/gnu/system/install.scm@@ -1,5 +1,5 @@ ;;; GNU Guix --- Functional package management for GNU-;;; Copyright © 2014, 2015, 2016, 2017, 2018, 2019 Ludovic Courtès <ludo@gnu.org>+;;; Copyright © 2014, 2015, 2016, 2017, 2018, 2019, 2020 Ludovic Courtès <ludo@gnu.org> ;;; Copyright © 2015 Mark H Weaver <mhw@netris.org> ;;; Copyright © 2016 Andreas Enge <andreas@enge.fr> ;;; Copyright © 2017 Marius Bakke <mbakke@fastmail.com>@@ -175,6 +175,13 @@ manual." (shepherd-service-type 'cow-store (lambda _+ (define (import-module? module)+ ;; Since we don't use deduplication support in 'populate-store', don't+ ;; import (guix store deduplication) and its dependencies, which+ ;; includes Guile-Gcrypt.+ (and (guix-module-name? module)+ (not (equal? module '(guix store deduplication)))))+ (shepherd-service (requirement '(root-file-system user-processes)) (provision '(cow-store))@@ -189,7 +196,8 @@ the given target.") ,@%default-modules)) (start (with-imported-modules (source-module-closure- '((gnu build install)))+ '((gnu build install))+ #:select? import-module?) #~(case-lambda ((target) (mount-cow-store target #$%backing-directory)diff --git a/gnu/system/linux-initrd.scm b/gnu/system/linux-initrd.scmindex 4fb1d863c9..c6ba9bb560 100644--- a/gnu/system/linux-initrd.scm+++ b/gnu/system/linux-initrd.scm@@ -76,12 +76,20 @@ the derivations referenced by EXP are automatically copied to the initrd." (define init (program-file "init" exp #:guile guile)) + (define (import-module? module)+ ;; Since we don't use deduplication support in 'populate-store', don't+ ;; import (guix store deduplication) and its dependencies, which includes+ ;; Guile-Gcrypt. That way we can run tests with '--bootstrap'.+ (and (guix-module-name? module)+ (not (equal? module '(guix store deduplication)))))+ (define builder ;; Do not use "guile-zlib" extension here, otherwise it would drag the ;; non-static "zlib" package to the initrd closure. It is not needed ;; anyway because the modules are stored uncompressed within the initrd. (with-imported-modules (source-module-closure- '((gnu build linux-initrd)))+ '((gnu build linux-initrd))+ #:select? import-module?) #~(begin (use-modules (gnu build linux-initrd)) diff --git a/guix/build/store-copy.scm b/guix/build/store-copy.scmindex 95dcb8e114..7f0672cd9d 100644--- a/guix/build/store-copy.scm+++ b/guix/build/store-copy.scm@@ -20,6 +20,7 @@ #:use-module ((guix build utils) #:hide (copy-recursively)) #:use-module (guix sets) #:use-module (guix progress)+ #:autoload (guix store deduplication) (copy-file/deduplicate) #:use-module (srfi srfi-1) #:use-module (srfi srfi-9) #:use-module (srfi srfi-26)@@ -242,10 +243,13 @@ permissions. Write verbose output to the LOG port." lstat))) (define* (populate-store reference-graphs target- #:key (log-port (current-error-port)))+ #:key+ (deduplicate? #t)+ (log-port (current-error-port))) "Populate the store under directory TARGET with the items specified in REFERENCE-GRAPHS, a list of reference-graph files. Items copied to TARGET-maintain timestamps and permissions."+maintain timestamps and permissions. When DEDUPLICATE? is true, deduplicate+regular files as they are copied to TARGET." (define store (string-append target (%store-directory))) @@ -273,6 +277,11 @@ maintain timestamps and permissions." (string-append target thing) #:keep-mtime? #t #:keep-permissions? #t+ #:copy-file+ (if deduplicate?+ (cut copy-file/deduplicate <> <>+ #:store store)+ copy-file) #:log (%make-void-port "w")) (report)) things)))))diff --git a/guix/scripts/pack.scm b/guix/scripts/pack.scmindex 1612ec8f04..440c4b0903 100644--- a/guix/scripts/pack.scm+++ b/guix/scripts/pack.scm@@ -203,12 +203,19 @@ added to the pack." #+(file-append glibc-utf8-locales "/lib/locale")) (setlocale LC_ALL "en_US.utf8")))) + (define (import-module? module)+ ;; Since we don't use deduplication support in 'populate-store', don't+ ;; import (guix store deduplication) and its dependencies, which includes+ ;; Guile-Gcrypt. That way we can run tests with '--bootstrap'.+ (and (not-config? module)+ (not (equal? '(guix store deduplication) module))))+ (define build (with-imported-modules (source-module-closure `((guix build utils) (guix build union) (gnu build install))- #:select? not-config?)+ #:select? import-module?) #~(begin (use-modules (guix build utils) ((guix build union) #:select (relative-file-name))@@ -382,138 +389,139 @@ added to the pack." `(("/bin" -> "bin") ,@symlinks))) (define build- (with-imported-modules (source-module-closure- '((guix build utils)- (guix build store-copy)- (guix build union)- (gnu build install))- #:select? not-config?)- #~(begin- (use-modules (guix build utils)- (guix build store-copy)- ((guix build union) #:select (relative-file-name))- (gnu build install)- (srfi srfi-1)- (srfi srfi-26)- (ice-9 match))-- (define database #+database)- (define entry-point #$entry-point)-- (define (mksquashfs args)- (apply invoke "mksquashfs"- `(,@args-- ;; Do not create a "recovery file" when appending to the- ;; file system since it's useless in this case.- "-no-recovery"-- ;; Do not attempt to store extended attributes.- ;; See <https://bugs.gnu.org/40043>.- "-no-xattrs"-- ;; Set file times and the file system creation time to- ;; one second after the Epoch.- "-all-time" "1" "-mkfs-time" "1"-- ;; Reset all UIDs and GIDs.- "-force-uid" "0" "-force-gid" "0")))-- (setenv "PATH" #+(file-append archiver "/bin"))-- ;; We need an empty file in order to have a valid file argument when- ;; we reparent the root file system. Read on for why that's- ;; necessary.- (with-output-to-file ".empty" (lambda () (display "")))-- ;; Create the squashfs image in several steps.- ;; Add all store items. Unfortunately mksquashfs throws away all- ;; ancestor directories and only keeps the basename. We fix this- ;; in the following invocations of mksquashfs.- (mksquashfs `(,@(map store-info-item- (call-with-input-file "profile"- read-reference-graph))- #$environment- ,#$output-- ;; Do not perform duplicate checking because we- ;; don't have any dupes.- "-no-duplicates"- "-comp"- ,#+(compressor-name compressor)))-- ;; Here we reparent the store items. For each sub-directory of- ;; the store prefix we need one invocation of "mksquashfs".- (for-each (lambda (dir)- (mksquashfs `(".empty"- ,#$output- "-root-becomes" ,dir)))- (reverse (string-tokenize (%store-directory)- (char-set-complement (char-set #\/)))))-- ;; Add symlinks and mount points.- (mksquashfs- `(".empty"- ,#$output- ;; Create SYMLINKS via pseudo file definitions.- ,@(append-map- (match-lambda- ((source '-> target)- ;; Create relative symlinks to work around a bug in- ;; Singularity 2.x:- ;; https://bugs.gnu.org/34913- ;; https://github.com/sylabs/singularity/issues/1487- (let ((target (string-append #$profile "/" target)))- (list "-p"- (string-join- ;; name s mode uid gid symlink- (list source- "s" "777" "0" "0"- (relative-file-name (dirname source)- target)))))))- '#$symlinks*)-- "-p" "/.singularity.d d 555 0 0"-- ;; Create the environment file.- "-p" "/.singularity.d/env d 555 0 0"- "-p" ,(string-append- "/.singularity.d/env/90-environment.sh s 777 0 0 "- (relative-file-name "/.singularity.d/env"- #$environment))-- ;; Create /.singularity.d/actions, and optionally the 'run'- ;; script, used by 'singularity run'.- "-p" "/.singularity.d/actions d 555 0 0"-- ,@(if entry-point- `(;; This one if for Singularity 2.x.- "-p"- ,(string-append- "/.singularity.d/actions/run s 777 0 0 "- (relative-file-name "/.singularity.d/actions"- (string-append #$profile "/"- entry-point)))-- ;; This one is for Singularity 3.x.- "-p"- ,(string-append- "/.singularity.d/runscript s 777 0 0 "- (relative-file-name "/.singularity.d"- (string-append #$profile "/"- entry-point))))- '())-- ;; Create empty mount points.- "-p" "/proc d 555 0 0"- "-p" "/sys d 555 0 0"- "-p" "/dev d 555 0 0"- "-p" "/home d 555 0 0"))-- (when database- ;; Initialize /var/guix.- (install-database-and-gc-roots "var-etc" database #$profile)- (mksquashfs `("var-etc" ,#$output))))))+ (with-extensions (list guile-gcrypt)+ (with-imported-modules (source-module-closure+ '((guix build utils)+ (guix build store-copy)+ (guix build union)+ (gnu build install))+ #:select? not-config?)+ #~(begin+ (use-modules (guix build utils)+ (guix build store-copy)+ ((guix build union) #:select (relative-file-name))+ (gnu build install)+ (srfi srfi-1)+ (srfi srfi-26)+ (ice-9 match))++ (define database #+database)+ (define entry-point #$entry-point)++ (define (mksquashfs args)+ (apply invoke "mksquashfs"+ `(,@args++ ;; Do not create a "recovery file" when appending to the+ ;; file system since it's useless in this case.+ "-no-recovery"++ ;; Do not attempt to store extended attributes.+ ;; See <https://bugs.gnu.org/40043>.+ "-no-xattrs"++ ;; Set file times and the file system creation time to+ ;; one second after the Epoch.+ "-all-time" "1" "-mkfs-time" "1"++ ;; Reset all UIDs and GIDs.+ "-force-uid" "0" "-force-gid" "0")))++ (setenv "PATH" #+(file-append archiver "/bin"))++ ;; We need an empty file in order to have a valid file argument when+ ;; we reparent the root file system. Read on for why that's+ ;; necessary.+ (with-output-to-file ".empty" (lambda () (display "")))++ ;; Create the squashfs image in several steps.+ ;; Add all store items. Unfortunately mksquashfs throws away all+ ;; ancestor directories and only keeps the basename. We fix this+ ;; in the following invocations of mksquashfs.+ (mksquashfs `(,@(map store-info-item+ (call-with-input-file "profile"+ read-reference-graph))+ #$environment+ ,#$output++ ;; Do not perform duplicate checking because we+ ;; don't have any dupes.+ "-no-duplicates"+ "-comp"+ ,#+(compressor-name compressor)))++ ;; Here we reparent the store items. For each sub-directory of+ ;; the store prefix we need one invocation of "mksquashfs".+ (for-each (lambda (dir)+ (mksquashfs `(".empty"+ ,#$output+ "-root-becomes" ,dir)))+ (reverse (string-tokenize (%store-directory)+ (char-set-complement (char-set #\/)))))++ ;; Add symlinks and mount points.+ (mksquashfs+ `(".empty"+ ,#$output+ ;; Create SYMLINKS via pseudo file definitions.+ ,@(append-map+ (match-lambda+ ((source '-> target)+ ;; Create relative symlinks to work around a bug in+ ;; Singularity 2.x:+ ;; https://bugs.gnu.org/34913+ ;; https://github.com/sylabs/singularity/issues/1487+ (let ((target (string-append #$profile "/" target)))+ (list "-p"+ (string-join+ ;; name s mode uid gid symlink+ (list source+ "s" "777" "0" "0"+ (relative-file-name (dirname source)+ target)))))))+ '#$symlinks*)++ "-p" "/.singularity.d d 555 0 0"++ ;; Create the environment file.+ "-p" "/.singularity.d/env d 555 0 0"+ "-p" ,(string-append+ "/.singularity.d/env/90-environment.sh s 777 0 0 "+ (relative-file-name "/.singularity.d/env"+ #$environment))++ ;; Create /.singularity.d/actions, and optionally the 'run'+ ;; script, used by 'singularity run'.+ "-p" "/.singularity.d/actions d 555 0 0"++ ,@(if entry-point+ `( ;; This one if for Singularity 2.x.+ "-p"+ ,(string-append+ "/.singularity.d/actions/run s 777 0 0 "+ (relative-file-name "/.singularity.d/actions"+ (string-append #$profile "/"+ entry-point)))++ ;; This one is for Singularity 3.x.+ "-p"+ ,(string-append+ "/.singularity.d/runscript s 777 0 0 "+ (relative-file-name "/.singularity.d"+ (string-append #$profile "/"+ entry-point))))+ '())++ ;; Create empty mount points.+ "-p" "/proc d 555 0 0"+ "-p" "/sys d 555 0 0"+ "-p" "/dev d 555 0 0"+ "-p" "/home d 555 0 0"))++ (when database+ ;; Initialize /var/guix.+ (install-database-and-gc-roots "var-etc" database #$profile)+ (mksquashfs `("var-etc" ,#$output))))))) (gexp->derivation (string-append name (compressor-extension compressor)diff --git a/guix/store/deduplication.scm b/guix/store/deduplication.scmindex b4d37d4525..8564f12107 100644--- a/guix/store/deduplication.scm+++ b/guix/store/deduplication.scm@@ -34,7 +34,8 @@ #:use-module (guix serialization) #:export (nar-sha256 deduplicate- dump-file/deduplicate))+ dump-file/deduplicate+ copy-file/deduplicate)) ;; XXX: This port is used as a workaround on Guile <= 2.2.4 where ;; 'port-position' throws to 'out-of-range' when the offset is great than or@@ -256,3 +257,16 @@ down the road." (get-hash))))) (deduplicate file hash #:store store))++(define* (copy-file/deduplicate source target+ #:key (store (%store-directory)))+ "Like 'copy-file', but additionally deduplicate TARGET in STORE."+ (call-with-input-file source+ (lambda (input)+ (let ((stat (stat input)))+ (dump-file/deduplicate target input (stat:size stat)+ (if (zero? (logand (stat:mode stat)+ #o100))+ 'regular+ 'executable)+ #:store store)))))diff --git a/tests/gexp.scm b/tests/gexp.scmindex a0e55178fa..6e92f0e4b3 100644--- a/tests/gexp.scm+++ b/tests/gexp.scm@@ -736,7 +736,8 @@ (zero? (logand #o222 (stat:mode st))))))) (mkdir #$output)- (populate-store '("graph") #$output)+ (populate-store '("graph") #$output+ #:deduplicate? #f) ;; Check whether 'populate-store' canonicalizes ;; permissions and timestamps.diff --git a/tests/store-deduplication.scm b/tests/store-deduplication.scmindex e2870a363d..7b01acae24 100644--- a/tests/store-deduplication.scm+++ b/tests/store-deduplication.scm@@ -1,5 +1,5 @@ ;;; GNU Guix --- Functional package management for GNU-;;; Copyright © 2018 Ludovic Courtès <ludo@gnu.org>+;;; Copyright © 2018, 2020 Ludovic Courtès <ludo@gnu.org> ;;; ;;; This file is part of GNU Guix. ;;;@@ -25,6 +25,7 @@ #:use-module (rnrs bytevectors) #:use-module (ice-9 binary-ports) #:use-module (srfi srfi-1)+ #:use-module (srfi srfi-26) #:use-module (srfi srfi-64)) (test-begin "store-deduplication")@@ -106,4 +107,19 @@ (cons (apply = (map (compose stat:ino stat) identical)) (map (compose stat:nlink stat) identical)))))) +(test-assert "copy-file/deduplicate"+ (call-with-temporary-directory+ (lambda (store)+ (let ((source (search-path %load-path "gnu/packages/emacs-xyz.scm")))+ (for-each (lambda (target)+ (copy-file/deduplicate source+ (string-append store target)+ #:store store))+ '("/a" "/b" "/c"))+ (and (directory-exists? (string-append store "/.links"))+ (file=? source (string-append store "/a"))+ (apply = (map (compose stat:ino stat+ (cut string-append store <>))+ '("/a" "/b" "/c"))))))))+ (test-end "store-deduplication")-- 2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 08/15] image: 'register-closure' leaves it up to the caller to deduplicate.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150919.18435-9-ludo@gnu.org
* gnu/build/image.scm (register-closure): Remove #:deduplicate?parameter and pass #:deduplicate? #f to 'register-items'.(initialize-root-partition): Adjust accordingly.* gnu/build/vm.scm (register-closure, root-partition-initializer):Likewise.--- gnu/build/image.scm | 8 ++------ gnu/build/vm.scm | 9 +++------ 2 files changed, 5 insertions(+), 12 deletions(-)
Toggle diff (73 lines)diff --git a/gnu/build/image.scm b/gnu/build/image.scmindex 8f50f27f78..8d5fc603d9 100644--- a/gnu/build/image.scm+++ b/gnu/build/image.scm@@ -140,21 +140,18 @@ given CONFIG file." (define* (register-closure prefix closure #:key- (deduplicate? #t) (schema (sql-schema)) (wal-mode? #t)) "Register CLOSURE in PREFIX, where PREFIX is the directory name of the target store and CLOSURE is the name of a file containing a reference graph as-produced by #:references-graphs. As a side effect, if DEDUPLICATE? is true,-deduplicates files common to CLOSURE and the rest of PREFIX. Pass WAL-MODE?-to call-with-database."+produced by #:references-graphs. Pass WAL-MODE? to call-with-database." (let ((items (call-with-input-file closure read-reference-graph))) (parameterize ((sql-schema schema)) (with-database (store-database-file #:prefix prefix) db #:wal-mode? wal-mode? (register-items db items #:prefix prefix- #:deduplicate? deduplicate?+ #:deduplicate? #f #:registration-time %epoch))))) (define* (initialize-efi-partition root@@ -196,7 +193,6 @@ register-closure." (when register-closures? (for-each (lambda (closure) (register-closure root closure- #:deduplicate? #f #:wal-mode? wal-mode?)) references-graphs)) diff --git a/gnu/build/vm.scm b/gnu/build/vm.scmindex 03be5697b7..8c6ab648ac 100644--- a/gnu/build/vm.scm+++ b/gnu/build/vm.scm@@ -215,18 +215,16 @@ the #:references-graphs parameter of 'derivation'." (define* (register-closure prefix closure #:key- (deduplicate? #t) (schema (sql-schema))) "Register CLOSURE in PREFIX, where PREFIX is the directory name of the target store and CLOSURE is the name of a file containing a reference graph as-produced by #:references-graphs. As a side effect, if DEDUPLICATE? is true,-deduplicates files common to CLOSURE and the rest of PREFIX."+produced by #:references-graphs." (let ((items (call-with-input-file closure read-reference-graph))) (parameterize ((sql-schema schema)) (with-database (store-database-file #:prefix prefix) db (register-items db items #:prefix prefix- #:deduplicate? deduplicate?+ #:deduplicate? #f #:registration-time %epoch))))) @@ -412,8 +410,7 @@ system that is passed to 'populate-root-file-system'." (display "registering closures...\n") (for-each (lambda (closure) (register-closure target- (string-append "/xchg/" closure)- #:deduplicate? #f))+ (string-append "/xchg/" closure))) closures) (unless copy-closures? (umount target-store)))-- 2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 09/15] database: Remove #:deduplicate? from 'register-items'.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150919.18435-10-ludo@gnu.org
It is now up to the caller to deduplicate store contents.
* guix/store/database.scm (register-items): Remove #:deduplicate?parameter and call to 'deduplicate'.(register-path): Call 'deduplicate' when #:deduplicate? is true.* gnu/build/image.scm (register-closure): Adjust call accordingly.* gnu/build/vm.scm (register-closure): Likewise.* guix/nar.scm (finalize-store-file): Likewise.* guix/scripts/pack.scm (store-database): Likewise.--- gnu/build/image.scm | 1 - gnu/build/vm.scm | 1 - guix/nar.scm | 3 +-- guix/scripts/pack.scm | 1 - guix/store/database.scm | 11 ++++++----- 5 files changed, 7 insertions(+), 10 deletions(-)
Toggle diff (95 lines)diff --git a/gnu/build/image.scm b/gnu/build/image.scmindex 8d5fc603d9..f6e5cb42f6 100644--- a/gnu/build/image.scm+++ b/gnu/build/image.scm@@ -151,7 +151,6 @@ produced by #:references-graphs. Pass WAL-MODE? to call-with-database." #:wal-mode? wal-mode? (register-items db items #:prefix prefix- #:deduplicate? #f #:registration-time %epoch))))) (define* (initialize-efi-partition rootdiff --git a/gnu/build/vm.scm b/gnu/build/vm.scmindex 8c6ab648ac..bd59916bf3 100644--- a/gnu/build/vm.scm+++ b/gnu/build/vm.scm@@ -224,7 +224,6 @@ produced by #:references-graphs." (with-database (store-database-file #:prefix prefix) db (register-items db items #:prefix prefix- #:deduplicate? #f #:registration-time %epoch))))) diff --git a/guix/nar.scm b/guix/nar.scmindex 947b393d84..a817b56007 100644--- a/guix/nar.scm+++ b/guix/nar.scm@@ -118,8 +118,7 @@ held." ;; Register TARGET. The 'restore-file' call took care of ;; deduplication, timestamps, and permissions. (register-items db- (list (store-info target deriver references))- #:deduplicate? #f))+ (list (store-info target deriver references)))) (when lock? (delete-file (string-append target ".lock"))diff --git a/guix/scripts/pack.scm b/guix/scripts/pack.scmindex 440c4b0903..8ecdcb823f 100644--- a/guix/scripts/pack.scm+++ b/guix/scripts/pack.scm@@ -167,7 +167,6 @@ dependencies are registered." (let ((items (append-map read-closure '#$labels))) (with-database db-file db (register-items db items- #:deduplicate? #f #:registration-time %epoch))))))) (computed-file "store-database" builddiff --git a/guix/store/database.scm b/guix/store/database.scmindex 0ed66a6e2c..31ea9add78 100644--- a/guix/store/database.scm+++ b/guix/store/database.scm@@ -407,6 +407,11 @@ by adding it as a temp-root." (define real-file-name (string-append (or prefix "") path)) + (when deduplicate?+ (deduplicate real-file-name (nar-sha256 real-file-name)+ #:store (string-append (or prefix "")+ %store-directory)))+ (when reset-timestamps? (reset-timestamps real-file-name)) @@ -414,7 +419,6 @@ by adding it as a temp-root." (with-database db-file db (register-items db (list (store-info path deriver references)) #:prefix prefix- #:deduplicate? deduplicate? #:log-port (%make-void-port "w"))))) (define %epoch@@ -423,7 +427,6 @@ by adding it as a temp-root." (define* (register-items db items #:key prefix- (deduplicate? #t) registration-time (log-port (current-error-port))) "Register all of ITEMS, a list of <store-info> records as returned by@@ -467,9 +470,7 @@ typically by adding them as temp-roots." "sha256:" (bytevector->base16-string hash)) #:nar-size nar-size- #:time registration-time)))- (when deduplicate?- (deduplicate real-file-name hash #:store store-dir)))))+ #:time registration-time)))))) (let* ((prefix (format #f "registering ~a items" (length items))) (progress (progress-reporter/bar (length items)-- 2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 11/15] database: Remove #:deduplicate? and #:reset-timestamps? from 'register-path'.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150951.18508-2-ludo@gnu.org
* guix/store/database.scm (register-path): Remove #:deduplicate?and #:reset-timestamps?.* guix/scripts/system.scm (copy-item): Adjust accordingly.* tests/store-database.scm ("register-path")("register-path, directory"): Call 'reset-timestamps'.--- guix/scripts/system.scm | 6 +----- guix/store/database.scm | 17 ++--------------- tests/store-database.scm | 5 +++-- 3 files changed, 6 insertions(+), 22 deletions(-)
Toggle diff (90 lines)diff --git a/guix/scripts/system.scm b/guix/scripts/system.scmindex c08929066b..0e543d9460 100644--- a/guix/scripts/system.scm+++ b/guix/scripts/system.scm@@ -158,11 +158,7 @@ REFERENCES as its set of references." (unless (register-path item #:prefix target #:state-directory state- #:references references-- ;; Those are taken care of by 'copy-store-item'.- #:reset-timestamps? #f- #:deduplicate? #f)+ #:references references) (leave (G_ "failed to register '~a' under '~a'~%") item target)))) diff --git a/guix/store/database.scm b/guix/store/database.scmindex 31ea9add78..c0010b72b9 100644--- a/guix/store/database.scm+++ b/guix/store/database.scm@@ -384,16 +384,14 @@ is true." (define* (register-path path #:key (references '()) deriver prefix- state-directory (deduplicate? #t)- (reset-timestamps? #t)+ state-directory (schema (sql-schema))) "Register PATH as a valid store file, with REFERENCES as its list of references, and DERIVER as its deriver (.drv that led to it.) If PREFIX is given, it must be the name of the directory containing the new store to initialize; if STATE-DIRECTORY is given, it must be a string containing the absolute file name to the state directory of the store being initialized.-Return #t on success. As a side effect, reset timestamps on PATH, unless-RESET-TIMESTAMPS? is false.+Return #t on success. Use with care as it directly modifies the store! This is primarily meant to be used internally by the daemon's build hook.@@ -404,17 +402,6 @@ by adding it as a temp-root." (store-database-file #:prefix prefix #:state-directory state-directory)) - (define real-file-name- (string-append (or prefix "") path))-- (when deduplicate?- (deduplicate real-file-name (nar-sha256 real-file-name)- #:store (string-append (or prefix "")- %store-directory)))-- (when reset-timestamps?- (reset-timestamps real-file-name))- (parameterize ((sql-schema schema)) (with-database db-file db (register-items db (list (store-info path deriver references))diff --git a/tests/store-database.scm b/tests/store-database.scmindex 3b4ef43f6d..33fd6cfbad 100644--- a/tests/store-database.scm+++ b/tests/store-database.scm@@ -34,8 +34,7 @@ (test-begin "store-database") -(test-equal "register-path"- '(1 1)+(test-assert "register-path" (let ((file (string-append (%store-prefix) "/" (make-string 32 #\f) "-fake"))) (when (valid-path? %store file)@@ -46,6 +45,7 @@ (drv (string-append file ".drv"))) (call-with-output-file file (cut display "This is a fake store item.\n" <>))+ (reset-timestamps file) (register-path file #:references (list ref) #:deriver drv)@@ -69,6 +69,7 @@ (mkdir-p (string-append file "/a")) (call-with-output-file (string-append file "/a/b") (const #t))+ (reset-timestamps file) (register-path file #:deriver drv) (and (valid-path? %store file)-- 2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 10/15] guix system: 'init' copies, resets timestamps, and deduplicates at once.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150951.18508-1-ludo@gnu.org
Partly fixes https://bugs.gnu.org/44760.
* guix/build/store-copy.scm (copy-store-item): New procedure.(populate-store): Use it instead of the inline 'copy-recursively' call.* guix/scripts/system.scm (copy-item): Likewise.Pass #:reset-timestamps? and #:deduplicate? to 'register-path'.--- guix/build/store-copy.scm | 31 +++++++++++++++++++++---------- guix/scripts/system.scm | 11 ++++++++--- 2 files changed, 29 insertions(+), 13 deletions(-)
Toggle diff (94 lines)diff --git a/guix/build/store-copy.scm b/guix/build/store-copy.scmindex 7f0672cd9d..01e1f41870 100644--- a/guix/build/store-copy.scm+++ b/guix/build/store-copy.scm@@ -38,6 +38,7 @@ file-size closure-size+ copy-store-item populate-store)) ;;; Commentary:@@ -242,6 +243,24 @@ permissions. Write verbose output to the LOG port." stat lstat))) +(define* (copy-store-item item target+ #:key+ (deduplicate? #t)+ (log-port (%make-void-port "w")))+ "Copy ITEM, a store item, to the store under TARGET, the target root+directory. When DEDUPLICATE? is true, deduplicate it within TARGET."+ (define store+ (string-append target (%store-directory)))++ (copy-recursively item (string-append target item)+ #:keep-mtime? #t+ #:keep-permissions? #t+ #:copy-file+ (if deduplicate?+ (cut copy-file/deduplicate <> <> #:store store)+ copy-file)+ #:log log-port))+ (define* (populate-store reference-graphs target #:key (deduplicate? #t)@@ -273,16 +292,8 @@ regular files as they are copied to TARGET." (call-with-progress-reporter progress (lambda (report) (for-each (lambda (thing)- (copy-recursively thing- (string-append target thing)- #:keep-mtime? #t- #:keep-permissions? #t- #:copy-file- (if deduplicate?- (cut copy-file/deduplicate <> <>- #:store store)- copy-file)- #:log (%make-void-port "w"))+ (copy-store-item thing target+ #:deduplicate? deduplicate?) (report)) things))))) diff --git a/guix/scripts/system.scm b/guix/scripts/system.scmindex db80e0be8f..c08929066b 100644--- a/guix/scripts/system.scm+++ b/guix/scripts/system.scm@@ -30,6 +30,7 @@ #:use-module ((guix status) #:select (with-status-verbosity)) #:use-module (guix store) #:autoload (guix store database) (register-path)+ #:autoload (guix build store-copy) (copy-store-item) #:use-module (guix describe) #:use-module (guix grafts) #:use-module (guix gexp)@@ -147,8 +148,8 @@ REFERENCES as its set of references." #:directories? #t)) (delete-file-recursively dest)) - (copy-recursively item dest- #:log (%make-void-port "w"))+ (copy-store-item item target+ #:deduplicate? #t) ;; Register ITEM; as a side-effect, it resets timestamps, etc. ;; Explicitly use "TARGET/var/guix" as the state directory, to avoid@@ -157,7 +158,11 @@ REFERENCES as its set of references." (unless (register-path item #:prefix target #:state-directory state- #:references references)+ #:references references++ ;; Those are taken care of by 'copy-store-item'.+ #:reset-timestamps? #f+ #:deduplicate? #f) (leave (G_ "failed to register '~a' under '~a'~%") item target)))) -- 2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 13/15] database: Remove 'register-path'.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150951.18508-4-ludo@gnu.org
* guix/store/database.scm (register-path): Remove.* tests/store-database.scm ("register-path"): Rename to...("register-items"): ... this, and use 'register-items' instead of'register-path'.("register-path, directory"): Rename to...("register-items, directory"): ... this, and use 'register-items'instead of 'register-path'.("register-path with unregistered references"): Rename to...("sqlite-register with unregistered references"): ... this.--- guix/store/database.scm | 27 --------------------------- tests/store-database.scm | 15 ++++++++------- 2 files changed, 8 insertions(+), 34 deletions(-)
Toggle diff (108 lines)diff --git a/guix/store/database.scm b/guix/store/database.scmindex 9d5bc531bb..4579b05261 100644--- a/guix/store/database.scm+++ b/guix/store/database.scm@@ -43,7 +43,6 @@ with-database path-id sqlite-register- register-path register-items %epoch reset-timestamps))@@ -383,32 +382,6 @@ is true." (chmod file (if (executable-file? file) #o555 #o444))) (utime file 1 1 0 0))))) -(define* (register-path path- #:key (references '()) deriver prefix- state-directory- (schema (sql-schema)))- "Register PATH as a valid store file, with REFERENCES as its list of-references, and DERIVER as its deriver (.drv that led to it.) If PREFIX is-given, it must be the name of the directory containing the new store to-initialize; if STATE-DIRECTORY is given, it must be a string containing the-absolute file name to the state directory of the store being initialized.-Return #t on success.--Use with care as it directly modifies the store! This is primarily meant to-be used internally by the daemon's build hook.--PATH must be protected from GC and locked during execution of this, typically-by adding it as a temp-root."- (define db-file- (store-database-file #:prefix prefix- #:state-directory state-directory))-- (parameterize ((sql-schema schema))- (with-database db-file db- (register-items db (list (store-info path deriver references))- #:prefix prefix- #:log-port (%make-void-port "w")))))- (define %epoch ;; When it all began. (make-time time-utc 0 1))diff --git a/tests/store-database.scm b/tests/store-database.scmindex 33fd6cfbad..17eea38c63 100644--- a/tests/store-database.scm+++ b/tests/store-database.scm@@ -20,6 +20,7 @@ #:use-module (guix tests) #:use-module (guix store) #:use-module (guix store database)+ #:use-module (guix build store-copy) #:use-module ((guix utils) #:select (call-with-temporary-output-file)) #:use-module ((guix build utils) #:select (mkdir-p delete-file-recursively))@@ -34,7 +35,7 @@ (test-begin "store-database") -(test-assert "register-path"+(test-assert "register-items" (let ((file (string-append (%store-prefix) "/" (make-string 32 #\f) "-fake"))) (when (valid-path? %store file)@@ -46,9 +47,8 @@ (call-with-output-file file (cut display "This is a fake store item.\n" <>)) (reset-timestamps file)- (register-path file- #:references (list ref)- #:deriver drv)+ (with-database (store-database-file) db+ (register-items db (list (store-info file drv (list ref))))) (and (valid-path? %store file) (equal? (references %store file) (list ref))@@ -57,7 +57,7 @@ (list (stat:mtime (lstat file)) (stat:mtime (lstat ref))))))) -(test-equal "register-path, directory"+(test-equal "register-items, directory" '(1 1 1) (let ((file (string-append (%store-prefix) "/" (make-string 32 #\f) "-fake-directory")))@@ -70,7 +70,8 @@ (call-with-output-file (string-append file "/a/b") (const #t)) (reset-timestamps file)- (register-path file #:deriver drv)+ (with-database (store-database-file) db+ (register-items db (list (store-info file drv '())))) (and (valid-path? %store file) (null? (references %store file))@@ -102,7 +103,7 @@ (list (path-id db "/gnu/foo") (path-id db "/gnu/bar"))))))) -(test-assert "register-path with unregistered references"+(test-assert "sqlite-register with unregistered references" ;; Make sure we get a "NOT NULL constraint failed: Refs.reference" error ;; when we try to add references that are not registered yet. Better safe ;; than sorry.-- 2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 12/15] system: 'init' does not recompute the hash of each store item.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150951.18508-3-ludo@gnu.org
Fixes https://bugs.gnu.org/44760.
Previously, the 'register-path' call would re-traverse ITEM to computeits nar hash, even though that hash is already known in the initialstore. This patch also avoids repeated opening/closing of thedatabase.
* guix/store/database.scm (call-with-database): Export.* guix/scripts/system.scm (copy-item): Add 'db' parameter. Call'sqlite-register' instead of 'register-path'.(copy-closure): Remove redundant call to 'references*'. Call'call-with-database' and pass the database to 'copy-item'.--- .dir-locals.el | 1 + guix/scripts/system.scm | 59 ++++++++++++++++++++++------------------- guix/store/database.scm | 1 + 3 files changed, 34 insertions(+), 27 deletions(-)
Toggle diff (125 lines)diff --git a/.dir-locals.el b/.dir-locals.elindex 4eb27d8b1b..8f07a08eb5 100644--- a/.dir-locals.el+++ b/.dir-locals.el@@ -121,6 +121,7 @@ (eval . (put 'let-system 'scheme-indent-function 1)) (eval . (put 'with-database 'scheme-indent-function 2))+ (eval . (put 'call-with-database 'scheme-indent-function 1)) (eval . (put 'call-with-transaction 'scheme-indent-function 1)) (eval . (put 'with-statement 'scheme-indent-function 3)) (eval . (put 'call-with-retrying-transaction 'scheme-indent-function 1))diff --git a/guix/scripts/system.scm b/guix/scripts/system.scmindex 0e543d9460..5427f875ec 100644--- a/guix/scripts/system.scm+++ b/guix/scripts/system.scm@@ -29,7 +29,9 @@ #:use-module (guix ui) #:use-module ((guix status) #:select (with-status-verbosity)) #:use-module (guix store)- #:autoload (guix store database) (register-path)+ #:autoload (guix base16) (bytevector->base16-string)+ #:autoload (guix store database)+ (sqlite-register store-database-file call-with-database) #:autoload (guix build store-copy) (copy-store-item) #:use-module (guix describe) #:use-module (guix grafts)@@ -130,12 +132,11 @@ BODY..., and restore them." (store-lift topologically-sorted)) -(define* (copy-item item references target+(define* (copy-item item info target db #:key (log-port (current-error-port)))- "Copy ITEM to the store under root directory TARGET and register it with-REFERENCES as its set of references."- (let ((dest (string-append target item))- (state (string-append target "/var/guix")))+ "Copy ITEM to the store under root directory TARGET and populate DB with the+given INFO, a <path-info> record."+ (let ((dest (string-append target item))) (format log-port "copying '~a'...~%" item) ;; Remove DEST if it exists to make sure that (1) we do not fail badly@@ -151,41 +152,45 @@ REFERENCES as its set of references." (copy-store-item item target #:deduplicate? #t) - ;; Register ITEM; as a side-effect, it resets timestamps, etc.- ;; Explicitly use "TARGET/var/guix" as the state directory, to avoid- ;; reproducing the user's current settings; see- ;; <http://bugs.gnu.org/18049>.- (unless (register-path item- #:prefix target- #:state-directory state- #:references references)- (leave (G_ "failed to register '~a' under '~a'~%")- item target))))+ (sqlite-register db+ #:path item+ #:references (path-info-references info)+ #:deriver (path-info-deriver info)+ #:hash (string-append+ "sha256:"+ (bytevector->base16-string (path-info-hash info)))+ #:nar-size (path-info-nar-size info)))) (define* (copy-closure item target #:key (log-port (current-error-port))) "Copy ITEM and all its dependencies to the store under root directory TARGET, and register them." (mlet* %store-monad ((to-copy (topologically-sorted* (list item)))- (refs (mapm %store-monad references* to-copy))- (info (mapm %store-monad query-path-info*- (delete-duplicates- (append to-copy (concatenate refs)))))+ (info (mapm %store-monad query-path-info* to-copy)) (size -> (reduce + 0 (map path-info-nar-size info)))) (define progress-bar (progress-reporter/bar (length to-copy) (format #f (G_ "copying to '~a'...") target))) + (define state+ (string-append target "/var/guix"))+ (check-available-space size target) - (call-with-progress-reporter progress-bar- (lambda (report)- (let ((void (%make-void-port "w")))- (for-each (lambda (item refs)- (copy-item item refs target #:log-port void)- (report))- to-copy refs))))+ ;; Explicitly use "TARGET/var/guix" as the state directory to avoid+ ;; reproducing the user's current settings; see+ ;; <http://bugs.gnu.org/18049>.+ (call-with-database (store-database-file #:prefix target+ #:state-directory state)+ (lambda (db)+ (call-with-progress-reporter progress-bar+ (lambda (report)+ (let ((void (%make-void-port "w")))+ (for-each (lambda (item info)+ (copy-item item info target db #:log-port void)+ (report))+ to-copy info)))))) (return *unspecified*))) diff --git a/guix/store/database.scm b/guix/store/database.scmindex c0010b72b9..9d5bc531bb 100644--- a/guix/store/database.scm+++ b/guix/store/database.scm@@ -39,6 +39,7 @@ #:export (sql-schema %default-database-file store-database-file+ call-with-database with-database path-id sqlite-register-- 2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 14/15] database: Honor 'SOURCE_DATE_EPOCH'.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150951.18508-5-ludo@gnu.org
* guix/store/database.scm (timestamp): New procedure.(sqlite-register): Use it as the default for #:time.(register-items): Likewise for #:registeration-time.--- guix/store/database.scm | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
Toggle diff (47 lines)diff --git a/guix/store/database.scm b/guix/store/database.scmindex 4579b05261..0a84bbddb9 100644--- a/guix/store/database.scm+++ b/guix/store/database.scm@@ -323,8 +323,19 @@ ids of items referred to." (sqlite-fold cons '() stmt)) references))) +(define (timestamp)+ "Return a timestamp, either the current time of SOURCE_DATE_EPOCH."+ (match (getenv "SOURCE_DATE_EPOCH")+ (#f+ (current-time time-utc))+ ((= string->number seconds)+ (if seconds+ (make-time time-utc 0 seconds)+ (current-time time-utc)))))+ (define* (sqlite-register db #:key path (references '())- deriver hash nar-size time)+ deriver hash nar-size+ (time (timestamp))) "Registers this stuff in DB. PATH is the store item to register and REFERENCES is the list of store items PATH refers to; DERIVER is the '.drv' that produced PATH, HASH is the base16-encoded Nix sha256 hash of@@ -337,9 +348,7 @@ Every store item in REFERENCES must already be registered." #:deriver deriver #:hash hash #:nar-size nar-size- #:time (time-second- (or time- (current-time time-utc))))))+ #:time (time-second time)))) ;; Call 'path-id' on each of REFERENCES. This ensures we get a ;; "non-NULL constraint" failure if one of REFERENCES is unregistered. (add-references db id@@ -388,7 +397,7 @@ is true." (define* (register-items db items #:key prefix- registration-time+ (registration-time (timestamp)) (log-port (current-error-port))) "Register all of ITEMS, a list of <store-info> records as returned by 'read-reference-graph', in DB. ITEMS must be in topological order (with-- 2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 15/15] deduplicate: Create the '.links' directory lazily.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150951.18508-6-ludo@gnu.org
This avoids repeated (mkdir-p "/gnu/store/.links") calls whendeduplicating lots of files.
* guix/store/deduplication.scm (deduplicate): Remove initial call to'mkdir-p'. Add ENOENT case in 'link' exception handler. Reindent.* tests/store-deduplication.scm ("deduplicate, ENOSPC"): Checkfor (<= links 4) to account for the initial 'link' call.--- guix/store/deduplication.scm | 96 ++++++++++++++++++----------------- tests/store-deduplication.scm | 2 +- 2 files changed, 51 insertions(+), 47 deletions(-)
Toggle diff (122 lines)diff --git a/guix/store/deduplication.scm b/guix/store/deduplication.scmindex 8564f12107..a72a43bf79 100644--- a/guix/store/deduplication.scm+++ b/guix/store/deduplication.scm@@ -159,52 +159,56 @@ under STORE." (define links-directory (string-append store "/.links")) - (mkdir-p links-directory)- (let loop ((path path)- (type (stat:type (lstat path)))- (hash hash))- (if (eq? 'directory type)- ;; Can't hardlink directories, so hardlink their atoms.- (for-each (match-lambda- ((file . properties)- (unless (member file '("." ".."))- (let* ((file (string-append path "/" file))- (type (match (assoc-ref properties 'type)- ((or 'unknown #f)- (stat:type (lstat file)))- (type type))))- (loop file type- (and (not (eq? 'directory type))- (nar-sha256 file)))))))- (scandir* path))- (let ((link-file (string-append links-directory "/"- (bytevector->nix-base32-string hash))))- (if (file-exists? link-file)- (replace-with-link link-file path- #:swap-directory links-directory- #:store store)- (catch 'system-error- (lambda ()- (link path link-file))- (lambda args- (let ((errno (system-error-errno args)))- (cond ((= errno EEXIST)- ;; Someone else put an entry for PATH in- ;; LINKS-DIRECTORY before we could. Let's use it.- (replace-with-link path link-file- #:swap-directory- links-directory- #:store store))- ((= errno ENOSPC)- ;; There's not enough room in the directory index for- ;; more entries in .links, but that's fine: we can- ;; just stop.- #f)- ((= errno EMLINK)- ;; PATH has reached the maximum number of links, but- ;; that's OK: we just can't deduplicate it more.- #f)- (else (apply throw args)))))))))))+ (let loop ((path path)+ (type (stat:type (lstat path)))+ (hash hash))+ (if (eq? 'directory type)+ ;; Can't hardlink directories, so hardlink their atoms.+ (for-each (match-lambda+ ((file . properties)+ (unless (member file '("." ".."))+ (let* ((file (string-append path "/" file))+ (type (match (assoc-ref properties 'type)+ ((or 'unknown #f)+ (stat:type (lstat file)))+ (type type))))+ (loop file type+ (and (not (eq? 'directory type))+ (nar-sha256 file)))))))+ (scandir* path))+ (let ((link-file (string-append links-directory "/"+ (bytevector->nix-base32-string hash))))+ (if (file-exists? link-file)+ (replace-with-link link-file path+ #:swap-directory links-directory+ #:store store)+ (catch 'system-error+ (lambda ()+ (link path link-file))+ (lambda args+ (let ((errno (system-error-errno args)))+ (cond ((= errno EEXIST)+ ;; Someone else put an entry for PATH in+ ;; LINKS-DIRECTORY before we could. Let's use it.+ (replace-with-link path link-file+ #:swap-directory+ links-directory+ #:store store))+ ((= errno ENOENT)+ ;; This most likely means that LINKS-DIRECTORY does+ ;; not exist. Attempt to create it and try again.+ (mkdir-p links-directory)+ (loop path type hash))+ ((= errno ENOSPC)+ ;; There's not enough room in the directory index for+ ;; more entries in .links, but that's fine: we can+ ;; just stop.+ #f)+ ((= errno EMLINK)+ ;; PATH has reached the maximum number of links, but+ ;; that's OK: we just can't deduplicate it more.+ #f)+ (else (apply throw args))))))))))) (define (tee input len output) "Return a port that reads up to LEN bytes from INPUT and writes them todiff --git a/tests/store-deduplication.scm b/tests/store-deduplication.scmindex 7b01acae24..b1c2d93bbd 100644--- a/tests/store-deduplication.scm+++ b/tests/store-deduplication.scm@@ -95,7 +95,7 @@ (lambda () (set! link (lambda (old new) (set! links (+ links 1))- (if (<= links 3)+ (if (<= links 4) (true-link old new) (throw 'system-error "link" "~A" '("Whaaat?!") (list ENOSPC))))))-- 2.29.2
L
L
Ludovic Courtès wrote on 15 Dec 2020 17:33
Re: bug#44760: [PATCH 00/15] Speed up 'guix system init' & co.
(address . 44760-done@debbugs.gnu.org)
873607m3ka.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:
Toggle quote (20 lines)> serialization: 'fold-archive' notifies about directory processing> completion.> serialization: 'restore-file' sets canonical timestamp and> permissions.> nar: Deduplicate files right as they are restored.> store-copy: 'populate-store' resets timestamps.> image: 'register-closure' assumes already-reset timestamps.> database: Remove #:reset-timestamps? from 'register-items'.> store-copy: 'populate-store' can optionally deduplicate files.> image: 'register-closure' leaves it up to the caller to deduplicate.> database: Remove #:deduplicate? from 'register-items'.> guix system: 'init' copies, resets timestamps, and deduplicates at> once.> database: Remove #:deduplicate? and #:reset-timestamps? from> 'register-path'.> system: 'init' does not recompute the hash of each store item.> database: Remove 'register-path'.> database: Honor 'SOURCE_DATE_EPOCH'.> deduplicate: Create the '.links' directory lazily.
Pushed as 7530e491b517497b7b8166b5ccecdc3d4cdb468d!
Ludo'.
Closed
L
L
Ludovic Courtès wrote on 15 Dec 2020 17:38
Re: bug#44760: Closure copy in ‘guix system init’ is inefficient
(address . 44760@debbugs.gnu.org)
87y2hzkos7.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:
Toggle quote (4 lines)> ‘guix system init’ ends by copying the system’s closure from the “host”> store to the target store; it also initializes the database of that> target store.
There’s one ‘guix system init’ optimization that would be nice to havein the installation image: instead of copying store items at all, simplymove the store from its temporary location in /mnt/tmp/guix-inst to/mnt/gnu/store. That’d make it zero-copy.
It would probably require a separate code path, but it’s worthconsidering.
Thoughts?
Ludo’.
J
J
Jonathan Brielmaier wrote on 16 Dec 2020 22:53
Closure copy in ‘guix system init’ is ine fficient
(address . 44760@debbugs.gnu.org)
a5e9cb2c-16b8-f880-3c71-34b0f03b7696@web.de
Hi Ludo,
there seems to be another regression in this series.
Pulling from 6a060ff27ff68384d7c90076baa36c349fff689d gives:```[ 1/20] Loading './guix/base32.scm'...
[ 2/20] Loading './guix/build/store-copy.scm'...
[ 3/20] Loading './guix/build/syscalls.scm'...
[ 4/20] Loading './guix/build/utils.scm'...
[ 5/20] Loading './guix/combinators.scm'...
[ 6/20] Loading './guix/progress.scm'...
[ 7/20] Loading './guix/records.scm'...
[ 8/20] Loading './guix/serialization.scm'...
[ 9/20] Loading './guix/sets.scm'...
[10/20] Loading './guix/store/deduplication.scm'...
;;; Failed to autoload copy-file/deduplicate in (guix store deduplication):
;;; no code for module (gcrypt hash)
Backtrace:
17 (primitive-load "/gnu/store/1fwvy0fz6zfgnyq1mj5pn8mfin7?")
In ice-9/eval.scm:
619:8 16 (_ #f)
In srfi/srfi-1.scm:
460:18 15 (fold #<procedure 7fffeeb741e0 at ice-9/eval.scm:336:1?> ?)
460:18 14 (fold #<procedure 7fffeeb81ac0 at ice-9/eval.scm:336:1?> ?)
460:18 13 (fold #<procedure 7fffeecdaf20 at ice-9/eval.scm:336:1?> ?)
In ice-9/eval.scm:
619:8 12 (_ #(#(#<directory (guix build utils) 7fffef187d20>) # ?))
In ice-9/boot-9.scm:
2806:4 11 (save-module-excursion #<procedure 7fffeecdad40 at ice-?>)
In unknown file:
10 (primitive-load "./guix/store/deduplication.scm")
In ice-9/eval.scm:
721:20 9 (primitive-eval (define-module (guix store #) # (# #) ?))
In ice-9/psyntax.scm:
1241:36 8 (expand-top-sequence ((define-module (guix store #) ?)) ?)
1233:19 7 (parse _ (("placeholder" placeholder)) ((top) #(# # ?)) ?)
285:10 6 (parse _ (("placeholder" placeholder)) (()) _ c&e (eval) ?)
In ice-9/eval.scm:
293:34 5 (_ #<directory (guix build utils) 7fffef187d20>)
In ice-9/boot-9.scm:
3380:4 4 (define-module* _ #:filename _ #:pure _ #:version _ # _ ?)
2565:24 3 (call-with-deferred-observers #<procedure 7fffeec53cd0 ?>)
3393:24 2 (_)
222:17 1 (map1 (((gcrypt hash)) ((guix build utils)) ((guix ?)) ?))
3300:6 0 (resolve-interface (gcrypt hash) #:select _ #:hide _ # _ ?)


ice-9/boot-9.scm:3300:6: In procedure resolve-interface:
no code for module (gcrypt hash)
```All commits before in this series are fine. The fix ind88ff09ea3138fc85c1463b0b345bd6ba71ca568 does not really help.
L
L
Ludovic Courtès wrote on 17 Dec 2020 14:24
Re: bug#44760: Closure copy in ‘guix system init’ is inefficient
(name . Jonathan Brielmaier)(address . jonathan.brielmaier@web.de)(address . 44760@debbugs.gnu.org)
87v9d0imzq.fsf@gnu.org
Hi,
Jonathan Brielmaier <jonathan.brielmaier@web.de> skribis:
Toggle quote (6 lines)> there seems to be another regression in this series.>> Pulling from 6a060ff27ff68384d7c90076baa36c349fff689d gives:> ```> [ 1/20] Loading './guix/base32.scm'...
[...]
Toggle quote (68 lines)> [10/20] Loading './guix/store/deduplication.scm'...>> ;;; Failed to autoload copy-file/deduplicate in (guix store deduplication):>> ;;; no code for module (gcrypt hash)>> Backtrace:>> 17 (primitive-load "/gnu/store/1fwvy0fz6zfgnyq1mj5pn8mfin7?")>> In ice-9/eval.scm:>> 619:8 16 (_ #f)>> In srfi/srfi-1.scm:>> 460:18 15 (fold #<procedure 7fffeeb741e0 at ice-9/eval.scm:336:1?> ?)>> 460:18 14 (fold #<procedure 7fffeeb81ac0 at ice-9/eval.scm:336:1?> ?)>> 460:18 13 (fold #<procedure 7fffeecdaf20 at ice-9/eval.scm:336:1?> ?)>> In ice-9/eval.scm:>> 619:8 12 (_ #(#(#<directory (guix build utils) 7fffef187d20>) # ?))>> In ice-9/boot-9.scm:>> 2806:4 11 (save-module-excursion #<procedure 7fffeecdad40 at ice-?>)>> In unknown file:>> 10 (primitive-load "./guix/store/deduplication.scm")>> In ice-9/eval.scm:>> 721:20 9 (primitive-eval (define-module (guix store #) # (# #) ?))>> In ice-9/psyntax.scm:>> 1241:36 8 (expand-top-sequence ((define-module (guix store #) ?)) ?)>> 1233:19 7 (parse _ (("placeholder" placeholder)) ((top) #(# # ?)) ?)>> 285:10 6 (parse _ (("placeholder" placeholder)) (()) _ c&e (eval) ?)>> In ice-9/eval.scm:>> 293:34 5 (_ #<directory (guix build utils) 7fffef187d20>)>> In ice-9/boot-9.scm:>> 3380:4 4 (define-module* _ #:filename _ #:pure _ #:version _ # _ ?)>> 2565:24 3 (call-with-deferred-observers #<procedure 7fffeec53cd0 ?>)>> 3393:24 2 (_)>> 222:17 1 (map1 (((gcrypt hash)) ((guix build utils)) ((guix ?)) ?))>> 3300:6 0 (resolve-interface (gcrypt hash) #:select _ #:hide _ # _ ?)>>>> ice-9/boot-9.scm:3300:6: In procedure resolve-interface:>> no code for module (gcrypt hash)
As discussed on IRC, the following command works for me:
guix time-machine \ --commit=6a060ff27ff68384d7c90076baa36c349fff689d -- \ pull -p /tmp/test
However, I don’t have enough information about the issue.
Could you provide information on how to reproduce the issue?
Thanks in advance,Ludo’.
?