Closure copy in ‘guix system init’ is inefficient

  • Done
  • quality assurance status badge
Details
3 participants
  • Jonathan Brielmaier
  • Ludovic Courtès
  • raingloom
Owner
unassigned
Submitted by
Ludovic Courtès
Severity
important
L
L
Ludovic Courtès wrote on 20 Nov 2020 12:02
Closure copy in ‘guix system init ’ is inefficient
(address . bug-guix@gnu.org)
87h7pkffzy.fsf@inria.fr
‘guix system init’ ends by copying the system’s closure from the “host”
store to the target store; it also initializes the database of that
target store.

That copy is inefficient for several reasons. Let’s pick one file,
shred.1.gz, that ends up being copied, and let’s look at its occurrences
in the strace log of ‘guix system init config.scm /tmp/os’:

Toggle snippet (31 lines)
$ grep -A2 '/shred.1.gz' ,,s
lstat("/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0
openat(AT_FDCWD, "/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", O_RDONLY) = 15
fstat(15, {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0
openat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", O_WRONLY|O_CREAT|O_TRUNC, 0444) = 16
read(15, "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"..., 8192) = 1490
write(16, "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"..., 1490) = 1490
--
utimensat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", [{tv_sec=1605721025, tv_nsec=616985411} /* 2020-11-18T18:37:05.616985411+0100 */, {tv_sec=1, tv_nsec=0} /* 1970-01-01T01:00:01+0100 */], 0) = 0
lstat("/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/sleep.1.gz", {st_mode=S_IFREG|0444, st_size=813, ...}) = 0
openat(AT_FDCWD, "/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/sleep.1.gz", O_RDONLY) = 15
--
lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0
lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shuf.1.gz", {st_mode=S_IFREG|0444, st_size=972, ...}) = 0
lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/sleep.1.gz", {st_mode=S_IFREG|0444, st_size=813, ...}) = 0
--
lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0
openat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", O_RDONLY) = 17
lseek(17, 0, SEEK_CUR) = 0
read(17, "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"..., 1490) = 1490
--
lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0
openat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", O_RDONLY) = 17
lseek(17, 0, SEEK_CUR) = 0
read(17, "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"..., 1490) = 1490
--
link("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", "/tmp/os/gnu/store/.links/0w0qcs5lp36i89yry91r2ixlghihzf0vc56bpd9yylj342gv82xl") = 0
lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shuf.1.gz", {st_mode=S_IFREG|0444, st_size=972, ...}) = 0
openat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shuf.1.gz", O_RDONLY) = 17

First, /tmp/os/…/shred.1.gz is read entirely twice: once in
‘register-items’ (in the ‘nar-sha256’ call) to compute its hash, and a
second time for deduplication (the ‘deduplicate’ call in there.)

The ‘nar-sha256’ call could be avoided because the database of
/gnu/store contains that value. As for deduplication, we could perhaps
create those ‘.links’ entries as we copy files instead of re-traversing
the whole thing afterwards.

Second, all of /tmp/os is traversed to reset timestamps, although we
could have cleared those timestamps when we created those files in the
first place (https://issues.guix.gnu.org/44741 prevents that though,
unless we keep a bug-fixed copy of ‘copy-recursively’ in there.)

Third, in the case of the installer, we’re really copying from
/mnt/guix-inst/store to /mnt/gnu/store, which is likely the same
device. In this case we could create hard links instead of actually
copying files.

Fourth, we’re adding items one by one in the target store database, but
it may be more efficient to more or less dump the subset of the source
database in bulk.

Surely we can do better.

Ludo’.
L
L
Ludovic Courtès wrote on 21 Nov 2020 12:01
control message for bug #44760
(address . control@debbugs.gnu.org)
87o8jrc6sa.fsf@gnu.org
severity 44760 important
quit
R
R
raingloom wrote on 22 Nov 2020 20:46
Re: bug#44760: Closure copy in ‘guix system init’ is inefficient
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 44760@debbugs.gnu.org)
20201122204634.2730df12@riseup.net
On Fri, 20 Nov 2020 12:02:25 +0100
Ludovic Courtès <ludo@gnu.org> wrote:

Toggle quote (87 lines)
> ‘guix system init’ ends by copying the system’s closure from the
> “host” store to the target store; it also initializes the database of
> that target store.
>
> That copy is inefficient for several reasons. Let’s pick one file,
> shred.1.gz, that ends up being copied, and let’s look at its
> occurrences in the strace log of ‘guix system init config.scm
> /tmp/os’:
>
> --8<---------------cut here---------------start------------->8---
> $ grep -A2 '/shred.1.gz' ,,s
> lstat("/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",
> {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0 openat(AT_FDCWD,
> "/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",
> O_RDONLY) = 15 fstat(15, {st_mode=S_IFREG|0444, st_size=1490, ...}) =
> 0 openat(AT_FDCWD,
> "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",
> O_WRONLY|O_CREAT|O_TRUNC, 0444) = 16 read(15,
> "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"...,
> 8192) = 1490 write(16,
> "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"...,
> 1490) = 1490 -- utimensat(AT_FDCWD,
> "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",
> [{tv_sec=1605721025, tv_nsec=616985411} /*
> 2020-11-18T18:37:05.616985411+0100 */, {tv_sec=1, tv_nsec=0} /*
> 1970-01-01T01:00:01+0100 */], 0) = 0
> lstat("/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/sleep.1.gz",
> {st_mode=S_IFREG|0444, st_size=813, ...}) = 0 openat(AT_FDCWD,
> "/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/sleep.1.gz",
> O_RDONLY) = 15 --
> lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",
> {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0
> lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shuf.1.gz",
> {st_mode=S_IFREG|0444, st_size=972, ...}) = 0
> lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/sleep.1.gz",
> {st_mode=S_IFREG|0444, st_size=813, ...}) = 0 --
> lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",
> {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0 openat(AT_FDCWD,
> "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",
> O_RDONLY) = 17 lseek(17, 0, SEEK_CUR) = 0 read(17,
> "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"...,
> 1490) = 1490 --
> lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",
> {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0 openat(AT_FDCWD,
> "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",
> O_RDONLY) = 17 lseek(17, 0, SEEK_CUR) = 0 read(17,
> "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"...,
> 1490) = 1490 --
> link("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",
> "/tmp/os/gnu/store/.links/0w0qcs5lp36i89yry91r2ixlghihzf0vc56bpd9yylj342gv82xl")
> = 0
> lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shuf.1.gz",
> {st_mode=S_IFREG|0444, st_size=972, ...}) = 0 openat(AT_FDCWD,
> "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shuf.1.gz",
> O_RDONLY) = 17 --8<---------------cut
> here---------------end--------------->8---
>
> First, /tmp/os/…/shred.1.gz is read entirely twice: once in
> ‘register-items’ (in the ‘nar-sha256’ call) to compute its hash, and a
> second time for deduplication (the ‘deduplicate’ call in there.)
>
> The ‘nar-sha256’ call could be avoided because the database of
> /gnu/store contains that value. As for deduplication, we could
> perhaps create those ‘.links’ entries as we copy files instead of
> re-traversing the whole thing afterwards.
>
> Second, all of /tmp/os is traversed to reset timestamps, although we
> could have cleared those timestamps when we created those files in the
> first place (<https://issues.guix.gnu.org/44741> prevents that though,
> unless we keep a bug-fixed copy of ‘copy-recursively’ in there.)
>
> Third, in the case of the installer, we’re really copying from
> /mnt/guix-inst/store to /mnt/gnu/store, which is likely the same
> device. In this case we could create hard links instead of actually
> copying files.
>
> Fourth, we’re adding items one by one in the target store database,
> but it may be more efficient to more or less dump the subset of the
> source database in bulk.
>
> Surely we can do better.
>
> Ludo’.
>
>
>

Also, if a store is already present (eg.: because of a partial
install), it could make sense to (optionally) keep its contents. AFAIK
this is still not possible. It was one the bigger time sinks while I
was working on the F2FS support.
L
L
Ludovic Courtès wrote on 22 Nov 2020 22:10
(name . raingloom)(address . raingloom@riseup.net)(address . 44760@debbugs.gnu.org)
87v9dxkshz.fsf@gnu.org
Hi,

raingloom <raingloom@riseup.net> skribis:

Toggle quote (5 lines)
> Also, if a store is already present (eg.: because of a partial
> install), it could make sense to (optionally) keep its contents. AFAIK
> this is still not possible. It was one the bigger time sinks while I
> was working on the F2FS support.

It’s on purpose that ‘guix system init’ forcefully wipes store items
already present before copying them; some time ago it was not the case
and that led to problems:


I think it’s safer to keep it unchanged.

Ludo’.
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 00/15] Speed up 'guix system init' & co.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150919.18435-1-ludo@gnu.org
Hi there!

Here’s a long and rather boring patch series to address

To avoid traversing store items repeatedly as described in the
issue above, the strategy here is to gradually move the
reset-timestamps and deduplicate phases as part of the file
copying process, such that each file is accessed only once.
Consequently, the kitchen sink that ‘register-items’ once was
is now very focused.

Furthermore, it changes ‘guix system init’ so that it reuses
the already-known store item hashes when populating the target
database instead of re-traversing store items.

On my laptop (SSD, warm cache, derivations already built), the
command:

guix system init gnu/system/examples/bare-bones.tmpl /tmp/sys

goes from 32s to 22s, a 33% improvement.

Feedback welcome!

Ludo’.

Ludovic Courtès (15):
serialization: 'fold-archive' notifies about directory processing
completion.
serialization: 'restore-file' sets canonical timestamp and
permissions.
nar: Deduplicate files right as they are restored.
store-copy: 'populate-store' resets timestamps.
image: 'register-closure' assumes already-reset timestamps.
database: Remove #:reset-timestamps? from 'register-items'.
store-copy: 'populate-store' can optionally deduplicate files.
image: 'register-closure' leaves it up to the caller to deduplicate.
database: Remove #:deduplicate? from 'register-items'.
guix system: 'init' copies, resets timestamps, and deduplicates at
once.
database: Remove #:deduplicate? and #:reset-timestamps? from
'register-path'.
system: 'init' does not recompute the hash of each store item.
database: Remove 'register-path'.
database: Honor 'SOURCE_DATE_EPOCH'.
deduplicate: Create the '.links' directory lazily.

.dir-locals.el | 1 +
gnu/build/image.scm | 16 +-
gnu/build/install.scm | 3 +-
gnu/build/linux-initrd.scm | 3 +-
gnu/build/vm.scm | 14 +-
gnu/system/install.scm | 12 +-
gnu/system/linux-initrd.scm | 10 +-
guix/build/store-copy.scm | 133 ++++++++++++----
guix/nar.scm | 8 +-
guix/scripts/archive.scm | 2 +
guix/scripts/challenge.scm | 1 +
guix/scripts/pack.scm | 276 +++++++++++++++++-----------------
guix/scripts/system.scm | 64 ++++----
guix/serialization.scm | 36 +++--
guix/store/database.scm | 58 ++-----
guix/store/deduplication.scm | 167 ++++++++++++++------
tests/gexp.scm | 20 ++-
tests/guix-archive.sh | 4 +-
tests/nar.scm | 21 ++-
tests/store-database.scm | 18 ++-
tests/store-deduplication.scm | 20 ++-
21 files changed, 544 insertions(+), 343 deletions(-)

--
2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 01/15] serialization: 'fold-archive' notifies about directory processing completion.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150919.18435-2-ludo@gnu.org
* guix/serialization.scm (fold-archive): Call PROC with a
'directory-complete tag when done with a directory.
(restore-file): Handle it.
* guix/scripts/archive.scm (list-contents): Likewise.
* guix/scripts/challenge.scm (archive-contents): Likewise.
* tests/nar.scm ("write-file-tree + fold-archive"): Adjust accordingly.
---
guix/scripts/archive.scm | 2 ++
guix/scripts/challenge.scm | 1 +
guix/serialization.scm | 5 ++++-
tests/nar.scm | 6 ++++--
4 files changed, 11 insertions(+), 3 deletions(-)

Toggle diff (73 lines)
diff --git a/guix/scripts/archive.scm b/guix/scripts/archive.scm
index c04baf9784..1f73fff711 100644
--- a/guix/scripts/archive.scm
+++ b/guix/scripts/archive.scm
@@ -347,6 +347,8 @@ output port."
(match type
('directory
(format #t "D ~a~%" file))
+ ('directory-complete
+ #t)
('symlink
(format #t "S ~a -> ~a~%" file content))
((or 'regular 'executable)
diff --git a/guix/scripts/challenge.scm b/guix/scripts/challenge.scm
index 39bd2c1c0f..d0a456ac1d 100644
--- a/guix/scripts/challenge.scm
+++ b/guix/scripts/challenge.scm
@@ -210,6 +210,7 @@ taken since we do not import the archives."
(cons `(,file ,type ,(port-sha256* port size))
result))))
('directory result)
+ ('directory-complete result)
('symlink
(cons `(,file ,type ,contents) result))))
'()
diff --git a/guix/serialization.scm b/guix/serialization.scm
index 836ad06caf..cc56134ef4 100644
--- a/guix/serialization.scm
+++ b/guix/serialization.scm
@@ -444,7 +444,8 @@ depends on TYPE."
(file file)
(token x))))))
(loop (read-string port) result)))))
- (")" result) ;done with DIR
+ (")" ;done with DIR
+ (proc file 'directory-complete #f result))
(x
(raise
(condition
@@ -463,6 +464,8 @@ Restore it as FILE."
(match type
('directory
(mkdir file))
+ ('directory-complete
+ #t)
('symlink
(symlink content file))
((or 'regular 'executable)
diff --git a/tests/nar.scm b/tests/nar.scm
index aeff3d3330..b542ebd47c 100644
--- a/tests/nar.scm
+++ b/tests/nar.scm
@@ -1,5 +1,5 @@
;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020 Ludovic Courtès <ludo@gnu.org>
;;;
;;; This file is part of GNU Guix.
;;;
@@ -218,8 +218,10 @@
'(("R" directory #f)
("R/dir" directory #f)
("R/dir/exe" executable "1234")
+ ("R/dir" directory-complete #f)
("R/foo" regular "abcdefg")
- ("R/lnk" symlink "foo"))
+ ("R/lnk" symlink "foo")
+ ("R" directory-complete #f))
(let ()
(define-values (port get-bytevector)
--
2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 02/15] serialization: 'restore-file' sets canonical timestamp and permissions.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150919.18435-3-ludo@gnu.org
* guix/serialization.scm (restore-file): Set the permissions and mtime
of FILE.
* guix/nar.scm (finalize-store-file): Pass #:reset-timestamps? #f to
'register-items'.
* tests/nar.scm (rm-rf): Add 'chmod' calls to ensure files are writable.
("write-file + restore-file with symlinks"): Ensure every file in OUTPUT
passes 'canonical-file?'.
* tests/guix-archive.sh: Run "chmod -R +w" before "rm -rf".
---
guix/nar.scm | 8 +++++---
guix/serialization.scm | 14 +++++++++-----
tests/guix-archive.sh | 4 ++--
tests/nar.scm | 12 ++++++++++--
4 files changed, 26 insertions(+), 12 deletions(-)

Toggle diff (110 lines)
diff --git a/guix/nar.scm b/guix/nar.scm
index a23af2e5de..edfcc9aab5 100644
--- a/guix/nar.scm
+++ b/guix/nar.scm
@@ -114,10 +114,12 @@ held."
;; Install the new TARGET.
(rename-file source target)
- ;; Register TARGET. As a side effect, it resets the timestamps of all
- ;; its files, recursively, and runs a deduplication pass.
+ ;; Register TARGET. As a side effect, run a deduplication pass.
+ ;; Timestamps and permissions are already correct thanks to
+ ;; 'restore-file'.
(register-items db
- (list (store-info target deriver references))))
+ (list (store-info target deriver references))
+ #:reset-timestamps? #f))
(when lock?
(delete-file (string-append target ".lock"))
diff --git a/guix/serialization.scm b/guix/serialization.scm
index cc56134ef4..677ca60b66 100644
--- a/guix/serialization.scm
+++ b/guix/serialization.scm
@@ -459,23 +459,27 @@ depends on TYPE."
(define (restore-file port file)
"Read a file (possibly a directory structure) in Nar format from PORT.
-Restore it as FILE."
+Restore it as FILE with canonical permissions and timestamps."
(fold-archive (lambda (file type content result)
(match type
('directory
(mkdir file))
('directory-complete
- #t)
+ (chmod file #o555)
+ (utime file 1 1 0 0))
('symlink
- (symlink content file))
+ (symlink content file)
+ (utime file 1 1 0 0 AT_SYMLINK_NOFOLLOW))
((or 'regular 'executable)
(match content
((input . size)
(call-with-output-file file
(lambda (output)
(dump input output size)
- (when (eq? type 'executable)
- (chmod output #o755)))))))))
+ (chmod output (if (eq? type 'executable)
+ #o555
+ #o444))))
+ (utime file 1 1 0 0))))))
#t
port
file))
diff --git a/tests/guix-archive.sh b/tests/guix-archive.sh
index e796c62f9a..00b87ff0ac 100644
--- a/tests/guix-archive.sh
+++ b/tests/guix-archive.sh
@@ -1,5 +1,5 @@
# GNU Guix --- Functional package management for GNU
-# Copyright © 2013, 2014, 2015, 2019 Ludovic Courtès <ludo@gnu.org>
+# Copyright © 2013, 2014, 2015, 2019, 2020 Ludovic Courtès <ludo@gnu.org>
#
# This file is part of GNU Guix.
#
@@ -28,7 +28,7 @@ tmpdir="t-archive-dir-$$"
rm -f "$archive" "$archive_alt"
rm -rf "$tmpdir"
-trap 'rm -f "$archive" "$archive_alt"; rm -rf "$tmpdir"' EXIT
+trap 'rm -f "$archive" "$archive_alt"; chmod -R +w "$tmpdir"; rm -rf "$tmpdir"' EXIT
guix archive --export guile-bootstrap > "$archive"
guix archive --export guile-bootstrap:out > "$archive_alt"
diff --git a/tests/nar.scm b/tests/nar.scm
index b542ebd47c..59616659c8 100644
--- a/tests/nar.scm
+++ b/tests/nar.scm
@@ -136,8 +136,11 @@
(define (rm-rf dir)
(file-system-fold (const #t) ; enter?
(lambda (file stat result) ; leaf
+ (unless (eq? 'symlink (stat:type stat))
+ (chmod file #o644))
(delete-file file))
- (const #t) ; down
+ (lambda (dir stat result) ; down
+ (chmod dir #o755))
(lambda (dir stat result) ; up
(rmdir dir))
(const #t) ; skip
@@ -363,7 +366,12 @@
(cut write-file input <>))
(call-with-input-file nar
(cut restore-file <> output))
- (file-tree-equal? input output))
+
+ (and (file-tree-equal? input output)
+ (every (lambda (file)
+ (canonical-file?
+ (string-append output "/" file)))
+ '("root" "root/reg" "root/exe"))))
(lambda ()
(false-if-exception (delete-file nar))
(false-if-exception (rm-rf output)))))))
--
2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 03/15] nar: Deduplicate files right as they are restored.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150919.18435-4-ludo@gnu.org
This avoids having to traverse and re-read the files that we have just
restored, thereby reducing I/O.

* guix/serialization.scm (dump-file): New procedure.
(restore-file): Add #:dump-file parameter and honor it.
* guix/store/deduplication.scm (tee, dump-file/deduplicate): New
procedures.
* guix/nar.scm (restore-one-item): Pass #:dump-file to 'restore-file'.
(finalize-store-file): Pass #:deduplicate? #f to 'register-items'.
* tests/nar.scm <top level>: Call 'setenv' to set "NIX_STORE".
---
guix/nar.scm | 12 ++++----
guix/serialization.scm | 27 ++++++++++++-----
guix/store/deduplication.scm | 57 +++++++++++++++++++++++++++++++++++-
tests/nar.scm | 3 ++
4 files changed, 85 insertions(+), 14 deletions(-)

Toggle diff (178 lines)
diff --git a/guix/nar.scm b/guix/nar.scm
index edfcc9aab5..ba035ca6dc 100644
--- a/guix/nar.scm
+++ b/guix/nar.scm
@@ -27,6 +27,7 @@
;; (guix store) since this is "daemon-side" code.
#:use-module (guix store)
#:use-module (guix store database)
+ #:use-module ((guix store deduplication) #:select (dump-file/deduplicate))
#:use-module ((guix build store-copy) #:select (store-info))
#:use-module (guix i18n)
@@ -114,12 +115,12 @@ held."
;; Install the new TARGET.
(rename-file source target)
- ;; Register TARGET. As a side effect, run a deduplication pass.
- ;; Timestamps and permissions are already correct thanks to
- ;; 'restore-file'.
+ ;; Register TARGET. The 'restore-file' call took care of
+ ;; deduplication, timestamps, and permissions.
(register-items db
(list (store-info target deriver references))
- #:reset-timestamps? #f))
+ #:reset-timestamps? #f
+ #:deduplicate? #f))
(when lock?
(delete-file (string-append target ".lock"))
@@ -212,7 +213,8 @@ s-expression"))
(let-values (((port get-hash)
(open-sha256-input-port port)))
(with-temporary-store-file temp
- (restore-file port temp)
+ (restore-file port temp
+ #:dump-file dump-file/deduplicate)
(let ((magic (read-int port)))
(unless (= magic %export-magic)
diff --git a/guix/serialization.scm b/guix/serialization.scm
index 677ca60b66..9e2dce8bb0 100644
--- a/guix/serialization.scm
+++ b/guix/serialization.scm
@@ -457,9 +457,22 @@ depends on TYPE."
(&message (message "unsupported nar entry type"))
(&nar-read-error (port port) (file file) (token x)))))))))
-(define (restore-file port file)
+(define (dump-file file input size type)
+ "Dump SIZE bytes from INPUT to FILE."
+ (call-with-output-file file
+ (lambda (output)
+ (dump input output size))))
+
+(define* (restore-file port file
+ #:key (dump-file dump-file))
"Read a file (possibly a directory structure) in Nar format from PORT.
-Restore it as FILE with canonical permissions and timestamps."
+Restore it as FILE with canonical permissions and timestamps. To write a
+regular or executable file, call:
+
+ (DUMP-FILE FILE INPUT SIZE TYPE)
+
+The default is to dump SIZE bytes from INPUT to FILE, but callers can provide
+a custom procedure, for instance to deduplicate FILE on the fly."
(fold-archive (lambda (file type content result)
(match type
('directory
@@ -473,12 +486,10 @@ Restore it as FILE with canonical permissions and timestamps."
((or 'regular 'executable)
(match content
((input . size)
- (call-with-output-file file
- (lambda (output)
- (dump input output size)
- (chmod output (if (eq? type 'executable)
- #o555
- #o444))))
+ (dump-file file input size type)
+ (chmod file (if (eq? type 'executable)
+ #o555
+ #o444))
(utime file 1 1 0 0))))))
#t
port
diff --git a/guix/store/deduplication.scm b/guix/store/deduplication.scm
index 0655ceb890..b4d37d4525 100644
--- a/guix/store/deduplication.scm
+++ b/guix/store/deduplication.scm
@@ -26,12 +26,15 @@
#:use-module (guix build syscalls)
#:use-module (guix base32)
#:use-module (srfi srfi-11)
+ #:use-module (srfi srfi-34)
+ #:use-module (srfi srfi-35)
#:use-module (rnrs io ports)
#:use-module (ice-9 ftw)
#:use-module (ice-9 match)
#:use-module (guix serialization)
#:export (nar-sha256
- deduplicate))
+ deduplicate
+ dump-file/deduplicate))
;; XXX: This port is used as a workaround on Guile <= 2.2.4 where
;; 'port-position' throws to 'out-of-range' when the offset is great than or
@@ -201,3 +204,55 @@ under STORE."
;; that's OK: we just can't deduplicate it more.
#f)
(else (apply throw args)))))))))))
+
+(define (tee input len output)
+ "Return a port that reads up to LEN bytes from INPUT and writes them to
+OUTPUT as it goes."
+ (define bytes-read 0)
+
+ (define (fail)
+ ;; Reached EOF before we had read LEN bytes from INPUT.
+ (raise (condition
+ (&nar-error (port input)
+ (file (port-filename output))))))
+
+ (define (read! bv start count)
+ ;; Read at most LEN bytes in total.
+ (let ((count (min count (- len bytes-read))))
+ (let loop ((ret (get-bytevector-n! input bv start count)))
+ (cond ((eof-object? ret)
+ (if (= bytes-read len)
+ 0 ; EOF
+ (fail)))
+ ((and (zero? ret) (> count 0))
+ ;; Do not return zero since zero means EOF, so try again.
+ (loop (get-bytevector-n! input bv start count)))
+ (else
+ (put-bytevector output bv start ret)
+ (set! bytes-read (+ bytes-read ret))
+ ret)))))
+
+ (make-custom-binary-input-port "tee input port" read! #f #f #f))
+
+(define* (dump-file/deduplicate file input size type
+ #:key (store (%store-directory)))
+ "Write SIZE bytes read from INPUT to FILE. TYPE is a symbol, either
+'regular or 'executable.
+
+This procedure is suitable as a #:dump-file argument to 'restore-file'. When
+used that way, it deduplicates files on the fly as they are restored, thereby
+removing the need to a deduplication pass that would re-read all the files
+down the road."
+ (define hash
+ (call-with-output-file file
+ (lambda (output)
+ (let-values (((hash-port get-hash)
+ (open-hash-port (hash-algorithm sha256))))
+ (write-file-tree file hash-port
+ #:file-type+size (lambda (_) (values type size))
+ #:file-port
+ (const (tee input size output)))
+ (close-port hash-port)
+ (get-hash)))))
+
+ (deduplicate file hash #:store store))
diff --git a/tests/nar.scm b/tests/nar.scm
index 59616659c8..ba4881caaa 100644
--- a/tests/nar.scm
+++ b/tests/nar.scm
@@ -452,6 +452,9 @@
(false-if-exception (rm-rf %test-dir))
(setlocale LC_ALL locale)))))
+;; XXX: Tell the 'deduplicate' procedure what store we're actually using.
+(setenv "NIX_STORE" (%store-prefix))
+
(test-assert "restore-file-set (signed, valid)"
(with-store store
(let* ((texts (unfold (cut >= <> 10)
--
2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 04/15] store-copy: 'populate-store' resets timestamps.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150919.18435-5-ludo@gnu.org
Until now, 'populate-store' would reset permissions but not timestamps,
so callers would resort to going through an extra directory traversal to
reset timestamps.

* guix/build/store-copy.scm (reset-permissions): Remove.
(copy-recursively): New procedure.
(populate-store): Pass #:keep-permissions? to 'copy-recursively'.
Remove call to 'reset-permissions'.
* tests/gexp.scm ("gexp->derivation, store copy"): In BUILD-DRV, check
whether 'populate-store' canonicalizes permissions and timestamps.
* gnu/build/image.scm (initialize-root-partition): Pass #:reset-timestamps? #f
to 'register-closure'.
* gnu/build/vm.scm (root-partition-initializer): Likewise.
---
gnu/build/image.scm | 5 +-
gnu/build/vm.scm | 2 +-
guix/build/store-copy.scm | 103 +++++++++++++++++++++++++++-----------
tests/gexp.scm | 19 ++++++-
4 files changed, 95 insertions(+), 34 deletions(-)

Toggle diff (202 lines)
diff --git a/gnu/build/image.scm b/gnu/build/image.scm
index 640a784204..2857362914 100644
--- a/gnu/build/image.scm
+++ b/gnu/build/image.scm
@@ -196,9 +196,8 @@ register-closure."
(when register-closures?
(for-each (lambda (closure)
- (register-closure root
- closure
- #:reset-timestamps? #t
+ (register-closure root closure
+ #:reset-timestamps? #f
#:deduplicate? deduplicate?
#:wal-mode? wal-mode?))
references-graphs))
diff --git a/gnu/build/vm.scm b/gnu/build/vm.scm
index 287d099f79..30feaf800f 100644
--- a/gnu/build/vm.scm
+++ b/gnu/build/vm.scm
@@ -414,7 +414,7 @@ system that is passed to 'populate-root-file-system'."
(for-each (lambda (closure)
(register-closure target
(string-append "/xchg/" closure)
- #:reset-timestamps? copy-closures?
+ #:reset-timestamps? #f
#:deduplicate? deduplicate?))
closures)
(unless copy-closures?
diff --git a/guix/build/store-copy.scm b/guix/build/store-copy.scm
index ad551bca98..95dcb8e114 100644
--- a/guix/build/store-copy.scm
+++ b/guix/build/store-copy.scm
@@ -1,5 +1,5 @@
;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2013, 2014, 2017, 2018 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2013, 2014, 2017, 2018, 2020 Ludovic Courtès <ludo@gnu.org>
;;;
;;; This file is part of GNU Guix.
;;;
@@ -17,7 +17,7 @@
;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>.
(define-module (guix build store-copy)
- #:use-module (guix build utils)
+ #:use-module ((guix build utils) #:hide (copy-recursively))
#:use-module (guix sets)
#:use-module (guix progress)
#:use-module (srfi srfi-1)
@@ -169,32 +169,83 @@ REFERENCE-GRAPHS, a list of reference-graph files."
(reduce + 0 (map file-size items)))
-(define (reset-permissions file)
- "Reset the permissions on FILE and its sub-directories so that they are all
-read-only."
- ;; XXX: This procedure exists just to work around the inability of
- ;; 'copy-recursively' to preserve permissions.
- (file-system-fold (const #t) ;enter?
- (lambda (file stat _) ;leaf
- (unless (eq? 'symlink (stat:type stat))
- (chmod file
- (if (zero? (logand (stat:mode stat)
- #o100))
- #o444
- #o555))))
- (const #t) ;down
- (lambda (directory stat _) ;up
- (chmod directory #o555))
- (const #f) ;skip
- (const #f) ;error
+;; TODO: Remove when the one in (guix build utils) has #:keep-permissions?,
+;; the fix for <https://bugs.gnu.org/44741>, and when #:keep-mtime? works for
+;; symlinks.
+(define* (copy-recursively source destination
+ #:key
+ (log (current-output-port))
+ (follow-symlinks? #f)
+ (copy-file copy-file)
+ keep-mtime? keep-permissions?)
+ "Copy SOURCE directory to DESTINATION. Follow symlinks if FOLLOW-SYMLINKS?
+is true; otherwise, just preserve them. Call COPY-FILE to copy regular files.
+When KEEP-MTIME? is true, keep the modification time of the files in SOURCE on
+those of DESTINATION. When KEEP-PERMISSIONS? is true, preserve file
+permissions. Write verbose output to the LOG port."
+ (define AT_SYMLINK_NOFOLLOW
+ ;; Guile 2.0 did not define this constant, hence this hack.
+ (let ((variable (module-variable the-root-module 'AT_SYMLINK_NOFOLLOW)))
+ (if variable
+ (variable-ref variable)
+ 256))) ;for GNU/Linux
+
+ (define (set-file-time file stat)
+ (utime file
+ (stat:atime stat)
+ (stat:mtime stat)
+ (stat:atimensec stat)
+ (stat:mtimensec stat)
+ AT_SYMLINK_NOFOLLOW))
+
+ (define strip-source
+ (let ((len (string-length source)))
+ (lambda (file)
+ (substring file len))))
+
+ (file-system-fold (const #t) ; enter?
+ (lambda (file stat result) ; leaf
+ (let ((dest (string-append destination
+ (strip-source file))))
+ (format log "`~a' -> `~a'~%" file dest)
+ (case (stat:type stat)
+ ((symlink)
+ (let ((target (readlink file)))
+ (symlink target dest)))
+ (else
+ (copy-file file dest)
+ (when keep-permissions?
+ (chmod dest (stat:perms stat)))))
+ (when keep-mtime?
+ (set-file-time dest stat))))
+ (lambda (dir stat result) ; down
+ (let ((target (string-append destination
+ (strip-source dir))))
+ (mkdir-p target)))
+ (lambda (dir stat result) ; up
+ (let ((target (string-append destination
+ (strip-source dir))))
+ (when keep-mtime?
+ (set-file-time target stat))
+ (when keep-permissions?
+ (chmod target (stat:perms stat)))))
+ (const #t) ; skip
+ (lambda (file stat errno result)
+ (format (current-error-port) "i/o error: ~a: ~a~%"
+ file (strerror errno))
+ #f)
#t
- file
- lstat))
+ source
+
+ (if follow-symlinks?
+ stat
+ lstat)))
(define* (populate-store reference-graphs target
#:key (log-port (current-error-port)))
"Populate the store under directory TARGET with the items specified in
-REFERENCE-GRAPHS, a list of reference-graph files."
+REFERENCE-GRAPHS, a list of reference-graph files. Items copied to TARGET
+maintain timestamps and permissions."
(define store
(string-append target (%store-directory)))
@@ -221,12 +272,8 @@ REFERENCE-GRAPHS, a list of reference-graph files."
(copy-recursively thing
(string-append target thing)
#:keep-mtime? #t
+ #:keep-permissions? #t
#:log (%make-void-port "w"))
-
- ;; XXX: Since 'copy-recursively' doesn't allow us to
- ;; preserve permissions, we have to traverse TARGET to
- ;; make sure everything is read-only.
- (reset-permissions (string-append target thing))
(report))
things)))))
diff --git a/tests/gexp.scm b/tests/gexp.scm
index 686334af61..a0e55178fa 100644
--- a/tests/gexp.scm
+++ b/tests/gexp.scm
@@ -723,10 +723,25 @@
(lambda (port)
(display "This is the second one." port))))))
(build-drv #~(begin
- (use-modules (guix build store-copy))
+ (use-modules (guix build store-copy)
+ (guix build utils)
+ (srfi srfi-1))
+
+ (define (canonical-file? file)
+ ;; Copied from (guix tests).
+ (let ((st (lstat file)))
+ (or (not (string-prefix? (%store-directory) file))
+ (eq? 'symlink (stat:type st))
+ (and (= 1 (stat:mtime st))
+ (zero? (logand #o222 (stat:mode st)))))))
(mkdir #$output)
- (populate-store '("graph") #$output))))
+ (populate-store '("graph") #$output)
+
+ ;; Check whether 'populate-store' canonicalizes
+ ;; permissions and timestamps.
+ (unless (every canonical-file? (find-files #$output))
+ (error "not canonical!" #$output)))))
(mlet* %store-monad ((one (gexp->derivation "one" build-one))
(two (gexp->derivation "two" (build-two one)))
(drv (gexp->derivation "store-copy" build-drv
--
2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 05/15] image: 'register-closure' assumes already-reset timestamps.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150919.18435-6-ludo@gnu.org
* gnu/build/image.scm (register-closure): Remove #:reset-timestamps?
parameter. Pass #:reset-timestamps? #f to 'register-items'.
(initialize-root-partition): Adjust accordingly.
* gnu/build/vm.scm (register-closure, root-partition-initializer):
Likewise.
---
gnu/build/image.scm | 8 +++-----
gnu/build/vm.scm | 8 +++-----
2 files changed, 6 insertions(+), 10 deletions(-)

Toggle diff (75 lines)
diff --git a/gnu/build/image.scm b/gnu/build/image.scm
index 2857362914..4f80a1964f 100644
--- a/gnu/build/image.scm
+++ b/gnu/build/image.scm
@@ -140,13 +140,12 @@ given CONFIG file."
(define* (register-closure prefix closure
#:key
- (deduplicate? #t) (reset-timestamps? #t)
+ (deduplicate? #t)
(schema (sql-schema))
(wal-mode? #t))
"Register CLOSURE in PREFIX, where PREFIX is the directory name of the
target store and CLOSURE is the name of a file containing a reference graph as
-produced by #:references-graphs.. As a side effect, if RESET-TIMESTAMPS? is
-true, reset timestamps on store files and, if DEDUPLICATE? is true,
+produced by #:references-graphs. As a side effect, if DEDUPLICATE? is true,
deduplicates files common to CLOSURE and the rest of PREFIX. Pass WAL-MODE?
to call-with-database."
(let ((items (call-with-input-file closure read-reference-graph)))
@@ -156,7 +155,7 @@ to call-with-database."
(register-items db items
#:prefix prefix
#:deduplicate? deduplicate?
- #:reset-timestamps? reset-timestamps?
+ #:reset-timestamps? #f
#:registration-time %epoch)))))
(define* (initialize-efi-partition root
@@ -197,7 +196,6 @@ register-closure."
(when register-closures?
(for-each (lambda (closure)
(register-closure root closure
- #:reset-timestamps? #f
#:deduplicate? deduplicate?
#:wal-mode? wal-mode?))
references-graphs))
diff --git a/gnu/build/vm.scm b/gnu/build/vm.scm
index 30feaf800f..f700e08b25 100644
--- a/gnu/build/vm.scm
+++ b/gnu/build/vm.scm
@@ -215,12 +215,11 @@ the #:references-graphs parameter of 'derivation'."
(define* (register-closure prefix closure
#:key
- (deduplicate? #t) (reset-timestamps? #t)
+ (deduplicate? #t)
(schema (sql-schema)))
"Register CLOSURE in PREFIX, where PREFIX is the directory name of the
target store and CLOSURE is the name of a file containing a reference graph as
-produced by #:references-graphs.. As a side effect, if RESET-TIMESTAMPS? is
-true, reset timestamps on store files and, if DEDUPLICATE? is true,
+produced by #:references-graphs. As a side effect, if DEDUPLICATE? is true,
deduplicates files common to CLOSURE and the rest of PREFIX."
(let ((items (call-with-input-file closure read-reference-graph)))
(parameterize ((sql-schema schema))
@@ -228,7 +227,7 @@ deduplicates files common to CLOSURE and the rest of PREFIX."
(register-items db items
#:prefix prefix
#:deduplicate? deduplicate?
- #:reset-timestamps? reset-timestamps?
+ #:reset-timestamps? #f
#:registration-time %epoch)))))
@@ -414,7 +413,6 @@ system that is passed to 'populate-root-file-system'."
(for-each (lambda (closure)
(register-closure target
(string-append "/xchg/" closure)
- #:reset-timestamps? #f
#:deduplicate? deduplicate?))
closures)
(unless copy-closures?
--
2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 06/15] database: Remove #:reset-timestamps? from 'register-items'.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150919.18435-7-ludo@gnu.org
The assumption now is that the caller took care of resetting timestamps
and permissions.

* guix/store/database.scm (register-items): Remove #:reset-timestamps?
parameter and the call to 'reset-timestamps'.
(register-path): Adjust accordingly and add call to 'reset-timestamps'.
* gnu/build/image.scm (register-closure): Remove #:reset-timestamps?
parameter to 'register-items'.
* gnu/build/vm.scm (register-closure): Likewise.
* guix/nar.scm (finalize-store-file): Adjust accordingly.
* guix/scripts/pack.scm (store-database)[build]: Likewise.
---
gnu/build/image.scm | 1 -
gnu/build/vm.scm | 1 -
guix/nar.scm | 1 -
guix/scripts/pack.scm | 1 -
guix/store/database.scm | 13 ++++++++-----
5 files changed, 8 insertions(+), 9 deletions(-)

Toggle diff (100 lines)
diff --git a/gnu/build/image.scm b/gnu/build/image.scm
index 4f80a1964f..0deea10a9d 100644
--- a/gnu/build/image.scm
+++ b/gnu/build/image.scm
@@ -155,7 +155,6 @@ to call-with-database."
(register-items db items
#:prefix prefix
#:deduplicate? deduplicate?
- #:reset-timestamps? #f
#:registration-time %epoch)))))
(define* (initialize-efi-partition root
diff --git a/gnu/build/vm.scm b/gnu/build/vm.scm
index f700e08b25..abb0317faf 100644
--- a/gnu/build/vm.scm
+++ b/gnu/build/vm.scm
@@ -227,7 +227,6 @@ deduplicates files common to CLOSURE and the rest of PREFIX."
(register-items db items
#:prefix prefix
#:deduplicate? deduplicate?
- #:reset-timestamps? #f
#:registration-time %epoch)))))
diff --git a/guix/nar.scm b/guix/nar.scm
index ba035ca6dc..947b393d84 100644
--- a/guix/nar.scm
+++ b/guix/nar.scm
@@ -119,7 +119,6 @@ held."
;; deduplication, timestamps, and permissions.
(register-items db
(list (store-info target deriver references))
- #:reset-timestamps? #f
#:deduplicate? #f))
(when lock?
diff --git a/guix/scripts/pack.scm b/guix/scripts/pack.scm
index ba9a6dc1b2..1612ec8f04 100644
--- a/guix/scripts/pack.scm
+++ b/guix/scripts/pack.scm
@@ -168,7 +168,6 @@ dependencies are registered."
(with-database db-file db
(register-items db items
#:deduplicate? #f
- #:reset-timestamps? #f
#:registration-time %epoch)))))))
(computed-file "store-database" build
diff --git a/guix/store/database.scm b/guix/store/database.scm
index b36b127630..0ed66a6e2c 100644
--- a/guix/store/database.scm
+++ b/guix/store/database.scm
@@ -392,7 +392,8 @@ references, and DERIVER as its deriver (.drv that led to it.) If PREFIX is
given, it must be the name of the directory containing the new store to
initialize; if STATE-DIRECTORY is given, it must be a string containing the
absolute file name to the state directory of the store being initialized.
-Return #t on success.
+Return #t on success. As a side effect, reset timestamps on PATH, unless
+RESET-TIMESTAMPS? is false.
Use with care as it directly modifies the store! This is primarily meant to
be used internally by the daemon's build hook.
@@ -403,12 +404,17 @@ by adding it as a temp-root."
(store-database-file #:prefix prefix
#:state-directory state-directory))
+ (define real-file-name
+ (string-append (or prefix "") path))
+
+ (when reset-timestamps?
+ (reset-timestamps real-file-name))
+
(parameterize ((sql-schema schema))
(with-database db-file db
(register-items db (list (store-info path deriver references))
#:prefix prefix
#:deduplicate? deduplicate?
- #:reset-timestamps? reset-timestamps?
#:log-port (%make-void-port "w")))))
(define %epoch
@@ -418,7 +424,6 @@ by adding it as a temp-root."
(define* (register-items db items
#:key prefix
(deduplicate? #t)
- (reset-timestamps? #t)
registration-time
(log-port (current-error-port)))
"Register all of ITEMS, a list of <store-info> records as returned by
@@ -452,8 +457,6 @@ typically by adding them as temp-roots."
;; significant differences when 'register-closures' is called
;; consecutively for overlapping closures such as 'system' and 'bootcfg'.
(unless (path-id db to-register)
- (when reset-timestamps?
- (reset-timestamps real-file-name))
(let-values (((hash nar-size) (nar-sha256 real-file-name)))
(call-with-retrying-transaction db
(lambda ()
--
2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 07/15] store-copy: 'populate-store' can optionally deduplicate files.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150919.18435-8-ludo@gnu.org
Until now deduplication was performed as an additional pass after
copying files, which involve re-traversing all the files that had just
been copied.

* guix/store/deduplication.scm (copy-file/deduplicate): New procedure.
* tests/store-deduplication.scm ("copy-file/deduplicate"): New test.
* guix/build/store-copy.scm (populate-store): Add #:deduplicate?
parameter and honor it.
* tests/gexp.scm ("gexp->derivation, store copy"): Pass #:deduplicate? #f
to 'populate-store'.
* gnu/build/image.scm (initialize-root-partition): Pass #:deduplicate?
to 'populate-store'. Pass #:deduplicate? #f to 'register-closure'.
* gnu/build/vm.scm (root-partition-initializer): Likewise.
* gnu/build/install.scm (populate-single-profile-directory): Pass
#:deduplicate? #f to 'populate-store'.
* gnu/build/linux-initrd.scm (build-initrd): Likewise.
* guix/scripts/pack.scm (self-contained-tarball)[import-module?]: New
procedure.
[build]: Pass it as an argument to 'source-module-closure'.
* guix/scripts/pack.scm (squashfs-image)[build]: Wrap in
'with-extensions'.
* gnu/system/linux-initrd.scm (expression->initrd)[import-module?]: New
procedure.
[builder]: Pass it to 'source-module-closure'.
* gnu/system/install.scm (cow-store-service-type)[import-module?]: New
procedure. Pass it to 'source-module-closure'.
---
gnu/build/image.scm | 5 +-
gnu/build/install.scm | 3 +-
gnu/build/linux-initrd.scm | 3 +-
gnu/build/vm.scm | 5 +-
gnu/system/install.scm | 12 +-
gnu/system/linux-initrd.scm | 10 +-
guix/build/store-copy.scm | 13 +-
guix/scripts/pack.scm | 274 +++++++++++++++++-----------------
guix/store/deduplication.scm | 16 +-
tests/gexp.scm | 3 +-
tests/store-deduplication.scm | 18 ++-
11 files changed, 215 insertions(+), 147 deletions(-)

Toggle diff (415 lines)
diff --git a/gnu/build/image.scm b/gnu/build/image.scm
index 0deea10a9d..8f50f27f78 100644
--- a/gnu/build/image.scm
+++ b/gnu/build/image.scm
@@ -186,7 +186,8 @@ rest of the store when registering the closures. SYSTEM-DIRECTORY is the name
of the directory of the 'system' derivation. Pass WAL-MODE? to
register-closure."
(populate-root-file-system system-directory root)
- (populate-store references-graphs root)
+ (populate-store references-graphs root
+ #:deduplicate? deduplicate?)
;; Populate /dev.
(when make-device-nodes
@@ -195,7 +196,7 @@ register-closure."
(when register-closures?
(for-each (lambda (closure)
(register-closure root closure
- #:deduplicate? deduplicate?
+ #:deduplicate? #f
#:wal-mode? wal-mode?))
references-graphs))
diff --git a/gnu/build/install.scm b/gnu/build/install.scm
index 63995e1d09..f5c8407b89 100644
--- a/gnu/build/install.scm
+++ b/gnu/build/install.scm
@@ -214,7 +214,8 @@ This is used to create the self-contained tarballs with 'guix pack'."
(symlink old (scope new)))
;; Populate the store.
- (populate-store (list closure) directory)
+ (populate-store (list closure) directory
+ #:deduplicate? #f)
(when database
(install-database-and-gc-roots directory database profile
diff --git a/gnu/build/linux-initrd.scm b/gnu/build/linux-initrd.scm
index 99796adba6..bb2ed0db0c 100644
--- a/gnu/build/linux-initrd.scm
+++ b/gnu/build/linux-initrd.scm
@@ -127,7 +127,8 @@ REFERENCES-GRAPHS."
(mkdir "contents")
;; Copy the closures of all the items referenced in REFERENCES-GRAPHS.
- (populate-store references-graphs "contents")
+ (populate-store references-graphs "contents"
+ #:deduplicate? #f)
(with-directory-excursion "contents"
;; Make '/init'.
diff --git a/gnu/build/vm.scm b/gnu/build/vm.scm
index abb0317faf..03be5697b7 100644
--- a/gnu/build/vm.scm
+++ b/gnu/build/vm.scm
@@ -395,7 +395,8 @@ system that is passed to 'populate-root-file-system'."
(when copy-closures?
;; Populate the store.
(populate-store (map (cut string-append "/xchg/" <>) closures)
- target))
+ target
+ #:deduplicate? deduplicate?))
;; Populate /dev.
(make-device-nodes target)
@@ -412,7 +413,7 @@ system that is passed to 'populate-root-file-system'."
(for-each (lambda (closure)
(register-closure target
(string-append "/xchg/" closure)
- #:deduplicate? deduplicate?))
+ #:deduplicate? #f))
closures)
(unless copy-closures?
(umount target-store)))
diff --git a/gnu/system/install.scm b/gnu/system/install.scm
index 7701297411..06f8043bb7 100644
--- a/gnu/system/install.scm
+++ b/gnu/system/install.scm
@@ -1,5 +1,5 @@
;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2014, 2015, 2016, 2017, 2018, 2019 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2014, 2015, 2016, 2017, 2018, 2019, 2020 Ludovic Courtès <ludo@gnu.org>
;;; Copyright © 2015 Mark H Weaver <mhw@netris.org>
;;; Copyright © 2016 Andreas Enge <andreas@enge.fr>
;;; Copyright © 2017 Marius Bakke <mbakke@fastmail.com>
@@ -175,6 +175,13 @@ manual."
(shepherd-service-type
'cow-store
(lambda _
+ (define (import-module? module)
+ ;; Since we don't use deduplication support in 'populate-store', don't
+ ;; import (guix store deduplication) and its dependencies, which
+ ;; includes Guile-Gcrypt.
+ (and (guix-module-name? module)
+ (not (equal? module '(guix store deduplication)))))
+
(shepherd-service
(requirement '(root-file-system user-processes))
(provision '(cow-store))
@@ -189,7 +196,8 @@ the given target.")
,@%default-modules))
(start
(with-imported-modules (source-module-closure
- '((gnu build install)))
+ '((gnu build install))
+ #:select? import-module?)
#~(case-lambda
((target)
(mount-cow-store target #$%backing-directory)
diff --git a/gnu/system/linux-initrd.scm b/gnu/system/linux-initrd.scm
index 4fb1d863c9..c6ba9bb560 100644
--- a/gnu/system/linux-initrd.scm
+++ b/gnu/system/linux-initrd.scm
@@ -76,12 +76,20 @@ the derivations referenced by EXP are automatically copied to the initrd."
(define init
(program-file "init" exp #:guile guile))
+ (define (import-module? module)
+ ;; Since we don't use deduplication support in 'populate-store', don't
+ ;; import (guix store deduplication) and its dependencies, which includes
+ ;; Guile-Gcrypt. That way we can run tests with '--bootstrap'.
+ (and (guix-module-name? module)
+ (not (equal? module '(guix store deduplication)))))
+
(define builder
;; Do not use "guile-zlib" extension here, otherwise it would drag the
;; non-static "zlib" package to the initrd closure. It is not needed
;; anyway because the modules are stored uncompressed within the initrd.
(with-imported-modules (source-module-closure
- '((gnu build linux-initrd)))
+ '((gnu build linux-initrd))
+ #:select? import-module?)
#~(begin
(use-modules (gnu build linux-initrd))
diff --git a/guix/build/store-copy.scm b/guix/build/store-copy.scm
index 95dcb8e114..7f0672cd9d 100644
--- a/guix/build/store-copy.scm
+++ b/guix/build/store-copy.scm
@@ -20,6 +20,7 @@
#:use-module ((guix build utils) #:hide (copy-recursively))
#:use-module (guix sets)
#:use-module (guix progress)
+ #:autoload (guix store deduplication) (copy-file/deduplicate)
#:use-module (srfi srfi-1)
#:use-module (srfi srfi-9)
#:use-module (srfi srfi-26)
@@ -242,10 +243,13 @@ permissions. Write verbose output to the LOG port."
lstat)))
(define* (populate-store reference-graphs target
- #:key (log-port (current-error-port)))
+ #:key
+ (deduplicate? #t)
+ (log-port (current-error-port)))
"Populate the store under directory TARGET with the items specified in
REFERENCE-GRAPHS, a list of reference-graph files. Items copied to TARGET
-maintain timestamps and permissions."
+maintain timestamps and permissions. When DEDUPLICATE? is true, deduplicate
+regular files as they are copied to TARGET."
(define store
(string-append target (%store-directory)))
@@ -273,6 +277,11 @@ maintain timestamps and permissions."
(string-append target thing)
#:keep-mtime? #t
#:keep-permissions? #t
+ #:copy-file
+ (if deduplicate?
+ (cut copy-file/deduplicate <> <>
+ #:store store)
+ copy-file)
#:log (%make-void-port "w"))
(report))
things)))))
diff --git a/guix/scripts/pack.scm b/guix/scripts/pack.scm
index 1612ec8f04..440c4b0903 100644
--- a/guix/scripts/pack.scm
+++ b/guix/scripts/pack.scm
@@ -203,12 +203,19 @@ added to the pack."
#+(file-append glibc-utf8-locales "/lib/locale"))
(setlocale LC_ALL "en_US.utf8"))))
+ (define (import-module? module)
+ ;; Since we don't use deduplication support in 'populate-store', don't
+ ;; import (guix store deduplication) and its dependencies, which includes
+ ;; Guile-Gcrypt. That way we can run tests with '--bootstrap'.
+ (and (not-config? module)
+ (not (equal? '(guix store deduplication) module))))
+
(define build
(with-imported-modules (source-module-closure
`((guix build utils)
(guix build union)
(gnu build install))
- #:select? not-config?)
+ #:select? import-module?)
#~(begin
(use-modules (guix build utils)
((guix build union) #:select (relative-file-name))
@@ -382,138 +389,139 @@ added to the pack."
`(("/bin" -> "bin") ,@symlinks)))
(define build
- (with-imported-modules (source-module-closure
- '((guix build utils)
- (guix build store-copy)
- (guix build union)
- (gnu build install))
- #:select? not-config?)
- #~(begin
- (use-modules (guix build utils)
- (guix build store-copy)
- ((guix build union) #:select (relative-file-name))
- (gnu build install)
- (srfi srfi-1)
- (srfi srfi-26)
- (ice-9 match))
-
- (define database #+database)
- (define entry-point #$entry-point)
-
- (define (mksquashfs args)
- (apply invoke "mksquashfs"
- `(,@args
-
- ;; Do not create a "recovery file" when appending to the
- ;; file system since it's useless in this case.
- "-no-recovery"
-
- ;; Do not attempt to store extended attributes.
- ;; See <https://bugs.gnu.org/40043>.
- "-no-xattrs"
-
- ;; Set file times and the file system creation time to
- ;; one second after the Epoch.
- "-all-time" "1" "-mkfs-time" "1"
-
- ;; Reset all UIDs and GIDs.
- "-force-uid" "0" "-force-gid" "0")))
-
- (setenv "PATH" #+(file-append archiver "/bin"))
-
- ;; We need an empty file in order to have a valid file argument when
- ;; we reparent the root file system. Read on for why that's
- ;; necessary.
- (with-output-to-file ".empty" (lambda () (display "")))
-
- ;; Create the squashfs image in several steps.
- ;; Add all store items. Unfortunately mksquashfs throws away all
- ;; ancestor directories and only keeps the basename. We fix this
- ;; in the following invocations of mksquashfs.
- (mksquashfs `(,@(map store-info-item
- (call-with-input-file "profile"
- read-reference-graph))
- #$environment
- ,#$output
-
- ;; Do not perform duplicate checking because we
- ;; don't have any dupes.
- "-no-duplicates"
- "-comp"
- ,#+(compressor-name compressor)))
-
- ;; Here we reparent the store items. For each sub-directory of
- ;; the store prefix we need one invocation of "mksquashfs".
- (for-each (lambda (dir)
- (mksquashfs `(".empty"
- ,#$output
- "-root-becomes" ,dir)))
- (reverse (string-tokenize (%store-directory)
- (char-set-complement (char-set #\/)))))
-
- ;; Add symlinks and mount points.
- (mksquashfs
- `(".empty"
- ,#$output
- ;; Create SYMLINKS via pseudo file definitions.
- ,@(append-map
- (match-lambda
- ((source '-> target)
- ;; Create relative symlinks to work around a bug in
- ;; Singularity 2.x:
- ;; https://bugs.gnu.org/34913
- ;; https://github.com/sylabs/singularity/issues/1487
- (let ((target (string-append #$profile "/" target)))
- (list "-p"
- (string-join
- ;; name s mode uid gid symlink
- (list source
- "s" "777" "0" "0"
- (relative-file-name (dirname source)
- target)))))))
- '#$symlinks*)
-
- "-p" "/.singularity.d d 555 0 0"
-
- ;; Create the environment file.
- "-p" "/.singularity.d/env d 555 0 0"
- "-p" ,(string-append
- "/.singularity.d/env/90-environment.sh s 777 0 0 "
- (relative-file-name "/.singularity.d/env"
- #$environment))
-
- ;; Create /.singularity.d/actions, and optionally the 'run'
- ;; script, used by 'singularity run'.
- "-p" "/.singularity.d/actions d 555 0 0"
-
- ,@(if entry-point
- `(;; This one if for Singularity 2.x.
- "-p"
- ,(string-append
- "/.singularity.d/actions/run s 777 0 0 "
- (relative-file-name "/.singularity.d/actions"
- (string-append #$profile "/"
- entry-point)))
-
- ;; This one is for Singularity 3.x.
- "-p"
- ,(string-append
- "/.singularity.d/runscript s 777 0 0 "
- (relative-file-name "/.singularity.d"
- (string-append #$profile "/"
- entry-point))))
- '())
-
- ;; Create empty mount points.
- "-p" "/proc d 555 0 0"
- "-p" "/sys d 555 0 0"
- "-p" "/dev d 555 0 0"
- "-p" "/home d 555 0 0"))
-
- (when database
- ;; Initialize /var/guix.
- (install-database-and-gc-roots "var-etc" database #$profile)
- (mksquashfs `("var-etc" ,#$output))))))
+ (with-extensions (list guile-gcrypt)
+ (with-imported-modules (source-module-closure
+ '((guix build utils)
+ (guix build store-copy)
+ (guix build union)
+ (gnu build install))
+ #:select? not-config?)
+ #~(begin
+ (use-modules (guix build utils)
+ (guix build store-copy)
+ ((guix build union) #:select (relative-file-name))
+ (gnu build install)
+ (srfi srfi-1)
+ (srfi srfi-26)
+ (ice-9 match))
+
+ (define database #+database)
+ (define entry-point #$entry-point)
+
+ (define (mksquashfs args)
+ (apply invoke "mksquashfs"
+ `(,@args
+
+ ;; Do not create a "recovery file" when appending to the
+ ;; file system since it's useless in this case.
+ "-no-recovery"
+
+ ;; Do not attempt to store extended attributes.
+ ;; See <https://bugs.gnu.org/40043>.
+ "-no-xattrs"
+
+ ;; Set file times and the file system creation time to
+ ;; one second after the Epoch.
+ "-all-time" "1" "-mkfs-time" "1"
+
+ ;; Reset all UIDs and GIDs.
+ "-force-uid" "0" "-force-gid" "0")))
+
+ (setenv "PATH" #+(file-append archiver "/bin"))
+
+ ;; We need an empty file in order to have a valid file argument when
+ ;; we reparent the root file system. Read on for why that's
+ ;; necessary.
+ (with-output-to-file ".empty" (lambda () (display "")))
+
+ ;; Create the squashfs image in several steps.
+ ;; Add all store items. Unfortunately mksquashfs throws away all
+ ;; ancestor directories and only keeps the basename. We fix this
+ ;; in the following invocations of mksquashfs.
+ (mksquashfs `(,@(map store-info-item
+ (call-with-input-file "profile"
+ read-reference-graph))
+ #$environment
+ ,#$output
+
+ ;; Do not perform duplicate checking because we
+ ;; don't have any dupes.
+ "-no-duplicates"
+ "-comp"
+ ,#+(compressor-name compressor)))
+
+ ;; Here we reparent the store items. For each sub-directory of
+ ;; the store prefix we need one invocation of "mksquashfs".
+ (for-each (lambda (dir)
+ (mksquashfs `(".empty"
+ ,#$output
+ "-root-becomes" ,dir)))
+ (reverse (string-tokenize (%store-directory)
+ (char-set-complement (char-set #\/)))))
+
+ ;; Add symlinks and mount points.
+ (mksquashfs
+ `(".empty"
+ ,#$output
+ ;; Create SYMLINKS via pseudo file definitions.
+ ,@(append-map
+ (match-lambda
+ ((source '-> target)
+ ;; Cr
This message was truncated. Download the full message here.
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 08/15] image: 'register-closure' leaves it up to the caller to deduplicate.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150919.18435-9-ludo@gnu.org
* gnu/build/image.scm (register-closure): Remove #:deduplicate?
parameter and pass #:deduplicate? #f to 'register-items'.
(initialize-root-partition): Adjust accordingly.
* gnu/build/vm.scm (register-closure, root-partition-initializer):
Likewise.
---
gnu/build/image.scm | 8 ++------
gnu/build/vm.scm | 9 +++------
2 files changed, 5 insertions(+), 12 deletions(-)

Toggle diff (73 lines)
diff --git a/gnu/build/image.scm b/gnu/build/image.scm
index 8f50f27f78..8d5fc603d9 100644
--- a/gnu/build/image.scm
+++ b/gnu/build/image.scm
@@ -140,21 +140,18 @@ given CONFIG file."
(define* (register-closure prefix closure
#:key
- (deduplicate? #t)
(schema (sql-schema))
(wal-mode? #t))
"Register CLOSURE in PREFIX, where PREFIX is the directory name of the
target store and CLOSURE is the name of a file containing a reference graph as
-produced by #:references-graphs. As a side effect, if DEDUPLICATE? is true,
-deduplicates files common to CLOSURE and the rest of PREFIX. Pass WAL-MODE?
-to call-with-database."
+produced by #:references-graphs. Pass WAL-MODE? to call-with-database."
(let ((items (call-with-input-file closure read-reference-graph)))
(parameterize ((sql-schema schema))
(with-database (store-database-file #:prefix prefix) db
#:wal-mode? wal-mode?
(register-items db items
#:prefix prefix
- #:deduplicate? deduplicate?
+ #:deduplicate? #f
#:registration-time %epoch)))))
(define* (initialize-efi-partition root
@@ -196,7 +193,6 @@ register-closure."
(when register-closures?
(for-each (lambda (closure)
(register-closure root closure
- #:deduplicate? #f
#:wal-mode? wal-mode?))
references-graphs))
diff --git a/gnu/build/vm.scm b/gnu/build/vm.scm
index 03be5697b7..8c6ab648ac 100644
--- a/gnu/build/vm.scm
+++ b/gnu/build/vm.scm
@@ -215,18 +215,16 @@ the #:references-graphs parameter of 'derivation'."
(define* (register-closure prefix closure
#:key
- (deduplicate? #t)
(schema (sql-schema)))
"Register CLOSURE in PREFIX, where PREFIX is the directory name of the
target store and CLOSURE is the name of a file containing a reference graph as
-produced by #:references-graphs. As a side effect, if DEDUPLICATE? is true,
-deduplicates files common to CLOSURE and the rest of PREFIX."
+produced by #:references-graphs."
(let ((items (call-with-input-file closure read-reference-graph)))
(parameterize ((sql-schema schema))
(with-database (store-database-file #:prefix prefix) db
(register-items db items
#:prefix prefix
- #:deduplicate? deduplicate?
+ #:deduplicate? #f
#:registration-time %epoch)))))
@@ -412,8 +410,7 @@ system that is passed to 'populate-root-file-system'."
(display "registering closures...\n")
(for-each (lambda (closure)
(register-closure target
- (string-append "/xchg/" closure)
- #:deduplicate? #f))
+ (string-append "/xchg/" closure)))
closures)
(unless copy-closures?
(umount target-store)))
--
2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 09/15] database: Remove #:deduplicate? from 'register-items'.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150919.18435-10-ludo@gnu.org
It is now up to the caller to deduplicate store contents.

* guix/store/database.scm (register-items): Remove #:deduplicate?
parameter and call to 'deduplicate'.
(register-path): Call 'deduplicate' when #:deduplicate? is true.
* gnu/build/image.scm (register-closure): Adjust call accordingly.
* gnu/build/vm.scm (register-closure): Likewise.
* guix/nar.scm (finalize-store-file): Likewise.
* guix/scripts/pack.scm (store-database): Likewise.
---
gnu/build/image.scm | 1 -
gnu/build/vm.scm | 1 -
guix/nar.scm | 3 +--
guix/scripts/pack.scm | 1 -
guix/store/database.scm | 11 ++++++-----
5 files changed, 7 insertions(+), 10 deletions(-)

Toggle diff (95 lines)
diff --git a/gnu/build/image.scm b/gnu/build/image.scm
index 8d5fc603d9..f6e5cb42f6 100644
--- a/gnu/build/image.scm
+++ b/gnu/build/image.scm
@@ -151,7 +151,6 @@ produced by #:references-graphs. Pass WAL-MODE? to call-with-database."
#:wal-mode? wal-mode?
(register-items db items
#:prefix prefix
- #:deduplicate? #f
#:registration-time %epoch)))))
(define* (initialize-efi-partition root
diff --git a/gnu/build/vm.scm b/gnu/build/vm.scm
index 8c6ab648ac..bd59916bf3 100644
--- a/gnu/build/vm.scm
+++ b/gnu/build/vm.scm
@@ -224,7 +224,6 @@ produced by #:references-graphs."
(with-database (store-database-file #:prefix prefix) db
(register-items db items
#:prefix prefix
- #:deduplicate? #f
#:registration-time %epoch)))))
diff --git a/guix/nar.scm b/guix/nar.scm
index 947b393d84..a817b56007 100644
--- a/guix/nar.scm
+++ b/guix/nar.scm
@@ -118,8 +118,7 @@ held."
;; Register TARGET. The 'restore-file' call took care of
;; deduplication, timestamps, and permissions.
(register-items db
- (list (store-info target deriver references))
- #:deduplicate? #f))
+ (list (store-info target deriver references))))
(when lock?
(delete-file (string-append target ".lock"))
diff --git a/guix/scripts/pack.scm b/guix/scripts/pack.scm
index 440c4b0903..8ecdcb823f 100644
--- a/guix/scripts/pack.scm
+++ b/guix/scripts/pack.scm
@@ -167,7 +167,6 @@ dependencies are registered."
(let ((items (append-map read-closure '#$labels)))
(with-database db-file db
(register-items db items
- #:deduplicate? #f
#:registration-time %epoch)))))))
(computed-file "store-database" build
diff --git a/guix/store/database.scm b/guix/store/database.scm
index 0ed66a6e2c..31ea9add78 100644
--- a/guix/store/database.scm
+++ b/guix/store/database.scm
@@ -407,6 +407,11 @@ by adding it as a temp-root."
(define real-file-name
(string-append (or prefix "") path))
+ (when deduplicate?
+ (deduplicate real-file-name (nar-sha256 real-file-name)
+ #:store (string-append (or prefix "")
+ %store-directory)))
+
(when reset-timestamps?
(reset-timestamps real-file-name))
@@ -414,7 +419,6 @@ by adding it as a temp-root."
(with-database db-file db
(register-items db (list (store-info path deriver references))
#:prefix prefix
- #:deduplicate? deduplicate?
#:log-port (%make-void-port "w")))))
(define %epoch
@@ -423,7 +427,6 @@ by adding it as a temp-root."
(define* (register-items db items
#:key prefix
- (deduplicate? #t)
registration-time
(log-port (current-error-port)))
"Register all of ITEMS, a list of <store-info> records as returned by
@@ -467,9 +470,7 @@ typically by adding them as temp-roots."
"sha256:"
(bytevector->base16-string hash))
#:nar-size nar-size
- #:time registration-time)))
- (when deduplicate?
- (deduplicate real-file-name hash #:store store-dir)))))
+ #:time registration-time))))))
(let* ((prefix (format #f "registering ~a items" (length items)))
(progress (progress-reporter/bar (length items)
--
2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 11/15] database: Remove #:deduplicate? and #:reset-timestamps? from 'register-path'.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150951.18508-2-ludo@gnu.org
* guix/store/database.scm (register-path): Remove #:deduplicate?
and #:reset-timestamps?.
* guix/scripts/system.scm (copy-item): Adjust accordingly.
* tests/store-database.scm ("register-path")
("register-path, directory"): Call 'reset-timestamps'.
---
guix/scripts/system.scm | 6 +-----
guix/store/database.scm | 17 ++---------------
tests/store-database.scm | 5 +++--
3 files changed, 6 insertions(+), 22 deletions(-)

Toggle diff (90 lines)
diff --git a/guix/scripts/system.scm b/guix/scripts/system.scm
index c08929066b..0e543d9460 100644
--- a/guix/scripts/system.scm
+++ b/guix/scripts/system.scm
@@ -158,11 +158,7 @@ REFERENCES as its set of references."
(unless (register-path item
#:prefix target
#:state-directory state
- #:references references
-
- ;; Those are taken care of by 'copy-store-item'.
- #:reset-timestamps? #f
- #:deduplicate? #f)
+ #:references references)
(leave (G_ "failed to register '~a' under '~a'~%")
item target))))
diff --git a/guix/store/database.scm b/guix/store/database.scm
index 31ea9add78..c0010b72b9 100644
--- a/guix/store/database.scm
+++ b/guix/store/database.scm
@@ -384,16 +384,14 @@ is true."
(define* (register-path path
#:key (references '()) deriver prefix
- state-directory (deduplicate? #t)
- (reset-timestamps? #t)
+ state-directory
(schema (sql-schema)))
"Register PATH as a valid store file, with REFERENCES as its list of
references, and DERIVER as its deriver (.drv that led to it.) If PREFIX is
given, it must be the name of the directory containing the new store to
initialize; if STATE-DIRECTORY is given, it must be a string containing the
absolute file name to the state directory of the store being initialized.
-Return #t on success. As a side effect, reset timestamps on PATH, unless
-RESET-TIMESTAMPS? is false.
+Return #t on success.
Use with care as it directly modifies the store! This is primarily meant to
be used internally by the daemon's build hook.
@@ -404,17 +402,6 @@ by adding it as a temp-root."
(store-database-file #:prefix prefix
#:state-directory state-directory))
- (define real-file-name
- (string-append (or prefix "") path))
-
- (when deduplicate?
- (deduplicate real-file-name (nar-sha256 real-file-name)
- #:store (string-append (or prefix "")
- %store-directory)))
-
- (when reset-timestamps?
- (reset-timestamps real-file-name))
-
(parameterize ((sql-schema schema))
(with-database db-file db
(register-items db (list (store-info path deriver references))
diff --git a/tests/store-database.scm b/tests/store-database.scm
index 3b4ef43f6d..33fd6cfbad 100644
--- a/tests/store-database.scm
+++ b/tests/store-database.scm
@@ -34,8 +34,7 @@
(test-begin "store-database")
-(test-equal "register-path"
- '(1 1)
+(test-assert "register-path"
(let ((file (string-append (%store-prefix) "/" (make-string 32 #\f)
"-fake")))
(when (valid-path? %store file)
@@ -46,6 +45,7 @@
(drv (string-append file ".drv")))
(call-with-output-file file
(cut display "This is a fake store item.\n" <>))
+ (reset-timestamps file)
(register-path file
#:references (list ref)
#:deriver drv)
@@ -69,6 +69,7 @@
(mkdir-p (string-append file "/a"))
(call-with-output-file (string-append file "/a/b")
(const #t))
+ (reset-timestamps file)
(register-path file #:deriver drv)
(and (valid-path? %store file)
--
2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 10/15] guix system: 'init' copies, resets timestamps, and deduplicates at once.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150951.18508-1-ludo@gnu.org

* guix/build/store-copy.scm (copy-store-item): New procedure.
(populate-store): Use it instead of the inline 'copy-recursively' call.
* guix/scripts/system.scm (copy-item): Likewise.
Pass #:reset-timestamps? and #:deduplicate? to 'register-path'.
---
guix/build/store-copy.scm | 31 +++++++++++++++++++++----------
guix/scripts/system.scm | 11 ++++++++---
2 files changed, 29 insertions(+), 13 deletions(-)

Toggle diff (94 lines)
diff --git a/guix/build/store-copy.scm b/guix/build/store-copy.scm
index 7f0672cd9d..01e1f41870 100644
--- a/guix/build/store-copy.scm
+++ b/guix/build/store-copy.scm
@@ -38,6 +38,7 @@
file-size
closure-size
+ copy-store-item
populate-store))
;;; Commentary:
@@ -242,6 +243,24 @@ permissions. Write verbose output to the LOG port."
stat
lstat)))
+(define* (copy-store-item item target
+ #:key
+ (deduplicate? #t)
+ (log-port (%make-void-port "w")))
+ "Copy ITEM, a store item, to the store under TARGET, the target root
+directory. When DEDUPLICATE? is true, deduplicate it within TARGET."
+ (define store
+ (string-append target (%store-directory)))
+
+ (copy-recursively item (string-append target item)
+ #:keep-mtime? #t
+ #:keep-permissions? #t
+ #:copy-file
+ (if deduplicate?
+ (cut copy-file/deduplicate <> <> #:store store)
+ copy-file)
+ #:log log-port))
+
(define* (populate-store reference-graphs target
#:key
(deduplicate? #t)
@@ -273,16 +292,8 @@ regular files as they are copied to TARGET."
(call-with-progress-reporter progress
(lambda (report)
(for-each (lambda (thing)
- (copy-recursively thing
- (string-append target thing)
- #:keep-mtime? #t
- #:keep-permissions? #t
- #:copy-file
- (if deduplicate?
- (cut copy-file/deduplicate <> <>
- #:store store)
- copy-file)
- #:log (%make-void-port "w"))
+ (copy-store-item thing target
+ #:deduplicate? deduplicate?)
(report))
things)))))
diff --git a/guix/scripts/system.scm b/guix/scripts/system.scm
index db80e0be8f..c08929066b 100644
--- a/guix/scripts/system.scm
+++ b/guix/scripts/system.scm
@@ -30,6 +30,7 @@
#:use-module ((guix status) #:select (with-status-verbosity))
#:use-module (guix store)
#:autoload (guix store database) (register-path)
+ #:autoload (guix build store-copy) (copy-store-item)
#:use-module (guix describe)
#:use-module (guix grafts)
#:use-module (guix gexp)
@@ -147,8 +148,8 @@ REFERENCES as its set of references."
#:directories? #t))
(delete-file-recursively dest))
- (copy-recursively item dest
- #:log (%make-void-port "w"))
+ (copy-store-item item target
+ #:deduplicate? #t)
;; Register ITEM; as a side-effect, it resets timestamps, etc.
;; Explicitly use "TARGET/var/guix" as the state directory, to avoid
@@ -157,7 +158,11 @@ REFERENCES as its set of references."
(unless (register-path item
#:prefix target
#:state-directory state
- #:references references)
+ #:references references
+
+ ;; Those are taken care of by 'copy-store-item'.
+ #:reset-timestamps? #f
+ #:deduplicate? #f)
(leave (G_ "failed to register '~a' under '~a'~%")
item target))))
--
2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 13/15] database: Remove 'register-path'.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150951.18508-4-ludo@gnu.org
* guix/store/database.scm (register-path): Remove.
* tests/store-database.scm ("register-path"): Rename to...
("register-items"): ... this, and use 'register-items' instead of
'register-path'.
("register-path, directory"): Rename to...
("register-items, directory"): ... this, and use 'register-items'
instead of 'register-path'.
("register-path with unregistered references"): Rename to...
("sqlite-register with unregistered references"): ... this.
---
guix/store/database.scm | 27 ---------------------------
tests/store-database.scm | 15 ++++++++-------
2 files changed, 8 insertions(+), 34 deletions(-)

Toggle diff (108 lines)
diff --git a/guix/store/database.scm b/guix/store/database.scm
index 9d5bc531bb..4579b05261 100644
--- a/guix/store/database.scm
+++ b/guix/store/database.scm
@@ -43,7 +43,6 @@
with-database
path-id
sqlite-register
- register-path
register-items
%epoch
reset-timestamps))
@@ -383,32 +382,6 @@ is true."
(chmod file (if (executable-file? file) #o555 #o444)))
(utime file 1 1 0 0)))))
-(define* (register-path path
- #:key (references '()) deriver prefix
- state-directory
- (schema (sql-schema)))
- "Register PATH as a valid store file, with REFERENCES as its list of
-references, and DERIVER as its deriver (.drv that led to it.) If PREFIX is
-given, it must be the name of the directory containing the new store to
-initialize; if STATE-DIRECTORY is given, it must be a string containing the
-absolute file name to the state directory of the store being initialized.
-Return #t on success.
-
-Use with care as it directly modifies the store! This is primarily meant to
-be used internally by the daemon's build hook.
-
-PATH must be protected from GC and locked during execution of this, typically
-by adding it as a temp-root."
- (define db-file
- (store-database-file #:prefix prefix
- #:state-directory state-directory))
-
- (parameterize ((sql-schema schema))
- (with-database db-file db
- (register-items db (list (store-info path deriver references))
- #:prefix prefix
- #:log-port (%make-void-port "w")))))
-
(define %epoch
;; When it all began.
(make-time time-utc 0 1))
diff --git a/tests/store-database.scm b/tests/store-database.scm
index 33fd6cfbad..17eea38c63 100644
--- a/tests/store-database.scm
+++ b/tests/store-database.scm
@@ -20,6 +20,7 @@
#:use-module (guix tests)
#:use-module (guix store)
#:use-module (guix store database)
+ #:use-module (guix build store-copy)
#:use-module ((guix utils) #:select (call-with-temporary-output-file))
#:use-module ((guix build utils)
#:select (mkdir-p delete-file-recursively))
@@ -34,7 +35,7 @@
(test-begin "store-database")
-(test-assert "register-path"
+(test-assert "register-items"
(let ((file (string-append (%store-prefix) "/" (make-string 32 #\f)
"-fake")))
(when (valid-path? %store file)
@@ -46,9 +47,8 @@
(call-with-output-file file
(cut display "This is a fake store item.\n" <>))
(reset-timestamps file)
- (register-path file
- #:references (list ref)
- #:deriver drv)
+ (with-database (store-database-file) db
+ (register-items db (list (store-info file drv (list ref)))))
(and (valid-path? %store file)
(equal? (references %store file) (list ref))
@@ -57,7 +57,7 @@
(list (stat:mtime (lstat file))
(stat:mtime (lstat ref)))))))
-(test-equal "register-path, directory"
+(test-equal "register-items, directory"
'(1 1 1)
(let ((file (string-append (%store-prefix) "/" (make-string 32 #\f)
"-fake-directory")))
@@ -70,7 +70,8 @@
(call-with-output-file (string-append file "/a/b")
(const #t))
(reset-timestamps file)
- (register-path file #:deriver drv)
+ (with-database (store-database-file) db
+ (register-items db (list (store-info file drv '()))))
(and (valid-path? %store file)
(null? (references %store file))
@@ -102,7 +103,7 @@
(list (path-id db "/gnu/foo")
(path-id db "/gnu/bar")))))))
-(test-assert "register-path with unregistered references"
+(test-assert "sqlite-register with unregistered references"
;; Make sure we get a "NOT NULL constraint failed: Refs.reference" error
;; when we try to add references that are not registered yet. Better safe
;; than sorry.
--
2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 12/15] system: 'init' does not recompute the hash of each store item.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150951.18508-3-ludo@gnu.org

Previously, the 'register-path' call would re-traverse ITEM to compute
its nar hash, even though that hash is already known in the initial
store. This patch also avoids repeated opening/closing of the
database.

* guix/store/database.scm (call-with-database): Export.
* guix/scripts/system.scm (copy-item): Add 'db' parameter. Call
'sqlite-register' instead of 'register-path'.
(copy-closure): Remove redundant call to 'references*'. Call
'call-with-database' and pass the database to 'copy-item'.
---
.dir-locals.el | 1 +
guix/scripts/system.scm | 59 ++++++++++++++++++++++-------------------
guix/store/database.scm | 1 +
3 files changed, 34 insertions(+), 27 deletions(-)

Toggle diff (125 lines)
diff --git a/.dir-locals.el b/.dir-locals.el
index 4eb27d8b1b..8f07a08eb5 100644
--- a/.dir-locals.el
+++ b/.dir-locals.el
@@ -121,6 +121,7 @@
(eval . (put 'let-system 'scheme-indent-function 1))
(eval . (put 'with-database 'scheme-indent-function 2))
+ (eval . (put 'call-with-database 'scheme-indent-function 1))
(eval . (put 'call-with-transaction 'scheme-indent-function 1))
(eval . (put 'with-statement 'scheme-indent-function 3))
(eval . (put 'call-with-retrying-transaction 'scheme-indent-function 1))
diff --git a/guix/scripts/system.scm b/guix/scripts/system.scm
index 0e543d9460..5427f875ec 100644
--- a/guix/scripts/system.scm
+++ b/guix/scripts/system.scm
@@ -29,7 +29,9 @@
#:use-module (guix ui)
#:use-module ((guix status) #:select (with-status-verbosity))
#:use-module (guix store)
- #:autoload (guix store database) (register-path)
+ #:autoload (guix base16) (bytevector->base16-string)
+ #:autoload (guix store database)
+ (sqlite-register store-database-file call-with-database)
#:autoload (guix build store-copy) (copy-store-item)
#:use-module (guix describe)
#:use-module (guix grafts)
@@ -130,12 +132,11 @@ BODY..., and restore them."
(store-lift topologically-sorted))
-(define* (copy-item item references target
+(define* (copy-item item info target db
#:key (log-port (current-error-port)))
- "Copy ITEM to the store under root directory TARGET and register it with
-REFERENCES as its set of references."
- (let ((dest (string-append target item))
- (state (string-append target "/var/guix")))
+ "Copy ITEM to the store under root directory TARGET and populate DB with the
+given INFO, a <path-info> record."
+ (let ((dest (string-append target item)))
(format log-port "copying '~a'...~%" item)
;; Remove DEST if it exists to make sure that (1) we do not fail badly
@@ -151,41 +152,45 @@ REFERENCES as its set of references."
(copy-store-item item target
#:deduplicate? #t)
- ;; Register ITEM; as a side-effect, it resets timestamps, etc.
- ;; Explicitly use "TARGET/var/guix" as the state directory, to avoid
- ;; reproducing the user's current settings; see
- ;; <http://bugs.gnu.org/18049>.
- (unless (register-path item
- #:prefix target
- #:state-directory state
- #:references references)
- (leave (G_ "failed to register '~a' under '~a'~%")
- item target))))
+ (sqlite-register db
+ #:path item
+ #:references (path-info-references info)
+ #:deriver (path-info-deriver info)
+ #:hash (string-append
+ "sha256:"
+ (bytevector->base16-string (path-info-hash info)))
+ #:nar-size (path-info-nar-size info))))
(define* (copy-closure item target
#:key (log-port (current-error-port)))
"Copy ITEM and all its dependencies to the store under root directory
TARGET, and register them."
(mlet* %store-monad ((to-copy (topologically-sorted* (list item)))
- (refs (mapm %store-monad references* to-copy))
- (info (mapm %store-monad query-path-info*
- (delete-duplicates
- (append to-copy (concatenate refs)))))
+ (info (mapm %store-monad query-path-info* to-copy))
(size -> (reduce + 0 (map path-info-nar-size info))))
(define progress-bar
(progress-reporter/bar (length to-copy)
(format #f (G_ "copying to '~a'...")
target)))
+ (define state
+ (string-append target "/var/guix"))
+
(check-available-space size target)
- (call-with-progress-reporter progress-bar
- (lambda (report)
- (let ((void (%make-void-port "w")))
- (for-each (lambda (item refs)
- (copy-item item refs target #:log-port void)
- (report))
- to-copy refs))))
+ ;; Explicitly use "TARGET/var/guix" as the state directory to avoid
+ ;; reproducing the user's current settings; see
+ ;; <http://bugs.gnu.org/18049>.
+ (call-with-database (store-database-file #:prefix target
+ #:state-directory state)
+ (lambda (db)
+ (call-with-progress-reporter progress-bar
+ (lambda (report)
+ (let ((void (%make-void-port "w")))
+ (for-each (lambda (item info)
+ (copy-item item info target db #:log-port void)
+ (report))
+ to-copy info))))))
(return *unspecified*)))
diff --git a/guix/store/database.scm b/guix/store/database.scm
index c0010b72b9..9d5bc531bb 100644
--- a/guix/store/database.scm
+++ b/guix/store/database.scm
@@ -39,6 +39,7 @@
#:export (sql-schema
%default-database-file
store-database-file
+ call-with-database
with-database
path-id
sqlite-register
--
2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 14/15] database: Honor 'SOURCE_DATE_EPOCH'.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150951.18508-5-ludo@gnu.org
* guix/store/database.scm (timestamp): New procedure.
(sqlite-register): Use it as the default for #:time.
(register-items): Likewise for #:registeration-time.
---
guix/store/database.scm | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)

Toggle diff (47 lines)
diff --git a/guix/store/database.scm b/guix/store/database.scm
index 4579b05261..0a84bbddb9 100644
--- a/guix/store/database.scm
+++ b/guix/store/database.scm
@@ -323,8 +323,19 @@ ids of items referred to."
(sqlite-fold cons '() stmt))
references)))
+(define (timestamp)
+ "Return a timestamp, either the current time of SOURCE_DATE_EPOCH."
+ (match (getenv "SOURCE_DATE_EPOCH")
+ (#f
+ (current-time time-utc))
+ ((= string->number seconds)
+ (if seconds
+ (make-time time-utc 0 seconds)
+ (current-time time-utc)))))
+
(define* (sqlite-register db #:key path (references '())
- deriver hash nar-size time)
+ deriver hash nar-size
+ (time (timestamp)))
"Registers this stuff in DB. PATH is the store item to register and
REFERENCES is the list of store items PATH refers to; DERIVER is the '.drv'
that produced PATH, HASH is the base16-encoded Nix sha256 hash of
@@ -337,9 +348,7 @@ Every store item in REFERENCES must already be registered."
#:deriver deriver
#:hash hash
#:nar-size nar-size
- #:time (time-second
- (or time
- (current-time time-utc))))))
+ #:time (time-second time))))
;; Call 'path-id' on each of REFERENCES. This ensures we get a
;; "non-NULL constraint" failure if one of REFERENCES is unregistered.
(add-references db id
@@ -388,7 +397,7 @@ is true."
(define* (register-items db items
#:key prefix
- registration-time
+ (registration-time (timestamp))
(log-port (current-error-port)))
"Register all of ITEMS, a list of <store-info> records as returned by
'read-reference-graph', in DB. ITEMS must be in topological order (with
--
2.29.2
L
L
Ludovic Courtès wrote on 11 Dec 2020 16:09
[PATCH 15/15] deduplicate: Create the '.links' directory lazily.
(address . 44760@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201211150951.18508-6-ludo@gnu.org
This avoids repeated (mkdir-p "/gnu/store/.links") calls when
deduplicating lots of files.

* guix/store/deduplication.scm (deduplicate): Remove initial call to
'mkdir-p'. Add ENOENT case in 'link' exception handler. Reindent.
* tests/store-deduplication.scm ("deduplicate, ENOSPC"): Check
for (<= links 4) to account for the initial 'link' call.
---
guix/store/deduplication.scm | 96 ++++++++++++++++++-----------------
tests/store-deduplication.scm | 2 +-
2 files changed, 51 insertions(+), 47 deletions(-)

Toggle diff (122 lines)
diff --git a/guix/store/deduplication.scm b/guix/store/deduplication.scm
index 8564f12107..a72a43bf79 100644
--- a/guix/store/deduplication.scm
+++ b/guix/store/deduplication.scm
@@ -159,52 +159,56 @@ under STORE."
(define links-directory
(string-append store "/.links"))
- (mkdir-p links-directory)
- (let loop ((path path)
- (type (stat:type (lstat path)))
- (hash hash))
- (if (eq? 'directory type)
- ;; Can't hardlink directories, so hardlink their atoms.
- (for-each (match-lambda
- ((file . properties)
- (unless (member file '("." ".."))
- (let* ((file (string-append path "/" file))
- (type (match (assoc-ref properties 'type)
- ((or 'unknown #f)
- (stat:type (lstat file)))
- (type type))))
- (loop file type
- (and (not (eq? 'directory type))
- (nar-sha256 file)))))))
- (scandir* path))
- (let ((link-file (string-append links-directory "/"
- (bytevector->nix-base32-string hash))))
- (if (file-exists? link-file)
- (replace-with-link link-file path
- #:swap-directory links-directory
- #:store store)
- (catch 'system-error
- (lambda ()
- (link path link-file))
- (lambda args
- (let ((errno (system-error-errno args)))
- (cond ((= errno EEXIST)
- ;; Someone else put an entry for PATH in
- ;; LINKS-DIRECTORY before we could. Let's use it.
- (replace-with-link path link-file
- #:swap-directory
- links-directory
- #:store store))
- ((= errno ENOSPC)
- ;; There's not enough room in the directory index for
- ;; more entries in .links, but that's fine: we can
- ;; just stop.
- #f)
- ((= errno EMLINK)
- ;; PATH has reached the maximum number of links, but
- ;; that's OK: we just can't deduplicate it more.
- #f)
- (else (apply throw args)))))))))))
+ (let loop ((path path)
+ (type (stat:type (lstat path)))
+ (hash hash))
+ (if (eq? 'directory type)
+ ;; Can't hardlink directories, so hardlink their atoms.
+ (for-each (match-lambda
+ ((file . properties)
+ (unless (member file '("." ".."))
+ (let* ((file (string-append path "/" file))
+ (type (match (assoc-ref properties 'type)
+ ((or 'unknown #f)
+ (stat:type (lstat file)))
+ (type type))))
+ (loop file type
+ (and (not (eq? 'directory type))
+ (nar-sha256 file)))))))
+ (scandir* path))
+ (let ((link-file (string-append links-directory "/"
+ (bytevector->nix-base32-string hash))))
+ (if (file-exists? link-file)
+ (replace-with-link link-file path
+ #:swap-directory links-directory
+ #:store store)
+ (catch 'system-error
+ (lambda ()
+ (link path link-file))
+ (lambda args
+ (let ((errno (system-error-errno args)))
+ (cond ((= errno EEXIST)
+ ;; Someone else put an entry for PATH in
+ ;; LINKS-DIRECTORY before we could. Let's use it.
+ (replace-with-link path link-file
+ #:swap-directory
+ links-directory
+ #:store store))
+ ((= errno ENOENT)
+ ;; This most likely means that LINKS-DIRECTORY does
+ ;; not exist. Attempt to create it and try again.
+ (mkdir-p links-directory)
+ (loop path type hash))
+ ((= errno ENOSPC)
+ ;; There's not enough room in the directory index for
+ ;; more entries in .links, but that's fine: we can
+ ;; just stop.
+ #f)
+ ((= errno EMLINK)
+ ;; PATH has reached the maximum number of links, but
+ ;; that's OK: we just can't deduplicate it more.
+ #f)
+ (else (apply throw args)))))))))))
(define (tee input len output)
"Return a port that reads up to LEN bytes from INPUT and writes them to
diff --git a/tests/store-deduplication.scm b/tests/store-deduplication.scm
index 7b01acae24..b1c2d93bbd 100644
--- a/tests/store-deduplication.scm
+++ b/tests/store-deduplication.scm
@@ -95,7 +95,7 @@
(lambda ()
(set! link (lambda (old new)
(set! links (+ links 1))
- (if (<= links 3)
+ (if (<= links 4)
(true-link old new)
(throw 'system-error "link" "~A" '("Whaaat?!")
(list ENOSPC))))))
--
2.29.2
L
L
Ludovic Courtès wrote on 15 Dec 2020 17:33
Re: bug#44760: [PATCH 00/15] Speed up 'guix system init' & co.
(address . 44760-done@debbugs.gnu.org)
873607m3ka.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (20 lines)
> serialization: 'fold-archive' notifies about directory processing
> completion.
> serialization: 'restore-file' sets canonical timestamp and
> permissions.
> nar: Deduplicate files right as they are restored.
> store-copy: 'populate-store' resets timestamps.
> image: 'register-closure' assumes already-reset timestamps.
> database: Remove #:reset-timestamps? from 'register-items'.
> store-copy: 'populate-store' can optionally deduplicate files.
> image: 'register-closure' leaves it up to the caller to deduplicate.
> database: Remove #:deduplicate? from 'register-items'.
> guix system: 'init' copies, resets timestamps, and deduplicates at
> once.
> database: Remove #:deduplicate? and #:reset-timestamps? from
> 'register-path'.
> system: 'init' does not recompute the hash of each store item.
> database: Remove 'register-path'.
> database: Honor 'SOURCE_DATE_EPOCH'.
> deduplicate: Create the '.links' directory lazily.

Pushed as 7530e491b517497b7b8166b5ccecdc3d4cdb468d!

Ludo'.
Closed
L
L
Ludovic Courtès wrote on 15 Dec 2020 17:38
Re: bug#44760: Closure copy in ‘guix system init’ is inefficient
(address . 44760@debbugs.gnu.org)
87y2hzkos7.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (4 lines)
> ‘guix system init’ ends by copying the system’s closure from the “host”
> store to the target store; it also initializes the database of that
> target store.

There’s one ‘guix system init’ optimization that would be nice to have
in the installation image: instead of copying store items at all, simply
move the store from its temporary location in /mnt/tmp/guix-inst to
/mnt/gnu/store. That’d make it zero-copy.

It would probably require a separate code path, but it’s worth
considering.

Thoughts?

Ludo’.
J
J
Jonathan Brielmaier wrote on 16 Dec 2020 22:53
Closure copy in ‘guix system init’ is ine fficient
(address . 44760@debbugs.gnu.org)
a5e9cb2c-16b8-f880-3c71-34b0f03b7696@web.de
Hi Ludo,

there seems to be another regression in this series.

Pulling from 6a060ff27ff68384d7c90076baa36c349fff689d gives:
```
[ 1/20] Loading './guix/base32.scm'...

[ 2/20] Loading './guix/build/store-copy.scm'...

[ 3/20] Loading './guix/build/syscalls.scm'...

[ 4/20] Loading './guix/build/utils.scm'...

[ 5/20] Loading './guix/combinators.scm'...

[ 6/20] Loading './guix/progress.scm'...

[ 7/20] Loading './guix/records.scm'...

[ 8/20] Loading './guix/serialization.scm'...

[ 9/20] Loading './guix/sets.scm'...

[10/20] Loading './guix/store/deduplication.scm'...

;;; Failed to autoload copy-file/deduplicate in (guix store deduplication):

;;; no code for module (gcrypt hash)

Backtrace:

17 (primitive-load "/gnu/store/1fwvy0fz6zfgnyq1mj5pn8mfin7?")

In ice-9/eval.scm:

619:8 16 (_ #f)

In srfi/srfi-1.scm:

460:18 15 (fold #<procedure 7fffeeb741e0 at ice-9/eval.scm:336:1?> ?)

460:18 14 (fold #<procedure 7fffeeb81ac0 at ice-9/eval.scm:336:1?> ?)

460:18 13 (fold #<procedure 7fffeecdaf20 at ice-9/eval.scm:336:1?> ?)

In ice-9/eval.scm:

619:8 12 (_ #(#(#<directory (guix build utils) 7fffef187d20>) # ?))

In ice-9/boot-9.scm:

2806:4 11 (save-module-excursion #<procedure 7fffeecdad40 at ice-?>)

In unknown file:

10 (primitive-load "./guix/store/deduplication.scm")

In ice-9/eval.scm:

721:20 9 (primitive-eval (define-module (guix store #) # (# #) ?))

In ice-9/psyntax.scm:

1241:36 8 (expand-top-sequence ((define-module (guix store #) ?)) ?)

1233:19 7 (parse _ (("placeholder" placeholder)) ((top) #(# # ?)) ?)

285:10 6 (parse _ (("placeholder" placeholder)) (()) _ c&e (eval) ?)

In ice-9/eval.scm:

293:34 5 (_ #<directory (guix build utils) 7fffef187d20>)

In ice-9/boot-9.scm:

3380:4 4 (define-module* _ #:filename _ #:pure _ #:version _ # _ ?)

2565:24 3 (call-with-deferred-observers #<procedure 7fffeec53cd0 ?>)

3393:24 2 (_)

222:17 1 (map1 (((gcrypt hash)) ((guix build utils)) ((guix ?)) ?))

3300:6 0 (resolve-interface (gcrypt hash) #:select _ #:hide _ # _ ?)



ice-9/boot-9.scm:3300:6: In procedure resolve-interface:

no code for module (gcrypt hash)

```
All commits before in this series are fine. The fix in
d88ff09ea3138fc85c1463b0b345bd6ba71ca568 does not really help.
L
L
Ludovic Courtès wrote on 17 Dec 2020 14:24
Re: bug#44760: Closure copy in ‘guix system init’ is inefficient
(name . Jonathan Brielmaier)(address . jonathan.brielmaier@web.de)(address . 44760@debbugs.gnu.org)
87v9d0imzq.fsf@gnu.org
Hi,

Jonathan Brielmaier <jonathan.brielmaier@web.de> skribis:

Toggle quote (6 lines)
> there seems to be another regression in this series.
>
> Pulling from 6a060ff27ff68384d7c90076baa36c349fff689d gives:
> ```
> [ 1/20] Loading './guix/base32.scm'...

[...]

Toggle quote (68 lines)
> [10/20] Loading './guix/store/deduplication.scm'...
>
> ;;; Failed to autoload copy-file/deduplicate in (guix store deduplication):
>
> ;;; no code for module (gcrypt hash)
>
> Backtrace:
>
> 17 (primitive-load "/gnu/store/1fwvy0fz6zfgnyq1mj5pn8mfin7?")
>
> In ice-9/eval.scm:
>
> 619:8 16 (_ #f)
>
> In srfi/srfi-1.scm:
>
> 460:18 15 (fold #<procedure 7fffeeb741e0 at ice-9/eval.scm:336:1?> ?)
>
> 460:18 14 (fold #<procedure 7fffeeb81ac0 at ice-9/eval.scm:336:1?> ?)
>
> 460:18 13 (fold #<procedure 7fffeecdaf20 at ice-9/eval.scm:336:1?> ?)
>
> In ice-9/eval.scm:
>
> 619:8 12 (_ #(#(#<directory (guix build utils) 7fffef187d20>) # ?))
>
> In ice-9/boot-9.scm:
>
> 2806:4 11 (save-module-excursion #<procedure 7fffeecdad40 at ice-?>)
>
> In unknown file:
>
> 10 (primitive-load "./guix/store/deduplication.scm")
>
> In ice-9/eval.scm:
>
> 721:20 9 (primitive-eval (define-module (guix store #) # (# #) ?))
>
> In ice-9/psyntax.scm:
>
> 1241:36 8 (expand-top-sequence ((define-module (guix store #) ?)) ?)
>
> 1233:19 7 (parse _ (("placeholder" placeholder)) ((top) #(# # ?)) ?)
>
> 285:10 6 (parse _ (("placeholder" placeholder)) (()) _ c&e (eval) ?)
>
> In ice-9/eval.scm:
>
> 293:34 5 (_ #<directory (guix build utils) 7fffef187d20>)
>
> In ice-9/boot-9.scm:
>
> 3380:4 4 (define-module* _ #:filename _ #:pure _ #:version _ # _ ?)
>
> 2565:24 3 (call-with-deferred-observers #<procedure 7fffeec53cd0 ?>)
>
> 3393:24 2 (_)
>
> 222:17 1 (map1 (((gcrypt hash)) ((guix build utils)) ((guix ?)) ?))
>
> 3300:6 0 (resolve-interface (gcrypt hash) #:select _ #:hide _ # _ ?)
>
>
>
> ice-9/boot-9.scm:3300:6: In procedure resolve-interface:
>
> no code for module (gcrypt hash)

As discussed on IRC, the following command works for me:

guix time-machine \
--commit=6a060ff27ff68384d7c90076baa36c349fff689d -- \
pull -p /tmp/test

However, I don’t have enough information about the issue.

Could you provide information on how to reproduce the issue?

Thanks in advance,
Ludo’.
?
Your comment

This issue is archived.

To comment on this conversation send an email to 44760@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 44760
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch