[PATCH 0/2] guix hash: eases conversion

DoneSubmitted by zimoun.
Details
2 participants
  • Ludovic Courtès
  • zimoun
Owner
unassigned
Severity
normal
Z
Z
zimoun wrote on 20 Oct 2021 18:50
(address . guix-patches@gnu.org)(name . zimoun)(address . zimon.toutoune@gmail.com)
20211020165020.3358311-1-zimon.toutoune@gmail.com
Hi,

The first patch is a tiny improvement on the error handling.

1. The current situation does not correctly handle error because of
’with-error-handling’.
2. Using the option recursive changes the result for tarball, as with:

$ guix hash $(guix build hello -S)
0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i

$ guix hash $(guix build hello -S) --recursive
1qx3qqk86vgdvpqkhpgzq3gfcxmys29wzfizjb9asn4crbn503x9

And I am not able to imagine a case. To me, it should be a fixed-point.
That’s what the first patch correct.

Moreover, it is possible to pass several arguments,

Toggle snippet (21 lines)
$ find . -maxdepth 1 -type d | xargs guix hash -r
guix hash: erreur : nombre d'arguments incorrect

$ find . -maxdepth 1 -type d | xargs ./pre-inst-env guix hash -r
1rzh9b4b4qc5nf4mq601jr2p3xsw690q6d4137ymgq0an9xsli9v
1cgdvnjlh1ziwb12ax2wcrs7ddr44c2nhjali1v3ilsv7fsm79fq
0x64hc3jqq1jwbym5gvcbnsck4v08xxa3kr44m9961nsml1rpmld
03gzaccd1cws05sf469l9ghf9mhxqsnlkkbr859l13alba5isirb
1qmmppfg65wdzcg137hg62ic31ykzvgb26brcv77is1nszvrqm14
1ajw5s2ykyyvpaisv8xbd8rn77q1whk2fxmyfqn3qyzxjf8vw7sz
1fjnk5hsfvsyahf997f6nca5c01jh7gm590xcx2d2adjj2vm51r2
0hm8s9hc6c4x32v3ff0kz7npd1n2i3ld6p69ya68wxfhhkhwpg6r
1k1y2hax62r2jj7j8vk8wx6mhww42g77x1fp7iy151alplv6mi23
1c3dg3mfl4kg0px7rdj52qyxkpn00sdaf7z1bxib4n2wy175gd9m
15680dqbzr7dcngyqblyzqnr5s74rka4qh76n2pdfndd9gc81j0h
0hvlnas7grx69hrxbxz3zw9z80wr02m2c0lbjs0kcxv6wv3da871
1zvw0k4gl3sj3hagp415iy0dcqx8c1k3zwph3n1xcg0z2ljfqpl2



Then, working on Disarchive which uses base16 as encoding, it is annoying
twice,

a) because it requires to download when all the sources
b) because it sometimes requires to apply patches

Compare,

$ guix hash $(guix build ceph -S)
0ppd362s177cc47g75v0k27j7aaf27qc31cbbh0j2g30wmhl8gj7

with the checksum in the package definition:
0lmdri415hqczc9565s5m5568pnj97ipqxgnw6085kps0flwq5zh.

With the second patch, it becomes easy to convert the checksum from upstream:

$ ./pre-inst-env guix hash ceph -f base16
f017cca903face8280e1f6757ce349d25e644aa945175312fb0cc31248ccad52

and nothing is downloaded. Get the checksum of what Guix really builds is
done via the current way, for instance,

$ guix hash $(guix build ceph -S) -f base16
473e4461e5603c21015c8b85c1f0114ea9238f986097f30e61ec9ca08519ed5e

and the second patch allows to convert the checksum from the package
definition (without downloading).

For instance, now it is really cheap to do:

Toggle snippet (4 lines)
guix package -A | cut -f1 | grep julia | xargs ./pre-inst-env guix hash -f base16


All the best,
simon



zimoun (2):
scripts: hash: Improve error handling.
scripts: hash: Support file or package.

guix/scripts/hash.scm | 75 ++++++++++++++++++++++++++++++-------------
tests/guix-hash.sh | 10 ++++++
2 files changed, 63 insertions(+), 22 deletions(-)


base-commit: 19d3cfec72720a4a1339be3d14f4d88ae5bd59f4
--
2.32.0
Z
Z
zimoun wrote on 20 Oct 2021 18:54
[PATCH 1/2] scripts: hash: Improve error handling.
(address . 51307@debbugs.gnu.org)(name . zimoun)(address . zimon.toutoune@gmail.com)
20211020165435.3358398-1-zimon.toutoune@gmail.com
* guix/scripts/hash.scm (guix-hash): Allow several files.
[directory?]: New procedure.
[file-hash]: Catch system-error.
[hash-to-display]: New procedure.
---
guix/scripts/hash.scm | 56 +++++++++++++++++++++++++++----------------
1 file changed, 36 insertions(+), 20 deletions(-)

Toggle diff (84 lines)
diff --git a/guix/scripts/hash.scm b/guix/scripts/hash.scm
index b8622373cc..f3363549d3 100644
--- a/guix/scripts/hash.scm
+++ b/guix/scripts/hash.scm
@@ -3,6 +3,7 @@
 ;;; Copyright © 2013 Nikita Karetnikov <nikita@karetnikov.org>
 ;;; Copyright © 2016 Jan Nieuwenhuizen <janneke@gnu.org>
 ;;; Copyright © 2018 Tim Gesthuizen <tim.gesthuizen@yahoo.de>
+;;; Copyright © 2021 Simon Tournier <zimon.toutoune@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -135,6 +136,11 @@ (define (vcs-file? file stat)
       (else
        #f)))
 
+  (define (directory? file)
+    (case (stat:type (stat file))
+      ((directory) #t)
+      (else #f)))
+
   (let* ((opts (parse-options))
          (args (filter-map (match-lambda
                             (('argument . value)
@@ -149,27 +155,37 @@ (define (vcs-file? file stat)
     (define (file-hash file)
       ;; Compute the hash of FILE.
       ;; Catch and gracefully report possible '&nar-error' conditions.
-      (with-error-handling
-        (if (assoc-ref opts 'recursive?)
+      (if (and (assoc-ref opts 'recursive?)
+               (directory? file))
+          (with-error-handling
             (let-values (((port get-hash)
                           (open-hash-port (assoc-ref opts 'hash-algorithm))))
               (write-file file port #:select? select?)
               (force-output port)
-              (get-hash))
-            (match file
-              ("-" (port-hash (assoc-ref opts 'hash-algorithm)
-                              (current-input-port)))
-              (_   (call-with-input-file file
-                     (cute port-hash (assoc-ref opts 'hash-algorithm)
-                           <>)))))))
-
-    (match args
-      ((file)
-       (catch 'system-error
-         (lambda ()
-           (format #t "~a~%" (fmt (file-hash file))))
-         (lambda args
-           (leave (G_ "~a~%")
-                  (strerror (system-error-errno args))))))
-      (x
-       (leave (G_ "wrong number of arguments~%"))))))
+              (get-hash)))
+          (catch 'system-error
+            (lambda _
+              (call-with-input-file file
+                (cute port-hash (assoc-ref opts 'hash-algorithm)
+                      <>)))
+            (lambda args
+              (when (directory? file)
+                (display-hint (G_ "Try @option{--recursive}.")))
+              (leave (G_ "~a ~a~%")
+                     file
+                     (strerror (system-error-errno args)))))))
+
+    (define (hash-to-display thing)
+      (match thing
+        ((? file-exists? file)
+         (fmt (file-hash file)))
+        ("-" (with-error-handling
+               (fmt (port-hash (assoc-ref opts 'hash-algorithm)
+                               (current-input-port)))))
+        (x
+         (leave (G_ "wrong argument~%")))))
+
+    (for-each
+     (lambda (arg)
+       (format #t "~a~%" (hash-to-display arg)))
+     args)))
-- 
2.32.0
Z
Z
zimoun wrote on 20 Oct 2021 18:54
[PATCH 2/2] scripts: hash: Support file or package.
(address . 51307@debbugs.gnu.org)(name . zimoun)(address . zimon.toutoune@gmail.com)
20211020165435.3358398-2-zimon.toutoune@gmail.com
* guix/scripts/hash.scm (guix-hash)[package?]: New procedure.
[hash-to-display]: Use it.
* tests/guix-hash.scm: New test.
---
guix/scripts/hash.scm | 19 +++++++++++++++++--
tests/guix-hash.sh | 10 ++++++++++
2 files changed, 27 insertions(+), 2 deletions(-)

Toggle diff (78 lines)
diff --git a/guix/scripts/hash.scm b/guix/scripts/hash.scm
index f3363549d3..4f0d41629f 100644
--- a/guix/scripts/hash.scm
+++ b/guix/scripts/hash.scm
@@ -22,6 +22,9 @@
 
 (define-module (guix scripts hash)
   #:use-module (gcrypt hash)
+  #:use-module ((gnu packages) #:select (find-best-packages-by-name))
+  #:use-module (guix packages)
+  #:use-module ((guix utils) #:select (package-name->name+version))
   #:use-module (guix serialization)
   #:use-module (guix ui)
   #:use-module (guix scripts)
@@ -48,8 +51,8 @@ (define %default-options
     (hash-algorithm . ,(hash-algorithm sha256))))
 
 (define (show-help)
-  (display (G_ "Usage: guix hash [OPTION] FILE
-Return the cryptographic hash of FILE.\n"))
+  (display (G_ "Usage: guix hash [OPTION] FILE-OR-PACKAGE
+Return the cryptographic hash of FILE-OR-PACKAGE.\n"))
   (newline)
   (display (G_ "\
 Supported formats: 'base64', 'nix-base32' (default), 'base32',
@@ -141,6 +144,12 @@ (define (directory? file)
       ((directory) #t)
       (else #f)))
 
+  (define (package? spec)
+    (let-values (((name version) (package-name->name+version spec)))
+      (match (find-best-packages-by-name name version)
+        ((package) package)
+        (_ #f))))
+
   (let* ((opts (parse-options))
          (args (filter-map (match-lambda
                             (('argument . value)
@@ -182,6 +191,12 @@ (define (hash-to-display thing)
         ("-" (with-error-handling
                (fmt (port-hash (assoc-ref opts 'hash-algorithm)
                                (current-input-port)))))
+        ((? package? spec)
+         (let* ((package (package? spec))
+                (origin (package-source package))
+                (content-hash (origin-hash origin))
+                (hash (content-hash-value content-hash)))
+           (fmt hash)))
         (x
          (leave (G_ "wrong argument~%")))))
 
diff --git a/tests/guix-hash.sh b/tests/guix-hash.sh
index c4461fa955..41bd2b1588 100644
--- a/tests/guix-hash.sh
+++ b/tests/guix-hash.sh
@@ -1,6 +1,7 @@
 # GNU Guix --- Functional package management for GNU
 # Copyright © 2013, 2014, 2016, 2020, 2021 Ludovic Courtès <ludo@gnu.org>
 # Copyright © 2016 Jan Nieuwenhuizen <janneke@gnu.org>
+# Copyright © 2021 Simon Tournier <zimon.toutoune@gmail.com>
 #
 # This file is part of GNU Guix.
 #
@@ -65,3 +66,12 @@ test `guix hash -r $tmpdir -x` = 10k1lw41wyrjf9mxydi0is5nkpynlsvgslinics4ppir13g
 # Without '-r', this should fail.
 ! guix hash "$tmpdir"
 
+cat > "$tmpdir/foo.scm"<<EOF
+(use-modules (guix packages)
+             (gnu packages base)
+             (guix base16))
+(format #t "~a~%"
+        (bytevector->base16-string
+         (content-hash-value (origin-hash (package-source hello)))))
+EOF
+test `guix hash hello -f base16` = `guix repl -- $tmpdir/foo.scm`
-- 
2.32.0
L
L
Ludovic Courtès wrote on 30 Oct 2021 16:46
Re: bug#51307: [PATCH 0/2] guix hash: eases conversion
(name . zimoun)(address . zimon.toutoune@gmail.com)(address . 51307@debbugs.gnu.org)
875ytetvt1.fsf_-_@gnu.org
Hi,

zimoun <zimon.toutoune@gmail.com> skribis:

Toggle quote (5 lines)
> * guix/scripts/hash.scm (guix-hash): Allow several files.
> [directory?]: New procedure.
> [file-hash]: Catch system-error.
> [hash-to-display]: New procedure.

Nice! Could you update guix.texi and tests/guix-hash.texi?

Toggle quote (5 lines)
> - (with-error-handling
> - (if (assoc-ref opts 'recursive?)
> + (if (and (assoc-ref opts 'recursive?)
> + (directory? file))

This change is not related to the main purpose of the patch.

More importantly, note that ‘--recursive’ is not limited to directories:
it preserves file properties (directory, executable, or regular), so
it’s also useful for executable files for instance. It can also be used
for regular files, even if it’s less useful.

Toggle quote (8 lines)
> + (define (hash-to-display thing)
> + (match thing
> + ((? file-exists? file)
> + (fmt (file-hash file)))
> + ("-" (with-error-handling
> + (fmt (port-hash (assoc-ref opts 'hash-algorithm)
> + (current-input-port)))))

I’d swap the order of the two clauses and remove the call to
‘file-exists?’: if the file doesn’t exist, ‘file-hash’ will raise an
error.

Toggle quote (7 lines)
> + (x
> + (leave (G_ "wrong argument~%")))))
> +
> + (for-each
> + (lambda (arg)
> + (format #t "~a~%" (hash-to-display arg)))

Or just (compose display hash-to-display) ?

Maybe s/hash-to-display/formatted-hash/.

Could you send an updated patch?

Thanks,
Ludo’.
L
L
Ludovic Courtès wrote on 30 Oct 2021 16:48
(name . zimoun)(address . zimon.toutoune@gmail.com)(address . 51307@debbugs.gnu.org)
871r42tvoy.fsf_-_@gnu.org
zimoun <zimon.toutoune@gmail.com> skribis:

Toggle quote (20 lines)
> * guix/scripts/hash.scm (guix-hash)[package?]: New procedure.
> [hash-to-display]: Use it.
> * tests/guix-hash.scm: New test.
> ---
> guix/scripts/hash.scm | 19 +++++++++++++++++--
> tests/guix-hash.sh | 10 ++++++++++
> 2 files changed, 27 insertions(+), 2 deletions(-)
>
> diff --git a/guix/scripts/hash.scm b/guix/scripts/hash.scm
> index f3363549d3..4f0d41629f 100644
> --- a/guix/scripts/hash.scm
> +++ b/guix/scripts/hash.scm
> @@ -22,6 +22,9 @@
>
> (define-module (guix scripts hash)
> #:use-module (gcrypt hash)
> + #:use-module ((gnu packages) #:select (find-best-packages-by-name))
> + #:use-module (guix packages)
> + #:use-module ((guix utils) #:select (package-name->name+version))

I think I would prefer to keep (guix scripts hash) bare-bones, not
depending on the package machinery.

Most of the time one can run:

guix hash $(guix build -S PACKAGE)

It’s not quite what you want if the package has patches or a snippet,
but that’s okay IMO.

WDYT?

Thanks,
Ludo’.
L
L
Ludovic Courtès wrote on 30 Oct 2021 16:53
(name . zimoun)(address . zimon.toutoune@gmail.com)(address . 51307@debbugs.gnu.org)
87wnlusgwy.fsf@gnu.org
Hi,

zimoun <zimon.toutoune@gmail.com> skribis:

Toggle quote (11 lines)
> 2. Using the option recursive changes the result for tarball, as with:
>
> $ guix hash $(guix build hello -S)
> 0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i
>
> $ guix hash $(guix build hello -S) --recursive
> 1qx3qqk86vgdvpqkhpgzq3gfcxmys29wzfizjb9asn4crbn503x9
>
> And I am not able to imagine a case. To me, it should be a fixed-point.
> That’s what the first patch correct.

That’s expected: ‘--recursive’ uses a different computation method,
including file metadata (technically, it serializes the file as a nar
and computes the hash of the nar).

[...]

Toggle quote (28 lines)
> Then, working on Disarchive which uses base16 as encoding, it is annoying
> twice,
>
> a) because it requires to download when all the sources
> b) because it sometimes requires to apply patches
>
> Compare,
>
> $ guix hash $(guix build ceph -S)
> 0ppd362s177cc47g75v0k27j7aaf27qc31cbbh0j2g30wmhl8gj7
>
> with the checksum in the package definition:
> 0lmdri415hqczc9565s5m5568pnj97ipqxgnw6085kps0flwq5zh.
>
> With the second patch, it becomes easy to convert the checksum from upstream:
>
> $ ./pre-inst-env guix hash ceph -f base16
> f017cca903face8280e1f6757ce349d25e644aa945175312fb0cc31248ccad52
>
> and nothing is downloaded. Get the checksum of what Guix really builds is
> done via the current way, for instance,
>
> $ guix hash $(guix build ceph -S) -f base16
> 473e4461e5603c21015c8b85c1f0114ea9238f986097f30e61ec9ca08519ed5e
>
> and the second patch allows to convert the checksum from the package
> definition (without downloading).

Ah yes, got it. (I should read messages in the right order, oops!)

An obvious problem with the interface you propose is that it’s
ambiguous: are you printing the hash of the ‘ceph’ package, or computing
that of the ‘ceph’ file? I’m sure the Zen of Python has something on
ambiguity. ;-)

Do you think there’s another place where we could provide helpers for
the die-hard Disarchive hackers among us? Maybe we could get ‘guix lint
-c archival’ to print Disarchive URLs upon failure, and that’d already
help?

WDYT?

Thanks!

Ludo’.
Z
Z
zimoun wrote on 30 Oct 2021 17:19
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 51307@debbugs.gnu.org)
86tugyleub.fsf@gmail.com
Hi Ludo,

On Sat, 30 Oct 2021 at 16:53, Ludovic Courtès <ludo@gnu.org> wrote:
Toggle quote (17 lines)
> zimoun <zimon.toutoune@gmail.com> skribis:
>
>> 2. Using the option recursive changes the result for tarball, as with:
>>
>> $ guix hash $(guix build hello -S)
>> 0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i
>>
>> $ guix hash $(guix build hello -S) --recursive
>> 1qx3qqk86vgdvpqkhpgzq3gfcxmys29wzfizjb9asn4crbn503x9
>>
>> And I am not able to imagine a case. To me, it should be a fixed-point.
>> That’s what the first patch correct.
>
> That’s expected: ‘--recursive’ uses a different computation method,
> including file metadata (technically, it serializes the file as a nar
> and computes the hash of the nar).

Yes, but that’s odd. It should be the same computation method for
tarballs. Nothing is recursive for a tarball therefore, the option
should be skipped. This proposal is perhaps not the best approach
although I lacked of imagination about corner cases.


Toggle quote (35 lines)
>> Then, working on Disarchive which uses base16 as encoding, it is annoying
>> twice,
>>
>> a) because it requires to download when all the sources
>> b) because it sometimes requires to apply patches
>>
>> Compare,
>>
>> $ guix hash $(guix build ceph -S)
>> 0ppd362s177cc47g75v0k27j7aaf27qc31cbbh0j2g30wmhl8gj7
>>
>> with the checksum in the package definition:
>> 0lmdri415hqczc9565s5m5568pnj97ipqxgnw6085kps0flwq5zh.
>>
>> With the second patch, it becomes easy to convert the checksum from upstream:
>>
>> $ ./pre-inst-env guix hash ceph -f base16
>> f017cca903face8280e1f6757ce349d25e644aa945175312fb0cc31248ccad52
>>
>> and nothing is downloaded. Get the checksum of what Guix really builds is
>> done via the current way, for instance,
>>
>> $ guix hash $(guix build ceph -S) -f base16
>> 473e4461e5603c21015c8b85c1f0114ea9238f986097f30e61ec9ca08519ed5e
>>
>> and the second patch allows to convert the checksum from the package
>> definition (without downloading).
>
> Ah yes, got it. (I should read messages in the right order, oops!)
>
> An obvious problem with the interface you propose is that it’s
> ambiguous: are you printing the hash of the ‘ceph’ package, or computing
> that of the ‘ceph’ file? I’m sure the Zen of Python has something on
> ambiguity. ;-)

The patch is printing the hash of upstream and it is the only hash which
matters – speaking both about packaging and about Disarchive.
Therefore, there is no ambiguity here. Better said, the ambiguity is
from “guix build --source” where it is not predictable beforehand what
it will return.

For instance, can you guess what “guix build -S graphviz” returns? ;-)
And can you guess the hash?


Toggle quote (5 lines)
> Do you think there’s another place where we could provide helpers for
> the die-hard Disarchive hackers among us? Maybe we could get ‘guix lint
> -c archival’ to print Disarchive URLs upon failure, and that’d already
> help?

To me, “guix hash” is about hashing therefore it appears to me the right
place for getting the hash of something. For instance, I do not find
“guix lint -c archival” the right place for sending a request and saving
to SWH; as olasd said at the time, IIRC. :-) However, the good is that
“guix lint <pkg>” just works (for archiving). :-)

Last, I do not want Diarchive URLs upon failure, I would like hashes and
upstream URLs on request. :-)

Well, I do not know. What could be better? Another subcommand “guix
archival” doing all these plumbings: save, display hashes, upstream URL,
disarchive URL, etc.

WDYT?

Cheers,
simon
Z
Z
zimoun wrote on 30 Oct 2021 17:24
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 51307@debbugs.gnu.org)
86pmrmleml.fsf@gmail.com
Re,

On Sat, 30 Oct 2021 at 17:19, zimoun <zimon.toutoune@gmail.com> wrote:

Toggle quote (5 lines)
>> An obvious problem with the interface you propose is that it’s
>> ambiguous: are you printing the hash of the ‘ceph’ package, or computing
>> that of the ‘ceph’ file? I’m sure the Zen of Python has something on
>> ambiguity. ;-)

[...]

Toggle quote (4 lines)
> Well, I do not know. What could be better? Another subcommand “guix
> archival” doing all these plumbings: save, display hashes, upstream URL,
> disarchive URL, etc.

Ah, I forgot. Zen of Python says,

Now is better than never.

and the proposed patch does “now” the things I need and this hypothetical
other subcommand falls into “never”. :-)


Cheers,
simon
Z
Z
zimoun wrote on 30 Oct 2021 17:34
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 51307@debbugs.gnu.org)
86h7cyle5c.fsf@gmail.com
On Sat, 30 Oct 2021 at 16:48, Ludovic Courtès <ludo@gnu.org> wrote:
Toggle quote (25 lines)
> zimoun <zimon.toutoune@gmail.com> skribis:
>
>> * guix/scripts/hash.scm (guix-hash)[package?]: New procedure.
>> [hash-to-display]: Use it.
>> * tests/guix-hash.scm: New test.
>> ---
>> guix/scripts/hash.scm | 19 +++++++++++++++++--
>> tests/guix-hash.sh | 10 ++++++++++
>> 2 files changed, 27 insertions(+), 2 deletions(-)
>>
>> diff --git a/guix/scripts/hash.scm b/guix/scripts/hash.scm
>> index f3363549d3..4f0d41629f 100644
>> --- a/guix/scripts/hash.scm
>> +++ b/guix/scripts/hash.scm
>> @@ -22,6 +22,9 @@
>>
>> (define-module (guix scripts hash)
>> #:use-module (gcrypt hash)
>> + #:use-module ((gnu packages) #:select (find-best-packages-by-name))
>> + #:use-module (guix packages)
>> + #:use-module ((guix utils) #:select (package-name->name+version))
>
> I think I would prefer to keep (guix scripts hash) bare-bones, not
> depending on the package machinery.

I understand but I do not have better to propose. :-)


Toggle quote (4 lines)
> Most of the time one can run:
>
> guix hash $(guix build -S PACKAGE)

First, it is not true. For instance,

$ guix hash $(guix build -S graphviz)
00skvq94xanwmprz5073mhmssz953dwf7h23p5czrpgd5s7hy444

and this hash does not correspond to the hash used by
Disarchive. Because “guix build -S” does not return what Guix downloads
but what Guix builds. The cover letter provides another example for the
package ’ceph’.

Each time a patch or a snippet is added to origin, then it is not true.

Second, it requires to download for hashing. When the hash is already
in the source. I would like to avoid unnecessary downloads. I mean, it
is ok to download for a couple of packages. But it becomes impractical
for batch of 1200 (or more).


Toggle quote (3 lines)
> It’s not quite what you want if the package has patches or a snippet,
> but that’s okay IMO.

No, that’s not OK. :-) Because, it becomes really annoying. I have to
open gnu/packages, then recompose URL, then download and hash with the
right format. Too many manual and boring steps when all is there. Just
require CLI to be displayed.

Cheers,
simon

PS:
That’s what the cover letter was explaining. Sorry if it was badly
worded or unclear.
Z
Z
zimoun wrote on 30 Oct 2021 17:40
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 51307@debbugs.gnu.org)
86cznmldv8.fsf@gmail.com
Hi Ludo,

On Sat, 30 Oct 2021 at 16:46, Ludovic Courtès <ludo@gnu.org> wrote:
Toggle quote (9 lines)
> zimoun <zimon.toutoune@gmail.com> skribis:
>
>> * guix/scripts/hash.scm (guix-hash): Allow several files.
>> [directory?]: New procedure.
>> [file-hash]: Catch system-error.
>> [hash-to-display]: New procedure.
>
> Nice! Could you update guix.texi and tests/guix-hash.texi?

I will once we agree and reach consensus. ;-)


Toggle quote (12 lines)
>> - (with-error-handling
>> - (if (assoc-ref opts 'recursive?)
>> + (if (and (assoc-ref opts 'recursive?)
>> + (directory? file))
>
> This change is not related to the main purpose of the patch.
>
> More importantly, note that ‘--recursive’ is not limited to directories:
> it preserves file properties (directory, executable, or regular), so
> it’s also useful for executable files for instance. It can also be used
> for regular files, even if it’s less useful.

I understand. Could you provide concrete examples when it is not
directory? In order to see if there is a pattern.

Toggle quote (12 lines)
>> + (define (hash-to-display thing)
>> + (match thing
>> + ((? file-exists? file)
>> + (fmt (file-hash file)))
>> + ("-" (with-error-handling
>> + (fmt (port-hash (assoc-ref opts 'hash-algorithm)
>> + (current-input-port)))))
>
> I’d swap the order of the two clauses and remove the call to
> ‘file-exists?’: if the file doesn’t exist, ‘file-hash’ will raise an
> error.

I can swap order but not remove file-exists? Otherwise, the next clause…

Toggle quote (3 lines)
>> + (x
>> + (leave (G_ "wrong argument~%")))))

will never happen. And some tests are failing.

Toggle quote (8 lines)
>> + (for-each
>> + (lambda (arg)
>> + (format #t "~a~%" (hash-to-display arg)))
>
> Or just (compose display hash-to-display) ?
>
> Maybe s/hash-to-display/formatted-hash/.

Thanks. Indeed both are better. :-)


Toggle quote (2 lines)
> Could you send an updated patch?

Yes, but before, from my understanding, we should agree on the goal of
such patch. :-) See the two other replies.

Cheers,
simon
L
L
Ludovic Courtès wrote on 31 Oct 2021 14:43
(name . zimoun)(address . zimon.toutoune@gmail.com)(address . 51307@debbugs.gnu.org)
87tugxqpgy.fsf@gnu.org
Hi,

zimoun <zimon.toutoune@gmail.com> skribis:

Toggle quote (12 lines)
> On Sat, 30 Oct 2021 at 16:46, Ludovic Courtès <ludo@gnu.org> wrote:
>> zimoun <zimon.toutoune@gmail.com> skribis:
>>
>>> * guix/scripts/hash.scm (guix-hash): Allow several files.
>>> [directory?]: New procedure.
>>> [file-hash]: Catch system-error.
>>> [hash-to-display]: New procedure.
>>
>> Nice! Could you update guix.texi and tests/guix-hash.texi?
>
> I will once we agree and reach consensus. ;-)

Apparently there’s consensus on this specific patch.

Toggle quote (10 lines)
>> This change is not related to the main purpose of the patch.
>>
>> More importantly, note that ‘--recursive’ is not limited to directories:
>> it preserves file properties (directory, executable, or regular), so
>> it’s also useful for executable files for instance. It can also be used
>> for regular files, even if it’s less useful.
>
> I understand. Could you provide concrete examples when it is not
> directory? In order to see if there is a pattern.

See the executables used in (gnu packages bootstrap): to preserve their
executable bit, you need ‘--recursive’. Same for symlinks.

Anyway, this patch is about allowing for multiple arguments (which we
agree on), so I think we should discuss the ‘--recursive’ issue
separately.

Thanks!

Ludo’.
L
L
Ludovic Courtès wrote on 31 Oct 2021 15:03
(name . zimoun)(address . zimon.toutoune@gmail.com)(address . 51307@debbugs.gnu.org)
87mtmppa0a.fsf@gnu.org
Hi!

zimoun <zimon.toutoune@gmail.com> skribis:

Toggle quote (23 lines)
> On Sat, 30 Oct 2021 at 16:53, Ludovic Courtès <ludo@gnu.org> wrote:
>> zimoun <zimon.toutoune@gmail.com> skribis:
>>
>>> 2. Using the option recursive changes the result for tarball, as with:
>>>
>>> $ guix hash $(guix build hello -S)
>>> 0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i
>>>
>>> $ guix hash $(guix build hello -S) --recursive
>>> 1qx3qqk86vgdvpqkhpgzq3gfcxmys29wzfizjb9asn4crbn503x9
>>>
>>> And I am not able to imagine a case. To me, it should be a fixed-point.
>>> That’s what the first patch correct.
>>
>> That’s expected: ‘--recursive’ uses a different computation method,
>> including file metadata (technically, it serializes the file as a nar
>> and computes the hash of the nar).
>
> Yes, but that’s odd. It should be the same computation method for
> tarballs. Nothing is recursive for a tarball therefore, the option
> should be skipped. This proposal is perhaps not the best approach
> although I lacked of imagination about corner cases.

The way I see it, ‘guix hash’ is a low-level tool and it should do what
I ask for and not try to second-guess.

Toggle quote (39 lines)
>>> Then, working on Disarchive which uses base16 as encoding, it is annoying
>>> twice,
>>>
>>> a) because it requires to download when all the sources
>>> b) because it sometimes requires to apply patches
>>>
>>> Compare,
>>>
>>> $ guix hash $(guix build ceph -S)
>>> 0ppd362s177cc47g75v0k27j7aaf27qc31cbbh0j2g30wmhl8gj7
>>>
>>> with the checksum in the package definition:
>>> 0lmdri415hqczc9565s5m5568pnj97ipqxgnw6085kps0flwq5zh.
>>>
>>> With the second patch, it becomes easy to convert the checksum from upstream:
>>>
>>> $ ./pre-inst-env guix hash ceph -f base16
>>> f017cca903face8280e1f6757ce349d25e644aa945175312fb0cc31248ccad52
>>>
>>> and nothing is downloaded. Get the checksum of what Guix really builds is
>>> done via the current way, for instance,
>>>
>>> $ guix hash $(guix build ceph -S) -f base16
>>> 473e4461e5603c21015c8b85c1f0114ea9238f986097f30e61ec9ca08519ed5e
>>>
>>> and the second patch allows to convert the checksum from the package
>>> definition (without downloading).
>>
>> Ah yes, got it. (I should read messages in the right order, oops!)
>>
>> An obvious problem with the interface you propose is that it’s
>> ambiguous: are you printing the hash of the ‘ceph’ package, or computing
>> that of the ‘ceph’ file? I’m sure the Zen of Python has something on
>> ambiguity. ;-)
>
> The patch is printing the hash of upstream and it is the only hash which
> matters – speaking both about packaging and about Disarchive.
> Therefore, there is no ambiguity here.

Sorry, I think I wasn’t clear. Consider this:

touch ceph
guix hash ceph

What does it print?

If the result depends on external context (the presence or not of a
‘ceph’ file in $PWD), that’s a brittle interface IMO.

This could be addressed by requiring users to be explicit, along these
lines:

guix hash ceph # compute the hash of the file called ‘ceph’
guix hash -P ceph # print the hash of the ‘ceph’ package


But there’s another issue with the interface: ‘guix hash -P ceph’ would
merely print the hash as it appears in the package definition. Thus
‘-H’ and ‘-r’ would have no effect, which can be confusing.

Toggle quote (2 lines)
> For instance, can you guess what “guix build -S graphviz” returns? ;-)

I’m aware it returns the source after applying patches and snippets; I
understand the shortcomings you mention quite well and I don’t deny
there’s a need. :-)

My comment is on the interface.

Toggle quote (18 lines)
>> Do you think there’s another place where we could provide helpers for
>> the die-hard Disarchive hackers among us? Maybe we could get ‘guix lint
>> -c archival’ to print Disarchive URLs upon failure, and that’d already
>> help?
>
> To me, “guix hash” is about hashing therefore it appears to me the right
> place for getting the hash of something. For instance, I do not find
> “guix lint -c archival” the right place for sending a request and saving
> to SWH; as olasd said at the time, IIRC. :-) However, the good is that
> “guix lint <pkg>” just works (for archiving). :-)
>
> Last, I do not want Diarchive URLs upon failure, I would like hashes and
> upstream URLs on request. :-)
>
> Well, I do not know. What could be better? Another subcommand “guix
> archival” doing all these plumbings: save, display hashes, upstream URL,
> disarchive URL, etc.

Yes, maybe? I don’t know. I think it’s important to take a step back:
perhaps we’re in need of a better tool around SWH and Disarchive, rather
than just a tool that displays a hash. We already have all the APIs to
do these things anyway, so if we clarify the use case, we can surely
glue things together to build a tool that will be more convenient.
(Maybe you’ve already written scripts to help you?)

For example, if the use case is “is this tarball in Disarchive”, this is
answered by ‘guix lint -c archival’, but perhaps we need a more
low-level or more focused tool in that area?

Regarding base16: that too isn’t set in stone. Commit
3cb5ae8577db28b2c6013b9d9ecf99cb696e3432 provides more flexibility, so
we don’t have to stick to base16.

I hope this perspective is useful!

Ludo’.
L
L
Ludovic Courtès wrote on 31 Oct 2021 15:48
Content hashes and file tree serialization methods
(name . zimoun)(address . zimon.toutoune@gmail.com)(address . 51307@debbugs.gnu.org)
87mtmpntcf.fsf_-_@gnu.org
zimoun <zimon.toutoune@gmail.com> skribis:

Toggle quote (7 lines)
>> That’s expected: ‘--recursive’ uses a different computation method,
>> including file metadata (technically, it serializes the file as a nar
>> and computes the hash of the nar).
>
> Yes, but that’s odd. It should be the same computation method for
> tarballs. Nothing is recursive for a tarball

Thinking more about it, I think confusion stems from the term
“recursive” (inherited from Nix) because, as you write, it doesn’t
necessarily have to do with recursion and directory traversal.

Instead, it has to do with the serialization method.

Thus, probably, ‘--recursive’ could be replaced by a ‘-S’ flag:

guix hash -S nar something

or:

guix hash -S none something
guix hash -S git something
guix hash -S swh something

‘-S none’ would be like not passing ‘-r’; ‘-S git’ would serialize the
file/directory as a Git tree; ‘-S swh’ would serialize it the SWH way,
which is like Git except it preserves empty directories (Disarchive
implements the Git/SWH methods already.)

Thoughts?

As mentioned towards the end of
being able to deal with different tree serialization methods would be
useful going forward; for instance, if we had the option to store
SWH-style content hashes for origins, that’d help a lot. Allowing ‘guix
hash’ to deal with that is a first step in that direction.

(Apologies for slipping a bit off-topic…)

Ludo’.
Z
Z
zimoun wrote on 9 Nov 2021 10:18
Re: bug#51307: [PATCH 0/2] guix hash: eases conversion
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 51307@debbugs.gnu.org)
86zgqdbsc6.fsf@gmail.com
Hi Ludo,

On Sun, 31 Oct 2021 at 15:03, Ludovic Courtès <ludo@gnu.org> wrote:


[...]

Toggle quote (11 lines)
>> The patch is printing the hash of upstream and it is the only hash which
>> matters – speaking both about packaging and about Disarchive.
>> Therefore, there is no ambiguity here.
>
> Sorry, I think I wasn’t clear. Consider this:
>
> touch ceph
> guix hash ceph
>
> What does it print?

It would print the first clause. Two things: 1. How many times do you
run “guix hash foo” inside a folder where there is a folder or file
’foo’? and 2. It is easy to document this corner case and “guix hash
./ceph” fixes the issue.

Well, the root is that I disagree with your comment, I guess. :-)

The way I see it, ‘guix hash’ is a low-level tool and it should
do what I ask for and not try to second-guess.

Bah it is similar as Garbage Collector debate; Pythonista says: devs are
too dumb for managing memory by themselves, it has to be done
automatically; C devs says: managing memory is too important for
second-guessing dev intent. ;-)


Note that from my understanding, “guix hash” and “guix download” are
somehow redundant, i.e., “guix download” should be included to “guix
hash”. Another story… but I was not drifting yet. ;-)


Toggle quote (3 lines)
> If the result depends on external context (the presence or not of a
> ‘ceph’ file in $PWD), that’s a brittle interface IMO.

I trust your experience on designing interfaces. :-)


Toggle quote (6 lines)
> This could be addressed by requiring users to be explicit, along these
> lines:
>
> guix hash ceph # compute the hash of the file called ‘ceph’
> guix hash -P ceph # print the hash of the ‘ceph’ package

Well, let’s go for that. One last question about bikeshedding, what
should do

guix hash -P ceph ceph
? Print twice hash of ceph package? Or print hash of ceph package and
hash of ceph file?


Toggle quote (4 lines)
> But there’s another issue with the interface: ‘guix hash -P ceph’ would
> merely print the hash as it appears in the package definition. Thus
> ‘-H’ and ‘-r’ would have no effect, which can be confusing.

Wow, many many options of many many Guix commands cannot be composed.

Aside, these two still open bugs,


for instance,

guix package --list-installed --show=hello
guix package --show=hello --list-installed

guix package --list-available --list-installed
guix package --list-installed --list-available

And many more,

guix pull --commit=1234 --branch=core-updates

and so “guix time-machine” too. And I am not speaking about build
transformations.


Bah, ok let’s avoid to add another one. :-) It seems possible to detect
and display a warning that -H or -r does not take effect because -P.


Toggle quote (7 lines)
> Yes, maybe? I don’t know. I think it’s important to take a step back:
> perhaps we’re in need of a better tool around SWH and Disarchive, rather
> than just a tool that displays a hash. We already have all the APIs to
> do these things anyway, so if we clarify the use case, we can surely
> glue things together to build a tool that will be more convenient.
> (Maybe you’ve already written scripts to help you?)

I will start to collect my needs and what I am doing when playing with
that. And I will try to put that inside an extension, such as “guix
archival”. It will be a basis for judging if it is worth or not.

No, I do not have scripts. I mean, each time I work on that topic, I
write again and again some quick and dirty stuff coupled to ugly Bash
glue code.

This patch is because I have been annoyed to repeat again and again. :-)


Well, I am going to send another version adding multi FILE, first patch
which is making consensus, and second patch the option --package/-P.


Cheers,
simon
Z
Z
zimoun wrote on 18 Nov 2021 01:20
[PATCH v2 1/3] scripts: hash: Support several files.
(address . 51307@debbugs.gnu.org)
20211118002023.3323307-2-zimon.toutoune@gmail.com
* guix/scripts/hash.scm (guix-hash): Allow several files.
[file-hash]: Catch system-error.
[formatted-hash]: New procedure.
---
guix/scripts/hash.scm | 43 ++++++++++++++++++++++++-------------------
1 file changed, 24 insertions(+), 19 deletions(-)

Toggle diff (65 lines)
diff --git a/guix/scripts/hash.scm b/guix/scripts/hash.scm
index b8622373cc..12f542929b 100644
--- a/guix/scripts/hash.scm
+++ b/guix/scripts/hash.scm
@@ -3,6 +3,7 @@
 ;;; Copyright © 2013 Nikita Karetnikov <nikita@karetnikov.org>
 ;;; Copyright © 2016 Jan Nieuwenhuizen <janneke@gnu.org>
 ;;; Copyright © 2018 Tim Gesthuizen <tim.gesthuizen@yahoo.de>
+;;; Copyright © 2021 Simon Tournier <zimon.toutoune@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -149,27 +150,31 @@ (define (vcs-file? file stat)
     (define (file-hash file)
       ;; Compute the hash of FILE.
       ;; Catch and gracefully report possible '&nar-error' conditions.
-      (with-error-handling
-        (if (assoc-ref opts 'recursive?)
+      (if (assoc-ref opts 'recursive?)
+          (with-error-handling
             (let-values (((port get-hash)
                           (open-hash-port (assoc-ref opts 'hash-algorithm))))
               (write-file file port #:select? select?)
               (force-output port)
-              (get-hash))
-            (match file
-              ("-" (port-hash (assoc-ref opts 'hash-algorithm)
-                              (current-input-port)))
-              (_   (call-with-input-file file
-                     (cute port-hash (assoc-ref opts 'hash-algorithm)
-                           <>)))))))
+              (get-hash)))
+          (catch 'system-error
+            (lambda _
+              (call-with-input-file file
+                (cute port-hash (assoc-ref opts 'hash-algorithm)
+                      <>)))
+            (lambda args
+              (leave (G_ "~a ~a~%")
+                     file
+                     (strerror (system-error-errno args)))))))
 
-    (match args
-      ((file)
-       (catch 'system-error
-         (lambda ()
-           (format #t "~a~%" (fmt (file-hash file))))
-         (lambda args
-           (leave (G_ "~a~%")
-                  (strerror (system-error-errno args))))))
-      (x
-       (leave (G_ "wrong number of arguments~%"))))))
+    (define (formatted-hash thing)
+      (match thing
+        ("-" (with-error-handling
+               (fmt (port-hash (assoc-ref opts 'hash-algorithm)
+                               (current-input-port)))))
+        (_
+         (fmt (file-hash thing)))))
+
+    (for-each
+     (compose (cute format #t "~a~%" <>) formatted-hash)
+     args)))
-- 
2.33.1
Z
Z
zimoun wrote on 18 Nov 2021 01:20
[PATCH v2 3/3] scripts: hash: Add git serializer.
(address . 51307@debbugs.gnu.org)
20211118002023.3323307-4-zimon.toutoune@gmail.com
* guix/scripts/hash.scm (git-hash): New procedure.
(%options): Use it.
* tests/guix-hash.sh: Test it.
* doc/guix.texi: Update.
---
doc/guix.texi | 4 +++-
guix/scripts/hash.scm | 15 +++++++++++++++
tests/guix-hash.sh | 6 ++++++
3 files changed, 24 insertions(+), 1 deletion(-)

Toggle diff (98 lines)
diff --git a/doc/guix.texi b/doc/guix.texi
index 20041c20b7..af2e99903b 100644
--- a/doc/guix.texi
+++ b/doc/guix.texi
@@ -11638,7 +11638,7 @@ legacy alias for @var{type} sets to @code{nar}.
 @itemx -S
 Compute the hash on @var{file} using @var{type} serialization.
 
-Supported types: @code{none} and @code{nar}.
+Supported types: @code{none}, @code{nar} and @code{git}.
 
 When using @code{nar}, the hash is computed on an archive containing
 @var{file}, including its children if it is a directory.  Some of the
@@ -11649,6 +11649,8 @@ impact on the hash (@pxref{Invoking guix archive}).
 @c FIXME: Replace xref above with xref to an ``Archive'' section when
 @c it exists.
 
+Using @code{git} serializes the file or directory as a Git tree.
+
 @item --exclude-vcs
 @itemx -x
 When combined with @option{--recursive}, exclude version control system
diff --git a/guix/scripts/hash.scm b/guix/scripts/hash.scm
index d05ecb80ba..80581f2340 100644
--- a/guix/scripts/hash.scm
+++ b/guix/scripts/hash.scm
@@ -35,6 +35,8 @@ (define-module (guix scripts hash)
   #:use-module (srfi srfi-11)
   #:use-module (srfi srfi-26)
   #:use-module (srfi srfi-37)
+  #:use-module ((disarchive git-hash) #:select (git-hash-file
+                                                git-hash-directory))
   #:export (guix-hash))
 
 
@@ -60,6 +62,17 @@ (define* (default-hash file #:optional
      (call-with-input-file file
        (cute port-hash algorithm <>)))))
 
+(define* (git-hash file #:optional
+                       (algorithm (assoc-ref %default-options 'hash-algorithm))
+                       select?)
+  (define directory?
+    (case (stat:type (stat file))
+      ((directory) #t)
+      (else #f)))
+  (if directory?
+      (git-hash-directory file algorithm)
+      (git-hash-file file algorithm)))
+
 
 ;;;
 ;;; Command-line options.
@@ -138,6 +151,8 @@ (define serializer-proc
                        default-hash)
                       ("nar"
                        nar-hash)
+                      ("git"
+                       git-hash)
                       (x
                        (leave (G_ "unsupported serializer type: ~a~%")
                               arg))))
diff --git a/tests/guix-hash.sh b/tests/guix-hash.sh
index cdcfac19bc..bb3973771f 100644
--- a/tests/guix-hash.sh
+++ b/tests/guix-hash.sh
@@ -34,6 +34,9 @@ test `guix hash -f base32 /dev/null` = 4oymiquy7qobjgx36tejs35zeqt24qpemsnzgtfes
 test `guix hash -H sha512 -f hex /dev/null` = cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e
 test `guix hash -H sha1 -f base64 /dev/null` = "2jmj7l5rSw0yVb/vlWAYkK/YBwk="
 
+# idem as `cat /dev/null | git hash-object --stdin`
+test `guix hash -S git -H sha1 -f hex  /dev/null` = e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
+
 ! guix hash -H abcd1234 /dev/null
 
 mkdir "$tmpdir"
@@ -44,6 +47,7 @@ mkdir "$tmpdir/subdir"
 
 test `guix hash -S nar "$tmpdir"` = 10k1lw41wyrjf9mxydi0is5nkpynlsvgslinics4ppir13g7d74p
 test `guix hash -S nar "$tmpdir" -H sha512` = 301ra58c2vahczzxiyfin41mpyb0ljh4dh9zn3ijvwviaw1j40sfzw5skh9x945da88n3785ggifzig7acd6k72h0mpsc20m1f66m9n
+test `guix hash -S git "$tmpdir" -H sha512` = 158b10d1bsdk4pm8ym9cg9ckfak1b0cgpw7365cl6s341ir380mh2f4ylicyh8khyrfnwq5cn9766d7m8fbfwwl94ndkv456v6a8knr
 
 # Deprecated --recursive option
 test `guix hash -r "$tmpdir" 2>/dev/null` = 10k1lw41wyrjf9mxydi0is5nkpynlsvgslinics4ppir13g7d74p
@@ -62,9 +66,11 @@ touch "$tmpdir/.git/foo"
 
 # ...changes the hash
 test `guix hash -S nar $tmpdir` = 0a50z04zyzf7pidwxv0nwbj82pgzbrhdy9562kncnvkcfvb48m59
+test `guix hash -S git $tmpdir` = 0ghlpca9xaswa1ay1g55dknwd9q899mi3ahfr43pq083v8wisjc7
 
 # ...but remains the same when using `-x'
 test `guix hash -S nar $tmpdir -x` = 10k1lw41wyrjf9mxydi0is5nkpynlsvgslinics4ppir13g7d74p
+test `guix hash -S git $tmpdir -x` = 0ghlpca9xaswa1ay1g55dknwd9q899mi3ahfr43pq083v8wisjc7
 
 # Without '-r', this should fail.
 ! guix hash "$tmpdir"
-- 
2.33.1
Z
Z
zimoun wrote on 18 Nov 2021 01:20
[PATCH v2 2/3] scripts: hash: Add 'serializer' option.
(address . 51307@debbugs.gnu.org)
20211118002023.3323307-3-zimon.toutoune@gmail.com
* guix/scripts/hash.scm (%options): Deprecate 'recursive', add 'serializer'.
(%default-options): Add 'serializer'.
(nar-hash): New procedure.
(default-hash): New procedure.
(guix-hash)[file-hash]: Use them.
(show-help): Adjust.
* tests/guix-hash.scm: Adjust.
* doc/guix.texi: Update.
---
doc/guix.texi | 25 ++++++++-----
guix/scripts/hash.scm | 82 +++++++++++++++++++++++++++++--------------
tests/guix-hash.sh | 14 +++++---
3 files changed, 81 insertions(+), 40 deletions(-)

Toggle diff (211 lines)
diff --git a/doc/guix.texi b/doc/guix.texi
index 89a970908d..20041c20b7 100644
--- a/doc/guix.texi
+++ b/doc/guix.texi
@@ -71,7 +71,7 @@ Copyright @copyright{} 2019 Kyle Andrews@*
 Copyright @copyright{} 2019 Alex Griffin@*
 Copyright @copyright{} 2019, 2020, 2021 Guillaume Le Vaillant@*
 Copyright @copyright{} 2020 Liliana Marie Prikler@*
-Copyright @copyright{} 2019, 2020 Simon Tournier@*
+Copyright @copyright{} 2019, 2020, 2021 Simon Tournier@*
 Copyright @copyright{} 2020 Wiktor Żelazny@*
 Copyright @copyright{} 2020 Damien Cassou@*
 Copyright @copyright{} 2020 Jakub Kądziołka@*
@@ -11631,14 +11631,21 @@ in the definitions of packages.
 
 @item --recursive
 @itemx -r
-Compute the hash on @var{file} recursively.
-
-In this case, the hash is computed on an archive containing @var{file},
-including its children if it is a directory.  Some of the metadata of
-@var{file} is part of the archive; for instance, when @var{file} is a
-regular file, the hash is different depending on whether @var{file} is
-executable or not.  Metadata such as time stamps has no impact on the
-hash (@pxref{Invoking guix archive}).
+This option is deprecated in favor of @option{--serializer}.  It is a
+legacy alias for @var{type} sets to @code{nar}.
+
+@item --serializer=@var{type}
+@itemx -S
+Compute the hash on @var{file} using @var{type} serialization.
+
+Supported types: @code{none} and @code{nar}.
+
+When using @code{nar}, the hash is computed on an archive containing
+@var{file}, including its children if it is a directory.  Some of the
+metadata of @var{file} is part of the archive; for instance, when
+@var{file} is a regular file, the hash is different depending on whether
+@var{file} is executable or not.  Metadata such as time stamps has no
+impact on the hash (@pxref{Invoking guix archive}).
 @c FIXME: Replace xref above with xref to an ``Archive'' section when
 @c it exists.
 
diff --git a/guix/scripts/hash.scm b/guix/scripts/hash.scm
index 12f542929b..d05ecb80ba 100644
--- a/guix/scripts/hash.scm
+++ b/guix/scripts/hash.scm
@@ -37,6 +37,29 @@ (define-module (guix scripts hash)
   #:use-module (srfi srfi-37)
   #:export (guix-hash))
 
+
+;;;
+;;; Serializers
+;;;
+
+(define* (nar-hash file #:optional
+                   (algorithm (assoc-ref %default-options 'hash-algorithm))
+                   select?)
+  (let-values (((port get-hash)
+                (open-hash-port algorithm)))
+    (write-file file port #:select? select?)
+    (force-output port)
+    (get-hash)))
+
+(define* (default-hash file #:optional
+                       (algorithm (assoc-ref %default-options 'hash-algorithm))
+                       select?)
+  (match file
+    ("-" (port-hash algorithm (current-input-port)))
+    (_
+     (call-with-input-file file
+       (cute port-hash algorithm <>)))))
+
 
 ;;;
 ;;; Command-line options.
@@ -45,7 +68,8 @@ (define-module (guix scripts hash)
 (define %default-options
   ;; Alist of default option values.
   `((format . ,bytevector->nix-base32-string)
-    (hash-algorithm . ,(hash-algorithm sha256))))
+    (hash-algorithm . ,(hash-algorithm sha256))
+    (serializer . ,default-hash)))
 
 (define (show-help)
   (display (G_ "Usage: guix hash [OPTION] FILE
@@ -61,7 +85,7 @@ (define (show-help)
   (format #t (G_ "
   -f, --format=FMT       write the hash in the given format"))
   (format #t (G_ "
-  -r, --recursive        compute the hash on FILE recursively"))
+  -S, --serializer=TYPE  compute the hash on FILE according to TYPE serialization"))
   (newline)
   (display (G_ "
   -h, --help             display this help and exit"))
@@ -102,7 +126,24 @@ (define fmt-proc
                               (alist-delete 'format result))))
         (option '(#\r "recursive") #f #f
                 (lambda (opt name arg result)
-                  (alist-cons 'recursive? #t result)))
+                  (warning (G_ "'--recursive' is deprecated, \
+use '--serializer' instead~%"))
+                  (alist-cons 'serializer nar-hash
+                              (alist-delete 'serializer result))))
+        (option '(#\S "serializer") #t #f
+                (lambda (opt name arg result)
+                  (define serializer-proc
+                    (match arg
+                      ("none"
+                       default-hash)
+                      ("nar"
+                       nar-hash)
+                      (x
+                       (leave (G_ "unsupported serializer type: ~a~%")
+                              arg))))
+
+                  (alist-cons 'serializer serializer-proc
+                              (alist-delete 'serializer result))))
         (option '(#\h "help") #f #f
                 (lambda args
                   (show-help)
@@ -145,35 +186,24 @@ (define (vcs-file? file stat)
          (fmt  (assq-ref opts 'format))
          (select? (if (assq-ref opts 'exclude-vcs?)
                       (negate vcs-file?)
-                      (const #t))))
+                      (const #t)))
+         (algorithm (assoc-ref opts 'hash-algorithm))
+         (serializer (assoc-ref opts 'serializer)))
 
     (define (file-hash file)
       ;; Compute the hash of FILE.
-      ;; Catch and gracefully report possible '&nar-error' conditions.
-      (if (assoc-ref opts 'recursive?)
+     ;; Catch and gracefully report possible error
+      (catch 'system-error
+        (lambda _
           (with-error-handling
-            (let-values (((port get-hash)
-                          (open-hash-port (assoc-ref opts 'hash-algorithm))))
-              (write-file file port #:select? select?)
-              (force-output port)
-              (get-hash)))
-          (catch 'system-error
-            (lambda _
-              (call-with-input-file file
-                (cute port-hash (assoc-ref opts 'hash-algorithm)
-                      <>)))
-            (lambda args
-              (leave (G_ "~a ~a~%")
-                     file
-                     (strerror (system-error-errno args)))))))
+            (serializer file algorithm select?)))
+        (lambda args
+          (leave (G_ "~a ~a~%")
+                 file
+                 (strerror (system-error-errno args))))))
 
     (define (formatted-hash thing)
-      (match thing
-        ("-" (with-error-handling
-               (fmt (port-hash (assoc-ref opts 'hash-algorithm)
-                               (current-input-port)))))
-        (_
-         (fmt (file-hash thing)))))
+      (fmt (file-hash thing)))
 
     (for-each
      (compose (cute format #t "~a~%" <>) formatted-hash)
diff --git a/tests/guix-hash.sh b/tests/guix-hash.sh
index c4461fa955..cdcfac19bc 100644
--- a/tests/guix-hash.sh
+++ b/tests/guix-hash.sh
@@ -42,25 +42,29 @@ chmod +x "$tmpdir/exe"
 ( cd "$tmpdir" ; ln -s exe symlink )
 mkdir "$tmpdir/subdir"
 
-test `guix hash -r "$tmpdir"` = 10k1lw41wyrjf9mxydi0is5nkpynlsvgslinics4ppir13g7d74p
-test `guix hash -r "$tmpdir" -H sha512` = 301ra58c2vahczzxiyfin41mpyb0ljh4dh9zn3ijvwviaw1j40sfzw5skh9x945da88n3785ggifzig7acd6k72h0mpsc20m1f66m9n
+test `guix hash -S nar "$tmpdir"` = 10k1lw41wyrjf9mxydi0is5nkpynlsvgslinics4ppir13g7d74p
+test `guix hash -S nar "$tmpdir" -H sha512` = 301ra58c2vahczzxiyfin41mpyb0ljh4dh9zn3ijvwviaw1j40sfzw5skh9x945da88n3785ggifzig7acd6k72h0mpsc20m1f66m9n
+
+# Deprecated --recursive option
+test `guix hash -r "$tmpdir" 2>/dev/null` = 10k1lw41wyrjf9mxydi0is5nkpynlsvgslinics4ppir13g7d74p
+test `guix hash -r "$tmpdir" -H sha512 2>/dev/null` = 301ra58c2vahczzxiyfin41mpyb0ljh4dh9zn3ijvwviaw1j40sfzw5skh9x945da88n3785ggifzig7acd6k72h0mpsc20m1f66m9n
 
 # Without '-r', this should fail.
 ! guix hash "$tmpdir"
 
 # This should fail because /dev/null is a character device, which
 # the archive format doesn't support.
-! guix hash -r /dev/null
+! guix hash -S nar /dev/null
 
 # Adding a .git directory
 mkdir "$tmpdir/.git"
 touch "$tmpdir/.git/foo"
 
 # ...changes the hash
-test `guix hash -r $tmpdir` = 0a50z04zyzf7pidwxv0nwbj82pgzbrhdy9562kncnvkcfvb48m59
+test `guix hash -S nar $tmpdir` = 0a50z04zyzf7pidwxv0nwbj82pgzbrhdy9562kncnvkcfvb48m59
 
 # ...but remains the same when using `-x'
-test `guix hash -r $tmpdir -x` = 10k1lw41wyrjf9mxydi0is5nkpynlsvgslinics4ppir13g7d74p
+test `guix hash -S nar $tmpdir -x` = 10k1lw41wyrjf9mxydi0is5nkpynlsvgslinics4ppir13g7d74p
 
 # Without '-r', this should fail.
 ! guix hash "$tmpdir"
-- 
2.33.1
Z
Z
zimoun wrote on 18 Nov 2021 01:20
[PATCH v2 0/3] scripts: hash: Several files and serializer.
(address . 51307@debbugs.gnu.org)
20211118002023.3323307-1-zimon.toutoune@gmail.com
Hi,

The first patch is the one which already made consensus. Minor suggestions by
[1] has been added.

The 2 following ones add UI presented there [2]. Basically, it allows:

guix hash foo # similar as -S none
guix hash foo -S nar # similar as -r
guix hash foo -S git

Especially, compare:

$ cat /tmp/foo.txt | git hash-object --stdin
557db03de997c86a4a028e1ebd3a1ceb225be238
$ ./pre-inst-env guix hash -S git -H sha1 -f hex /tmp/foo.txt
557db03de997c86a4a028e1ebd3a1ceb225be238


Two remarks:

1. « #:use-module (disarchive) » is added. Elsewhere (guix download), the
module is "conditionally" loaded, as if disarchive is an optional
dependency. What should be done?

2. The SWH serializer is not added yet.


Cheers,
simon


zimoun (3):
scripts: hash: Support several files.
scripts: hash: Add 'serializer' option.
scripts: hash: Add git serializer.

doc/guix.texi | 27 +++++++----
guix/scripts/hash.scm | 108 ++++++++++++++++++++++++++++++------------
tests/guix-hash.sh | 20 ++++++--
3 files changed, 112 insertions(+), 43 deletions(-)


base-commit: 99084abd80d7c81e83263ffc6fd3699aeb8899c5
--
2.33.1
Z
Z
zimoun wrote on 18 Nov 2021 01:29
Re: Content hashes and file tree serialization methods
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 51307@debbugs.gnu.org)
86wnl6qpaw.fsf@gmail.com
Hi Ludo,

On Sun, 31 Oct 2021 at 15:48, Ludovic Courtès <ludo@gnu.org> wrote:

Toggle quote (21 lines)
> Thinking more about it, I think confusion stems from the term
> “recursive” (inherited from Nix) because, as you write, it doesn’t
> necessarily have to do with recursion and directory traversal.
>
> Instead, it has to do with the serialization method.
>
> Thus, probably, ‘--recursive’ could be replaced by a ‘-S’ flag:
>
> guix hash -S nar something
>
> or:
>
> guix hash -S none something
> guix hash -S git something
> guix hash -S swh something
>
> ‘-S none’ would be like not passing ‘-r’; ‘-S git’ would serialize the
> file/directory as a Git tree; ‘-S swh’ would serialize it the SWH way,
> which is like Git except it preserves empty directories (Disarchive
> implements the Git/SWH methods already.)

Well, v2 is an attempt for this UI. It does not solve my initial
problem but it is a first step in that direction. ;-)

Note that SWH serializer is not added. Because I have probably missed
to reach the implementation. :-)

Well, next step could be to add an option “--list-serializers“. Let
as an exercise for the reader. ;-)


Toggle quote (7 lines)
> As mentioned towards the end of
> <https://guix.gnu.org/en/blog/2019/connecting-reproducible-deployment-to-a-long-term-source-code-archive/>,
> being able to deal with different tree serialization methods would be
> useful going forward; for instance, if we had the option to store
> SWH-style content hashes for origins, that’d help a lot. Allowing ‘guix
> hash’ to deal with that is a first step in that direction.

Well, this series does not add SWH-style but it is a tiny step in that
direction. Somehow, after v2-patch#2, it is easy to add as many
serializers as we want. :-)

Toggle quote (2 lines)
> (Apologies for slipping a bit off-topic…)

Thanks for slipping off-topic. :-)

Cheers,
simon

*serializer: it is not an English word but I lacked imagination.
Z
Z
zimoun wrote on 15 Dec 2021 09:06
Re: [bug#51307] [PATCH v2 0/3] scripts: hash: Several files and serializer.
(address . ludo@gnu.org)(address . 51307@debbugs.gnu.org)
86r1ae5m32.fsf@gmail.com
Hi,

What do you think about patch#51307 [1] for “guix hash”?

As explained and discussed earlier in this thread, it renames the option
’-r’ to ’-S nar’ – keeping the old one for backward compatibility* – and
adds ’-S git’.


*backward compatibility: we can come later with a plan to remove and/or
deprecate.


On Thu, 18 Nov 2021 at 01:20, zimoun <zimon.toutoune@gmail.com> wrote:

Toggle quote (45 lines)
> The first patch is the one which already made consensus. Minor suggestions by
> [1] has been added.
>
> The 2 following ones add UI presented there [2]. Basically, it allows:
>
> guix hash foo # similar as -S none
> guix hash foo -S nar # similar as -r
> guix hash foo -S git
>
> Especially, compare:
>
> $ cat /tmp/foo.txt | git hash-object --stdin
> 557db03de997c86a4a028e1ebd3a1ceb225be238
> $ ./pre-inst-env guix hash -S git -H sha1 -f hex /tmp/foo.txt
> 557db03de997c86a4a028e1ebd3a1ceb225be238
>
>
> Two remarks:
>
> 1. « #:use-module (disarchive) » is added. Elsewhere (guix download), the
> module is "conditionally" loaded, as if disarchive is an optional
> dependency. What should be done?
>
> 2. The SWH serializer is not added yet.
>
>
> Cheers,
> simon
>
> 1: <http://issues.guix.gnu.org/issue/51307#3>
> 2: <http://issues.guix.gnu.org/issue/51307#12>
>
> zimoun (3):
> scripts: hash: Support several files.
> scripts: hash: Add 'serializer' option.
> scripts: hash: Add git serializer.
>
> doc/guix.texi | 27 +++++++----
> guix/scripts/hash.scm | 108 ++++++++++++++++++++++++++++++------------
> tests/guix-hash.sh | 20 ++++++--
> 3 files changed, 112 insertions(+), 43 deletions(-)
>
>
> base-commit: 99084abd80d7c81e83263ffc6fd3699aeb8899c5

Cheers,
simon
L
L
Ludovic Courtès wrote on 15 Dec 2021 11:05
(name . zimoun)(address . zimon.toutoune@gmail.com)(address . 51307@debbugs.gnu.org)
87k0g6xjxe.fsf@gnu.org
Hi!

zimoun <zimon.toutoune@gmail.com> skribis:

Toggle quote (2 lines)
> What do you think about patch#51307 [1] for “guix hash”?

I haven’t forgotten about it but I’m lagging behind.

I’m getting there, apologies for the delay!

Ludo’.
L
L
Ludovic Courtès wrote on 17 Dec 2021 17:17
Re: [PATCH v2 1/3] scripts: hash: Support several files.
(name . zimoun)(address . zimon.toutoune@gmail.com)(address . 51307@debbugs.gnu.org)
87pmpvp5o7.fsf@gnu.org
Hi!

zimoun <zimon.toutoune@gmail.com> skribis:

Toggle quote (4 lines)
> * guix/scripts/hash.scm (guix-hash): Allow several files.
> [file-hash]: Catch system-error.
> [formatted-hash]: New procedure.

Applied with minor tweaks:

• doc update;

• tests in ‘tests/guix-hash.sh’;

• error out when passed zero arguments, as was the case before.

Thanks,
Ludo’.
L
L
Ludovic Courtès wrote on 17 Dec 2021 17:17
Re: [PATCH v2 2/3] scripts: hash: Add 'serializer' option.
(name . zimoun)(address . zimon.toutoune@gmail.com)(address . 51307@debbugs.gnu.org)
87lf0jp5nn.fsf@gnu.org
zimoun <zimon.toutoune@gmail.com> skribis:

Toggle quote (9 lines)
> * guix/scripts/hash.scm (%options): Deprecate 'recursive', add 'serializer'.
> (%default-options): Add 'serializer'.
> (nar-hash): New procedure.
> (default-hash): New procedure.
> (guix-hash)[file-hash]: Use them.
> (show-help): Adjust.
> * tests/guix-hash.scm: Adjust.
> * doc/guix.texi: Update.

Applied!
Z
Z
zimoun wrote on 17 Dec 2021 17:39
control message for bug #51307
(address . control@debbugs.gnu.org)
87wnk3nq2j.fsf@gmail.com
close 51307
quit
L
L
Ludovic Courtès wrote on 21 Dec 2021 10:09
Re: [PATCH v2 3/3] scripts: hash: Add git serializer.
(name . zimoun)(address . zimon.toutoune@gmail.com)(address . 51307-done@debbugs.gnu.org)
87bl1ab9zm.fsf@gnu.org
Hi,

zimoun <zimon.toutoune@gmail.com> skribis:

Toggle quote (25 lines)
> On Mon, 20 Dec 2021 at 14:40, Ludovic Courtès <ludo@gnu.org> wrote:
>
>> > + #:use-module ((disarchive git-hash) #:select (git-hash-file
>> > + git-hash-directory))
>>
>> Use #:autoload (that makes Disarchive a soft dependency and also reduces
>> start-up time.)
>
> Thanks. As I asked in the cover-letter, these are confusing for me.
> For instance, why not autoloading all the used modules? And is it
> better to lazily load with something like:
>
> (match (and=> (resolve-module '(disarchive) #:ensure #f)
> (lambda (disarchive)
> (cons (module-ref disarchive '%disarchive-log-port)
> (module-ref disarchive 'disarchive-assemble))))
> (#f (format #t "could not load Disarchive~%")
> #f)
> ((%disarchive-log-port . disarchive-assemble)
>
> ? From guix/build/download.scm, or any other "lazy reference" in the code code.
>
> What are the differences? I have read the documentation but it is not
> clear for me the difference between '#:autoload' and 'module-ref'.

There are several reasons for autoloading: one is startup time/memory
footprint, and another one is having “soft dependencies” (you can still
build and use the thing when the dependency is missing; only that
specific feature is unavailable.)

Usually we’d use #:autoload, but in cases where we want full control
over the process, as in the Disarchive snippet you show above, we’d use
the programmatic interface.

There are also cases of “lazy references” where we use the programmatic
interface to load things at run time. This is used for instance to keep
(guix …) modules independent of (gnu …) modules.

I hope this sheds some light!

Toggle quote (6 lines)
>> I was wondering whether we should keep the deprecation warning for ‘-r’;
>> we might consider it a handy shorthand for ‘-S nar’?
>
> Ah, yes. Deprecate '--recursive' but keep '-r' as shorthand for '-S
> nar'. LGTM.

I’d treat ‘-r’ and ‘--recursive’ in the same way, no? That is, keep
them both without a deprecation warning, but document them as aliases
for ‘-S nar’. WDYT?

Thanks,
Ludo’.
Closed
?
Your comment

This issue is archived.

To comment on this conversation send email to 51307@debbugs.gnu.org