zimoun schreef op wo 05-01-2022 om 13:58 [+0100]:
> [...]
> > > --8<---------------cut here---------------start------------->8---
> > >   "Compute the hash of FILE with ALGORITHM.  If NAR-SERIALIZER? is
> > >   #true, compute the combined hash (NAR hash) of FILE for which (SELECT?
> > >   FILE STAT) returns true.
> > > 
> > >   If NAR-SERIALIZER? is #false, compute the regular hash using the
> > >   default serializer.  It is meant to be used for a regular file.
> > > 
> > >   If NAR-SERIALIZER? is 'auto', when FILE is a directory, compute the
> > >   combined hash (NAR hash).  When FILE is a regular file, compute the
> > >   regular hash using the default serializer.  The option ’auto’ is meant
> > >   to apply by default the expected hash computation.
> > > 
> > >   Symbolic links are not dereferenced unless NAR-SERIALIZER? is false.
> > > 
> > >   This procedure must only be used under controlled circumstances; the
> > >   detection of symbolic links in FILE is racy.
> > > --8<---------------cut here---------------end--------------->8---
> 
> > The nar hash / regular hash difference seems a very low-level detail to
> > me, that most (all?) users don't need to be bothered about. Except
> > maybe if FILE denotes an executable regular file, but file-hash* is
> > currently only used on tarballs/zip files/git checkouts, which aren't
> > executable files unless weirdness or some kind of attack is happening.
> > 
> > I think that, the ‘least astonishing’ thing to do here, is computing
> > the hash that would go into the 'hash' / 'sha256' field of 'origin'
> > objects by default, and not the nar hash for regular files that's
> > almost never used.
> 
> I do not understand what you mean here.  ’file-hash*’ is a low-level
> detail, no?  Whatever. :-)

I don't see what it matters if 'file-hash*' is classified as low-level
or high-level.  But what I do care about, is how easy to use file-hash*
is.

A low-level argument like #:nar-hash? #true/#false would make file-
hash* much more complicated: this patch series uses file-hash* to
compute the hash for 'origin' records, and the documentation of
'origin' doesn't mention 'nar' anywhere and if I search for 'nar hash'
in the manual, I find zero results.

Instead, file-hash* talks about directories, regular files, recursion
and claims that the default value of #:recursive? usually does the
right thing, so I don't have to look up any complicated terminology
to figure out how to use file-hash* to compute hashes for 'origin'
records.

And in the rare situation where file-hash* doesn't do the right thing,
the documentation tells me I can set #:recursive? #true/#false.
 
> Just, to be sure, I am proposing:
> 
>  1) It is v4 and ready, I guess.  About ’auto’, I could have waken up
>  earlier. :-) And it can be still improved later as you are saying in
>  the other answer.  So, we are done, right?

I think so, yes, except for a docstring change I'll send as a v5.
I'm also out of bikeshed paint.
Anway, keep in mind that I'm not a committer.

>  2) From my point of view, ’#:recursive?’ needs to be adapted in
>  agreement with the discussion [1], quoting Ludo:
> 
>         Thinking more about it, I think confusion stems from the term
>         “recursive” (inherited from Nix) because, as you write, it
>         doesn’t necessarily have to do with recursion and directory
>         traversal.
> 
>         Instead, it has to do with the serialization method.
> 
>         1: <http://issues.guix.gnu.org/issue/51307>
> 
>    And I do not have a strong opinion.  Just a naive remark.

I don't think the arguments for (guix scripts hash) apply directly
to (guix hash) -- (guix scripts hash) supports multiple serialisers:

 * none (regular in (guix hash) terminology)
 * git
 * nar
 * swh

so something like -S nar makes a lot of sense there. But (guix hash)
is only for computing the hash of something that would become a store
item after interning, more specifically is is currently only used for
computing the hash that would go into an (origin ...) object
(though I suppose it could be extended to support git/swh/... if
someone wants do that).

Possibly some name like
#:treat-it-as-a-directory-or-an-executable-file-or-a-symlink-and-
compute-the-alternative-hash-even-if-it-is-regular?
would be clearer and technically more accurate than #:recursive?, but
that's a bit of a mouthful.

>  3) Whatever the keyword for the current v4 ’#:recursive?’ is picked, I
>   still find the current docstring wording unclear.  In fact, reading
>   the code is more helpful. :-) I am just proposing a reword which
>   appears to me clearer than the current v4 one.  Maybe, I am missing
>   the obvious.  Or maybe this proposed rewording is not clearer. :-)

I've reworded it a bit; it falsely claimed that the nar hash was always
computed when recursive? is 'auto' (even if FILE is a regular file). It
also mentions executable files and SELECT? now.

Greetings,
Maxime.