git download defaults to origin/master

DoneSubmitted by Ricardo Wurmus.
Details
4 participants
  • Kyle Meyer
  • Ludovic Courtès
  • Marius Bakke
  • Ricardo Wurmus
Owner
unassigned
Severity
normal
R
R
Ricardo Wurmus wrote on 11 Dec 2020 22:02
(address . bug-guix@gnu.org)
878sa4vyxb.fsf@elephly.net
importer fails, because the git repository does not have an
origin/master branch. This repository only has a “main” branch.

Arguably, this shouldn’t matter, but (guix git) has the “master” name
set up as the default. When cloning a repository it may be better to
fetch everything and select the default branch — whichever name it may
have.

Here is a reproducer:

Toggle snippet (41 lines)
$ guix import cran -r -a git https://github.com/immunogenomics/scpost
Backtrace:
In ice-9/boot-9.scm:
1736:10 19 (with-exception-handler _ _ #:unwind? _ #:unwind-for-type _)
1731:15 18 (with-exception-handler #<procedure 7f65768171e0 at ice-9/boot-9.scm:1815:7…> …)
In guix/scripts/import/cran.scm:
103:26 17 (_)
In guix/import/utils.scm:
458:31 16 (recursive-import "https://github.com/immunogenomics/scpost" # _ #:guix-name …)
449:33 15 (lookup-node "https://github.com/immunogenomics/scpost" #f)
In guix/memoization.scm:
98:0 14 (mproc "https://github.com/immunogenomics/scpost" #:version #f #:repo git)
In unknown file:
13 (_ #<procedure 7f6576843ae0 at guix/memoization.scm:179:32 ()> #<procedure …> …)
In guix/import/cran.scm:
576:24 12 (_ "https://github.com/immunogenomics/scpost" #:repo _ #:version _)
269:25 11 (fetch-description _ "https://github.com/immunogenomics/scpost")
In guix/memoization.scm:
98:0 10 (mproc "https://github.com/immunogenomics/scpost" #:method git)
In unknown file:
9 (_ #<procedure 7f6576843a80 at guix/memoization.scm:179:32 ()> #<procedure …> …)
In ice-9/boot-9.scm:
1736:10 8 (with-exception-handler _ _ #:unwind? _ #:unwind-for-type _)
In guix/store.scm:
632:37 7 (thunk)
In guix/git.scm:
424:8 6 (latest-repository-commit #<store-connection 256.99 7f6577201460> "https://…" …)
240:2 5 (update-cached-checkout _ #:ref _ #:recursive? _ #:check-out? _ # _ # _ # _)
208:19 4 (resolve _)
In git/branch.scm:
101:8 3 (_ _ _ _)
In git/bindings.scm:
77:2 2 (raise-git-error _)
In ice-9/boot-9.scm:
1669:16 1 (raise-exception _ #:continuable? _)
1669:16 0 (raise-exception _ #:continuable? _)

ice-9/boot-9.scm:1669:16: In procedure raise-exception:
Git error: cannot locate remote-tracking branch 'origin/master'

--
Ricardo
K
K
Kyle Meyer wrote on 12 Dec 2020 00:02
(name . Ricardo Wurmus)(address . rekado@elephly.net)(address . 45187@debbugs.gnu.org)
87v9d8dk0r.fsf@kyleam.com
Ricardo Wurmus writes:

Toggle quote (9 lines)
> importer fails, because the git repository does not have an
> origin/master branch. This repository only has a “main” branch.
>
> Arguably, this shouldn’t matter, but (guix git) has the “master” name
> set up as the default. When cloning a repository it may be better to
> fetch everything and select the default branch — whichever name it may
> have.

One option may be to use the remote HEAD symref. That's probably the
best indicator of what the primary branch is. In a clone, it doesn't
necessarily match HEAD on the remote, because users may change it to
another branch they're interested in, but that isn't really relevant to
these behind-the-scenes checkouts.

Here's a quick and dirty demo that makes your reproducer work. A real
patch in this direction would of course look very different.

Toggle diff (32 lines)
diff --git a/guix/git.scm b/guix/git.scm
index ca77b9f54b..7320c0d6c8 100644
--- a/guix/git.scm
+++ b/guix/git.scm
@@ -207,6 +207,9 @@ (define (resolve-reference repository ref)
        (let ((oid (reference-target
                    (branch-lookup repository branch BRANCH-REMOTE))))
          (object-lookup repository oid)))
+      (('symref . symref)
+       (let ((oid (reference-name->oid repository symref)))
+         (object-lookup repository oid)))
       (('commit . commit)
        (let ((len (string-length commit)))
          ;; 'object-lookup-prefix' appeared in Guile-Git in Mar. 2018, so we
@@ -320,7 +323,7 @@ (define (reference-available? repository ref)
 
 (define* (update-cached-checkout url
                                  #:key
-                                 (ref '(branch . "master"))
+                                 (ref '(symref . "refs/remotes/origin/HEAD"))
                                  recursive?
                                  (check-out? #t)
                                  starting-commit
@@ -395,7 +398,7 @@ (define* (latest-repository-commit store url
                                    (log-port (%make-void-port "w"))
                                    (cache-directory
                                     (%repository-cache-directory))
-                                   (ref '(branch . "master")))
+                                   (ref '(symref . "refs/remotes/origin/HEAD")))
   "Return two values: the content of the git repository at URL copied into a
 store directory and the sha1 of the top level commit in this directory.  The
 reference to be checkout, once the repository is fetched, is specified by REF.
M
M
Marius Bakke wrote on 13 Dec 2020 22:52
(address . 45187@debbugs.gnu.org)
87lfe19xw1.fsf@gnu.org
Kyle Meyer <kyle@kyleam.com> skriver:

Toggle quote (20 lines)
> Ricardo Wurmus writes:
>
>> Importing https://github.com/immunogenomics/scpost with the CRAN
>> importer fails, because the git repository does not have an
>> origin/master branch. This repository only has a “main” branch.
>>
>> Arguably, this shouldn’t matter, but (guix git) has the “master” name
>> set up as the default. When cloning a repository it may be better to
>> fetch everything and select the default branch — whichever name it may
>> have.
>
> One option may be to use the remote HEAD symref. That's probably the
> best indicator of what the primary branch is. In a clone, it doesn't
> necessarily match HEAD on the remote, because users may change it to
> another branch they're interested in, but that isn't really relevant to
> these behind-the-scenes checkouts.
>
> Here's a quick and dirty demo that makes your reproducer work. A real
> patch in this direction would of course look very different.

Another quick and dirty patch to make this specific example work ...
Toggle diff (110 lines)
diff --git a/guix/git.scm b/guix/git.scm
index ca77b9f54b..0c49859e42 100644
--- a/guix/git.scm
+++ b/guix/git.scm
@@ -198,47 +198,11 @@ of SHA1 string."
     (last (string-split url #\/)) ".git" "")
    "-" (string-take sha1 7)))
 
-(define (resolve-reference repository ref)
-  "Resolve the branch, commit or tag specified by REF, and return the
-corresponding Git object."
-  (let resolve ((ref ref))
-    (match ref
-      (('branch . branch)
-       (let ((oid (reference-target
-                   (branch-lookup repository branch BRANCH-REMOTE))))
-         (object-lookup repository oid)))
-      (('commit . commit)
-       (let ((len (string-length commit)))
-         ;; 'object-lookup-prefix' appeared in Guile-Git in Mar. 2018, so we
-         ;; can't be sure it's available.  Furthermore, 'string->oid' used to
-         ;; read out-of-bounds when passed a string shorter than 40 chars,
-         ;; which is why we delay calls to it below.
-         (if (< len 40)
-             (object-lookup-prefix repository (string->oid commit) len)
-             (object-lookup repository (string->oid commit)))))
-      (('tag-or-commit . str)
-       (if (or (> (string-length str) 40)
-               (not (string-every char-set:hex-digit str)))
-           (resolve `(tag . ,str))              ;definitely a tag
-           (catch 'git-error
-             (lambda ()
-               (resolve `(tag . ,str)))
-             (lambda _
-               ;; There's no such tag, so it must be a commit ID.
-               (resolve `(commit . ,str))))))
-      (('tag    . tag)
-       (let ((oid (reference-name->oid repository
-                                       (string-append "refs/tags/" tag))))
-         ;; OID may point to a "tag" object, but it can also point directly
-         ;; to a "commit" object, as surprising as it may seem.  Return that
-         ;; object, whatever that is.
-         (object-lookup repository oid))))))
-
 (define (switch-to-ref repository ref)
   "Switch to REPOSITORY's branch, commit or tag specified by REF.  Return the
 OID (roughly the commit hash) corresponding to REF."
   (define obj
-    (resolve-reference repository ref))
+    (revparse-single repository (cdr ref)))
 
   (reset repository obj RESET_HARD)
   (object-id obj))
@@ -320,7 +284,7 @@ definitely available in REPOSITORY, false otherwise."
 
 (define* (update-cached-checkout url
                                  #:key
-                                 (ref '(branch . "master"))
+                                 (ref '(branch . "HEAD"))
                                  recursive?
                                  (check-out? #t)
                                  starting-commit
@@ -349,7 +313,9 @@ it unchanged."
       (('branch . branch)
        `(branch . ,(if (string-prefix? "origin/" branch)
                        branch
-                       (string-append "origin/" branch))))
+                       (if (string=? "HEAD" branch)
+                           branch
+                           (string-append "origin/" branch)))))
       (_ ref)))
 
   (with-libgit2
@@ -370,8 +336,7 @@ it unchanged."
      ;; than letting users re-open the checkout later on.
      (let* ((oid      (if check-out?
                           (switch-to-ref repository canonical-ref)
-                          (object-id
-                           (resolve-reference repository canonical-ref))))
+                          (revparse-single repository (cdr canonical-ref))))
             (new      (and starting-commit
                            (commit-lookup repository oid)))
             (old      (and starting-commit
@@ -395,7 +360,7 @@ it unchanged."
                                    (log-port (%make-void-port "w"))
                                    (cache-directory
                                     (%repository-cache-directory))
-                                   (ref '(branch . "master")))
+                                   (ref '(branch . "HEAD")))
   "Return two values: the content of the git repository at URL copied into a
 store directory and the sha1 of the top level commit in this directory.  The
 reference to be checkout, once the repository is fetched, is specified by REF.
@@ -510,7 +475,7 @@ objects: 'ancestor (meaning that OLD is an ancestor of NEW), 'descendant, or
   git-checkout make-git-checkout
   git-checkout?
   (url     git-checkout-url)
-  (branch  git-checkout-branch (default "master"))
+  (branch  git-checkout-branch (default "HEAD"))
   (commit  git-checkout-commit (default #f))      ;#f | tag | commit
   (recursive? git-checkout-recursive? (default #f)))
 
@@ -550,7 +515,7 @@ objects: 'ancestor (meaning that OLD is an ancestor of NEW), 'descendant, or
     (($ <git-checkout> url branch commit recursive?)
      (latest-repository-commit* url
                                 #:ref (if commit
-                                          `(tag-or-commit . ,commit)
+                                          `(commit . ,commit)
                                           `(branch . ,branch))
                                 #:recursive? recursive?
                                 #:log-port (current-error-port)))))
I wonder if there was a specific reason to not use 'revparse-single',
and instead mostly reinvent it with (resolve-reference ...).

This approach can be simplified further by deprecating the 'commit' and
'branch' fields, and just use a single reference string.
-----BEGIN PGP SIGNATURE-----

iQFDBAEBCgAtFiEEu7At3yzq9qgNHeZDoqBt8qM6VPoFAl/WjS4PHG1hcml1c0Bn
bnUub3JnAAoJEKKgbfKjOlT6FykIAIodtjZIVzwsRRb+7+1b7MRWQmnxQoy2QQs6
0exMoFaYIX/G3NOfE+RWVjcCgIyIc1iua18GKIe990+5uI8dGezOqBkg3/KqrDZc
90jfz2o/IoudaTKEcH3Hp52h4JDkRLnbhiB8b08pZ2NxzsiBWDSaQoWf21m5y0AD
GBtZqcX/n3diNkoJ9giSx+xVLlnGqxKXzugTxmnGc/A5AwRLpa+vO7Z15EwCBZ3K
ZYh6zGXkFP1hS08Gk41KPjG8JKbtuZx5zxu+iV7K9hXwptOrEepFX9Lio4YNpigi
hB31CyW2ITV891Ut/nmqOgihRCChKvGOIwgPN7z6jiAd0gbjriM=
=q4BH
-----END PGP SIGNATURE-----

K
K
Kyle Meyer wrote on 14 Dec 2020 05:49
(name . Marius Bakke)(address . marius@gnu.org)
87lfe1t2jd.fsf@kyleam.com
Marius Bakke writes:

Toggle quote (3 lines)
> Kyle Meyer <kyle@kyleam.com> skriver:
>
>> One option may be to use the remote HEAD symref.
[...]
Toggle quote (5 lines)
>> Here's a quick and dirty demo that makes your reproducer work. A real
>> patch in this direction would of course look very different.
>
> Another quick and dirty patch to make this specific example work ...

Thanks.

Toggle quote (1 lines)
> diff --git a/guix/git.scm b/guix/git.scm
[...]
Toggle quote (17 lines)
> (define* (update-cached-checkout url
> #:key
> - (ref '(branch . "master"))
> + (ref '(branch . "HEAD"))
> recursive?
> (check-out? #t)
> starting-commit
> @@ -349,7 +313,9 @@ it unchanged."
> (('branch . branch)
> `(branch . ,(if (string-prefix? "origin/" branch)
> branch
> - (string-append "origin/" branch))))
> + (if (string=? "HEAD" branch)
> + branch
> + (string-append "origin/" branch)))))
> (_ ref)))

I think using HEAD, rather than refs/remotes/origin/HEAD, will be
problematic when the repository is already present in the cache. If I'm
reading guix/git.scm correctly, a fetch is done in that case.
refs/remotes/origin/HEAD will track any updates on the remote ref that
it points to, while HEAD will stay on the same commit that it landed on
at clone time.
L
L
Ludovic Courtès wrote on 14 Dec 2020 11:27
(name . Marius Bakke)(address . marius@gnu.org)
874kkovg1r.fsf@gnu.org
Hi!

Marius Bakke <marius@gnu.org> skribis:

Toggle quote (3 lines)
> I wonder if there was a specific reason to not use 'revparse-single',
> and instead mostly reinvent it with (resolve-reference ...).

It must have been ignorance on my side, but also the fact that something
like ‘revparse-single’ has to guess what it is you’re looking for (is
this string a commit ID? a tag name? a branch name?), whereas here we
can explicitly state what we want and thus reduce the search space and
ambiguity.

Does that make sense?

Ludo’.
L
L
Ludovic Courtès wrote on 14 Dec 2020 11:28
(name . Kyle Meyer)(address . kyle@kyleam.com)
87zh2gu1ef.fsf@gnu.org
Hi,

Kyle Meyer <kyle@kyleam.com> skribis:

Toggle quote (30 lines)
> diff --git a/guix/git.scm b/guix/git.scm
> index ca77b9f54b..7320c0d6c8 100644
> --- a/guix/git.scm
> +++ b/guix/git.scm
> @@ -207,6 +207,9 @@ (define (resolve-reference repository ref)
> (let ((oid (reference-target
> (branch-lookup repository branch BRANCH-REMOTE))))
> (object-lookup repository oid)))
> + (('symref . symref)
> + (let ((oid (reference-name->oid repository symref)))
> + (object-lookup repository oid)))
> (('commit . commit)
> (let ((len (string-length commit)))
> ;; 'object-lookup-prefix' appeared in Guile-Git in Mar. 2018, so we
> @@ -320,7 +323,7 @@ (define (reference-available? repository ref)
>
> (define* (update-cached-checkout url
> #:key
> - (ref '(branch . "master"))
> + (ref '(symref . "refs/remotes/origin/HEAD"))
> recursive?
> (check-out? #t)
> starting-commit
> @@ -395,7 +398,7 @@ (define* (latest-repository-commit store url
> (log-port (%make-void-port "w"))
> (cache-directory
> (%repository-cache-directory))
> - (ref '(branch . "master")))
> + (ref '(symref . "refs/remotes/origin/HEAD")))

Do we really need to add “remotes/origin” in there? Or is there a way
to just say HEAD and later specify that we’re talking about the remote
head, as is done fro branches?

We also need to change the defaults in <git-checkout> & co., like Marius did.

Thanks,
Ludo’.
K
K
Kyle Meyer wrote on 15 Dec 2020 07:07
(name . Ludovic Courtès)(address . ludo@gnu.org)
87wnxj7gbw.fsf@kyleam.com
Ludovic Courtès writes:

Toggle quote (18 lines)
>> (define* (update-cached-checkout url
>> #:key
>> - (ref '(branch . "master"))
>> + (ref '(symref . "refs/remotes/origin/HEAD"))
>> recursive?
>> (check-out? #t)
>> starting-commit
>> @@ -395,7 +398,7 @@ (define* (latest-repository-commit store url
>> (log-port (%make-void-port "w"))
>> (cache-directory
>> (%repository-cache-directory))
>> - (ref '(branch . "master")))
>> + (ref '(symref . "refs/remotes/origin/HEAD")))
>
> Do we really need to add “remotes/origin” in there? Or is there a way
> to just say HEAD and later specify that we’re talking about the remote
> head, as is done fro branches?

Thanks for the feedback. Sure, that sounds fine to me. As far as I can
tell, in this context it'd make sense to assume HEAD should always be
the one under refs/remotes/origin. (It really was a "quick check that
my suggestion to use the remote HEAD symref works" sort of patch :))

Another thing that doesn't feel suitable for the proper patch is the
introduction of the `symref' symbol...

Toggle quote (2 lines)
> We also need to change the defaults in <git-checkout> & co., like Marius did.

...which didn't seem particularly nice to add as a field to
<git-checkout>.

However, without the introduction of something like `symref', I'm not
spotting a straightforward way to deal with refs/remotes/origin/HEAD in
resolve-reference.

FWIW if we were to go in the direction Marius suggested, I believe the
main change needed on top of Marius's patch in order to use the remote
symref would be

Toggle diff (15 lines)
diff --git a/guix/git.scm b/guix/git.scm
index 0c49859e42..5caf715916 100644
--- a/guix/git.scm
+++ b/guix/git.scm
@@ -313,9 +313,7 @@ (define* (update-cached-checkout url
       (('branch . branch)
        `(branch . ,(if (string-prefix? "origin/" branch)
                        branch
-                       (if (string=? "HEAD" branch)
-                           branch
-                           (string-append "origin/" branch)))))
+                       (string-append "origin/" branch))))
       (_ ref)))
 
   (with-libgit2
R
R
Ricardo Wurmus wrote on 8 Apr 2021 18:21
(name . Kyle Meyer)(address . kyle@kyleam.com)
87ft00ycg8.fsf@elephly.net
Hi,

I’d just like to say that I successfully used Kyle’s first patch to get
past the problem. Thank you!

I’m a bit lost: is there something we need to decide on to move this
forward? (The issue has become a bit more pressing as origin/main is
the new default on Github since a while.)

--
Ricardo
K
K
Kyle Meyer wrote on 9 Apr 2021 07:10
[PATCH] git: Update cached checkout to the remote HEAD by default.
(name . Ricardo Wurmus)(address . rekado@elephly.net)
875z0wt54x.fsf@kyleam.com
Ricardo Wurmus writes:

Toggle quote (3 lines)
> I’m a bit lost: is there something we need to decide on to move this
> forward?

I guess the main open question is whether to rework things to use
revparse-single as Marius suggested.

And if we don't, another question (raised by me) is whether there's a
better approach than adding the additional symref key/field. As I said
upthread, "without the introduction of something like `symref', I'm not
spotting a straightforward way to deal with refs/remotes/origin/HEAD in
resolve-reference".

Toggle quote (3 lines)
> (The issue has become a bit more pressing as origin/main is the new
> default on Github since a while.)

Right, thanks for bumping this. Here's my attempt to put together a
complete patch. I've tested this with

$ guix import cran -r -a git https://github.com/immunogenomics/scpost

as well as a few `guix build --with-(branch|commit)=...' invocations.

Running `make check', I don't see any new failures, though I see the
same two failures ("channel-news, one entry" and
"tests/guix-git-authenticate.sh") on master and with this patch. I
haven't yet looked more closely at those.

-- >8 --
Subject: [PATCH] git: Update cached checkout to the remote HEAD by default.

Reported by Ricardo Wurmus <rekado@elephly.net>.

update-cached-checkout hard codes "master" as the default branch, leading to a
failure when the clone doesn't have a "master" branch. Instead use the remote
HEAD symref as an indicator of what the primary branch is.

* guix/git.scm (resolve-reference): Support resolving symrefs.
(update-cached-checkout, latest-repository-commit): Default to the remote HEAD
symref.
(<git-checkout>): Add symref field that defaults to "HEAD", and change branch
field's default to #f.
(git-checkout-compiler): Handle symref field of <git-checkout>.
---
guix/git.scm | 35 ++++++++++++++++++++++++-----------
1 file changed, 24 insertions(+), 11 deletions(-)

Toggle diff (103 lines)
diff --git a/guix/git.scm b/guix/git.scm
index 1820036f25..6b410ed42f 100644
--- a/guix/git.scm
+++ b/guix/git.scm
@@ -1,6 +1,7 @@
 ;;; GNU Guix --- Functional package management for GNU
 ;;; Copyright © 2017, 2020 Mathieu Othacehe <m.othacehe@gmail.com>
 ;;; Copyright © 2018, 2019, 2020, 2021 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2021 Kyle Meyer <kyle@kyleam.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -209,6 +210,9 @@ (define (resolve-reference repository ref)
        (let ((oid (reference-target
                    (branch-lookup repository branch BRANCH-REMOTE))))
          (object-lookup repository oid)))
+      (('symref . symref)
+       (let ((oid (reference-name->oid repository symref)))
+         (object-lookup repository oid)))
       (('commit . commit)
        (let ((len (string-length commit)))
          ;; 'object-lookup-prefix' appeared in Guile-Git in Mar. 2018, so we
@@ -340,7 +344,7 @@ (define (delete-checkout directory)
 
 (define* (update-cached-checkout url
                                  #:key
-                                 (ref '(branch . "master"))
+                                 (ref '(symref . "HEAD"))
                                  recursive?
                                  (check-out? #t)
                                  starting-commit
@@ -354,8 +358,9 @@ (define* (update-cached-checkout url
 to REF, and the relation of the new commit relative to STARTING-COMMIT (if
 provided) as returned by 'commit-relation'.
 
-REF is pair whose key is [branch | commit | tag | tag-or-commit ] and value
-the associated data: [<branch name> | <sha1> | <tag name> | <string>].
+REF is pair whose key is [branch | commit | tag | tag-or-commit | symref] and
+value the associated data:
+[<branch name> | <sha1> | <tag name> | <string> | <symref>].
 
 When RECURSIVE? is true, check out submodules as well, if any.
 
@@ -378,6 +383,10 @@ (define* (update-cached-checkout url
        `(branch . ,(if (string-prefix? "origin/" branch)
                        branch
                        (string-append "origin/" branch))))
+      (('symref . symref)
+       `(symref . ,(if (string-prefix? "refs/remotes/origin/" symref)
+                       symref
+                       (string-append "refs/remotes/origin/" symref))))
       (_ ref)))
 
   (with-libgit2
@@ -433,12 +442,12 @@ (define* (latest-repository-commit store url
                                    (log-port (%make-void-port "w"))
                                    (cache-directory
                                     (%repository-cache-directory))
-                                   (ref '(branch . "master")))
+                                   (ref '(symref . "HEAD")))
   "Return two values: the content of the git repository at URL copied into a
 store directory and the sha1 of the top level commit in this directory.  The
 reference to be checkout, once the repository is fetched, is specified by REF.
-REF is pair whose key is [branch | commit | tag] and value the associated
-data, respectively [<branch name> | <sha1> | <tag name>].
+REF is pair whose key is [branch | commit | tag | symref] and value the
+associated data, respectively [<branch name> | <sha1> | <tag name> | <symref>].
 
 When RECURSIVE? is true, check out submodules as well, if any.
 
@@ -548,7 +557,8 @@ (define-record-type* <git-checkout>
   git-checkout make-git-checkout
   git-checkout?
   (url     git-checkout-url)
-  (branch  git-checkout-branch (default "master"))
+  (branch  git-checkout-branch (default #f))
+  (symref  git-checkout-symref (default "HEAD"))
   (commit  git-checkout-commit (default #f))      ;#f | tag | commit
   (recursive? git-checkout-recursive? (default #f)))
 
@@ -585,11 +595,14 @@ (define-gexp-compiler (git-checkout-compiler (checkout <git-checkout>)
   ;; "Compile" CHECKOUT by updating the local checkout and adding it to the
   ;; store.
   (match checkout
-    (($ <git-checkout> url branch commit recursive?)
+    (($ <git-checkout> url branch symref commit recursive?)
      (latest-repository-commit* url
-                                #:ref (if commit
-                                          `(tag-or-commit . ,commit)
-                                          `(branch . ,branch))
+                                #:ref (cond (commit
+                                             `(tag-or-commit . ,commit))
+                                            (branch
+                                             `(branch . ,branch))
+                                            (else
+                                             `(symref . ,symref)))
                                 #:recursive? recursive?
                                 #:log-port (current-error-port)))))
 

base-commit: 43c55856c876c76200cdccc1211868b92352c4ae
-- 
2.31.1
L
L
Ludovic Courtès wrote on 9 Apr 2021 15:50
Re: bug#45187: git download defaults to origin/master
(name . Kyle Meyer)(address . kyle@kyleam.com)
87fszz7ej2.fsf_-_@gnu.org
Hi!

Kyle Meyer <kyle@kyleam.com> skribis:

Toggle quote (16 lines)
> Subject: [PATCH] git: Update cached checkout to the remote HEAD by default.
>
> Fixes <https://bugs.gnu.org/45187>.
> Reported by Ricardo Wurmus <rekado@elephly.net>.
>
> update-cached-checkout hard codes "master" as the default branch, leading to a
> failure when the clone doesn't have a "master" branch. Instead use the remote
> HEAD symref as an indicator of what the primary branch is.
>
> * guix/git.scm (resolve-reference): Support resolving symrefs.
> (update-cached-checkout, latest-repository-commit): Default to the remote HEAD
> symref.
> (<git-checkout>): Add symref field that defaults to "HEAD", and change branch
> field's default to #f.
> (git-checkout-compiler): Handle symref field of <git-checkout>.

[...]

Toggle quote (7 lines)
> git-checkout make-git-checkout
> git-checkout?
> (url git-checkout-url)
> - (branch git-checkout-branch (default "master"))
> + (branch git-checkout-branch (default #f))
> + (symref git-checkout-symref (default "HEAD"))

I know it’s established Git jargon, but “symref” looks obscure to me.
I find it OK for ‘update-cached-checkout’, because it’s an “internal”
procedure for die-hard hackers, but a bit ugly here.

Another option would be to not add this ‘symref’ field and instead, when
‘branch’ and ‘commit’ are both #f, translate that to '(symref . "HEAD").

WDYT?

The rest of the patch LGTM, thanks!

Ludo’.
K
K
Kyle Meyer wrote on 10 Apr 2021 05:50
(name . Ludovic Courtès)(address . ludo@gnu.org)
87v98ussqy.fsf@kyleam.com
Ludovic Courtès writes:

Toggle quote (1 lines)
> Kyle Meyer <kyle@kyleam.com> skribis:
[...]
Toggle quote (11 lines)
>> git-checkout make-git-checkout
>> git-checkout?
>> (url git-checkout-url)
>> - (branch git-checkout-branch (default "master"))
>> + (branch git-checkout-branch (default #f))
>> + (symref git-checkout-symref (default "HEAD"))
>
> I know it’s established Git jargon, but “symref” looks obscure to me.
> I find it OK for ‘update-cached-checkout’, because it’s an “internal”
> procedure for die-hard hackers, but a bit ugly here.

Agreed. As mentioned upthread, I don't like it either.

Toggle quote (5 lines)
> Another option would be to not add this ‘symref’ field and instead, when
> ‘branch’ and ‘commit’ are both #f, translate that to '(symref . "HEAD").
>
> WDYT?

Thanks for the suggestion. Dropping the field and translating
downstream sounds good to me.

Working through the patch update, I ended up moving things even more
internally, making update-cached-checkout translate the empty list into
(symref . "refs/remotes/origin/HEAD") for resolve-reference. I don't
think there will be a use for symrefs other than HEAD, so it seemed
reasonable to add the special case (or rather shift the special case all
the way down into update-cached-checkout). Let me know if you'd rather
use a (symref . "HEAD") default for update-cached-checkout.

Toggle quote (2 lines)
> The rest of the patch LGTM, thanks!

Thanks for the review.

-- >8 --
Subject: [PATCH v2] git: Update cached checkout to the remote HEAD by default.

Reported by Ricardo Wurmus <rekado@elephly.net>.

update-cached-checkout hard codes "master" as the default branch, leading to a
failure when the clone doesn't have a "master" branch. Instead use the remote
HEAD symref as an indicator of what the primary branch is.

* guix/git.scm (resolve-reference): Support resolving symrefs.
(update-cached-checkout, latest-repository-commit): Change the default for REF
to the empty list and translate it to the remote HEAD symref.
(<git-checkout>): Change branch field's default to #f.
(git-checkout-compiler): When branch and commit fields are both #f, call
latest-repository-commit* with the empty list as the ref.
---
guix/git.scm | 23 ++++++++++++++++-------
1 file changed, 16 insertions(+), 7 deletions(-)

Toggle diff (91 lines)
diff --git a/guix/git.scm b/guix/git.scm
index 1820036f25..5d9f0bf205 100644
--- a/guix/git.scm
+++ b/guix/git.scm
@@ -1,6 +1,7 @@
 ;;; GNU Guix --- Functional package management for GNU
 ;;; Copyright © 2017, 2020 Mathieu Othacehe <m.othacehe@gmail.com>
 ;;; Copyright © 2018, 2019, 2020, 2021 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2021 Kyle Meyer <kyle@kyleam.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -209,6 +210,9 @@ (define (resolve-reference repository ref)
        (let ((oid (reference-target
                    (branch-lookup repository branch BRANCH-REMOTE))))
          (object-lookup repository oid)))
+      (('symref . symref)
+       (let ((oid (reference-name->oid repository symref)))
+         (object-lookup repository oid)))
       (('commit . commit)
        (let ((len (string-length commit)))
          ;; 'object-lookup-prefix' appeared in Guile-Git in Mar. 2018, so we
@@ -340,7 +344,7 @@ (define (delete-checkout directory)
 
 (define* (update-cached-checkout url
                                  #:key
-                                 (ref '(branch . "master"))
+                                 (ref '())
                                  recursive?
                                  (check-out? #t)
                                  starting-commit
@@ -356,6 +360,7 @@ (define* (update-cached-checkout url
 
 REF is pair whose key is [branch | commit | tag | tag-or-commit ] and value
 the associated data: [<branch name> | <sha1> | <tag name> | <string>].
+IF REF is the empty list, the remote HEAD is used.
 
 When RECURSIVE? is true, check out submodules as well, if any.
 
@@ -374,6 +379,7 @@ (define* (update-cached-checkout url
     ;; made little sense since the cache should be transparent to them.  So
     ;; here we append "origin/" if it's missing and otherwise keep it.
     (match ref
+      (() '(symref . "refs/remotes/origin/HEAD"))
       (('branch . branch)
        `(branch . ,(if (string-prefix? "origin/" branch)
                        branch
@@ -433,12 +439,13 @@ (define* (latest-repository-commit store url
                                    (log-port (%make-void-port "w"))
                                    (cache-directory
                                     (%repository-cache-directory))
-                                   (ref '(branch . "master")))
+                                   (ref '()))
   "Return two values: the content of the git repository at URL copied into a
 store directory and the sha1 of the top level commit in this directory.  The
 reference to be checkout, once the repository is fetched, is specified by REF.
 REF is pair whose key is [branch | commit | tag] and value the associated
-data, respectively [<branch name> | <sha1> | <tag name>].
+data, respectively [<branch name> | <sha1> | <tag name>].  IF REF is the empty
+list, the remote HEAD is used.
 
 When RECURSIVE? is true, check out submodules as well, if any.
 
@@ -548,7 +555,7 @@ (define-record-type* <git-checkout>
   git-checkout make-git-checkout
   git-checkout?
   (url     git-checkout-url)
-  (branch  git-checkout-branch (default "master"))
+  (branch  git-checkout-branch (default #f))
   (commit  git-checkout-commit (default #f))      ;#f | tag | commit
   (recursive? git-checkout-recursive? (default #f)))
 
@@ -587,9 +594,11 @@ (define-gexp-compiler (git-checkout-compiler (checkout <git-checkout>)
   (match checkout
     (($ <git-checkout> url branch commit recursive?)
      (latest-repository-commit* url
-                                #:ref (if commit
-                                          `(tag-or-commit . ,commit)
-                                          `(branch . ,branch))
+                                #:ref (cond (commit
+                                             `(tag-or-commit . ,commit))
+                                            (branch
+                                             `(branch . ,branch))
+                                            (else '()))
                                 #:recursive? recursive?
                                 #:log-port (current-error-port)))))
 

base-commit: f68bcc1bc3aa0a8d1829e2eb5d9ef256c817c17c
-- 
2.31.1
K
K
Kyle Meyer wrote on 10 Apr 2021 07:51
(name . Ludovic Courtès)(address . ludo@gnu.org)
87zgy6ofg3.fsf@kyleam.com
Kyle Meyer writes:

Toggle quote (6 lines)
> @@ -356,6 +360,7 @@ (define* (update-cached-checkout url
>
> REF is pair whose key is [branch | commit | tag | tag-or-commit ] and value
> the associated data: [<branch name> | <sha1> | <tag name> | <string>].
> +IF REF is the empty list, the remote HEAD is used.

Sorry, here and ...

Toggle quote (14 lines)
> @@ -433,12 +439,13 @@ (define* (latest-repository-commit store url
> (log-port (%make-void-port "w"))
> (cache-directory
> (%repository-cache-directory))
> - (ref '(branch . "master")))
> + (ref '()))
> "Return two values: the content of the git repository at URL copied into a
> store directory and the sha1 of the top level commit in this directory. The
> reference to be checkout, once the repository is fetched, is specified by REF.
> REF is pair whose key is [branch | commit | tag] and value the associated
> -data, respectively [<branch name> | <sha1> | <tag name>].
> +data, respectively [<branch name> | <sha1> | <tag name>]. IF REF is the empty
> +list, the remote HEAD is used.

... here should say "If" rather than "IF".
L
L
Ludovic Courtès wrote on 10 Apr 2021 21:33
(name . Kyle Meyer)(address . kyle@kyleam.com)
874kge2avn.fsf@gnu.org
Hi Kyle,

Kyle Meyer <kyle@kyleam.com> skribis:

Toggle quote (19 lines)
> Subject: [PATCH v2] git: Update cached checkout to the remote HEAD by default.
>
> Fixes <https://bugs.gnu.org/45187>.
> Reported by Ricardo Wurmus <rekado@elephly.net>.
>
> update-cached-checkout hard codes "master" as the default branch, leading to a
> failure when the clone doesn't have a "master" branch. Instead use the remote
> HEAD symref as an indicator of what the primary branch is.
>
> * guix/git.scm (resolve-reference): Support resolving symrefs.
> (update-cached-checkout, latest-repository-commit): Change the default for REF
> to the empty list and translate it to the remote HEAD symref.
> (<git-checkout>): Change branch field's default to #f.
> (git-checkout-compiler): When branch and commit fields are both #f, call
> latest-repository-commit* with the empty list as the ref.
> ---
> guix/git.scm | 23 ++++++++++++++++-------
> 1 file changed, 16 insertions(+), 7 deletions(-)

Applied, thanks!

Ludo’.
Closed
?
Your comment

This issue is archived.

To comment on this conversation send email to 45187@debbugs.gnu.org