[PATCH] git-download: Download a bare Git repository from SWH.

  • Done
  • quality assurance status badge
Details
2 participants
  • Ludovic Courtès
  • Simon Tournier
Owner
unassigned
Submitted by
Ludovic Courtès
Severity
normal
L
L
Ludovic Courtès wrote on 2 Mar 2023 10:12
(address . guix-patches@gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20230302091233.28226-1-ludo@gnu.org
Fixes a bug whereby CR/LF convention would not be applied on Git
repositories retrieved from SWH:


Reported by Simon Tournier <simon.tournier@inserm.fr>.
Suggested by Valentin Lorentz <valentin.lorentz@inria.fr>.

* guix/git-download.scm (git-fetch)[build]: Pass #:archive-type to
'swh-download' and invoke "git clone" on the result.
---
guix/git-download.scm | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)

Hi there!

I just saw the discussion between Simon and Valentin (Cc’d) and
realized this could easily be fixed on our side because (guix swh)
is already equipped to fetch bare Git repositories.

To test it, comment out the ‘git-fetch’ and ‘download-nar’ calls in
(guix git-download) and run, say:

./pre-inst-env guix build -S guile-bash --check

I don’t know of a repository that has this CRLF problem though, so
we should check that it actually works as advertised on such a repo.

Thanks,
Ludo’.

Toggle diff (34 lines)
diff --git a/guix/git-download.scm b/guix/git-download.scm
index a1566bed4d..9f57b4170d 100644
--- a/guix/git-download.scm
+++ b/guix/git-download.scm
@@ -1,5 +1,5 @@
;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2014-2021, 2023 Ludovic Courtès <ludo@gnu.org>
;;; Copyright © 2017 Mathieu Lirzin <mthl@gnu.org>
;;; Copyright © 2017 Christopher Baines <mail@cbaines.net>
;;; Copyright © 2020 Jakub K?dzio?ka <kuba@kadziolka.net>
@@ -162,8 +162,17 @@ (define recursive?
(parameterize ((%verify-swh-certificate? #f))
(format (current-error-port)
"Trying to download from Software Heritage...~%")
+
+ ;; Fetch the source as a bare Git repository (rather
+ ;; than 'flat') so Git can post-process it in the usual
+ ;; way, for instance to perform CR/LF conversion.
(swh-download (getenv "git url") (getenv "git commit")
- #$output))))))))
+ "git-repo"
+ #:archive-type 'git-bare)
+ (invoke #+(file-append git "/bin/git")
+ "clone" "git-repo" #$output)
+ (delete-file-recursively
+ (string-append #$output "/.git")))))))))
(mlet %store-monad ((guile (package->derivation guile system)))
(gexp->derivation (or name "git-checkout") build

base-commit: ff5fbcc19bce6e94ead0cc79b27ae8ed0307463d
--
2.39.1
S
S
Simon Tournier wrote on 2 Mar 2023 11:30
(name . Ludovic Courtès)(address . ludo@gnu.org)
877cvzfnst.fsf@gmail.com
Hi,

On jeu., 02 mars 2023 at 10:12, Ludovic Courtès <ludo@gnu.org> wrote:

Toggle quote (3 lines)
> I don’t know of a repository that has this CRLF problem though, so
> we should check that it actually works as advertised on such a repo.

Here my test:

+ disable network and only allow archive.softwareheritage.org,
+ run,

./pre-inst-env guix build -L /tmp/pkg hidapi@0.9.0 -S --check

where /tmp/pkg.test.scm is just the old version of hidapi.

The patch fixes the discrepancy but it seems more work on SWH side and
this will not scale, as Valentin reported in [1].

Since it is a corner case, it would be best if first let try with the
flat method and if it fails, let use git-bare. Well, it would mean add
some integrity check in the last resort attempt.

WDYT?



Toggle snippet (82 lines)
$ cat /tmp/pkg/test.scm
(define-module (test)
#:use-module (guix packages)
#:use-module (guix git-download)
#:use-module (gnu packages libusb)
)

(define-public hidapi-0.9
(package
(inherit hidapi)
(name "hidapi")
(version "0.9.0")
(source
(origin
(method git-fetch)
(uri (git-reference
(url "https://github.com/libusb/hidapi")
(commit (string-append "hidapi-" version))))
(file-name (git-file-name name version))
(sha256
(base32 "1p4g8lgwj4rki6lbn5l6rvwj0xlbn1xfh4d255bg5pvgczmwmc4i"))))))

$ ./pre-inst-env guix build -L /tmp/pkg hidapi@0.9.0 -S --check
The following derivation will be built:
/gnu/store/cg34yy4pd7b246qpi946x5lhcg1x80sw-hidapi-0.9.0-checkout.drv
building /gnu/store/cg34yy4pd7b246qpi946x5lhcg1x80sw-hidapi-0.9.0-checkout.drv...
guile: warning: failed to install locale
environment variable `PATH' set to `/gnu/store/0c1yfbxyv877mlgychfgvmk5ha2jqh52-gzip-1.10/bin:/gnu/store/8z1q7vjxylm6l4ibsywd4h6m6jv6zqm4-tar-1.34/bin'
hint: Using 'master' as the name for the initial branch. This default branch name
hint: is subject to change. To configure the initial branch name to use in all
hint: of your new repositories, which will suppress this warning, call:
hint:
hint: git config --global init.defaultBranch <name>
hint:
hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and
hint: 'development'. The just-created branch can be renamed via this command:
hint:
hint: git branch -m <name>
Initialized empty Git repository in /gnu/store/n6rvypb3z386v0ml6n0xyga2x4q9ij58-hidapi-0.9.0-checkout/.git/
fatal: unable to access 'https://github.com/libusb/hidapi/': Could not resolve host: github.com
Failed to do a shallow fetch; retrying a full fetch...
fatal: unable to access 'https://github.com/libusb/hidapi/': Could not resolve host: github.com
git-fetch: '/gnu/store/5qcj54m8smhp9gd0i8zq1y4c08cmrfii-git-minimal-2.39.1/bin/git fetch origin' failed with exit code 128
Trying content-addressed mirror at berlin.guix.gnu.org...
Trying content-addressed mirror at berlin.guix.gnu.org...
Trying to download from Software Heritage...
SWH: found revision 7da5cc91fc0d2dbe4df4f08cd31f6ca1a262418f with directory at 'https://archive.softwareheritage.org/api/1/directory/c95078f3645abc050536d3aaf76cb5fb6eb6b288/'
SWH vault: requested bundle cooking, waiting for completion...
SWH vault: Processing... 1334 objects processed
Over 403 remaining
SWH vault: Processing... 1484 objects processed
Over 301 remaining
SWH vault: Processing... 1603 objects processed
Over 212 remaining
SWH vault: Processing... 1712 objects processed
Over 115 remaining
swh:1:rev:7da5cc91fc0d2dbe4df4f08cd31f6ca1a262418f.git/
swh:1:rev:7da5cc91fc0d2dbe4df4f08cd31f6ca1a262418f.git/HEAD
swh:1:rev:7da5cc91fc0d2dbe4df4f08cd31f6ca1a262418f.git/branches/
swh:1:rev:7da5cc91fc0d2dbe4df4f08cd31f6ca1a262418f.git/config
swh:1:rev:7da5cc91fc0d2dbe4df4f08cd31f6ca1a262418f.git/description
swh:1:rev:7da5cc91fc0d2dbe4df4f08cd31f6ca1a262418f.git/hooks/
swh:1:rev:7da5cc91fc0d2dbe4df4f08cd31f6ca1a262418f.git/info/
swh:1:rev:7da5cc91fc0d2dbe4df4f08cd31f6ca1a262418f.git/info/exclude
swh:1:rev:7da5cc91fc0d2dbe4df4f08cd31f6ca1a262418f.git/info/refs
swh:1:rev:7da5cc91fc0d2dbe4df4f08cd31f6ca1a262418f.git/objects/
swh:1:rev:7da5cc91fc0d2dbe4df4f08cd31f6ca1a262418f.git/objects/info/
swh:1:rev:7da5cc91fc0d2dbe4df4f08cd31f6ca1a262418f.git/objects/info/packs
swh:1:rev:7da5cc91fc0d2dbe4df4f08cd31f6ca1a262418f.git/objects/pack/
swh:1:rev:7da5cc91fc0d2dbe4df4f08cd31f6ca1a262418f.git/objects/pack/pack-f10e443dcb1d90e42a4fb9519c0fc5429ba49ba9.idx
swh:1:rev:7da5cc91fc0d2dbe4df4f08cd31f6ca1a262418f.git/objects/pack/pack-f10e443dcb1d90e42a4fb9519c0fc5429ba49ba9.pack
swh:1:rev:7da5cc91fc0d2dbe4df4f08cd31f6ca1a262418f.git/refs/
swh:1:rev:7da5cc91fc0d2dbe4df4f08cd31f6ca1a262418f.git/refs/heads/
swh:1:rev:7da5cc91fc0d2dbe4df4f08cd31f6ca1a262418f.git/refs/heads/master
swh:1:rev:7da5cc91fc0d2dbe4df4f08cd31f6ca1a262418f.git/refs/tags/
Cloning into '/gnu/store/n6rvypb3z386v0ml6n0xyga2x4q9ij58-hidapi-0.9.0-checkout'...
done.
successfully built /gnu/store/cg34yy4pd7b246qpi946x5lhcg1x80sw-hidapi-0.9.0-checkout.drv
successfully built /gnu/store/cg34yy4pd7b246qpi946x5lhcg1x80sw-hidapi-0.9.0-checkout.drv
/gnu/store/n6rvypb3z386v0ml6n0xyga2x4q9ij58-hidapi-0.9.0-checkout

Cheers,
simon
S
S
Simon Tournier wrote on 2 Mar 2023 13:05
[PATCH v2] git-download: Apply CR/LF to Git repository from SWH.
(address . 61910@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20230302120546.1074630-1-zimon.toutoune@gmail.com
From: Ludovic Courtès <ludo@gnu.org>

Fixes a bug whereby CR/LF convention would not be applied on Git
repositories retrieved from SWH:


Reported by Simon Tournier <simon.tournier@inserm.fr>.
Suggested by Valentin Lorentz <valentin.lorentz@inria.fr>.
Co-authored by Simon Tournier <simon.tournier@inserm.fr>.

* guix/git-download.scm (git-fetch)[build]: Add Git operations conditioned by
.gitattributes on the result from SWH.
---
guix/git-download.scm | 24 ++++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)

Hi,

This uses 'flat' as previously but applies some Git commands for fixing CR/LF
when .gitattributes is present or not. It works on the example hidapi.

I have not tried for other examples.

WDYT?

Cheers,
simon


Toggle diff (48 lines)
diff --git a/guix/git-download.scm b/guix/git-download.scm
index a1566bed4d..be0b13615c 100644
--- a/guix/git-download.scm
+++ b/guix/git-download.scm
@@ -1,8 +1,9 @@
;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2014-2021, 2023 Ludovic Courtès <ludo@gnu.org>
;;; Copyright © 2017 Mathieu Lirzin <mthl@gnu.org>
;;; Copyright © 2017 Christopher Baines <mail@cbaines.net>
;;; Copyright © 2020 Jakub K?dzio?ka <kuba@kadziolka.net>
+;;; Copyright © 2023 Simon Tournier <zimon.toutoune@gmail.com>
;;;
;;; This file is part of GNU Guix.
;;;
@@ -162,8 +163,27 @@ (define recursive?
(parameterize ((%verify-swh-certificate? #f))
(format (current-error-port)
"Trying to download from Software Heritage...~%")
+
(swh-download (getenv "git url") (getenv "git commit")
- #$output))))))))
+ #$output)
+ ;; Perform CR/LF conversion if specificied by .gitattributes
+ (when (find-files "." ".gitattributes")
+ (invoke #+(file-append git "/bin/git") "-C" #$output
+ "init")
+ (invoke #+(file-append git "/bin/git") "-C" #$output
+ "config" "--local" "user.email" "you@example.org")
+ (invoke #+(file-append git "/bin/git") "-C" #$output
+ "config" "--local" "user.name" "Your Name")
+ (invoke #+(file-append git "/bin/git") "-C" #$output
+ "add" ".")
+ (invoke #+(file-append git "/bin/git") "-C" #$output
+ "commit" "-am" "'init'")
+ (invoke #+(file-append git "/bin/git") "-C" #$output
+ "read-tree" "--empty")
+ (invoke #+(file-append git "/bin/git") "-C" #$output
+ "reset" "--hard")
+ (delete-file-recursively
+ (string-append #$output "/.git"))))))))))
(mlet %store-monad ((guile (package->derivation guile system)))
(gexp->derivation (or name "git-checkout") build

base-commit: af95f2d8f98eb2c8c64954bb2fd0b70838899174
--
2.38.1
L
L
Ludovic Courtès wrote on 2 Mar 2023 14:15
Re: bug#61910: [PATCH] git-download: Download a bare Git repository from SWH.
(name . Simon Tournier)(address . zimon.toutoune@gmail.com)(address . 61910@debbugs.gnu.org)
87cz5r5m7a.fsf_-_@gnu.org
Hello!

Simon Tournier <zimon.toutoune@gmail.com> skribis:


[...]

Toggle quote (22 lines)
> (swh-download (getenv "git url") (getenv "git commit")
> - #$output))))))))
> + #$output)
> + ;; Perform CR/LF conversion if specificied by .gitattributes
> + (when (find-files "." ".gitattributes")
> + (invoke #+(file-append git "/bin/git") "-C" #$output
> + "init")
> + (invoke #+(file-append git "/bin/git") "-C" #$output
> + "config" "--local" "user.email" "you@example.org")
> + (invoke #+(file-append git "/bin/git") "-C" #$output
> + "config" "--local" "user.name" "Your Name")
> + (invoke #+(file-append git "/bin/git") "-C" #$output
> + "add" ".")
> + (invoke #+(file-append git "/bin/git") "-C" #$output
> + "commit" "-am" "'init'")
> + (invoke #+(file-append git "/bin/git") "-C" #$output
> + "read-tree" "--empty")
> + (invoke #+(file-append git "/bin/git") "-C" #$output
> + "reset" "--hard")
> + (delete-file-recursively
> + (string-append #$output "/.git"))))))))))

This is much better than fetching the whole repo, well done!

Should we replace (find-files …) with (file-exists? ".gitattributes")?

I wonder if we could achieve the same without creating a repo just to
delete it afterwards, either by invoking a low-level ‘git’ command it if
exists, or using Guile-Git, or doing it ourselves in Scheme.

WDYT?

Thanks,
Ludo’.
S
S
Simon Tournier wrote on 2 Mar 2023 14:50
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 61910@debbugs.gnu.org)
CAJ3okZ1kz-bL0pbaUHMUcg8YvEc1KMuS25wRgnS9h0oPL5UQgA@mail.gmail.com
Re,

On Thu, 2 Mar 2023 at 14:15, Ludovic Courtès <ludo@gnu.org> wrote:

Toggle quote (2 lines)
> Should we replace (find-files …) with (file-exists? ".gitattributes")?

Damned! It was what I was looking for. :-)

Toggle quote (4 lines)
> I wonder if we could achieve the same without creating a repo just to
> delete it afterwards, either by invoking a low-level ‘git’ command it if
> exists, or using Guile-Git, or doing it ourselves in Scheme.

Doing it ourselves in Scheme, I do not know. I guess it would be
hard. Basically, it is not clear for me what we would have to
implement. For instance, we get the ("normalized") content fetched
from SWH , so we have:

Toggle snippet (5 lines)
hidapi.swh/testgui/test.cpp: C++ source, ASCII text
hidapi.swh/testgui/testgui.sln: UTF-8 Unicode (with BOM) text
hidapi.swh/testgui/testgui.vcproj: XML 1.0 document, ASCII text

and we need:

Toggle snippet (7 lines)
hidapi.guix/testgui/test.cpp: C++ source, ASCII text
hidapi.guix/testgui/testgui.sln: UTF-8 Unicode (with BOM)
text, with CRLF line terminators
hidapi.guix/testgui/testgui.vcproj: XML 1.0 document, ASCII text,
with CRLF line terminators

So the question is how to know which files require CRLF and which not?

From my understanding, this information is provided by:

Toggle snippet (10 lines)
$ cat hidapi.guix/.gitattributes
* text=auto

*.sln text eol=crlf
*.vcproj text eol=crlf

bootstrap text eol=lf
configure.ac text eol=lf

As zack pointed yesterday, well that what I understood or
misunderstood, it is related to "smudge" filters [1].



About Guile-Git, probably the same as you answered me earlier today. ;-)

<civodul> it's just that it was easier to implement and yeah,
shallow clones
are nice
<civodul> we don't need tight integration in this context, so less of an
incentive to use Guile-Git
<civodul> (it's very different from the channel code from that
perspective)

Kidding aside, I agree that relying on Guile-Git would mean being less
"fragile". Hum, somehow this asks about the size of the binary seed
for bootstrapping.

About low-level 'git' command, maybe the same could be achieved
without "git init" + "git add" + "git commit". Well, my gitology is
not enough skilled. :-)

Cheers,
simon
S
S
Simon Tournier wrote on 2 Mar 2023 18:42
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 61910@debbugs.gnu.org)
87356nf3s7.fsf_-_@gmail.com
Hi,

Follow up after discussing on #swh-devel. :-)

On jeu., 02 mars 2023 at 14:50, Simon Tournier <zimon.toutoune@gmail.com> wrote:

Toggle quote (13 lines)
> So the question is how to know which files require CRLF and which not?
>
> From my understanding, this information is provided by:
>
> $ cat hidapi.guix/.gitattributes
> * text=auto
>
> *.sln text eol=crlf
> *.vcproj text eol=crlf
>
> bootstrap text eol=lf
> configure.ac text eol=lf

Yeah, we could parse this file. It is not that hard but I am not so
much interested right now for implementing such.


Toggle quote (3 lines)
> As zack pointed yesterday, well that what I understood or
> misunderstood, it is related to "smudge" filters [1].

No, it is not “smudge”. These attributes are much simpler than
smudge/clean filters – well they are scary!


Toggle quote (4 lines)
> Kidding aside, I agree that relying on Guile-Git would mean being less
> "fragile". Hum, somehow this asks about the size of the binary seed
> for bootstrapping.

After some quick look, Guile-Git provides some plumbing and here it is
more about porcelain. Well, I do not know how it could be implemented
with Guile-Git.


Toggle quote (4 lines)
> About low-level 'git' command, maybe the same could be achieved
> without "git init" + "git add" + "git commit". Well, my gitology is
> not enough skilled. :-)

Asking on #swh-devel, vlorentz confirmed using something similar.
Quoting: « I looked at git's normalization-related test cases; but they
all use git-add »

Therefore, I guess that’s probably the best tradeoff for now.

Well, I am going to run more tests.


Cheers,
simon
L
L
Ludovic Courtès wrote on 3 Mar 2023 12:12
(name . Simon Tournier)(address . zimon.toutoune@gmail.com)(address . 61910@debbugs.gnu.org)
87a60uxf5h.fsf@gnu.org
Hi,

Simon Tournier <zimon.toutoune@gmail.com> skribis:

Toggle quote (2 lines)
> On Thu, 2 Mar 2023 at 14:15, Ludovic Courtès <ludo@gnu.org> wrote:

[...]

Toggle quote (6 lines)
>> I wonder if we could achieve the same without creating a repo just to
>> delete it afterwards, either by invoking a low-level ‘git’ command it if
>> exists, or using Guile-Git, or doing it ourselves in Scheme.
>
> Doing it ourselves in Scheme, I do not know. I guess it would be
> hard.
[...]

Toggle quote (11 lines)
> From my understanding, this information is provided by:
>
> $ cat hidapi.guix/.gitattributes
> * text=auto
>
> *.sln text eol=crlf
> *.vcproj text eol=crlf
>
> bootstrap text eol=lf
> configure.ac text eol=lf

Uh, tricky enough.

[...]

Toggle quote (3 lines)
> Kidding aside, I agree that relying on Guile-Git would mean being less
> "fragile".

Libgit2 has an “attr” interface¹ that Guile-Git only partially binds.
That interface works on a repo, so we’d still have to ‘git init’ just to
be able to read those attributes. Not great.


But anyway, skimming through gitattributes(5) has convinced me that we
should not try to implement it by ourselves. :-)

Toggle quote (4 lines)
> About low-level 'git' command, maybe the same could be achieved
> without "git init" + "git add" + "git commit". Well, my gitology is
> not enough skilled. :-)

I did a quick search and didn’t find anything so it looks like the
strategy you chose is the right one.

Thanks,
Ludo’.
L
L
Ludovic Courtès wrote on 3 Mar 2023 14:27
(name . Simon Tournier)(address . zimon.toutoune@gmail.com)(address . 61910-done@debbugs.gnu.org)
87wn3yvubx.fsf_-_@gnu.org
Hi again,

I pushed a slightly modified version of your patch as commit
58f20fa8181bdcd4269671e1d3cef1268947af3a. It still passes the hidapi
test. :-)

Let me know if anything’s amiss!

Thanks,
Ludo’.
Closed
S
S
Simon Tournier wrote on 3 Mar 2023 19:52
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 61910-done@debbugs.gnu.org)
86pm9pzmzj.fsf@gmail.com
Hi,

On Fri, 03 Mar 2023 at 14:27, Ludovic Courtès <ludo@gnu.org> wrote:

Toggle quote (2 lines)
> Let me know if anything’s amiss!

The version using ’file-exists?’ works well for Julia packages. Well, I
will try your submitted one later next week. :-)

Cheers,
simon
Closed
?