Channel clones lack SWH fallback

DoneSubmitted by zimoun.
Details
2 participants
  • Ludovic Courtès
  • zimoun
Owner
unassigned
Severity
important
Z
Z
zimoun wrote on 24 Oct 2020 00:17
whishlist: time-machine --channel falls back to SWH
(address . bug-guix@gnu.org)
86pn581t9s.fsf@gmail.com
Dear,
Let’s describe the use case. Consider that:
guix time-machine -C channels -- install foo
is provided in some documentation, say scientific paper. Where thechannels.scm file is completly described:
Toggle snippet (7 lines)(list (channel (name 'kikoo) (url "https://example.org/that-great.git") (commit "353bdae32f72b720c7ddd706576ccc40e2b43f95")))
In the future, if https://example.org/that-great.gitdisappears, thenbuild/install the package ’foo’ is becoming difficult, nor impossible.
However, let’s consider that the repo ’that-great’ had been saved in SWH(say manually); since it is a regular Git repo. Guix should be able tofallback to it transparently.

Obviously, another whislist is to have something to ease the saverequest of the channel on SWH. Maybe this latter could be part of theseveral-times discussed “guix channel” subcommand. :-)

All the best,simon
L
L
Ludovic Courtès wrote on 5 Mar 15:14 +0100
control message for bug #44187
(address . control@debbugs.gnu.org)
878s71lmbq.fsf@gnu.org
severity 44187 importantquit
L
L
Ludovic Courtès wrote on 5 Mar 15:14 +0100
(address . control@debbugs.gnu.org)
877dmllmas.fsf@gnu.org
retitle 44187 Channel clones lack SWH fallbackquit
L
L
Ludovic Courtès wrote on 5 Mar 15:51 +0100
Re: bug#44187: whishlist: time-machine --channel falls back to SWH
(name . zimoun)(address . zimon.toutoune@gmail.com)(address . 44187@debbugs.gnu.org)
87pn0dk61v.fsf@gnu.org
Hi,
zimoun <zimon.toutoune@gmail.com> skribis:
Toggle quote (20 lines)> Let’s describe the use case. Consider that:>> guix time-machine -C channels -- install foo>> is provided in some documentation, say scientific paper. Where the> channels.scm file is completly described:>> (list (channel> (name 'kikoo)> (url "https://example.org/that-great.git")> (commit> "353bdae32f72b720c7ddd706576ccc40e2b43f95")))>> In the future, if https://example.org/that-great.git disappears, then> build/install the package ’foo’ is becoming difficult, nor impossible.>> However, let’s consider that the repo ’that-great’ had been saved in SWH> (say manually); since it is a regular Git repo. Guix should be able to> fallback to it transparently.
I went head-down to add SWH fallback to ‘latest-repository-commit’… butthat’s of no use because (guix channels) wants a complete clone so thatit can determine commit relations (to detect downgrades).
The SWH vault gives access to checkouts primarily, but it’s alsopossible to get a full repo in ‘git fast-import’ format, which is whatwe need:
https://archive.softwareheritage.org/api/1/vault/revision/gitfast/doc/
However, this API will be eventually replaced by some other solution saySWH developers, possibly a bare Git repo export, so it may not be a goodidea to build upon it.
If we were able, using the SWH API, to map “revisions” to “origins”, wecould find potential mirrors hosting a given commit, but apparentlythat’s not possible.
To be continued…
Ludo’.
Toggle diff (72 lines)diff --git a/guix/git.scm b/guix/git.scmindex a5103547d3..449011c51a 100644--- a/guix/git.scm+++ b/guix/git.scm@@ -32,6 +32,7 @@ #:use-module (guix records) #:use-module (guix gexp) #:use-module (guix sets)+ #:autoload (guix swh) (swh-download) #:use-module ((guix diagnostics) #:select (leave)) #:use-module (guix progress) #:use-module (rnrs bytevectors)@@ -459,22 +460,43 @@ Log progress and checkout info to LOG-PORT." (eq? 'regular (stat:type stat)))))) (format log-port "updating checkout of '~a'...~%" url)- (let*-values- (((checkout commit _)- (update-cached-checkout url- #:recursive? recursive?- #:ref ref- #:cache-directory- (url-cache-directory url cache-directory- #:recursive?- recursive?)- #:log-port log-port))- ((name)- (url+commit->name url commit)))- (format log-port "retrieved commit ~a~%" commit)- (values (add-to-store store name #t "sha256" checkout- #:select? (negate dot-git?))- commit)))++ (catch 'git-error+ (lambda ()+ (let*-values+ (((checkout commit _)+ (update-cached-checkout (pk 'l-r-c url)+ #:recursive? recursive?+ #:ref ref+ #:cache-directory+ (url-cache-directory url cache-directory+ #:recursive?+ recursive?)+ #:log-port log-port))+ ((name)+ (url+commit->name url commit)))+ (format log-port "retrieved commit ~a~%" commit)+ (values (add-to-store store name #t "sha256" checkout+ #:select? (negate dot-git?))+ commit)))+ (lambda (key err . rest)+ ;; XXX: 'swh-download' currently doesn't support submodules.+ (when recursive?+ (apply throw key err rest))++ (pk 'err key err rest)+ (match ref+ (('commit . commit)+ ;; Attempt to fetch COMMIT from SWH.+ (call-with-temporary-directory+ (lambda (directory)+ (unless (swh-download url commit directory)+ (apply throw key err rest))+ (values (add-to-store store (url+commit->name url commit)+ #t "sha256" directory)+ commit))))+ (_+ (apply throw key err rest)))))) (define (print-git-error port key args default-printer) (match args
L
L
Ludovic Courtès wrote on 10 Sep 16:34 +0200
[PATCH 0/3] Fall back to Software Heritage (SWH) for Git clones
(address . 44187@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20210910143415.14783-1-ludo@gnu.org
Hi!
A bit of context: we already had automatic SWH fallback for Git checkouts,which is to say that any origin that uses ‘git-fetch’ would have itscheckout transparently fetched from SWH if upstream vanished (thisdates back to commit 608d3dca89d73fe7260e97a284a8aeea756a3e11, Nov. 2018).
What this patch series provides is SWH fallback for full Git clones (asopposed to flat checkouts). It works for anything that uses (guix git).That includes <git-checkout>, used by transformation options:
Toggle snippet (40 lines)$ ./pre-inst-env guix build footswitch --with-git-url=footswitch=http://example.org/sdf --with-commit=footswitch=1eabc563ca5692b3e08d84f1f0e6fd2283284469 -nupdating checkout of 'http://example.org/sdf'...SWH: found revision 1eabc563ca5692b3e08d84f1f0e6fd2283284469 with directory at 'https://archive.softwareheritage.org/api/1/directory/ad8976564375ee55f645387bbcdf4b66e6582fbf/'swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/HEADswh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/branches/swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/configswh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/descriptionswh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/applypatch-msg.sampleswh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/commit-msg.sampleswh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/fsmonitor-watchman.sampleswh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/post-update.sampleswh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/pre-applypatch.sampleswh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/pre-commit.sampleswh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/pre-push.sampleswh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/pre-rebase.sampleswh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/pre-receive.sampleswh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/prepare-commit-msg.sampleswh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/update.sampleswh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/info/swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/info/excludeswh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/info/refsswh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/objects/swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/objects/info/swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/objects/info/packsswh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/objects/pack/swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/objects/pack/pack-ed28f44a2599fe2d0a5f1b1a84c247c43afd14a1.idxswh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/objects/pack/pack-ed28f44a2599fe2d0a5f1b1a84c247c43afd14a1.packswh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/refs/swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/refs/heads/swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/refs/heads/masterswh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/refs/tags/retrieved commit 1eabc563ca5692b3e08d84f1f0e6fd2283284469substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0%substitute: updating substitutes from 'https://bayfront.guix.gnu.org'... 100.0%The following derivation would be built: /gnu/store/39kzsy5kgj5150q6zgckc2hbxp999adw-footswitch-git.1eabc56.drv
In the example above, we pass a bogus Git URL, but since the targetcommit is known, (guix git) automatically fetches a bare Git repositoryfrom the SWH vault.
It also works for channels, which is what zimoun reported here:
Toggle snippet (46 lines)$ cat /tmp/chan.scm(list (channel (name 'guix) (url "https://git.savannah.gnu.org/git/guix.git") (commit "f91ae9425bb385b60396a544afe27933896b8fa3") (introduction (make-channel-introduction "9edb3f66fd807b096b48283debdcddccfea34bad" (openpgp-fingerprint "BBB0 2DDF 2CEA F6A8 0D1D E643 A2A0 6DF2 A33A 54FA")))) (channel (name 'guix-past) (url "https://does-not-exist.inria.fr/guix-hpc/guix-past") (commit "77e183dc7ade307ad3409fad4b71f12e266de910") #;(introduction (make-channel-introduction "0c119db2ea86a389769f4d2b9c6f5c41c027e336" (openpgp-fingerprint "3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5")))))$ ./pre-inst-env guix time-machine -C /tmp/chan.scm -- describeUpdating channel 'guix' from Git repository at 'https://git.savannah.gnu.org/git/guix.git'...Updating channel 'guix-past' from Git repository at 'https://does-not-exist.inria.fr/guix-hpc/guix-past'...SWH: found revision 77e183dc7ade307ad3409fad4b71f12e266de910 with directory at 'https://archive.softwareheritage.org/api/1/directory/7c6aa10e1e0fa54199566145c6a453731872b87d/'swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/HEADswh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/branches/swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/configswh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/descriptionswh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/hooks/swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/info/swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/info/excludeswh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/info/refsswh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/objects/swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/objects/info/swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/objects/info/packsswh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/objects/pack/swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/objects/pack/pack-e6c0a4813509178eed735708dd60503353a50b9c.idxswh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/objects/pack/pack-e6c0a4813509178eed735708dd60503353a50b9c.packswh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/refs/swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/refs/heads/swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/refs/heads/masterswh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/refs/tags/Computing Guix derivation for 'x86_64-linux'... \ C-c C-c
Here, the ‘guix-past’ channel is transparently cloned from SWH. Thisis pretty cool, because having the whole repo around is what permitsthings like downgrade prevention¹ and news support².
Finally we can enjoy content-addressability and brittle URLs are becoming a thing of the past!*

Limitations~~~~~~~~~~~~
Yes, there’s a couple of them.
First, fallback is implemented only for fresh clones, not for updates.Thus, if I rerun the first example, having now the clone in~/.cache/guix/checkouts, with a different commit, I get:
Toggle snippet (5 lines)$ ./pre-inst-env guix build footswitch --with-git-url=footswitch=http://example.org/sdf --with-commit=footswitch=aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa -nupdating checkout of 'http://example.org/sdf'...guix build: error: Git failure while fetching http://example.org/sdf: unexpected http status code: 404
Second, clones from SWH only contain the one branch that the revisionis on. For channels, that means that the ‘keyring’ branch is not fetched,which is why I commented out ‘introduction’ in /tmp/chan.scm above.If I uncomment it, I get:
Toggle snippet (6 lines)$ ./pre-inst-env guix time-machine -C /tmp/chan.scm -- describeUpdating channel 'guix' from Git repository at 'https://git.savannah.gnu.org/git/guix.git'...Updating channel 'guix-past' from Git repository at 'https://does-not-exist.inria.fr/guix-hpc/guix-past'...guix time-machine: error: Git error: cannot locate remote-tracking branch 'origin/keyring'
The SWH folks tell me it’ll eventually be possible to map a revisionto its containing snapshot(s) via the HTTP API, and to obtain entiresnapshots (i.e., the repo and all its branches) from the vault. That’swhat we need to fix this issue.
*Third, and this answers the asterisk above, we must keep in mind thatthis is content-addressibility *with SHA1*. Generating a chosen-prefixcollision is becoming affordable³, so users absolutely need an additionalmechanism to authenticate code they fetched.
For origins, we have the content SHA256, so we’re fine. For channels,we have Guix’s authentication mechanism¹, except it’s not available yetvia SWH, as I wrote above. For the footswitch example above using‘--with-commit’, we don’t have any authentication method, but in fact,that’s the situation of Git repositories in general: they can rarely beauthenticated.
Overall, I think it’s a step in the right direction.
Thoughts?
Thanks to vlorentz and olasd on #swh-devel for their support!
Thanks,Ludo’.
¹ https://guix.gnu.org/en/blog/2020/securing-updates/² https://guix.gnu.org/en/blog/2019/spreading-the-news/³ https://sha-mbles.github.io/
Ludovic Courtès (3): swh: Support downloads of bare Git repositories. git: 'update-cached-checkout' can fall back to SWH when cloning. git: 'reference-available?' recognizes 'tag-or-commit'.
guix/git.scm | 45 +++++++++++++++++++++++++++++++++++++++++++-- guix/swh.scm | 52 ++++++++++++++++++++++++++++++++++++++++------------ 2 files changed, 83 insertions(+), 14 deletions(-)
-- 2.33.0
L
L
Ludovic Courtès wrote on 10 Sep 16:34 +0200
[PATCH 1/3] swh: Support downloads of bare Git repositories.
(address . 44187@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludovic.courtes@inria.fr)
20210910143415.14783-2-ludo@gnu.org
From: Ludovic Courtès <ludovic.courtes@inria.fr>
* guix/swh.scm (swh-download-archive): New procedure.(swh-download-directory): Rewrite in terms of 'swh-download-archive'.(swh-download): Add #:archive-type and honor it. Use'swh-download-archive' instead of 'swh-download-directory'.--- guix/swh.scm | 52 ++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 40 insertions(+), 12 deletions(-)
Toggle diff (91 lines)diff --git a/guix/swh.scm b/guix/swh.scmindex 3d5d2a410a..707551a799 100644--- a/guix/swh.scm+++ b/guix/swh.scm@@ -645,20 +645,29 @@ delete it when leaving the dynamic extent of this call." (lambda () (false-if-exception (delete-file-recursively tmp-dir)))))) -(define* (swh-download-directory id output- #:key (log-port (current-error-port)))- "Download from Software Heritage the directory with the given ID, and-unpack it to OUTPUT. Return #t on success and #f on failure"+(define* (swh-download-archive swhid output+ #:key+ (archive-type 'flat)+ (log-port (current-error-port)))+ "Download from Software Heritage the directory or revision with the given+SWID, in the ARCHIVE-TYPE format (one of 'flat or 'git-bare), and unpack it to+OUTPUT. Return #t on success and #f on failure." (call-with-temporary-directory (lambda (directory)- (match (vault-fetch id 'directory #:log-port log-port)+ (match (vault-fetch swhid+ #:archive-type archive-type+ #:log-port log-port) (#f (format log-port- "SWH: directory ~a could not be fetched from the vault~%"- id)+ "SWH: object ~a could not be fetched from the vault~%"+ swhid) #f) ((? port? input)- (let ((tar (open-pipe* OPEN_WRITE "tar" "-C" directory "-xzvf" "-")))+ (let ((tar (open-pipe* OPEN_WRITE "tar" "-C" directory+ (match archive-type+ ('flat "-xzvf") ;gzipped+ ('git-bare "-xvf")) ;uncompressed+ "-"))) (dump-port input tar) (close-port input) (let ((status (close-pipe tar)))@@ -672,6 +681,14 @@ unpack it to OUTPUT. Return #t on success and #f on failure" #:log (%make-void-port "w")) #t)))))))) +(define* (swh-download-directory id output+ #:key (log-port (current-error-port)))+ "Download from Software Heritage the directory with the given ID, and+unpack it to OUTPUT. Return #t on success and #f on failure."+ (swh-download-archive (string-append "swh:1:dir:" id) output+ #:archive-type 'flat+ #:log-port log-port))+ (define (commit-id? reference) "Return true if REFERENCE is likely a commit ID, false otherwise---e.g., if it is a tag name. This is based on a simple heuristic so use with care!"@@ -679,8 +696,11 @@ it is a tag name. This is based on a simple heuristic so use with care!" (string-every char-set:hex-digit reference))) (define* (swh-download url reference output- #:key (log-port (current-error-port)))- "Download from Software Heritage a checkout of the Git tag or commit+ #:key+ (archive-type 'flat)+ (log-port (current-error-port)))+ "Download from Software Heritage a checkout (if ARCHIVE-TYPE is 'flat) or a+full Git repository (if ARCHIVE-TYPE is 'git-bare) of the Git tag or commit REFERENCE originating from URL, and unpack it in OUTPUT. Return #t on success and #f on failure. @@ -694,7 +714,15 @@ wait until it becomes available, which could take several minutes." (format log-port "SWH: found revision ~a with directory at '~a'~%" (revision-id revision) (swh-url (revision-directory-url revision)))- (swh-download-directory (revision-directory revision) output- #:log-port log-port))+ (swh-download-archive (match archive-type+ ('flat+ (string-append+ "swh:1:dir:" (revision-directory revision)))+ ('git-bare+ (string-append+ "swh:1:rev:" (revision-id revision))))+ output+ #:archive-type archive-type+ #:log-port log-port)) (#f #f)))-- 2.33.0
L
L
Ludovic Courtès wrote on 10 Sep 16:34 +0200
[PATCH 3/3] git: 'reference-available?' recognizes 'tag-or-commit'.
(address . 44187@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20210910143415.14783-4-ludo@gnu.org
* guix/git.scm (reference-available?): Handle 'tag-or-commit' with a40-digit hex string.--- guix/git.scm | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
Toggle diff (25 lines)diff --git a/guix/git.scm b/guix/git.scmindex 377e09888a..33a111b84a 100644--- a/guix/git.scm+++ b/guix/git.scm@@ -36,7 +36,7 @@ #:use-module (guix sets) #:use-module ((guix diagnostics) #:select (leave)) #:use-module (guix progress)- #:autoload (guix swh) (swh-download)+ #:autoload (guix swh) (swh-download commit-id?) #:use-module (rnrs bytevectors) #:use-module (ice-9 format) #:use-module (ice-9 match)@@ -340,7 +340,8 @@ dynamic extent of EXP." "Return true if REF, a reference such as '(commit . \"cabba9e\"), is definitely available in REPOSITORY, false otherwise." (match ref- (('commit . commit)+ ((or ('commit . commit)+ ('tag-or-commit . (? commit-id? commit))) (let ((len (string-length commit)) (oid (string->oid commit))) (false-if-git-not-found-- 2.33.0
L
L
Ludovic Courtès wrote on 10 Sep 16:34 +0200
[PATCH 2/3] git: 'update-cached-checkout' can fall back to SWH when cloning.
(address . 44187@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludovic.courtes@inria.fr)
20210910143415.14783-3-ludo@gnu.org
From: Ludovic Courtès <ludovic.courtes@inria.fr>
Fixes https://issues.guix.gnu.org/44187.Reported by zimoun <zimon.toutoune@gmail.com>.
* guix/git.scm (GITERR_HTTP): New variable.(clone-from-swh, clone/swh-fallback): New procedures.(update-cached-checkout): Use 'clone/swh-fallback' instead of 'clone*'.--- guix/git.scm | 42 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 41 insertions(+), 1 deletion(-)
Toggle diff (76 lines)diff --git a/guix/git.scm b/guix/git.scmindex acc48fd12f..377e09888a 100644--- a/guix/git.scm+++ b/guix/git.scm@@ -36,6 +36,7 @@ #:use-module (guix sets) #:use-module ((guix diagnostics) #:select (leave)) #:use-module (guix progress)+ #:autoload (guix swh) (swh-download) #:use-module (rnrs bytevectors) #:use-module (ice-9 format) #:use-module (ice-9 match)@@ -180,6 +181,13 @@ the 'SSL_CERT_FILE' and 'SSL_CERT_DIR' environment variables." (lambda args (make-fetch-options auth-method))))) +(define GITERR_HTTP+ ;; Guile-Git <= 0.5.2 lacks this constant.+ (let ((errors (resolve-interface '(git errors))))+ (if (module-defined? errors 'GITERR_HTTP)+ (module-ref errors 'GITERR_HTTP)+ 34)))+ (define (clone* url directory) "Clone git repository at URL into DIRECTORY. Upon failure, make sure no empty directory is left behind."@@ -342,6 +350,38 @@ definitely available in REPOSITORY, false otherwise." (_ #f))) +(define (clone-from-swh url tag-or-commit output)+ "Attempt to clone TAG-OR-COMMIT (a string), which originates from URL, using+a copy archived at Software Heritage."+ (call-with-temporary-directory+ (lambda (bare)+ (and (swh-download url tag-or-commit bare+ #:archive-type 'git-bare)+ (let ((repository (clone* bare output)))+ (remote-set-url! repository "origin" url)+ repository)))))++(define (clone/swh-fallback url ref cache-directory)+ "Like 'clone', but fallback to Software Heritage if the repository cannot be+found at URL."+ (define (inaccessible-url-error? err)+ (let ((class (git-error-class err))+ (code (git-error-code err)))+ (or (= class GITERR_HTTP) ;404 or similar+ (= class GITERR_NET)))) ;unknown host, etc.++ (catch 'git-error+ (lambda ()+ (clone* url cache-directory))+ (lambda (key err)+ (match ref+ (((or 'commit 'tag-or-commit) . commit)+ (if (inaccessible-url-error? err)+ (or (clone-from-swh url commit cache-directory)+ (throw key err))+ (throw key err)))+ (_ (throw key err))))))+ (define cached-checkout-expiration ;; Return the expiration time procedure for a cached checkout. ;; TODO: Honor $GUIX_GIT_CACHE_EXPIRATION.@@ -408,7 +448,7 @@ it unchanged." (let* ((cache-exists? (openable-repository? cache-directory)) (repository (if cache-exists? (repository-open cache-directory)- (clone* url cache-directory))))+ (clone/swh-fallback url ref cache-directory)))) ;; Only fetch remote if it has not been cloned just before. (when (and cache-exists? (not (reference-available? repository ref)))-- 2.33.0
Z
Z
zimoun wrote on 13 Sep 18:07 +0200
Re: bug#44187: [PATCH 0/3] Fall back to Software Heritage (SWH) for Git clones
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 44187@debbugs.gnu.org)
CAJ3okZ3-6dXBjampGu9UTtY8f3xnmKoqGsJvsfu1-wrSZ2zUZQ@mail.gmail.com
Hi Ludo,
Cool! However, the patch does not apply on the top of 53f54d4aa2.That's why the option '--base' of "git format-patch" is really helpful. ;-)
Onto which commit does the patch set apply? In order to try and review. :-)
Cheers,simon
L
L
Ludovic Courtès wrote on 14 Sep 15:37 +0200
(name . zimoun)(address . zimon.toutoune@gmail.com)(address . 44187@debbugs.gnu.org)
871r5r5ldr.fsf@gnu.org
Hello,
zimoun <zimon.toutoune@gmail.com> skribis:
Toggle quote (3 lines)> Cool! However, the patch does not apply on the top of 53f54d4aa2.> That's why the option '--base' of "git format-patch" is really helpful. ;-)
Ah! It should apply on top of ff613c2b68aac539262822490448e637d8f315ba.
If not, I can rebase it and send an updated patch (I’ve been fiddlingwith code in this area lately…).
Thanks,Ludo’.
Z
Z
zimoun wrote on 17 Sep 10:02 +0200
Re: bug#44187: Channel clones lack SWH fallback
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 44187@debbugs.gnu.org)
86o88r1vfe.fsf@gmail.com
Hi,
On ven., 10 sept. 2021 at 16:34, Ludovic Courtès <ludo@gnu.org> wrote:
Toggle quote (3 lines)> Finally we can enjoy content-addressability and brittle URLs> are becoming a thing of the past!*
Yeah, it is awesome!
The original URL of the channel was:https://github.com/zimoun/channel-example.git. And this channeldefines a package where the upstream has also disappearedhttps://github.com/zimoun/hello-example.git. Note the URL in thepackage definition is not bogus… but using one was already working. :-)
All is saved on SWH, so now all is transparent! From my point of view,this is a killer feature for scientific folks. :-)
Toggle snippet (89 lines)$ cat /tmp/channels.scm(list (channel (name 'guix) (url "/home/sitour/src/guix/guix") (branch "fix-44187") (commit "cdea76a2fdaf7705583a02081a6468d436b8df05")) (channel (name 'example) (url "https://example.org/foo.git") (commit "67c9f2143aa6f545419ae913b4ae02af4cd3effc")))
$ ./pre-inst-env guix time-machine -C /tmp/channels.scm --disable-authentication -- build hiUpdating channel 'guix' from Git repository at '/home/sitour/src/guix/guix'...guix time-machine: warning: channel authentication disabledUpdating channel 'example' from Git repository at 'https://example.org/foo.git'...SWH: found revision 67c9f2143aa6f545419ae913b4ae02af4cd3effc with directory at 'https://archive.softwareheritage.org/api/1/directory/fe423e88ce277d3fc230c88d408e42b14a3a458c/'SWH vault: requested bundle cooking, waiting for completion...swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/HEADswh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/branches/swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/configswh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/descriptionswh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/hooks/swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/info/swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/info/excludeswh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/info/refsswh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/objects/swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/objects/info/swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/objects/info/packsswh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/objects/pack/swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/objects/pack/pack-4e9279a1b64e4dda7bd9d84bb6b50bb1f80def08.idxswh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/objects/pack/pack-4e9279a1b64e4dda7bd9d84bb6b50bb1f80def08.packswh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/refs/swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/refs/heads/swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/refs/heads/masterswh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/refs/tags/guix time-machine: warning: channel authentication disabled
[...]
Computing Guix derivation for 'x86_64-linux'... -
[...]
construction de /gnu/store/6g9qlysbbk7p4609xrv82j0wzbib1y4r-git-checkout.drv...guile: warning: failed to install localeenvironment variable `PATH' set to `/gnu/store/378zjf2kgajcfd7mfr98jn5xyc5wa3qv-gzip-1.10/bin:/gnu/store/sf3rbvb6iqcphgm1afbplcs72hsywg25-tar-1.32/bin'hint: Using 'master' as the name for the initial branch. This default branch namehint: is subject to change. To configure the initial branch name to use in allhint: of your new repositories, which will suppress this warning, call:hint:hint: git config --global init.defaultBranch <name>hint:hint: Names commonly chosen instead of 'master' are 'main', 'trunk' andhint: 'development'. The just-created branch can be renamed via this command:hint:hint: git branch -m <name>Initialized empty Git repository in /gnu/store/884nsva9r8wkp40kbqyvpj1ad57jc5dd-git-checkout/.git/fatal: could not read Username for 'https://github.com': No such device or addressFailed to do a shallow fetch; retrying a full fetch...fatal: could not read Username for 'https://github.com': No such device or addressgit-fetch: '/gnu/store/5vai7bfrfkzv22dx13bxpszjrqyi78x6-git-minimal-2.33.0/bin/git fetch origin' failed with exit code 128Trying content-addressed mirror at berlin.guix.gnu.org...Trying content-addressed mirror at berlin.guix.gnu.org...Trying to download from Software Heritage...SWH: found revision e1eefd033b8a2c4c81babc6fde08ebb116c6abb8 with directory at 'https://archive.softwareheritage.org/api/1/directory/c3e538ed2de412d54c567ed7c8cfc46cbbc35d07/'swh:1:dir:c3e538ed2de412d54c567ed7c8cfc46cbbc35d07/swh:1:dir:c3e538ed2de412d54c567ed7c8cfc46cbbc35d07/ABOUT-NLSswh:1:dir:c3e538ed2de412d54c567ed7c8cfc46cbbc35d07/AUTHORSswh:1:dir:c3e538ed2de412d54c567ed7c8cfc46cbbc35d07/COPYING
[...]
swh:1:dir:c3e538ed2de412d54c567ed7c8cfc46cbbc35d07/tests/hello-1swh:1:dir:c3e538ed2de412d54c567ed7c8cfc46cbbc35d07/tests/last-1swh:1:dir:c3e538ed2de412d54c567ed7c8cfc46cbbc35d07/tests/traditional-1construction de /gnu/store/6g9qlysbbk7p4609xrv82j0wzbib1y4r-git-checkout.drv réussieconstruction de /gnu/store/jx1r7w8xaw768176pjl0j0q1l1529w75-hi-2.10.drv...starting phase `set-SOURCE-DATE-EPOCH'phase `set-SOURCE-DATE-EPOCH' succeeded after 0.0 seconds
[...]
construction de /gnu/store/jx1r7w8xaw768176pjl0j0q1l1529w75-hi-2.10.drv réussie/gnu/store/jn8d031zx4znxy7s5zhj4dbr6xjsfq9v-hi-2.10
Well, it still misses the tarball and non-Git fetch method fallback andthe story will be more than awesome! :-)
Toggle quote (5 lines)> Limitations> ~~~~~~~~~~~~>> Yes, there’s a couple of them.
Well, yes some limitations but not so much. ;-)

Toggle quote (4 lines)> First, fallback is implemented only for fresh clones, not for updates.> Thus, if I rerun the first example, having now the clone in> ~/.cache/guix/checkouts, with a different commit, I get:
SWH is not a forge but an archive. :-) Therefore, this update case doesnot make sense to me. I mean,
Toggle snippet (4 lines)$ git -C ~/.cache/guix/checkouts/6k7wvrcpbdsw3pje5b4squybw3jfn3viyrj7gcl7fipa5yjflaza fetchfatal: dépôt 'http://example.org/sdf/' non trouvé
Well, maybe this cache could be removed if the commit is not foundinside this cache and retry to fetch it from SWH. Obviously, thedowndate case works.
Note that on fresh clone, the error message could be improved:
Toggle snippet (5 lines)$ ./pre-inst-env guix build guix --with-git-url=guix=https://example.org --with-commit=guix=ff613c2b68aac539262822490448e637d8f315ba -nupdating checkout of 'https://example.org'...guix build: error: Git failure while fetching https://example.org: unexpected http status code: 404
where https://example.orgis bogus andff613c2b68aac539262822490448e637d8f315ba is not yet archived on SWH. Itcould be nice to warn in addition to the 404 that it is not found inSWH. WDYT?

Toggle quote (4 lines)> Second, clones from SWH only contain the one branch that the revision> is on. For channels, that means that the ‘keyring’ branch is not fetched,> which is why I commented out ‘introduction’ in /tmp/chan.scm above.
To me, it is not an issue. Because you reach a commit from the pastknowing the hash.
Aside my opinion, I wanted to know which kind of metadata we get backfrom the Git repo, so I tried:
Toggle snippet (8 lines)$ guix build guix --with-git-url=guix=https://example.org --with-commit=guix=c75b30d58f0becb0a5cd6a8bfe69d1063b0d1ada -nupdating checkout of 'https://example.org'...SWH: found revision c75b30d58f0becb0a5cd6a8bfe69d1063b0d1ada with directory at 'https://archive.softwareheritage.org/api/1/directory/ca2e8a7222b4850c7bea935dff86b9c2a905efd6/'SWH vault: requested bundle cooking, waiting for completion...SWH vault: Processing...[...]
then after several hours, I get this:
Toggle snippet (6 lines)SWH vault: failure: Internal Server Error. This incident will be reported.SWH vault: retrying...SWH vault: requested bundle cooking, waiting for completion...SWH vault: Processing...
and after more than 12h, the status is still: «SWH vault: Processing...»and nothing is complete.
About this ’keyring’ branch, somehow it could be as a separated repo, sowhy not effectively do it. :-) I mean, get the branch as it is andmirror this branch in another Git repo saved on SWH; fallback to it if’keyring’ branch is not there. I do not know… Or simply wait that SWHimproves their things. :-)

Toggle quote (12 lines)> *Third, and this answers the asterisk above, we must keep in mind that> this is content-addressibility *with SHA1*. Generating a chosen-prefix> collision is becoming affordable³, so users absolutely need an additional> mechanism to authenticate code they fetched.>> For origins, we have the content SHA256, so we’re fine. For channels,> we have Guix’s authentication mechanism¹, except it’s not available yet> via SWH, as I wrote above. For the footswitch example above using> ‘--with-commit’, we don’t have any authentication method, but in fact,> that’s the situation of Git repositories in general: they can rarely be> authenticated.
How a chosen-prefix attack could work here? I understand why the secondpreimage attack is an issue. But I miss how the SHA-1 chosen-prefix attackcould be exploited here to compromise the user, because this hash is providedby this very same user.

Toggle quote (5 lines)> Ludovic Courtès (3):> swh: Support downloads of bare Git repositories.> git: 'update-cached-checkout' can fall back to SWH when cloning.> git: 'reference-available?' recognizes 'tag-or-commit'.
LGTM!
Cheers,simon
Z
Z
zimoun wrote on 17 Sep 19:31 +0200
(name . Ludovic Courtès)(address . ludo@gnu.org)
874kajglbo.fsf_-_@gmail.com
Hi Ludo,
The patch LGTM although there is a redundancy, from my understanding.
On Fri, 10 Sep 2021 at 16:34, Ludovic Courtès <ludo@gnu.org> wrote:
Toggle quote (14 lines)> @@ -694,7 +714,15 @@ wait until it becomes available, which could take several minutes."> (format log-port "SWH: found revision ~a with directory at '~a'~%"> (revision-id revision)> (swh-url (revision-directory-url revision)))> - (swh-download-directory (revision-directory revision) output> - #:log-port log-port))> + (swh-download-archive (match archive-type> + ('flat> + (string-append> + "swh:1:dir:" (revision-directory revision)))> + ('git-bare> + (string-append> + "swh:1:rev:" (revision-id revision))))
Here the ’swid’ depends on the ’archive-type’…
Toggle quote (3 lines)> + output> + #:archive-type archive-type
…which is also passed. Then this is propagated. For instance,’swh-download-directory’:
Toggle quote (9 lines)> +(define* (swh-download-directory id output> + #:key (log-port (current-error-port)))> + "Download from Software Heritage the directory with the given ID, and> +unpack it to OUTPUT. Return #t on success and #f on failure."> + (swh-download-archive (string-append "swh:1:dir:" id) output> + #:archive-type 'flat> + #:log-port log-port))> +
Does it make sense to pass this ’swhid’ equal to ’swh:1:rev’ with the’flat’ archive-type? Another instance is,
Toggle quote (4 lines)> + (match (vault-fetch swhid> + #:archive-type archive-type> + #:log-port log-port)
and from my understanding, again ’swhid’ depends on ’archive-type’.Therefore, it prone error. The best seems to pass ’(archive-type. swhid)’ and pattern-match on that. Yeah, it potentially breaks thepublic API… but there is no claim about stability (and I am notconvinced this (guix swh) module is used outside Guix :-)).


Cheers,simon
L
L
Ludovic Courtès wrote on 18 Sep 12:05 +0200
(name . zimoun)(address . zimon.toutoune@gmail.com)(address . 44187@debbugs.gnu.org)
8735q2urjv.fsf@gnu.org
Hi!
zimoun <zimon.toutoune@gmail.com> skribis:
Toggle quote (45 lines)> The patch LGTM although there is a redundancy, from my understanding.>> On Fri, 10 Sep 2021 at 16:34, Ludovic Courtès <ludo@gnu.org> wrote:>>> @@ -694,7 +714,15 @@ wait until it becomes available, which could take several minutes.">> (format log-port "SWH: found revision ~a with directory at '~a'~%">> (revision-id revision)>> (swh-url (revision-directory-url revision)))>> - (swh-download-directory (revision-directory revision) output>> - #:log-port log-port))>> + (swh-download-archive (match archive-type>> + ('flat>> + (string-append>> + "swh:1:dir:" (revision-directory revision)))>> + ('git-bare>> + (string-append>> + "swh:1:rev:" (revision-id revision))))>> Here the ’swid’ depends on the ’archive-type’…>>> + output>> + #:archive-type archive-type>> …which is also passed. Then this is propagated. For instance,> ’swh-download-directory’:>>> +(define* (swh-download-directory id output>> + #:key (log-port (current-error-port)))>> + "Download from Software Heritage the directory with the given ID, and>> +unpack it to OUTPUT. Return #t on success and #f on failure.">> + (swh-download-archive (string-append "swh:1:dir:" id) output>> + #:archive-type 'flat>> + #:log-port log-port))>> +>> Does it make sense to pass this ’swhid’ equal to ’swh:1:rev’ with the> ’flat’ archive-type? Another instance is,>>> + (match (vault-fetch swhid>> + #:archive-type archive-type>> + #:log-port log-port)>> and from my understanding, again ’swhid’ depends on ’archive-type’.> Therefore, it prone error.
‘git-bare’ only makes sense for a revision, not a directory, but Iwonder if ‘flat’ can be used for a revision (in which case it’d beequivalent to getting the corresponding directory)?
I agree there’s some redundancy between directory/revision andflat/git-bare, but it’s the SWH API that looks like this, so I’d betempted to just keep it as is. Maybe we could ask for guidance on#swh-devel.
Thanks!
Ludo’.
Z
Z
zimoun wrote on 18 Sep 12:27 +0200
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 44187@debbugs.gnu.org)
CAJ3okZ3cVgLopG5GMFXJVomxKWCfgjmWxVt1cC0oV_S32ewOTw@mail.gmail.com
Hi,
On Sat, 18 Sept 2021 at 12:05, Ludovic Courtès <ludo@gnu.org> wrote:
Toggle quote (5 lines)> zimoun <zimon.toutoune@gmail.com> skribis:
> > Does it make sense to pass this ’swhid’ equal to ’swh:1:rev’ with the> > ’flat’ archive-type? Another instance is,
[...]
Toggle quote (5 lines)> > and from my understanding, again ’swhid’ depends on ’archive-type’.> > Therefore, it prone error.>> ‘git-bare’ only makes sense for a revision, not a directory, but I
So it does not seem possible to form a 'swhid' as "swh:1:dir" and pass'archive-type' as 'git-bare'. And conversely with 'swh:1:rev' and'flat'. Right?I have not tried though. :-)If yes, it means the both arguments 'swhid' and 'archive-type' arelinked so the function should accept only one unifyied argument andnot 2 independent ones. IMHO.
Toggle quote (8 lines)> wonder if ‘flat’ can be used for a revision (in which case it’d be> equivalent to getting the corresponding directory)?>> I agree there’s some redundancy between directory/revision and> flat/git-bare, but it’s the SWH API that looks like this, so I’d be> tempted to just keep it as is. Maybe we could ask for guidance on> #swh-devel.
Well, let postpone the refactoring. :-) However, if it works as Iunderstand, then the refactoring seems the correct way so I would notaccept a backward compatibility argument. ;-)
Have a nice week-end,simon
L
L
Ludovic Courtès wrote on 18 Sep 23:10 +0200
(name . zimoun)(address . zimon.toutoune@gmail.com)(address . 44187-done@debbugs.gnu.org)
87pmt5si8b.fsf@gnu.org
Hello!
zimoun <zimon.toutoune@gmail.com> skribis:
Toggle quote (9 lines)> The original URL of the channel was:> <https://github.com/zimoun/channel-example.git>. And this channel> defines a package where the upstream has also disappeared> <https://github.com/zimoun/hello-example.git>. Note the URL in the> package definition is not bogus… but using one was already working. :-)>> All is saved on SWH, so now all is transparent! From my point of view,> this is a killer feature for scientific folks. :-)
Yay! Great that you came up with a nice example to test it on!
Toggle quote (10 lines)>> First, fallback is implemented only for fresh clones, not for updates.>> Thus, if I rerun the first example, having now the clone in>> ~/.cache/guix/checkouts, with a different commit, I get:>> SWH is not a forge but an archive. :-) Therefore, this update case does> not make sense to me. I mean,>> $ git -C ~/.cache/guix/checkouts/6k7wvrcpbdsw3pje5b4squybw3jfn3viyrj7gcl7fipa5yjflaza fetch> fatal: dépôt 'http://example.org/sdf/' non trouvé
Right, that’s a reasonable limitation.
Toggle quote (4 lines)> Well, maybe this cache could be removed if the commit is not found> inside this cache and retry to fetch it from SWH. Obviously, the> downdate case works.
It’s still useful to keep it cached around in case the user is going touse it several times in a row.
Toggle quote (12 lines)> Note that on fresh clone, the error message could be improved:>> $ ./pre-inst-env guix build guix --with-git-url=guix=https://example.org --with-commit=guix=ff613c2b68aac539262822490448e637d8f315ba -n> updating checkout of 'https://example.org'...> guix build: error: Git failure while fetching https://example.org: unexpected http status code: 404>>> where https://example.org is bogus and> ff613c2b68aac539262822490448e637d8f315ba is not yet archived on SWH. It> could be nice to warn in addition to the 404 that it is not found in> SWH. WDYT?
Agreed; I’ve made this change (actually ‘swh-download’ prints somethingupon failure since commit 60b42bec8413aa9844e625fb1903257f1bc1e55c, butit looks more like a debugging message.)
Toggle quote (18 lines)> $ guix build guix --with-git-url=guix=https://example.org--with-commit=guix=c75b30d58f0becb0a5cd6a8bfe69d1063b0d1ada -n> updating checkout of 'https://example.org'...> SWH: found revision c75b30d58f0becb0a5cd6a8bfe69d1063b0d1ada with directory at 'https://archive.softwareheritage.org/api/1/directory/ca2e8a7222b4850c7bea935dff86b9c2a905efd6/'> SWH vault: requested bundle cooking, waiting for completion...> SWH vault: Processing...> [...]>>> then after several hours, I get this:>> SWH vault: failure: Internal Server Error. This incident will be reported.> SWH vault: retrying...> SWH vault: requested bundle cooking, waiting for completion...> SWH vault: Processing...>> and after more than 12h, the status is still: «SWH vault: Processing...»> and nothing is complete.
Did it eventually succeed? We obviously have no guarantee as to howlong it might take to cook a bundle.
Toggle quote (6 lines)> About this ’keyring’ branch, somehow it could be as a separated repo, so> why not effectively do it. :-) I mean, get the branch as it is and> mirror this branch in another Git repo saved on SWH; fallback to it if> ’keyring’ branch is not there. I do not know… Or simply wait that SWH> improves their things. :-)
Yeah, they’re planning to support it eventually.
Toggle quote (5 lines)>> *Third, and this answers the asterisk above, we must keep in mind that>> this is content-addressibility *with SHA1*. Generating a chosen-prefix>> collision is becoming affordable³, so users absolutely need an additional>> mechanism to authenticate code they fetched.
[...]
Toggle quote (5 lines)> How a chosen-prefix attack could work here? I understand why the second> preimage attack is an issue. But I miss how the SHA-1 chosen-prefix attack> could be exploited here to compromise the user, because this hash is provided> by this very same user.
I think you’re right, it’s rather second-preimage attacks that would bea serious problem. My point is: as time passes, assuming that a SHA1resolves to a single revision on SWH is becoming more and morequestionable.
Toggle quote (4 lines)>> swh: Support downloads of bare Git repositories.>> git: 'update-cached-checkout' can fall back to SWH when cloning.>> git: 'reference-available?' recognizes 'tag-or-commit'.
I’ve pushed this after adding the warning as you suggested:
dce2cf311b * git: 'reference-available?' recognizes 'tag-or-commit'. 05f44c2d85 * git: 'update-cached-checkout' can fall back to SWH when cloning. 6ec81c31c0 * swh: Support downloads of bare Git repositories.
Thanks a lot for reviewing and testing on real-world examples!
Ludo’.
Closed
Z
Z
zimoun wrote on 20 Sep 11:27 +0200
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 44187-done@debbugs.gnu.org)
CAJ3okZ2XNXecc-Q0xCGcmJk99kJvk0coHSp+zpUwRGwbJSdOhg@mail.gmail.com
Hi,
On Sat, 18 Sept 2021 at 23:10, Ludovic Courtès <ludo@gnu.org> wrote:
Toggle quote (8 lines)> zimoun <zimon.toutoune@gmail.com> skribis:
> > and after more than 12h, the status is still: «SWH vault: Processing...»> > and nothing is complete.>> Did it eventually succeed? We obviously have no guarantee as to how> long it might take to cook a bundle.
No, I stopped. And I reported to #swh-devel. It might be somethingwrong on their side.Yeah, cook a bundle could be long... especially with large repo asGuix (lot of commits and couple of files).I think it is ok to let the code as it is now.

Toggle quote (17 lines)> >> *Third, and this answers the asterisk above, we must keep in mind that> >> this is content-addressibility *with SHA1*. Generating a chosen-prefix> >> collision is becoming affordable³, so users absolutely need an additional> >> mechanism to authenticate code they fetched.>> [...]>> > How a chosen-prefix attack could work here? I understand why the second> > preimage attack is an issue. But I miss how the SHA-1 chosen-prefix attack> > could be exploited here to compromise the user, because this hash is provided> > by this very same user.>> I think you’re right, it’s rather second-preimage attacks that would be> a serious problem. My point is: as time passes, assuming that a SHA1> resolves to a single revision on SWH is becoming more and more> questionable.
Well, SHA-1 is 2^160 (~10^48.2) and compared to 10^50 which is theestimated number of atoms in Earth. Speaking aboutcontent-addressability, SHA-1 seems fine. However, for security, yeahtime flies. :-)

Toggle quote (10 lines)> >> swh: Support downloads of bare Git repositories.> >> git: 'update-cached-checkout' can fall back to SWH when cloning.> >> git: 'reference-available?' recognizes 'tag-or-commit'.>> I’ve pushed this after adding the warning as you suggested:>> dce2cf311b * git: 'reference-available?' recognizes 'tag-or-commit'.> 05f44c2d85 * git: 'update-cached-checkout' can fall back to SWH when cloning.> 6ec81c31c0 * swh: Support downloads of bare Git repositories.
Cool! I would deserve a --news entry. ;-)
Cheers,simon
Closed
L
L
Ludovic Courtès wrote 5 days ago
(name . zimoun)(address . zimon.toutoune@gmail.com)(address . 44187-done@debbugs.gnu.org)
87fstxhqq3.fsf@gnu.org
Hi,
zimoun <zimon.toutoune@gmail.com> skribis:
Toggle quote (2 lines)> On Sat, 18 Sept 2021 at 23:10, Ludovic Courtès <ludo@gnu.org> wrote:
[...]
Toggle quote (15 lines)>> > How a chosen-prefix attack could work here? I understand why the second>> > preimage attack is an issue. But I miss how the SHA-1 chosen-prefix attack>> > could be exploited here to compromise the user, because this hash is provided>> > by this very same user.>>>> I think you’re right, it’s rather second-preimage attacks that would be>> a serious problem. My point is: as time passes, assuming that a SHA1>> resolves to a single revision on SWH is becoming more and more>> questionable.>> Well, SHA-1 is 2^160 (~10^48.2) and compared to 10^50 which is the> estimated number of atoms in Earth. Speaking about> content-addressability, SHA-1 seems fine. However, for security, yeah> time flies. :-)
True!
Toggle quote (12 lines)>> >> swh: Support downloads of bare Git repositories.>> >> git: 'update-cached-checkout' can fall back to SWH when cloning.>> >> git: 'reference-available?' recognizes 'tag-or-commit'.>>>> I’ve pushed this after adding the warning as you suggested:>>>> dce2cf311b * git: 'reference-available?' recognizes 'tag-or-commit'.>> 05f44c2d85 * git: 'update-cached-checkout' can fall back to SWH when cloning.>> 6ec81c31c0 * swh: Support downloads of bare Git repositories.>> Cool! I would deserve a --news entry. ;-)
That’s a good idea, I’ve added one.
Thanks,Ludo’.
Closed
?
Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send email to 44187@debbugs.gnu.org