[PATCH 0/1] guix: Add git-fetch/impure.

  • Done
  • quality assurance status badge
Details
4 participants
  • Chris Marusich
  • Luis Felipe
  • sirgazil
  • zimoun
Owner
unassigned
Submitted by
Chris Marusich
Severity
normal
C
C
Chris Marusich wrote on 27 Apr 2018 10:15
(address . guix-patches@gnu.org)(name . Chris Marusich)(address . cmmarusich@gmail.com)
20180427081520.28645-1-cmmarusich@gmail.com
Hi Guix!

Sometimes, a Git repository may only be available via an authenticated
SSH connection. Even in the case of repositories that only contain
free software, this situation can arise for administrative or
compliance-related reasons. How can one define a package in such a
situation?

This patch adds a new origin method, git-fetch/impure, which solves
that problem. Specifically, git-fetch/impure creates a fixed-output
derivation that fetches the Git repository outside of a derivation, in
the environment of the invoking user. In particular, this enables SSH
to communicate with the user's SSH agent, which in turn allows Git to
fetch the repository over an authenticated SSH connection. In
addition, because it is a fixed-output derivation, the output of a
successful git-fetch/impure is guaranteed to be identical to the
output of a pure git-fetch for any given commit.

Here's a simple example:

(define-public guix-over-ssh
(package
(inherit guix)
(name "guix-over-ssh")
(source
(origin
(inherit (package-source guix))
(method git-fetch/impure)
(uri
(git-reference
(inherit (origin-uri (package-source guix)))
(url "ssh://marusich@git.sv.gnu.org:/srv/git/guix.git")))))))

In this particular example, my username appears in the package
definition, but there is no reason why that has to be so. In many
systems, it is possible to grant access to multiple users with
different SSH keys under a single shared user name. And in other
systems, an automated build system might need to fetch sources using
its own unique system user name and SSH key.

All in all, I think this is pretty useful. It enables developers to
define packages in environments where authenticated access to Git
repositories is required. Please let me know what you think!

Chris Marusich (1):
guix: Add git-fetch/impure.

doc/guix.texi | 24 +++++++
guix/git-download.scm | 150 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 174 insertions(+)

--
2.17.0
C
C
Chris Marusich wrote on 27 Apr 2018 10:26
[PATCH 1/1] guix: Add git-fetch/impure.
(address . 31285@debbugs.gnu.org)(name . Chris Marusich)(address . cmmarusich@gmail.com)
20180427082642.28760-1-cmmarusich@gmail.com
* guix/git-download.scm (clone-to-store, clone-to-store*)
(git-reference->name, git-fetch/impure): New procedures. Export
git-fetch/impure.
* doc/guix.texi (origin Reference): Document it.
---
doc/guix.texi | 24 +++++++
guix/git-download.scm | 150 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 174 insertions(+)

Toggle diff (227 lines)
diff --git a/doc/guix.texi b/doc/guix.texi
index 75886e94b..182e15428 100644
--- a/doc/guix.texi
+++ b/doc/guix.texi
@@ -3553,6 +3553,30 @@ specified in the @code{uri} field as a @code{git-reference} object; a
(url "git://git.debian.org/git/pkg-shadow/shadow")
(commit "v4.1.5.1"))
@end example
+
+@vindex git-fetch/impure
+@item @var{git-fetch/impure} from @code{(guix git-download)}
+This procedure is the same as @code{git-fetch} in spirit; however, it
+explicitly allows impurities from the environment in which it is
+invoked: the @code{ssh} client program currently available via the
+@code{PATH} environment variable, its SSH configuration file (usually
+found at @file{~/.ssh/config}), and any SSH agent that is currently
+running (usually made available via environment variables such as
+@code{SSH_AUTH_SOCK}). Such impurities may seem concerning at first
+blush; however, because this method will fail unless its content hash
+matches the expected value, a successful git-fetch/impure is guaranteed
+to produce the exact same output as a successful git-fetch for the same
+commit.
+
+This procedure is useful if for example you need to fetch a Git
+repository that is only available via an authenticated SSH connection.
+In this case, an example @code{git-reference} might look like this:
+
+@example
+(git-reference
+ (url "ssh://username@@git.sv.gnu.org:/srv/git/guix.git")
+ (commit "486de7377f25438b0f44fd93f97e9ef822d558b8"))
+@end example
@end table
@item @code{sha256}
diff --git a/guix/git-download.scm b/guix/git-download.scm
index 33f102bc6..04c90e448 100644
--- a/guix/git-download.scm
+++ b/guix/git-download.scm
@@ -2,6 +2,7 @@
;;; Copyright © 2014, 2015, 2016, 2017 Ludovic Courtès <ludo@gnu.org>
;;; Copyright © 2017 Mathieu Lirzin <mthl@gnu.org>
;;; Copyright © 2017 Christopher Baines <mail@cbaines.net>
+;;; Copyright © 2018 Chris Marusich <cmmarusich@gmail.com>
;;;
;;; This file is part of GNU Guix.
;;;
@@ -24,14 +25,19 @@
#:use-module (guix store)
#:use-module (guix monads)
#:use-module (guix records)
+ #:use-module (guix derivations)
#:use-module (guix packages)
#:use-module (guix modules)
+ #:use-module (guix ui)
+ #:use-module ((guix build git)
+ #:select ((git-fetch . build:git-fetch)))
#:autoload (guix build-system gnu) (standard-packages)
#:use-module (ice-9 match)
#:use-module (ice-9 popen)
#:use-module (ice-9 rdelim)
#:use-module (ice-9 vlist)
#:use-module (srfi srfi-1)
+ #:use-module (srfi srfi-26)
#:export (git-reference
git-reference?
git-reference-url
@@ -39,6 +45,7 @@
git-reference-recursive?
git-fetch
+ git-fetch/impure
git-version
git-file-name
git-predicate))
@@ -140,6 +147,149 @@ HASH-ALGO (a symbol). Use NAME as the file name, or a generic name if #f."
#:recursive? #t
#:guile-for-build guile)))
+(define (clone-to-store store name git-reference hash runtime-dependencies)
+ "Clone a Git repository and add it to the store. STORE is an open
+connection to the store. NAME will be used as the file name. GIT-REFERENCE
+is a <git-reference> describing the Git repository to clone. HASH is the
+recursive SHA256 hash value of the Git repository, as produced by \"guix hash
+--recursive\" after the .git directories have been removed; if a fixed output
+derivation has already added content to the store with this HASH, then this
+procedure returns immediately. RUNTIME-DEPENDENCIES is a list of store paths;
+the \"bin\" directory of the RUNTIME-DEPENDENCIES will be added to the PATH
+environment variable before running the \"git\" program."
+ (define (is-source? name stat)
+ ;; It's source if and only if it isn't a .git directory.
+ (not (and (eq? (stat:type stat) 'directory)
+ (equal? name ".git"))))
+
+ (define (clean staging-directory)
+ (when (file-exists? staging-directory)
+ (info (G_ "Removing staging directory `~a'~%") staging-directory)
+ (delete-file-recursively staging-directory)))
+
+ (define (fetch staging-directory)
+ (info
+ (G_ "Downloading Git repository `~a' to staging directory `~a'~%")
+ (git-reference-url git-reference)
+ staging-directory)
+ (mkdir-p staging-directory)
+ ;; TODO: Make Git print to stderr instead of stdout.
+ (build:git-fetch
+ (git-reference-url git-reference)
+ (git-reference-commit git-reference)
+ staging-directory
+ #:recursive? (git-reference-recursive? git-reference))
+ (info (G_ "Adding `~a' to the store~%") staging-directory)
+ ;; Even when the git fetch was not done recursively, we want to
+ ;; recursively add to the store the results of the git fetch.
+ (add-to-store store name #t "sha256" staging-directory
+ #:select? is-source?))
+
+ ;; To avoid fetching the repository when it has already been added to the
+ ;; store previously, the name passed to fixed-output-path must be the same
+ ;; as the name used when calling gexp->derivation in git-fetch/ssh.
+ (let* ((already-fetched? (false-if-exception
+ (valid-path? store (fixed-output-path name hash))))
+ (tmpdir (or (getenv "TMPDIR") "/tmp"))
+ (checkouts-directory (string-append tmpdir "/guix-git-ssh-checkouts"))
+ (staging-directory (string-append checkouts-directory "/" name))
+ (original-path (getenv "PATH")))
+ ;; We might need to clean up before starting. For example, we would need
+ ;; to do that if Guile crashed during a previous fetch.
+ (clean staging-directory)
+ (unless already-fetched?
+ ;; Put our Guix-managed runtime dependencies at the front of the PATH so
+ ;; they will be used in favor of whatever happens to be in the user's
+ ;; environment (except for SSH, of course). Redirect stdout to stderr
+ ;; to keep set-path-environment-variable from printing a misleading
+ ;; message about PATH's value, since we immediately change it.
+ (parameterize ((current-output-port (%make-void-port "w")))
+ (set-path-environment-variable "PATH" '("bin") runtime-dependencies))
+ (let ((new-path (if original-path
+ (string-append (getenv "PATH") ":" original-path)
+ (getenv "PATH"))))
+ (setenv "PATH" new-path)
+ (info (G_ "Set environment variable PATH to `~a'~%") new-path)
+ (let ((result (fetch staging-directory)))
+ (clean staging-directory)
+ result)))))
+
+(define clone-to-store* (store-lift clone-to-store))
+
+(define (git-reference->name git-reference)
+ (let ((repository-name (basename (git-reference-url git-reference) ".git"))
+ (short-commit (string-take (git-reference-commit git-reference) 9)))
+ (string-append repository-name "-" short-commit "-checkout")))
+
+(define* (git-fetch/impure ref hash-algo hash
+ #:optional name
+ #:key
+ (system (%current-system))
+ (guile (default-guile)))
+ "Return a fixed-output derivation that fetches REF, a <git-reference>
+object. The output is expected to have recursive hash HASH of type
+HASH-ALGO (a symbol). Use NAME as the file name, or a generic name if #f.
+
+This procedure is the same as git-fetch in spirit; however, it explicitly
+allows impurities from the environment in which it is invoked: the \"ssh\"
+client program currently available via the PATH environment variable, its SSH
+configuration file (usually found at ~/.ssh/config), and any SSH agent that is
+currently running (usually made available via environment variables such as
+SSH_AUTH_SOCK). Such impurities may seem concerning at first blush; however,
+because a fixed-output derivation will fail unless its content hash is
+correct, a successful git-fetch/impure is guaranteed to produce the exact same
+output as a successful git-fetch for the same commit.
+
+This procedure is useful if for example you need to fetch a Git repository
+that is only available via an authenticated SSH connection."
+ ;; Do the Git fetch in the host environment so that it has access to the
+ ;; user's SSH agent, SSH config, and other tools. This will only work if we
+ ;; are running in an environment with a properly installed and configured
+ ;; SSH. It is impure because it happens outside of a derivation, but it
+ ;; allows us to fetch a Git repository that is only available over SSH.
+ (mlet* %store-monad
+ ((name -> (or name (git-reference->name ref)))
+ (guile (package->derivation guile system))
+ (git -> `("git" ,(git-package)))
+ ;; When doing 'git clone --recursive', we need sed, grep, etc. to be
+ ;; available so that 'git submodule' works. We do not add an SSH
+ ;; client to the inputs here, since we explicltly want to use the SSH
+ ;; client, SSH agent, and SSH config from the user's environment.
+ (inputs -> `(,git ,@(if (git-reference-recursive? ref)
+ (standard-packages)
+ '())))
+ (input-packages -> (match inputs (((names packages outputs ...) ...)
+ packages)))
+ (input-derivations (sequence %store-monad
+ (map (cut package->derivation <> system)
+ input-packages)))
+ ;; The tools that clone-to-store requires (e.g., Git) must be built
+ ;; before we invoke clone-to-store.
+ (ignored (built-derivations input-derivations))
+ (input-paths -> (map derivation->output-path input-derivations))
+ (checkout (clone-to-store* name ref hash input-paths)))
+ (gexp->derivation
+ ;; To avoid fetching the repository when it's already been added to the
+ ;; store previously, the name used here must be the same as the name used
+ ;; when calling fixed-output-path in clone-to-store.
+ name
+ (with-imported-modules '((guix build utils))
+ #~(begin
+ (use-modules (guix build utils))
+ (copy-recursively #$checkout #$output)))
+ ;; Slashes are not allowed in file names.
+ #:script-name "git-download-ssh"
+ #:system system
+ ;; Fetching a Git repository is usually a network-bound operation, so
+ ;; offloading is unlikely to speed things up.
+ #:local-build? #t
+ #:hash-algo hash-algo
+ #:hash hash
+ ;; Even when the git fetch will not be done recursively, we want to
+ ;; recursively add to the store the results of the git fetch.
+ #:recursive? #t
+ #:guile-for-build guile)))
+
(define (git-version version revision commit)
"Return the version string for packages using git-download."
(string-append version "-" revision "." (string-take commit 7)))
--
2.17.0
C
C
Chris Marusich wrote on 30 Apr 2018 04:49
(address . 31285@debbugs.gnu.org)
87sh7dcsss.fsf@gmail.com
Hi Mark, Ludo, and David,

ludo@gnu.org (Ludovic Courtès) writes:

Toggle quote (15 lines)
> Hello,
>
> Chris Marusich <cmmarusich@gmail.com> skribis:
>
>> You've both said that you would prefer not to add git-fetch/impure to
>> Guix. Can you help me to understand why you feel that way? I really
>> think it would be nice if Guix could fetch Git repositories over SSH
>> using public key authentication, so I'm hoping that we can talk about it
>> and figure out an acceptable way to implement it.
>
> One argument against it would be that it encourages people (or at least
> makes it very easy) to write origins that depend on external state, and
> thus may be non-reproducible by others, and that Guix itself should
> provide tools for writing reproducible build definitions.

The impurity bothers me, too. If you don't have the right SSH key
available or your SSH installation isn't configured in just the right
way, then an origin defined using git-fetch/impure won't work.

Could we eliminate the impurity by adding a feature to the guix-daemon
that allows an administrator (i.e., root) to configure an SSH key for
guix-daemon to use when fetching Git repositories over SSH? If it's
possible, I think that would be preferable. What do you think of that
idea?

Also, here's a new version of the patch, which fixes/improves some
random things I noticed.

--
Chris
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEEy/WXVcvn5+/vGD+x3UCaFdgiRp0FAlrmhDMACgkQ3UCaFdgi
Rp2LFhAAkAptSz6erC9kTA4UKSw18399eW7+nS2xVzKsB5GRBORywLMFO+JYiHpW
wVA0XFO5esoYQYFj09oU4heNLVSWU2F4XItXdAgh4mVc5BOCHYnsUtr7J94R9rpH
FN+3cJiW+dDl2EOGa5f5LDNlAzpOOXJw+312rRs5gxUIzS7aM9inwTd6OsuUaVM9
0a7GvVcrs79aVJ+jCmyIYy4FXB/1oV/Boe0diuss857peZW1GwEUbFjBXpy2Joo1
SCP51RaENVlpTZdeHC8jjTUWkrhq3bP53ffMRh+gZFkdsmcs/vlWxZ0jHiWHwW1x
Ypb+ybGz/cc719GRJm6Y88nYIjsiHZZ9Q/cTqoK36LW5MjOwzmJOJVMyrwyq/Eg5
TOd2Zuvsow7ofipkZoobt7knezH04FSQeTlzl/tS/l1vJrfoHKqA9S/JdzZBeAnz
B8PHj7WHqYnm6v/Fw3KKzKrdcTOFLNCtncUpIC9jSmUUwknHP7AfUaBROXEH007X
mpvCeJV5e67zCCwY7UrkPPZQK/geqDDoJUPKs9VQki8Qwr8K5UlknvcGenMK5dke
r2BBaAI/i0jjQUTIOP6H46LHvRSPn6NPWEvk8garSzNmFo8YpVoIkt9gHjY1RMNE
Gs2zYXwjmGclWcBc/Ot2cZgiPj7F/Bmm9l7dMHIMNQy4NDocZtw=
=GC4l
-----END PGP SIGNATURE-----

S
S
sirgazil wrote on 18 Apr 2020 17:54
[PATCH 0/1] guix: Add git-fetch/impure.
(name . 31285)(address . 31285@debbugs.gnu.org)
1718dff34ac.cc32d28414025.6904503667509437602@zoho.com
Hi,

I feel the same as Chris. I started doing some packaging this year, and really felt downhearted when I found there was no support for package definitions with SSH authenticated git repositories (for private use, of course).

In my case, I need this for two reasons:

* I want to use Guix channels for experimental packages, prototypes and pre-alpha software that should be available for some people only.
* I want to use Guix channels for production-ready packages of in-house tools that are only useful for private businesses.

In both cases, the channels and software sources would be in Git repositories hosted by third-parties like GitLab, BitBucket, etc., which provide SSH authentication.

There are some comments already about Chris' patch in another bug report (issues.guix.gnu.org/issue/31284). I agree that "git-fetch/impure" must not be used in Guix's official channel(s), but I'd like Guix to include it in its API for use in private channels.

I think having this functionality would make it even easier to adopt the GNU Guix in mainstream culture.


---
L
L
Luis Felipe wrote on 22 Oct 2020 02:44
(name . 31285@debbugs.gnu.org)(address . 31285@debbugs.gnu.org)
W3s1Sh57IvY4C8ohKgv7HotKerWbKjuqVV9ZzG1H_CgookWFqoPB7KqfMjUV8lnG8rHkxCtLEoFL_jrOVQDmDVYlerox39SEzq1iAfCKxUw=@protonmail.com
Toggle quote (3 lines)
> Sometimes, a Git repository may only be available via an authenticatedSSH connection. Even in the case of repositories that only containfree software, this situation can arise for administrative orcompliance-related reasons. How can one define a package in such asituation?


Correct me if I'm wrong, but I think this is possible now. All you have to do is pass a git-checkout record to the package source field instead of an origin (see the (guix git) module). For example:

(source
(git-checkout
(url "git@gitlab.com:luis-felipe/guile-lab.git")
(commit (string-append "v" version))))

I'm using this for my private packages, and it seems to work.
Z
Z
zimoun wrote on 1 Dec 2020 19:06
Re: [bug#31285] [PATCH 1/1] guix: Add git-fetch/impure.
(name . Chris Marusich)(address . cmmarusich@gmail.com)
86im9lv1sh.fsf@gmail.com
Hi,

The bug #31285 is mainly about allow Git over SSH, see:


and I do not know where the discussion below happened…

On Sun, 29 Apr 2018 at 19:49, Chris Marusich <cmmarusich@gmail.com> wrote:
Toggle quote (32 lines)
> Hi Mark, Ludo, and David,
>
> ludo@gnu.org (Ludovic Courtès) writes:
>
>> Hello,
>>
>> Chris Marusich <cmmarusich@gmail.com> skribis:
>>
>>> You've both said that you would prefer not to add git-fetch/impure to
>>> Guix. Can you help me to understand why you feel that way? I really
>>> think it would be nice if Guix could fetch Git repositories over SSH
>>> using public key authentication, so I'm hoping that we can talk about it
>>> and figure out an acceptable way to implement it.
>>
>> One argument against it would be that it encourages people (or at least
>> makes it very easy) to write origins that depend on external state, and
>> thus may be non-reproducible by others, and that Guix itself should
>> provide tools for writing reproducible build definitions.
>
> The impurity bothers me, too. If you don't have the right SSH key
> available or your SSH installation isn't configured in just the right
> way, then an origin defined using git-fetch/impure won't work.
>
> Could we eliminate the impurity by adding a feature to the guix-daemon
> that allows an administrator (i.e., root) to configure an SSH key for
> guix-daemon to use when fetching Git repositories over SSH? If it's
> possible, I think that would be preferable. What do you think of that
> idea?
>
> Also, here's a new version of the patch, which fixes/improves some
> random things I noticed.

…and the question is: is it still relevant? I am not sure to get if the
use-case of the initial motivation is not covered by the current
’git-fetch’. If not, what is the status of this patch: rejected with
which reason or merged?


Thanks,
simon
C
C
Chris Marusich wrote on 14 Jul 2021 11:23
Re: bug#31285: [PATCH 0/1] guix: Add git-fetch/impure.
(name . Luis Felipe)(address . luis.felipe.la@protonmail.com)(address . 31285-done@debbugs.gnu.org)
87sg0hz1sv.fsf_-_@gmail.com
Luis Felipe <luis.felipe.la@protonmail.com> writes:

Toggle quote (12 lines)
>> Sometimes, a Git repository may only be available via an authenticatedSSH connection. Even in the case of repositories that only containfree software, this situation can arise for administrative orcompliance-related reasons. How can one define a package in such asituation?
>
>
> Correct me if I'm wrong, but I think this is possible now. All you have to do is pass a git-checkout record to the package source field instead of an origin (see the (guix git) module). For example:
>
> (source
> (git-checkout
> (url "git@gitlab.com:luis-felipe/guile-lab.git")
> (commit (string-append "v" version))))
>
> I'm using this for my private packages, and it seems to work.

Yes, this does work. Combined with the fact that it is now possible to
"guix pull" channels over SSH, there is no need for this patch any more.
The "git-checkout" gexp-compiler basically does the same thing that I
was trying to do (it is still "impure" in that the fetching happens
outside the store), but it does it more elegantly.

I'm closing this report.

--
Chris
-----BEGIN PGP SIGNATURE-----

iQJJBAEBCAAzFiEEy/WXVcvn5+/vGD+x3UCaFdgiRp0FAmDurSAVHGNtbWFydXNp
Y2hAZ21haWwuY29tAAoJEN1AmhXYIkadPxQP/1LNLTb64wzSUwJGapQxF2KCV0TA
1DjMUpkKFXS3GgYelLFJWAJECmmjwiSaxrp3/3ujYA4h68WHYOp66IdpRcCRcteH
Q/5ealPro8+vTWIzMWf8we7GAFKvVgOiwwiHX3MNT/QQMZiGLzVFBYzX/ktvnn91
4yMi7QDZjC0Jpx79RHVh18XjF5FhOcD5F1/H7dH6aJzXb202XVuvU3YgTB8kp3q+
SvNMXdnTCUKCBALMlimlgGCDtoaC5Lh7AdjT/TIDX3SX+3ea8935N7qX1FAsC3EY
tFatUZy5aWB+46NR6TaIrTpGQVOSbq66oS8WmtcfOX8Jk4u0j09huictdAGVeeAF
nPInIqn/OT07DXLaW+9r+7OWc53m+fDnw++P3fEdsjWWoAoHbVdOgK/Zkm3IX/tw
fv6NZuppLW5WcetibvdCKd8L4yXu+qZuUk3QqLe9qeNJ4aZsvUhi94P6pSKJ6pMq
YYKRtxDVb+5g7sJ51dRUMj3pr8S+tYJ8E+3aJrzSBuXWUK04pNeeMnZmuwkN6XQQ
xqld7FRMVhYxJi5pyd4Uw0xWd+z2e7O3+/PF+Gom3N0kqoUUMSR9Dledz4ZKNssk
012iftCIK3dvbPQjf2Tv5fih2de55pTT2BVBNG0XK0eDItYqg4yIo1YxN1+J1VdD
90I1FSwMoAeMYHp4
=/XPX
-----END PGP SIGNATURE-----

Closed
?