From debbugs-submit-bounces@debbugs.gnu.org Thu Mar 04 00:41:00 2021 Received: (at 44178) by debbugs.gnu.org; 4 Mar 2021 05:41:00 +0000 Received: from localhost ([127.0.0.1]:57455 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lHgjC-00079D-Ft for submit@debbugs.gnu.org; Thu, 04 Mar 2021 00:41:00 -0500 Received: from mail-qt1-f173.google.com ([209.85.160.173]:36006) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lHgj5-00078v-UR for 44178@debbugs.gnu.org; Thu, 04 Mar 2021 00:40:48 -0500 Received: by mail-qt1-f173.google.com with SMTP id 18so16943356qty.3 for <44178@debbugs.gnu.org>; Wed, 03 Mar 2021 21:40:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=MMcQ57WI/a4F1HVzGgAaBLizDemEewK280fkj03Sq04=; b=ZSk5JNfzYg/2OxZZE4kcMuhhlhhQoumiBUZM0UdElxK9Ri91Sl7iNq6sInmLHLrxfe IcyKDJU/8zsNDKiSAYEg/KBNX2kjXLNbrX7KA8EBo29A4gYEhK/yAXNPwQsPmi/OyjUP hK1+Sy8/6HhQkLK+pk5Os2glhuXDSIjYRLbbWLr+VPZ5ZP8+m2AvBHJMA0se+Ay89iC2 /wBc5Qcb9XG2dYjbcKpNiekxU7O2LsQCBrYtJgKusTwxL1wuohB/jd29fkZRyzxY0ZnU wZFifil37wZhWLxt6oTPl3j72rpQWVmAtDbqQw7V71iI3FSDeHbn8EVnG87jF7G97365 I9LQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version; bh=MMcQ57WI/a4F1HVzGgAaBLizDemEewK280fkj03Sq04=; b=iPTWOXQgIOmepZClZa8UZGKpMiayFPSXPDKx48Nk+wIrB71KMaJF5/w5zIIvehU1CN NaiqpGi9F6VJJMN2I4KyvM0mOk5UI+6Dxre7DEOTb9JFvn/YZqb05AF/EPtgrktiiYWI AD2W1WeLl1e01bIneesCFhE/uIaz+79bFXOzsfKQkCX+Hp4iRziB3IBVaSEe6zjeHZV9 VmsKKjM9q51XfoQ8f4U/oo+5eMaaVUNtOKJcg11770tv/jq1x7Aqx/HxHNitCElp/KbC eCjdjfQZ6ejuIUxtUOztZbc+ISj1yOcdZxEj0qFjfgjHZm9bmnddAyR7AOmWXIufWouZ 34BQ== X-Gm-Message-State: AOAM530N8oGbGq5a28rQm4zSxHfZOB26gGZfdOBoTcYlH+hgGKWpPh2f NaiHU2xuZ6TutzTwDCFK2KA= X-Google-Smtp-Source: ABdhPJzU9+KGwspk61dek752ACqKLnWS60FVSCZud2xuh0Uzf5M/QYQjU6ZCJ+wbJbW83uirCootNQ== X-Received: by 2002:ac8:678c:: with SMTP id b12mr2698845qtp.160.1614836438421; Wed, 03 Mar 2021 21:40:38 -0800 (PST) Received: from hurd (dsl-205-236-230-76.b2b2c.ca. [205.236.230.76]) by smtp.gmail.com with ESMTPSA id h6sm2841714qtj.75.2021.03.03.21.40.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Mar 2021 21:40:37 -0800 (PST) From: Maxim Cournoyer To: JOULAUD =?utf-8?Q?Fran=C3=A7ois?= Subject: [PATCH v4] Re: bug#44178: Add a Go Module Importer References: <87sga5kpdp.fsf@gmail.com> <20210219161737.4l266imcd24gqxwn@fjo-extia-HPdeb.example.avalenn.eu> <871rcxte52.fsf_-_@gnu.org> Date: Thu, 04 Mar 2021 00:40:36 -0500 In-Reply-To: <871rcxte52.fsf_-_@gnu.org> ("Ludovic =?utf-8?Q?Court=C3=A8s?= =?utf-8?Q?=22's?= message of "Tue, 02 Mar 2021 22:54:49 +0100") Message-ID: <8735xbqxwr.fsf_-_@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Spam-Score: 0.7 (/) X-Debbugs-Envelope-To: 44178 Cc: Ludovic =?utf-8?Q?Court=C3=A8s?= , "44178@debbugs.gnu.org" <44178@debbugs.gnu.org>, Katherine Cox-Buday X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi Fran=C3=A7ois, Ludovic, et al! Sorry for bumping in the review, but I have been experimenting with this importer, and it looks promising; thanks for everyone involved! I made a couple changes, mostly with regard to integrating support for the synopsis, description and license field of the package, plus other cosmetic changes. I thought I should share it quickly so that it can be used as the basis for a v5, so here's the patch, attached. I hope you don't mind! I tested it with: $ ./pre-inst-env guix environment guix $ ./pre-inst-env guix import go -r github.com/dgraph-io/badger/v2 --8<---------------cut here---------------start------------->8--- [...] (define-public go-github-com-dgraph-io-badger-v2 (package (name "go-github-com-dgraph-io-badger-v2") (version "2.2007.2") (source (origin (method git-fetch) (uri (git-reference (url "https://github.com/dgraph-io/badger.git") (commit (go-version->git-ref version)))) (file-name (git-file-name name version)) (sha256 (base32 "0000000000000000000000000000000000000000000000000000")))) (build-system go-build-system) (arguments '(#:import-path "github.com/dgraph-io/badger")) (inputs `(("go-gopkg-in-check-v1" ,go-gopkg-in-check-v1) ("go-golang-org-x-sys" ,go-golang-org-x-sys) ("go-golang-org-x-net" ,go-golang-org-x-net) ("go-github-com-stretchr-testify" ,go-github-com-stretchr-testify) ("go-github-com-spf13-cobra" ,go-github-com-spf13-cobra) ("go-github-com-spaolacci-murmur3" ,go-github-com-spaolacci-murmur3) ("go-github-com-pkg-errors" ,go-github-com-pkg-errors) ("go-github-com-kr-pretty" ,go-github-com-kr-pretty) ("go-github-com-golang-snappy" ,go-github-com-golang-snappy) ("go-github-com-golang-protobuf" ,go-github-com-golang-protobuf) ("go-github-com-dustin-go-humanize" ,go-github-com-dustin-go-humanize) ("go-github-com-dgryski-go-farm" ,go-github-com-dgryski-go-farm) ("go-github-com-dgraph-io-ristretto" ,go-github-com-dgraph-io-ristretto) ("go-github-com-cespare-xxhash" ,go-github-com-cespare-xxhash) ("go-github-com-datadog-zstd" ,go-github-com-datadog-zstd))) (home-page "https://github.com/dgraph-io/badger") (synopsis "BadgerDB") (description "Package badger implements an embeddable, simple and fast key-value d= atabase, written in pure Go. It is designed to be highly performant for bot= h reads and writes simultaneously. Badger uses Multi-Version Concurrency Co= ntrol (MVCC), and supports transactions. It runs transactions concurrently,= with serializable snapshot isolation guarantees.") (license (license:asl2.0)))) --8<---------------cut here---------------end--------------->8--- Attached is the fixup commit which should apply cleanly on top of your v3 patch on master, along a (now required) commit to use a temporary fork of guile-lib: --=-=-= Content-Type: text/x-patch; charset=utf-8 Content-Disposition: attachment; filename=0001-gnu-guile-lib-Update-to-a-temporary-fork.patch Content-Transfer-Encoding: quoted-printable From 16c07537375ab5d18ee76a5fdfb2b8ed7192b395 Mon Sep 17 00:00:00 2001 From: Maxim Cournoyer Date: Wed, 3 Mar 2021 16:20:22 -0500 Subject: [PATCH] gnu: guile-lib: Update to a temporary fork. This fork add support to enable stricter/more correct parsing of HTML in htmlprag, which is used by the go importer. * gnu/packages/guile-xyz.scm (guile-lib)[source]: Fetch from git. Remove snippet and modules field. [native-inputs]: Add autoconf, automake, gettext and texinfo. --- gnu/packages/guile-xyz.scm | 96 ++++++++++++++++++-------------------- 1 file changed, 46 insertions(+), 50 deletions(-) diff --git a/gnu/packages/guile-xyz.scm b/gnu/packages/guile-xyz.scm index ce5aad8ec7..c14193921b 100644 --- a/gnu/packages/guile-xyz.scm +++ b/gnu/packages/guile-xyz.scm @@ -16,7 +16,7 @@ ;;; Copyright =C2=A9 2017 Theodoros Foradis ;;; Copyright =C2=A9 2017 Nikita ;;; Copyright =C2=A9 2017, 2018 Tobias Geerinckx-Rice -;;; Copyright =C2=A9 2018 Maxim Cournoyer +;;; Copyright =C2=A9 2018, 2021 Maxim Cournoyer ;;; Copyright =C2=A9 2018, 2019, 2020 Arun Isaac ;;; Copyright =C2=A9 2018 Pierre-Antoine Rouby ;;; Copyright =C2=A9 2018 Eric Bavier @@ -2167,59 +2167,55 @@ library.") ("guile" ,guile-3.0))))) =20 (define-public guile-lib - (package - (name "guile-lib") - (version "0.2.6.1") - (source (origin - (method url-fetch) - (uri (string-append "mirror://savannah/guile-lib/guile-lib-" - version ".tar.gz")) - (sha256 - (base32 - "0aizxdif5dpch9cvs8zz5g8ds5s4xhfnwza2il5ji7fv2h7ks7bd")) - (modules '((guix build utils))) - (snippet - '(begin - ;; Work around miscompilation on Guile 3.0.0 at -O2: - ;; . - (substitute* "src/md5.scm" - (("\\(define f-ash ash\\)") - "(define f-ash (@ (guile) ash))\n") - (("\\(define f-add \\+\\)") - "(define f-add (@ (guile) +))\n")) - #t)))) - (build-system gnu-build-system) - (arguments - '(#:make-flags - '("GUILE_AUTO_COMPILE=3D0") ; to prevent guild errors - #:phases - (modify-phases %standard-phases - (add-before 'configure 'patch-module-dir - (lambda _ - (substitute* "src/Makefile.in" - (("^moddir =3D ([[:graph:]]+)") - "moddir =3D $(datadir)/guile/site/@GUILE_EFFECTIVE_VERSION= @\n") - (("^godir =3D ([[:graph:]]+)") - "godir =3D \ -$(libdir)/guile/@GUILE_EFFECTIVE_VERSION@/site-ccache\n")) - #t))))) - (native-inputs - `(("guile" ,guile-3.0) - ("pkg-config" ,pkg-config))) - (inputs - `(("guile" ,guile-3.0))) - (home-page "https://www.nongnu.org/guile-lib/") - (synopsis "Collection of useful Guile Scheme modules") - (description - "Guile-Lib is intended as an accumulation place for pure-scheme Guile + (let ((revision "1") + (commit "c059f13e332347201eaa4a32ef27c53d064f2d17")) + (package + (name "guile-lib") + (version (git-version "0.2.6.1" revision commit)) + (source (origin + (method git-fetch) + (uri (git-reference + (url "https://notabug.org/apteryx/guile-lib/") + (commit commit))) + (file-name (git-file-name name version)) + (sha256 + (base32 + "1dl2f53p737n637n2805slci5i32s6cy0bq1j0xkmzd5piymg4f8"))= )) + (build-system gnu-build-system) + (arguments + '(#:make-flags + '("GUILE_AUTO_COMPILE=3D0") ;to prevent guild errors + #:phases + (modify-phases %standard-phases + (add-before 'configure 'patch-module-dir + (lambda _ + (substitute* "src/Makefile.in" + (("^moddir =3D ([[:graph:]]+)") + "moddir =3D $(datadir)/guile/site/@GUILE_EFFECTIVE_VERSI= ON@\n") + (("^godir =3D ([[:graph:]]+)") + "godir =3D \ +$(libdir)/guile/@GUILE_EFFECTIVE_VERSION@/site-ccache\n"))))))) + (native-inputs + `(("autoconf" ,autoconf) + ("automake" ,automake) + ("gettext" ,gettext-minimal) + ("guile" ,guile-3.0) + ("pkg-config" ,pkg-config) + ("texinfo" ,texinfo))) + (inputs + `(("guile" ,guile-3.0))) + (home-page "https://www.nongnu.org/guile-lib/") + (synopsis "Collection of useful Guile Scheme modules") + (description + "Guile-Lib is intended as an accumulation place for pure-scheme Gui= le modules, allowing for people to cooperate integrating their generic Guile modules into a coherent library. Think \"a down-scaled, limited-scope CPAN for Guile\".") =20 - ;; The whole is under GPLv3+, but some modules are under laxer - ;; distribution terms such as LGPL and public domain. See `COPYING' f= or - ;; details. - (license license:gpl3+))) + ;; The whole is under GPLv3+, but some modules are under laxer + ;; distribution terms such as LGPL and public domain. See `COPYING'= for + ;; details. + (license license:gpl3+)))) =20 (define-public guile2.0-lib (package --=20 2.30.1 --=-=-= Content-Type: text/x-patch; charset=utf-8 Content-Disposition: attachment; filename=0002-fixup-Create-importer-for-Go-modules.patch Content-Transfer-Encoding: quoted-printable From f3a6130577252e3d079a6209ec2e21bf5d8baf25 Mon Sep 17 00:00:00 2001 From: Maxim Cournoyer Date: Wed, 3 Mar 2021 16:45:11 -0500 Subject: [PATCH] fixup! Create importer for Go modules --- guix/build-system/go.scm | 34 ++-- guix/import/go.scm | 420 ++++++++++++++++++++++----------------- 2 files changed, 257 insertions(+), 197 deletions(-) diff --git a/guix/build-system/go.scm b/guix/build-system/go.scm index 594e0cb4f3..d07c703a6a 100644 --- a/guix/build-system/go.scm +++ b/guix/build-system/go.scm @@ -34,30 +34,28 @@ go-version->git-ref)) =20 (define (go-version->git-ref version) - "GO-VERSION->GIT-REF parse pseudo-versions and extract the commit - hash from it, defaulting to full VERSION if we don't recognise a - pseudo-version pattern." - ;; A module version like v1.2.3 is introduced by tagging a revision in - ;; the underlying source repository. Untagged revisions can be referred - ;; to using a "pseudo-version" like v0.0.0-yyyymmddhhmmss-abcdefabcdef, - ;; where the time is the commit time in UTC and the final suffix is the - ;; prefix of the commit hash. - ;; cf. https://golang.org/cmd/go/#hdr-Pseudo_versions + "GO-VERSION->GIT-REF parse pseudo-versions and extract the commit hash f= rom +it, defaulting to full VERSION if a pseudo-version pattern is not recogniz= ed." + ;; A module version like v1.2.3 is introduced by tagging a revision in t= he + ;; underlying source repository. Untagged revisions can be referred to + ;; using a "pseudo-version" like v0.0.0-yyyymmddhhmmss-abcdefabcdef, whe= re + ;; the time is the commit time in UTC and the final suffix is the prefix= of + ;; the commit hash (see: https://golang.org/cmd/go/#hdr-Pseudo_versions). (let* ((version - ;; if a source code repository has a v2.0.0 or later tag for - ;; a file tree with no go.mod, the version is considered to be - ;; part of the v1 module's available versions and is given an - ;; +incompatible suffix - ;; https://golang.org/cmd/go/#hdr-Module_compatibility_and_seman= tic_versioning + ;; If a source code repository has a v2.0.0 or later tag for a f= ile + ;; tree with no go.mod, the version is considered to be part of = the + ;; v1 module's available versions and is given an +incompatible + ;; suffix + ;; (see:https://golang.org/cmd/go/#hdr-Module_compatibility_and_= semantic_versioning). (if (string-suffix? "+incompatible" version) (string-drop-right version 13) version)) (re (string-concatenate (list - "(v?[0-9]\\.[0-9]\\.[0-9])" ; "v" prefix can be omitted in = version prefix - "(-|-pre\\.0\\.|-0\\.)" ; separator - "([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0= -9][0-9][0-9])-" ; timestamp - "([0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0= -9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f]= )"))) ; commit hash + "(v?[0-9]\\.[0-9]\\.[0-9])" ;"v" prefix can be omitted in v= ersion prefix + "(-|-pre\\.0\\.|-0\\.)" ;separator + "([0-9]{14})-" ;timestamp + "([0-9A-Fa-f]{12})"))) ;commit hash (match (string-match re version))) (if match (match:substring match 4) diff --git a/guix/import/go.scm b/guix/import/go.scm index fead355bd2..7bc97c5c92 100644 --- a/guix/import/go.scm +++ b/guix/import/go.scm @@ -2,6 +2,7 @@ ;;; Copyright =C2=A9 2020 Katherine Cox-Buday ;;; Copyright =C2=A9 2020 Helio Machado <0x2b3bfa0+guix@googlemail.com> ;;; Copyright =C2=A9 2021 Fran=C3=A7ois Joulaud +;;; Copyright =C2=A9 2021 Maxim Cournoyer ;;; ;;; This file is part of GNU Guix. ;;; @@ -18,51 +19,37 @@ ;;; You should have received a copy of the GNU General Public License ;;; along with GNU Guix. If not, see . =20 -;;; (guix import golang) wants to make easier to create Guix package -;;; declaration for Go modules. +;;; (guix import golang) attempts to make it easier to create Guix package +;;; declarations for Go modules. ;;; -;;; Modules in Go are "collection of related Go packages" which are -;;; "the unit of source code interchange and versioning". -;;; Modules are generally hosted in a repository. +;;; Modules in Go are a "collection of related Go packages" which are "the +;;; unit of source code interchange and versioning". Modules are generally +;;; hosted in a repository. ;;; -;;; At this point it should handle correctly modules which -;;; have only Go dependencies and are accessible from proxy.golang.org -;;; (or configured GOPROXY). +;;; At this point it should handle correctly modules which have only Go +;;; dependencies and are accessible from proxy.golang.org (or configured v= ia +;;; GOPROXY). ;;; ;;; We want it to work more or less this way: ;;; - get latest version for the module from GOPROXY ;;; - infer VCS root repo from which we will check-out source by ;;; + recognising known patterns (like github.com) -;;; + or (TODO) recognising .vcs suffix -;;; + or parsing meta tag in html served at the URL +;;; + or recognizing .vcs suffix +;;; + or parsing meta tag in HTML served at the URL ;;; + or (TODO) if nothing else works by using zip file served by GOPROXY ;;; - get go.mod from GOPROXY (which is able to synthetize one if needed) ;;; - extract list of dependencies from this go.mod ;;; -;;; We translate Go module paths to a Guix package name under the +;;; The Go module paths are translated to a Guix package name under the ;;; assumption that there will be no collision. =20 ;;; TODO list ;;; - get correct hash in vcs->origin ;;; - print partial result during recursive imports (need to catch ;;; exceptions) -;;; - infer repo from module path with VCS qualifier -;;; (e.g. site.example/my/path/to/repo.git/and/subdir/module) -;;; - don't print fetch messages to stdout -;;; - pre-fill synopsis, description and license =20 (define-module (guix import go) - #:use-module (ice-9 match) - #:use-module (ice-9 rdelim) - #:use-module (ice-9 receive) - #:use-module (ice-9 regex) #:use-module (guix build-system go) - #:use-module (htmlprag) - #:use-module (sxml xpath) - #:use-module (srfi srfi-1) - #:use-module (srfi srfi-9) - #:use-module (srfi srfi-11) - #:use-module (json) #:use-module ((guix download) #:prefix download:) #:use-module (guix git) #:use-module (guix import utils) @@ -75,49 +62,134 @@ #:use-module (guix base32) #:use-module (guix memoization) #:use-module ((guix build download) #:prefix build-download:) + #:use-module (htmlprag) + #:use-module (ice-9 match) + #:use-module (ice-9 rdelim) + #:use-module (ice-9 receive) + #:use-module (ice-9 regex) + #:use-module (json) + #:use-module (rnrs io ports) + #:use-module (srfi srfi-1) + #:use-module (srfi srfi-9) + #:use-module (srfi srfi-11) + #:use-module (srfi srfi-26) + #:use-module (sxml xpath) + #:use-module (web client) + #:use-module (web response) #:use-module (web uri) =20 #:export (go-module->guix-package - go-module-recursive-import - infer-module-root-repo)) + go-module-recursive-import)) =20 +;;; Parameterize htmlprag to parse valid HTML more reliably. +(%strict-tokenizer? #t) =20 (define (go-path-escape path) - "Escape a module path by replacing every uppercase letter with an exclam= ation -mark followed with its lowercase equivalent, as per the module Escaped Pat= hs -specification. https://godoc.org/golang.org/x/mod/module#hdr-Escaped_Paths" + "Escape a module path by replacing every uppercase letter with an +exclamation mark followed with its lowercase equivalent, as per the module +Escaped Paths specification (see: +https://godoc.org/golang.org/x/mod/module#hdr-Escaped_Paths)." (define (escape occurrence) (string-append "!" (string-downcase (match:substring occurrence)))) (regexp-substitute/global #f "[A-Z]" path 'pre escape 'post)) =20 - (define (go-module-latest-version goproxy-url module-path) - "Fetches the version number of the latest version for MODULE-PATH from t= he + "Fetch the version number of the latest version for MODULE-PATH from the given GOPROXY-URL server." - (assoc-ref - (json-fetch (format #f "~a/~a/@latest" goproxy-url - (go-path-escape module-path))) - "Version")) + (assoc-ref (json-fetch (format #f "~a/~a/@latest" goproxy-url + (go-path-escape module-path))) + "Version")) + +(define (go-package-licenses name) + "Retrieve the list of licenses that apply to NAME, a Go package or module +name (e.g. \"github.com/golang/protobuf/proto\"). The data is scraped from +the https://pkg.go.dev/ web site." + (let*-values (((url) (string-append "https://pkg.go.dev/" name + "?tab=3Dlicenses")) + ((response body) (http-get url)) + ;; Extract the text contained in a h2 child node of any + ;; element marked with a "License" class attribute. + ((select) (sxpath `(// (* (@ (equal? (class "License")))) + h2 // *text*)))) + (and (eq? (response-code response) 200) + (match (select (html->sxml body)) + (() #f) ;nothing selected + (licenses licenses))))) + +(define (go-package-description name) + "Retrieve a short description for NAME, a Go package name, +e.g. \"google.golang.org/protobuf/proto\". The data is scraped from the +https://pkg.go.dev/ web site." + (let*-values (((url) (string-append "https://pkg.go.dev/" name)) + ((response body) (http-get url)) + ;; Extract the text contained in a h2 child node of any + ;; element marked with a "License" class attribute. + ((select) (sxpath + `(// (section + (@ (equal? (class "Documentation-overview= ")))) + (p 1))))) + (and (eq? (response-code response) 200) + (match (select (html->sxml body)) + (() #f) ;nothing selected + (((p . strings)) + ;; The paragraph text is returned as a list of strings embeddi= ng + ;; newline characters. Join them and strip the newline + ;; characters. + (string-delete #\newline (string-join strings))))))) + +(define (go-package-synopsis module-name) + "Retrieve a short synopsis for a Go module named MODULE-NAME, +e.g. \"google.golang.org/protobuf\". The data is scraped from +the https://pkg.go.dev/ web site." + ;; Note: Only the *module* (rather than package) page has the README tit= le + ;; used as a synopsis on the https://pkg.go.dev web site. + (let*-values (((url) (string-append "https://pkg.go.dev/" module-name)) + ((response body) (http-get url)) + ;; Extract the text contained in a h2 child node of any + ;; element marked with a "License" class attribute. + ((select) (sxpath + `(// (div (@ (equal? (class "UnitReadme-content= ")))) + // h3 *text*)))) + (and (eq? (response-code response) 200) + (match (select (html->sxml body)) + (() #f) ;nothing selected + ((title more ...) ;title is the first string of the = list + (string-trim-both title)))))) =20 -(define go-module-latest-version* (memoize go-module-latest-version)) +(define (list->licenses licenses) + "Given a list of LICENSES mostly following the SPDX conventions, return = the +corresponding Guix license or 'unknown-license!" + (filter-map (lambda (license) + (and (not (string-null? license)) + (not (any (cut string=3D? <> license) + '("AND" "OR" "WITH"))) + ;; Adjust the license names scraped from + ;; https://pkg.go.dev to an equivalent SPDX identifie= r, + ;; if they differ (see: https://github.com/golang/pkg= site + ;; /internal/licenses/licenses.go#L174). + (or (spdx-string->license + (match license + ("BlueOak-1.0" "BlueOak-1.0.0") + ("BSD-0-Clause" "0BSD") + ("BSD-2-Clause" "BSD-2-Clause-FreeBSD") + ("GPL2" "GPL-2.0") + ("GPL3" "GPL-3.0") + ("NIST" "NIST-PD") + (_ license))) + 'unknown-license!))) + licenses)) =20 -(define (fetch-go.mod goproxy-url module-path version file) - "Fetches go.mod from the given GOPROXY-URL server for the given MODULE-P= ATH -and VERSION." +(define (fetch-go.mod goproxy-url module-path version) + "Fetch go.mod from the given GOPROXY-URL server for the given MODULE-PATH +and VERSION and return an input port." (let ((url (format #f "~a/~a/@v/~a.mod" goproxy-url (go-path-escape module-path) (go-path-escape version)))) - (parameterize ((current-output-port (current-error-port))) - (build-download:url-fetch url - file - #:print-build-trace? #f)))) + (build-download:http-fetch (string->uri url)))) =20 -(define (parse-go.mod go.mod-path) - (parse-go.mod-port (open-input-file go.mod-path))) - -(define (parse-go.mod-port go.mod-port) - "PARSE-GO.MOD takes a filename in GO.MOD-PATH and extract a list of -requirements from it." +(define (parse-go.mod port) + "Parse the go.mod file accessible via the input PORT, returning a list of +requirements." ;; We parse only a subset of https://golang.org/ref/mod#go-mod-file-gram= mar ;; which we think necessary for our use case. (define (toplevel results) @@ -147,6 +219,7 @@ requirements from it." (#t ;; unrecognised line, ignore silently (toplevel results))))) + (define (in-require results) (let ((line (read-line))) (cond @@ -158,6 +231,7 @@ requirements from it." (toplevel results)) (#t (in-require (require-directive results line)))))) + (define (in-replace results) (let ((line (read-line))) (cond @@ -169,6 +243,7 @@ requirements from it." (toplevel results)) (#t (in-replace (replace-directive results line)))))) + (define (replace-directive results line) "Extract replaced modules and new requirements from replace directive in LINE and add to RESULTS." @@ -191,6 +266,7 @@ requirements from it." requirements (acons new-module-path new-version requirements)))) (cons new-requirements new-replaced))) + (define (require-directive results line) "Extract requirement from LINE and add it to RESULTS." (let* ((requirements (car results)) @@ -209,7 +285,8 @@ requirements from it." (module-path (string-trim-both module-path #\")) (version (match:substring match 2))) (cons (acons module-path version requirements) replaced))) - (with-input-from-port go.mod-port + + (with-input-from-port port (lambda () (let* ((results (toplevel '(() . ()))) (requirements (car results)) @@ -221,120 +298,102 @@ requirements from it." requirements replaced))))) =20 -(define (infer-module-root-repo module-path) - "Go modules can be defined at any level of a repository's tree, but quer= ying -for the meta tag usually can only be done at the webpage at the root of the -repository. Therefore, it is sometimes necessary to try and derive a modul= e's -root path from its path. For a set of well-known forges, the pattern of wh= at -consists of a module's root page is known before hand." +(define (module-path->repository-root module-path) + "Infer the repository root from a module path. Go modules can be +defined at any level of a repository tree, but querying for the meta tag +usually can only be done from the web page at the root of the repository, +hence the need to derive this information." ;; See the following URL for the official Go equivalent: ;; https://github.com/golang/go/blob/846dce9d05f19a1f53465e62a304dea21b9= 9f910/src/cmd/go/internal/vcs/vcs.go#L1026-L1087 - ;; - ;; TODO: handle module path with VCS qualifier as described in - ;; https://golang.org/ref/mod#vcs-find and - ;; https://golang.org/cmd/go/#hdr-Remote_import_paths + (define-record-type (make-vcs url-prefix root-regex type) vcs? (url-prefix vcs-url-prefix) (root-regex vcs-root-regex) (type vcs-type)) - (let* ((known-vcs - (list - (make-vcs - "github.com" - "^(github\\.com/[A-Za-z0-9_.\\-]+/[A-Za-z0-9_.\\-]+)(/[A-Za-z0= -9_.\\-]+)*$" - 'git) - (make-vcs - "bitbucket.org" - "^(bitbucket\\.org/([A-Za-z0-9_.\\-]+/[A-Za-z0-9_.\\-]+))(/[A-= Za-z0-9_.\\-]+)*$" - 'unknown) - (make-vcs - "hub.jazz.net/git/" - "^(hub\\.jazz\\.net/git/[a-z0-9]+/[A-Za-z0-9_.\\-]+)(/[A-Za-z0= -9_.\\-]+)*$" - 'git) - (make-vcs - "git.apache.org" - "^(git\\.apache\\.org/[a-z0-9_.\\-]+\\.git)(/[A-Za-z0-9_.\\-]+= )*$" - 'git) - (make-vcs - "git.openstack.org" - "^(git\\.openstack\\.org/[A-Za-z0-9_.\\-]+/[A-Za-z0-9_.\\-]+)(= \\.git)?(/[A-Za-z0-9_.\\-]+)*$" - 'git))) - (vcs (find (lambda (vcs) (string-prefix? (vcs-url-prefix vcs) mod= ule-path)) - known-vcs))) - (if vcs - (match:substring (string-match (vcs-root-regex vcs) module-path) 1) - module-path))) + + (define known-vcs + (list + (make-vcs + "github.com" + "^(github\\.com/[A-Za-z0-9_.\\-]+/[A-Za-z0-9_.\\-]+)(/[A-Za-z0-9_.\\= -]+)*$" + 'git) + (make-vcs + "bitbucket.org" + "^(bitbucket\\.org/([A-Za-z0-9_.\\-]+/[A-Za-z0-9_.\\-]+))(/[A-Za-z0-= 9_.\\-]+)*$" + 'unknown) + (make-vcs + "hub.jazz.net/git/" + "^(hub\\.jazz\\.net/git/[a-z0-9]+/[A-Za-z0-9_.\\-]+)(/[A-Za-z0-9_.\\= -]+)*$" + 'git) + (make-vcs + "git.apache.org" + "^(git\\.apache\\.org/[a-z0-9_.\\-]+\\.git)(/[A-Za-z0-9_.\\-]+)*$" + 'git) + (make-vcs + "git.openstack.org" + "^(git\\.openstack\\.org/[A-Za-z0-9_.\\-]+/[A-Za-z0-9_.\\-]+)(\\.git= )?\ +(/[A-Za-z0-9_.\\-]+)*$" + 'git))) + + ;; For reference, see: https://golang.org/ref/mod#vcs-find. + (define vcs-qualifiers '(".bzr" ".fossil" ".git" ".hg" ".svn")) + + (define (vcs-qualified-module-path->root-repo-url module-path) + (let* ((vcs-qualifiers-group (string-join vcs-qualifiers "|")) + (pattern (format #f "^(.*(~a))(/|$)" vcs-qualifiers-group)) + (m (string-match pattern module-path))) + (and=3D> m (cut match:substring <> 1)))) + + (or (and=3D> (find (lambda (vcs) + (string-prefix? (vcs-url-prefix vcs) module-path)) + known-vcs) + (lambda (vcs) + (match:substring (string-match (vcs-root-regex vcs) + module-path) 1))) + (vcs-qualified-module-path->root-repo-url module-path) + module-path)) =20 (define (go-module->guix-package-name module-path) "Converts a module's path to the canonical Guix format for Go packages." - (string-downcase - (string-append "go-" - (string-replace-substring - (string-replace-substring - module-path - "." "-") - "/" "-")))) + (string-downcase (string-append "go-" (string-replace-substring + (string-replace-substring + module-path + "." "-") + "/" "-")))) =20 (define-record-type (make-module-meta import-prefix vcs repo-root) module-meta? (import-prefix module-meta-import-prefix) - ;; VCS field is a symbol - (vcs module-meta-vcs) + (vcs module-meta-vcs) ;a symbol (repo-root module-meta-repo-root)) =20 (define (fetch-module-meta-data module-path) - "Fetches module meta-data from a module's landing page. This is - necessary because goproxy servers don't currently provide all the - information needed to build a package." + "Retrieve the module meta-data from its landing page. This is necessary +because goproxy servers don't currently provide all the information needed= to +build a package." ;; - (define (meta-go-import->module-meta text) - "Takes the content of the go-import meta tag as TEXT and gives back - a MODULE-META record" - (define (get-component s start) - (let* - ((start (string-skip s char-set:whitespace start)) - (end (string-index s char-set:whitespace start)) - (end (if end end (string-length s))) - (result (substring s start end))) - (values result end))) - (let*-values (((import-prefix end) (get-component text 0)) - ((vcs end) (get-component text end)) - ((repo-root end) (get-component text end))) - (make-module-meta import-prefix (string->symbol vcs) repo-root))) - (define (html->meta-go-import port) - "Read PORT with HTML content. Find the go-import meta tag and gives - back its content as a string." - (let* ((parsedhtml (html->sxml port)) - (extract-content (node-join - (select-kids (node-typeof? 'html)) - (select-kids (node-typeof? 'head)) - (select-kids (node-typeof? 'meta)) - (select-kids (node-typeof? '@)) - (node-self - (node-join - (select-kids (node-typeof? 'name)) - (select-kids (node-equal? "go-import")))) - (select-kids (node-typeof? 'content)) - (select-kids (lambda (_) #t)))) - (content (car (extract-content parsedhtml)))) - content)) - (let* ((port (build-download:http-fetch (string->uri (format #f "https:/= /~a?go-get=3D1" module-path)))) - (meta-go-import (html->meta-go-import port)) - (module-metadata (meta-go-import->module-meta meta-go-import))) - (close-port port) - module-metadata)) + (let* ((port (build-download:http-fetch + (string->uri (format #f "https://~a?go-get=3D1" module-pat= h)))) + (select (sxpath `(// head (meta (@ (equal? (name "go-import")))) + // content)))) + (match (select (call-with-port port html->sxml)) + (() #f) ;nothing selected + (((content content-text)) + (match (string-split content-text #\space) + ((root-path vcs repo-url) + (make-module-meta root-path (string->symbol vcs) repo-url))))))) =20 (define (module-meta-data-repo-url meta-data goproxy-url) - "Return the URL where the fetcher which will be used can download the so= urce -control." - (if (member (module-meta-vcs meta-data)'(fossil mod)) + "Return the URL where the fetcher which will be used can download the +source." + (if (member (module-meta-vcs meta-data) '(fossil mod)) goproxy-url (module-meta-repo-root meta-data))) =20 -(define (vcs->origin vcs-type vcs-repo-url version file) +(define (vcs->origin vcs-type vcs-repo-url version) "Generate the `origin' block of a package depending on what type of sour= ce control system is being used." (case vcs-type @@ -347,61 +406,64 @@ control system is being used." (file-name (git-file-name name version)) (sha256 (base32 - ;; FIXME: get hash for git repo checkout - "0000000000000000000000000000000000000000000000000000")))) + ;; FIXME: populate hash for git repo checkout + "0000000000000000000000000000000000000000000000000000")))) ((hg) `(origin (method hg-fetch) (uri (hg-reference (url ,vcs-repo-url) (changeset ,version))) - (file-name (format #f "~a-~a-checkout" name version)))) + (file-name (string-append name "-" version "-checkout")) + (sha256 + (base32 + ;; FIXME: populate hash for hg repo checkout + "0000000000000000000000000000000000000000000000000000")))) ((svn) `(origin (method svn-fetch) (uri (svn-reference (url ,vcs-repo-url) - (revision (string->number version)) - (recursive? #f))) - (file-name (format #f "~a-~a-checkout" name version)) + (revision (string->number version)))) + (file-name (string-append name "-" version "-checkout")) (sha256 (base32 - ,(guix-hash-url file))))) + ;; FIXME: populate hash for svn repo checkout + "0000000000000000000000000000000000000000000000000000")))) (else (raise-exception (format #f "unsupported vcs type: ~a" vcs-type))))) =20 -(define* (go-module->guix-package module-path #:key (goproxy-url "https://= proxy.golang.org")) - (call-with-temporary-output-file - (lambda (temp port) - (let* ((latest-version (go-module-latest-version* goproxy-url module-= path)) - (go.mod-path (fetch-go.mod goproxy-url module-path latest-vers= ion - temp)) - (dependencies (map car (parse-go.mod temp))) - (guix-name (go-module->guix-package-name module-path)) - (root-module-path (infer-module-root-repo module-path)) - ;; VCS type and URL are not included in goproxy information. F= or - ;; this we need to fetch it from the official module page. - (meta-data (fetch-module-meta-data root-module-path)) - (vcs-type (module-meta-vcs meta-data)) - (vcs-repo-url (module-meta-data-repo-url meta-data goproxy-url= ))) - (values - `(package - (name ,guix-name) - ;; Elide the "v" prefix Go uses - (version ,(string-trim latest-version #\v)) - (source - ,(vcs->origin vcs-type vcs-repo-url latest-version temp)) - (build-system go-build-system) - (arguments - '(#:import-path ,root-module-path)) - ,@(maybe-inputs (map go-module->guix-package-name dependencies)) - ;; TODO(katco): It would be nice to make an effort to fetch this - ;; from known forges, e.g. GitHub - (home-page ,(format #f "https://~a" root-module-path)) - (synopsis "A Go package") - (description ,(format #f "~a is a Go package." guix-name)) - (license #f)) - dependencies))))) +(define* (go-module->guix-package module-path #:key + (goproxy-url "https://proxy.golang.org")) + (let* ((latest-version (go-module-latest-version goproxy-url module-path= )) + (port (fetch-go.mod goproxy-url module-path latest-version)) + (dependencies (map car (call-with-port port parse-go.mod))) + (guix-name (go-module->guix-package-name module-path)) + (root-module-path (module-path->repository-root module-path)) + ;; The VCS type and URL are not included in goproxy information. = For + ;; this we need to fetch it from the official module page. + (meta-data (fetch-module-meta-data root-module-path)) + (vcs-type (module-meta-vcs meta-data)) + (vcs-repo-url (module-meta-data-repo-url meta-data goproxy-url)) + (synopsis (go-package-synopsis root-module-path)) + (description (go-package-description module-path)) + (licenses (go-package-licenses module-path))) + (values + `(package + (name ,guix-name) + ;; Elide the "v" prefix Go uses + (version ,(string-trim latest-version #\v)) + (source + ,(vcs->origin vcs-type vcs-repo-url latest-version)) + (build-system go-build-system) + (arguments + '(#:import-path ,root-module-path)) + ,@(maybe-inputs (map go-module->guix-package-name dependencies)) + (home-page ,(format #f "https://~a" root-module-path)) + (synopsis ,synopsis) + (description ,description) + (license ,(and=3D> licenses list->licenses))) + dependencies))) =20 (define go-module->guix-package* (memoize go-module->guix-package)) =20 --=20 2.30.1 --=-=-= Content-Type: text/plain I hope I'm not making things more difficult for you! Thank you for working on it! :-) Maxim --=-=-=--