Bioconductor URI, fallback and time-machine

OpenSubmitted by zimoun.
Details
3 participants
  • Ludovic Courtès
  • Ricardo Wurmus
  • zimoun
Owner
unassigned
Severity
normal
Z
Z
zimoun wrote on 3 Mar 2020 16:59
CAJ3okZ3dFunYgafRH6=9LsLKLf6OrZBpXqUMxZAjEhaiL93ARA@mail.gmail.com
Dear,
Currently, the URI scheme (see 'bioconductor-uri' inguix/build-system/r.scm) is:
https://bioconductor.org/packages/release/data/type-url-part /src/contrib/ upstream-name - version .tar.gz
which leads to 2 issues:
1. when Bioconductor updates their release, some package versions areupdated too, and so, the upstream return 404. 2. for this reason 1., the "guix time-machine" is broken for all theBioconductor packages, at least if Berlin or SWH does not have asubstitute; which is not expected for 'annotation' packages.
However, the Bioconductor archive still serves the old release, i.e.,
https://bioconductor.org/packages/3.x/data/type-url-part /src/contrib/ upstream-name - version .tar.gz

The ways to fix the both issues are:
a) Add the Bioconductor release (known at packaging time) to all thepackages; provide as argument to 'bioconductor-uri'. b) Add more URLs to fallback.
As discussed on IRC, Tobias seems more inclined with the option a) andI am more in favour of option b.
Attached, a quick patch showing the option b).

Please also consider #36805 which was never merged or closed. http://issues.guix.gnu.org/issue/36805

All the best,simon
From 87e73e02202fe5e342d68f1fb17efdd4425737cd Mon Sep 17 00:00:00 2001From: zimoun <zimon.toutoune@gmail.com>Date: Tue, 3 Mar 2020 16:53:39 +0100Subject: [PATCH] build-system: r: Use Bioconductor old releases to fallback.
* guix/build-system/r.scm (bioconductor-uri): Extend the fallback list.--- guix/build-system/r.scm | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-)
Toggle diff (34 lines)diff --git a/guix/build-system/r.scm b/guix/build-system/r.scmindex 2d328764b0..8638e1b888 100644--- a/guix/build-system/r.scm+++ b/guix/build-system/r.scm@@ -54,15 +54,18 @@ release corresponding to NAME and VERSION." ('annotation "/data/annotation") ('experiment "/data/experiment") (_ "/bioc"))))- (list (string-append "https://bioconductor.org/packages/release"- type-url-part- "/src/contrib/"- name "_" version ".tar.gz")- ;; TODO: use %bioconductor-version from (guix import cran)- (string-append "https://bioconductor.org/packages/3.10"- type-url-part- "/src/contrib/Archive/"- name "_" version ".tar.gz"))))+ (append (list (string-append "https://bioconductor.org/packages/release"+ type-url-part+ "/src/contrib/"+ name "_" version ".tar.gz"))+ (map (lambda (release)+ (string-append "https://bioconductor.org/packages/"+ release+ type-url-part+ "/src/contrib/"+ name "_" version ".tar.gz"))+ (list (@@ (guix import cran) %bioconductor-version)+ "3.9" "3.8" "3.7"))))) (define %r-build-system-modules ;; Build-side modules imported by default.-- 2.25.0
R
R
Ricardo Wurmus wrote on 23 Mar 2020 22:20
(name . zimoun)(address . zimon.toutoune@gmail.com)
87ftdylqdn.fsf@elephly.net
zimoun <zimon.toutoune@gmail.com> writes:
Toggle quote (20 lines)> 1. when Bioconductor updates their release, some package versions are> updated too, and so, the upstream return 404.> 2. for this reason 1., the "guix time-machine" is broken for all the> Bioconductor packages, at least if Berlin or SWH does not have a> substitute; which is not expected for 'annotation' packages.>> However, the Bioconductor archive still serves the old release, i.e.,>> https://bioconductor.org/packages/3.x/data/<type-url-part>/src/contrib/<upstream-name>-<version>.tar.gz>>> The ways to fix the both issues are:>> a) Add the Bioconductor release (known at packaging time) to all the> packages; provide as argument to 'bioconductor-uri'.> b) Add more URLs to fallback.>> As discussed on IRC, Tobias seems more inclined with the option a) and> I am more in favour of option b.
I think option a) is more explicit, which is probably what we generallywant to future-proof the time-machine. Fallbacks are okay in the caseof the CRAN URL where it’s not necessarily clear when a package tarballmoves from the release location to the archive.
In the case of Bioconductor URLs it seems that we can afford to be a bitmore accurate.
-- Ricardo
Z
Z
zimoun wrote on 22 May 2020 01:29
(name . Ricardo Wurmus)(address . rekado@elephly.net)
CAJ3okZ1Ttjh+iG3qU1a_PcK_m5-64+KLAGKYT7b5Cum7fGgkKA@mail.gmail.com
Dear Ricardo,
On Mon, 23 Mar 2020 at 22:21, Ricardo Wurmus <rekado@elephly.net> wrote:
Toggle quote (15 lines)> > a) Add the Bioconductor release (known at packaging time) to all the> > packages; provide as argument to 'bioconductor-uri'.> > b) Add more URLs to fallback.> >> > As discussed on IRC, Tobias seems more inclined with the option a) and> > I am more in favour of option b.>> I think option a) is more explicit, which is probably what we generally> want to future-proof the time-machine. Fallbacks are okay in the case> of the CRAN URL where it’s not necessarily clear when a package tarball> moves from the release location to the archive.>> In the case of Bioconductor URLs it seems that we can afford to be a bit> more accurate.
We are going for option a) which means rename all the URLs, right?
Because it is a lot, I suggest to first address the bug#36805, i.e.,provide as an argument the BioConductor version to 'bioconductor-uri'and applies this policy to all the new packages or any update of them.
Moreover, I have suggested to reorganise bioconductor.scm,bioinformatics.scm, cran.scm, etc. and I have not dedicated enoughtime to this boring task. But because I am working remotely(semi-lockdown), I plan to work on it next week and so this change ofURLs could be part of the big reorganisation.
What do you think?
[1] http://issues.guix.gnu.org/issue/36805

All the best,simon
Z
Z
zimoun wrote on 24 Jun 2020 13:07
Re: bug#39885: Bioconductor URI, fallback and time-machine
CAJ3okZ1Xwd-2WArzNus3xE_KOayDdXPp+ku1SxYBon4Zm0qQQg@mail.gmail.com
Dear,
The time-machine is broken for some BioConductor packages.. For anexample, consider the package "r-genomegraphs" which has been removedfrom the BioConductor in 3.11 release.
(Well, now the issue is mitigated because ci.guix.gnu.org serves a lotof upstream substitutes but ci.guix.gnu.org could be down. Othersaid, we should use the upstream resources where they are available.)

Concretely, there are 2 issues:
a) What to do for the removed packages? For 3.11, the list is there[1]. Do we keep them in gnu/packages/bioconductor.scm but then'bioconductor-uri' needs some tweaks? Or do we transfer them to thechannel guix-past (for example)?
b) The fallback URI in guix/build-system/r.scm(bioconductor-uri)added by commit c586f427b4831b9b492e5b900b2226e898b8fcfa is notcorrect, if I do not misread:
Toggle snippet (4 lines)"https://bioconductor.org/packages/3.10/bioc/src/contrib/Archive/GenomeGraphs_1.46.0.tar.gz"404 "Not Found"
L
L
Ludovic Courtès wrote on 28 Jun 2020 22:14
(name . zimoun)(address . zimon.toutoune@gmail.com)
87lfk7gd7d.fsf@gnu.org
Hi,
zimoun <zimon.toutoune@gmail.com> skribis:
Toggle quote (11 lines)> b) The fallback URI in guix/build-system/r.scm(bioconductor-uri)> added by commit c586f427b4831b9b492e5b900b2226e898b8fcfa is not> correct, if I do not misread:>> "https://bioconductor.org/packages/3.10/bioc/src/contrib/Archive/GenomeGraphs_1.46.0.tar.gz"> 404 "Not Found">> The correct seems to be (without Archive):>> https://bioconductor.org/packages/3.10/bioc/src/contrib/GenomeGraphs_1.46.0.tar.gz
Could you provide a patch for this?
Thanks,Ludo’.
Z
Z
zimoun wrote on 29 Jun 2020 19:36
(name . Ludovic Courtès)(address . ludo@gnu.org)
CAJ3okZ36KrALgMq69zdkDsHcfMv09Lk=DPBmzW-4SePZPRAvnA@mail.gmail.com
Hi Ludo,
On Sun, 28 Jun 2020 at 22:14, Ludovic Courtès <ludo@gnu.org> wrote:
Toggle quote (2 lines)> Could you provide a patch for this?
About the url, for sure, see attached.
But it does not address the root of the problem. Well, I will try tofind a slot and propose something.

All the best,simon
From c1c963a3b86e306a20c14626127e54d21843c22c Mon Sep 17 00:00:00 2001From: zimoun <zimon.toutoune@gmail.com>Date: Mon, 29 Jun 2020 19:18:20 +0200Subject: [PATCH] build-system/r: bioconductor-uri: Fix archive URL.
* guix/build-system/r.scm (bioconductor-uri): Fix archive URL.--- guix/build-system/r.scm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Toggle diff (17 lines)diff --git a/guix/build-system/r.scm b/guix/build-system/r.scmindex c8ec9abd0d..5ef982d66a 100644--- a/guix/build-system/r.scm+++ b/guix/build-system/r.scm@@ -61,7 +61,7 @@ release corresponding to NAME and VERSION." ;; TODO: use %bioconductor-version from (guix import cran) (string-append "https://bioconductor.org/packages/3.11" type-url-part- "/src/contrib/Archive/"+ "/src/contrib/" name "_" version ".tar.gz")))) (define %r-build-system-modules
base-commit: 6ebf300959a58fd1eda875205c75d21137862285-- 2.26.2
L
L
Ludovic Courtès wrote on 29 Jun 2020 22:42
(name . zimoun)(address . zimon.toutoune@gmail.com)
87r1txa9ix.fsf@gnu.org
zimoun <zimon.toutoune@gmail.com> skribis:
Toggle quote (7 lines)> From c1c963a3b86e306a20c14626127e54d21843c22c Mon Sep 17 00:00:00 2001> From: zimoun <zimon.toutoune@gmail.com>> Date: Mon, 29 Jun 2020 19:18:20 +0200> Subject: [PATCH] build-system/r: bioconductor-uri: Fix archive URL.>> * guix/build-system/r.scm (bioconductor-uri): Fix archive URL.
Applied, thanks!
I let the rest of you discuss the other issues. :-)
Ludo’.
Z
Z
zimoun wrote on 19 Nov 2020 15:22
(address . 39885@debbugs.gnu.org)
87r1ope8a4.fsf@gmail.com
Hi,
Some explanations of the issue are provided here:
http://issues.guix.gnu.org/issue/39885
Since we are currently updating to 3.12, maybe it is the occasion to fixthe issue. See option a) below.

On Tue, 03 Mar 2020 at 16:59, zimoun <zimon.toutoune@gmail.com> wrote:
Toggle quote (14 lines)> Currently, the URI scheme (see 'bioconductor-uri' in> guix/build-system/r.scm) is:>> https://bioconductor.org/packages/release/data/<type-url-part>/src/contrib/<upstream-name>-<version>.tar.gz>> which leads to 2 issues:>> 1. when Bioconductor updates their release, some package versions are> updated too, and so, the upstream return 404.>> 2. for this reason 1., the "guix time-machine" is broken for all the> Bioconductor packages, at least if Berlin or SWH does not have a> substitute; which is not expected for 'annotation' packages.
An example of this issue is for example:
Toggle snippet (24 lines)$ guix time-machine --commit=aee183e -- import cran -a bioconductor CATALYST -r
Updating channel 'guix' from Git repository at 'https://git.savannah.gnu.org/git/guix.git'...
Starting download of /tmp/guix-file.NxajqhFrom https://bioconductor.org/packages/release/bioc/src/contrib/CATALYST_1.12.2.tar.gz...download failed "https://bioconductor.org/packages/release/bioc/src/contrib/CATALYST_1.12.2.tar.gz" 404 "Not Found"failed to download "/tmp/guix-file.Nxajqh" from "https://bioconductor.org/packages/release/bioc/src/contrib/CATALYST_1.12.2.tar.gz"error: failed to retrieve package information from "https://cran.r-project.org/web/packages/CATALYST/DESCRIPTION": 404 ("Not Found")Backtrace: 4 (primitive-load "/home/simon/.cache/guix/inferiors/vznc…")In guix/ui.scm: 2117:12 3 (run-guix-command _ . _)In guix/scripts/import.scm: 120:11 2 (guix-import . _)In srfi/srfi-1.scm: 586:17 1 (map1 (#f))In guix/import/utils.scm: 258:2 0 (package->definition _)
guix/import/utils.scm:258:2: In procedure package->definition:Throw to key `match-error' with args `("match" "no matching pattern" #f)'.
Aside the ugly backtrace which is tracked by #44115, the main issue isbecause Bioconductor updated to 3.12 and Guix is still at 3.11.
Concretely, the issue is that ’release’ in the URL:
https://bioconductor.org/packages/release/bioc/src/contrib/CATALYST_1.12.2.tar.gz
now refers to 3.12 (because Bioconductor update) and Guix still think itis 3.11 (because Guix has not yet updated; work-in-progress). AndCATALYST in 3.12 is at version 1.14.0 against 1.12.2 for 3.11.Therefore, the conflict and the error.
It means that while:
(define %bioconductor-version "3.11")
is not updated to 3.12, all the Bioconductor packages are broken; in themeaning not buildable from source.

Toggle quote (9 lines)> a) Add the Bioconductor release (known at packaging time) to all the> packages; provide as argument to 'bioconductor-uri'.> b) Add more URLs to fallback, e.g.:>> https://bioconductor.org/packages/release/data/<type-url-part>/src/contrib/<upstream-name>-<version>.tar.gz> https://bioconductor.org/packages/3.11/data/<type-url-part>/src/contrib/<upstream-name>-<version>.tar.gz>> Attached, a quick patch showing the option b).
Then each time we update Bioconductor, we add an URL to the list.

Toggle quote (3 lines)> As discussed on IRC, Tobias seems more inclined with the option a) and> I am more in favour of option b.
Tobias and Ricardo are in favor for a) (see this thread). Which means alot of work IMHO, i.e., add 3.11 as arguments and then 3.12 to all theBioconductor packages and fix the importer, IIUC; while b) means donothing except merge the proposed patch (possibly re-worked).
Just to note that only the task to group in bioconductor.scm all theBioconductor packages scattered here and there is still not done, Ithink option a) is not doable by hand – I do not volunteer! :-) Else,any suggestion to script the task instead?
Since I am more in favor of b), I am less motivated to fix the a). ;-)But I am motivated to fix the issue at hand. :-)

Other option c) is to switch all the Bioconductor to git-fetch insteadof url-fetch. I have not checked yet how could be the transition.

Toggle quote (3 lines)> Please also consider #36805 which was never merged or closed.> http://issues.guix.gnu.org/issue/36805
This patch could help for option a).

WDYT?
All the best,simon
?