Bioconductor URI, fallback and time-machine

  • Open
  • quality assurance status badge
Details
4 participants
  • Ludovic Courtès
  • Maxime Devos
  • Ricardo Wurmus
  • zimoun
Owner
unassigned
Submitted by
zimoun
Severity
normal
Z
Z
zimoun wrote on 3 Mar 2020 16:59
CAJ3okZ3dFunYgafRH6=9LsLKLf6OrZBpXqUMxZAjEhaiL93ARA@mail.gmail.com
Dear,

Currently, the URI scheme (see 'bioconductor-uri' in
guix/build-system/r.scm) is:

https://bioconductor.org/packages/release/data/type-url-part /src/contrib/ upstream-name - version .tar.gz

which leads to 2 issues:

1. when Bioconductor updates their release, some package versions are
updated too, and so, the upstream return 404.
2. for this reason 1., the "guix time-machine" is broken for all the
Bioconductor packages, at least if Berlin or SWH does not have a
substitute; which is not expected for 'annotation' packages.

However, the Bioconductor archive still serves the old release, i.e.,

https://bioconductor.org/packages/3.x/data/type-url-part /src/contrib/ upstream-name - version .tar.gz


The ways to fix the both issues are:

a) Add the Bioconductor release (known at packaging time) to all the
packages; provide as argument to 'bioconductor-uri'.
b) Add more URLs to fallback.

As discussed on IRC, Tobias seems more inclined with the option a) and
I am more in favour of option b.

Attached, a quick patch showing the option b).


Please also consider #36805 which was never merged or closed.


All the best,
simon
From 87e73e02202fe5e342d68f1fb17efdd4425737cd Mon Sep 17 00:00:00 2001
From: zimoun <zimon.toutoune@gmail.com>
Date: Tue, 3 Mar 2020 16:53:39 +0100
Subject: [PATCH] build-system: r: Use Bioconductor old releases to fallback.

* guix/build-system/r.scm (bioconductor-uri): Extend the fallback list.
---
guix/build-system/r.scm | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)

Toggle diff (34 lines)
diff --git a/guix/build-system/r.scm b/guix/build-system/r.scm
index 2d328764b0..8638e1b888 100644
--- a/guix/build-system/r.scm
+++ b/guix/build-system/r.scm
@@ -54,15 +54,18 @@ release corresponding to NAME and VERSION."
('annotation "/data/annotation")
('experiment "/data/experiment")
(_ "/bioc"))))
- (list (string-append "https://bioconductor.org/packages/release"
- type-url-part
- "/src/contrib/"
- name "_" version ".tar.gz")
- ;; TODO: use %bioconductor-version from (guix import cran)
- (string-append "https://bioconductor.org/packages/3.10"
- type-url-part
- "/src/contrib/Archive/"
- name "_" version ".tar.gz"))))
+ (append (list (string-append "https://bioconductor.org/packages/release"
+ type-url-part
+ "/src/contrib/"
+ name "_" version ".tar.gz"))
+ (map (lambda (release)
+ (string-append "https://bioconductor.org/packages/"
+ release
+ type-url-part
+ "/src/contrib/"
+ name "_" version ".tar.gz"))
+ (list (@@ (guix import cran) %bioconductor-version)
+ "3.9" "3.8" "3.7")))))
(define %r-build-system-modules
;; Build-side modules imported by default.
--
2.25.0
R
R
Ricardo Wurmus wrote on 23 Mar 2020 22:20
(name . zimoun)(address . zimon.toutoune@gmail.com)
87ftdylqdn.fsf@elephly.net
zimoun <zimon.toutoune@gmail.com> writes:

Toggle quote (20 lines)
> 1. when Bioconductor updates their release, some package versions are
> updated too, and so, the upstream return 404.
> 2. for this reason 1., the "guix time-machine" is broken for all the
> Bioconductor packages, at least if Berlin or SWH does not have a
> substitute; which is not expected for 'annotation' packages.
>
> However, the Bioconductor archive still serves the old release, i.e.,
>
> https://bioconductor.org/packages/3.x/data/<type-url-part>/src/contrib/<upstream-name>-<version>.tar.gz
>
>
> The ways to fix the both issues are:
>
> a) Add the Bioconductor release (known at packaging time) to all the
> packages; provide as argument to 'bioconductor-uri'.
> b) Add more URLs to fallback.
>
> As discussed on IRC, Tobias seems more inclined with the option a) and
> I am more in favour of option b.

I think option a) is more explicit, which is probably what we generally
want to future-proof the time-machine. Fallbacks are okay in the case
of the CRAN URL where it’s not necessarily clear when a package tarball
moves from the release location to the archive.

In the case of Bioconductor URLs it seems that we can afford to be a bit
more accurate.

--
Ricardo
Z
Z
zimoun wrote on 22 May 2020 01:29
(name . Ricardo Wurmus)(address . rekado@elephly.net)
CAJ3okZ1Ttjh+iG3qU1a_PcK_m5-64+KLAGKYT7b5Cum7fGgkKA@mail.gmail.com
Dear Ricardo,

On Mon, 23 Mar 2020 at 22:21, Ricardo Wurmus <rekado@elephly.net> wrote:

Toggle quote (15 lines)
> > a) Add the Bioconductor release (known at packaging time) to all the
> > packages; provide as argument to 'bioconductor-uri'.
> > b) Add more URLs to fallback.
> >
> > As discussed on IRC, Tobias seems more inclined with the option a) and
> > I am more in favour of option b.
>
> I think option a) is more explicit, which is probably what we generally
> want to future-proof the time-machine. Fallbacks are okay in the case
> of the CRAN URL where it’s not necessarily clear when a package tarball
> moves from the release location to the archive.
>
> In the case of Bioconductor URLs it seems that we can afford to be a bit
> more accurate.

We are going for option a) which means rename all the URLs, right?

Because it is a lot, I suggest to first address the bug#36805, i.e.,
provide as an argument the BioConductor version to 'bioconductor-uri'
and applies this policy to all the new packages or any update of them.

Moreover, I have suggested to reorganise bioconductor.scm,
bioinformatics.scm, cran.scm, etc. and I have not dedicated enough
time to this boring task. But because I am working remotely
(semi-lockdown), I plan to work on it next week and so this change of
URLs could be part of the big reorganisation.

What do you think?



All the best,
simon
Z
Z
zimoun wrote on 24 Jun 2020 13:07
Re: bug#39885: Bioconductor URI, fallback and time-machine
CAJ3okZ1Xwd-2WArzNus3xE_KOayDdXPp+ku1SxYBon4Zm0qQQg@mail.gmail.com
Dear,

The time-machine is broken for some BioConductor packages.. For an
example, consider the package "r-genomegraphs" which has been removed
from the BioConductor in 3.11 release.

(Well, now the issue is mitigated because ci.guix.gnu.org serves a lot
of upstream substitutes but ci.guix.gnu.org could be down. Other
said, we should use the upstream resources where they are available.)


Concretely, there are 2 issues:

a) What to do for the removed packages? For 3.11, the list is there
[1]. Do we keep them in gnu/packages/bioconductor.scm but then
'bioconductor-uri' needs some tweaks? Or do we transfer them to the
channel guix-past (for example)?

b) The fallback URI in guix/build-system/r.scm(bioconductor-uri)
added by commit c586f427b4831b9b492e5b900b2226e898b8fcfa is not
correct, if I do not misread:

Toggle snippet (4 lines)
"https://bioconductor.org/packages/3.10/bioc/src/contrib/Archive/GenomeGraphs_1.46.0.tar.gz"
404 "Not Found"

L
L
Ludovic Courtès wrote on 28 Jun 2020 22:14
(name . zimoun)(address . zimon.toutoune@gmail.com)
87lfk7gd7d.fsf@gnu.org
Hi,

zimoun <zimon.toutoune@gmail.com> skribis:

Toggle quote (11 lines)
> b) The fallback URI in guix/build-system/r.scm(bioconductor-uri)
> added by commit c586f427b4831b9b492e5b900b2226e898b8fcfa is not
> correct, if I do not misread:
>
> "https://bioconductor.org/packages/3.10/bioc/src/contrib/Archive/GenomeGraphs_1.46.0.tar.gz"
> 404 "Not Found"
>
> The correct seems to be (without Archive):
>
> https://bioconductor.org/packages/3.10/bioc/src/contrib/GenomeGraphs_1.46.0.tar.gz

Could you provide a patch for this?

Thanks,
Ludo’.
Z
Z
zimoun wrote on 29 Jun 2020 19:36
(name . Ludovic Courtès)(address . ludo@gnu.org)
CAJ3okZ36KrALgMq69zdkDsHcfMv09Lk=DPBmzW-4SePZPRAvnA@mail.gmail.com
Hi Ludo,

On Sun, 28 Jun 2020 at 22:14, Ludovic Courtès <ludo@gnu.org> wrote:

Toggle quote (2 lines)
> Could you provide a patch for this?

About the url, for sure, see attached.

But it does not address the root of the problem. Well, I will try to
find a slot and propose something.


All the best,
simon
From c1c963a3b86e306a20c14626127e54d21843c22c Mon Sep 17 00:00:00 2001
From: zimoun <zimon.toutoune@gmail.com>
Date: Mon, 29 Jun 2020 19:18:20 +0200
Subject: [PATCH] build-system/r: bioconductor-uri: Fix archive URL.

* guix/build-system/r.scm (bioconductor-uri): Fix archive URL.
---
guix/build-system/r.scm | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

Toggle diff (17 lines)
diff --git a/guix/build-system/r.scm b/guix/build-system/r.scm
index c8ec9abd0d..5ef982d66a 100644
--- a/guix/build-system/r.scm
+++ b/guix/build-system/r.scm
@@ -61,7 +61,7 @@ release corresponding to NAME and VERSION."
;; TODO: use %bioconductor-version from (guix import cran)
(string-append "https://bioconductor.org/packages/3.11"
type-url-part
- "/src/contrib/Archive/"
+ "/src/contrib/"
name "_" version ".tar.gz"))))
(define %r-build-system-modules

base-commit: 6ebf300959a58fd1eda875205c75d21137862285
--
2.26.2
L
L
Ludovic Courtès wrote on 29 Jun 2020 22:42
(name . zimoun)(address . zimon.toutoune@gmail.com)
87r1txa9ix.fsf@gnu.org
zimoun <zimon.toutoune@gmail.com> skribis:

Toggle quote (7 lines)
> From c1c963a3b86e306a20c14626127e54d21843c22c Mon Sep 17 00:00:00 2001
> From: zimoun <zimon.toutoune@gmail.com>
> Date: Mon, 29 Jun 2020 19:18:20 +0200
> Subject: [PATCH] build-system/r: bioconductor-uri: Fix archive URL.
>
> * guix/build-system/r.scm (bioconductor-uri): Fix archive URL.

Applied, thanks!

I let the rest of you discuss the other issues. :-)

Ludo’.
Z
Z
zimoun wrote on 19 Nov 2020 15:22
(address . 39885@debbugs.gnu.org)
87r1ope8a4.fsf@gmail.com
Hi,

Some explanations of the issue are provided here:


Since we are currently updating to 3.12, maybe it is the occasion to fix
the issue. See option a) below.


On Tue, 03 Mar 2020 at 16:59, zimoun <zimon.toutoune@gmail.com> wrote:

Toggle quote (14 lines)
> Currently, the URI scheme (see 'bioconductor-uri' in
> guix/build-system/r.scm) is:
>
> https://bioconductor.org/packages/release/data/<type-url-part>/src/contrib/<upstream-name>-<version>.tar.gz
>
> which leads to 2 issues:
>
> 1. when Bioconductor updates their release, some package versions are
> updated too, and so, the upstream return 404.
>
> 2. for this reason 1., the "guix time-machine" is broken for all the
> Bioconductor packages, at least if Berlin or SWH does not have a
> substitute; which is not expected for 'annotation' packages.

An example of this issue is for example:

Toggle snippet (24 lines)
$ guix time-machine --commit=aee183e -- import cran -a bioconductor CATALYST -r

Updating channel 'guix' from Git repository at 'https://git.savannah.gnu.org/git/guix.git'...

Starting download of /tmp/guix-file.Nxajqh
From https://bioconductor.org/packages/release/bioc/src/contrib/CATALYST_1.12.2.tar.gz...
download failed "https://bioconductor.org/packages/release/bioc/src/contrib/CATALYST_1.12.2.tar.gz" 404 "Not Found"
failed to download "/tmp/guix-file.Nxajqh" from "https://bioconductor.org/packages/release/bioc/src/contrib/CATALYST_1.12.2.tar.gz"
error: failed to retrieve package information from "https://cran.r-project.org/web/packages/CATALYST/DESCRIPTION": 404 ("Not Found")
Backtrace:
4 (primitive-load "/home/simon/.cache/guix/inferiors/vznc…")
In guix/ui.scm:
2117:12 3 (run-guix-command _ . _)
In guix/scripts/import.scm:
120:11 2 (guix-import . _)
In srfi/srfi-1.scm:
586:17 1 (map1 (#f))
In guix/import/utils.scm:
258:2 0 (package->definition _)

guix/import/utils.scm:258:2: In procedure package->definition:
Throw to key `match-error' with args `("match" "no matching pattern" #f)'.

Aside the ugly backtrace which is tracked by #44115, the main issue is
because Bioconductor updated to 3.12 and Guix is still at 3.11.

Concretely, the issue is that ’release’ in the URL:


now refers to 3.12 (because Bioconductor update) and Guix still think it
is 3.11 (because Guix has not yet updated; work-in-progress). And
CATALYST in 3.12 is at version 1.14.0 against 1.12.2 for 3.11.
Therefore, the conflict and the error.

It means that while:

(define %bioconductor-version "3.11")

is not updated to 3.12, all the Bioconductor packages are broken; in the
meaning not buildable from source.


Toggle quote (9 lines)
> a) Add the Bioconductor release (known at packaging time) to all the
> packages; provide as argument to 'bioconductor-uri'.
> b) Add more URLs to fallback, e.g.:
>
> https://bioconductor.org/packages/release/data/<type-url-part>/src/contrib/<upstream-name>-<version>.tar.gz
> https://bioconductor.org/packages/3.11/data/<type-url-part>/src/contrib/<upstream-name>-<version>.tar.gz
>
> Attached, a quick patch showing the option b).

Then each time we update Bioconductor, we add an URL to the list.


Toggle quote (3 lines)
> As discussed on IRC, Tobias seems more inclined with the option a) and
> I am more in favour of option b.

Tobias and Ricardo are in favor for a) (see this thread). Which means a
lot of work IMHO, i.e., add 3.11 as arguments and then 3.12 to all the
Bioconductor packages and fix the importer, IIUC; while b) means do
nothing except merge the proposed patch (possibly re-worked).

Just to note that only the task to group in bioconductor.scm all the
Bioconductor packages scattered here and there is still not done, I
think option a) is not doable by hand – I do not volunteer! :-) Else,
any suggestion to script the task instead?

Since I am more in favor of b), I am less motivated to fix the a). ;-)
But I am motivated to fix the issue at hand. :-)


Other option c) is to switch all the Bioconductor to git-fetch instead
of url-fetch. I have not checked yet how could be the transition.


Toggle quote (3 lines)
> Please also consider #36805 which was never merged or closed.
> http://issues.guix.gnu.org/issue/36805

This patch could help for option a).


WDYT?

All the best,
simon
Z
Z
zimoun wrote on 22 Nov 2021 20:48
(address . 39885@debbugs.gnu.org)
87wnl0q8ei.fsf@gmail.com
Hi,

On Tue, 03 Mar 2020 at 16:59, zimoun <zimon.toutoune@gmail.com> wrote:

Toggle quote (17 lines)
> Currently, the URI scheme (see 'bioconductor-uri' in
> guix/build-system/r.scm) is:
>
> https://bioconductor.org/packages/release/data/<type-url-part>/src/contrib/<upstream-name>-<version>.tar.gz
>
> which leads to 2 issues:
>
> 1. when Bioconductor updates their release, some package versions are
> updated too, and so, the upstream return 404.
> 2. for this reason 1., the "guix time-machine" is broken for all the
> Bioconductor packages, at least if Berlin or SWH does not have a
> substitute; which is not expected for 'annotation' packages.
>
> However, the Bioconductor archive still serves the old release, i.e.,
>
> https://bioconductor.org/packages/3.x/data/<type-url-part>/src/contrib/<upstream-name>-<version>.tar.gz

It is still the case and for concrete breakage, see [1]. I will not
detail but each time Guix lags behind Bioconductor new release, it is
broken. For sure, Guix upgrades more or less quickly. Each time
Bioconductor remove a package, it is broken. Well, because a lot
of care about R packages, the forward breakages happen barely. :-) But
backward breakages are not negligible, IMHO.


Well, this URL choice is not The Right Thing and somehow broken by design.



Toggle quote (11 lines)
> The ways to fix the both issues are:
>
> a) Add the Bioconductor release (known at packaging time) to all the
> packages; provide as argument to 'bioconductor-uri'.
> b) Add more URLs to fallback.
>
> As discussed on IRC, Tobias seems more inclined with the option a) and
> I am more in favour of option b.
>
> Attached, a quick patch showing the option b).

We are now 1.5 years after. And we did nothing; well we did other
things instead. ;-). Now, I have an strong opinion that option a) is
not doable: I speak using my janitor moves of Bioconductor packages.

Instead, something along the proposed patch below half-fixes the issue
now. We just have to append the releases and let the fallback mechanism
takes care. It reduces the maintenance burden, IMHO.

For sure, it is not perfect but it appears to me a pragmatical fix
waiting something better.


This better is unknown (at least from me :-)). On one hand Disarchive
would improve the situation for tarballs… but some work remains (check
that SWH ingestion and rebuild is bullet-proof). On the other hand,
Bioconductor uses Git, for instance:



And Bioconductor uses ’origin/RELEASE_3.14’ as Git tag. Based on this,
it would avoid the eternal inplace-change fixes.

For instance, the package tximeta [2], recently updated by Ricardo.
Well, from their Bioconductor Git repo,


it is not clear that the current version is at 1.12.3. And it is not
clear either if they tagged origin/RELEASE_3_14 at 1.12.0 and did
something ugly to then get 1.12.3. Anyway, switch from url-fetch to
git-fetch is an option. However, it is as option a) and I am not
convinced it is doable with the resource at hand.



What could a plan to have a bullet-proof “guix time-machine” for
Bioconductor?


Cheers,
simon


Toggle quote (42 lines)
> From 87e73e02202fe5e342d68f1fb17efdd4425737cd Mon Sep 17 00:00:00 2001
> From: zimoun <zimon.toutoune@gmail.com>
> Date: Tue, 3 Mar 2020 16:53:39 +0100
> Subject: [PATCH] build-system: r: Use Bioconductor old releases to fallback.
>
> * guix/build-system/r.scm (bioconductor-uri): Extend the fallback list.
> ---
> guix/build-system/r.scm | 21 ++++++++++++---------
> 1 file changed, 12 insertions(+), 9 deletions(-)
>
> diff --git a/guix/build-system/r.scm b/guix/build-system/r.scm
> index 2d328764b0..8638e1b888 100644
> --- a/guix/build-system/r.scm
> +++ b/guix/build-system/r.scm
> @@ -54,15 +54,18 @@ release corresponding to NAME and VERSION."
> ('annotation "/data/annotation")
> ('experiment "/data/experiment")
> (_ "/bioc"))))
> - (list (string-append "https://bioconductor.org/packages/release"
> - type-url-part
> - "/src/contrib/"
> - name "_" version ".tar.gz")
> - ;; TODO: use %bioconductor-version from (guix import cran)
> - (string-append "https://bioconductor.org/packages/3.10"
> - type-url-part
> - "/src/contrib/Archive/"
> - name "_" version ".tar.gz"))))
> + (append (list (string-append "https://bioconductor.org/packages/release"
> + type-url-part
> + "/src/contrib/"
> + name "_" version ".tar.gz"))
> + (map (lambda (release)
> + (string-append "https://bioconductor.org/packages/"
> + release
> + type-url-part
> + "/src/contrib/"
> + name "_" version ".tar.gz"))
> + (list (@@ (guix import cran) %bioconductor-version)
> + "3.9" "3.8" "3.7")))))
>
> (define %r-build-system-modules
> ;; Build-side modules imported by default.
Z
Z
zimoun wrote on 18 Jul 2022 18:03
(address . 39885@debbugs.gnu.org)
87lesqmmrr.fsf@gmail.com
Hi,

Since 2020, I provided several examples of breakage with bug#39885 [1].
Here another one:

Toggle snippet (40 lines)
$ guix time-machine --commit=77e2de365497bf4c8b81cbd78624f78293490485 \
-- build r-biocneighbors -S
substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0%
substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'... 100.0%
The following derivation will be built:
/gnu/store/q9ggmh5a9bzmnr49p10x1w9sv6pzjarv-BiocNeighbors_1.4.1.tar.gz.drv
building /gnu/store/q9ggmh5a9bzmnr49p10x1w9sv6pzjarv-BiocNeighbors_1.4.1.tar.gz.drv...

Starting download of /gnu/store/zgf7x09kgiqbvj0dmhplxi1xzpljxd7k-BiocNeighbors_1.4.1.tar.gz
From https://bioconductor.org/packages/release/bioc/src/contrib/BiocNeighbors_1.4.1.tar.gz...
download failed "https://bioconductor.org/packages/release/bioc/src/contrib/BiocNeighbors_1.4.1.tar.gz" 404 "Not Found"

Starting download of /gnu/store/zgf7x09kgiqbvj0dmhplxi1xzpljxd7k-BiocNeighbors_1.4.1.tar.gz
From https://bioconductor.org/packages/3.10/bioc/src/contrib/Archive/BiocNeighbors_1.4.1.tar.gz...
download failed "https://bioconductor.org/packages/3.10/bioc/src/contrib/Archive/BiocNeighbors_1.4.1.tar.gz" 404 "Not Found"

Starting download of /gnu/store/zgf7x09kgiqbvj0dmhplxi1xzpljxd7k-BiocNeighbors_1.4.1.tar.gz
From https://ci.guix.gnu.org/file/BiocNeighbors_1.4.1.tar.gz/sha256/05vi1cij37s8wgj92k3l6a3f3dwldj8jvijdp4695zczka6kypdf...
download failed "https://ci.guix.gnu.org/file/BiocNeighbors_1.4.1.tar.gz/sha256/05vi1cij37s8wgj92k3l6a3f3dwldj8jvijdp4695zczka6kypdf" 404 "Not Found"

Starting download of /gnu/store/zgf7x09kgiqbvj0dmhplxi1xzpljxd7k-BiocNeighbors_1.4.1.tar.gz
From https://tarballs.nixos.org/sha256/05vi1cij37s8wgj92k3l6a3f3dwldj8jvijdp4695zczka6kypdf...
download failed "https://tarballs.nixos.org/sha256/05vi1cij37s8wgj92k3l6a3f3dwldj8jvijdp4695zczka6kypdf" 404 "Not Found"

Starting download of /gnu/store/zgf7x09kgiqbvj0dmhplxi1xzpljxd7k-BiocNeighbors_1.4.1.tar.gz
From https://archive.softwareheritage.org/api/1/content/sha256:ae5d3f8d9a9ffd920cb94dc62d916c94b7e18632744c91e4e3489f21230b7117/raw/...
download failed "https://archive.softwareheritage.org/api/1/content/sha256:ae5d3f8d9a9ffd920cb94dc62d916c94b7e18632744c91e4e3489f21230b7117/raw/" 404 "Not Found"

Starting download of /gnu/store/zgf7x09kgiqbvj0dmhplxi1xzpljxd7k-BiocNeighbors_1.4.1.tar.gz
From https://web.archive.org/web/20220718175152/https://bioconductor.org/packages/release/bioc/src/contrib/BiocNeighbors_1.4.1.tar.gz...
download failed "https://web.archive.org/web/20220718175152/https://bioconductor.org/packages/release/bioc/src/contrib/BiocNeighbors_1.4.1.tar.gz" 404 "NOT FOUND"
Trying to use Disarchive to assemble /gnu/store/zgf7x09kgiqbvj0dmhplxi1xzpljxd7k-BiocNeighbors_1.4.1.tar.gz...
could not find its Disarchive specification
failed to download "/gnu/store/zgf7x09kgiqbvj0dmhplxi1xzpljxd7k-BiocNeighbors_1.4.1.tar.gz" from ("https://bioconductor.org/packages/release/bioc/src/contrib/BiocNeighbors_1.4.1.tar.gz" "https://bioconductor.org/packages/3.10/bioc/src/contrib/Archive/BiocNeighbors_1.4.1.tar.gz")
builder for `/gnu/store/q9ggmh5a9bzmnr49p10x1w9sv6pzjarv-BiocNeighbors_1.4.1.tar.gz.drv' failed to produce output path `/gnu/store/zgf7x09kgiqbvj0dmhplxi1xzpljxd7k-BiocNeighbors_1.4.1.tar.gz'
build of /gnu/store/q9ggmh5a9bzmnr49p10x1w9sv6pzjarv-BiocNeighbors_1.4.1.tar.gz.drv failed
View build log at '/var/log/guix/drvs/q9/ggmh5a9bzmnr49p10x1w9sv6pzjarv-BiocNeighbors_1.4.1.tar.gz.drv.gz'.
guix build: error: build of `/gnu/store/q9ggmh5a9bzmnr49p10x1w9sv6pzjarv-BiocNeighbors_1.4.1.tar.gz.drv' failed

Well, several comments:

1. Berlin or Bordeaux do not have it as substitutes,
2. Diasarchive does not have it,
3. Many others neither.

but the question in the first place is: why is Bioconductor failing?
Because they do ugly things!

Our history reads:

f431d5e299 Sun Dec 15 15:38:51 2019 +0100 guix: Upgrade to Bioconductor 3.10
12e2aa96dc Sun Dec 15 15:38:55 2019 +0100 gnu: r-biocneighbors: Update to 1.4.1.
aece78fe2f Sun Mar 1 23:38:12 2020 +0100 gnu: r-biocneighbors: Update to 1.4.2.
8e518d4802 Sat Jun 13 01:19:38 2020 +0200 guix: Update to Bioconductor 3.11.

which means that Bioconductor removes v1.4.1 from their URI scheme
(even, I do not know if the tarball is still available on their infra)
and despite the fact Bioconductor v3.10 had released v1.4.1, then it is
not stable.

At the cost of more bandwidth, we could switch from url-fetch to
git-fetch. Or we also could examine why Disarchive is failing here.



Cheers,
simon
R
R
Ricardo Wurmus wrote on 18 Jul 2022 18:21
(name . zimoun)(address . zimon.toutoune@gmail.com)
87bktmtmol.fsf@elephly.net
zimoun <zimon.toutoune@gmail.com> writes:

Toggle quote (3 lines)
> At the cost of more bandwidth, we could switch from url-fetch to
> git-fetch.

Let’s do it! I’m tired of Bioconductor archive shenanigans messing with
package availability.

--
Ricardo
R
R
Ricardo Wurmus wrote on 10 Aug 2022 20:25
(name . zimoun)(address . zimon.toutoune@gmail.com)
878rnwuemq.fsf@elephly.net
Ricardo Wurmus <rekado@elephly.net> writes:

Toggle quote (8 lines)
> zimoun <zimon.toutoune@gmail.com> writes:
>
>> At the cost of more bandwidth, we could switch from url-fetch to
>> git-fetch.
>
> Let’s do it! I’m tired of Bioconductor archive shenanigans messing with
> package availability.

I have finally taken the time to review this and implement a first draft
of a change to the bioconductor importer and updater.

There are some limitations:

- we cannot use the updater to go from “url-fetch” to “git-fetch”.
That’s because “package-update” in (guix upstream) decides whether to
use package-update/url-fetch or package-update/git-fetch based on the
*current* package value’s origin fetch procedure. For the switch we
can hack around this (adding an exception for bioconductor packages),
but there is no pretty way to do this in a generic fashion that could
be committed.

Perhaps we could operate on the url included in the <upstream-source>
instead of looking at the *current* package value. We’re only
accessing “package” once in the url-fetch case, so maybe we can work
around this problem.

- the repositories at https://git.bioconductor.org/package/NAMEdo not
tag package versions. The only method of organization is branches
that are named after *Bioconductor releases* (not package releases),
e.g. RELEASE_3_15. We can only determine the package version by
reading its DESCRIPTION file or by looking up the version index for
all Bioconductor packages (we do that already). This means that there
could be different commits for the same package version in the same
release branch — so we have to include the commit hash and a revision
counter in the version string.

- the updater doesn’t work on version expressions like (git-version
"1.12" revision commit). It expects to be able to replace literal
strings. Because of that my changes let the importer generate a
string literal such as "1.12-0.cafebab" without a let-bound commit
string.

- “experiment” or “data” packages are not kept in Git. They only exist
as volatile tarballs that will be overwritten. Thankfully, they don’t
change all that often, so they have a good chance of making it into
our archives.

- the above exception means that we need to litter the importer and
updater code with extra checks.

With all these notes out of the way I’ll prepare a series of patches
next.

--
Ricardo
M
M
Maxime Devos wrote on 10 Aug 2022 21:44
4041b7dd-df1d-318c-0ca9-efe296203ea9@telenet.be
On 10-08-2022 20:25, Ricardo Wurmus wrote:
Toggle quote (5 lines)
> - the updater doesn’t work on version expressions like (git-version
> "1.12" revision commit). It expects to be able to replace literal
> strings. Because of that my changes let the importer generate a
> string literal such as "1.12-0.cafebab" without a let-bound commit
> string.
I've a patch that implements replacing (revision "N") by (revision
"N+1"), apparently it's not applied yet but let me search for it ...
Attachment: OpenPGP_signature
M
M
Maxime Devos wrote on 10 Aug 2022 21:48
0d6b6bae-4ab3-aad5-f03d-f6e369620267@telenet.be
On 10-08-2022 21:44, Maxime Devos wrote:
Toggle quote (9 lines)
>
> On 10-08-2022 20:25, Ricardo Wurmus wrote:
>> - the updater doesn’t work on version expressions like (git-version
>>    "1.12" revision commit).  It expects to be able to replace literal
>>    strings.  Because of that my changes let the importer generate a
>>    string literal such as "1.12-0.cafebab" without a let-bound commit
>>    string.
> I've a patch that implements replacing (revision "N") by (revision
> "N+1"), apparently it's not applied yet but let me search for it ...
Found it:
That patch series was written with Minetest / ContentDB and a new
'latest-git' updater in mind, but the ContentDB and latest-git bits
should be separable without much trouble.
Greetings,
Maxime.
Attachment: OpenPGP_signature
Z
Z
zimoun wrote on 9 Sep 2022 19:23
(name . Ricardo Wurmus)(address . rekado@elephly.net)
865yhwpim7.fsf@gmail.com
Hi Ricardo,

I am late. This message landed when I was traveling for holidays. :-)

On Wed, 10 Aug 2022 at 20:25, Ricardo Wurmus <rekado@elephly.net> wrote:

Toggle quote (8 lines)
> - we cannot use the updater to go from “url-fetch” to “git-fetch”.
> That’s because “package-update” in (guix upstream) decides whether to
> use package-update/url-fetch or package-update/git-fetch based on the
> *current* package value’s origin fetch procedure. For the switch we
> can hack around this (adding an exception for bioconductor packages),
> but there is no pretty way to do this in a generic fashion that could
> be committed.

It appears to me acceptable to have an exception. Or even to do it just
once as a big replacement of Bioconductor packages.

Toggle quote (10 lines)
> - the repositories at https://git.bioconductor.org/package/NAMEdo not
> tag package versions. The only method of organization is branches
> that are named after *Bioconductor releases* (not package releases),
> e.g. RELEASE_3_15. We can only determine the package version by
> reading its DESCRIPTION file or by looking up the version index for
> all Bioconductor packages (we do that already). This means that there
> could be different commits for the same package version in the same
> release branch — so we have to include the commit hash and a revision
> counter in the version string.

This is the most annoying part. Indeed, when I check out some
Bioconductor Git repositories, I am always confused by their Git
structure.

From my understanding, the tarball you fetch from bioconductor.org has
the same content than the commit tagged “Bioconductor release”
(RELEASE_X_Y). The content of the upstream release can mismatch the
content of the Bioconductor tarball release.

I do not know how it would be complicated or inaccurate to consider the
package version from the Bioconductor index and assign this version to
the commit tagged RELEASE_X_Y. This commit would appear in the Guix
package definition though. Or maybe we transparently could RELEASE_X_Y
to determine this commit.


Toggle quote (6 lines)
> - the updater doesn’t work on version expressions like (git-version
> "1.12" revision commit). It expects to be able to replace literal
> strings. Because of that my changes let the importer generate a
> string literal such as "1.12-0.cafebab" without a let-bound commit
> string.

Maxime pointed patch#53144 [1] but I have not looked at it yet.




Toggle quote (5 lines)
> - “experiment” or “data” packages are not kept in Git. They only exist
> as volatile tarballs that will be overwritten. Thankfully, they don’t
> change all that often, so they have a good chance of making it into
> our archives.

That’s an interesting question for Disarchive and Software Heritage.


Cheers,
simon
?