--fallback and (fallback #t) do not apply when receiving a corrupt nar

  • Open
  • quality assurance status badge
Details
One participant
  • Richard Sent
Owner
unassigned
Submitted by
Richard Sent
Severity
normal
R
R
Richard Sent wrote on 24 May 05:28 +0200
(address . bug-guix@gnu.org)
87fru7vqfy.fsf@freakingpenguin.com
Hi Guix!

As part of building linux-libre-arm64-generic, Guix tries substituting
linux-libre-arm64-generic.tar.xz. Unfortunately it looks like a corrupt
nar snuck into bordeaux.

Toggle snippet (19 lines)
root@lifeline ~# guix build linux-libre-arm64-generic --system=aarch64-linux --fallback
The following derivations will be built:
/gnu/store/9626zaczwl5x4ypxmmdklvkclqx2dlpi-linux-libre-arm64-generic-6.8.10.drv
/gnu/store/1wi10rg7236ck8k5vdrdfap5l7a9s9z0-linux-libre-6.8.10-guix.tar.xz.drv
199.7 MB will be downloaded:
# Snip
/gnu/store/y813phs2n9xnb7zbcr07g0j9509bzbsb-linux-libre-6.8.10-guix.tar.xz
substituting /gnu/store/0jfsx4hljddyand45z7i77ynpvr0mhb5-module-import-compiled...
downloading from https://bordeaux.guix.gnu.org/nar/lzip/0jfsx4hljddyand45z7i77ynpvr0mhb5-module-import-compiled ...
module-import-compiled 171KiB 676KiB/s 00:00 ???????????????????? 100.0%

substituting /gnu/store/y813phs2n9xnb7zbcr07g0j9509bzbsb-linux-libre-6.8.10-guix.tar.xz...
downloading from https://bordeaux.guix.gnu.org/nar/none/y813phs2n9xnb7zbcr07g0j9509bzbsb-linux-libre-6.8.10-guix.tar.xz ...
linux-libre-6.8.10-guix.tar.xz 136.5MiB 20.5MiB/s 00:06 ???????????????????? 97.6%guix substitute: error: corrupt input while restoring '/gnu/store/y813phs2n9xnb7zbcr07g0j9509bzbsb-linux-libre-6.8.10-guix.tar.xz' from #<input: string 7f1190966ee0>
substitution of /gnu/store/y813phs2n9xnb7zbcr07g0j9509bzbsb-linux-libre-6.8.10-guix.tar.xz failed
guix build: error: corrupt input while restoring archive from #<closed: file 7fea23dffaf0>
root@lifeline ~#

Trying to pass --fallback on the command line has no effect, even though
both the documentation and [1] imply that should work.

Similarly with Cuirass, I have a spec that tries to build an
operating-system with a linux-libre-arm64-generic kernel. Even though
cuirass-configuration has (fallback #t), no attempt is made to recover
by building the derivation locally.

Toggle snippet (31 lines)
;; In config
(define %rsent-cuirass-service
(service cuirass-service-type
(cuirass-configuration
(specifications %rsent-cuirass-specs)
(host "0.0.0.0")
(fallback? #t))))

;; From /var/log/cuirass.log
2024-05-23 22:23:36 Uncaught exception in task:
2024-05-23 22:23:36 In fibers.scm:
2024-05-23 22:23:36 172:8 8 (_)
2024-05-23 22:23:36 In ice-9/boot-9.scm:
2024-05-23 22:23:36 1752:10 7 (with-exception-handler _ _ #:unwind? _ # _)
2024-05-23 22:23:36 In guix/store.scm:
2024-05-23 22:23:36 684:37 6 (thunk)
2024-05-23 22:23:36 In cuirass/base.scm:
2024-05-23 22:23:36 421:14 5 (_ _)
2024-05-23 22:23:36 267:10 4 (spawn-builds #<store-connection 256.100 7ff1213670f0> _ ?)
2024-05-23 22:23:36 In ice-9/boot-9.scm:
2024-05-23 22:23:36 1752:10 3 (with-exception-handler _ _ #:unwind? _ # _)
2024-05-23 22:23:36 1685:16 2 (raise-exception _ #:continuable? _)
2024-05-23 22:23:36 1683:16 1 (raise-exception _ #:continuable? _)
2024-05-23 22:23:36 1685:16 0 (raise-exception _ #:continuable? _)
2024-05-23 22:23:36 ice-9/boot-9.scm:1685:16: In procedure raise-exception:
2024-05-23 22:23:36 ERROR:
2024-05-23 22:23:36 1. &nar-error:
2024-05-23 22:23:36 file: #f
2024-05-23 22:23:36 port: #<closed: file 7ff1213615b0>

I'm no rocket scientist, but that error looks very similar to the error
found when building from the CLI. Given that cuirass-configuration has
fallback #t, it should be recoverable.

Possibly related: [2] and [3]

[3]: Guix 3f59fd6d114548480c719d4b8f8509bdf3e8dcca

--
Take it easy,
Richard Sent
Making my computer weirder one commit at a time.
R
R
Richard Sent wrote on 26 May 04:06 +0200
(address . 71160@debbugs.gnu.org)
87h6els4wt.fsf@freakingpenguin.com
Richard Sent <richard@freakingpenguin.com> writes:

Toggle quote (3 lines)
> Trying to pass --fallback on the command line has no effect, even though
> both the documentation and [1] imply that should work.

This issue might have some spiciness to it. I have two machines with
identical guix commits and --fallback works on one but not the other.

Toggle snippet (16 lines)
# Failing machine:
root@lifeline ~# guix describe
Generation 2 May 25 2024 21:03:46 (current)
guix 94c8cec
repository URL: https://git.savannah.gnu.org/git/guix.git
branch: master
commit: 94c8cec99969fe9f65777637fde1f05e1c576a3f

# Good machine:
Generation 3 May 25 2024 21:58:15 (current)
guix 94c8cec
repository URL: https://git.savannah.gnu.org/git/guix.git
branch: master
commit: 94c8cec99969fe9f65777637fde1f05e1c576a3f

On the failing machine, I get an error like this:

Toggle snippet (9 lines)
root@lifeline ~# guix build linux-libre-arm64-generic --system=aarch64-linux --fallback
# snip
downloading from https://bordeaux.guix.gnu.org/nar/none/y813phs2n9xnb7zbcr07g0j9509bzbsb-linux-libre-6.8.10-guix.tar.xz ...
linux-libre-6.8.10-guix.tar.xz 136.5MiB 19.0MiB/s 00:07 ???????????????????? 99.9%guix substitute: error: corrupt input while restoring '/gnu/store/y813phs2n9xnb7zbcr07g0j9509bzbsb-linux-libre-6.8.10-guix.tar.xz' from #<input: string 7f558e7233f0>
substitution of /gnu/store/y813phs2n9xnb7zbcr07g0j9509bzbsb-linux-libre-6.8.10-guix.tar.xz failed
guix build: error: corrupt input while restoring archive from #<closed: file 7f51c615b380>
root@lifeline ~# logout

whereas on the good machine:

Toggle snippet (8 lines)
root@gibraltar ~# guix build linux-libre-arm64-generic --system=aarch64-linux --fallback
# snip
downloading from https://bordeaux.guix.gnu.org/nar/none/y813phs2n9xnb7zbcr07g0j9509bzbsb-linux-libre-6.8.10-guix.tar.xz ...
linux-libre-6.8.10-guix.tar.xz 136.5MiB 19.6MiB/s 00:07 ???????????????????? 98.8%guix substitute: error: corrupt input while restoring '/gnu/store/y813phs2n9xnb7zbcr07g0j9509bzbsb-linux-libre-6.8.10-guix.tar.xz' from #<input: string 7f7b2a223b60>
substitution of /gnu/store/y813phs2n9xnb7zbcr07g0j9509bzbsb-linux-libre-6.8.10-guix.tar.xz failed
building /gnu/store/ny56fdcig9cd9bd3pssmlraz2c1q10q8-linux-libre-6.8.10-guix.tar.xz.drv...

I thought perhaps there was some hyper-odd race condition going on here
(lifeline is consistently at a higher percent than gibraltar when the
error is detected), but I just had a outlier that seems to disprove
that, where lifeline had the same error with a 97.2% download progress
bar.

--
Take it easy,
Richard Sent
Making my computer weirder one commit at a time.
?
Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send an email to 71160@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 71160
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch