Test of guix-1.1.0-4.bdc801e is failing on aarch64

  • Open
  • quality assurance status badge
Details
One participant
  • Stefan
Owner
unassigned
Submitted by
Stefan
Severity
normal
S
S
Stefan wrote on 2 Jun 2020 01:14
(address . bug-guix@gnu.org)
E46D3F13-D3E8-473E-AFA7-689A33053BBA@vodafonemail.de
Hi!

Building guix-1.1.0-4.bdc801e is failing on aarch64 since about two weeks on ci.guix.gnu.org.


There is always the same failure during the test. See for example the raw log from http://ci.guix.gnu.org/build/2794788/details:

test-name: network-interface-names
actual-error:
+ (wrong-type-arg
+ "list-tail"
+ "Wrong type argument in position ~A (expecting ~A): ~S"
+ (1 "pair" ())
+ (()))
result: FAIL


When building locally this particular test is passing (after about three days on an SBC with 1 GB RAM).

But unfortunately building locally I get two different failures instead:

FAIL: tests/store.scm
FAIL: tests/guix-package.sh

Here are some more details:


test-name: verify-store + check-contents
location: /tmp/guix-build-guix-1.1.0-4.bdc801e.drv-0/source/tests/store.scm:972
actual-error:
+ (%exception
+ #<&store-protocol-error message: "path `dtmp/guix-tests/store/bpl866rgsjyzdczlngfw8ss2lkld6bim-mirrors' is not in the store" status: 1>)
result: FAIL

Note the “dtmp/” instead of “/tmp/”. I saw the same error some weeks ago already, but shortly after that was happening, there was a substitute available. But it seems this error is reproducible in my case.


guix install: error: profile t-guix-package-541/profile is locked by another process
+ true
+ kill 1387
+ rm -f t-profile-541 t-profile-541.lock t-profile-541-1-link t-guix-package-file-541
+ rm -rf t-guix-package-541 t-home-541
rm: cannot remove 't-guix-package-541': Directory not empty
FAIL tests/guix-package.sh (exit status: 1)


This may be due to a deleted but still opened file on an NFS share. There may be an intermediate hidden .nfs… file, which may get created in such a case (“delete on last close”, “silly rename”), However, the RFC-5661 for NFS demands even if OPEN4_RESULT_PRESERVE_UNLINKED is supported, that the directory entry of an open file must not be removed in this case, thus preventing a directory removal.
S
S
Stefan wrote on 23 Nov 2020 09:35
Re: bug#41654: Acknowledgement (Test of guix-1.1.0-4.bdc801e is failing on aarch64)
(address . 41654@debbugs.gnu.org)
0DF2D52E-2544-45E3-BB0B-F2B441EBED03@vodafonemail.de
Hi!

The two errors when testing guix are still reproducible, now with guix-1.2.0rc2-1.0d4b1af, see the log for details.

I’m really wondering about the “dtmp/” instead of “/tmp/” bug. May this be a bug in guile itself?


Bye

Stefan
S
S
Stefan wrote on 23 Nov 2020 09:38
[PATCH] tests: Ignoring incomplete clean-up on NFS share.
(address . 41654@debbugs.gnu.org)
AC42C906-A019-4DF9-964B-973657E1FAFA@vodafonemail.de
* tests/guix-package.sh: The 'rm -rf' for clean-up inside the trap may not
succeed on an NFS share, but this should not fail the test.
---
tests/guix-package.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

Toggle diff (15 lines)
diff --git a/tests/guix-package.sh b/tests/guix-package.sh
index 3e5fa71d20..b1c6eeffb8 100644
--- a/tests/guix-package.sh
+++ b/tests/guix-package.sh
@@ -33,7 +33,7 @@ profile="t-profile-$$"
tmpfile="t-guix-package-file-$$"
rm -f "$profile" "$tmpfile"
-trap 'rm -f "$profile" "$profile.lock" "$profile-"[0-9]* "$tmpfile"; rm -rf "$module_dir" t-home-'"$$" EXIT
+trap 'rm -f "$profile" "$profile.lock" "$profile-"[0-9]* "$tmpfile"; rm -rf "$module_dir" t-home-'"$$"' || echo "incomplete clean-up ignored"' EXIT
# Use `-e' with a non-package expression.
! guix package --bootstrap -e +
--
2.29.2
S
S
Stefan wrote on 23 Nov 2020 10:20
test-name: verify-store + check-contents failing on guix-1.2.0rc2-1.0d4b1af
(address . 41654@debbugs.gnu.org)
E447536D-56DB-4B72-BBDE-0E78A835B6F5@vodafonemail.de
Hi!

The “dtmp/” issue is not new. It appeared already in these bugs:


In all cases it happens in tests/store.scm:

test-name: verify-store + check-contents

This bug happened on x86_64 and aarch64.
The issues 27032 and 27034 where reported when the installer used a unionfs. Probably it was hidden when switching to an overlayfs.

This issue 41654 and 29676 show the “dtmp/” error when running this test on an NFS share.


Bye

Stefan
S
S
Stefan wrote on 30 Dec 2020 14:50
708455FD-2A6C-4E6C-8F2C-D2E5549F5886@vodafonemail.de
Hi Ludo’!

Could you please take a look at the bug report https://issues.guix.gnu.org/41654?

There are two reproducible problems when building the guix-1.2.0 package.

There is a patch already for the first of the two problems (an incomplete cleanup) at https://issues.guix.gnu.org/41654#2.

For the other issue with a wrong path "dtmp/guix-tests/store/", there is a summary at https://issues.guix.gnu.org/41654#3. This problem is existing since years and seems to be depending on the underlying file-system. It seems to appear with a tmp-store on a unionfs-fuse or NFS. In the past it got hidden by switching to overlayfs or changes on the NFS server side.

As the CI for aarch64 is always lagging behind these issues are kind of blockers to me. Today they block the build of guix-1.2.0-8.7624ebb.


Bye

Stefan
S
S
Stefan wrote on 4 Jan 2021 02:05
B8A36638-AF42-4F6A-BF33-FDED72854552@vodafonemail.de
Hi!

Today I thought that it should be possible to work around this bug on my aarch64 system by mounting /tmp as a tmpfs, backed by swap on a small SD card, to prevent any issues due to my NFS root. This worked for the trap-cleanup-issue. But to my surprise this didn’t work out for the “dtmp/” error.

I was wondering, if using /tmp as a tmpfs on my virtual x86_64 system would make the test fail there as well. So I did

sudo mount -t tmpfs -o size=768M tmpfs /tmp
guix build --check --no-grafts guix@1.2.0-8.7624ebb

But there it was passing.

This is the interesting part from the log of the aarch64 system:

test-name: verify-store + check-contents
location: /tmp/guix-build-guix-1.2.0-8.7624ebb.drv-0/source/tests/store.scm:1156
source:
+ (test-assert
+ "verify-store + check-contents"
+ (with-store
+ s
+ (let* ((text (random-text))
+ (drv (build-expression->derivation
+ s
+ "corrupt"
+ `(let ((out (assoc-ref %outputs "out")))
+ (call-with-output-file
+ out
+ (lambda (port) (display ,text port)))
+ #t)
+ #:guile-for-build
+ (package-derivation
+ s
+ %bootstrap-guile
+ (%current-system))))
+ (file (derivation->output-path drv)))
+ (with-derivation-substitute
+ drv
+ text
+ (and (build-derivations s (list drv))
+ (verify-store s #:check-contents? #t)
+ (begin
+ (chmod file 420)
+ (call-with-output-file
+ file
+ (lambda (port) (display "corrupt!" port)))
+ #t)
+ (not (verify-store s #:check-contents? #t))
+ (delete-paths s (list file)))))))
actual-value: #f
actual-error:
+ (%exception
+ #<&store-protocol-error message: "path `dtmp/guix-tests/store/rn6xq5kaipp8aanhcnz2hvyfmr3y2laa-mirrors' is not in the store" status: 1>)
result: FAIL


Bye

Stefan
?
Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send an email to 41654@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 41654
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch