ludo@gnu.org (Ludovic Courtès) writes:
Toggle quote (9 lines)
> I’ve become convinced that this is due to parallelism: several> guix-daemon processes run at the same time. In this case, I bet this> process tries to remove an item from the ValidPaths table while another> is trying to add it in the Refs table or something.>> In dc57d527 I added #:parallel-tests? #f for ‘guix-devel’. Eventually> we should fix the makefile to run this test alone, as is done for> ‘guix-gc.sh’.
In the 2 years and 7 months since we disabled parallel tests in commitdc57d527aee4eb18ec5fb345f90d6637bbd1a4d2 to work around this bug, we mayhave allowed other parallelism bugs to quietly creep in. Today, Iobserved a parallel test failure that seems unrelated to the originalbug reported here. And anecdotally, I feel that the tests frequentlyfail spuriously when I run them in parallel. Until we get to the bottomof this, I agree that the best thing to do is to always run the tests inserial.
For completeness, below I'll report the failure I observed today.
On my x86_64-linux GuixSD machine, using Guix version0ec430f79530ee343c175347952f91a78adca5ec (this is what my~/.config/guix/latest points to), I entered a Guix developmentenvironment via "guix environment guix". In Guix's Git repository, Ichecked out commit 4dd91dff477b9717b3fa494b23976e4d69ab7dfc (the currenttip of core-updates) and ran the following commands:
./bootstrap && ./configure --localstatedir=/var && make -j \ && make -j check
The following tests failed:
FAIL: tests/guix-hash.sh FAIL: tests/guix-download.sh FAIL: tests/guix-build.sh FAIL: tests/guix-package.sh FAIL: tests/guix-system.sh
When I immediately ran "make recheck" without making any changes, thesame 5 tests passed. Note that this ran the tests in serial because Iomitted -j. When I ran the same 5 tests again in parallel using thefollowing command, they all passed:
make -j check TESTS="tests/guix-hash.sh tests/guix-download.sh \ tests/guix-build.sh tests/guix-package.sh tests/guix-system.sh"
I also tried running just tests/guix-hash.sh and tests/guix-download.shtogether 10 times in serial and then 10 times in parallel.Unfortunately, this didn't reproduce the failure, either (i.e., all 20test runs passed).
All in all, this seems to suggest that the failures I observed might becaused by a parallelism bug when running the entire test suite.
Regarding the cause of failure, the 5 tests all failed with a messagelike the following:
Toggle snippet (33 lines)
ERROR: In procedure canonicalize-path:In procedure canonicalize-path: No such file or directory+ guix download --versionBacktrace:In ice-9/boot-9.scm: 2875:24 19 (_) 222:17 18 (map1 (((guix utils)) ((guix config)) ((guix #)) ((…)) …)) 2788:17 17 (resolve-interface (guix utils) #:select _ #:hide _ # _ …) 2714:10 16 (_ (guix utils) _ _ #:ensure _) 2982:16 15 (try-module-autoload _ _) 2312:4 14 (save-module-excursion #<procedure 1397630 at ice-9/boo…>) 3002:22 13 (_)In unknown file: 12 (primitive-load-path "guix/utils" #<procedure 130d260 a…>)In guix/utils.scm: 26:0 11 (_)In ice-9/boot-9.scm: 2862:4 10 (define-module* _ #:filename _ #:pure _ #:version _ # _ …) 2875:24 9 (_) 222:17 8 (map1 (((guix config)) ((srfi srfi-1)) ((srfi #)) (#) …)) 2788:17 7 (resolve-interface (guix config) #:select _ #:hide _ # _ …) 2714:10 6 (_ (guix config) _ _ #:ensure _) 2982:16 5 (try-module-autoload _ _) 2312:4 4 (save-module-excursion #<procedure 13975d0 at ice-9/boo…>) 3002:22 3 (_)In unknown file: 2 (primitive-load-path "guix/config" #<procedure 130d1a0 …>)In guix/config.scm: 86:6 1 (_)In unknown file: 0 (canonicalize-path "/home/marusich/guix/test-tmp/db")
All the test failures looked the same, except that instead of "guixdownload --version", the equivalent command (e.g., "guix system--version") was invoked.
I realize this information doesn't help solve the original bug reportedhere. However, it's a real failure, so I hope it'll be useful. In anycase, it shows that there are probably multiple parallelism bugs lurkingin our code now. We're going to have to solve all those parallelismbugs before we can reliably run the tests in parallel again.
-- Chris