Gnulib’s ‘test-lock’ fails to complete on machines with >= 32 cores
(address . bug-guix@gnu.org)(name . Mathieu Othacehe)(address . m.othacehe@gmail.com)
With current ‘master’, Gnulib’s ‘test-lock’ (a test found in many GNU
packages that use Gnulib, such as gettext-0.19.8.1) fails to complete in
1 hour almost systematically on bayfront, which is a 32-core x86_64
machine. The backtrace is like this:
Toggle snippet (133 lines)
(gdb) thread apply all bt
Thread 21 (LWP 5636):
#0 0x00007ffff77a9602 in pthread_rwlock_wrlock () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#1 0x000000000040172f in rwlock_mutator_thread (arg=<optimized out>) at test-lock.c:348
#2 0x00007ffff77a4454 in start_thread () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#3 0x00007ffff74e67bf in clone () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
Thread 20 (LWP 5635):
#0 0x00007ffff77a9602 in pthread_rwlock_wrlock () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#1 0x000000000040172f in rwlock_mutator_thread (arg=<optimized out>) at test-lock.c:348
#2 0x00007ffff77a4454 in start_thread () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#3 0x00007ffff74e67bf in clone () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
Thread 19 (LWP 5634):
#0 0x00007ffff77a9602 in pthread_rwlock_wrlock () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#1 0x000000000040172f in rwlock_mutator_thread (arg=<optimized out>) at test-lock.c:348
#2 0x00007ffff77a4454 in start_thread () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#3 0x00007ffff74e67bf in clone () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
Thread 18 (LWP 5633):
#0 0x00007ffff77a9602 in pthread_rwlock_wrlock () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#1 0x000000000040172f in rwlock_mutator_thread (arg=<optimized out>) at test-lock.c:348
#2 0x00007ffff77a4454 in start_thread () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#3 0x00007ffff74e67bf in clone () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
Thread 17 (LWP 5632):
#0 0x00007ffff77a9602 in pthread_rwlock_wrlock () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#1 0x000000000040172f in rwlock_mutator_thread (arg=<optimized out>) at test-lock.c:348
#2 0x00007ffff77a4454 in start_thread () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#3 0x00007ffff74e67bf in clone () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
Thread 16 (LWP 5631):
#0 0x00007ffff77a9602 in pthread_rwlock_wrlock () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#1 0x000000000040172f in rwlock_mutator_thread (arg=<optimized out>) at test-lock.c:348
#2 0x00007ffff77a4454 in start_thread () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#3 0x00007ffff74e67bf in clone () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
Thread 15 (LWP 5630):
#0 0x00007ffff77a9602 in pthread_rwlock_wrlock () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#1 0x000000000040172f in rwlock_mutator_thread (arg=<optimized out>) at test-lock.c:348
#2 0x00007ffff77a4454 in start_thread () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#3 0x00007ffff74e67bf in clone () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
Thread 14 (LWP 5629):
#0 0x00007ffff77a93d6 in pthread_rwlock_wrlock () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#1 0x000000000040172f in rwlock_mutator_thread (arg=<optimized out>) at test-lock.c:348
#2 0x00007ffff77a4454 in start_thread () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#3 0x00007ffff74e67bf in clone () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
Thread 13 (LWP 5628):
#0 0x00007ffff77a9602 in pthread_rwlock_wrlock () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#1 0x000000000040172f in rwlock_mutator_thread (arg=<optimized out>) at test-lock.c:348
#2 0x00007ffff77a4454 in start_thread () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#3 0x00007ffff74e67bf in clone () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
Thread 12 (LWP 5627):
#0 0x00007ffff77a9602 in pthread_rwlock_wrlock () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#1 0x000000000040172f in rwlock_mutator_thread (arg=<optimized out>) at test-lock.c:348
#2 0x00007ffff77a4454 in start_thread () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#3 0x00007ffff74e67bf in clone () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
Thread 11 (LWP 5626):
#0 0x00007ffff74cf777 in sched_yield () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
#1 0x0000000000401810 in gl_thread_yield () at test-lock.c:104
#2 rwlock_checker_thread (arg=<optimized out>) at test-lock.c:381
#3 0x00007ffff77a4454 in start_thread () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#4 0x00007ffff74e67bf in clone () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
Thread 10 (LWP 5625):
#0 0x00007ffff74cf777 in sched_yield () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
#1 0x0000000000401810 in gl_thread_yield () at test-lock.c:104
#2 rwlock_checker_thread (arg=<optimized out>) at test-lock.c:381
#3 0x00007ffff77a4454 in start_thread () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#4 0x00007ffff74e67bf in clone () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
Thread 9 (LWP 5624):
#0 0x00007ffff77a9dd7 in pthread_rwlock_unlock () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#1 0x0000000000401807 in rwlock_checker_thread (arg=<optimized out>) at test-lock.c:378
#2 0x00007ffff77a4454 in start_thread () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#3 0x00007ffff74e67bf in clone () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
Thread 8 (LWP 5623):
#0 0x00007ffff77a9dd7 in pthread_rwlock_unlock () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#1 0x0000000000401807 in rwlock_checker_thread (arg=<optimized out>) at test-lock.c:378
#2 0x00007ffff77a4454 in start_thread () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#3 0x00007ffff74e67bf in clone () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
Thread 7 (LWP 5622):
#0 0x00007ffff77a9dd7 in pthread_rwlock_unlock () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#1 0x0000000000401807 in rwlock_checker_thread (arg=<optimized out>) at test-lock.c:378
#2 0x00007ffff77a4454 in start_thread () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#3 0x00007ffff74e67bf in clone () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
Thread 6 (LWP 5621):
#0 0x00007ffff77a9dd7 in pthread_rwlock_unlock () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#1 0x0000000000401807 in rwlock_checker_thread (arg=<optimized out>) at test-lock.c:378
#2 0x00007ffff77a4454 in start_thread () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#3 0x00007ffff74e67bf in clone () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
Thread 5 (LWP 5620):
#0 0x00007ffff74cf777 in sched_yield () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
#1 0x0000000000401810 in gl_thread_yield () at test-lock.c:104
#2 rwlock_checker_thread (arg=<optimized out>) at test-lock.c:381
#3 0x00007ffff77a4454 in start_thread () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#4 0x00007ffff74e67bf in clone () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
Thread 4 (LWP 5619):
#0 0x00007ffff74cf777 in sched_yield () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
#1 0x0000000000401810 in gl_thread_yield () at test-lock.c:104
#2 rwlock_checker_thread (arg=<optimized out>) at test-lock.c:381
#3 0x00007ffff77a4454 in start_thread () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#4 0x00007ffff74e67bf in clone () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
Thread 3 (LWP 5618):
#0 0x00007ffff77a9dd7 in pthread_rwlock_unlock () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#1 0x0000000000401807 in rwlock_checker_thread (arg=<optimized out>) at test-lock.c:378
#2 0x00007ffff77a4454 in start_thread () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#3 0x00007ffff74e67bf in clone () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
Thread 2 (LWP 5617):
#0 0x00007ffff77a9dd7 in pthread_rwlock_unlock () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#1 0x0000000000401807 in rwlock_checker_thread (arg=<optimized out>) at test-lock.c:378
#2 0x00007ffff77a4454 in start_thread () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#3 0x00007ffff74e67bf in clone () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libc.so.6
Thread 1 (LWP 5605):
#0 0x00007ffff77a568d in pthread_join () from target:/gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25/lib/libpthread.so.0
#1 0x0000000000400e29 in gl_thread_join (retvalp=0x0, thread=<optimized out>) at test-lock.c:99
#2 test_rwlock () at test-lock.c:408
#3 main () at test-lock.c:682
Mathieu Othacehe reported the same issue on a 32-core machine.
This may be fixed by Gnulib commit
480d374e596a0ee3fed168ab42cd84c313ad3c89 (not present in
gettext-0.19.8.1), which introduces this:
--- a/tests/test-lock.c
+++ b/tests/test-lock.c
@@ -50,6 +50,13 @@
Uncomment this to see if the operating system has a fair scheduler. */
#define EXPLICIT_YIELD 1
+/* Whether to use 'volatile' on some variables that communicate information
+ between threads. If set to 0, a lock is used to protect these variables.
+ If set to 1, 'volatile' is used; this is theoretically equivalent but can
+ lead to much slower execution (e.g. 30x slower total run time on a 40-core
+ machine. */
+#define USE_VOLATILE 0
30x slower could exceed the default max-silent-timeout, which is 1 hour.
Ludo’.