tbb: test_global_control failure

  • Done
  • quality assurance status badge
Details
3 participants
  • Christopher Howard
  • Greg Hogan
  • Ludovic Courtès
Owner
unassigned
Submitted by
Christopher Howard
Severity
normal
C
C
Christopher Howard wrote on 3 Oct 2020 16:06
(address . bug-guix@gnu.org)
366cee92f612159fe9ab9d141ff31b6bb96f7f92.camel@qlfiles.net
Attempt to build tbb, a dependency of Octave I think, fails with:

"""
./test_task_scheduler_observer_v3.exe 1:4
done
Call stack info (12):
./test_global_control.exe(_Z16print_call_stackv+0x37)[0x407937]
./test_global_control.exe(_Z11ReportErrorPKciS0_S0_+0x1b)[0x407a3b]
./test_global_control.exe(_ZN3tbb10interface98internal9start_forINS_13b
locked_rangeImEENS_8internal17parallel_for_bodyIN7Harness21ExactConcurr
encyLevelEmEEKNS_18simple_partitionerEE7executeEv+0x327)[0x411067]
./libtbb.so.2(+0x2b7c2)[0x7ffff7fb17c2]
./libtbb.so.2(+0x2bb25)[0x7ffff7fb1b25]
./libtbb.so.2(+0x29810)[0x7ffff7faf810]
./test_global_control.exe(_ZN7Harness21ExactConcurrencyLevel5checkEmNS0
_4ModeE+0x35f)[0x40cbcf]
./test_global_control.exe(_Z23TestParallelismRestoredv+0xd8)[0x409c28]
./test_global_control.exe(_Z8TestMainv+0x2a)[0x40afea]
./test_global_control.exe(main+0xe)[0x40751e]
/gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-
2.31/lib/libc.so.6(__libc_start_main+0xed)[0x7ffff7ad9a6d]
./test_global_control.exe(_start+0x2a)[0x40776a]
../../src/test/harness_concurrency_tracker.h:123, assertion
!myCrashOnFail: Timeout was detected.
make[1]: *** [../../build/Makefile.test:274: test_tbb_plain] Aborted
make[1]: Leaving directory '/tmp/guix-build-tbb-2020.3.drv-0/tbb-
2020.3-checkout/build/guix_release'
make: *** [Makefile:42: test] Error 2

Test suite failed, dumping logs.
command "make" "test" "-j" "3" "LDFLAGS=-Wl,-
rpath=/gnu/store/qc926v75q54k94mwgz6gn4s02sjgrr03-tbb-2020.3/lib"
failed with status 2
"""

I attempted to build this with -M 1 option, but build still dies with
the same error.

My system:

christopher@nightshade ~$ guix describe
Generation 35 Oct 01 2020 20:02:29 (current)
guix 23dc21f
branch: master
commit: 23dc21f05b54ef63daaea9eb301cfddbc4c82ddb

christopher@nightshade ~$ neofetch --stdout
christopher@nightshade
----------------------
OS: Guix System 23dc21f05b54ef63daaea9eb301cfddbc4c82ddb x86_64
Host: GA-880GM-UD2H
Kernel: 5.7.15-gnu
Uptime: 5 days, 14 hours, 45 mins
Packages: 103 (guix-system), 84 (guix-user)
Shell: bash 5.0.16
Resolution: 1920x1200
DE: GNOME 3.34.2
Theme: Adwaita [GTK2/3]
Icons: Adwaita [GTK2/3]
Terminal: .gnome-terminal
CPU: AMD Athlon II X3 455 (3) @ 3.300GHz
GPU: NVIDIA GeForce 8400 GS Rev. 3
Memory: 1232MiB / 7960MiB

--
Christopher Howard
p: +1 (907) 374-0257
gpg: ADDEAADE5D607C8D (keys.gnupg.net)
L
L
Ludovic Courtès wrote on 5 Oct 2020 16:09
(name . Christopher Howard)(address . christopher.howard@qlfiles.net)
87pn5wwyqk.fsf@gnu.org
Hi,

Christopher Howard <christopher.howard@qlfiles.net> skribis:

Toggle quote (35 lines)
> Attempt to build tbb, a dependency of Octave I think, fails with:
>
> """
> ./test_task_scheduler_observer_v3.exe 1:4
> done
> Call stack info (12):
> ./test_global_control.exe(_Z16print_call_stackv+0x37)[0x407937]
> ./test_global_control.exe(_Z11ReportErrorPKciS0_S0_+0x1b)[0x407a3b]
> ./test_global_control.exe(_ZN3tbb10interface98internal9start_forINS_13b
> locked_rangeImEENS_8internal17parallel_for_bodyIN7Harness21ExactConcurr
> encyLevelEmEEKNS_18simple_partitionerEE7executeEv+0x327)[0x411067]
> ./libtbb.so.2(+0x2b7c2)[0x7ffff7fb17c2]
> ./libtbb.so.2(+0x2bb25)[0x7ffff7fb1b25]
> ./libtbb.so.2(+0x29810)[0x7ffff7faf810]
> ./test_global_control.exe(_ZN7Harness21ExactConcurrencyLevel5checkEmNS0
> _4ModeE+0x35f)[0x40cbcf]
> ./test_global_control.exe(_Z23TestParallelismRestoredv+0xd8)[0x409c28]
> ./test_global_control.exe(_Z8TestMainv+0x2a)[0x40afea]
> ./test_global_control.exe(main+0xe)[0x40751e]
> /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-
> 2.31/lib/libc.so.6(__libc_start_main+0xed)[0x7ffff7ad9a6d]
> ./test_global_control.exe(_start+0x2a)[0x40776a]
> ../../src/test/harness_concurrency_tracker.h:123, assertion
> !myCrashOnFail: Timeout was detected.
> make[1]: *** [../../build/Makefile.test:274: test_tbb_plain] Aborted
> make[1]: Leaving directory '/tmp/guix-build-tbb-2020.3.drv-0/tbb-
> 2020.3-checkout/build/guix_release'
> make: *** [Makefile:42: test] Error 2
>
> Test suite failed, dumping logs.
> command "make" "test" "-j" "3" "LDFLAGS=-Wl,-
> rpath=/gnu/store/qc926v75q54k94mwgz6gn4s02sjgrr03-tbb-2020.3/lib"
> failed with status 2
> """

It looks like a regression introduced by the upgrade in commit
e9cbf43ae07f1b4c4c047e072c9aa021b64eace8. It worked on my machine but
also fails on ci.guix.gnu.org.

Greg, could you take a look? If you’re unavailable at the moment,
should we revert the upgrade in the meantime?

Thanks,
Ludo’.
G
G
Greg Hogan wrote on 5 Oct 2020 17:36
(name . Ludovic Courtès)(address . ludo@gnu.org)
0B0D193F-886F-40AB-B181-07431015E539@greghogan.com
I don’t see any build failures on ci.guix.gnu.org http://ci.guix.gnu.org/, only two successes. Where is x86_64, should it not at least show as pending?

Builds matching tbb-2020.3
ID Specification Completion time Job Name System Log
3253827 guix-master 2 Oct 10:51 +0200 tbb-2020.3.armhf-linux tbb-2020.3 armhf-linux
3253435 guix-master 2 Oct 02:17 +0200 tbb-2020.3.i686-linux tbb-2020.3 i686-linux

Greg


Toggle quote (50 lines)
> On Oct 5, 2020, at 10:09 AM, Ludovic Courtès <ludo@gnu.org> wrote:
>
> Hi,
>
> Christopher Howard <christopher.howard@qlfiles.net> skribis:
>
>> Attempt to build tbb, a dependency of Octave I think, fails with:
>>
>> """
>> ./test_task_scheduler_observer_v3.exe 1:4
>> done
>> Call stack info (12):
>> ./test_global_control.exe(_Z16print_call_stackv+0x37)[0x407937]
>> ./test_global_control.exe(_Z11ReportErrorPKciS0_S0_+0x1b)[0x407a3b]
>> ./test_global_control.exe(_ZN3tbb10interface98internal9start_forINS_13b
>> locked_rangeImEENS_8internal17parallel_for_bodyIN7Harness21ExactConcurr
>> encyLevelEmEEKNS_18simple_partitionerEE7executeEv+0x327)[0x411067]
>> ./libtbb.so.2(+0x2b7c2)[0x7ffff7fb17c2]
>> ./libtbb.so.2(+0x2bb25)[0x7ffff7fb1b25]
>> ./libtbb.so.2(+0x29810)[0x7ffff7faf810]
>> ./test_global_control.exe(_ZN7Harness21ExactConcurrencyLevel5checkEmNS0
>> _4ModeE+0x35f)[0x40cbcf]
>> ./test_global_control.exe(_Z23TestParallelismRestoredv+0xd8)[0x409c28]
>> ./test_global_control.exe(_Z8TestMainv+0x2a)[0x40afea]
>> ./test_global_control.exe(main+0xe)[0x40751e]
>> /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-
>> 2.31/lib/libc.so.6(__libc_start_main+0xed)[0x7ffff7ad9a6d]
>> ./test_global_control.exe(_start+0x2a)[0x40776a]
>> ../../src/test/harness_concurrency_tracker.h:123, assertion
>> !myCrashOnFail: Timeout was detected.
>> make[1]: *** [../../build/Makefile.test:274: test_tbb_plain] Aborted
>> make[1]: Leaving directory '/tmp/guix-build-tbb-2020.3.drv-0/tbb-
>> 2020.3-checkout/build/guix_release'
>> make: *** [Makefile:42: test] Error 2
>>
>> Test suite failed, dumping logs.
>> command "make" "test" "-j" "3" "LDFLAGS=-Wl,-
>> rpath=/gnu/store/qc926v75q54k94mwgz6gn4s02sjgrr03-tbb-2020.3/lib"
>> failed with status 2
>> """
>
> It looks like a regression introduced by the upgrade in commit
> e9cbf43ae07f1b4c4c047e072c9aa021b64eace8. It worked on my machine but
> also fails on ci.guix.gnu.org.
>
> Greg, could you take a look? If you’re unavailable at the moment,
> should we revert the upgrade in the meantime?
>
> Thanks,
> Ludo’.
Attachment: file
L
L
Ludovic Courtès wrote on 7 Oct 2020 23:30
(name . Greg Hogan)(address . code@greghogan.com)
875z7lvi40.fsf@gnu.org
Hi,

Greg Hogan <code@greghogan.com> skribis:

Toggle quote (7 lines)
> I don’t see any build failures on ci.guix.gnu.org http://ci.guix.gnu.org/, only two successes. Where is x86_64, should it not at least show as pending?
>
> Builds matching tbb-2020.3
> ID Specification Completion time Job Name System Log
> 3253827 guix-master 2 Oct 10:51 +0200 tbb-2020.3.armhf-linux tbb-2020.3 armhf-linux
> 3253435 guix-master 2 Oct 02:17 +0200 tbb-2020.3.i686-linux tbb-2020.3 i686-linux

Here’s the build log of a failure on x86_64:


(You can build such URLs by appending the basename of the store file

It ends like this:

Toggle snippet (15 lines)
./test_reader_writer_lock.exe
done
./test_tbb_condition_variable.exe
done
./test_intrusive_list.exe
done
./test_concurrent_priority_queue.exe
done
./test_task_priority.exe
Known issue: priority effect is limited in case of blocking-style nesting
make[1]: *** [../../build/Makefile.test:274: test_tbb_plain] Segmentation fault
make[1]: Leaving directory '/tmp/guix-build-tbb-2020.3.drv-0/tbb-2020.3-checkout/build/guix_release'
make: *** [Makefile:42: test] Error 2

Ludo’.
G
G
Greg Hogan wrote on 9 Oct 2020 13:55
(name . Ludovic Courtès)(address . ludo@gnu.org)
CA+3U0ZmQ9X9=oa95uEGZyS+Zbmx8ghfKa84c57y4tcPFoEkYKg@mail.gmail.com
I am also successfully building tbb-2020.3 and octave-5.2.0 locally. Is it
possible to retry the build on ci.guix.gnu.org?

I'm still puzzled why the 64-bit builds of tbb-2020.3 are missing from
ci.guix.gnu.org.

And previous versions of tbb have had mixed success building.

But the logs for the old failed builds are missing.

On Wed, Oct 7, 2020 at 5:31 PM Ludovic Courtès <ludo@gnu.org> wrote:

Toggle quote (44 lines)
> Hi,
>
> Greg Hogan <code@greghogan.com> skribis:
>
> > I don’t see any build failures on ci.guix.gnu.org <
> http://ci.guix.gnu.org/>, only two successes. Where is x86_64, should it
> not at least show as pending?
> >
> > Builds matching tbb-2020.3
> > ID Specification Completion time Job Name System Log
> > 3253827 guix-master 2 Oct 10:51 +0200
> tbb-2020.3.armhf-linux tbb-2020.3 armhf-linux
> > 3253435 guix-master 2 Oct 02:17 +0200
> tbb-2020.3.i686-linux tbb-2020.3 i686-linux
>
> Here’s the build log of a failure on x86_64:
>
> https://ci.guix.gnu.org/log/qc926v75q54k94mwgz6gn4s02sjgrr03-tbb-2020.3
>
> (You can build such URLs by appending the basename of the store file
> name to “https://ci.guix.gnu.org/log/”.)
>
> It ends like this:
>
> --8<---------------cut here---------------start------------->8---
> ./test_reader_writer_lock.exe
> done
> ./test_tbb_condition_variable.exe
> done
> ./test_intrusive_list.exe
> done
> ./test_concurrent_priority_queue.exe
> done
> ./test_task_priority.exe
> Known issue: priority effect is limited in case of blocking-style nesting
> make[1]: *** [../../build/Makefile.test:274: test_tbb_plain] Segmentation
> fault
> make[1]: Leaving directory
> '/tmp/guix-build-tbb-2020.3.drv-0/tbb-2020.3-checkout/build/guix_release'
> make: *** [Makefile:42: test] Error 2
> --8<---------------cut here---------------end--------------->8---
>
> Ludo’.
>
Attachment: file
L
L
Ludovic Courtès wrote on 9 Oct 2020 23:47
(name . Greg Hogan)(address . code@greghogan.com)
87k0vzhy0i.fsf@gnu.org
Hi Greg,

Greg Hogan <code@greghogan.com> skribis:

Toggle quote (3 lines)
> I am also successfully building tbb-2020.3 and octave-5.2.0 locally. Is it
> possible to retry the build on ci.guix.gnu.org?

I retried and it succeeded this time:


Could it be non-determinism when running tests in parallel?

Thanks,
Ludo’.
G
G
Greg Hogan wrote on 7 Mar 17:33 +0100
(name . Ludovic Courtès)(address . ludo@gnu.org)
CA+3U0Zn9F4XH91Lbj8JU00Cwebqyza7ZobQDeJwS+EWUbt4wkw@mail.gmail.com
On Fri, Oct 9, 2020 at 5:48?PM Ludovic Courtès <ludo@gnu.org> wrote:
Toggle quote (17 lines)
>
> Hi Greg,
>
> Greg Hogan <code@greghogan.com> skribis:
>
> > I am also successfully building tbb-2020.3 and octave-5.2.0 locally. Is it
> > possible to retry the build on ci.guix.gnu.org?
>
> I retried and it succeeded this time:
>
> https://ci.guix.gnu.org/log/qc926v75q54k94mwgz6gn4s02sjgrr03-tbb-2020.3
>
> Could it be non-determinism when running tests in parallel?
>
> Thanks,
> Ludo’.

Closing after 3+ years and no recent build failures:
Closed
?