From debbugs-submit-bounces@debbugs.gnu.org Mon Mar 09 18:20:05 2020 Received: (at submit) by debbugs.gnu.org; 9 Mar 2020 22:20:05 +0000 Received: from localhost ([127.0.0.1]:51577 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jBQkn-0001tH-CR for submit@debbugs.gnu.org; Mon, 09 Mar 2020 18:20:05 -0400 Received: from lists.gnu.org ([209.51.188.17]:37237) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jBQkl-0001t9-TS for submit@debbugs.gnu.org; Mon, 09 Mar 2020 18:20:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:56444) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jBQkj-00079c-Hd for bug-guile@gnu.org; Mon, 09 Mar 2020 18:20:03 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.1 required=5.0 tests=BAYES_50,FREEMAIL_FROM, RCVD_IN_DNSWL_LOW,URIBL_BLOCKED autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jBQki-0006d9-CW for bug-guile@gnu.org; Mon, 09 Mar 2020 18:20:01 -0400 Received: from mout.gmx.net ([212.227.17.22]:50267) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jBQki-0006b5-0L for bug-guile@gnu.org; Mon, 09 Mar 2020 18:20:00 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1583792391; bh=xhfkwKHuWluJ5IqWI9fDZoA9dZoXgMirnQeo3+Rf/as=; h=X-UI-Sender-Class:References:From:To:Cc:Subject:In-reply-to:Date; b=HRONuHQwKpUlBCNIB7FQf9pgzIZWovhLLlhIUqjBPmQTRIFMBEqvJDXAI7eogED3e IsbNAL4o6MxJoFmZPASP67XqicAOvtavgkouyhydtKGilrLtW6r7A0m06Pns2ovG7b oDzh23ah1WjO2u1nNGS5Dbz1AfrrimTs6JiUoLqM= X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c Received: from labiere ([80.44.64.14]) by mail.gmx.com (mrgmx104 [212.227.17.174]) with ESMTPSA (Nemesis) id 1M72sJ-1jGgcC0BrT-008bq6; Mon, 09 Mar 2020 23:19:51 +0100 References: <87tv4kdgyy.fsf@inria.fr> <874kux385m.fsf@gnu.org> User-agent: mu4e 1.2.0; emacs 26.3 From: Pierre Langlois To: bug-guile@gnu.org Subject: Re: bug#39266: Finalization thread hits wrong-type-arg on weak vector (AArch64) In-reply-to: <874kux385m.fsf@gnu.org> Date: Mon, 09 Mar 2020 22:19:49 +0000 Message-ID: <87wo7tdvcq.fsf@gmx.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Provags-ID: V03:K1:ZFD4CJ3kBdi4sLM9vjoH3KW9uRTpu38m7QNo8NSABG1pdZaypx+ CrLOgfAUKSl+U5ToAT4zK6/LMsaCN0F3yUSZCPvF4vkmShm+PRf0x5QHsvrTD9Vx46T+8f3 635g/boy2Tc6MfguvOy0qJZAUTdNHC/DjmCoIlkdARKrJaRlVwO4a+fvtaTOrxsn+acU0Vi nm+fW6WyiTJXYtnbGYSnQ== X-UI-Out-Filterresults: notjunk:1;V03:K0:cpE4et5AQHU=:532xDQK1qL23o7T7MJVTJj CP1DY7MiWdh97vWxoG1gg70uD9zqpMrro/uu3YMMMPC5YKUDnJdipAFNLbsVnT9Ji8DVZ4SfS BGjkP9XwBIEmXmCeyhtaDOHEuSKxaPgreQyCM6IOMWji8LpLMtC0T3NwUQPYJ0NjdtHKcKdST OxXVCxXozIXPjYyLN2eMaMaibOr7IOXI2rWgLeUoy4NjfBJTAoBX/qG+wKSmKJ/nRQiqeEi2D V43g+YELATlkgv2ztLy9uHel5THBlZOpNzteKPRuc+aYxc04cnl67pCcGMWqYtCMDlTF1C23f e0ZanWEubHVb2xUI41W1US5qjYtPaFNrKs+gTVSmiIGVs54gku/I8ncxWMFmdP9CrxCgfG0Wo KLOTBiX4kzT4laigrw97nKO10Cg7V3HDoFCkOZv9bLxXR/uJrsT8aK7jB7ztxQHoGvxS/sVeL 9FVdqbcQNDh2KmtBnV5NwL1inEyDhuYtmMq4TWYB7tXNcVwRDlNfD3Sziz4t+AEtIKLyrmDTh jwoaZ9CUpZcTYiW165Zfj74mVk62vOC6KWxsyhVJ07u9hJ5iks91r9mmg9JdTZVEAlXVMgs0U nYA0n02PJJ0WcJgCXYiMeqshGy9pAX6RdYcCYbx1UAJNPU0T0VHnAXsW5HxUFSl4hsFJ4/Qsd V4SX0HgRiXKOTOjyqn/WUywZW8y7Hld0yeRbcDCH7kkpDUp9C7nhKxF4VjpDFn2fpsIrhytO5 cO6cvAmLIM87TK9BRDzw22qpixE0MGwtRf5e27nag+BoESpdbAE8h/ImIpxjLbdY9rVs1qzy+ 3/kXVc3yCQXDg2YEI2NKQiydQQdbVtKa8Gm/GDFSq2JG0mNtNMlNuM4ow3IL5Vm5ILbDjmH8t WfJh9j5Zz8g3wgkpuIPE9vzR5Sqny4zvQDpLOdzsCVS/eWR1JDhivUOdxHdC/RHT4veDnuWnm Z0ELqRyKimy/z5OdjdxgW9opG4S7BW5a7ywDxA+MI19XIlyriJrJGUUFNa5LkwbvHeF8jMyqg q1wmdsjX2rDh0qb6GEmMNroX0toMmSNRMXxO8eO0v9vhYiCfPgMAv0MdpiEc4zfozuPtfl5Dl Mh2X4o92JoNbpwlGr9Um6nTzJvnImyofpyxIxaXoxTHiuKUimSxr/rp7/KB9QU8ddI2tkpKLs 9Z8053U5dRD5QGuuxkVbeZBM0YvJW8NHthS4Z8Qb+yZjY9hIIWcrSJyC3JWbpzKFgck9c2lMR fgJpoCcrr+Bfveecc X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 212.227.17.22 X-Spam-Score: 0.2 (/) X-Debbugs-Envelope-To: submit Cc: 39266@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.8 (/) Hi Ludo, Ludovic Court=C3=A8s writes: > Ludovic Court=C3=A8s skribis: > >> While building the =E2=80=9Cguix-system.drv=E2=80=9D derivation on AArch= 64, I got this >> crash (not fully deterministic but quite frequent). Here the >> finalization thread gets a wrong-type-arg in =E2=80=98scm_i_weak_car=E2= =80=99 (i.e., >> accessing a one-element weak vector): > > With 3.0.1, I can reproduce the bug on x86_64. With rr (thanks, Andy!), > I found this (starting from the point where the type cell of the weak > vector is zeroed, and reverse-continuing until its gets its original > value of 0x10f): > > --8<---------------cut here---------------start------------->8--- > (rr) frame 40 > #40 0x00007ffff7f2e66d in scm_i_weak_car (pair=3D0x7fffe15af690) at ../li= bguile/pairs.h:190 > 190 return SCM_CAR (x); > (rr) down > #39 0x00007ffff7f2f576 in scm_c_weak_vector_ref (wv=3D, k= =3Dk@entry=3D0) at weak-vector.c:193 > 193 SCM_VALIDATE_WEAK_VECTOR (1, wv); > (rr)=20 > #38 0x00007ffff7ea7ba0 in scm_wrong_type_arg_msg ( > subr=3Dsubr@entry=3D0x7ffff7f56f00 "weak-vect= or-ref", pos=3Dpos@entry=3D1,=20 > bad_value=3D0x7fffec472b90, szMessage=3DszMessage@entry=3D0x7ffff7f56= e80 "weak vector") at error.c:282 > 282 scm_error (scm_arg_type_key, > (rr) p *((void**)0x7fffec472b90) > $1 =3D (void *) 0x0 > (rr) watch *((void**)0x7fffec472b90) > Hardware watchpoint 1: *((void**)0x7fffec472b90) > (rr) reverse-cont > Continuing. > > Thread 1 received signal SIGCONT, Continued. > [Switching to Thread 27074.27074] > __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:101 > 101 ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: Dosiero a=C5=AD dos= ierujo ne ekzistas. > (rr)=20 > Continuing. > > Thread 1 hit Hardware watchpoint 1: *((void**)0x7fffec472b90) > > Old value =3D (void *) 0x0 > New value =3D (void *) 0x10f > __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec= -unaligned-erms.S:259 > 259 ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Dosiero a=C5= =AD dosierujo ne ekzistas. > (rr) bt > #0 __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset= -vec-unaligned-erms.S:259 > #1 0x00007ffff7f1d499 in set_vtable_access_fields (vtable=3Dvtable@entry= =3D0x7fffeb48ee80) at struct.c:143 > #2 0x00007ffff7f1dd8d in scm_i_struct_inherit_vtable_magic (vtable=3Dvta= ble@entry=3D0x7ffff4e32fa0,=20 > obj=3Dobj@entry=3D0x7fffeb48ee80) at struct.c:215 > #3 0x00007ffff7f1dfea in scm_c_make_structv (vtable=3D0x7ffff4e32fa0, n_= tail=3D, n_init=3D8,=20 > init=3D0x7fffffff50d0) at struct.c:364 > #4 0x00007ffff7f1e0b9 in scm_make_struct_no_tail (vtable=3D0x7ffff4e32fa= 0, init=3D0x304) at struct.c:491 > --8<---------------cut here---------------end--------------->8--- > > Bingo! There=E2=80=99s a mismatch in struct.c: > > --8<---------------cut here---------------start------------->8--- > bitmask_size =3D (nfields + 31U) / 32U; > unboxed_fields =3D scm_gc_malloc_pointerless (bitmask_size, "unboxed fi= elds"); > memset (unboxed_fields, 0, bitmask_size * sizeof(*unboxed_fields)); > --8<---------------cut here---------------end--------------->8--- Oh wow, scary! That was some nice debugging, these types of bugs can be really hard to get to the bottom of. > > Pushed a fix as 7c17655cd3d859bf0c5a86d9782a7788205fc05a. > > Thanks, rr! You made my day! :-) > > Now testing Guix builds on x86_64, i686, ARMv7, and AArch64 to see if > that addresses seemingly related issues. I've tested it on AArch64 and it's looking good, I'm running Guile 3 finally! I've tested by running 'guix pull --branch=3Dwip-guile-3.0.1' on a rockpro64 running the Guix system, I've then reconfigured and rebooted and it's all good. Thanks so much for the fix! Hopefully it'll work on every platform and that can be the end of it :-). Pierre