From debbugs-submit-bounces@debbugs.gnu.org Tue Nov 22 17:14:18 2022 Received: (at submit) by debbugs.gnu.org; 22 Nov 2022 22:14:18 +0000 Received: from localhost ([127.0.0.1]:52758 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oxbWz-0006PN-S7 for submit@debbugs.gnu.org; Tue, 22 Nov 2022 17:14:18 -0500 Received: from lists.gnu.org ([209.51.188.17]:60136) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oxbWw-0006PB-Em for submit@debbugs.gnu.org; Tue, 22 Nov 2022 17:14:16 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oxbWv-0003y5-KK for bug-guix@gnu.org; Tue, 22 Nov 2022 17:14:14 -0500 Received: from mail3-relais-sop.national.inria.fr ([192.134.164.104]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oxbWt-00009H-7L for bug-guix@gnu.org; Tue, 22 Nov 2022 17:14:13 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=inria.fr; s=dc; h=from:to:subject:date:message-id:mime-version: content-transfer-encoding; bh=yXriI6oDw9I0vfPfKOe9pNY8YzKIHj0arJFhoEHPg+A=; b=cYHPl5qiCFXvgKd9CqW3GP+FfGxOcohffLWq4pRUxcVJoU8OFXpVusvA 6asf5xWmW4jUZlKNlu01c9O1ErCNT1f3g+nh5C7pEoUXZr8Fgvh54bZcl TcNXw9nH4qcfRSRMAtbo+2Xi88IdELxP1NS5UD8Qr28IkkA78BqXOqH+3 o=; Authentication-Results: mail3-relais-sop.national.inria.fr; dkim=none (message not signed) header.i=none; spf=SoftFail smtp.mailfrom=ludovic.courtes@inria.fr; dmarc=fail (p=none dis=none) d=inria.fr X-IronPort-AV: E=Sophos;i="5.96,185,1665439200"; d="scan'208";a="40514360" Received: from 91-160-117-201.subs.proxad.net (HELO ribbon) ([91.160.117.201]) by mail3-relais-sop.national.inria.fr with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Nov 2022 23:14:06 +0100 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: bug-guix@gnu.org Subject: cuirass-remote-worker crash X-Debbugs-Cc: Mathieu Othacehe X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: Duodi 2 Frimaire an 231 de la =?utf-8?Q?R=C3=A9volut?= =?utf-8?Q?ion=2C?= jour du Turneps X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Tue, 22 Nov 2022 23:14:05 +0100 Message-ID: <87ilj6hc2a.fsf@inria.fr> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=192.134.164.104; envelope-from=ludovic.courtes@inria.fr; helo=mail3-relais-sop.national.inria.fr X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NUMERIC_HTTP_ADDR=1.242, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) Hi, In /var/log/cuirass-remote-worker.log on overdrive1.guix, I found this: --8<---------------cut here---------------start------------->8--- 2022-11-21 14:27:24 Backtrace: 2022-11-21 14:27:24 Backtrace: 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 1752:10 10 (with-exception-handler _ _ #:unwind? _ # = _) 2022-11-21 14:27:24 In unknown file: 2022-11-21 14:27:24 9 (apply-smob/0 #) 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 724:2 8 (call-with-prompt _ _ #) 2022-11-21 14:27:24 In ice-9/eval.scm: 2022-11-21 14:27:24 1752:10 10 (with-exception-handler _ _ #:unwind? _ # = _) 2022-11-21 14:27:24 619:8 7 (_ #(#(#)= )) 2022-11-21 14:27:24 In cuirass/ui.scm: 2022-11-21 14:27:24 In unknown file: 2022-11-21 14:27:24 9 (apply-smob/0 #) 2022-11-21 14:27:24 104:10 6 (run-cuirass-command _ . _) 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 724:2 8 (call-with-prompt _ _ #) 2022-11-21 14:27:24 1752:10 5 (with-exception-handler _ _ #:unwind? _ # = _) 2022-11-21 14:27:24 In ice-9/eval.scm: 2022-11-21 14:27:24 In cuirass/scripts/remote-worker.scm: 2022-11-21 14:27:24 619:8 7 (_ #(#(#)= )) 2022-11-21 14:27:24 In cuirass/ui.scm: 2022-11-21 14:27:24 104:10 6 (run-cuirass-command _ . _) 2022-11-21 14:27:24 435:12 4 (_) 2022-11-21 14:27:24 In srfi/srfi-1.scm: 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 1752:10 5 (with-exception-handler _ _ #:unwind? _ # = _) 2022-11-21 14:27:24 634:9 3 (for-each # ?) 2022-11-21 14:27:24 In cuirass/scripts/remote-worker.scm: 2022-11-21 14:27:24 In cuirass/scripts/remote-worker.scm: 2022-11-21 14:27:24 448:18 2 (_ _) 2022-11-21 14:27:24 435:12 4 (_) 2022-11-21 14:27:24 In srfi/srfi-1.scm: 2022-11-21 14:27:24 634:9 3 (for-each # ?) 2022-11-21 14:27:24 356:11 1 (start-worker _ _) 2022-11-21 14:27:24 In cuirass/scripts/remote-worker.scm: 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 448:18 2 (_ _) 2022-11-21 14:27:24 1685:16 0 (raise-exception _ #:continuable? _) 2022-11-21 14:27:24 2022-11-21 14:27:24 ice-9/boot-9.scm:1685:16: In procedure raise-exception: 2022-11-21 14:27:24 Throw to key `match-error' with args `("match" "no matc= hing pattern" (#vu8()))'. 2022-11-21 14:27:24 356:11 1 (start-worker _ _) 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 1685:16 0 (raise-exception _ #:continuable? _) 2022-11-21 14:27:24 2022-11-21 14:27:24 ice-9/boot-9.scm:1685:16: In procedure raise-exception: 2022-11-21 14:27:24 Throw to key `match-error' with args `("match" "no matc= hing pattern" (#vu8()))'. --8<---------------cut here---------------end--------------->8--- (Stuttering is due to the unprotected use of =E2=80=98primitive-fork=E2=80= =99: a non-local exit in the child leads it to execute the same code as its parent. We should fix that, but should we really fork in the first place? :-)) This comes from here: --8<---------------cut here---------------start------------->8--- (define (read-server-info socket) (request-info socket) (match (zmq-get-msg-parts-bytevector socket '()) ;<-- here ((empty info) (match (zmq-read-message (bv->string info)) (('server-info ('worker-address worker-address) ('log-port log-port) ('publish-port publish-port)) (list worker-address log-port publish-port)))))) --8<---------------cut here---------------end--------------->8--- This is the version being used: --8<---------------cut here---------------start------------->8--- ludo@overdrive1 ~$ cat /proc/24019/cmdline |xargs -0 /gnu/store/zpir9n73amaxrwz2k7x46l73v21vxk6s-guile-3.0.8/bin/guile --no-auto= -compile -e main -s /gnu/store/rlqdzmfyamjpn6lz07yqk2hsabv3l7g5-cuirass-1.1= .0-11.9f08035/bin/.cuirass-real remote-worker --workers=3D2 --server=3D10.0= .0.1:5555 --systems=3Darmhf-linux,aarch64-linux --publish-port=3D5558 --sub= stitute-urls=3Dhttp://10.0.0.1 ludo@overdrive1 ~$ guix system describe Generation 36 Sep 27 2022 09:06:48 (current) file name: /var/guix/profiles/system-36-link canonical file name: /gnu/store/m04qw6f0lfd0wpn1skiys4b56wqfc3b8-system label: GNU with Linux-Libre 5.19.11 bootloader: grub-efi root device: /dev/sda3 kernel: /gnu/store/09r4wbbabskmbrnwmshpdk7vh6g87gam-linux-libre-5.19.11/I= mage channels: guix: repository URL: https://git.savannah.gnu.org/git/guix.git commit: f15a141cf35bd4188767f0e91c0654991d4c49e0 configuration file: /gnu/store/myvzd1kpw2pfzfj3krl4lzpcbqsdn48x-configura= tion.scm --8<---------------cut here---------------end--------------->8--- The sequence leading to this seems to be: --8<---------------cut here---------------start------------->8--- 22340 eventfd2(0, EFD_CLOEXEC [=E2=80=A6] 22340 <... eventfd2 resumed>) =3D 15 [=E2=80=A6] 22340 ppoll([{fd=3D15, events=3DPOLLIN}], 1, NULL, NULL, 0 [=E2=80=A6] 22340 <... ppoll resumed>) =3D 1 ([{fd=3D15, revents=3DPOLLIN}= ]) 22343 epoll_pwait(8, 22340 read(15, "\1\0\0\0\0\0\0\0", 8) =3D 8 22340 ppoll([{fd=3D15, events=3DPOLLIN}], 1, {tv_sec=3D0, tv_nsec=3D0}, NUL= L, 0) =3D 0 (Timeout) 22340 write(2, "Backtrace:\n", 11) =3D 11 --8<---------------cut here---------------end--------------->8--- Does that ring a bell? Perhaps that was fixed in the meantime? Right now it cannot be restarted: it always fails at start up with the error above. 10.0.0.1 is reachable though so I=E2=80=99m not sure what=E2= =80=99s up. Ludo=E2=80=99.