From debbugs-submit-bounces@debbugs.gnu.org Mon May 16 04:26:28 2022 Received: (at submit) by debbugs.gnu.org; 16 May 2022 08:26:28 +0000 Received: from localhost ([127.0.0.1]:51638 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nqW3f-0000rv-NT for submit@debbugs.gnu.org; Mon, 16 May 2022 04:26:27 -0400 Received: from lists.gnu.org ([209.51.188.17]:50136) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nqW3d-0000rl-UJ for submit@debbugs.gnu.org; Mon, 16 May 2022 04:26:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46594) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nqW3Z-0003vV-SB for bug-guix@gnu.org; Mon, 16 May 2022 04:26:25 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:37976) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nqW3Z-0001Gy-IR for bug-guix@gnu.org; Mon, 16 May 2022 04:26:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:Date:Subject:To:From:in-reply-to: references; bh=Mw1l525OMdOZVRAYaJPDpoV5ekpOEhXSqqfa5J3dMGw=; b=BIl4hz1c/qP9w1 0MidSLF0MgJ7LvdyD3wLiEpELVHo2/5bZtziaqU9dng1F6BpxokdgJnJ8aFKDdaxJWpPm+FSqE6Ol B00Y6TypC0RrNE234dwT1gx4lpodmKHL0PabF2IbOlRicCj7rGQrvZ7Xl+IuZKBNq2K7kYA3hkPDO ehrFGLfh+JDK05znmVVRBRxzT1srSVPCPg01159TntOZjYWCRDHX8Z317As/ePY8ZuBEQ5MCxjYGw G9qCVM4ZX/i72HtaFdGbb/9fouXmkXclYLkw5z+o+uod4834OJ6kAeD5Kx+z72vbJjMiD8o4YpIGQ cOxP29SC43CuWwi8FaKg==; Received: from 91-160-117-201.subs.proxad.net ([91.160.117.201]:49500 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nqW3V-00043I-JZ for bug-guix@gnu.org; Mon, 16 May 2022 04:26:20 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: bug-guix@gnu.org Subject: elogind startup race between shepherd and dbus-daemon X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 27 =?utf-8?Q?Flor=C3=A9al?= an 230 de la =?utf-8?Q?R?= =?utf-8?Q?=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Mon, 16 May 2022 10:26:15 +0200 Message-ID: <877d6lc28o.fsf@inria.fr> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hello! Currently (40a729a0e6f1d660b942241416c1e2c567616d4d), shepherd and dbus-daemon compete to start elogind: shepherd tries to start it eagerly, and dbus-daemon starts it on-demand upon bus activation. Sometimes dbus-daemon wins, and thus shepherd tries a few times to start it anyway, leading to the infamous: elogind is already running as PID 123 (elogind checks whether its PID file exists. Note that you may see that message also when shepherd wins, because dbus-daemon tries to start it anyway.) Eventually, shepherd considers that elogind cannot be started and disables it. In addition to being ridiculous, it=E2=80=99s harmful: the =E2=80=98xorg-se= rver=E2=80=99 service (from =E2=80=98gdm-service-type=E2=80=99 and =E2=80=98sddm-service-type=E2= =80=99 depends on =E2=80=98elogind=E2=80=99), so if shepherd loses the race, Xorg isn=E2=80=99t started (on my laptop, shepherd never loses the race it seems, but i=E2=80=99ve seen it lose half = of the time on a slower machine). The reason elogind is started by shepherd is explained in this comment: ;; Start elogind from the Shepherd rather than waiting ;; for bus activation. This ensures that it can handle ;; events like lid close, etc. This comes from 94a881178af9a9a918ce6de55641daa245c92e73, which was a fix for . I believe the justification still holds. So it would seem that the solution to this is to prevent dbus-daemon from starting elogind. We can do that by changing org.freedesktop.login1.service so that it has =E2=80=9CExec=3Dtrue=E2=80=9D= instead of =E2=80=9CExec=3Delogind --daemon=E2=80=9D. =E2=80=9CExec=3Dtrue=E2=80=9D is a bit crude because it doesn=E2=80=99t gua= rantee that elogind is really started; if that isn=E2=80=99t good enough, we could instead wait fo= r the PID file or something (as of Shepherd 0.9.0, invoking =E2=80=98herd start elogind=E2=80=99 potentially leads shepherd to start a second instance if t= he first one is still being started, so we can=E2=80=99t really do that). Depending on what we end up with, we might also revisit whether xorg-server needs to explicitly depend on elogind. Thoughts? Ludo=E2=80=99.