GDM, GNOME Shell, etc. break when there are stale caches

OpenSubmitted by Ricardo Wurmus.
Details
7 participants
  • Andreas Enge
  • Efraim Flashner
  • L p R n d n
  • Ludovic Courtès
  • Mark H Weaver
  • Ricardo Wurmus
  • Timothy Sample
Owner
unassigned
Severity
important
R
R
Ricardo Wurmus wrote on 4 Aug 2019 23:00
fixing GDM + GNOME Shell
(address . guix-devel@gnu.org)(address . bug-guix@gnu.org)
87tvawzlvq.fsf@elephly.net
Hi Guix,
Today I again couldn’t log into my workstation after upgrading thesystem. I’m using GDM + GNOME Shell.
At first GDM wouldn’t start. I knew what to do: remove /var/lib/gdm,because some state must have accumulated there.
GDM came up after a reboot, but I still couldn’t log in. Instead I wasthrown back to the login screen without any error message. I looked in~/.cache/gdm/session.log for information, but it only told me thatgnome-shell was killed. Thanks.
After removing both .local/share and .cache out of the way I could login again.
This happens whenever I upgrade the system. This makes the systemrather frustrating to use. I don’t know if booting into an older systemgeneration would result in the same problem, but my guess is that itwould because both GDM and GNOME Shell appear to be leaving some binaryfiles behind that cause different versions to crash unceremoneously.
What can we do to make GDM and GNOME Shell more reliable?
--Ricardo
E
E
Efraim Flashner wrote on 5 Aug 2019 09:17
(name . Ricardo Wurmus)(address . rekado@elephly.net)
20190805071719.GB15819@E2140
On Sun, Aug 04, 2019 at 11:00:41PM +0200, Ricardo Wurmus wrote:
Toggle quote (8 lines)> Hi Guix,> > Today I again couldn’t log into my workstation after upgrading the> system. I’m using GDM + GNOME Shell.> > At first GDM wouldn’t start. I knew what to do: remove /var/lib/gdm,> because some state must have accumulated there.
For this one can we create a single-shot service that, on reconfigure orboot, removes this directory and recreates it? In fact, it seems this isbasically what Debian does¹.
Toggle quote (9 lines)> > GDM came up after a reboot, but I still couldn’t log in. Instead I was> thrown back to the login screen without any error message. I looked in> ~/.cache/gdm/session.log for information, but it only told me that> gnome-shell was killed. Thanks.> > After removing both .local/share and .cache out of the way I could log> in again.
This part seems a little harder to automate. /etc/skel is only sourcedwhen a user is created, so it's hard to make sweeping changes to helppeople in this case, if they even want automated help. I'm guessingmaking .cache/gdm(?) read-only would create other issues.
Toggle quote (9 lines)> > This happens whenever I upgrade the system. This makes the system> rather frustrating to use. I don’t know if booting into an older system> generation would result in the same problem, but my guess is that it> would because both GDM and GNOME Shell appear to be leaving some binary> files behind that cause different versions to crash unceremoneously.> > What can we do to make GDM and GNOME Shell more reliable?
Modify the logout scripts to remove a users' .cache file seems extreme.Some of the other options, such as removing and recreating directorieswould address other issues we've had (such as /var/cache/fontconfig).

¹ https://sources.debian.org/src/gdm3/3.30.2-3/debian/gdm3.postinst/#L76
-- Efraim Flashner <efraim@flashner.co.il> אפרים פלשנרGPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351Confidentiality cannot be guaranteed on emails sent or received unencrypted
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEoov0DD5VE3JmLRT3Qarn3Mo9g1EFAl1H1/IACgkQQarn3Mo9g1G5VhAAsibD5ztLCeQoH3V8uNSGHhxreveiSAs1ZRB63xKot5+77Yc6dVWbfQOnFO+0kD4ltjlaD+FplwoR+3qBUzshx+Gs+5NJi6EJJfWlt6mM7mOpUlry3uVJ5iFAY0lR9mw+xc2Yaj/oaXiimmncpGVs8aCqNM5lSugIuACL3e0JTOVX1Dzatuc1HW7+DRrAIpnz+2Jwdf8n8lDBRVf6skJHh3cMKEWYxO/xRABeaAESjOxvWq8sB0TEpCSebZdMjAUV9eoWE+gIPbqLjdAjBTHy69BKj6tRgn1meZCvQv1CsdVDkqJUVJJ1p9fmK5VjXij092gnXDGkdDYUm2B4loTYodwva45x5O5Eb7z12wq6+zCsYcD3Ir0+twFfhM2Q8ptCwVr55bpO48I8KpWkAJi8jn+RiDNro3BZK+IfOcHAusKdEqcdA3wEQzs099qkeTKbBa8sTAmI44lqYTrkrWg6mIeWXamgr0Xkb6bZM2YwkgdmsqhGFb1hs1uWdOhfBrULNBMMARMiFIR3Muurqo/pRVqR3V/4aK0a+y4XdmE7eab2dhQs9DwPjYLlWqaSF7X5XfLi7GQZKEzbZOEqH2D7e03zKKximVjkO/7wrGZ4qr17mC8k2/U6voEBYw9Xwo3hacJHGLbwG/7k79yQ4lM9oR/H0mR3RcwqNoZ7NTwKJaY==ZKzJ-----END PGP SIGNATURE-----

R
R
Ricardo Wurmus wrote on 5 Aug 2019 16:36
(name . Efraim Flashner)(address . efraim@flashner.co.il)
87h86vznkt.fsf@elephly.net
Efraim Flashner <efraim@flashner.co.il> writes:
Toggle quote (13 lines)> On Sun, Aug 04, 2019 at 11:00:41PM +0200, Ricardo Wurmus wrote:>> Hi Guix,>>>> Today I again couldn’t log into my workstation after upgrading the>> system. I’m using GDM + GNOME Shell.>>>> At first GDM wouldn’t start. I knew what to do: remove /var/lib/gdm,>> because some state must have accumulated there.>> For this one can we create a single-shot service that, on reconfigure or> boot, removes this directory and recreates it? In fact, it seems this is> basically what Debian does¹.
I suggested as much earlier, but it seems like a hack. Is this howGNOME expects this state directory to be handled? The fact that Debiandoes this is reassuring (or not…), but I would very much like to avoidadding even more hacks.
Toggle quote (13 lines)>> GDM came up after a reboot, but I still couldn’t log in. Instead I was>> thrown back to the login screen without any error message. I looked in>> ~/.cache/gdm/session.log for information, but it only told me that>> gnome-shell was killed. Thanks.>>>> After removing both .local/share and .cache out of the way I could log>> in again.>> This part seems a little harder to automate. /etc/skel is only sourced> when a user is created, so it's hard to make sweeping changes to help> people in this case, if they even want automated help. I'm guessing> making .cache/gdm(?) read-only would create other issues.
Does anyone know why this happens at all? What are the cached data?Can we do without?
Toggle quote (6 lines)>> What can we do to make GDM and GNOME Shell more reliable?>> Modify the logout scripts to remove a users' .cache file seems extreme.> Some of the other options, such as removing and recreating directories> would address other issues we've had (such as /var/cache/fontconfig).
In my opinion generating a global /var/cache/fontconfig should beprevented; removing it seems again like an avoidable hack.
--Ricardo
M
M
Mark H Weaver wrote on 6 Aug 2019 18:12
(name . Ricardo Wurmus)(address . rekado@elephly.net)
87k1bqgtn1.fsf@netris.org
Hi Ricardo,
Ricardo Wurmus <rekado@elephly.net> writes:
Toggle quote (22 lines)> Today I again couldn’t log into my workstation after upgrading the> system. I’m using GDM + GNOME Shell.>> At first GDM wouldn’t start. I knew what to do: remove /var/lib/gdm,> because some state must have accumulated there.>> GDM came up after a reboot, but I still couldn’t log in. Instead I was> thrown back to the login screen without any error message. I looked in> ~/.cache/gdm/session.log for information, but it only told me that> gnome-shell was killed. Thanks.>> After removing both .local/share and .cache out of the way I could log> in again.>> This happens whenever I upgrade the system. This makes the system> rather frustrating to use. I don’t know if booting into an older system> generation would result in the same problem, but my guess is that it> would because both GDM and GNOME Shell appear to be leaving some binary> files behind that cause different versions to crash unceremoneously.>> What can we do to make GDM and GNOME Shell more reliable?
It's interesting that I've never run into this problem, not even once,in all my years of running GNOME on Guix systems. Since recentlyreverting to mostly using GNOME under X and GDM (whereas for a while Iwas mostly launching GNOME manually under Wayland), I've run into someother problems, e.g. GDM suspending my system automatically, sometimesimmediately after logging out, but I've *never* had to remove my caches.
I wonder if this is related to my use of Btrfs instead of Ext4. Whereassystem crashes cause file system corruptions under Ext4 (usually in theform of some files being left empty after a crash), I've never seen anyevidence of corruption from crashes under Btrfs.
Mark
R
R
Ricardo Wurmus wrote on 6 Aug 2019 20:08
(name . Mark H Weaver)(address . mhw@netris.org)
87sgqexj2k.fsf@elephly.net
Mark H Weaver <mhw@netris.org> writes:
Toggle quote (7 lines)> It's interesting that I've never run into this problem, not even once,> in all my years of running GNOME on Guix systems. Since recently> reverting to mostly using GNOME under X and GDM (whereas for a while I> was mostly launching GNOME manually under Wayland), I've run into some> other problems, e.g. GDM suspending my system automatically, sometimes> immediately after logging out, but I've *never* had to remove my caches.
Interesting.
Toggle quote (5 lines)> I wonder if this is related to my use of Btrfs instead of Ext4. Whereas> system crashes cause file system corruptions under Ext4 (usually in the> form of some files being left empty after a crash), I've never seen any> evidence of corruption from crashes under Btrfs.
I haven’t had a system crash on this machine. I didn’t use it for amonth, upgraded, rebooted, and then had GDM + GNOME Shell problems.
-- Ricardo
T
T
Timothy Sample wrote on 8 Aug 2019 04:59
(name . Ricardo Wurmus)(address . rekado@elephly.net)
87d0hgpdjl.fsf@ngyro.com
Hello,
Ricardo Wurmus <rekado@elephly.net> writes:
Toggle quote (11 lines)> Mark H Weaver <mhw@netris.org> writes:>>> It's interesting that I've never run into this problem, not even once,>> in all my years of running GNOME on Guix systems. Since recently>> reverting to mostly using GNOME under X and GDM (whereas for a while I>> was mostly launching GNOME manually under Wayland), I've run into some>> other problems, e.g. GDM suspending my system automatically, sometimes>> immediately after logging out, but I've *never* had to remove my caches.>> Interesting.
FWIW, I’m having the same good luck as Mark. Other than the upgradefrom 3.24 to 3.28, I’ve never had this kind of trouble with GNOME andGDM. And even then, IIRC, the issue wasn’t really with the state files(deleting them just happened to serve as a temporary work-around).

-- Tim
L
L
Ludovic Courtès wrote on 12 Sep 2019 10:42
control message for bug #36924
(address . control@debbugs.gnu.org)
874l1hrjm6.fsf@gnu.org
severity 36924 importantquit
L
L
Ludovic Courtès wrote on 12 Sep 2019 10:43
(address . control@debbugs.gnu.org)
8736h1rjjz.fsf@gnu.org
retitle 36924 GDM, GNOME Shell, etc. break when there are stale cachesquit
A
A
Andreas Enge wrote on 12 Sep 2019 11:54
Mesa/GDM/XFCE
(address . 36924@debbugs.gnu.org)
20190912095430.GA1559@jurong
Hello,
now it is my turn to experience a problem in this area. I newly installed amachine with the graphical installer of Guix 1.0.1 (where I could use xfcewithout problem), then I issued a "guix pull" and a "guix system reconfigure".
Now logging into XFCE poses problems, which I can reproduce as follows(after removing /var/lib/gdm, $HOME/{.config,.cache,.local} once):- When I remove $HOME/.config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml I can log into XFCE once.- The second time, various problems may occur: The terminal, which was open before opens, but does not receive focus so I cannot type, and the windows decorations for closing it are absent; or the xfce panel is absent.- Then I remove $HOME/.config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml again, and can log in once more.And so on.
The following lines in $HOME/.cache/gdm/session.log appear when there isa problem:
xfwm4: ../mesa-19.1.4/src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1293: intel_miptree_match_image: Zusicherung ᅵimage->TexObject->Target == mt->targetᅵ nicht erfᅵllt.(nm-applet:8046): nm-applet-WARNING **: 11:23:52.630: GDBus.Error:org.freedesktop.NetworkManager.AgentManager.PermissionDenied: An agent with this ID is already registered for this user.xfwm4: ../mesa-19.1.4/src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1293: intel_miptree_match_image: Zusicherung ᅵimage->TexObject->Target == mt->targetᅵ nicht erfᅵllt.(nm-applet:8046): Gdk-CRITICAL **: 11:23:53.035: gdk_window_thaw_toplevel_updates: assertion 'window->update_and_descendants_freeze_count > 0' failedxfwm4: ../mesa-19.1.4/src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1293: intel_miptree_match_image: Zusicherung ᅵimage->TexObject->Target == mt->targetᅵ nicht erfᅵllt.(nm-applet:8046): Gdk-CRITICAL **: 11:23:53.357: gdk_window_thaw_toplevel_updates: assertion 'window->update_and_descendants_freeze_count > 0' failedxfwm4: ../mesa-19.1.4/src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1293: intel_miptree_match_image: Zusicherung ᅵimage->TexObject->Target == mt->targetᅵ nicht erfᅵllt.(nm-applet:8046): Gdk-CRITICAL **: 11:23:53.441: gdk_window_thaw_toplevel_updates: assertion 'window->update_and_descendants_freeze_count > 0' failed(xfconfd:7966): xfconfd-CRITICAL **: 11:23:58.040: Name org.xfce.Xfconf lost on the message dbus, exiting.(Thunar:8030): thunar-WARNING **: 11:23:58.041: Name ᅵorg.xfce.FileManagerᅵ auf dem Nachrichten-dbus verloren.(tumblerd:8023): tumblerd-CRITICAL **: 11:23:58.041: Name org.freedesktop.thumbnails.Cache1 lost on the message dbus, exiting.(Thunar:8030): thunar-WARNING **: 11:23:58.041: Name ᅵorg.freedesktop.FileManager1ᅵ auf dem Nachrichten-dbus verloren.(tumblerd:8023): tumblerd-CRITICAL **: 11:23:58.041: Name org.freedesktop.thumbnails.Manager1 lost on the message dbus, exiting.(tumblerd:8023): tumblerd-CRITICAL **: 11:23:58.041: Name org.freedesktop.thumbnails.Thumbnailer1 lost on the message dbus, exiting.
Sorry for the German, but you also have the translation:"Zusicherung ... nicht erfï¿œllt" = "assertion ... failed""auf dem Nachrichten-dbus verloren" = "lost on the message dbus"
Andreas
L
L
Ludovic Courtès wrote on 12 Sep 2019 13:40
(name . Andreas Enge)(address . andreas@enge.fr)(address . 36924@debbugs.gnu.org)
874l1hpwti.fsf@gnu.org
Hallo!
Thanks for the report, Andreas!
Andreas Enge <andreas@enge.fr> skribis:
Toggle quote (2 lines)> xfwm4: ../mesa-19.1.4/src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1293: intel_miptree_match_image: Zusicherung »image->TexObject->Target == mt->target« nicht erfüllt.
That’s the likely root cause to me (in which case it may be unrelated tothis https://issues.guix.gnu.org/issue/36924, after all.)
I found these bug reports:
https://bugs.freedesktop.org/show_bug.cgi?id=107117 https://bugzilla.redhat.com/show_bug.cgi?id=1678334
In both cases, Xfce and Mesa’s i965 drivers are involved, as is the caseon your machine. The 2nd bug report includes an xfwm4 patch, even.
I wonder if Xfce before the recent updates (so before8549e0ca6fd68a57253471436de49b88b2d47e64) works better.
Andreas, if you feel like it, could you try:
guix pull --commit=97ce5964fb5d52cf2151fea685e28fa23a98b264 sudo guix system reconfigure …
?
Thanks,Ludo’.
A
A
Andreas Enge wrote on 16 Sep 2019 11:44
(name . Ludovic Courtès)(address . ludo@gnu.org)
20190916094454.GA1265@jurong
Hello,
On Thu, Sep 12, 2019 at 01:40:09PM +0200, Ludovic Courtès wrote:
Toggle quote (2 lines)> Thanks for the report, Andreas!
and thanks for the time spent putting me on the good track!
Toggle quote (11 lines)> Andreas Enge <andreas@enge.fr> skribis:> > xfwm4: ../mesa-19.1.4/src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1293: intel_miptree_match_image: Zusicherung »image->TexObject->Target == mt->target« nicht erfüllt.> That’s the likely root cause to me (in which case it may be unrelated to> this <https://issues.guix.gnu.org/issue/36924>, after all.)> I found these bug reports:> https://bugs.freedesktop.org/show_bug.cgi?id=107117> https://bugzilla.redhat.com/show_bug.cgi?id=1678334> > In both cases, Xfce and Mesa’s i965 drivers are involved, as is the case> on your machine. The 2nd bug report includes an xfwm4 patch, even.
The first one also contains a patch, but it has been integrated into latermesa releases, in particular the one we are using.
Toggle quote (6 lines)> I wonder if Xfce before the recent updates (so before> 8549e0ca6fd68a57253471436de49b88b2d47e64) works better.> Andreas, if you feel like it, could you try:> guix pull --commit=97ce5964fb5d52cf2151fea685e28fa23a98b264> sudo guix system reconfigure …
Indeed, the problem disappears with this commit; I can log in and outand in again with xfce working. So I am cc-ing the author of the commitsupdating xfce, maybe they have an answer!
And I will try to look at the patch in the second report you referenceabove.
Thanks!
Andreas
L
L
L p R n d n wrote on 16 Sep 2019 16:57
(name . Andreas Enge)(address . andreas@enge.fr)
878sqoe1bi.fsf@lprndn.info
Hello,
Andreas Enge <andreas@enge.fr> writes:
Toggle quote (33 lines)> Hello,>> On Thu, Sep 12, 2019 at 01:40:09PM +0200, Ludovic Courtès wrote:>> Thanks for the report, Andreas!>> and thanks for the time spent putting me on the good track!>>> Andreas Enge <andreas@enge.fr> skribis:>> > xfwm4: ../mesa-19.1.4/src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1293:>> intel_miptree_match_image: Zusicherung »image->TexObject->Target ==>> mt->target« nicht erfüllt.>> That’s the likely root cause to me (in which case it may be unrelated to>> this <https://issues.guix.gnu.org/issue/36924>, after all.)>> I found these bug reports:>> https://bugs.freedesktop.org/show_bug.cgi?id=107117>> https://bugzilla.redhat.com/show_bug.cgi?id=1678334>> >> In both cases, Xfce and Mesa’s i965 drivers are involved, as is the case>> on your machine. The 2nd bug report includes an xfwm4 patch, even.>> The first one also contains a patch, but it has been integrated into later> mesa releases, in particular the one we are using.>>> I wonder if Xfce before the recent updates (so before>> 8549e0ca6fd68a57253471436de49b88b2d47e64) works better.>> Andreas, if you feel like it, could you try:>> guix pull --commit=97ce5964fb5d52cf2151fea685e28fa23a98b264>> sudo guix system reconfigure …>> Indeed, the problem disappears with this commit; I can log in and out> and in again with xfce working. So I am cc-ing the author of the commits> updating xfce, maybe they have an answer!
It seems some bugs have been introduced in xfwm4 between 4.12 and 4.14.(All issues previously linked are for >=4.13 wich was the dev version of4.14).I found https://forum.xfce.org/viewtopic.php?id=13233which seemsinteresting. Please let us know if it changes anything.
I don't know what would be the correct way to deal with the problem inguix though.
Toggle quote (7 lines)> And I will try to look at the patch in the second report you reference> above.>> Thanks!>> Andreas
Have a nice day,
L p r n d n
A
A
Andreas Enge wrote on 27 Dec 2019 08:25
Xfce not starting
(address . 36924@debbugs.gnu.org)
20191227072541.GA842@jurong
Hello,
after trying to reconfigure with commit 02b6382169192367e97a2d1bc72f8eb3ed38b0dcof December 9, I am now running into a problem where I cannot log into my xfcesession under gdm any more: According to the first tty, a session is openedand closed immediately again, and the gdm login screen reappears.
It is not enough to delete /var/lib/gdm, ~/.cache and ~/.local. Could I tryanything else?
Luckily, there is "guix system rollback" to my working configuration ofSeptember, but I am not very comfortable with running such an old system...
Andreas
L
L
Ludovic Courtès wrote on 30 Dec 2019 20:00
(name . Andreas Enge)(address . andreas@enge.fr)(address . 36924@debbugs.gnu.org)
87o8vp63m0.fsf@gnu.org
Hi Andreas,
Andreas Enge <andreas@enge.fr> skribis:
Toggle quote (5 lines)> after trying to reconfigure with commit 02b6382169192367e97a2d1bc72f8eb3ed38b0dc> of December 9, I am now running into a problem where I cannot log into my xfce> session under gdm any more: According to the first tty, a session is opened> and closed immediately again, and the gdm login screen reappears.
Did you try your config with the same commit in ‘guix system vm’? Doesit reproduce the problem?
If it does not, that means the problem has to do with state, things like/var/lib/gdm as you mentioned.
Thanks,Ludo’.
?