Guix System hangs after boot with linux-libre 5.15.17

  • Done
  • quality assurance status badge
Details
5 participants
  • Squirrel
  • Katherine Cox-Buday
  • Leo Famulari
  • Ludovic Courtès
  • Maxim Cournoyer
Owner
unassigned
Submitted by
Leo Famulari
Severity
important
Merged with
L
L
Leo Famulari wrote on 1 Feb 2022 21:39
(address . bug-guix@gnu.org)
YfmafwdvyaZ6Q6gD@jasmine.lan
Guix System on x86_64 hangs after boot when reconfigured to use
linux-libre 5.15.17.

The system becomes unresponsive to keyboard input after boot. Sometimes
it reaches the login prompt, sometimes not. Services that take a while
to start, typically finishing after the prompt is displayed, do not
ever start.

I bisected and confirmed that the problem is introduced in Guix Git
commit aad96ed54070 "gnu: linux-libre: Update to 5.15.17."

I could not reproduce the problem with `guix system vm [...]`. Maybe
this means that the problem only occurs after reconfiguring, rather than
a fresh system, or maybe it just doesn't occur at all in a KVM VM.

I'll try with 5.15.18 and the 5.10 series. We are still unable to deploy
linux-libre 5.16 for Guix System users due to this bug:

M
M
Maxim Cournoyer wrote on 1 Feb 2022 22:11
(name . Leo Famulari)(address . leo@famulari.name)(address . 53712@debbugs.gnu.org)
87wniez400.fsf@gmail.com
Hi Leo,

Leo Famulari <leo@famulari.name> writes:

Toggle quote (18 lines)
> Guix System on x86_64 hangs after boot when reconfigured to use
> linux-libre 5.15.17.
>
> The system becomes unresponsive to keyboard input after boot. Sometimes
> it reaches the login prompt, sometimes not. Services that take a while
> to start, typically finishing after the prompt is displayed, do not
> ever start.
>
> I bisected and confirmed that the problem is introduced in Guix Git
> commit aad96ed54070 "gnu: linux-libre: Update to 5.15.17."
>
> I could not reproduce the problem with `guix system vm [...]`. Maybe
> this means that the problem only occurs after reconfiguring, rather than
> a fresh system, or maybe it just doesn't occur at all in a KVM VM.
>
> I'll try with 5.15.18 and the 5.10 series. We are still unable to deploy
> linux-libre 5.16 for Guix System users due to this bug:

Not sure if that helps, but my system is running smoothly on 5.15.18 as
I write this. What are the specifics of your system?

Thanks,

Maxim
L
L
Leo Famulari wrote on 1 Feb 2022 22:44
Re: Guix System hangs after boot with linux-libre 5.15.17
(address . 53712@debbugs.gnu.org)
YfmpuOZnXn3xAmjA@jasmine.lan
On Tue, Feb 01, 2022 at 03:39:27PM -0500, Leo Famulari wrote:
Toggle quote (3 lines)
> I'll try with 5.15.18 and the 5.10 series. We are still unable to deploy
> linux-libre 5.16 for Guix System users due to this bug:

Same problem with 5.10.95.
L
L
Leo Famulari wrote on 1 Feb 2022 22:58
Re: bug#53712: Guix System hangs after boot with linux-libre 5.15.17
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 53712@debbugs.gnu.org)
YfmtERSYVYOnkbmL@jasmine.lan
On Tue, Feb 01, 2022 at 04:11:27PM -0500, Maxim Cournoyer wrote:
Toggle quote (3 lines)
> Not sure if that helps, but my system is running smoothly on 5.15.18 as
> I write this. What are the specifics of your system?

Interestingly, for me 5.15.18 does boot to an interactive console, but
the system fails to bring up the wifi interface, and then it fails to
halt upon command, just hanging forever. At least, it did that for the 3
times I tried using that generation.

This is a Thinkpad x200s with 3 GB RAM. Normal BIOS. I've attached the
operating-system declaration.

There are similar reports on #guix IRC from two other users:

;; This is an operating system configuration template ;; for a "bare bones" setup, with no X11 display server. (use-modules (gnu)) (use-service-modules networking desktop dbus ssh) (use-package-modules admin certs curl linux ntp nvi ssh rsync tmux version-control wicd vim) (operating-system (host-name "zamia") (timezone "America/New_York") (locale "en_US.UTF-8") (kernel-loadable-modules (list rtl8812au-aircrack-ng-linux-module)) (kernel-arguments '(;; Console resolution "gfxpayload=1440x900x16,1440x900" ;; console cursor. stops the blinking but the colors are bad "vt.cur.default=0x520032" "consoleblank=120" ;; ??? "quiet" ;; Disable the PC speaker "modprobe.blacklist=pcspkr,snd_pcsp")) ;; Assuming /dev/sdX is the target hard disk, and "my-root" is ;; the label of the target root file system. (bootloader (grub-configuration (target "/dev/sda") (terminal-outputs '(console)))) (file-systems (cons* (file-system (device (uuid "0pa2dcd8-e037-43fb-b0cc-9ec5bcc3127a")) (mount-point "/") (type "btrfs") (options "compress-force=zstd")) (file-system (device (uuid "9p614cc2-af95-482a-b906-ebc958ed57b7")) (mount-point "/home") (type "btrfs") (options "compress-force=zstd")) ; This will break the boot ; <https://bugs.gnu.org/35472> ; (file-system ; (device "/foo/bar") ; (mount-point "/bar") ; (type "none") ; (check? #f) ; (needed-for-boot? #t) ; (flags '(bind-mount))) %base-file-systems)) ;; This is where user accounts are specified. The "root" ;; account is implicit, and is initially created with the ;; empty password. (users (append (list (user-account (name "leo") (group "users") ;; Adding the account to the "wheel" group ;; makes it a sudoer. Adding it to "audio" ;; and "video" allows the user to play sound ;; and access the webcam. (supplementary-groups '("wheel" "netdev" "audio")))) %base-user-accounts)) ;; Globally-installed packages. (packages (append (list curl atop htop git openssh mosh nss-certs ntp rsync tmux tree vim nvi) %base-packages-disk-utilities %base-packages)) (services (append (list (dbus-service) (service gpm-service-type) (service openssh-service-type (openssh-configuration (password-authentication? #f))) (service ntp-service-type) (service wicd-service-type wicd) (elogind-service)) (modify-services %base-services (guix-service-type config => (guix-configuration (inherit config) (substitute-urls '("https://custom.example.com"))))))))
L
L
Leo Famulari wrote on 1 Feb 2022 23:09
Re: Guix System hangs after boot with linux-libre 5.15.17
(address . bug-guix@gnu.org)
YfmvsubxlWjWHc7z@jasmine.lan
On Tue, Feb 01, 2022 at 03:39:27PM -0500, Leo Famulari wrote:
Toggle quote (3 lines)
> Guix System on x86_64 hangs after boot when reconfigured to use
> linux-libre 5.15.17.

More anecdata:

While I build Linux 5.15.17 (not linux-libre) using the Guix kernel
config, adapted for Debian, it works fine on Debian.

The only changes required for Debian are:

1) Switch from CONFIG_MODULE_COMPRESS_GZIP to CONFIG_MODULE_COMPRESS_XZ
2) Set CONFIG_MODPROBE_PATH=/sbin/modprobe

This Debian kernel doesn't use the config options set in ((gnu packages
linux) %default-extra-linux-options), so it's not a very useful point of
comparison.
L
L
Leo Famulari wrote on 1 Feb 2022 23:11
Re: bug#53712: Guix System hangs after boot with linux-libre 5.15.17
(address . 53712@debbugs.gnu.org)
YfmwEEyxrhYVTi0N@jasmine.lan
On Tue, Feb 01, 2022 at 03:39:27PM -0500, Leo Famulari wrote:
Toggle quote (3 lines)
> Guix System on x86_64 hangs after boot when reconfigured to use
> linux-libre 5.15.17.

More anecdata:

While I build Linux 5.15.17 (not linux-libre) using the Guix kernel
config, adapted for Debian, it works fine on Debian.

The only changes required for Debian are:

1) Switch from CONFIG_MODULE_COMPRESS_GZIP to CONFIG_MODULE_COMPRESS_XZ
2) Set CONFIG_MODPROBE_PATH=/sbin/modprobe

This Debian kernel doesn't use the config options set in ((gnu packages
linux) %default-extra-linux-options), so it's not a very useful point of
comparison.
L
L
L
Leo Famulari wrote on 3 Feb 2022 19:06
(address . 53712@debbugs.gnu.org)
YfwZsU5flm+OQXD+@jasmine.lan
The failure exists for me with the most recent versions of the kernel
series 5.4, 5.10, 5.15, and 5.16.

Sometimes the screen does display the end of a kernel panic, but I can
only see the last few lines.
L
L
Ludovic Courtès wrote on 5 Feb 2022 00:20
control message for bug #53712
(address . control@debbugs.gnu.org)
87r18ikymc.fsf@gnu.org
severity 53712 important
quit
K
K
Katherine Cox-Buday wrote on 7 Feb 2022 17:54
I'm also experiencing this
(address . 53712@debbugs.gnu.org)
87bkzi8vmq.fsf@gmail.com
I thought I'd chime in with another data-point:

- Lenovo ThinkPad T480s
- Normal BIOS
- Vanilla Linux kernel(s)
- kernel v5.15.16 boots fine; everything after seems to hit this bug
- Turning off bluetooth in BIOS seemed to get boot further
- Turning off wifi in BIOS allows me to boot

The spec sheet tells me it has a Intel Dual Band Wireless-AC 8265 card, so I guess it's something to do with that. I don't know if this is an upstream kernel bug or something to do with distros specifically?

I hope this helps narrow things down. Luckily my laptop is just used as a desktop, and I have ethernet, so I can just keep Wifi disabled.

--
Katherine
L
L
Leo Famulari wrote on 7 Feb 2022 19:14
(name . Katherine Cox-Buday)(address . cox.katherine.e@gmail.com)(address . 53712@debbugs.gnu.org)
YgFholvevfY6rdrG@jasmine.lan
On Mon, Feb 07, 2022 at 10:54:53AM -0600, Katherine Cox-Buday wrote:
Toggle quote (13 lines)
> I thought I'd chime in with another data-point:
>
> - Lenovo ThinkPad T480s
> - Normal BIOS
> - Vanilla Linux kernel(s)
> - kernel v5.15.16 boots fine; everything after seems to hit this bug
> - Turning off bluetooth in BIOS seemed to get boot further
> - Turning off wifi in BIOS allows me to boot
>
> The spec sheet tells me it has a Intel Dual Band Wireless-AC 8265 card, so I guess it's something to do with that. I don't know if this is an upstream kernel bug or something to do with distros specifically?
>
> I hope this helps narrow things down. Luckily my laptop is just used as a desktop, and I have ethernet, so I can just keep Wifi disabled.

Thanks for the notes.

I'm using wifi supported by the drivers from Guix's
rtl8812au-aircrack-ng-linux-module with linux-libre, so it also affects
users who only use the GNU Guix channel.

I also noticed some improvement when removing the wifi dongle, although
I was still unable to properly halt or reboot.
L
L
Leo Famulari wrote on 9 Feb 2022 19:23
Re: bug#53712: Guix System hangs after boot with linux-libre 5.15.17
(address . 53712@debbugs.gnu.org)
YgQGuaXULfi/W+OX@jasmine.lan
I'm still experiencing this problem with 5.15.21, as well 5.16.7 and
5.4.177.
S
S
Squirrel wrote on 10 Feb 2022 21:03
bug#53712: Guix System hangs after boot with linux-libre 5.15.17
(address . bug-guix@gnu.org)
tencent_643527A4DC5BAA14A9B802F1BC53FEBEDE0A@qq.com
It seems that this is a bug from upstream linux, which may be patched
soon. Please see

Blacklisting the iwlwifi kernel module works as a workaround for now, as
is suggested by Jason Self, the maintainer of the Freesh and libeRTy apt
repositories of linux-libre kernel. He mentioned it in Trisquel forum at

I'd removed the Intel AX201NGW Wi-Fi & Bluetooth card from my Acer Swift
3x laptop two days ago. There isn't any wireless network card installed
now. The system runs without errors, which may be because the iwlwifi
kernel module is not loaded due to the lack of a Wi-Fi network card.
Before that, the system hanged randomly with the most recent versions of
the kernel series 5.4, 5.10 and 5.15.
L
L
Leo Famulari wrote on 13 Feb 2022 03:06
(name . Squirrel via Bug reports for GNU Guix)(address . bug-guix@gnu.org)(address . 53712@debbugs.gnu.org)
YghnkdV68oua7c2B@jasmine.lan
On Fri, Feb 11, 2022 at 04:03:03AM +0800, Squirrel via Bug reports for GNU Guix wrote:
Toggle quote (5 lines)
> Blacklisting the iwlwifi kernel module works as a workaround for now, as is
> suggested by Jason Self, the maintainer of the Freesh and libeRTy apt
> repositories of linux-libre kernel. He mentioned it in Trisquel forum at https://trisquel.info/en/forum/trisquel-9-linux-libre-51517-will-not-allow-login
> and at https://lore.kernel.org/all/20220203161959.3edf1d6e@valencia/

Thanks for the info!

I can confirm that adding the modprobe incantation to my
operating-system declaration, reconfiguring, and rebooting does fix the
bug:

(kernel-arguments
"modprobe.blacklist=pcspkr,snd_pcsp,iwlwifi"))

L
L
Ludovic Courtès wrote on 16 Feb 2022 22:17
control message for bug #53712
(address . control@debbugs.gnu.org)
87iltev7e9.fsf@gnu.org
merge 53712 54010
quit
L
L
Ludovic Courtès wrote on 17 Feb 2022 10:12
Re: bug#53712: Guix System hangs after boot with linux-libre 5.15.17
(name . Leo Famulari)(address . leo@famulari.name)
874k4xrh4e.fsf_-_@gnu.org
Hi,

Leo Famulari <leo@famulari.name> skribis:

Toggle quote (13 lines)
> On Tue, Feb 15, 2022 at 03:56:27PM +0100, Ludovic Courtès wrote:
>> 2022 has left me without a working Linux-libre kernel.
>>
>> Breakage occurred sometime between:
>>
>> • 92faad0adb93b8349bfd7c67911d3d95f0505eb2
>> (Jan. 3rd; Linux-libre 5.15.12)
>>
>> • 43dd34c7777a212c99a97da7a2c237158faa9a1b
>> (Jan. 31st; Linux-libre 5.15.17)
>
> Isn't this #53712?

It is!

Toggle quote (6 lines)
> I recommend trying the workaround described there, which is to blacklist
> the iwlwifi kernel module, if you are not using it but have iwlwifi
> hardware:
>
> https://issues.guix.gnu.org/53712#13

I confirm that 5.15.23-gnu works for me with:

(kernel-arguments
"modprobe.blacklist=usbmouse,usbkbd,iwlwifi"
"quiet"))

Thank you!

Ludo’.
L
L
Leo Famulari wrote on 24 Feb 2022 16:08
(name . Squirrel via Bug reports for GNU Guix)(address . bug-guix@gnu.org)(address . 53712-done@debbugs.gnu.org)
Yhefdlj/hJXxEoV9@jasmine.lan
On Fri, Feb 11, 2022 at 04:03:03AM +0800, Squirrel via Bug reports for GNU Guix wrote:
Toggle quote (3 lines)
> It seems that this is a bug from upstream linux, which may be patched soon.
> Please see https://lore.kernel.org/stable/164448100914.10463.9523338503936670263.kvalo@kernel.org/

Good news, this bug has been fixed in the latest group of stable Linux
releases:

5.16.11
5.15.25
5.10.102
5.4.181
4.19.231
4.14.268
4.9.303

For our default release series, 5.15, it was fixed as commit
ddd46059f7d99119b62d44c519df7a79f2e6a515:

S
S
Squirrel wrote on 2 Mar 2022 16:39
bug#53712: Guix System hangs after boot with linux-libre 5.15.17
(address . bug-guix@gnu.org)
tencent_615F93BB9D049A7B0B954D9577A084F7E007@qq.com
On Thu, 24 Feb 2022 10:08:38 -0500, Leo Famulari <leo@famulari.name> wrote:
Toggle quote (15 lines)
> Good news, this bug has been fixed in the latest group of stable Linux
> releases:
>
> 5.16.11
> 5.15.25
> 5.10.102
> 5.4.181
> 4.19.231
> 4.14.268
> 4.9.303
>
> For our default release series, 5.15, it was fixed as commit
> ddd46059f7d99119b62d44c519df7a79f2e6a515:
>
> https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.15.25
Thank you! I eventually have time to reinstall the WiFi card to test the
5.10.102 kernel, and it works!
?