How to pass i915.enable_guc=0 in config.scm to prevent a 'wedged' GPU?

  • Done
  • quality assurance status badge
Details
4 participants
  • Hugo Buddelmeijer
  • Maxim Cournoyer
  • Tobias Geerinckx-Rice
  • Csepp
Owner
unassigned
Submitted by
Hugo Buddelmeijer
Severity
normal
H
H
Hugo Buddelmeijer wrote on 20 Oct 2023 19:42
(address . bug-guix@gnu.org)
CA+Jv8O2G0g_022ctO4+ncVxgwShPQgV6xiQHXrh1fFP8gXBcKg@mail.gmail.com
The i915 driver will try to load the GuC firmware, at least for Iris
Xe chips. Loading the GuC firmware fails because it is non-free and
deblobbed. As a result, some software, like sway, will not work.

It is possible to manually pass the i915.enable_guc=0 kernel parameter
at boot from grub. Then everything works as intended. However, it
seems not possible to set this parameter from config.scm.

So at the moment my system is not fully declarative, as I have to type
in a kernel parameter at boot; does anyone perhaps have advice on how
can this be done better?

Details below.

Thanks,
Hugo


guix version:

guix 27c2ebd
branch: master
commit: 27c2ebd7cebba22b7acd341d7ce402f6beb02733



### Attempt 1, out-of-the box configuration ###

dmesg output:

[ 10.028684] i915 0000:00:02.0: [drm] *ERROR* GT0: GuC firmware
/*(DEBLOBBED)*/: fetch failed -ENOENT
[ 10.028692] i915 0000:00:02.0: [drm] GT0: GuC firmware(s) can be
downloaded from /*(DEBLOBBED)*/
[ 10.029541] i915 0000:00:02.0: [drm] GT0: GuC firmware
/*(DEBLOBBED)*/ version 0.0.0
[ 10.029613] i915 0000:00:02.0: [drm] *ERROR* GT0: GuC
initialization failed -ENOENT
[ 10.029615] i915 0000:00:02.0: [drm] *ERROR* GT0: Enabling uc failed (-5)
[ 10.029617] i915 0000:00:02.0: [drm] *ERROR* GT0: Failed to
initialize GPU, declaring it wedged!
[ 10.029973] i915 0000:00:02.0: [drm:add_taint_for_CI [i915]] CI
tainted:0x9 by intel_gt_set_wedged_on_init+0x38/0x50 [i915]


sway-greeter.log:

error: Kernel is too old (4.16+ required) or unusable for Iris.
Check your dmesg logs for loading failures.

libEGL warning: egl: failed to create dri2 screen
00:00:00.185 [ERROR] [wlr] [EGL] command: eglInitialize, error:
EGL_NOT_INITIALIZED (0x3001), message: "DRI2: failed to create screen"



### Attempt 2: manually disabling guc ###

dmesg when passing i915.enable_guc=0 at boot (roughly instead of the
errors from attempt 1):

[ 9.233275] Setting dangerous option enable_guc - tainting kernel

sway works



### Attempt 3: through config.scm ###

attempt to set the kernel parameter in config.scm:

(modify-services %base-services
(sysctl-service-type config =>
(sysctl-configuration
(settings (append '(("i915.enable_guc" . "0"))
%default-sysctl-settings)))))


dmesg output (in addition to the errors from attempt 1):

[ 7.759922] sysctl: cannot stat /proc/sys/i915/enable_guc: No such
file or directory

sway does not work
C
(name . Hugo Buddelmeijer)(address . hugo@buddelmeijer.nl)
cuco7gtdtm9.fsf@riseup.net
Hugo Buddelmeijer <hugo@buddelmeijer.nl> writes:

Toggle quote (13 lines)
> The i915 driver will try to load the GuC firmware, at least for Iris
> Xe chips. Loading the GuC firmware fails because it is non-free and
> deblobbed. As a result, some software, like sway, will not work.
>
> It is possible to manually pass the i915.enable_guc=0 kernel parameter
> at boot from grub. Then everything works as intended. However, it
> seems not possible to set this parameter from config.scm.
>
> So at the moment my system is not fully declarative, as I have to type
> in a kernel parameter at boot; does anyone perhaps have advice on how
> can this be done better?
> ...

You can use the kernel-arguments option in the operating-system config.
Untested:
(kernel-arguments (cons "i915.enable_guc=0" %default-kernel-arguments))
This should work, in theory.

I suspect that the sysctl thing doesn't work because it is done too late
in the boot process.
H
H
Hugo Buddelmeijer wrote on 20 Oct 2023 21:39
(address . 66651@debbugs.gnu.org)
CA+Jv8O0bpe=dpJKiCU2Znk3KXPFC8oJOzPXR-LJb87S1sO7m2A@mail.gmail.com
Csepp <raingloom <at> riseup.net> wrote:

Toggle quote (5 lines)
> You can use the kernel-arguments option in the operating-system config.
> Untested:
> (kernel-arguments (cons "i915.enable_guc=0" %default-kernel-arguments))
> This should work, in theory.

Thanks, using kernel-arguments indeed works!

The idea to use i915.enable_guc came from the arch wiki [1], which states

Toggle quote (2 lines)
> GuC functionality is controlled by the i915.enable_guc kernel parameter.

So I searched the Guix manual for "kernel parameter", and found the
sysctl section.

Toggle quote (3 lines)
> I suspect that the sysctl thing doesn't work because it is done too late
> in the boot process.

That makes sense now I understand the difference between kernel
arguments and parameters in Guix. From the sysctl man page:

Toggle quote (2 lines)
> sysctl is used to modify kernel parameters at runtime.

And that is indeed a bit late for deciding whether to load firmware.


About the default value: naively I would think that the default of
i915.enable_guc should be changed to 0 in the libre kernel, at least
for those chips that do not have free firmware.

At least now the term 'wedged' is part of the issue tracker, so other
affected people will hopefully find this.


This issue can be closed as far as I'm concerned, but I don't know how
to do that. Let's try this:

/close

Thanks again,

Hugo


T
T
Tobias Geerinckx-Rice wrote on 20 Oct 2023 22:56
(name . Hugo Buddelmeijer)(address . hugo@buddelmeijer.nl)(address . 66651-close@debbugs.gnu.org)
c23a1cc1de816b7c6bf81c1af6e43c6e@tobias.gr
On 2023-10-20 21:39, Hugo Buddelmeijer wrote:
Toggle quote (5 lines)
> This issue can be closed as far as I'm concerned, but I don't know how
> to do that. Let's try this:
>
> /close

Add ‘-done’ or ‘-close’ to the bug address, as I've done here.

Kind regards,

T G-R

Sent from a Web browser. Excuse or enjoy my brevity.
M
M
Maxim Cournoyer wrote on 21 Oct 2023 00:35
(name . Hugo Buddelmeijer)(address . hugo@buddelmeijer.nl)
87zg0cgbw3.fsf@gmail.com
tags 66651 + notabug
thanks

Hello!

Hugo Buddelmeijer <hugo@buddelmeijer.nl> writes:

[...]

Toggle quote (5 lines)
> This issue can be closed as far as I'm concerned, but I don't know how
> to do that. Let's try this:
>
> /close

You can reply to the bug via the special '66651-done@debbugs.gnu.org'
email address. I'm also CC'ing the Debbugs control server so that it
processes the directives found at the beginning of this mail (add a
'notabug' tag).

--
Thanks,
Maxim
?