‘guix system vm’ spawns QEMU and hangs

  • Done
  • quality assurance status badge
Details
4 participants
  • Josselin Poiret
  • Efraim Flashner
  • Leo Famulari
  • Ludovic Courtès
Owner
unassigned
Submitted by
Ludovic Courtès
Severity
important
L
L
Ludovic Courtès wrote on 22 Jan 2023 22:36
(address . bug-guix@gnu.org)
87sfg245gq.fsf@inria.fr
Hello,

On my Guix System machine, the ‘qemu-system-x86_64’ spawned by ‘guix
system vm’ hangs after printing “Booting from ROM...”; it has to be
terminated with SIGKILL, SIGINT is not enough.

Specifically:

$(guix time-machine --commit=66188398c446bdf9ce044fa539536e9b54c28c60 \
-- system vm gnu/system/examples/bare-bones.tmpl) -m 1024 # Good.

… whereas:

$(guix time-machine --commit=9923100a42ffa80f604c1c13a5e999e6a4c15146 \
-- system vm gnu/system/examples/bare-bones.tmpl) -m 1024 # Bad!

I thought the culprit might be this commit:

commit 9923100a42ffa80f604c1c13a5e999e6a4c15146
Date: Fri Dec 23 09:42:27 2022 +0200

gnu: sgabios: Fix build on cross-build architectures.

* gnu/packages/firmware.scm (sgabios)[arguments]: When cross-building
add a make-flag to use the correct objcopy.

… but even after reverting it on today’s master, QEMU occasionally hangs
as before, though not always.

‘qemu-minimal’ as used for “make check-system” seems to work fine.

There have been a number of packages unbundled, so I wonder if another
one of these might be causing problems.

What do you think?

Ludo’.
L
L
Ludovic Courtès wrote on 22 Jan 2023 22:57
control message for bug #61011
(address . control@debbugs.gnu.org)
87r0vm44h4.fsf@gnu.org
severity 61011 important
quit
L
L
Leo Famulari wrote on 23 Jan 2023 05:19
Re: bug#61011: ‘guix syst em vm’ spawns QEMU and hangs
(name . Ludovic Courtès)(address . ludo@gnu.org)
Y84K41nX7pbMGcv9@jasmine.lan
On Sun, Jan 22, 2023 at 10:36:21PM +0100, Ludovic Court�s wrote:
Toggle quote (3 lines)
> $(guix time-machine --commit=9923100a42ffa80f604c1c13a5e999e6a4c15146 \
> -- system vm gnu/system/examples/bare-bones.tmpl) -m 1024 # Bad!

I can reproduce.

It seems to work fine without '-m 1024', in which case it only has 512
MB RAM.
J
J
Josselin Poiret wrote on 23 Jan 2023 21:59
Re: bug#61011: ‘guix system vm ’ spawns QEMU and hangs
87fsc16k7c.fsf@jpoiret.xyz
Hi,
Leo Famulari <leo@famulari.name> writes:

Toggle quote (9 lines)
> On Sun, Jan 22, 2023 at 10:36:21PM +0100, Ludovic Courtès wrote:
>> $(guix time-machine --commit=9923100a42ffa80f604c1c13a5e999e6a4c15146 \
>> -- system vm gnu/system/examples/bare-bones.tmpl) -m 1024 # Bad!
>
> I can reproduce.
>
> It seems to work fine without '-m 1024', in which case it only has 512
> MB RAM.

This is probably due to the following kernel bug [1], which could be
related to the khugepaged hangs I'm getting on my system since 6.1.

[1]


Best,
--
Josselin Poiret
L
L
Ludovic Courtès wrote on 23 Jan 2023 23:21
(name . Josselin Poiret)(address . dev@jpoiret.xyz)
871qnkzybw.fsf@gnu.org
Hello,

Josselin Poiret <dev@jpoiret.xyz> skribis:

Toggle quote (17 lines)
> Leo Famulari <leo@famulari.name> writes:
>
>> On Sun, Jan 22, 2023 at 10:36:21PM +0100, Ludovic Courtès wrote:
>>> $(guix time-machine --commit=9923100a42ffa80f604c1c13a5e999e6a4c15146 \
>>> -- system vm gnu/system/examples/bare-bones.tmpl) -m 1024 # Bad!
>>
>> I can reproduce.
>>
>> It seems to work fine without '-m 1024', in which case it only has 512
>> MB RAM.
>
> This is probably due to the following kernel bug [1], which could be
> related to the khugepaged hangs I'm getting on my system since 6.1.
>
> [1]
> https://lore.kernel.org/kvm/b8017e09-f336-3035-8344-c549086c2340@kernel.org/

Ouch. I’m running 6.1 since January 16th, which is about the time I
first experienced the issue.

Ludo’.
E
E
Efraim Flashner wrote on 24 Jan 2023 11:24
Re: bug#61011: ‘guix syst em vm’ spawns QEMU and hangs
(name . Ludovic Courtès)(address . ludo@gnu.org)
Y8+x4bdL/QmKu3YB@3900XT
On Sun, Jan 22, 2023 at 10:36:21PM +0100, Ludovic Courtès wrote:
Toggle quote (36 lines)
> Hello,
>
> On my Guix System machine, the ‘qemu-system-x86_64’ spawned by ‘guix
> system vm’ hangs after printing “Booting from ROM...”; it has to be
> terminated with SIGKILL, SIGINT is not enough.
>
> Specifically:
>
> $(guix time-machine --commit=66188398c446bdf9ce044fa539536e9b54c28c60 \
> -- system vm gnu/system/examples/bare-bones.tmpl) -m 1024 # Good.
>
> … whereas:
>
> $(guix time-machine --commit=9923100a42ffa80f604c1c13a5e999e6a4c15146 \
> -- system vm gnu/system/examples/bare-bones.tmpl) -m 1024 # Bad!
>
> I thought the culprit might be this commit:
>
> commit 9923100a42ffa80f604c1c13a5e999e6a4c15146
> Date: Fri Dec 23 09:42:27 2022 +0200
>
> gnu: sgabios: Fix build on cross-build architectures.
>
> * gnu/packages/firmware.scm (sgabios)[arguments]: When cross-building
> add a make-flag to use the correct objcopy.
>
> … but even after reverting it on today’s master, QEMU occasionally hangs
> as before, though not always.
>
> ‘qemu-minimal’ as used for “make check-system” seems to work fine.
>
> There have been a number of packages unbundled, so I wonder if another
> one of these might be causing problems.
>
> What do you think?

I remember feeling overwhelmed by the build failures after the
unbundling (but I didn't reach out! I should've said something.) and
worked to try and quickly fix the builds.

I looked at reverting it locally, but with or without that patch I got
the same derivation for sgabios when built on x86_64. I tried firing up
diffoscope and I found no differences between the sgabios built on
x86_64, aarch64 or armhf (wow!).

I've run diffoscope against the sga I tried firing up diffoscope and I
found no differences between the sgabios built on x86_64, aarch64 or
armhf (wow!).

I've run diffoscope against the sgabios.bin that we build and the one
that comes in the qemu release tarball and I've included the output in
the email.

--
Efraim Flashner <efraim@flashner.co.il> ????? ?????
GPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted
-----BEGIN PGP SIGNATURE-----

iQIzBAABCAAdFiEEoov0DD5VE3JmLRT3Qarn3Mo9g1EFAmPPsd4ACgkQQarn3Mo9
g1GWCxAAsunqslcmwQDfmlMinF5JHHyPr787csXPhGpzLJB9D1VstMpWBClosCiq
6Hi2HqpAKEyr/yLiLSwzjXtx+2jdDe9zXBcMWBX0niajJC7BLOjhF/64xQPhCl0a
06qKWHIjgn3Jxp5l7VM7qRc7MRHaOkzSbwEt7zsojzZ5HWa8gRfitqpgpQwA/cbD
8K2mL1ZwVyoXN65s2K0/u19iDNJvQ44rOYahqeTVALu4D5YoOB8F6uuFyt32brrH
L8U/p7QKFyPBtI28cncMmlL/BmVYf/MEAW+46poAtnlraDUCoWMeRzzoU8hF6r8+
TwnWtC4q1gOX5L46AKdKqnyZ3qRMdOUPkY34so6WAvhS8xabFV/OyTCEyhebFGD7
JPHc08abPQVV4pyt+JatsQR6KrPJaNayc1CcJLJAWUrWhLnC5nJPuTPZQr5h8/3i
ES9IDNQ9g8UeGlkxDoE7Ru2otdIMvfl3tHT/E9ZWQ0cZVHlyFazxTFQirxXwcTml
lrVJUxMRiW9HQbbaoK3DiCHHxEcDbPfeXIbZy4lDEdMh2H2iVM/xJ+6uXV7THUpW
551rHkPXzIn/5f2CucRl4H0A+ru6HQNWvt956nxvP7O7mrgPOLQ8o40SkH4FGEm6
rkoQMsivmMyU/yoisEDXk5/zo1ulakS7vhtuLDybbV9HfzavRsA=
=dNb2
-----END PGP SIGNATURE-----


L
L
Ludovic Courtès wrote on 15 Jun 2023 22:54
control message for bug #61011
(address . control@debbugs.gnu.org)
87legkv43d.fsf@gnu.org
close 61011
quit
?