‘guix system disk-image’ successfully builds a bad image

  • Done
  • quality assurance status badge
Details
3 participants
  • Brice Waegeneire
  • Ludovic Courtès
  • Tobias Geerinckx-Rice
Owner
unassigned
Submitted by
Tobias Geerinckx-Rice
Severity
important
Merged with
T
T
Tobias Geerinckx-Rice wrote on 1 Feb 2019 16:57
(name . Bug Guix)(address . bug-guix@gnu.org)
877eejfqmb.fsf@nckx
Hullo!

I wanted to install this ‘Guix’ thing that everyone's so hyped up
about.

I have a small forgotten script in my ~/guix.git that runs:

./pre-inst-env guix system disk-image --fallback
--image-size=1.5G \
gnu/system/install.scm

This was written back when 1.5G was higher than the default.

Now it's much lower and too small to store all the Guix. However,
the command completes ‘successfully’:

copying 422 store items [#########:
In srfi/srfi-1.scm:
466:18 19 (fold #<procedure 1a60440 at ice-9/ftw.scm:452:38
(sub?> ?)
In unknown file:
18 (_ #<procedure 1917270 at ice-9/ftw.scm:454:44 ()>
#<p?> ?)
In ice-9/ftw.scm:
452:32 17 (loop _ _ #(21 1706421 16749 3 0 0 0 4096 1548869386
?) ?)
In srfi/srfi-1.scm:
466:18 16 (fold #<procedure 1a60160 at ice-9/ftw.scm:452:38
(sub?> ?)
In unknown file:
15 (_ #<procedure 1917240 at ice-9/ftw.scm:454:44 ()>
#<p?> ?)
In ice-9/ftw.scm:
452:32 14 (loop _ _ #(21 1739151 16749 3 0 0 0 4096 1548869386
?) ?)
In srfi/srfi-1.scm:
466:18 13 (fold #<procedure 1b8f8c0 at ice-9/ftw.scm:452:38
(sub?> ?)
In unknown file:
12 (_ #<procedure 1b5bc90 at ice-9/ftw.scm:454:44 ()>
#<p?> ?)
In ice-9/ftw.scm:
452:32 11 (loop _ _ #(21 1772091 16749 13 0 0 0 4096 1548869389
?) ?)
In srfi/srfi-1.scm:
466:18 10 (fold #<procedure 1b8f280 at ice-9/ftw.scm:452:38
(sub?> ?)
In unknown file:
9 (_ #<procedure 1a56750 at ice-9/ftw.scm:454:44 ()>
#<p?> ?)
In ice-9/ftw.scm:
452:32 8 (loop _ _ #(21 2132258 16749 98 0 0 0 4096 1548869432
?) ?)
In srfi/srfi-1.scm:
466:18 7 (fold #<procedure 140dd20 at ice-9/ftw.scm:452:38
(sub?> ?)
In unknown file:
6 (_ #<procedure 19ea030 at ice-9/ftw.scm:454:44 ()>
#<p?> ?)
In ice-9/ftw.scm:
452:32 5 (loop _ _ #(21 4589344 16749 24 0 0 0 4096 1548869676
?) ?)
In srfi/srfi-1.scm:
466:18 4 (fold #<procedure 1969540 at ice-9/ftw.scm:452:38
(sub?> ?)
In unknown file:
3 (_ #<procedure 1725750 at ice-9/ftw.scm:454:44 ()>
#<p?> ?)
In ice-9/ftw.scm:
482:39 2 (loop _ _ #(21 4589402 16749 3 0 0 0 4096 1548869687
?) ?)
In ./guix/build/utils.scm:
312:27 1 (_
"/gnu/store/ricf82z3mqqrqim67jz3jlsglfm1g1a8-linux-?" ?)
In unknown file:
0 (copy-file
"/gnu/store/ricf82z3mqqrqim67jz3jlsglfm1g1a?" ?)

ERROR: In procedure copy-file:
In procedure copy-file: No space left on device
^MESC[Kcopying 422 store items
boot program
'/gnu/store/lbvrvrlqab4qpw9f907na445kppmknab-linux-vm-loader'
terminated, rebooting
[ 1071.512054] Unregister pv shared memory for cpu 0
[ 1071.522414] reboot: Restarting system
[ 1071.542285] reboot: machine restart
successfully built
/gnu/store/lbyq5790j5hfq3spbm76i1yw3sj41l8b-disk-image.drv
/gnu/store/dby523cy1l4wrqi8wwmk5ln9qr7g5mh8-disk-image

Kind regards,

T G-R

Sent from my GNU Emacs
L
L
Ludovic Courtès wrote on 17 Mar 2019 13:09
Re: bug#34276: ‘guix system disk-im age’ successfully builds a bad image
(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)(address . 34276@debbugs.gnu.org)
878sxdzoqo.fsf@gnu.org
Hello,

Tobias Geerinckx-Rice <me@tobias.gr> skribis:

Toggle quote (12 lines)
> ERROR: In procedure copy-file:
> In procedure copy-file: No space left on device
> ^MESC[Kcopying 422 store items
> boot program
> '/gnu/store/lbvrvrlqab4qpw9f907na445kppmknab-linux-vm-loader'
> terminated, rebooting
> [ 1071.512054] Unregister pv shared memory for cpu 0
> [ 1071.522414] reboot: Restarting system
> [ 1071.542285] reboot: machine restart
> successfully built
> /gnu/store/lbyq5790j5hfq3spbm76i1yw3sj41l8b-disk-image.drv

I investigated a bit. I managed to get our code to cause a kernel panic
upon failure (patch below). However I fail to turn that guest kernel
panic into a different QEMU exit code.

I tried to use the “pvpanic” paravirtualized device (the ‘pvpanic.ko’
module in the guest, and “-device pvpanic” on the QEMU command line),
but unfortunately that thing is almost undocumented and I can’t get it
to turn the panic into a non-zero exit code, nor do I know if it’s
possible.

Thoughts anyone?

The other option would be to create a special file in the 9p mount
that’s shared with the host upon success, but that seems a bit hacky.

Thanks,
Ludo’.
Toggle diff (35 lines)
diff --git a/gnu/system/linux-initrd.scm b/gnu/system/linux-initrd.scm
index 983c6d81c8..cb29a656b9 100644
--- a/gnu/system/linux-initrd.scm
+++ b/gnu/system/linux-initrd.scm
@@ -1,5 +1,5 @@
;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2013, 2014, 2015, 2016, 2017, 2018 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2013, 2014, 2015, 2016, 2017, 2018, 2019 Ludovic Courtès <ludo@gnu.org>
;;; Copyright © 2016 Mark H Weaver <mhw@netris.org>
;;; Copyright © 2016 Jan Nieuwenhuizen <janneke@gnu.org>
;;; Copyright © 2017 Mathieu Othacehe <m.othacehe@gmail.com>
@@ -279,6 +279,7 @@ FILE-SYSTEMS."
"isci") ;for SAS controllers like Intel C602
'())
+ "pvpanic"
,@virtio-modules))
(define-syntax %base-initrd-modules
diff --git a/gnu/system/vm.scm b/gnu/system/vm.scm
index e561285964..b671c74ab8 100644
--- a/gnu/system/vm.scm
+++ b/gnu/system/vm.scm
@@ -187,8 +187,9 @@ made available under the /xchg CIFS share."
;; When USER-BUILDER succeeds, reboot (indicating a
;; success), otherwise die, which causes a kernel panic
;; ("Attempted to kill init!").
- #~(when (zero? (system* #$user-builder))
- (reboot))))
+ #~(if (zero? (system* #$user-builder))
+ (reboot)
+ (exit 1))))
(let ((initrd (or initrd
(base-initrd file-systems
L
L
Ludovic Courtès wrote on 1 May 2019 22:19
control message for bug #34276
(address . control@debbugs.gnu.org)
87ef5iexlp.fsf@gnu.org
severity 34276 important
L
L
Ludovic Courtès wrote on 1 Sep 2019 22:37
(address . control@debbugs.gnu.org)
878sr792ht.fsf@gnu.org
merge 34276 37164
quit
B
B
Brice Waegeneire wrote on 19 Mar 2020 21:05
Re: bug#34276: ‘guix system disk-im age ’ successfully builds a bad image
(address . 34276@debbugs.gnu.org)
7a36cb1fc7c68f5d63a324df49170cdc@waegenei.re
Hello Ludovic,

Toggle quote (13 lines)
> I investigated a bit. I managed to get our code to cause a kernel
> panic
> upon failure (patch below). However I fail to turn that guest kernel
> panic into a different QEMU exit code.
>
> I tried to use the “pvpanic” paravirtualized device (the ‘pvpanic.ko’
> module in the guest, and “-device pvpanic” on the QEMU command line),
> but unfortunately that thing is almost undocumented and I can’t get it
> to turn the panic into a non-zero exit code, nor do I know if it’s
> possible.
>
> Thoughts anyone?

I looked a little into it and I have found how to use pvpanic.
Unfortunately it's not as straight forward as getting a non-zero exit
code form qemu. When pvpanic is loaded in a VM?, as you did with
“-device
pvpanic”, generate events[0] on the QMP interface when a crash happen
and qemu either shutdown or pause when using --no-shutdown[1].

(gnu build marionette) which use the “-monitor” interface could be
recycled to use “-qmp” a machine interface using JSON.

Following is log of a QMP session where the guest panicked[2]:
Toggle snippet (54 lines)
{
"QMP": {
"version": {
"qemu": {
"micro": 0,
"minor": 2,
"major": 4
},
"package": ""
},
"capabilities": [
"oob"
]
}
}
{ "execute": "qmp_capabilities" }
{
"return": {
}
}
{
"timestamp": {
"seconds": 1584645026,
"microseconds": 936550
},
"event": "GUEST_PANICKED",
"data": {
"action": "pause"
}
}
{
"timestamp": {
"seconds": 1584645026,
"microseconds": 936675
},
"event": "GUEST_PANICKED",
"data": {
"action": "poweroff"
}
}
{
"timestamp": {
"seconds": 1584645026,
"microseconds": 936776
},
"event": "SHUTDOWN",
"data": {
"guest": true,
"reason": "guest-panic"
}
}


L
L
Ludovic Courtès wrote on 21 Mar 2020 16:58
Re: bug#34276: ‘guix system disk-im age’ successfully builds a bad image
(name . Brice Waegeneire)(address . brice@waegenei.re)(address . 34276@debbugs.gnu.org)
875zex1z05.fsf@gnu.org
Hi Brice,

Brice Waegeneire <brice@waegenei.re> skribis:

Toggle quote (25 lines)
>> I investigated a bit. I managed to get our code to cause a kernel
>> panic
>> upon failure (patch below). However I fail to turn that guest kernel
>> panic into a different QEMU exit code.
>>
>> I tried to use the “pvpanic” paravirtualized device (the ‘pvpanic.ko’
>> module in the guest, and “-device pvpanic” on the QEMU command line),
>> but unfortunately that thing is almost undocumented and I can’t get it
>> to turn the panic into a non-zero exit code, nor do I know if it’s
>> possible.
>>
>> Thoughts anyone?
>
> I looked a little into it and I have found how to use pvpanic.
> Unfortunately it's not as straight forward as getting a non-zero exit
> code form qemu. When pvpanic is loaded in a VM?, as you did with
> “-device
> pvpanic”, generate events[0] on the QMP interface when a crash happen
> and qemu either shutdown or pause when using --no-shutdown[1].
>
> (gnu build marionette) which use the “-monitor” interface could be
> recycled to use “-qmp” a machine interface using JSON.
>
> Following is log of a QMP session where the guest panicked[2]:

Oooh, I see, thanks for digging into this!

Any idea how to implement it? Is QMP a request/reply kind of interface
like the monitor?

Ludo’.
B
B
Brice Waegeneire wrote on 21 Mar 2020 17:44
Re: bug#34276: ‘guix system disk-im age ’ successfully builds a bad image
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 34276@debbugs.gnu.org)
b99009e2b128d5dab6f4345f434f63ef@waegenei.re
Hello Ludo,

On 2020-03-21 15:58, Ludovic Courtès wrote:
Toggle quote (3 lines)
> Any idea how to implement it? Is QMP a request/reply kind of interface
> like the monitor?

Not really or I would have sent a patch instead.

QMP is similar to the the monitor in the sense that you can send a
command and
receive a reply but it give us access to more features; in our case
asynchronous events. To get notified by the pvpanic device that a panic
occured
on the guest it is needed to do the following:
1. Connect to the socket
2. Receive the server greetings
3. Respond with the capabilites request
4. Receive the capabilites respond
5. Listen on GUEST_PANICKED events

The QMP specifications are available here[0].

[0]:

Brice.
L
L
Ludovic Courtès wrote on 26 Mar 2020 23:57
Re: bug#34276: ‘guix system disk-im age’ successfully builds a bad image
(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)(address . 34276-done@debbugs.gnu.org)
877dz6911q.fsf@gnu.org
Hi,

Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (3 lines)
> The other option would be to create a special file in the 9p mount
> that’s shared with the host upon success, but that seems a bit hacky.

Turns out that was easily done and better than the status quo.
Done in commit be6520e6a58d7f6ee58f4cab76db9d1245410113!

Ludo’.
Closed
?
Your comment

This issue is archived.

To comment on this conversation send an email to 34276@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 34276
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch