‘guix system disk-image’ successfully builds a bad image

DoneSubmitted by Tobias Geerinckx-Rice.
Details
3 participants
  • Brice Waegeneire
  • Ludovic Courtès
  • Tobias Geerinckx-Rice
Owner
unassigned
Severity
important
Merged with
T
T
Tobias Geerinckx-Rice wrote on 1 Feb 2019 16:57
(name . Bug Guix)(address . bug-guix@gnu.org)
877eejfqmb.fsf@nckx
Hullo!

I wanted to install this ‘Guix’ thing that everyone's so hyped up
about.

I have a small forgotten script in my ~/guix.git that runs:

./pre-inst-env guix system disk-image --fallback
--image-size=1.5G \
gnu/system/install.scm

This was written back when 1.5G was higher than the default.

Now it's much lower and too small to store all the Guix. However,
the command completes ‘successfully’:

copying 422 store items [#########:
In srfi/srfi-1.scm:
466:18 19 (fold #<procedure 1a60440 at ice-9/ftw.scm:452:38
(sub?> ?)
In unknown file:
18 (_ #<procedure 1917270 at ice-9/ftw.scm:454:44 ()>
#<p?> ?)
In ice-9/ftw.scm:
452:32 17 (loop _ _ #(21 1706421 16749 3 0 0 0 4096 1548869386
?) ?)
In srfi/srfi-1.scm:
466:18 16 (fold #<procedure 1a60160 at ice-9/ftw.scm:452:38
(sub?> ?)
In unknown file:
15 (_ #<procedure 1917240 at ice-9/ftw.scm:454:44 ()>
#<p?> ?)
In ice-9/ftw.scm:
452:32 14 (loop _ _ #(21 1739151 16749 3 0 0 0 4096 1548869386
?) ?)
In srfi/srfi-1.scm:
466:18 13 (fold #<procedure 1b8f8c0 at ice-9/ftw.scm:452:38
(sub?> ?)
In unknown file:
12 (_ #<procedure 1b5bc90 at ice-9/ftw.scm:454:44 ()>
#<p?> ?)
In ice-9/ftw.scm:
452:32 11 (loop _ _ #(21 1772091 16749 13 0 0 0 4096 1548869389
?) ?)
In srfi/srfi-1.scm:
466:18 10 (fold #<procedure 1b8f280 at ice-9/ftw.scm:452:38
(sub?> ?)
In unknown file:
9 (_ #<procedure 1a56750 at ice-9/ftw.scm:454:44 ()>
#<p?> ?)
In ice-9/ftw.scm:
452:32 8 (loop _ _ #(21 2132258 16749 98 0 0 0 4096 1548869432
?) ?)
In srfi/srfi-1.scm:
466:18 7 (fold #<procedure 140dd20 at ice-9/ftw.scm:452:38
(sub?> ?)
In unknown file:
6 (_ #<procedure 19ea030 at ice-9/ftw.scm:454:44 ()>
#<p?> ?)
In ice-9/ftw.scm:
452:32 5 (loop _ _ #(21 4589344 16749 24 0 0 0 4096 1548869676
?) ?)
In srfi/srfi-1.scm:
466:18 4 (fold #<procedure 1969540 at ice-9/ftw.scm:452:38
(sub?> ?)
In unknown file:
3 (_ #<procedure 1725750 at ice-9/ftw.scm:454:44 ()>
#<p?> ?)
In ice-9/ftw.scm:
482:39 2 (loop _ _ #(21 4589402 16749 3 0 0 0 4096 1548869687
?) ?)
In ./guix/build/utils.scm:
312:27 1 (_
"/gnu/store/ricf82z3mqqrqim67jz3jlsglfm1g1a8-linux-?" ?)
In unknown file:
0 (copy-file
"/gnu/store/ricf82z3mqqrqim67jz3jlsglfm1g1a?" ?)

ERROR: In procedure copy-file:
In procedure copy-file: No space left on device
^MESC[Kcopying 422 store items
boot program
'/gnu/store/lbvrvrlqab4qpw9f907na445kppmknab-linux-vm-loader'
terminated, rebooting
[ 1071.512054] Unregister pv shared memory for cpu 0
[ 1071.522414] reboot: Restarting system
[ 1071.542285] reboot: machine restart
successfully built
/gnu/store/lbyq5790j5hfq3spbm76i1yw3sj41l8b-disk-image.drv
/gnu/store/dby523cy1l4wrqi8wwmk5ln9qr7g5mh8-disk-image

Kind regards,

T G-R

Sent from my GNU Emacs
L
L
Ludovic Courtès wrote on 17 Mar 2019 13:09
Re: bug#34276: ‘guix system disk-im age’ successfully builds a bad image
(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)(address . 34276@debbugs.gnu.org)
878sxdzoqo.fsf@gnu.org
Hello,

Tobias Geerinckx-Rice <me@tobias.gr> skribis:

Toggle quote (12 lines)
> ERROR: In procedure copy-file:
> In procedure copy-file: No space left on device
> ^MESC[Kcopying 422 store items
> boot program
> '/gnu/store/lbvrvrlqab4qpw9f907na445kppmknab-linux-vm-loader'
> terminated, rebooting
> [ 1071.512054] Unregister pv shared memory for cpu 0
> [ 1071.522414] reboot: Restarting system
> [ 1071.542285] reboot: machine restart
> successfully built
> /gnu/store/lbyq5790j5hfq3spbm76i1yw3sj41l8b-disk-image.drv

I investigated a bit. I managed to get our code to cause a kernel panic
upon failure (patch below). However I fail to turn that guest kernel
panic into a different QEMU exit code.

I tried to use the “pvpanic” paravirtualized device (the ‘pvpanic.ko’
module in the guest, and “-device pvpanic” on the QEMU command line),
but unfortunately that thing is almost undocumented and I can’t get it
to turn the panic into a non-zero exit code, nor do I know if it’s
possible.

Thoughts anyone?

The other option would be to create a special file in the 9p mount
that’s shared with the host upon success, but that seems a bit hacky.

Thanks,
Ludo’.
Toggle diff (35 lines)
diff --git a/gnu/system/linux-initrd.scm b/gnu/system/linux-initrd.scm
index 983c6d81c8..cb29a656b9 100644
--- a/gnu/system/linux-initrd.scm
+++ b/gnu/system/linux-initrd.scm
@@ -1,5 +1,5 @@
 ;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2013, 2014, 2015, 2016, 2017, 2018 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2013, 2014, 2015, 2016, 2017, 2018, 2019 Ludovic Courtès <ludo@gnu.org>
 ;;; Copyright © 2016 Mark H Weaver <mhw@netris.org>
 ;;; Copyright © 2016 Jan Nieuwenhuizen <janneke@gnu.org>
 ;;; Copyright © 2017 Mathieu Othacehe <m.othacehe@gmail.com>
@@ -279,6 +279,7 @@ FILE-SYSTEMS."
             "isci")                      ;for SAS controllers like Intel C602
           '())
 
+    "pvpanic"
     ,@virtio-modules))
 
 (define-syntax %base-initrd-modules
diff --git a/gnu/system/vm.scm b/gnu/system/vm.scm
index e561285964..b671c74ab8 100644
--- a/gnu/system/vm.scm
+++ b/gnu/system/vm.scm
@@ -187,8 +187,9 @@ made available under the /xchg CIFS share."
                   ;; When USER-BUILDER succeeds, reboot (indicating a
                   ;; success), otherwise die, which causes a kernel panic
                   ;; ("Attempted to kill init!").
-                  #~(when (zero? (system* #$user-builder))
-                      (reboot))))
+                  #~(if (zero? (system* #$user-builder))
+                        (reboot)
+                        (exit 1))))
 
   (let ((initrd (or initrd
                     (base-initrd file-systems
L
L
Ludovic Courtès wrote on 1 May 2019 22:19
control message for bug #34276
(address . control@debbugs.gnu.org)
87ef5iexlp.fsf@gnu.org
severity 34276 important
L
L
Ludovic Courtès wrote on 1 Sep 2019 22:37
(address . control@debbugs.gnu.org)
878sr792ht.fsf@gnu.org
merge 34276 37164
quit
B
B
Brice Waegeneire wrote on 19 Mar 2020 21:05
Re: bug#34276: ‘guix system disk-im age ’ successfully builds a bad image
(address . 34276@debbugs.gnu.org)
7a36cb1fc7c68f5d63a324df49170cdc@waegenei.re
Hello Ludovic,

Toggle quote (13 lines)
> I investigated a bit. I managed to get our code to cause a kernel
> panic
> upon failure (patch below). However I fail to turn that guest kernel
> panic into a different QEMU exit code.
>
> I tried to use the “pvpanic” paravirtualized device (the ‘pvpanic.ko’
> module in the guest, and “-device pvpanic” on the QEMU command line),
> but unfortunately that thing is almost undocumented and I can’t get it
> to turn the panic into a non-zero exit code, nor do I know if it’s
> possible.
>
> Thoughts anyone?

I looked a little into it and I have found how to use pvpanic.
Unfortunately it's not as straight forward as getting a non-zero exit
code form qemu. When pvpanic is loaded in a VṂ, as you did with
“-device
pvpanic”, generate events[0] on the QMP interface when a crash happen
and qemu either shutdown or pause when using --no-shutdown[1].

(gnu build marionette) which use the “-monitor” interface could be
recycled to use “-qmp” a machine interface using JSON.

Following is log of a QMP session where the guest panicked[2]:
Toggle snippet (54 lines)
{
"QMP": {
"version": {
"qemu": {
"micro": 0,
"minor": 2,
"major": 4
},
"package": ""
},
"capabilities": [
"oob"
]
}
}
{ "execute": "qmp_capabilities" }
{
"return": {
}
}
{
"timestamp": {
"seconds": 1584645026,
"microseconds": 936550
},
"event": "GUEST_PANICKED",
"data": {
"action": "pause"
}
}
{
"timestamp": {
"seconds": 1584645026,
"microseconds": 936675
},
"event": "GUEST_PANICKED",
"data": {
"action": "poweroff"
}
}
{
"timestamp": {
"seconds": 1584645026,
"microseconds": 936776
},
"event": "SHUTDOWN",
"data": {
"guest": true,
"reason": "guest-panic"
}
}


L
L
Ludovic Courtès wrote on 21 Mar 2020 16:58
Re: bug#34276: ‘guix system disk-im age’ successfully builds a bad image
(name . Brice Waegeneire)(address . brice@waegenei.re)(address . 34276@debbugs.gnu.org)
875zex1z05.fsf@gnu.org
Hi Brice,

Brice Waegeneire <brice@waegenei.re> skribis:

Toggle quote (25 lines)
>> I investigated a bit. I managed to get our code to cause a kernel
>> panic
>> upon failure (patch below). However I fail to turn that guest kernel
>> panic into a different QEMU exit code.
>>
>> I tried to use the “pvpanic” paravirtualized device (the ‘pvpanic.ko’
>> module in the guest, and “-device pvpanic” on the QEMU command line),
>> but unfortunately that thing is almost undocumented and I can’t get it
>> to turn the panic into a non-zero exit code, nor do I know if it’s
>> possible.
>>
>> Thoughts anyone?
>
> I looked a little into it and I have found how to use pvpanic.
> Unfortunately it's not as straight forward as getting a non-zero exit
> code form qemu. When pvpanic is loaded in a VṂ, as you did with
> “-device
> pvpanic”, generate events[0] on the QMP interface when a crash happen
> and qemu either shutdown or pause when using --no-shutdown[1].
>
> (gnu build marionette) which use the “-monitor” interface could be
> recycled to use “-qmp” a machine interface using JSON.
>
> Following is log of a QMP session where the guest panicked[2]:

Oooh, I see, thanks for digging into this!

Any idea how to implement it? Is QMP a request/reply kind of interface
like the monitor?

Ludo’.
B
B
Brice Waegeneire wrote on 21 Mar 2020 17:44
Re: bug#34276: ‘guix system disk-im age ’ successfully builds a bad image
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 34276@debbugs.gnu.org)
b99009e2b128d5dab6f4345f434f63ef@waegenei.re
Hello Ludo,

On 2020-03-21 15:58, Ludovic Courtès wrote:
Toggle quote (3 lines)
> Any idea how to implement it? Is QMP a request/reply kind of interface
> like the monitor?

Not really or I would have sent a patch instead.

QMP is similar to the the monitor in the sense that you can send a
command and
receive a reply but it give us access to more features; in our case
asynchronous events. To get notified by the pvpanic device that a panic
occured
on the guest it is needed to do the following:
1. Connect to the socket
2. Receive the server greetings
3. Respond with the capabilites request
4. Receive the capabilites respond
5. Listen on GUEST_PANICKED events

The QMP specifications are available here[0].

[0]:

Brice.
L
L
Ludovic Courtès wrote on 26 Mar 2020 23:57
Re: bug#34276: ‘guix system disk-im age’ successfully builds a bad image
(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)(address . 34276-done@debbugs.gnu.org)
877dz6911q.fsf@gnu.org
Hi,

Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (3 lines)
> The other option would be to create a special file in the 9p mount
> that’s shared with the host upon success, but that seems a bit hacky.

Turns out that was easily done and better than the status quo.
Done in commit be6520e6a58d7f6ee58f4cab76db9d1245410113!

Ludo’.
Closed
?
Your comment

This issue is archived.

To comment on this conversation send email to 34276@debbugs.gnu.org