/run/booted-system is not protected from GC

  • Done
  • quality assurance status badge
Details
2 participants
  • Ludovic Courtès
  • Ludovic Courtès
Owner
unassigned
Submitted by
Ludovic Courtès
Severity
important
L
L
Ludovic Courtès wrote on 25 Feb 2021 10:00
/run/booted-system can be removed by ‘guix system delete-generations’
(address . bug-guix@gnu.org)
878s7c1pzh.fsf@inria.fr
/run/booted-system is not protected from GC. Here’s what I observed on
a machine with unattended upgrades (which includes automatic removal of
old system generations):

Toggle snippet (20 lines)
~$ ls -l /run/booted-system
lrwxrwxrwx 1 root root 33 Nov 2 16:06 /run/booted-system -> /var/guix/profiles/system-68-link
~$ ls -l /var/guix/profiles/system-68-link
ls: cannot access '/var/guix/profiles/system-68-link': No such file or directory
~$ ls -lrt /var/guix/profiles/system-*-link
lrwxrwxrwx 1 root root 50 Nov 29 01:34 /var/guix/profiles/system-74-link -> /gnu/store/ym7bs9pp9lxy0s1pjfrbic0pjjr7svzd-system
lrwxrwxrwx 1 root root 50 Dec 6 01:33 /var/guix/profiles/system-75-link -> /gnu/store/ivxak4d58gqz2xqihkc636nhwhpa1fs4-system
lrwxrwxrwx 1 root root 50 Dec 13 01:35 /var/guix/profiles/system-76-link -> /gnu/store/wqpwlqlfsc4yqm0nypzvan1a8sb9xmcc-system
lrwxrwxrwx 1 root root 50 Dec 27 01:33 /var/guix/profiles/system-77-link -> /gnu/store/y539xw934mbdcqidg6zaxrzq9hy8hm9p-system
lrwxrwxrwx 1 root root 50 Jan 4 09:29 /var/guix/profiles/system-78-link -> /gnu/store/3582jh1v9vn51wasyl1y189ng4vhqiy9-system
lrwxrwxrwx 1 root root 50 Jan 10 01:33 /var/guix/profiles/system-79-link -> /gnu/store/q67avpz4bfhq2zyfhh8ka6q9hpqzc3xj-system
lrwxrwxrwx 1 root root 50 Jan 17 01:33 /var/guix/profiles/system-80-link -> /gnu/store/13bwrvjgsl16sigwpa93yr4r51qnm8zi-system
lrwxrwxrwx 1 root root 50 Jan 19 14:23 /var/guix/profiles/system-81-link -> /gnu/store/v151wf6lj4ivgj3xwysi9fdmva55jzqp-system
lrwxrwxrwx 1 root root 50 Jan 24 01:35 /var/guix/profiles/system-82-link -> /gnu/store/rvarwdsymd94am8bc8b1rx2xdrxcvx6l-system
lrwxrwxrwx 1 root root 50 Jan 31 01:33 /var/guix/profiles/system-83-link -> /gnu/store/0sgd2yb702483zi3hl04wv4r4rn3ibcy-system
lrwxrwxrwx 1 root root 50 Feb 7 01:32 /var/guix/profiles/system-84-link -> /gnu/store/bix4yp9zs2h3vy8zi9ap9mazap727hng-system
lrwxrwxrwx 1 root root 50 Feb 14 01:33 /var/guix/profiles/system-85-link -> /gnu/store/702l59w3gbsc45c7nffsyv89vnaky5zc-system
lrwxrwxrwx 1 root root 50 Feb 21 01:34 /var/guix/profiles/system-86-link -> /gnu/store/qq4rz2fprvnsgqhj24v735hhmp189jl8-system

This is bad but mostly harmless since all the packages actually in use
are GC-protected anyway, via ‘guix gc --list-busy’.

It breaks things like ‘guix deploy’ though. Specifically, its remote
initrd module check in (gnu machine ssh) looks for
/run/booted-system/kernel/lib/modules/KERNEL-VERSION/modules.alias, via
‘missing-modules’ of (gnu build linux-modules). That code throws
because ‘modules.alias’ is supposed to exist, which in turn leads ‘guix
deploy’ to crash badly:

Toggle snippet (68 lines)
$ guix time-machine -C channels.scm -- deploy deploy.scm
The following 1 machine will be deployed:
guix-hpc

guix deploy: deploying to guix-hpc...
substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0%
La jenaj derivoj estos konstruataj:
/gnu/store/nh0n44vb4i445msj6g6i0kwyp3jgs39c-remote-exp.scm.drv
/gnu/store/44271p6fvygx0fa7dvjhmqz7jwnnr9sc-remote-assertion.scm.drv
/gnu/store/5nqzxyal9lhhga7bz9qf6hvml5v862xw-remote-assertion.scm.drv

0.0 MB will be downloaded
downloading from https://ci.guix.gnu.org/nar/lzip/1q9118pw4d18ihj91csfilbg6x2x29am-module-import-compiled ...
module-import-compiled 8KiB 1.7MiB/s 00:00 [##################] 100.0%

building /gnu/store/44271p6fvygx0fa7dvjhmqz7jwnnr9sc-remote-assertion.scm.drv...
building /gnu/store/5nqzxyal9lhhga7bz9qf6hvml5v862xw-remote-assertion.scm.drv...
building /gnu/store/nh0n44vb4i445msj6g6i0kwyp3jgs39c-remote-exp.scm.drv...
guix deploy: sending 8 store items (5 MiB) to 'localhost'...

FORMAT: error with call: (format #f "missing modules for ~a:~{ ~a~}<===~%" #<file-system-label "root"> ===>#f )
expected a list argument
FORMAT: INTERNAL ERROR IN FORMAT-ERROR!
destination: #f
format string: "missing modules for ~a:~{ ~a~}~%"
format args: (#<file-system-label "root"> #f)
error args: (#f "error in format" () #f)
Backtrace:
In guix/store.scm:
1305:8 19 (call-with-build-handler #<procedure 7f0d3cc35240 at guix/ui.scm:1171:2 (continue store things mode)> _)
In guix/scripts/deploy.scm:
170:14 18 (_)
In guix/store.scm:
1346:2 17 (map/accumulate-builds #<store-connection 256.99 7f0d39c6c000> _ _)
In srfi/srfi-1.scm:
586:17 16 (map1 (#<<unresolved> things: (("/gnu/store/b5nnbpgkvgdpzgvj67539ylcaqacj90l-guile-3.0.2.drv" . "out"…>))
In guix/store.scm:
1305:8 15 (call-with-build-handler #<procedure build-accumulator (continue store things mode)> _)
In ice-9/boot-9.scm:
1736:10 14 (with-exception-handler _ _ #:unwind? _ #:unwind-for-type _)
In guix/scripts/deploy.scm:
144:6 13 (_)
In guix/store.scm:
2066:24 12 (run-with-store #<store-connection 256.99 7f0d39c6c000> _ #:guile-for-build _ #:system _ #:target _)
In gnu/machine/ssh.scm:
445:2 11 (_ _)
338:4 10 (_ _)
In srfi/srfi-1.scm:
650:11 9 (for-each #<procedure 7f0d3e372d28 at gnu/machine/ssh.scm:338:14 (proc value)> _ _)
In gnu/machine/ssh.scm:
275:26 8 (_ #f)
In ice-9/format.scm:
1546:2 7 (format #f "missing modules for ~a:~{ ~a~}~%" #<file-system-label "root"> #f)
571:24 6 (format:format-work "missing modules for ~a:~{ ~a~}~%" (#<file-system-label "root"> #f))
In ice-9/boot-9.scm:
1736:10 5 (with-exception-handler _ _ #:unwind? _ #:unwind-for-type _)
In ice-9/format.scm:
102:10 4 (_)
In ice-9/boot-9.scm:
1669:16 3 (raise-exception _ #:continuable? _)
1669:16 2 (raise-exception _ #:continuable? _)
1669:16 1 (raise-exception _ #:continuable? _)
1669:16 0 (raise-exception _ #:continuable? _)

ice-9/boot-9.scm:1669:16: In procedure raise-exception:
error in format

Ludo’.
L
L
Ludovic Courtès wrote on 25 Feb 2021 10:59
control message for bug #46767
(address . control@debbugs.gnu.org)
871rd41n8w.fsf@gnu.org
severity 46767 important
quit
L
L
Ludovic Courtès wrote on 25 Feb 2021 11:00
(address . control@debbugs.gnu.org)
87zgzszcuy.fsf@gnu.org
retitle 46767 /run/booted-system is not protected from GC
quit
L
L
Ludovic Courtès wrote on 25 Feb 2021 11:44
Re: bug#46767: /run/booted-system can be removed by ‘guix system delete-generations’
(address . 46767@debbugs.gnu.org)
87r1l4zat9.fsf@gnu.org
Before rebooting, I had:

Toggle snippet (5 lines)
$ ls -l /run/{current,booted}-system
lrwxrwxrwx 1 root root 33 Nov 2 16:06 /run/booted-system -> /var/guix/profiles/system-68-link
lrwxrwxrwx 1 root root 50 Feb 21 01:34 /run/current-system -> /gnu/store/qq4rz2fprvnsgqhj24v735hhmp189jl8-system

After rebooting:

Toggle snippet (5 lines)
$ ls -l /run/{current,booted}-system
lrwxrwxrwx 1 root root 33 Feb 25 10:28 /run/booted-system -> /var/guix/profiles/system-86-link
lrwxrwxrwx 1 root root 33 Feb 25 10:28 /run/current-system -> /var/guix/profiles/system-86-link

/run/booted-system is symlinked from /run/current-system in
‘shepherd-boot-gexp’:

Toggle snippet (12 lines)
(define (shepherd-boot-gexp config)
"Return a gexp starting the shepherd service."
(let ((shepherd (shepherd-configuration-shepherd config))
(services (shepherd-configuration-services config)))
#~(begin
;; Keep track of the booted system.
(false-if-exception (delete-file "/run/booted-system"))
(symlink (readlink "/run/current-system")
"/run/booted-system")
…)))

So the solution is to make sure /run/current-system always points to the
store item rather than to the /var/guix symlink in the first place.

/run/current-system is created from (gnu build activation). When
reconfiguring or deploying, the symlink points to $GUIX_NEW_SYSTEM,
which is set to the store item in (guix scripts system reconfigure).

But when booting, /run/current-system is symlinked to the ‘--system’
kernel command-line argument, which is /var/guix/…. To address that, we
need to throw a ‘canonicalize-path’ call.

Done in 412e4f081e9cdf38db9859e1548ef2362cde678e.

Ludo’.
L
L
Ludovic Courtès wrote on 25 Feb 2021 11:44
control message for bug #46767
(address . control@debbugs.gnu.org)
87pn0ozat0.fsf@gnu.org
tags 46767 fixed
close 46767
quit
?
Your comment

This issue is archived.

To comment on this conversation send an email to 46767@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 46767
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch