Impossible to remove all offload machines

  • Open
  • quality assurance status badge
Details
3 participants
  • Ian Eure
  • Maxim Cournoyer
  • Tomas Volf
Owner
unassigned
Submitted by
Ian Eure
Severity
normal
I
I
Ian Eure wrote on 17 Aug 18:40 +0200
(address . bug-guix@gnu.org)(name . guix-devel)(address . guix-devel@gnu.org)
87plq75cbc.fsf@meson
Ran into this issue last week. If you:

- Configure some offload build machines in your operating-system
configuration.
- Reconfigure your system.
- Remove all offload build machines.
- Reconfigure your system again.

...then various guix operations will still try to connect to
offload machines, even if you reboot the affected client.

This is caused by a bug in the `guix-activation' procedure:

;; ... and /etc/guix/machines.scm.
#$(if (null? (guix-configuration-build-machines config))
#~#f
(guix-machines-files-installation
#~(list #$@(guix-configuration-build-machines
config))))

If there are no build machines defined in the configuration, no
operation is performed (#f is returned), which leaves the previous
generation’s /etc/guix/machines.scm in place.

The same issue appears to affect channels:

;; ... and /etc/guix/channels.scm...
#$(and channels (install-channels-file channels))

I’d be happy to take a stab at fixing this, but I’m not certain
what
direction to go, or how much to refactor to get there. Should the
channels/machines files be removed (ignoring errors if they don’t
exist)? Should empty files be installed? Should that happen
inline
in `guix-activation', or in another procedure? Should the
filenames be
extracted to %variables to avoid duplicating between the two
places
they’ll be used?

If someone would like to provide answered, I would contribute a
patch.

Thanks,

— Ian
M
M
Maxim Cournoyer wrote on 14 Sep 16:55 +0200
(name . Ian Eure)(address . ian@retrospec.tv)
87zfoaqo7p.fsf@gmail.com
Hi Ian,

Ian Eure <ian@retrospec.tv> writes:

Toggle quote (29 lines)
> Ran into this issue last week. If you:
>
> - Configure some offload build machines in your operating-system
> configuration.
> - Reconfigure your system.
> - Remove all offload build machines.
> - Reconfigure your system again.
>
> ...then various guix operations will still try to connect to offload
> machines, even if you reboot the affected client.
>
> This is caused by a bug in the `guix-activation' procedure:
>
> ;; ... and /etc/guix/machines.scm.
> #$(if (null? (guix-configuration-build-machines config))
> #~#f
> (guix-machines-files-installation
> #~(list #$@(guix-configuration-build-machines
> config))))
>
> If there are no build machines defined in the configuration, no
> operation is performed (#f is returned), which leaves the previous
> generation’s /etc/guix/machines.scm in place.
>
> The same issue appears to affect channels:
>
> ;; ... and /etc/guix/channels.scm...
> #$(and channels (install-channels-file channels))

Interesting!

Toggle quote (10 lines)
> I’d be happy to take a stab at fixing this, but I’m not certain what
> direction to go, or how much to refactor to get there. Should the
> channels/machines files be removed (ignoring errors if they don’t
> exist)? Should empty files be installed? Should that happen inline
> in `guix-activation', or in another procedure? Should the filenames be
> extracted to %variables to avoid duplicating between the two places
> they’ll be used?
>
> If someone would like to provide answered, I would contribute a patch.

I guess the simplest would be to attempt to remove the files when there
are no offload machines or channels, in this already existing activation
procedure. Extracting the file names to %variables sounds preferable
yes, if there's a logical place to store them that is easily shared.

A patch would be dandy!

--
Thanks,
Maxim
I
I
Ian Eure wrote on 15 Sep 05:24 +0200
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
87plp560ji.fsf@meson
Hi Maxim,

Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

Toggle quote (65 lines)
> Hi Ian,
>
> Ian Eure <ian@retrospec.tv> writes:
>
>> Ran into this issue last week. If you:
>>
>> - Configure some offload build machines in your
>> operating-system
>> configuration.
>> - Reconfigure your system.
>> - Remove all offload build machines.
>> - Reconfigure your system again.
>>
>> ...then various guix operations will still try to connect to
>> offload
>> machines, even if you reboot the affected client.
>>
>> This is caused by a bug in the `guix-activation' procedure:
>>
>> ;; ... and /etc/guix/machines.scm.
>> #$(if (null? (guix-configuration-build-machines config))
>> #~#f
>> (guix-machines-files-installation
>> #~(list #$@(guix-configuration-build-machines
>> config))))
>>
>> If there are no build machines defined in the configuration, no
>> operation is performed (#f is returned), which leaves the
>> previous
>> generation’s /etc/guix/machines.scm in place.
>>
>> The same issue appears to affect channels:
>>
>> ;; ... and /etc/guix/channels.scm...
>> #$(and channels (install-channels-file channels))
>
> Interesting!
>
>> I’d be happy to take a stab at fixing this, but I’m not certain
>> what
>> direction to go, or how much to refactor to get there. Should
>> the
>> channels/machines files be removed (ignoring errors if they
>> don’t
>> exist)? Should empty files be installed? Should that happen
>> inline
>> in `guix-activation', or in another procedure? Should the
>> filenames be
>> extracted to %variables to avoid duplicating between the two
>> places
>> they’ll be used?
>>
>> If someone would like to provide answered, I would contribute a
>> patch.
>
> I guess the simplest would be to attempt to remove the files
> when there
> are no offload machines or channels, in this already existing
> activation
> procedure. Extracting the file names to %variables sounds
> preferable
> yes, if there's a logical place to store them that is easily
> shared.
>

As I was putting together a patch for this, I realized there’s a
problem: if a user is *manually* managing either
/etc/guix/machines.scm or channels.scm, these files would be
deleted, which likely isn’t what they want. The current code lets
users choose to manage these files manually or declaritively, and
there’s no way to know if the files on disk are the result of a
previous system generation or a user’s creation. Since the
channel management is a relatively new feature, I suspect there
are quite a few folks with manually-managed channels that this
would negatively impact. I know there was some disruption just
moving to declaritive management of channels (but I can’t find the
thread/s at the moment).

I don’t see an elegant technical solution to this. I think the
best option is probably to say that those files should *always* be
managed through operating-system, and put a fat warning in the
channel news to update your config if they’re still handled
manually.

The only other option I can see would be to keep the existing
filenames for user configuration, and declaritively manage
different files -- like declaritive-channels.scm. This comes with
its own set of problems, like needing to update the Guix daemon to
read and combine multiple files; and the inability to know whether
a given `channels.scm' is declaritively- or manually-managed means
a bumpy upgrade path (ex. should this preexisting channels.scm
file be left as-is, or renamed to the new name?)

I’m inclined to go with the fat-warning option, but am also
thinking this likely needs some guix-devel discussion.

What do you think?

Thanks,

— Ian
I
I
Ian Eure wrote on 15 Sep 05:53 +0200
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
87ldzt606c.fsf@meson
Hi Maxim,

Ian Eure <ian@retrospec.tv> writes:

Toggle quote (122 lines)
> Hi Maxim,
>
> Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:
>
>> Hi Ian,
>>
>> Ian Eure <ian@retrospec.tv> writes:
>>
>>> Ran into this issue last week. If you:
>>>
>>> - Configure some offload build machines in your
>>> operating-system
>>> configuration.
>>> - Reconfigure your system.
>>> - Remove all offload build machines.
>>> - Reconfigure your system again.
>>>
>>> ...then various guix operations will still try to connect to
>>> offload
>>> machines, even if you reboot the affected client.
>>>
>>> This is caused by a bug in the `guix-activation' procedure:
>>>
>>> ;; ... and /etc/guix/machines.scm.
>>> #$(if (null? (guix-configuration-build-machines config))
>>> #~#f
>>> (guix-machines-files-installation
>>> #~(list #$@(guix-configuration-build-machines
>>> config))))
>>>
>>> If there are no build machines defined in the configuration,
>>> no
>>> operation is performed (#f is returned), which leaves the
>>> previous
>>> generation’s /etc/guix/machines.scm in place.
>>>
>>> The same issue appears to affect channels:
>>>
>>> ;; ... and /etc/guix/channels.scm...
>>> #$(and channels (install-channels-file channels))
>>
>> Interesting!
>>
>>> I’d be happy to take a stab at fixing this, but I’m not
>>> certain
>>> what
>>> direction to go, or how much to refactor to get there. Should
>>> the
>>> channels/machines files be removed (ignoring errors if they
>>> don’t
>>> exist)? Should empty files be installed? Should that happen
>>> inline
>>> in `guix-activation', or in another procedure? Should the
>>> filenames
>>> be
>>> extracted to %variables to avoid duplicating between the two
>>> places
>>> they’ll be used?
>>>
>>> If someone would like to provide answered, I would contribute
>>> a
>>> patch.
>>
>> I guess the simplest would be to attempt to remove the files
>> when
>> there
>> are no offload machines or channels, in this already existing
>> activation
>> procedure. Extracting the file names to %variables sounds
>> preferable
>> yes, if there's a logical place to store them that is easily
>> shared.
>>
>
> As I was putting together a patch for this, I realized there’s a
> problem: if a user is *manually* managing either
> /etc/guix/machines.scm or channels.scm, these files would be
> deleted,
> which likely isn’t what they want. The current code lets users
> choose
> to manage these files manually or declaritively, and there’s no
> way to
> know if the files on disk are the result of a previous system
> generation or a user’s creation. Since the channel management
> is a
> relatively new feature, I suspect there are quite a few folks
> with
> manually-managed channels that this would negatively impact. I
> know
> there was some disruption just moving to declaritive management
> of
> channels (but I can’t find the thread/s at the moment).
>
> I don’t see an elegant technical solution to this. I think the
> best
> option is probably to say that those files should *always* be
> managed
> through operating-system, and put a fat warning in the channel
> news to
> update your config if they’re still handled manually.
>
> The only other option I can see would be to keep the existing
> filenames for user configuration, and declaritively manage
> different
> files -- like declaritive-channels.scm. This comes with its own
> set
> of problems, like needing to update the Guix daemon to read and
> combine multiple files; and the inability to know whether a
> given
> `channels.scm' is declaritively- or manually-managed means a
> bumpy
> upgrade path (ex. should this preexisting channels.scm file be
> left
> as-is, or renamed to the new name?)
>
> I’m inclined to go with the fat-warning option, but am also
> thinking
> this likely needs some guix-devel discussion.
>
> What do you think?
>

Disregard this, I continued thinking after sending the email (as
one does) and realized that any managed file will be a link into
the store -- so if the system is reconfigured with no
build-machines or channels *and* the corresponding file is a store
link, it should be removed; otherwise, it should remain untouched.
I can work with this.

Thanks,

— Ian
T
T
Tomas Volf wrote on 15 Sep 21:06 +0200
(name . Ian Eure)(address . ian@retrospec.tv)
875xqwivoa.fsf@wolfsden.cz
Hello,

Ian Eure <ian@retrospec.tv> writes:

Toggle quote (6 lines)
> Disregard this, I continued thinking after sending the email (as one does) and
> realized that any managed file will be a link into the store -- so if the system
> is reconfigured with no build-machines or channels *and* the corresponding file
> is a store link, it should be removed; otherwise, it should remain untouched. I
> can work with this.

Will this correctly handle cases where user is managing the file using
for example extra-special-file?

I wonder whether fat-warning approach would not be better.

Tomas

--
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.
-----BEGIN PGP SIGNATURE-----

iQJCBAEBCgAsFiEEt4NJs4wUfTYpiGikL7/ufbZ/wakFAmbnMCUOHH5Ad29sZnNk
ZW4uY3oACgkQL7/ufbZ/wamhkw//YdqItJFnhZlRT59t2Ys/apMiyuGuJJml5Gan
aMtkn5FFVZQQIU4l6l7RQS/6+N2RJDZ0WtFXpy+c6AZc/y2xA9ib9IIVCpQoTdyK
o7yoRXhZfDyNAGrgyCvKywzbUBEqQ9hOuE3CY4uHZvZ7YcpeenA0w13lWk1j0UzY
H+1bY/7bS3nXYwx43DKFL3uwwPD6/Kwom6JROO7YoJ6QAh2ppwCsYZqqHc+4Eixb
VJM9bowK1hBexviDw3Pg7CA53JurFQQk5UneReempbx/gTOMzrf/T1Ye46Utf0Ys
iqajy3rZ4wSfc4tAJRH6Yt/VRvDowWQ9FsV+KyoaseBEm9v9l0kK8EBNHB5kkEQw
327PwJ4eafOgQ+2nwSOccGZIvYiX12hsFm5/uKWw6dwEAY+tWMUlKeT9wbqB9ya6
FhimFHMQw7RjdJ7SRYbuh9KrNUCTI43vI+HsHD1RkovpeeCrlq0e9sPPJgl3LITG
XHg6KPt1XUVj9zyEf+BmQaJ6qdS1376NcD79DQ7XWtjMEFj8jvmUmM9fIXCJqwj+
QwvWZ0tU4wPJ5mqhMTzWUsuDBykrEOLYkx89oSBj5C3w9z4zFDy4q+Ki7pH2h2lF
X47xWgvCqbo64YJhEPIP0SETkIgVLFn9bbuRY73l/GZ6INSgcF4FQWrNWLxpagD5
DH4XWdQ=
=cMRi
-----END PGP SIGNATURE-----

I
I
Ian Eure wrote on 19 Sep 02:35 +0200
(name . Tomas Volf)(address . ~@wolfsden.cz)
874j6c5vk3.fsf@meson
Tomas Volf <~@wolfsden.cz> writes:

Toggle quote (21 lines)
> [[PGP Signed Part:Undecided]]
>
> Hello,
>
> Ian Eure <ian@retrospec.tv> writes:
>
>> Disregard this, I continued thinking after sending the email
>> (as one does) and
>> realized that any managed file will be a link into the store --
>> so if the system
>> is reconfigured with no build-machines or channels *and* the
>> corresponding file
>> is a store link, it should be removed; otherwise, it should
>> remain untouched. I
>> can work with this.
>
> Will this correctly handle cases where user is managing the file
> using
> for example extra-special-file?
>

No, it wouldn’t.


Toggle quote (3 lines)
> I wonder whether fat-warning approach would not be better.
>

I think I agree.

— Ian
M
M
Maxim Cournoyer wrote on 22 Sep 04:26 +0200
(name . Ian Eure)(address . ian@retrospec.tv)
87setsmnj2.fsf@gmail.com
Hi Ian,

Ian Eure <ian@retrospec.tv> writes:

[...]

Toggle quote (9 lines)
> The only other option I can see would be to keep the existing
> filenames for user configuration, and declaritively manage different
> files -- like declaritive-channels.scm. This comes with its own set
> of problems, like needing to update the Guix daemon to read and
> combine multiple files; and the inability to know whether a given
> `channels.scm' is declaritively- or manually-managed means a bumpy
> upgrade path (ex. should this preexisting channels.scm file be left
> as-is, or renamed to the new name?)

I'd think that be a great option to pursue, although it's more work more
thoughts. Perhaps it could work along these lines (brainstorming)

I like the idea to leave the original, potentially manually written file
in place and complement it with a declarative counterpart. The same
would also have benefited /etc/guix/acl, which suffers from the same
ambiguity.

--
Thanks,
Maxim
I
I
Ian Eure wrote on 1 Dec 20:05 +0100
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
87ldwzmdfi.fsf@retrospec.tv
Hi Maxim,

Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

Toggle quote (34 lines)
> Hi Ian,
>
> Ian Eure <ian@retrospec.tv> writes:
>
> [...]
>
>> The only other option I can see would be to keep the existing
>> filenames for user configuration, and declaritively manage
>> different
>> files -- like declaritive-channels.scm. This comes with its
>> own set
>> of problems, like needing to update the Guix daemon to read and
>> combine multiple files; and the inability to know whether a
>> given
>> `channels.scm' is declaritively- or manually-managed means a
>> bumpy
>> upgrade path (ex. should this preexisting channels.scm file be
>> left
>> as-is, or renamed to the new name?)
>
> I'd think that be a great option to pursue, although it's more
> work more
> thoughts. Perhaps it could work along these lines
> (brainstorming)
>
> I like the idea to leave the original, potentially manually
> written file
> in place and complement it with a declarative counterpart. The
> same
> would also have benefited /etc/guix/acl, which suffers from the
> same
> ambiguity.
>

Apologies for the silence, life stuff has been eating most of my
free time, but I have a bit of bandwidth to spend on this problem
again.

I took a swing at this, it wasn’t as difficult as I expected.
While this approach gives a smooth upgrade path for those who’ve
configured channels in a stateful way switching to declarative
configuration, it’s possibly bumpy for those already using a
declarative config. If a machine with declarative channels is
reconfigured, the channels will be duplicated from
/etc/guix/channels.scm to /etc/guix/channels-declarative.scm.
Using `delete-duplicates' on the merged channels should avoid
major problems, but I think it still needs a loud entry in news
and manual action (deleting /etc/guix/channels.scm) to upgrade.
Given that both approaches will require manual action, I’m a bit
inclined to go with the simpler, and take over the existing file.
That said, I think the failure mode of the simpler approach
(stomping on channels a user may have configured) is undeniably
worse than potentially duplicating channels or continuing to pull
in old ones unexpectedly. Do either of you have a strong opinion
or more information which would help guide this decision?

The root issue at work behind all these problems is that
activation code only sees the desired target config, rather than
the current and target configs. Comparing the current and target
configs would allow the code to more precisely compute the needd
change to move from one state to the next. I think that could be
a good change to make, though it’s obviously going to be much more
involved, and IMO will require discussion outside the scope of
this specific bug.

I have a draft patch series I hope to send up soon, but need to
get Guix System up in a VM to test first. It does separate
declarative channels into their own config, but doesn’t do the
same for build machines. While I think many fewer users configure
build machines than channels, it’s probably a good idea to use the
same approach for both channels and machines.

— Ian
?
Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send an email to 72686@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 72686
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch