Services depending on new Shepherd features may fail until reboot

  • Open
  • quality assurance status badge
Details
3 participants
  • Ludovic Courtès
  • Maxim Cournoyer
  • Maxime Devos
Owner
unassigned
Submitted by
Maxim Cournoyer
Severity
normal
M
M
Maxim Cournoyer wrote on 11 Jun 2022 07:53
jami service failing following 'guix deploy' update
(name . bug-guix)(address . bug-guix@gnu.org)
87a6ajg2zv.fsf@gmail.com
Hello Guix!

After having fixed the tests of the jami-service-type and pushed the fix
as 85b4dabd94d53f8179f31a42046cd83fc3a352fc, I was confident it would
work, but my freshly 'guix deploy'ed machine says otherwise:

Toggle snippet (17 lines)
$ sudo herd stop jami
$ sudo herd start jami
Service jami-dbus-session has been started.
herd: exception caught while executing 'start' on service 'jami':
Unbound variable: jami-service-available?

$ guix system describe
Generation 141 Jun 11 2022 01:38:12 (current)
file name: /var/guix/profiles/system-141-link
canonical file name: /gnu/store/vx9sd4vb2yfv0zhycd461m9wfvgzclsp-system
label: GNU with Linux-Libre 5.17.14
bootloader: grub-efi
root device: label: "btrfs-pool-1"
kernel: /gnu/store/zf4062hz23485dp1xr8w6zbk2m8igpsk-linux-libre-5.17.14/bzImage
configuration file: /gnu/store/n2wqba6npybjd8i730cpi9qc61p16jkr-configuration.scm

I don't get it; how can the service runs fine in the instrumented VMs
the system tests use, and fail in my updated machine? Could it be a
fault in 'guix deploy'?

To be investigated...

Thanks,

Maxim
M
M
Maxime Devos wrote on 11 Jun 2022 16:51
5521716772922285f7c6bc381c82613026eebbd9.camel@telenet.be
Maxim Cournoyer schreef op za 11-06-2022 om 01:53 [-0400]:
Toggle quote (4 lines)
> I don't get it; how can the service runs fine in the instrumented VMs
> the system tests use, and fail in my updated machine?  Could it be a
> fault in 'guix deploy'?

Maybe the shepherd has the old (gnu build jami-service) module loaded
and it doesn't know know to reload modules during reconfiguration?

Greetings,
Maxime.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYqSr2RccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7k8FAP9hlar7xP3IHawtI1wLXjXnxRp0
cbdlFmEs08Ex8uCz3wD/YAiRVRIL6f18vnmp7Er9gvzM/3bm9Uq9iT4YCswNIAs=
=BDRp
-----END PGP SIGNATURE-----


M
M
Maxim Cournoyer wrote on 14 Jun 2022 21:40
(name . Maxime Devos)(address . maximedevos@telenet.be)(address . 55898@debbugs.gnu.org)
87bkuvdoe4.fsf@gmail.com
Hello Maxime,

Maxime Devos <maximedevos@telenet.be> writes:

Toggle quote (8 lines)
> Maxim Cournoyer schreef op za 11-06-2022 om 01:53 [-0400]:
>> I don't get it; how can the service runs fine in the instrumented VMs
>> the system tests use, and fail in my updated machine?  Could it be a
>> fault in 'guix deploy'?
>
> Maybe the shepherd has the old (gnu build jami-service) module loaded
> and it doesn't know know to reload modules during reconfiguration?

If true, that would indeed explain it.

Thanks,

Maxim
M
M
Maxim Cournoyer wrote on 24 Jun 2022 19:52
(name . Maxime Devos)(address . maximedevos@telenet.be)(address . 55898@debbugs.gnu.org)
87v8sqx83c.fsf@gmail.com
retitle 55898 jami service fails to start following reconfigure
thanks

Hello,

Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

Toggle quote (12 lines)
> Hello Maxime,
>
> Maxime Devos <maximedevos@telenet.be> writes:
>
>> Maxim Cournoyer schreef op za 11-06-2022 om 01:53 [-0400]:
>>> I don't get it; how can the service runs fine in the instrumented VMs
>>> the system tests use, and fail in my updated machine?  Could it be a
>>> fault in 'guix deploy'?
>>
>> Maybe the shepherd has the old (gnu build jami-service) module loaded
>> and it doesn't know know to reload modules during reconfiguration?

The module seems to be simply missing, according to:

Toggle snippet (17 lines)
$ guix gc -R /gnu/store/sq7krjjpwbkr3z573flbnvkml1574mn5-system | grep jami
/gnu/store/vkgamffkm92l3xdzid42k4lcz6aqfj7i-ffmpeg-jami-4.4.2
/gnu/store/kk3dzx2xsa135d1i5jsjm8i787gbl56i-pjproject-jami-2.11-0.e1f389d
/gnu/store/fyd7rmvzhhqbk1f08c4pl7ahhlfgig40-shepherd-jami.scm
/gnu/store/kqiqnza4l0jawrs0mszymj8diaa2j97m-shepherd-file-system--home-jenkins-jami.scm
/gnu/store/yp9awyfgiym32card9w5mds8id6d6d0l-shepherd-file-system--home-jenkins-jami-workspace.scm
/gnu/store/xib9gc60a8bbff99cffh2x74gqpszf0i-shepherd-jami-dbus-session.scm
/gnu/store/q00v0f7syc1b6phfq4gih8i9irnm862w-dbus-for-jami-1.12.20
/gnu/store/dqfply51lzqc5z697k98avigsv21qm8q-libjami-20211223.2.37be4c3
/gnu/store/5w1zqbwagkhavqs7xjbzb8m7j978dcwj-shepherd-file-system--var-cache-jami.scm
/gnu/store/njrxi4apky4ckb2py9qz0ciz0b92smrd-shepherd-jami.go
/gnu/store/kciz8nady3rc5jd9j67bmlzyn622j5md-shepherd-file-system--home-jenkins-jami.go
/gnu/store/ddxa8yxqh1c3h6iax2x24wj0lfxrx8c6-shepherd-file-system--home-jenkins-jami-workspace.go
/gnu/store/d54hhmd90h7q4qmnb3q6ngsdp9457r80-shepherd-jami-dbus-session.go
/gnu/store/59lizyj4miag5if9ylhk383qr1qbxw1h-shepherd-file-system--var-cache-jami.go

After a 'guix system reconfigure' that successfully changed
/run/current-system.

I was expecting the module should have been pulled in the closure via
the encapsulating:

Toggle snippet (12 lines)
(with-extensions (list guile-packrat ;used by guile-ac-d-bus
guile-ac-d-bus
;; Fibers is needed to provide the non-blocking
;; variant of the 'sleep' procedure.
guile-fibers)
(with-imported-modules (source-module-closure
'((gnu build dbus-service)
(gnu build jami-service)
(gnu build shepherd)
(gnu system file-systems)))

in (gnu services telephony) around line 312.

Thoughts?

Maxim
M
M
Maxim Cournoyer wrote on 24 Jun 2022 20:01
Re: bug#55898: jami service failing following reconfigure
(name . Maxime Devos)(address . maximedevos@telenet.be)(address . 55898@debbugs.gnu.org)
87r13ex7nw.fsf_-_@gmail.com
Hi again,

Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

Toggle quote (12 lines)
> Hello Maxime,
>
> Maxime Devos <maximedevos@telenet.be> writes:
>
>> Maxim Cournoyer schreef op za 11-06-2022 om 01:53 [-0400]:
>>> I don't get it; how can the service runs fine in the instrumented VMs
>>> the system tests use, and fail in my updated machine?  Could it be a
>>> fault in 'guix deploy'?
>>
>> Maybe the shepherd has the old (gnu build jami-service) module loaded
>> and it doesn't know know to reload modules during reconfiguration?

I verified that in the
/gnu/store/fyd7rmvzhhqbk1f08c4pl7ahhlfgig40-shepherd-jami.scm file it
was setting up the load path with everything needed, such as

Toggle snippet (45 lines)
(eval-when
(expand load eval)
(let
((extensions
(quote
("/gnu/store/f6q4237n62lq7n8z3qyh3jx5iinb9myr-guile-packrat-0.1.1" "/gnu/store/l2f9gmd64w56nnhnlb63hg8f5crfvwln-guile-ac-d-bus-1.0.0-beta.0" "/gnu/store/is9f6ki7i2f6qk80ivvz7q1vvlibb96l-guile-fibers-1.0.0")))
(prepend
(lambda
(items lst)
(let loop
((items items)
(lst lst))
(if
(null? items)
lst
(loop
(cdr items)
(cons
(car items)
(delete
(car items)
lst))))))))
(set! %load-path
(prepend
(cons "/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import"
(map
(lambda
(extension)
(string-append extension "/share/guile/site/"
(effective-version)))
extensions))
%load-path))
(set! %load-compiled-path
(prepend
(cons "/gnu/store/zqgpayc87lfmcmncgzbp5v59hav8ww1c-module-import-compiled"
(map
(lambda
(extension)
(string-append extension "/lib/guile/"
(effective-version)
"/site-ccache"))
extensions))
%load-compiled-path))))

The /gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import directory
contains:

Toggle snippet (25 lines)
$ find /gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import/gnu
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import/gnu/build
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import/gnu/build/dbus-service.scm
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import/gnu/build/file-systems.scm
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import/gnu/build/jami-service.scm
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import/gnu/build/linux-container.scm
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import/gnu/build/shepherd.scm
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import/gnu/system
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import/gnu/system/file-systems.scm
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import/gnu/system/uuid.scm
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import/guix
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import/guix/build
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import/guix/build/bournish.scm
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import/guix/build/syscalls.scm
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import/guix/build/utils.scm
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import/guix/colors.scm
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import/guix/i18n.scm
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import/guix/profiling.scm
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import/guix/diagnostics.scm
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import/guix/memoization.scm
/gnu/store/7kgdg6dmgncqirj3k07n02hq6kjyf4an-module-import/guix/records.scm

and the referenced [...]/gnu/build/jami-service.scm file does contain
the supposedly missing 'jami-service-available?' procedure.

I'm suspecting that given the service makes use of Shepherd 0.9
features, perhaps it fails loading and the error is reported erroneously
that way... a reboot would tell but I'm not in a position to do that at
this moment (remote machine).

Thanks,

Maxim
M
M
Maxim Cournoyer wrote on 6 Jul 2022 23:38
Re: bug#55898: jami service failing following 'guix deploy' update
(name . Maxime Devos)(address . maximedevos@telenet.be)
87mtdl2a7u.fsf_-_@gmail.com
retitle 55898 Services depending on new Shepherd features may fail until
a reboot
thanks

Hi,

Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

[...]

Toggle quote (5 lines)
> I'm suspecting that given the service makes use of Shepherd 0.9
> features, perhaps it fails loading and the error is reported erroneously
> that way... a reboot would tell but I'm not in a position to do that at
> this moment (remote machine).

I confirm that following a reboot, the service runs normally.

Perhaps services should allow specifying the minimum required Shepherd
version, which Shepherd could ensure is met before attempting to restart
a service, printing something like:

'Could not restart service X due to unmet Shepherd version requirement;
the service will continue unchanged until the next reboot'

or something similar.

I've re-titled the bug, as this isn't specific to our jami service.

Thanks!

Maxim
M
M
Maxim Cournoyer wrote on 7 Jul 2022 00:01
(name . Maxime Devos)(address . maximedevos@telenet.be)
87ilo9294m.fsf@gmail.com
Hi again,

[...]

Toggle quote (11 lines)
> Perhaps services should allow specifying the minimum required Shepherd
> version, which Shepherd could ensure is met before attempting to restart
> a service, printing something like:
>
> 'Could not restart service X due to unmet Shepherd version requirement;
> the service will continue unchanged until the next reboot'
>
> or something similar.
>
> I've re-titled the bug, as this isn't specific to our jami service.

I've asked in #systemd about what a similar situation would happen in
systemd-land, and here's what I've learned:

1. service units aren't reloaded automatically after new versions of
them are installed -- this effectively prevent the breakage seen here
(the jami service was reloaded and restarting manually it caused it to
fail).

2. a savvy user can still opt to force the new service to be reloaded
via 'systemctl daemon-reload'. In case the service update depends on
new systemd features, systemd would need to be restarted itself, via
'systemctl daemon-reexec'. The later command is interesting, but its
documented as a debugging tool [0]:

daemon-reexec

Reexecute the systemd manager. This will serialize the manager
state, reexecute the process and deserialize the state again. This
command is of little use except for debugging and package
upgrades. Sometimes, it might be helpful as a heavy-weight
daemon-reload. While the daemon is being reexecuted, all sockets
systemd listening on behalf of user configuration will stay
accessible.


systemd folks told me it is not typically run in systemd package upgrade
hooks, but perhaps some distribution do this (I don't know).

So the situation is not very different in systemd vs shepherd, except
that we more aggressively load the new service definitions, potentially
leading to breakage.

Thoughts?

Maxim
M
M
Maxim Cournoyer wrote on 7 Jul 2022 15:16
control message for bug #55898
(address . control@debbugs.gnu.org)
87h73t12s8.fsf@gmail.com
retitle 55898 Services depending on new Shepherd features may fail until reboot
quit
L
L
Ludovic Courtès wrote on 20 Jul 2022 23:19
Re: bug#55898: Services depending on new Shepherd features may fail until reboot
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
87a693pjm7.fsf_-_@gnu.org
Hi!

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

Toggle quote (9 lines)
>> Perhaps services should allow specifying the minimum required Shepherd
>> version, which Shepherd could ensure is met before attempting to restart
>> a service, printing something like:
>>
>> 'Could not restart service X due to unmet Shepherd version requirement;
>> the service will continue unchanged until the next reboot'
>>
>> or something similar.

Yes. The issue is that we’re more free-style than systemd: we’re
basically loading code live in the running Shepherd. So we have to
write that code such that it works with older Shepherd versions.

This is why we have things like conditions on

(defined? 'make-inetd-constructor)

and the likes, with a fallback.

Toggle quote (35 lines)
>> I've re-titled the bug, as this isn't specific to our jami service.
>
> I've asked in #systemd about what a similar situation would happen in
> systemd-land, and here's what I've learned:
>
> 1. service units aren't reloaded automatically after new versions of
> them are installed -- this effectively prevent the breakage seen here
> (the jami service was reloaded and restarting manually it caused it to
> fail).
>
> 2. a savvy user can still opt to force the new service to be reloaded
> via 'systemctl daemon-reload'. In case the service update depends on
> new systemd features, systemd would need to be restarted itself, via
> 'systemctl daemon-reexec'. The later command is interesting, but its
> documented as a debugging tool [0]:
>
> daemon-reexec
>
> Reexecute the systemd manager. This will serialize the manager
> state, reexecute the process and deserialize the state again. This
> command is of little use except for debugging and package
> upgrades. Sometimes, it might be helpful as a heavy-weight
> daemon-reload. While the daemon is being reexecuted, all sockets
> systemd listening on behalf of user configuration will stay
> accessible.
>
> [0] https://www.freedesktop.org/software/systemd/man/systemctl.html#
>
> systemd folks told me it is not typically run in systemd package upgrade
> hooks, but perhaps some distribution do this (I don't know).
>
> So the situation is not very different in systemd vs shepherd, except
> that we more aggressively load the new service definitions, potentially
> leading to breakage.

I think any upgrade, be it ‘guix system reconfigure’ or ‘apt
dist-upgrade’, is likely to end up loading new service definitions—Guix
System isn’t more aggressive in that respect.

The main difference is that our services are code and that they might
depend on specific Shepherd APIs. It’s a much more direct dependency
compared to .service files I guess.

Thanks,
Ludo’.
M
M
Maxim Cournoyer wrote on 21 Jul 2022 06:10
(name . Ludovic Courtès)(address . ludo@gnu.org)
87r12fqf6j.fsf@gmail.com
Hi Ludovic,

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (23 lines)
> Hi!
>
> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>>> Perhaps services should allow specifying the minimum required Shepherd
>>> version, which Shepherd could ensure is met before attempting to restart
>>> a service, printing something like:
>>>
>>> 'Could not restart service X due to unmet Shepherd version requirement;
>>> the service will continue unchanged until the next reboot'
>>>
>>> or something similar.
>
> Yes. The issue is that we’re more free-style than systemd: we’re
> basically loading code live in the running Shepherd. So we have to
> write that code such that it works with older Shepherd versions.
>
> This is why we have things like conditions on
>
> (defined? 'make-inetd-constructor)
>
> and the likes, with a fallback.

I saw that used somewhere, but I still think a minimally required
Shepherd version field could be of use on services, for the following
reasons:

1. Otherwise each services are left implementing ad-hoc solutions.

2. It's more complicated to be compatible with two Shepherd version than
simply mentioning the minimum version required, and prevent the service
from running until it is satisfied (especially on a system like Guix
System where we *know* what is the current version of Shepherd
available).

Thanks,

Maxim
L
L
Ludovic Courtès wrote on 29 Aug 2022 15:43
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
87y1v75fo0.fsf@gnu.org
Hi,

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

Toggle quote (2 lines)
> Ludovic Courtès <ludo@gnu.org> writes:

[...]

Toggle quote (22 lines)
>> Yes. The issue is that we’re more free-style than systemd: we’re
>> basically loading code live in the running Shepherd. So we have to
>> write that code such that it works with older Shepherd versions.
>>
>> This is why we have things like conditions on
>>
>> (defined? 'make-inetd-constructor)
>>
>> and the likes, with a fallback.
>
> I saw that used somewhere, but I still think a minimally required
> Shepherd version field could be of use on services, for the following
> reasons:
>
> 1. Otherwise each services are left implementing ad-hoc solutions.
>
> 2. It's more complicated to be compatible with two Shepherd version than
> simply mentioning the minimum version required, and prevent the service
> from running until it is satisfied (especially on a system like Guix
> System where we *know* what is the current version of Shepherd
> available).

I think it’s a situation similar to “feature tests vs. identity tests”
in build system configuration (checking whether the libc function you
want to use is available rather than checking whether ‘uname’ returns
“Linux”), and for the same reason I tend to prefer feature tests as
shown above.

I won’t pretend it’s pretty :-), but I don’t see an improvement feasible
in the short term.

In the long term, maybe we’d want the service API in the Shepherd to be
more declarative, more like packages in Guix. But that’s more for a 2.0
horizon IMO.

Perhaps we should close this issue unless it becomes actionable?

Thanks,
Ludo’.
M
M
Maxim Cournoyer wrote on 29 Aug 2022 23:06
(name . Ludovic Courtès)(address . ludo@gnu.org)
8735debvz7.fsf@gmail.com
Hi Ludovic,

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (36 lines)
> Hi,
>
> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>> Ludovic Courtès <ludo@gnu.org> writes:
>
> [...]
>
>>> Yes. The issue is that we’re more free-style than systemd: we’re
>>> basically loading code live in the running Shepherd. So we have to
>>> write that code such that it works with older Shepherd versions.
>>>
>>> This is why we have things like conditions on
>>>
>>> (defined? 'make-inetd-constructor)
>>>
>>> and the likes, with a fallback.
>>
>> I saw that used somewhere, but I still think a minimally required
>> Shepherd version field could be of use on services, for the following
>> reasons:
>>
>> 1. Otherwise each services are left implementing ad-hoc solutions.
>>
>> 2. It's more complicated to be compatible with two Shepherd version than
>> simply mentioning the minimum version required, and prevent the service
>> from running until it is satisfied (especially on a system like Guix
>> System where we *know* what is the current version of Shepherd
>> available).
>
> I think it’s a situation similar to “feature tests vs. identity tests”
> in build system configuration (checking whether the libc function you
> want to use is available rather than checking whether ‘uname’ returns
> “Linux”), and for the same reason I tend to prefer feature tests as
> shown above.

Agreed, but the context differs wildly: while Autoconf or browsers for
example really are facing a diversity of configuration, the version of
Shepherd used in Guix System is known and controlled. So the only
problems bound to happen are in this context:

1. New Shepherd version introduced in Guix (package upgrade).

2. New Shepherd features used by services.

3. Machine reconfigured using a commit including 1 and 2.

The problems are temporary: upon a reboot the running Shepherd version
will be the latest, and have all the features needed.

Hence my suggestion to use something simple to improve the user
experience of a user faced with 3.

Toggle quote (9 lines)
> I won’t pretend it’s pretty :-), but I don’t see an improvement feasible
> in the short term.
>
> In the long term, maybe we’d want the service API in the Shepherd to be
> more declarative, more like packages in Guix. But that’s more for a 2.0
> horizon IMO.
>
> Perhaps we should close this issue unless it becomes actionable?

It's a relatively narrow use case and it's relatively rare too, but I'd
err on keeping it open until it gets fixed, whether in a definitive
fashion or as a more limited one to help users facing 3. above.

Thanks, and welcome back!

Maxim
L
L
Ludovic Courtès wrote on 30 Aug 2022 09:33
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
878rn6423p.fsf@gnu.org
Hi Maxim,

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

Toggle quote (17 lines)
> Agreed, but the context differs wildly: while Autoconf or browsers for
> example really are facing a diversity of configuration, the version of
> Shepherd used in Guix System is known and controlled. So the only
> problems bound to happen are in this context:
>
> 1. New Shepherd version introduced in Guix (package upgrade).
>
> 2. New Shepherd features used by services.
>
> 3. Machine reconfigured using a commit including 1 and 2.
>
> The problems are temporary: upon a reboot the running Shepherd version
> will be the latest, and have all the features needed.
>
> Hence my suggestion to use something simple to improve the user
> experience of a user faced with 3.

So are you suggesting replacing:

(defined? 'make-inetd-constructor)

by something like:

(version<? shepherd-version "0.9.0")

or is it something different that you have in mind?

I’m not sure how this could improve the user experience, unless by
“user” you mean the person writing the service?

Thanks,
Ludo’.
M
M
Maxime Devos wrote on 30 Aug 2022 11:35
(address . 55898@debbugs.gnu.org)
c908b6d8-49ce-0eb4-ebe1-011e3c11699c@telenet.be
On 30-08-2022 09:33, Ludovic Courtès wrote:
Toggle quote (12 lines)
> So are you suggesting replacing:
>
> (defined? 'make-inetd-constructor)
>
> by something like:
>
> (version<? shepherd-version "0.9.0")
>
> or is it something different that you have in mind?
>
> I’m not sure how this could improve the user experience, unless by
> “user” you mean the person writing the service?
(defined? '...) does not work in all contexts -- it works on the
top-level, but not always inside a procedure, as it tests if the thing
is defined in (current-module), and not whether it is defined in the
module that calls (defined? '...).
Perhaps it works fine in the way it is used in Guix currently, but AFAIK
it is not guaranteed anywhere. As such, I cannot recommend defined?
here. The version<? does not have this problem.
A hypothetical defined-in-this-module macro would be ideal, but such a
thing does not exist yet.
Greetings,
Maxime.
Attachment: OpenPGP_signature
L
L
Ludovic Courtès wrote on 30 Aug 2022 15:50
(name . Maxime Devos)(address . maximedevos@telenet.be)
87fshdyh53.fsf@gnu.org
Maxime Devos <maximedevos@telenet.be> skribis:

Toggle quote (20 lines)
> On 30-08-2022 09:33, Ludovic Courtès wrote:
>
>> So are you suggesting replacing:
>>
>> (defined? 'make-inetd-constructor)
>>
>> by something like:
>>
>> (version<? shepherd-version "0.9.0")
>>
>> or is it something different that you have in mind?
>>
>> I’m not sure how this could improve the user experience, unless by
>> “user” you mean the person writing the service?
>
> (defined? '...) does not work in all contexts -- it works on the
> top-level, but not always inside a procedure, as it tests if the thing
> is defined in (current-module), and not whether it is defined in the
> module that calls (defined? '...).

Right, but that’s a bit of a theoretical concern in my view.

Alternatively, one could write:

(module-defined? (resolve-interface '(shepherd service))
'make-inetd-constructor)

Thanks,
Ludo’.
M
M
Maxim Cournoyer wrote on 1 Sep 2022 15:18
(name . Ludovic Courtès)(address . ludo@gnu.org)
87tu5r8c8m.fsf@gmail.com
Hi,

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (31 lines)
> Hi Maxim,
>
> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>> Agreed, but the context differs wildly: while Autoconf or browsers for
>> example really are facing a diversity of configuration, the version of
>> Shepherd used in Guix System is known and controlled. So the only
>> problems bound to happen are in this context:
>>
>> 1. New Shepherd version introduced in Guix (package upgrade).
>>
>> 2. New Shepherd features used by services.
>>
>> 3. Machine reconfigured using a commit including 1 and 2.
>>
>> The problems are temporary: upon a reboot the running Shepherd version
>> will be the latest, and have all the features needed.
>>
>> Hence my suggestion to use something simple to improve the user
>> experience of a user faced with 3.
>
> So are you suggesting replacing:
>
> (defined? 'make-inetd-constructor)
>
> by something like:
>
> (version<? shepherd-version "0.9.0")
>
> or is it something different that you have in mind?

I had something different on mind; I was thinking of some added field to
our shepherd-service object where the minimal version of Shepherd
required could be specified, e.g. "0.9.1".

The check could be abstracted in the shepherd-service implementation,
avoiding services writers to handle that by themselves in *each* service
requiring so.

The benefit would be an improved user experience, and cleaner service
code. Upon reconfiguring a machine not yet equipped with a new enough
Shepherd, Shepherd could print:

Toggle snippet (4 lines)
The x, y and z services won't be started until the next reboot, as they
require a newer Shepherd version.

Instead of seeing the new services fail to run without (for the end
user) without any obvious reason.

Does that make sense?

Thanks,

Maxim
M
M
Maxime Devos wrote on 1 Sep 2022 15:28
(address . 55898@debbugs.gnu.org)
d06b945a-3c6a-17ab-4179-18c6ba7479b7@telenet.be
On 01-09-2022 15:18, Maxim Cournoyer wrote:
Toggle quote (21 lines)
> I had something different on mind; I was thinking of some added field to
> our shepherd-service object where the minimal version of Shepherd
> required could be specified, e.g. "0.9.1".
>
> The check could be abstracted in the shepherd-service implementation,
> avoiding services writers to handle that by themselves in*each* service
> requiring so.
>
> The benefit would be an improved user experience, and cleaner service
> code. Upon reconfiguring a machine not yet equipped with a new enough
> Shepherd, Shepherd could print:
>
> --8<---------------cut here---------------start------------->8---
> The x, y and z services won't be started until the next reboot, as they
> require a newer Shepherd version.
> --8<---------------cut here---------------end--------------->8---
>
> Instead of seeing the new services fail to run without (for the end
> user) without any obvious reason.
>
> Does that make sense?
I like this system, it's declarative, simple and doesn't have the
defined?-looks-in-(current-module) problem. It also avoids accumulating
compatibility fallbacks that probably won't be well-tested.
If something like (defined? 'whatever) is desired instead of version
checks (though I don't see the value here, see Maxim explanation on
different contexts), a similar system based on feature checks could be
used instead:
(require-exports ; <-- the field
  '(((module name) this that ...)
    [...]))
The (if (defined? ...) preferred compat) pattern could be preserved for
the case where a patch author puts in some extra effort to not require
reboots on systems where an old shepherd is running, but a simple
version check or feature check would be accepted too.
Greetings,
Maxime.
Attachment: OpenPGP_signature
L
L
Ludovic Courtès wrote on 1 Sep 2022 15:51
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
87fshb2of0.fsf@gnu.org
Howdy!

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

Toggle quote (8 lines)
> I had something different on mind; I was thinking of some added field to
> our shepherd-service object where the minimal version of Shepherd
> required could be specified, e.g. "0.9.1".
>
> The check could be abstracted in the shepherd-service implementation,
> avoiding services writers to handle that by themselves in *each* service
> requiring so.

Hmm I see.

Toggle quote (10 lines)
> The benefit would be an improved user experience, and cleaner service
> code. Upon reconfiguring a machine not yet equipped with a new enough
> Shepherd, Shepherd could print:
>
> The x, y and z services won't be started until the next reboot, as they
> require a newer Shepherd version.
>
> Instead of seeing the new services fail to run without (for the end
> user) without any obvious reason.

Currently, new services don’t fail to run: we arrange so that new
services always “work”, whether they’re talking to an old shepherd or a
new one. The user experience (bugs aside) should be good: services are
always reloaded.

IIUC, in the model you propose, we’d sacrifice this, by admitting that
in some cases we just won’t load services live and instead tell users to
reboot; the benefit would be cleaner service code.

It’s a tradeoff; the cost/benefit ratio is not obvious to me.

Thanks for explaining!

Ludo’.
M
M
Maxim Cournoyer wrote on 1 Sep 2022 21:16
(name . Ludovic Courtès)(address . ludo@gnu.org)
87ilm69a7i.fsf@gmail.com
Hi,

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (29 lines)
> Howdy!
>
> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>> I had something different on mind; I was thinking of some added field to
>> our shepherd-service object where the minimal version of Shepherd
>> required could be specified, e.g. "0.9.1".
>>
>> The check could be abstracted in the shepherd-service implementation,
>> avoiding services writers to handle that by themselves in *each* service
>> requiring so.
>
> Hmm I see.
>
>> The benefit would be an improved user experience, and cleaner service
>> code. Upon reconfiguring a machine not yet equipped with a new enough
>> Shepherd, Shepherd could print:
>>
>> The x, y and z services won't be started until the next reboot, as they
>> require a newer Shepherd version.
>>
>> Instead of seeing the new services fail to run without (for the end
>> user) without any obvious reason.
>
> Currently, new services don’t fail to run: we arrange so that new
> services always “work”, whether they’re talking to an old shepherd or a
> new one. The user experience (bugs aside) should be good: services are
> always reloaded.

This depends on how the services are written, and how much care to this
edge case their author put into writing it. I know the Jami service
type won't run without Shepherd >= 0.9.0, and the solution is not
obvious (I'm suspecting sleep* from (guix build dbus-service) should use
regular sleep when shepherd is < 0.9.0.).

Toggle quote (6 lines)
> IIUC, in the model you propose, we’d sacrifice this, by admitting that
> in some cases we just won’t load services live and instead tell users to
> reboot; the benefit would be cleaner service code.
>
> It’s a tradeoff; the cost/benefit ratio is not obvious to me.

Having a simple way to cleanly mark a service as "requiring Shepherd
0.9.X" would provide good value in my opinion, for when adding backward
compatibility is too difficult or not desirable for any reason (such as
causing long 'hangs' while busy-waiting for a process to start in older
shepherds).

Thanks,

Maxim
L
L
Ludovic Courtès wrote on 2 Sep 2022 11:10
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
87wnamxht6.fsf@gnu.org
Hi Maxim,

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

Toggle quote (11 lines)
>> Currently, new services don’t fail to run: we arrange so that new
>> services always “work”, whether they’re talking to an old shepherd or a
>> new one. The user experience (bugs aside) should be good: services are
>> always reloaded.
>
> This depends on how the services are written, and how much care to this
> edge case their author put into writing it. I know the Jami service
> type won't run without Shepherd >= 0.9.0, and the solution is not
> obvious (I'm suspecting sleep* from (guix build dbus-service) should use
> regular sleep when shepherd is < 0.9.0.).

Ah, that’s an exception then, and this should be avoided IMO. :-)

Usually, the Shepherd’s API is backward compatible, so one can normally
leave services unchanged. It’s only when you want to take advantage of
the new features that you have to resort to conditionals.

But then more complex services like Jami or childhurd were more likely
to hit edge cases with the switch to Fibers.

Toggle quote (12 lines)
>> IIUC, in the model you propose, we’d sacrifice this, by admitting that
>> in some cases we just won’t load services live and instead tell users to
>> reboot; the benefit would be cleaner service code.
>>
>> It’s a tradeoff; the cost/benefit ratio is not obvious to me.
>
> Having a simple way to cleanly mark a service as "requiring Shepherd
> 0.9.X" would provide good value in my opinion, for when adding backward
> compatibility is too difficult or not desirable for any reason (such as
> causing long 'hangs' while busy-waiting for a process to start in older
> shepherds).

Right, I agree it’d be useful in those cases. (Though having to
busy-wait is not a valid reason IMO: it’s a sign we should provide the
proper abstraction to avoid busy waiting.)

I don’t have a clear idea on how to implement it now, but we should keep
that in mind.

Thanks,
Ludo’.
?