[PATCH] gnu: Add ability to restart services on system reconfigure

  • Done
  • quality assurance status badge
Details
3 participants
  • Carlo Zancanaro
  • Clément Lassieur
  • Ludovic Courtès
Owner
unassigned
Submitted by
Carlo Zancanaro
Severity
normal
C
C
Carlo Zancanaro wrote on 26 Nov 2018 12:41
(address . guix-patches@gnu.org)
87efb8m5gy.fsf@zancanaro.id.au
Hey Guix!

A few months ago I mentioned the idea of adding the ability to
have services automatically restarted when running "guix system
reconfigure". These patches are a start on making that happen.
They're incomplete (in particular, documentation is missing), but
I'm offering them up for comment.

The broad idea is to add a new field to our guix shepherd
services: restart-strategy. There are three valid values:

- always: this service is always safe to restart when running
reconfigure
- manual: this service may not be safe to restart when running
reconfigure - a message will be printed telling the user to
restart the service manually, or they can provide the
--restart-services flag to reconfigure to automatically restart
them

- never: this service is never safe to restart when running
reconfigure (eg. udev)

I have added the flag to the guix daemon's shepherd service to
show how it works. I tested this by changing my substitute servers
in config.scm, and after running "reconfigure" I saw my updated
substitute servers in ps without having to run "sudo herd restart
guix-daemon".

If nobody has any feedback in the next few days then I'll update
the manual and send through another patch.

Carlo
From 099a8e2e6e28b38816ed1ba895c407f1d9efe62e Mon Sep 17 00:00:00 2001
From: Carlo Zancanaro <carlo@zancanaro.id.au>
Date: Mon, 26 Nov 2018 22:38:26 +1100
Subject: [PATCH 3/3] services: Always restart guix daemon

* gnu/services/base.scm (guix-shepherd-service): Add restart-strategy of
'always.
---
gnu/services/base.scm | 1 +
1 file changed, 1 insertion(+)

Toggle diff (14 lines)
diff --git a/gnu/services/base.scm b/gnu/services/base.scm
index 228d3c592..7e0fdcb3e 100644
--- a/gnu/services/base.scm
+++ b/gnu/services/base.scm
@@ -1573,6 +1573,7 @@ failed to register hydra.gnu.org public key: ~a~%" status))))))))
(documentation "Run the Guix daemon.")
(provision '(guix-daemon))
(requirement '(user-processes))
+ (restart-strategy 'always)
(modules '((srfi srfi-1)))
(start
#~(make-forkexec-constructor
--
2.19.1
C
C
Clément Lassieur wrote on 26 Nov 2018 13:42
(name . Carlo Zancanaro)(address . carlo@zancanaro.id.au)(address . 33508@debbugs.gnu.org)
871s78ypr9.fsf@lassieur.org
Hi Carlo,

It might be safer to 'reload' some services, rather than 'restarting'
them. E.g. for nginx and prosody. Do you think it would be possible
add a 'custom' value that would point to a custom Shepherd action?

Thank you for your work on this!
Clément


Carlo Zancanaro <carlo@zancanaro.id.au> writes:

Toggle quote (358 lines)
> Hey Guix!
>
> A few months ago I mentioned the idea of adding the ability to
> have services automatically restarted when running "guix system
> reconfigure". These patches are a start on making that happen.
> They're incomplete (in particular, documentation is missing), but
> I'm offering them up for comment.
>
> The broad idea is to add a new field to our guix shepherd
> services: restart-strategy. There are three valid values:
>
> - always: this service is always safe to restart when running
> reconfigure
>
> - manual: this service may not be safe to restart when running
> reconfigure - a message will be printed telling the user to
> restart the service manually, or they can provide the
> --restart-services flag to reconfigure to automatically restart
> them
>
> - never: this service is never safe to restart when running
> reconfigure (eg. udev)
>
> I have added the flag to the guix daemon's shepherd service to
> show how it works. I tested this by changing my substitute servers
> in config.scm, and after running "reconfigure" I saw my updated
> substitute servers in ps without having to run "sudo herd restart
> guix-daemon".
>
> If nobody has any feedback in the next few days then I'll update
> the manual and send through another patch.
>
> Carlo
>
> From 8b92ebac4fa13a2a89f279b249be152051f31d94 Mon Sep 17 00:00:00 2001
> From: Carlo Zancanaro <carlo@zancanaro.id.au>
> Date: Mon, 26 Nov 2018 22:38:08 +1100
> Subject: [PATCH 1/3] gnu: Add ability to restart services on system
> reconfigure
>
> * gnu/services/herd.scm (restart-service): New procedure.
> * gnu/services/shepherd.scm (<shepherd-service>)[restart-strategy]: New
> field.
> (shepherd-service-upgrade): Return lists of services to automatically and
> manually restart.
> * guix/scripts/system.scm (call-with-service-upgrade-info): Pass through
> services to be automatically and manually restarted.
> (upgrade-shepherd-services): Automatically restart services that should be
> automatically restarted, and print a message about manually restarting
> services that should be manually restarted.
> ---
> gnu/services/herd.scm | 5 +++++
> gnu/services/shepherd.scm | 35 ++++++++++++++++++++++--------
> guix/scripts/system.scm | 45 ++++++++++++++++++++++++---------------
> 3 files changed, 59 insertions(+), 26 deletions(-)
>
> diff --git a/gnu/services/herd.scm b/gnu/services/herd.scm
> index 8ff817759..c8d6eb04e 100644
> --- a/gnu/services/herd.scm
> +++ b/gnu/services/herd.scm
> @@ -52,6 +52,7 @@
> load-services
> load-services/safe
> start-service
> + restart-service
> stop-service))
>
> ;;; Commentary:
> @@ -256,6 +257,10 @@ when passed a service with an already-registered name."
> (with-shepherd-action name ('start) result
> result))
>
> +(define (restart-service name)
> + (with-shepherd-action name ('restart) result
> + result))
> +
> (define (stop-service name)
> (with-shepherd-action name ('stop) result
> result))
> diff --git a/gnu/services/shepherd.scm b/gnu/services/shepherd.scm
> index 49d08cc30..0c80e44f2 100644
> --- a/gnu/services/shepherd.scm
> +++ b/gnu/services/shepherd.scm
> @@ -159,7 +159,9 @@ DEFAULT is given, use it as the service's default value."
> (auto-start? shepherd-service-auto-start? ;Boolean
> (default #t))
> (modules shepherd-service-modules ;list of module names
> - (default %default-modules)))
> + (default %default-modules))
> + (restart-strategy shepherd-service-restart-strategy
> + (default 'manual)))
>
> (define-record-type* <shepherd-action>
> shepherd-action make-shepherd-action
> @@ -344,9 +346,10 @@ symbols provided/required by a service."
> #t))))))
>
> (define (shepherd-service-upgrade live target)
> - "Return two values: the subset of LIVE (a list of <live-service>) that needs
> -to be unloaded, and the subset of TARGET (a list of <shepherd-service>) that
> -need to be restarted to complete their upgrade."
> + "Return three values: (a) the subset of LIVE (a list of <live-service>) that
> +needs to be unloaded, (b) the subset of TARGET (a list of <shepherd-service>)
> +that can be restarted automatically, and (c) the subset of TARGET that must be
> +restarted manually."
> (define (essential? service)
> (memq (first (live-service-provision service))
> '(root shepherd)))
> @@ -373,14 +376,28 @@ need to be restarted to complete their upgrade."
> (#f (every obsolete? (live-service-dependents service)))
> (_ #f)))
>
> - (define to-restart
> - ;; Restart services that are currently running.
> - (filter running? target))
> -
> (define to-unload
> ;; Unload services that are no longer required.
> (remove essential? (filter obsolete? live)))
>
> - (values to-unload to-restart))
> + (define to-automatically-restart
> + ;; Automatically restart services that are currently running and can
> + ;; always be restarted.
> + (filter (lambda (service)
> + (and (running? service)
> + (eq? (shepherd-service-restart-strategy service)
> + 'always)))
> + target))
> +
> + (define to-manually-restart
> + ;; Manually restart services that are currently running and must be
> + ;; manually restarted.
> + (filter (lambda (service)
> + (and (running? service)
> + (eq? (shepherd-service-restart-strategy service)
> + 'manual)))
> + target))
> +
> + (values to-unload to-automatically-restart to-manually-restart))
>
> ;;; shepherd.scm ends here
> diff --git a/guix/scripts/system.scm b/guix/scripts/system.scm
> index d92ec7d5a..6f14b1395 100644
> --- a/guix/scripts/system.scm
> +++ b/guix/scripts/system.scm
> @@ -322,11 +322,12 @@ names of services to load (upgrade), and the list of names of services to
> unload."
> (match (current-services)
> ((services ...)
> - (let-values (((to-unload to-restart)
> + (let-values (((to-unload to-automatically-restart to-manually-restart)
> (shepherd-service-upgrade services new-services)))
> - (mproc to-restart
> - (map (compose first live-service-provision)
> - to-unload))))
> + (mproc (map (compose first live-service-provision)
> + to-unload)
> + to-automatically-restart
> + to-manually-restart)))
> (#f
> (with-monad %store-monad
> (warning (G_ "failed to obtain list of shepherd services~%"))
> @@ -347,7 +348,7 @@ bring the system down."
> ;; Arrange to simply emit a warning if the service upgrade fails.
> (with-shepherd-error-handling
> (call-with-service-upgrade-info new-services
> - (lambda (to-restart to-unload)
> + (lambda (to-unload to-automatically-restart to-manually-restart)
> (for-each (lambda (unload)
> (info (G_ "unloading service '~a'...~%") unload)
> (unload-service unload))
> @@ -355,27 +356,37 @@ bring the system down."
>
> (with-monad %store-monad
> (munless (null? new-services)
> - (let ((new-service-names (map shepherd-service-canonical-name new-services))
> - (to-restart-names (map shepherd-service-canonical-name to-restart))
> - (to-start (filter shepherd-service-auto-start? new-services)))
> - (info (G_ "loading new services:~{ ~a~}...~%") new-service-names)
> - (unless (null? to-restart-names)
> - ;; Listing TO-RESTART-NAMES in the message below wouldn't help
> - ;; because many essential services cannot be meaningfully
> - ;; restarted. See <https://debbugs.gnu.org/cgi/bugreport.cgi?bug=22039#30>.
> - (format #t (G_ "To complete the upgrade, run 'herd restart SERVICE' to stop,
> -upgrade, and restart each service that was not automatically restarted.\n")))
> + (let ((new-service-names (map shepherd-service-canonical-name new-services))
> + (to-start-names (map shepherd-service-canonical-name (filter shepherd-service-auto-start? new-services)))
> + (to-automatically-restart-names (map shepherd-service-canonical-name to-automatically-restart))
> + (to-manually-restart-names (map shepherd-service-canonical-name to-manually-restart)))
> + (set! to-start-names
> + (remove (lambda (name)
> + (or (member name to-automatically-restart-names)
> + (member name to-manually-restart-names)))
> + to-start-names))
> +
> (mlet %store-monad ((files (mapm %store-monad
> (compose lower-object
> shepherd-service-file)
> new-services)))
> + (for-each restart-service to-automatically-restart-names)
> +
> ;; Here we assume that FILES are exactly those that were computed
> ;; as part of the derivation that built OS, which is normally the
> ;; case.
> + (info (G_ "loading new services:~{ ~a~}~%") new-service-names)
> (load-services/safe (map derivation->output-path files))
> -
> + (info (G_ "starting services:~{ ~a~}~%") to-start-names)
> (for-each start-service
> - (map shepherd-service-canonical-name to-start))
> + to-start-names)
> + (info (G_ "restarting services:~{ ~a~}~%") to-automatically-restart-names)
> + (for-each restart-service
> + to-automatically-restart-names)
> +
> + (unless (null? to-manually-restart-names)
> + (format #t (G_ "To complete the upgrade, the following services need to be manually restarted:~{ ~a~}~%")
> + to-manually-restart-names))
> (return #t)))))))))
>
> (define* (switch-to-system os
> --
> 2.19.1
>
> From 3fdef27c8f11b6a0f013afa9b6e619659ce78dec Mon Sep 17 00:00:00 2001
> From: Carlo Zancanaro <carlo@zancanaro.id.au>
> Date: Mon, 26 Nov 2018 22:38:18 +1100
> Subject: [PATCH 2/3] system: Add --restart-services flag for reconfigure
>
> * guix/scripts/system.scm (upgrade-shepherd-services): Add parameter to
> automatically restart services marked as needing manual restart.
> (switch-to-system): Pass through restart-services? flag.
> (perform-action): Pass through restart-services? flag.
> (%options): Add --restart-services flag.
> (%default-options): Add #f as default value for --restart-services flag.
> (process-action): Pass through restart-services? flag.
> ---
> guix/scripts/system.scm | 24 +++++++++++++++++-------
> 1 file changed, 17 insertions(+), 7 deletions(-)
>
> diff --git a/guix/scripts/system.scm b/guix/scripts/system.scm
> index 6f14b1395..bf632c534 100644
> --- a/guix/scripts/system.scm
> +++ b/guix/scripts/system.scm
> @@ -333,7 +333,7 @@ unload."
> (warning (G_ "failed to obtain list of shepherd services~%"))
> (return #f)))))
>
> -(define (upgrade-shepherd-services os)
> +(define (upgrade-shepherd-services os restart-services?)
> "Upgrade the Shepherd (PID 1) by unloading obsolete services and loading new
> services specified in OS and not currently running.
>
> @@ -360,6 +360,10 @@ bring the system down."
> (to-start-names (map shepherd-service-canonical-name (filter shepherd-service-auto-start? new-services)))
> (to-automatically-restart-names (map shepherd-service-canonical-name to-automatically-restart))
> (to-manually-restart-names (map shepherd-service-canonical-name to-manually-restart)))
> + (when restart-services?
> + (set! to-automatically-restart-names (append to-automatically-restart-names
> + to-manually-restart-names))
> + (set! to-manually-restart-names '()))
> (set! to-start-names
> (remove (lambda (name)
> (or (member name to-automatically-restart-names)
> @@ -389,7 +393,7 @@ bring the system down."
> to-manually-restart-names))
> (return #t)))))))))
>
> -(define* (switch-to-system os
> +(define* (switch-to-system os restart-services?
> #:optional (profile %system-profile))
> "Make a new generation of PROFILE pointing to the directory of OS, switch to
> it atomically, and then run OS's activation script."
> @@ -417,7 +421,7 @@ it atomically, and then run OS's activation script."
> (primitive-load (derivation->output-path script))))
>
> ;; Finally, try to update system services.
> - (upgrade-shepherd-services os))))
> + (upgrade-shepherd-services os restart-services?))))
>
> (define-syntax-rule (unless-file-not-found exp)
> (catch 'system-error
> @@ -825,7 +829,8 @@ and TARGET arguments."
> use-substitutes? bootloader-target target
> image-size file-system-type full-boot?
> (mappings '())
> - (gc-root #f))
> + (gc-root #f)
> + (restart-services? #f))
> "Perform ACTION for OS. INSTALL-BOOTLOADER? specifies whether to install
> bootloader; BOOTLOADER-TAGET is the target for the bootloader; TARGET is the
> target root directory; IMAGE-SIZE is the size of the image to be built, for
> @@ -907,7 +912,7 @@ static checks."
> (case action
> ((reconfigure)
> (mbegin %store-monad
> - (switch-to-system os)
> + (switch-to-system os restart-services?)
> (mwhen install-bootloader?
> (install-bootloader bootloader-script
> #:bootcfg bootcfg
> @@ -1090,6 +1095,9 @@ Some ACTIONS support additional ARGS.\n"))
> (option '(#\r "root") #t #f
> (lambda (opt name arg result)
> (alist-cons 'gc-root arg result)))
> + (option '("restart-services") #f #f
> + (lambda (opt name arg result)
> + (alist-cons 'restart-services? #t result)))
> %standard-build-options))
>
> (define %default-options
> @@ -1104,7 +1112,8 @@ Some ACTIONS support additional ARGS.\n"))
> (verbosity . 0)
> (file-system-type . "ext4")
> (image-size . guess)
> - (install-bootloader? . #t)))
> + (install-bootloader? . #t)
> + (restart-services? . #f)))
>
>
> ;;;
> @@ -1177,7 +1186,8 @@ resulting from command-line parsing."
> #:install-bootloader? bootloader?
> #:target target
> #:bootloader-target bootloader-target
> - #:gc-root (assoc-ref opts 'gc-root)))))
> + #:gc-root (assoc-ref opts 'gc-root)
> + #:restart-services? (assoc-ref opts 'restart-services?)))))
> #:system system))
> (warn-about-disk-space)))
>
> --
> 2.19.1
>
> From 099a8e2e6e28b38816ed1ba895c407f1d9efe62e Mon Sep 17 00:00:00 2001
> From: Carlo Zancanaro <carlo@zancanaro.id.au>
> Date: Mon, 26 Nov 2018 22:38:26 +1100
> Subject: [PATCH 3/3] services: Always restart guix daemon
>
> * gnu/services/base.scm (guix-shepherd-service): Add restart-strategy of
> 'always.
> ---
> gnu/services/base.scm | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/gnu/services/base.scm b/gnu/services/base.scm
> index 228d3c592..7e0fdcb3e 100644
> --- a/gnu/services/base.scm
> +++ b/gnu/services/base.scm
> @@ -1573,6 +1573,7 @@ failed to register hydra.gnu.org public key: ~a~%" status))))))))
> (documentation "Run the Guix daemon.")
> (provision '(guix-daemon))
> (requirement '(user-processes))
> + (restart-strategy 'always)
> (modules '((srfi srfi-1)))
> (start
> #~(make-forkexec-constructor
C
C
Carlo Zancanaro wrote on 26 Nov 2018 21:11
(name . Clément Lassieur)(address . clement@lassieur.org)(address . 33508@debbugs.gnu.org)
875zwj8uqz.fsf@zancanaro.id.au
Hey Clément!

On Mon, Nov 26 2018, Clément Lassieur wrote:
Toggle quote (5 lines)
> It might be safer to 'reload' some services, rather than
> 'restarting' them. E.g. for nginx and prosody. Do you think it
> would be possible add a 'custom' value that would point to a
> custom Shepherd action?

I can add this, but I don't think this is as useful as it
initially sounds. Most of our services are a specific version of a
service pointing to a specific version of a configuration file
(ie. that's in the store). That means that a "reload" shepherd
action won't be able to know where the new configuration file is
to load it.

We could solve this in one of two ways:

1) by allowing an arbitrary procedure as the value of
restart-strategy, because it can then call a shepherd action with
the appropriate configuration file, but then our action will have
to detect whether the binary has been changed (which would also
detect any dependencies changing). This may also lead to an
inconsistent user experience where a "reconfigure" might lead to a
reload, or might lead to a restart, and it's not obvious which it
will be.

2) by changing our services to create configuration files in a
known location (ie. /etc/nginx/nginx.conf). This would make it so
a simple "reload" action in the service could meaningfully reload
the service, but only if the binary was unchanged (because the old
binary might not be able to read the new configuration format, for
instance). This still leads to the above problem around the
inconsistent user experience, and adds some complexity in terms of
how configuration files are managed.

I lean towards option (1), because it gives us the ability to call
a shepherd action if we want, but also allows us to do arbitrary
other things with the extra knowledge on the Guix side.

In the end, though, I'm unconvinced that this is a useful thing to
add. What do you think?

Carlo
C
C
Clément Lassieur wrote on 26 Nov 2018 22:02
(name . Carlo Zancanaro)(address . carlo@zancanaro.id.au)(address . 33508@debbugs.gnu.org)
87k1kzd02u.fsf@lassieur.org
Hey Carlo!

Carlo Zancanaro <carlo@zancanaro.id.au> writes:

Toggle quote (15 lines)
> I can add this, but I don't think this is as useful as it initially
> sounds. Most of our services are a specific version of a service pointing to a
> specific version of a configuration file (ie. that's in the store). That means
> that a "reload" shepherd action won't be able to know where the new
> configuration file is to load it.
>
> We could solve this in one of two ways:
>
> 1) by allowing an arbitrary procedure as the value of restart-strategy,
> because it can then call a shepherd action with the appropriate configuration
> file

> but then our action will have to detect whether the binary has been
> changed (which would also detect any dependencies changing).

I don't think it needs to detect whether the binary has changed, because
'reload' signals are usually implemented so that they can safely fail.
So, if the configuration file has changed in an incompatible way, the
'reload' action won't work, but the service will keep running.

Toggle quote (4 lines)
> This may also lead to an inconsistent user experience where a
> "reconfigure" might lead to a reload, or might lead to a restart, and
> it's not obvious which it will be.

Your patch also leads to this inconsistency, because it allows a service
to either be restarted or not, in my opinion :-)

Toggle quote (12 lines)
> 2) by changing our services to create configuration files in a known location
> (ie. /etc/nginx/nginx.conf). This would make it so a simple "reload" action in
> the service could meaningfully reload the service, but only if the binary was
> unchanged (because the old binary might not be able to read the new
> configuration format, for instance). This still leads to the above problem
> around the inconsistent user experience, and adds some complexity in terms of
> how configuration files are managed.
>
> I lean towards option (1), because it gives us the ability to call a shepherd
> action if we want, but also allows us to do arbitrary other things with the
> extra knowledge on the Guix side.

I think both (1) and (2) make sense because both kind of services (the
ones pointing to configuration files in the store and the ones using
/etc/some-file.conf) already exist. Ideally, the mechanism should be
generic enough to handle both cases.

Toggle quote (3 lines)
> In the end, though, I'm unconvinced that this is a useful thing to add. What
> do you think?

I don't agree :-). A 'restart' is inherently dangerous because there is
a chance for the restart to fail (say, if the new configuration file is
erroneous), whereas the 'reload' action cannot fail (if it is correctly
implemented).

That being said, I agree that adding support for 'reload' would lead to
more complexity, and I would understand if you don't add it :-), but I
still think it's a very useful feature.

One question though: my understanding is that the default value for
'restart-strategy' is set in the Guix repository, but a user would be
able to customize it in their config.scm. Can you confirm it?

Thank you,
Clément
C
C
Carlo Zancanaro wrote on 26 Nov 2018 22:59
(name . Clément Lassieur)(address . clement@lassieur.org)(address . 33508@debbugs.gnu.org)
87lg5fsdof.fsf@zancanaro.id.au
Hey Clément,

Thanks for your thoughts! I think you're right that the approach
I've implemented isn't flexible enough. I potentially haven't
thought through the failure cases enough. I was more thinking of
reload as providing "zero downtime" upgrades, rather than
providing a safer way to upgrade.

I'll respond more specifically inline.

On Tue, Nov 27 2018, Clément Lassieur wrote:
Toggle quote (6 lines)
> I don't think it needs to detect whether the binary has changed,
> because 'reload' signals are usually implemented so that they
> can safely fail. So, if the configuration file has changed in
> an incompatible way, the 'reload' action won't work, but the
> service will keep running.

We do need to detect whether the binary has changed for the sake
of security updates, or similar. It would be bad if a user
reconfigured their system and it didn't upgrade the version of
nginx (or its dependencies) that they're running.

Broadly speaking, I conceptualise reconfigure as "bring my system
into this state". Now, thus far we haven't been able to do that,
because we have lacked the ability to restart services properly,
but in my mind the ideal situation is that after running "guix
system reconfigure" our system is completely put into the state
specified by the config.scm file used.

Although, now that I type that out, I notice that there is one
obvious way in which that is not true: the kernel. We can't
hot-swap the kernel, so there can always be a difference between
what the configuration file specifies and what the system is
actually running.

At any rate, even if we give services the ability to reload
without restarting, they would need to print out a message to
prompt the user to manually restart them if the binary has
changed. I would also then expect the --restart-services flag to
fully restart those services, rather than just reloading them.

Toggle quote (3 lines)
> Your patch also leads to this inconsistency, because it allows a
> service to either be restarted or not, in my opinion :-)

Yes, that's true, but then there's no middle-ground. Reloading is
"new configuration, old binary", whereas the current options are
"old configuration, old binary" or "new configuration, new binary"
(or, I guess, "not running because of a failed restart").

Toggle quote (5 lines)
> I think both (1) and (2) make sense because both kind of
> services (the ones pointing to configuration files in the store
> and the ones using /etc/some-file.conf) already exist. Ideally,
> the mechanism should be generic enough to handle both cases.

(1) actually subsumes (2), so I think I'll implement that. It
actually ends up being slightly easier, because the restart
strategy can just always be a procedure, with three predefined
procedures: always-restart, manually-restart, and never-restart.

Toggle quote (4 lines)
> That being said, I agree that adding support for 'reload' would
> lead to more complexity, and I would understand if you don't add
> it :-), but I still think it's a very useful feature.

I think you've convinced me that there's value in having more
flexibility around restarts. I'll change the restart-strategy
value to be a procedure rather than a bare symbol. The downside is
that we'll lose the ability to statically analyse how services
will behave when restarting, but the increased flexibility is
useful to us.

Toggle quote (5 lines)
> One question though: my understanding is that the default value
> for 'restart-strategy' is set in the Guix repository, but a user
> would be able to customize it in their config.scm. Can you
> confirm it?

That is not the current implementation. Individual services can
add a field to their configuration objects if it's meaningful for
them to customise their restart behaviour, but there isn't a
general-purpose mechanism for a user to change the restart
behaviour of any service (beyond the ability to write arbitrary
Scheme to do it).

Carlo
C
C
Clément Lassieur wrote on 30 Nov 2018 13:12
(name . Carlo Zancanaro)(address . carlo@zancanaro.id.au)(address . 33508@debbugs.gnu.org)
87bm66higm.fsf@lassieur.org
Hi Carlo,

Carlo Zancanaro <carlo@zancanaro.id.au> writes:

Toggle quote (20 lines)
> Hey Clément,
>
> Thanks for your thoughts! I think you're right that the approach I've
> implemented isn't flexible enough. I potentially haven't thought through the
> failure cases enough. I was more thinking of reload as providing "zero
> downtime" upgrades, rather than providing a safer way to upgrade.
>
> I'll respond more specifically inline.
>
> On Tue, Nov 27 2018, Clément Lassieur wrote:
>> I don't think it needs to detect whether the binary has changed, because
>> 'reload' signals are usually implemented so that they can safely fail. So,
>> if the configuration file has changed in an incompatible way, the 'reload'
>> action won't work, but the service will keep running.
>
> We do need to detect whether the binary has changed for the sake of security
> updates, or similar. It would be bad if a user reconfigured their system and
> it didn't upgrade the version of nginx (or its dependencies) that they're
> running.

If there is a risk for a service to be broken on reconfigure, a user
might want to do a safe reconfigure, and later on deal with each
critical service one after another, so to avoid having several services
down at the same time. I think we should at least allow a user to do a
'safe reconfigure' if they want.

Toggle quote (6 lines)
> Broadly speaking, I conceptualise reconfigure as "bring my system into this
> state". Now, thus far we haven't been able to do that, because we have lacked
> the ability to restart services properly, but in my mind the ideal situation
> is that after running "guix system reconfigure" our system is completely put
> into the state specified by the config.scm file used.

This is ideal, but most services depend on a state (Cuirass, mail
servers...).

Toggle quote (5 lines)
> Although, now that I type that out, I notice that there is one obvious way in
> which that is not true: the kernel. We can't hot-swap the kernel, so there can
> always be a difference between what the configuration file specifies and what
> the system is actually running.

And I don't think you can restart Xorg either...

Toggle quote (6 lines)
> At any rate, even if we give services the ability to reload without
> restarting, they would need to print out a message to prompt the user to
> manually restart them if the binary has changed. I would also then expect the
> --restart-services flag to fully restart those services, rather than just
> reloading them.

Agreed :-)

Toggle quote (28 lines)
>> Your patch also leads to this inconsistency, because it allows a service to
>> either be restarted or not, in my opinion :-)
>
> Yes, that's true, but then there's no middle-ground. Reloading is "new
> configuration, old binary", whereas the current options are "old
> configuration, old binary" or "new configuration, new binary" (or, I guess,
> "not running because of a failed restart").
>
>> I think both (1) and (2) make sense because both kind of services (the ones
>> pointing to configuration files in the store and the ones using
>> /etc/some-file.conf) already exist. Ideally, the mechanism should be
>> generic enough to handle both cases.
>
> (1) actually subsumes (2), so I think I'll implement that. It actually ends up
> being slightly easier, because the restart strategy can just always be a
> procedure, with three predefined procedures: always-restart, manually-restart,
> and never-restart.
>
>> That being said, I agree that adding support for 'reload' would lead to more
>> complexity, and I would understand if you don't add it :-), but I still
>> think it's a very useful feature.
>
> I think you've convinced me that there's value in having more flexibility
> around restarts. I'll change the restart-strategy value to be a procedure
> rather than a bare symbol. The downside is that we'll lose the ability to
> statically analyse how services will behave when restarting, but the increased
> flexibility is useful to us.

It could also detect whether you pass a symbol or a procedure. In most
cases that would be a symbol which would allow static analysis. But one
could still customize it with a procedure.

Toggle quote (10 lines)
>> One question though: my understanding is that the default value for
>> 'restart-strategy' is set in the Guix repository, but a user would be able
>> to customize it in their config.scm. Can you confirm it?
>
> That is not the current implementation. Individual services can add a field to
> their configuration objects if it's meaningful for them to customise their
> restart behaviour, but there isn't a general-purpose mechanism for a user to
> change the restart behaviour of any service (beyond the ability to write
> arbitrary Scheme to do it).

Ok thank you!

Clément
C
C
Carlo Zancanaro wrote on 1 Dec 2018 03:31
(name . Clément Lassieur)(address . clement@lassieur.org)(address . 33508@debbugs.gnu.org)
87va4eugdn.fsf@zancanaro.id.au
Hey Clément,

I'm still working through my thoughts on how all of this should
work. I feel like there are a few different use-cases that change
the trade-offs (eg. servers vs desktops, multi-user vs
single-user) and I don't know what the best defaults are, or the
most useful ways to vary that behaviour.

I've attached my most recent version of my patches. I haven't had
a chance to test them (so they may have really dumb mistakes), but
they should implement the idea of restart-actions as procedures.

On Fri, Nov 30 2018, Clément Lassieur wrote:
Toggle quote (4 lines)
> It could also detect whether you pass a symbol or a procedure.
> In most cases that would be a symbol which would allow static
> analysis. But one could still customize it with a procedure.

I don't like this way of having two different representations for
the same thing. In my current implementation there are three
procedures, {always,manually,never}-restart, which can be used
directly in the place of the old symbols (thus we can check for
those strategies with eq?).

Carlo
From 2077919dca604c94b09cf105c33987daa8c46304 Mon Sep 17 00:00:00 2001
From: Carlo Zancanaro <carlo@zancanaro.id.au>
Date: Mon, 26 Nov 2018 22:38:26 +1100
Subject: [PATCH 3/3] services: Always restart guix daemon

* gnu/services/base.scm (guix-shepherd-service): Add restart-strategy of
always-restart.
---
gnu/services/base.scm | 1 +
1 file changed, 1 insertion(+)

Toggle diff (14 lines)
diff --git a/gnu/services/base.scm b/gnu/services/base.scm
index 228d3c592..37d60720d 100644
--- a/gnu/services/base.scm
+++ b/gnu/services/base.scm
@@ -1573,6 +1573,7 @@ failed to register hydra.gnu.org public key: ~a~%" status))))))))
(documentation "Run the Guix daemon.")
(provision '(guix-daemon))
(requirement '(user-processes))
+ (restart-strategy always-restart)
(modules '((srfi srfi-1)))
(start
#~(make-forkexec-constructor
--
2.19.2
L
L
Ludovic Courtès wrote on 9 Dec 2018 17:59
(name . Carlo Zancanaro)(address . carlo@zancanaro.id.au)(address . 33508@debbugs.gnu.org)
87lg4yws9a.fsf@gnu.org
Hi Carlo,

Sorry for not commenting earlier!

Carlo Zancanaro <carlo@zancanaro.id.au> skribis:

Toggle quote (20 lines)
> The broad idea is to add a new field to our guix shepherd services:
> restart-strategy. There are three valid values:
>
> - always: this service is always safe to restart when running
> reconfigure
>
> - manual: this service may not be safe to restart when running
> reconfigure - a message will be printed telling the user to restart
> the service manually, or they can provide the --restart-services flag
> to reconfigure to automatically restart them
>
> - never: this service is never safe to restart when running
> reconfigure (eg. udev)
>
> I have added the flag to the guix daemon's shepherd service to show
> how it works. I tested this by changing my substitute servers in
> config.scm, and after running "reconfigure" I saw my updated
> substitute servers in ps without having to run "sudo herd restart
> guix-daemon".

In what sense is guix-daemon “always safe to restart”? It’s actually a
difficult question for me.

You could argue that its child guix-daemon processes will remain live
when we restart it, meaning that client connections remain active and
valid. I believe this is indeed the case, though it would be worth
double-checking.

Now, if safe-to-restart means that we automatically invoke the “restart”
action on guix-daemon, that means that anything that depends on it
(‘guix-publish’, ‘cuirass’, ‘hpcguix-web’, etc.) would be restarted as
well (even though I *think* we don’t have to in this case.) But these
may not be safe to restart: for example, on may want ‘guix-publish’ to
run uninterrupted.

Furthermore, whether something is “safe to restart” is really user
policy.

So the notion here should probably not be “safe to restart” but rather
“live-upgradable”.

sshd, nginx, and maybe guix-daemon can more or less be live-upgraded,
meaning that (1) existing connections are preserved but future
connections will talk to the new daemon, and as a corollary, (2)
dependent services do not need to be stopped & restarted.

Does that make sense?

Thanks,
Ludo’.
C
C
Clément Lassieur wrote on 13 Dec 2018 15:22
(name . Carlo Zancanaro)(address . carlo@zancanaro.id.au)(address . 33508@debbugs.gnu.org)
87h8fhtsk0.fsf@lassieur.org
Hi Carlo,

Thank you for your modifications. I like your patches better now :-)
But I prefer to let Ludovic comment on them, as I have seen an email
from him about Guix-daemon restart strategy.

Cheers!
Clément

Carlo Zancanaro <carlo@zancanaro.id.au> writes:

Toggle quote (351 lines)
> Hey Clément,
>
> I'm still working through my thoughts on how all of this should
> work. I feel like there are a few different use-cases that change
> the trade-offs (eg. servers vs desktops, multi-user vs
> single-user) and I don't know what the best defaults are, or the
> most useful ways to vary that behaviour.
>
> I've attached my most recent version of my patches. I haven't had
> a chance to test them (so they may have really dumb mistakes), but
> they should implement the idea of restart-actions as procedures.
>
> On Fri, Nov 30 2018, Clément Lassieur wrote:
>> It could also detect whether you pass a symbol or a procedure.
>> In most cases that would be a symbol which would allow static
>> analysis. But one could still customize it with a procedure.
>
> I don't like this way of having two different representations for
> the same thing. In my current implementation there are three
> procedures, {always,manually,never}-restart, which can be used
> directly in the place of the old symbols (thus we can check for
> those strategies with eq?).
>
> Carlo
>
> From 25d631b33b84f1f48bc06a192c46eb3170e29b97 Mon Sep 17 00:00:00 2001
> From: Carlo Zancanaro <carlo@zancanaro.id.au>
> Date: Mon, 26 Nov 2018 22:38:08 +1100
> Subject: [PATCH 1/3] gnu: Add ability to restart services on system
> reconfigure
>
> * gnu/services/herd.scm (restart-service): New procedure.
> * gnu/services/shepherd.scm (<shepherd-service>)[restart-strategy]: New
> field.
> (always-restart, manually-restart, never-restart): New procedures.
> * guix/scripts/system.scm (upgrade-shepherd-services): Automatically restart
> services that should be automatically restarted, and print a message about
> manually restarting services that should be manually restarted.
>
> Temporary commit
> ---
> gnu/services/herd.scm | 5 +++++
> gnu/services/shepherd.scm | 25 ++++++++++++++++++++++++-
> guix/scripts/system.scm | 37 +++++++++++++++++++++++++------------
> 3 files changed, 54 insertions(+), 13 deletions(-)
>
> diff --git a/gnu/services/herd.scm b/gnu/services/herd.scm
> index 8ff817759..c8d6eb04e 100644
> --- a/gnu/services/herd.scm
> +++ b/gnu/services/herd.scm
> @@ -52,6 +52,7 @@
> load-services
> load-services/safe
> start-service
> + restart-service
> stop-service))
>
> ;;; Commentary:
> @@ -256,6 +257,10 @@ when passed a service with an already-registered name."
> (with-shepherd-action name ('start) result
> result))
>
> +(define (restart-service name)
> + (with-shepherd-action name ('restart) result
> + result))
> +
> (define (stop-service name)
> (with-shepherd-action name ('stop) result
> result))
> diff --git a/gnu/services/shepherd.scm b/gnu/services/shepherd.scm
> index 49d08cc30..f7e690fb0 100644
> --- a/gnu/services/shepherd.scm
> +++ b/gnu/services/shepherd.scm
> @@ -44,12 +44,17 @@
> shepherd-service-provision
> shepherd-service-canonical-name
> shepherd-service-requirement
> + shepherd-service-restart-strategy
> shepherd-service-respawn?
> shepherd-service-start
> shepherd-service-stop
> shepherd-service-auto-start?
> shepherd-service-modules
>
> + always-restart
> + manually-restart
> + never-restart
> +
> shepherd-action
> shepherd-action?
> shepherd-action-name
> @@ -141,6 +146,22 @@ DEFAULT is given, use it as the service's default value."
> (guix build utils)
> (guix build syscalls)))
>
> +(define (always-restart service)
> + "Unconditionally restart SERVICE and return #f."
> + (let ((name (shepherd-service-canonical-name service)))
> + (info (G_ "restarting service: ~a~%") name)
> + (restart-service name)
> + #f))
> +
> +(define (manually-restart service)
> + "Do not restart SERVICE, but return #t to indicate that the user should
> +restart it."
> + #t)
> +
> +(define (never-restart service)
> + "Do not restart SERVICE and return #f."
> + #f)
> +
> (define-record-type* <shepherd-service>
> shepherd-service make-shepherd-service
> shepherd-service?
> @@ -159,7 +180,9 @@ DEFAULT is given, use it as the service's default value."
> (auto-start? shepherd-service-auto-start? ;Boolean
> (default #t))
> (modules shepherd-service-modules ;list of module names
> - (default %default-modules)))
> + (default %default-modules))
> + (restart-strategy shepherd-service-restart-strategy ;procedure
> + (default manually-restart)))
>
> (define-record-type* <shepherd-action>
> shepherd-action make-shepherd-action
> diff --git a/guix/scripts/system.scm b/guix/scripts/system.scm
> index d92ec7d5a..26e35fe99 100644
> --- a/guix/scripts/system.scm
> +++ b/guix/scripts/system.scm
> @@ -355,16 +355,14 @@ bring the system down."
>
> (with-monad %store-monad
> (munless (null? new-services)
> - (let ((new-service-names (map shepherd-service-canonical-name new-services))
> - (to-restart-names (map shepherd-service-canonical-name to-restart))
> - (to-start (filter shepherd-service-auto-start? new-services)))
> - (info (G_ "loading new services:~{ ~a~}...~%") new-service-names)
> - (unless (null? to-restart-names)
> - ;; Listing TO-RESTART-NAMES in the message below wouldn't help
> - ;; because many essential services cannot be meaningfully
> - ;; restarted. See <https://debbugs.gnu.org/cgi/bugreport.cgi?bug=22039#30>.
> - (format #t (G_ "To complete the upgrade, run 'herd restart SERVICE' to stop,
> -upgrade, and restart each service that was not automatically restarted.\n")))
> + (let* ((new-service-names (map shepherd-service-canonical-name new-services))
> + (to-restart-names (map shepherd-service-canonical-name to-restart))
> + (to-start-names (map shepherd-service-canonical-name
> + (filter (lambda (service)
> + (and (shepherd-service-auto-start? service)
> + (not (member service to-restart))))
> + new-services))))
> +
> (mlet %store-monad ((files (mapm %store-monad
> (compose lower-object
> shepherd-service-file)
> @@ -372,10 +370,25 @@ upgrade, and restart each service that was not automatically restarted.\n")))
> ;; Here we assume that FILES are exactly those that were computed
> ;; as part of the derivation that built OS, which is normally the
> ;; case.
> + (info (G_ "loading new services:~{ ~a~}~%") new-service-names)
> (load-services/safe (map derivation->output-path files))
>
> - (for-each start-service
> - (map shepherd-service-canonical-name to-start))
> + (info (G_ "starting services:~{ ~a~}~%") to-start-names)
> + (for-each (lambda (service-name)
> + (info (G_ "starting service: ~a~%") service-name)
> + (start-service service-name))
> + to-start-names)
> +
> + (let* ((to-manually-restart (filter (lambda (service)
> + ((shepherd-service-restart-strategy service)
> + service))
> + to-restart))
> + (to-manually-restart-names (map shepherd-service-canonical-name
> + to-manually-restart)))
> + (unless (null? to-manually-restart-names)
> + (info (G_ "To complete the upgrade, the following services need to be manually restarted:~{ ~a~}~%")
> + to-manually-restart-names)))
> +
> (return #t)))))))))
>
> (define* (switch-to-system os
> --
> 2.19.2
>
> From 270a126c6efd498798bb9342a12c0f671df51b4c Mon Sep 17 00:00:00 2001
> From: Carlo Zancanaro <carlo@zancanaro.id.au>
> Date: Mon, 26 Nov 2018 22:38:18 +1100
> Subject: [PATCH 2/3] system: Add --restart-services flag for reconfigure
>
> * gnu/services/shepherd.scm (always-restart, manually-restart, never-restart):
> Add restart-services? argument.
> * guix/scripts/system.scm (upgrade-shepherd-services): Add parameter to
> automatically restart services marked as needing manual restart.
> (switch-to-system, perform-action, process-action): Pass through
> restart-services? flag.
> (%options): Add --restart-services flag.
> (%default-options): Add #f as default value for --restart-services flag.
> ---
> gnu/services/shepherd.scm | 14 ++++++++------
> guix/scripts/system.scm | 23 +++++++++++++++--------
> 2 files changed, 23 insertions(+), 14 deletions(-)
>
> diff --git a/gnu/services/shepherd.scm b/gnu/services/shepherd.scm
> index f7e690fb0..638f6440c 100644
> --- a/gnu/services/shepherd.scm
> +++ b/gnu/services/shepherd.scm
> @@ -146,19 +146,21 @@ DEFAULT is given, use it as the service's default value."
> (guix build utils)
> (guix build syscalls)))
>
> -(define (always-restart service)
> +(define (always-restart service restart-services?)
> "Unconditionally restart SERVICE and return #f."
> (let ((name (shepherd-service-canonical-name service)))
> (info (G_ "restarting service: ~a~%") name)
> (restart-service name)
> #f))
>
> -(define (manually-restart service)
> - "Do not restart SERVICE, but return #t to indicate that the user should
> -restart it."
> - #t)
> +(define (manually-restart service restart-services?)
> + "Restart SERVICE and return #f if RESTART-SERVICES? is true, otherwise return #t to
> +indicate that the user should manually restart SERVICE."
> + (if restart-services?
> + (always-restart service #t)
> + #t))
>
> -(define (never-restart service)
> +(define (never-restart service restart-services?)
> "Do not restart SERVICE and return #f."
> #f)
>
> diff --git a/guix/scripts/system.scm b/guix/scripts/system.scm
> index 26e35fe99..7c2699065 100644
> --- a/guix/scripts/system.scm
> +++ b/guix/scripts/system.scm
> @@ -332,7 +332,7 @@ unload."
> (warning (G_ "failed to obtain list of shepherd services~%"))
> (return #f)))))
>
> -(define (upgrade-shepherd-services os)
> +(define (upgrade-shepherd-services os restart-services?)
> "Upgrade the Shepherd (PID 1) by unloading obsolete services and loading new
> services specified in OS and not currently running.
>
> @@ -381,7 +381,8 @@ bring the system down."
>
> (let* ((to-manually-restart (filter (lambda (service)
> ((shepherd-service-restart-strategy service)
> - service))
> + service
> + restart-services?))
> to-restart))
> (to-manually-restart-names (map shepherd-service-canonical-name
> to-manually-restart)))
> @@ -391,7 +392,7 @@ bring the system down."
>
> (return #t)))))))))
>
> -(define* (switch-to-system os
> +(define* (switch-to-system os restart-services?
> #:optional (profile %system-profile))
> "Make a new generation of PROFILE pointing to the directory of OS, switch to
> it atomically, and then run OS's activation script."
> @@ -419,7 +420,7 @@ it atomically, and then run OS's activation script."
> (primitive-load (derivation->output-path script))))
>
> ;; Finally, try to update system services.
> - (upgrade-shepherd-services os))))
> + (upgrade-shepherd-services os restart-services?))))
>
> (define-syntax-rule (unless-file-not-found exp)
> (catch 'system-error
> @@ -827,7 +828,8 @@ and TARGET arguments."
> use-substitutes? bootloader-target target
> image-size file-system-type full-boot?
> (mappings '())
> - (gc-root #f))
> + (gc-root #f)
> + (restart-services? #f))
> "Perform ACTION for OS. INSTALL-BOOTLOADER? specifies whether to install
> bootloader; BOOTLOADER-TAGET is the target for the bootloader; TARGET is the
> target root directory; IMAGE-SIZE is the size of the image to be built, for
> @@ -909,7 +911,7 @@ static checks."
> (case action
> ((reconfigure)
> (mbegin %store-monad
> - (switch-to-system os)
> + (switch-to-system os restart-services?)
> (mwhen install-bootloader?
> (install-bootloader bootloader-script
> #:bootcfg bootcfg
> @@ -1092,6 +1094,9 @@ Some ACTIONS support additional ARGS.\n"))
> (option '(#\r "root") #t #f
> (lambda (opt name arg result)
> (alist-cons 'gc-root arg result)))
> + (option '("restart-services") #f #f
> + (lambda (opt name arg result)
> + (alist-cons 'restart-services? #t result)))
> %standard-build-options))
>
> (define %default-options
> @@ -1106,7 +1111,8 @@ Some ACTIONS support additional ARGS.\n"))
> (verbosity . 0)
> (file-system-type . "ext4")
> (image-size . guess)
> - (install-bootloader? . #t)))
> + (install-bootloader? . #t)
> + (restart-services? . #f)))
>
>
> ;;;
> @@ -1179,7 +1185,8 @@ resulting from command-line parsing."
> #:install-bootloader? bootloader?
> #:target target
> #:bootloader-target bootloader-target
> - #:gc-root (assoc-ref opts 'gc-root)))))
> + #:gc-root (assoc-ref opts 'gc-root)
> + #:restart-services? (assoc-ref opts 'restart-services?)))))
> #:system system))
> (warn-about-disk-space)))
>
> --
> 2.19.2
>
> From 2077919dca604c94b09cf105c33987daa8c46304 Mon Sep 17 00:00:00 2001
> From: Carlo Zancanaro <carlo@zancanaro.id.au>
> Date: Mon, 26 Nov 2018 22:38:26 +1100
> Subject: [PATCH 3/3] services: Always restart guix daemon
>
> * gnu/services/base.scm (guix-shepherd-service): Add restart-strategy of
> always-restart.
> ---
> gnu/services/base.scm | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/gnu/services/base.scm b/gnu/services/base.scm
> index 228d3c592..37d60720d 100644
> --- a/gnu/services/base.scm
> +++ b/gnu/services/base.scm
> @@ -1573,6 +1573,7 @@ failed to register hydra.gnu.org public key: ~a~%" status))))))))
> (documentation "Run the Guix daemon.")
> (provision '(guix-daemon))
> (requirement '(user-processes))
> + (restart-strategy always-restart)
> (modules '((srfi srfi-1)))
> (start
> #~(make-forkexec-constructor
C
C
Carlo Zancanaro wrote on 1 Jan 2019 12:25
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 33508@debbugs.gnu.org)
87h8eszkk5.fsf@zancanaro.id.au
Hey Ludo’,

Sorry for not responding to this email for so long. I've been
trying to think through some of the issues around this, and I'm
not confident that I have thought through the issues well enough
to actually decide on a good course of action, beyond what I have
already written. I'll respond to a few specific things in your
message, but I don't even know what a good solution would look
like, let alone how to build it.

On Mon, Dec 10 2018, Ludovic Courtès wrote:
Toggle quote (3 lines)
> In what sense is guix-daemon “always safe to restart”? It’s
> actually a difficult question for me.

I agree it's tricky. I had mostly intended that as an example,
because I used guix-daemon for my testing, but ...

Toggle quote (5 lines)
> You could argue that its child guix-daemon processes will remain
> live when we restart it, meaning that client connections remain
> active and valid. I believe this is indeed the case, though it
> would be worth double-checking.

... this is what I was thinking. I'm fairly sure this is the case,
given my observations while I was testing these patches.

Toggle quote (7 lines)
> Now, if safe-to-restart means that we automatically invoke the
> “restart” action on guix-daemon, that means that anything that
> depends on it (‘guix-publish’, ‘cuirass’, ‘hpcguix-web’, etc.)
> would be restarted as well (even though I *think* we don’t have
> to in this case.) But these may not be safe to restart: for
> example, on may want ‘guix-publish’ to run uninterrupted.

At the moment we have no way to capture this, particularly in the
Shepherd. There's no way to restart a service without restarting
dependent services, but I particularly want to pick up on the
"uninterrupted" by talking about nginx below.

Toggle quote (8 lines)
> ...

> sshd, nginx, and maybe guix-daemon can more or less be
> live-upgraded, meaning that (1) existing connections are
> preserved but future connections will talk to the new daemon,
> and as a corollary, (2) dependent services do not need to be
> stopped & restarted.

I did some research into nginx, and it turns out that it is
possible to upgrade nginx with zero-downtime by running two
daemons simultaneously listening on the same port(s), then
shutting down the old daemon after the new one has successfully
started[1]. This allows for an "uninterrupted" upgrade, but I'm
not confident that I would be able to implement it within our
current framework.

In all, I haven't done anything with this in the last month. I've
thought about it a few times, but it just feels a bit
overwhelming.

Carlo

L
L
Ludovic Courtès wrote on 5 Jan 2019 15:00
(name . Carlo Zancanaro)(address . carlo@zancanaro.id.au)(address . 33508@debbugs.gnu.org)
87tvin5hnc.fsf@gnu.org
Hi Carlo,

Carlo Zancanaro <carlo@zancanaro.id.au> skribis:

Toggle quote (7 lines)
> Sorry for not responding to this email for so long. I've been trying
> to think through some of the issues around this, and I'm not confident
> that I have thought through the issues well enough to actually decide
> on a good course of action, beyond what I have already written. I'll
> respond to a few specific things in your message, but I don't even
> know what a good solution would look like, let alone how to build it.

Sure, we can take more time to think through it. You earlier work in
this area has already greatly improved the situation so I feel less
pressure now.

Toggle quote (7 lines)
> I did some research into nginx, and it turns out that it is possible
> to upgrade nginx with zero-downtime by running two daemons
> simultaneously listening on the same port(s), then shutting down the
> old daemon after the new one has successfully started[1]. This allows
> for an "uninterrupted" upgrade, but I'm not confident that I would be
> able to implement it within our current framework.

Nginx does all this for us. Basically if you run “nginx -s restart”,
IIRC, it automatically does the multi-process dance and you eventually
end up with only upgraded nginx processes. However it relies on being
able to read its new configuration file from the same location as
before, which is something that doesn’t quite work in our setting,
unless we make the file available at a fixed location like
/etc/nginx.conf. Clément looked into this a while back but I cannot
find the reference.

Anyway I think we should probably special-case the ‘restart’ action for
those live-upgradable services. That doesn’t require any change in the
Shepherd.

Thoughts?

Ludo’.
L
L
Ludovic Courtès wrote on 9 Jun 2023 17:26
control message for bug #33508
(address . control@debbugs.gnu.org)
871qikwtah.fsf@gnu.org
close 33508
quit
?