[PATCH shepherd] Allow replacement of services

DoneSubmitted by Carlo Zancanaro.
Details
2 participants
  • Carlo Zancanaro
  • Ludovic Courtès
Owner
unassigned
Severity
normal
C
C
Carlo Zancanaro wrote on 9 Aug 2018 14:42
(address . guix-patches@gnu.org)
87wosz4spx.fsf@zancanaro.id.au
This is a relatively simple patch adding a replacement slot to services in the Shepherd. When stopping a service, the replacement slot is checked and, if it has a value, is used to upgrade the current service.
I've chosen to modify the existing service, rather than creating a new one, but that was mostly because it was easier for me to implement quickly, and I didn't have a huge amount of time.
I'm hopeful that this, or something like it, can be used by GuixSD to allow people to restart services after reconfiguring without rebooting (or remembering to stop, reconfigure, start).
Carlo
From e03290041c91813f1a301c7e9c4dbb9ee768b400 Mon Sep 17 00:00:00 2001From: Carlo Zancanaro <carlo@zancanaro.id.au>Date: Thu, 9 Aug 2018 22:30:38 +1000Subject: [PATCH] service: Add a replacement slot for delayed service replacement.
* modules/shepherd/service.scm (<service>): Add replacement slot(replace-service): New procedure.(stop): Call replace-service after stopping a service.* tests/replacement.sh: Add a test for it.* Makefile.am (TESTS): Add the new test.* doc/shepherd.texi (Slots of services): Document it.--- Makefile.am | 1 + doc/shepherd.texi | 9 +++ modules/shepherd/service.scm | 23 +++++++- tests/replacement.sh | 106 +++++++++++++++++++++++++++++++++++ 4 files changed, 138 insertions(+), 1 deletion(-) create mode 100644 tests/replacement.sh
Toggle diff (194 lines)diff --git a/Makefile.am b/Makefile.amindex 8dad006..4322d7f 100644--- a/Makefile.am+++ b/Makefile.am@@ -184,6 +184,7 @@ SUFFIXES = .go TESTS = \ tests/basic.sh \+ tests/replacement.sh \ tests/respawn.sh \ tests/respawn-throttling.sh \ tests/misbehaved-client.sh \diff --git a/doc/shepherd.texi b/doc/shepherd.texiindex 7946f8b..1de6d80 100644--- a/doc/shepherd.texi+++ b/doc/shepherd.texi@@ -708,6 +708,15 @@ handler will not start it again. otherwise @code{#f}. +@item+@vindex replacement (slot of <service>)+@code{replacement} specifies a service to be used to replace this one+when it is stopped. This service will continue to function normally+until the @code{stop} action is invoked. After the service has been+successfully stopped, its definition will be replaced by the value of+this slot, which must itself be a service. This slot is ignored if+its value is @code{#f}.+ @end itemize @c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scmindex 5653388..4f62dc1 100644--- a/modules/shepherd/service.scm+++ b/modules/shepherd/service.scm@@ -205,7 +205,10 @@ respawned, shows that it has been respawned more than TIMES in SECONDS." (stop-delay? #:init-keyword #:stop-delay? #:init-value #f) ;; The times of the last respawns, most recent first.- (last-respawns #:init-form '()))+ (last-respawns #:init-form '())+ ;; A replacement for when this service is stopped.+ (replacement #:init-keyword #:replacement+ #:init-value #f)) (define (service? obj) "Return true if OBJ is a service."@@ -341,6 +344,21 @@ wire." (canonical-name obj))))) (slot-ref obj 'running)) +(define (replace-service service)+ (let ((replacement (slot-ref service 'replacement)))+ (define (copy-slot! slot)+ (slot-set! service slot (slot-ref replacement slot)))+ (when replacement+ (copy-slot! 'provides)+ (copy-slot! 'requires)+ (copy-slot! 'respawn?)+ (copy-slot! 'start)+ (copy-slot! 'stop)+ (copy-slot! 'actions)+ (copy-slot! 'running)+ (copy-slot! 'docstring))+ service))+ ;; Stop the service, including services that depend on it. If the ;; latter fails, continue anyway. Return `#f' if it could be stopped. (define-method (stop (obj <service>) . args)@@ -385,6 +403,9 @@ wire." ;; Reset the list of respawns. (slot-set! obj 'last-respawns '()) + ;; Replace the service with its replacement, if it has one+ (replace-service obj)+ ;; Status message. (let ((name (canonical-name obj))) (if (running? obj)diff --git a/tests/replacement.sh b/tests/replacement.shnew file mode 100644index 0000000..585ab5a--- /dev/null+++ b/tests/replacement.sh@@ -0,0 +1,106 @@+# GNU Shepherd --- Ensure replacing services works properly+# Copyright © 2014, 2016 Ludovic Courtès <ludo@gnu.org>+# Copyright © 2018 Carlo Zancanaro <carlo@zancanaro.id.au>+#+# This file is part of the GNU Shepherd.+#+# The GNU Shepherd is free software; you can redistribute it and/or modify it+# under the terms of the GNU General Public License as published by+# the Free Software Foundation; either version 3 of the License, or (at+# your option) any later version.+#+# The GNU Shepherd is distributed in the hope that it will be useful, but+# WITHOUT ANY WARRANTY; without even the implied warranty of+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the+# GNU General Public License for more details.+#+# You should have received a copy of the GNU General Public License+# along with the GNU Shepherd. If not, see <http://www.gnu.org/licenses/>.++shepherd --version+herd --version++socket="t-socket-$$"+conf="t-conf-$$"+rconf="t-rconf-$$"+log="t-log-$$"+stamp="t-stamp-$$"+pid="t-pid-$$"++herd="herd -s $socket"++trap "rm -f $socket $conf $rconf $stamp $log;+ test -f $pid && kill \`cat $pid\` || true; rm -f $pid" EXIT++cat > "$conf"<<EOF+(use-modules (srfi srfi-26))+(register-services+ (make <service>+ #:provides '(test)+ #:start (const #t)+ #:actions (make-actions+ (say-hello (lambda _+ (call-with-output-file "$stamp"+ (lambda (port)+ (display "Hello" port))))))+ #:respawn? #f))+EOF++rm -f "$pid" "$stamp" "$socket"+shepherd -I -s "$socket" -c "$conf" --pid="$pid" --log="$log" &++while ! test -f "$pid"; do sleep 0.5 ; done++$herd start test++if ! $herd say-hello test; then+ echo "say-hello failed"+ exit 1+fi++cat - > "$rconf"<<EOF+(let ((service (lookup-running 'test)))+ (slot-set! service 'replacement+ (make <service>+ #:provides '(test)+ #:start (const #t)+ #:actions (make-actions+ (say-goodbye (lambda _+ (call-with-output-file "$stamp"+ (lambda (port)+ (display "Goodbye" port))))))+ #:respawn? #f)))+EOF++$herd load root "$rconf"++if ! $herd say-hello test; then+ echo "say-hello failed after setting replacement"+ exit 1+fi++if test `cat $stamp` != "Hello"; then+ echo "Output file had the wrong contents! Was:"+ cat $stamp+ exit 1+fi++$herd stop test++$herd start test++if $herd say-hello test; then+ echo "say-hello should have failed after stop/start"+ exit 1+fi++if ! $herd say-goodbye test; then+ echo "say-goodbye should have failed"+ exit 1+fi++if test `cat $stamp` != "Goodbye"; then+ echo "Output file had the wrong contents! Was:"+ cat $stamp+ exit 1+fi-- 2.18.0
-----BEGIN PGP SIGNATURE-----
iQIzBAEBCAAdFiEE1lpncq7JnOkt+LaeqdyPv9awIbwFAltsNpoACgkQqdyPv9awIbzJ3A//TD+Z+9GuR+bSoRxFTS5wrMfQ0hHANj22sEPQQk8xaYSVKuPMHkPe4+RBz+ECmxYKtQRUB5b4pJ/cg8G8ynvOW9ZAbKL1pGpIMfuyyHmCv8XdbugzSrphSLFDPq34HfTPWKPN9JuSeXZ445h5vVJ4i2XP0bwHU8l3by5cvOJesupBge1lvGYwnBhcHIvfB0r+Cm7g/3m5jHF8k/mImHZpYANzKRHL9pRSFJz4ewk/+8i5wpIVrpm5KPFXyX2c3wvxa4BRsyKPGVhJ0v34ltLT1BGeUHQhHDju3Iilkt3r/b3+DmWMiSVZ+D8vGn3CjPH1bhmTx6Ls3Kj7kZMxyhdam5S3Zl+TERQsYS0Kp5jZNvqY8fbk6LEcS9tXo6B5WIo4oKnqA7PmOwMg5GzX+9BkLVY6CGnMkIAhdKWANjSYMBizZ94DA3ZmrImA6/nrHsNCzcjhg2wyhS7cOtMeis+xWMP7CMO0IASjikpq6e3YWJU3vSqGJKZwInyZxZGBp45+v8wiiLDcc9q1+k3rSyLbAPc4UE4h8CQ0FBeA93Z12pCiZ2W2l+BkK9L+S1kaZ3iLBu1qFPyzKT6nRvREawdgbwD3BeZoo2jqX2PF4HlstAH9noM/n2GfiL0vm0QKotRUXcf97aCBhQHs2yRVMt2oJz936Q44zJ+yQqBeu7CyCb0==2Nj0-----END PGP SIGNATURE-----
L
L
Ludovic Courtès wrote on 20 Aug 2018 22:33
(name . Carlo Zancanaro)(address . carlo@zancanaro.id.au)(address . 32408@debbugs.gnu.org)
87h8jobwwp.fsf@gnu.org
Hi Carlo,
Carlo Zancanaro <carlo@zancanaro.id.au> skribis:
Toggle quote (13 lines)> This is a relatively simple patch adding a replacement slot to> services in the Shepherd. When stopping a service, the replacement> slot is checked and, if it has a value, is used to upgrade the current> service.>> I've chosen to modify the existing service, rather than creating a new> one, but that was mostly because it was easier for me to implement> quickly, and I didn't have a huge amount of time.>> I'm hopeful that this, or something like it, can be used by GuixSD to> allow people to restart services after reconfiguring without rebooting> (or remembering to stop, reconfigure, start).
Awesome! This is exactly what we need to address the problem.
We’ll still want to be able to special-case things like nginx that canbe “hot-replaced”, though. So perhaps, in addition to this patch on theShepherd side, we’ll need extra stuff in (gnu services shepherd).
For instance, the ‘actions’ field of <shepherd-service> could, bydefault, include an “upgrade” action that simply sets the ‘replacement’slot. For nginx, we’d provide a custom “upgrade” action that does“nginx -s restart” or whatever it is that needs to be done.
‘guix system reconfigure’ would automatically invoke the ‘upgrade’action for each new service.
WDYT?
Toggle quote (13 lines)> From e03290041c91813f1a301c7e9c4dbb9ee768b400 Mon Sep 17 00:00:00 2001> From: Carlo Zancanaro <carlo@zancanaro.id.au>> Date: Thu, 9 Aug 2018 22:30:38 +1000> Subject: [PATCH] service: Add a replacement slot for delayed service> replacement.>> * modules/shepherd/service.scm (<service>): Add replacement slot> (replace-service): New procedure.> (stop): Call replace-service after stopping a service.> * tests/replacement.sh: Add a test for it.> * Makefile.am (TESTS): Add the new test.> * doc/shepherd.texi (Slots of services): Document it.
Overall LGTM. Some comments:
Toggle quote (2 lines)> +(define (replace-service service)
Please add a docstring.
Toggle quote (14 lines)> + (let ((replacement (slot-ref service 'replacement)))> + (define (copy-slot! slot)> + (slot-set! service slot (slot-ref replacement slot)))> + (when replacement> + (copy-slot! 'provides)> + (copy-slot! 'requires)> + (copy-slot! 'respawn?)> + (copy-slot! 'start)> + (copy-slot! 'stop)> + (copy-slot! 'actions)> + (copy-slot! 'running)> + (copy-slot! 'docstring))> + service))
Having a hardcoded list of slots sounds error-prone—surely we’ll forgetto update it down the road. I wonder what else could be done.
One option would be to grab the block asyncs and atomically replace theservice in the ‘%services’ hash table. Then we only need to copy the‘last-respawns’ slot to the new service, I believe. (This changes theobject identity of the service but I think its OK.)
Another option would be to use GOOPS tricks to iterate over the list ofslots and have a list of slots *not* to copy. I’m not a big fan of thisoption, though.
Toggle quote (1 lines)> +cat - > "$rconf"<<EOF
^You can remove the dash.
Toggle quote (4 lines)> +(let ((service (lookup-running 'test)))> + (slot-set! service 'replacement> + (make <service>
I wonder if we should let users fiddle with ‘replacement’ directly, orif we should provide a higher-level construct.
For instance, ‘register-services’ could transparently set the‘replacement’ field for services already registered instead of doing:
(assert (null? (lookup-services (canonical-name new))))
Not sure if there are cases where this behavior would be undesirable,though.
Thoughts?
Toggle quote (2 lines)> +if test `cat $stamp` != "Goodbye"; then
Nitpick: "`cat $stamp`".
Thanks!
Ludo’.
C
C
Carlo Zancanaro wrote on 20 Aug 2018 23:16
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 32408@debbugs.gnu.org)
87k1okzqkf.fsf@zancanaro.id.au
Hey Ludo,
On Tue, Aug 21 2018, Ludovic Courtès wrote:
Toggle quote (5 lines)> We’ll still want to be able to special-case things like nginx > that can be “hot-replaced”, though. So perhaps, in addition to > this patch on the Shepherd side, we’ll need extra stuff in (gnu > services shepherd).
Yeah, if we expose the replacement field directly, then we'll need some supporting code in (gnu services shepherd), even if it's just to detect whether the service is stopped or not before doing a replacement. Although ideally our interface wouldn't introduce race conditions like that. (See below for more thoughts on this.)
Toggle quote (11 lines)> For instance, the ‘actions’ field of <shepherd-service> could, > by default, include an “upgrade” action that simply sets the > ‘replacement’ slot. For nginx, we’d provide a custom “upgrade” > action that does “nginx -s restart” or whatever it is that needs > to be done.>> ‘guix system reconfigure’ would automatically invoke the > ‘upgrade’ action for each new service.>> WDYT?
How many services can we meaningfully upgrade like this? My understanding is that most of our system services a fed an immutable configuration file, and thus restarting/reloading won't actually upgrade them. In order to make an upgrade action work the service would have to mutate itself into a new correct configuration, as well as restarting/reloading the underlying daemon. It's even trickier if the daemon itself has been upgraded, because then the process will have to be restarted anyway.
At any rate, I think the replacement mechanism (this patch) is just one way that a service can be reloaded. It would probably be a good idea to create a higher-level abstraction over it. I think other mechanisms (like a upgrade/reload action) should be handled on the Guix side of things.
Toggle quote (28 lines)>> + (let ((replacement (slot-ref service 'replacement)))>> + (define (copy-slot! slot)>> + (slot-set! service slot (slot-ref replacement slot)))>> + (when replacement>> + (copy-slot! 'provides)>> + (copy-slot! 'requires)>> + (copy-slot! 'respawn?)>> + (copy-slot! 'start)>> + (copy-slot! 'stop)>> + (copy-slot! 'actions)>> + (copy-slot! 'running)>> + (copy-slot! 'docstring))>> + service))>> Having a hardcoded list of slots sounds error-prone—surely we’ll > forget to update it down the road. I wonder what else could be > done.>> One option would be to grab the block asyncs and atomically > replace the service in the ‘%services’ hash table. Then we only > need to copy the ‘last-respawns’ slot to the new service, I > believe. (This changes the object identity of the service but I > think its OK.)>> Another option would be to use GOOPS tricks to iterate over the > list of slots and have a list of slots *not* to copy. I’m not a > big fan of this option, though.
My favourite option for this would be to separate the <service> object into an immutable <service> and a mutable <service-state>. The <service-state> object would have a reference to a <service> object in order to invoke actions on it, and it could also hold a second <service> object as a replacement. Then the swap would be much more straightforward. I haven't done any real work towards this, though.
In the short term, I'd rather replace it in the %services hash table. I did it by copying slots because I wasn't sure I would get the details of the swap right and didn't have time to properly work out how to do it. I'll give it a go!
Toggle quote (18 lines)>> +(let ((service (lookup-running 'test)))>> + (slot-set! service 'replacement>> + (make <service>>> I wonder if we should let users fiddle with ‘replacement’ > directly, or if we should provide a higher-level construct.>> For instance, ‘register-services’ could transparently set the > ‘replacement’ field for services already registered instead of > doing:>> (assert (null? (lookup-services (canonical-name new))))>> Not sure if there are cases where this behavior would be > undesirable, though.>> Thoughts?
With this current patch the replacement field is only checked at the point when the service is stopped, so the field could only be set when the service is actually running. I think it makes the most sense to just replace the service directly if it's not stopped.
I can't think of any undesirable cases, but having a higher-level interface is a good idea. At the very least we need to control the inherent race condition involved in (if running? do-x do-y) for if the service is stopped after the running? check. At the moment I think the only thing we have to worry about there is signals, but if we're going to move to have more parallelism through fibers then we might need to be even more careful.
I'll try to send through an updated patch later this week.
Carlo
-----BEGIN PGP SIGNATURE-----
iQIzBAEBCAAdFiEE1lpncq7JnOkt+LaeqdyPv9awIbwFAlt7L8AACgkQqdyPv9awIbyKtw//X2xkrVY9HAtGQ7ROk1k7PMGAEFpCJyxEFbcMh3BkbmJxbhdX5Gov1+3MU67UX/t2D239Q07W28FPw5psm2PPowSZMx+QIcl+CA2H3efPuRmgdXlAPuMzn8gnCZxLcDoS9lhB1vFGHIG6AkHVzGFj6OB2HhDNG/id/KyumUOPMRrqrNDTtt9+60pBX/At8EpNp1QGtgq4ZOVXvmEufDX4TWzSdAFYh82AiiTuZZFeDhqZ7zDEk8jB/ng9MPtyt8RdWfnqmxgzwHWf/Dniiow8nHw9Z9mYX+P4g7/6ZRWvDTLeGFKn5fjG9fVzNtg71O7DDyuM7bm24R2/JW7cN1Qv9gP4AYOAfrYxJgzNntNmnOVniZyFRS+dtm0pz6iF/LqsDkit9eRnWW5K8MuEu9AwtV8HTANNmGgOnU4Oc7soVzBJnbvJYuxTPEm7rfliFke/HNGBnvSfoXa5JzGPYJz8wVlnMdUc7CvATVobUqty4sJ4NbSLQDxNDlSE9uTZH5irxE5E2RY1aPgsl+4uU1KbMw+rah5d40pe0ddVu1WI0llSmij0IpJTaFm9+IzMoaiJhrk0ceocs8aEcmx9gjl09ELcUiEIvoorz/Z9NMdLLJ04eJHlWNXgTgu7QA37arySGg6zxnQOiExJC+/B6l+7hw341+pG+YDrA6gBFIpp9ys==5vP+-----END PGP SIGNATURE-----
L
L
Ludovic Courtès wrote on 21 Aug 2018 12:27
(name . Carlo Zancanaro)(address . carlo@zancanaro.id.au)(address . 32408@debbugs.gnu.org)
87in44ovzm.fsf@gnu.org
Hey Carlo,
Carlo Zancanaro <carlo@zancanaro.id.au> skribis:
Toggle quote (14 lines)> On Tue, Aug 21 2018, Ludovic Courtès wrote:>> For instance, the ‘actions’ field of <shepherd-service> could, by>> default, include an “upgrade” action that simply sets the>> ‘replacement’ slot. For nginx, we’d provide a custom “upgrade”>> action that does “nginx -s restart” or whatever it is that needs to>> be done.>>>> ‘guix system reconfigure’ would automatically invoke the ‘upgrade’>> action for each new service.>>>> WDYT?>> How many services can we meaningfully upgrade like this?
Probably very few.
Toggle quote (8 lines)> My understanding is that most of our system services a fed an immutable> configuration file, and thus restarting/reloading won't actually> upgrade them. In order to make an upgrade action work the service> would have to mutate itself into a new correct configuration, as well> as restarting/reloading the underlying daemon. It's even trickier if> the daemon itself has been upgraded, because then the process will> have to be restarted anyway.
Yeah.
Toggle quote (6 lines)> At any rate, I think the replacement mechanism (this patch) is just> one way that a service can be reloaded. It would probably be a good> idea to create a higher-level abstraction over it. I think other> mechanisms (like a upgrade/reload action) should be handled on the> Guix side of things.
Sounds good.
Toggle quote (34 lines)>>> + (let ((replacement (slot-ref service 'replacement)))>>> + (define (copy-slot! slot)>>> + (slot-set! service slot (slot-ref replacement slot)))>>> + (when replacement>>> + (copy-slot! 'provides)>>> + (copy-slot! 'requires)>>> + (copy-slot! 'respawn?)>>> + (copy-slot! 'start)>>> + (copy-slot! 'stop)>>> + (copy-slot! 'actions)>>> + (copy-slot! 'running)>>> + (copy-slot! 'docstring))>>> + service))>>>> Having a hardcoded list of slots sounds error-prone—surely we’ll>> forget to update it down the road. I wonder what else could be>> done.>>>> One option would be to grab the block asyncs and atomically replace>> the service in the ‘%services’ hash table. Then we only need to>> copy the ‘last-respawns’ slot to the new service, I believe. (This>> changes the object identity of the service but I think its OK.)>>>> Another option would be to use GOOPS tricks to iterate over the list>> of slots and have a list of slots *not* to copy. I’m not a big fan>> of this option, though.>> My favourite option for this would be to separate the <service> object> into an immutable <service> and a mutable <service-state>. The> <service-state> object would have a reference to a <service> object in> order to invoke actions on it, and it could also hold a second> <service> object as a replacement. Then the swap would be much more> straightforward. I haven't done any real work towards this, though.
I agree that separating state is the right approach longer-term (it’sbeyond the scope of this patch.)
Toggle quote (5 lines)> In the short term, I'd rather replace it in the %services hash> table. I did it by copying slots because I wasn't sure I would get the> details of the swap right and didn't have time to properly work out> how to do it. I'll give it a go!
Alright!
Toggle quote (26 lines)>>> +(let ((service (lookup-running 'test)))>>> + (slot-set! service 'replacement>>> + (make <service>>>>> I wonder if we should let users fiddle with ‘replacement’ directly,>> or if we should provide a higher-level construct.>>>> For instance, ‘register-services’ could transparently set the>> ‘replacement’ field for services already registered instead of>> doing:>>>> (assert (null? (lookup-services (canonical-name new))))>>>> Not sure if there are cases where this behavior would be>> undesirable, though.>>>> Thoughts?>> With this current patch the replacement field is only checked at the> point when the service is stopped, so the field could only be set when> the service is actually running. I think it makes the most sense to> just replace the service directly if it's not stopped.>> I can't think of any undesirable cases, but having a higher-level> interface is a good idea.
Right. Currently the Guix side of things does the equivalent of:
herd eval root '(register-services (load "a.scm") (load "b.scm"))'
for services currently stopped, which is okay but not great. IWBN if itcould directly use a higher-level action.
Toggle quote (7 lines)> At the very least we need to control the inherent race condition> involved in (if running? do-x do-y) for if the service is stopped> after the running? check. At the moment I think the only thing we have> to worry about there is signals, but if we're going to move to have> more parallelism through fibers then we might need to be even more> careful.
Indeed.
Thank you,Ludo’.
C
C
Carlo Zancanaro wrote on 23 Aug 2018 15:45
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 32408@debbugs.gnu.org)
87o9dtqjry.fsf@zancanaro.id.au
Hey Ludo’,
I've attached an updated patch. I couldn't think of any unwanted consequences, so I took your idea of making register-services handle most of the details of replacement. With my patch, something like
Toggle quote (2 lines)> herd eval root '(register-services (load "a.scm") (load > "b.scm"))'
will deal with a conflict by either replacing the old service (if it's not running), arranging for the old service to be replaced when it's stopped, or raising an error. This seems like a sensible way for things to function.
Toggle quote (5 lines)>> At the very least we need to control the inherent race >> condition [...]>> Indeed.
Despite my desire to deal with the race condition, I haven't done anything about it in this patch. The modification of %services that was done in register-services was already racy, and I don't think this patch will make it worse. If it hasn't been a problem up until now, then I don't think this will make it a problem.
Carlo
From 9ec5c0000e9a45441417a6ee4138cdcbf1b1f2b2 Mon Sep 17 00:00:00 2001From: Carlo Zancanaro <carlo@zancanaro.id.au>Date: Thu, 9 Aug 2018 22:30:38 +1000Subject: [PATCH] service: Add a replacement slot for delayed service replacement.
* modules/shepherd/service.scm (<service>): Add replacement slot(replace-service): New procedure.(stop): Call replace-service after stopping a service.(register-services): Replace existing services where possible, setting the newreplacement slot if they are currently running.* tests/replacement.sh: Add a test for it.* Makefile.am (TESTS): Add the new test.* doc/shepherd.texi (Slots of services): Document it.--- Makefile.am | 1 + doc/shepherd.texi | 9 +++ modules/shepherd/service.scm | 68 ++++++++++++++++++----- tests/replacement.sh | 105 +++++++++++++++++++++++++++++++++++ 4 files changed, 168 insertions(+), 15 deletions(-) create mode 100644 tests/replacement.sh
Toggle diff (250 lines)diff --git a/Makefile.am b/Makefile.amindex 8dad006..4322d7f 100644--- a/Makefile.am+++ b/Makefile.am@@ -184,6 +184,7 @@ SUFFIXES = .go TESTS = \ tests/basic.sh \+ tests/replacement.sh \ tests/respawn.sh \ tests/respawn-throttling.sh \ tests/misbehaved-client.sh \diff --git a/doc/shepherd.texi b/doc/shepherd.texiindex 7946f8b..1de6d80 100644--- a/doc/shepherd.texi+++ b/doc/shepherd.texi@@ -708,6 +708,15 @@ handler will not start it again. otherwise @code{#f}. +@item+@vindex replacement (slot of <service>)+@code{replacement} specifies a service to be used to replace this one+when it is stopped. This service will continue to function normally+until the @code{stop} action is invoked. After the service has been+successfully stopped, its definition will be replaced by the value of+this slot, which must itself be a service. This slot is ignored if+its value is @code{#f}.+ @end itemize @c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scmindex 5653388..006309c 100644--- a/modules/shepherd/service.scm+++ b/modules/shepherd/service.scm@@ -205,7 +205,10 @@ respawned, shows that it has been respawned more than TIMES in SECONDS." (stop-delay? #:init-keyword #:stop-delay? #:init-value #f) ;; The times of the last respawns, most recent first.- (last-respawns #:init-form '()))+ (last-respawns #:init-form '())+ ;; A replacement for when this service is stopped.+ (replacement #:init-keyword #:replacement+ #:init-value #f)) (define (service? obj) "Return true if OBJ is a service."@@ -341,6 +344,20 @@ wire." (canonical-name obj))))) (slot-ref obj 'running)) +(define (replace-service old-service new-service)+ "Replace OLD-SERVICE with NEW-SERVICE in the services registry. This+completely removes all references to OLD-SERVICE before registering+NEW-SERVICE."+ (define (remove-service name)+ (let* ((old (hashq-ref %services name))+ (new (delete old-service old)))+ (if (null? new)+ (hashq-remove! %services name)+ (hashq-set! %services name new))))+ (when new-service+ (for-each remove-service (provided-by old-service))+ (register-services new-service)))+ ;; Stop the service, including services that depend on it. If the ;; latter fails, continue anyway. Return `#f' if it could be stopped. (define-method (stop (obj <service>) . args)@@ -385,6 +402,11 @@ wire." ;; Reset the list of respawns. (slot-set! obj 'last-respawns '()) + ;; Replace the service with its replacement, if it has one+ (let ((replacement (slot-ref obj 'replacement)))+ (when replacement+ (replace-service obj replacement)))+ ;; Status message. (let ((name (canonical-name obj))) (if (running? obj)@@ -1038,25 +1060,41 @@ then disable it." ;; Add NEW-SERVICES to the list of known services. (define (register-services . new-services)+ "Add NEW-SERVICES to the list of known services. If a service has already+been registered, arrange to have it replaced when it is next stopped. If it+is currently stopped, replace it immediately." (define (register-single-service new) ;; Sanity-checks first. (assert (list-of-symbols? (provided-by new))) (assert (list-of-symbols? (required-by new))) (assert (boolean? (respawn? new)))- ;; Canonical name actually must be canonical. (FIXME: This test- ;; is incomplete, since we may add a service later that makes it- ;; non-cannonical.)- (assert (null? (lookup-services (canonical-name new))))- ;; FIXME: Verify consistency: Check that there are no circular- ;; dependencies, check for bogus conflicts/dependencies, whatever- ;; else makes sense.-- ;; Insert into the hash table.- (for-each (lambda (name)- (let ((old (lookup-services name)))- ;; Actually add the new service now.- (hashq-set! %services name (cons new old))))- (provided-by new)))++ ;; FIXME: Just because we have a unique canonical name now doesn't mean it+ ;; will remain unique as other services are added. Whenever a service is+ ;; added it should check that it's not conflicting with any already+ ;; registered canonical names.+ (match (lookup-services (canonical-name new))+ (() ;; empty, so we can safely add ourselves+ (for-each (lambda (name)+ (let ((old (lookup-services name)))+ (hashq-set! %services name (cons new old))))+ (provided-by new)))+ ((old) ;; one service registered, so it may be an old version of us+ (cond+ ((not (eq? (canonical-name new) (canonical-name old)))+ (local-output+ "Cannot register service ~a: canonical name is not unique."+ (canonical-name new))+ (throw 'non-canonical-name))+ ((running? old)+ (slot-set! old 'replacement new))+ (#:else+ (replace-service old new))))+ (_ ;; in any other case, there are too many services to register+ (local-output+ "Cannot register service ~a: canonical name is not unique."+ (canonical-name new))+ (throw 'non-canonical-name)))) (for-each register-single-service new-services)) diff --git a/tests/replacement.sh b/tests/replacement.shnew file mode 100644index 0000000..e06cb93--- /dev/null+++ b/tests/replacement.sh@@ -0,0 +1,105 @@+# GNU Shepherd --- Ensure replacing services works properly+# Copyright © 2014, 2016 Ludovic Courtès <ludo@gnu.org>+# Copyright © 2018 Carlo Zancanaro <carlo@zancanaro.id.au>+#+# This file is part of the GNU Shepherd.+#+# The GNU Shepherd is free software; you can redistribute it and/or modify it+# under the terms of the GNU General Public License as published by+# the Free Software Foundation; either version 3 of the License, or (at+# your option) any later version.+#+# The GNU Shepherd is distributed in the hope that it will be useful, but+# WITHOUT ANY WARRANTY; without even the implied warranty of+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the+# GNU General Public License for more details.+#+# You should have received a copy of the GNU General Public License+# along with the GNU Shepherd. If not, see <http://www.gnu.org/licenses/>.++shepherd --version+herd --version++socket="t-socket-$$"+conf="t-conf-$$"+rconf="t-rconf-$$"+log="t-log-$$"+stamp="t-stamp-$$"+pid="t-pid-$$"++herd="herd -s $socket"++trap "rm -f $socket $conf $rconf $stamp $log;+ test -f $pid && kill \`cat $pid\` || true; rm -f $pid" EXIT++cat > "$conf"<<EOF+(use-modules (srfi srfi-26))+(register-services+ (make <service>+ #:provides '(test)+ #:start (const #t)+ #:actions (make-actions+ (say-hello (lambda _+ (call-with-output-file "$stamp"+ (lambda (port)+ (display "Hello" port))))))+ #:respawn? #f))+EOF++rm -f "$pid" "$stamp" "$socket"+shepherd -I -s "$socket" -c "$conf" --pid="$pid" --log="$log" &++while ! test -f "$pid"; do sleep 0.5 ; done++$herd start test++if ! $herd say-hello test; then+ echo "say-hello failed"+ exit 1+fi++cat > "$rconf"<<EOF+(register-services+ (make <service>+ #:provides '(test)+ #:start (const #t)+ #:actions (make-actions+ (say-goodbye (lambda _+ (call-with-output-file "$stamp"+ (lambda (port)+ (display "Goodbye" port))))))+ #:respawn? #f))+EOF++$herd load root "$rconf"++if ! $herd say-hello test; then+ echo "say-hello failed after setting replacement"+ exit 1+fi++if test "`cat $stamp`" != "Hello"; then+ echo "Output file had the wrong contents! Was:"+ cat $stamp+ exit 1+fi++$herd stop test++$herd start test++if $herd say-hello test; then+ echo "say-hello should have failed after stop/start"+ exit 1+fi++if ! $herd say-goodbye test; then+ echo "say-goodbye failed after replacement"+ exit 1+fi++if test "`cat $stamp`" != "Goodbye"; then+ echo "Output file had the wrong contents! Was:"+ cat $stamp+ exit 1+fi-- 2.18.0
-----BEGIN PGP SIGNATURE-----
iQIzBAEBCAAdFiEE1lpncq7JnOkt+LaeqdyPv9awIbwFAlt+umEACgkQqdyPv9awIbz9VQ/8CcWPyF3dAAxTQVS20gJs/b3NpTgQdTf4Vj+p9PTDrzY9NJgh2HMyvZJ4OvoEZEzS7LfRV3InERRwJbEH8npN1PUrJz196EAD5AVwAynDHrwe/KwZDqk38ScPBIEpcOr34Jzn/CZ2c+CnXyYZsXf7omJtcP6LHLdq2yBW/KkKB5UagAxNjNzf/wPZqzoZ1zb4XJ+tMzCaupUm+J4OyVgnEjqCvWy96L47QSD3TP9dl4701oJaZ7Vvzsm908+8ikDJ71ZKWhQ9PniKLj9pWrKibcBMMZ9PTY6zw69NJ2DHOH1AONWPzCmRKiy3c776j3vkFXhQ2SUyprphzF7CfZ7OEgH2xwgy2TAphwNPZtHtamngQcR6MLECOQIPfKlGzxYdQ44u8TLCKkeR/ngMrJLMy5FuCymKuMqO3NUOHopp8qnK5qiUnp8RcaFVLFYOaWcugGqt0eugKTaGnHap5UtMVMXukmoQcLAzv6fBYYiyFOnFvXxCKOi0VXw06b+UDmjs2EuC5lmx/mMf3HlS3FxvyDfo41PdPoeKPXQBfDRzVepapBb+4nw6lqJjWLu+h6YsYrwWJxJx+9vP9mmGCTLDCXIiEfCmry+pkZsMndC+d0Tt0X/xYd6p3yHhbOW7NbNUG0/C3aYVjV62nIYTWSdpVQ3X14hd4OVYuAtYbrV7Vuk==yrpC-----END PGP SIGNATURE-----
L
L
Ludovic Courtès wrote on 25 Aug 2018 16:20
(name . Carlo Zancanaro)(address . carlo@zancanaro.id.au)(address . 32408@debbugs.gnu.org)
877ekecyu8.fsf@gnu.org
Hello!
Carlo Zancanaro <carlo@zancanaro.id.au> skribis:
Toggle quote (10 lines)> I've attached an updated patch. I couldn't think of any unwanted> consequences, so I took your idea of making register-services handle> most of the details of replacement. With my patch, something like>> herd eval root '(register-services (load "a.scm") (load>> "b.scm"))'> will deal with a conflict by either replacing the old service (if it's> not running), arranging for the old service to be replaced when it's> stopped, or raising an error. This seems like a sensible way for> things to function.
Awesome. Thus, the only thing we need to change in ‘guix systemreconfigure’ is to make the ‘register-services’ call unconditional(currently it’s limited to services that are stopped.)
Toggle quote (6 lines)> Despite my desire to deal with the race condition, I haven't done> anything about it in this patch. The modification of %services that> was done in register-services was already racy, and I don't think this> patch will make it worse. If it hasn't been a problem up until now,> then I don't think this will make it a problem.
Yeah, sounds reasonable.
Toggle quote (15 lines)> From 9ec5c0000e9a45441417a6ee4138cdcbf1b1f2b2 Mon Sep 17 00:00:00 2001> From: Carlo Zancanaro <carlo@zancanaro.id.au>> Date: Thu, 9 Aug 2018 22:30:38 +1000> Subject: [PATCH] service: Add a replacement slot for delayed service> replacement.>> * modules/shepherd/service.scm (<service>): Add replacement slot> (replace-service): New procedure.> (stop): Call replace-service after stopping a service.> (register-services): Replace existing services where possible, setting the new> replacement slot if they are currently running.> * tests/replacement.sh: Add a test for it.> * Makefile.am (TESTS): Add the new test.> * doc/shepherd.texi (Slots of services): Document it.
Awesome, please push! And sorry about the delay.
Thank you,Ludo’.
L
L
Ludovic Courtès wrote on 2 Oct 2018 11:50
control message for bug #32408
(address . control@debbugs.gnu.org)
87r2h8r82r.fsf@gnu.org
tags 32408 fixedclose 32408
?
Your comment

This issue is archived.

To comment on this conversation send email to 32408@debbugs.gnu.org