Shepherd service is not getting respawned.

  • Done
  • quality assurance status badge
Details
2 participants
  • Ludovic Courtès
  • Tomas Volf
Owner
unassigned
Submitted by
Tomas Volf
Severity
normal
T
T
Tomas Volf wrote on 9 Nov 15:58 +0100
(address . bug-guix@gnu.org)
87bjyo8net.fsf@wolfsden.cz
Hi,

I wrote a shepherd service to function as a check for networking being
actually up, but it does not get respawned when it fails and I do not
understand why.

This is the service in my operating-system:

Toggle snippet (18 lines)
(simple-service
'network-online
shepherd-root-service-type
(list (shepherd-service
(requirement '(networking))
(provision '(network-online))
(documentation "Wait for the network to come up.")
(start #~(lambda _
(let* ((cmd "/run/privileged/bin/ping -qc1 -W1 1.1.1.1")
(status (system cmd)))
(= 0 (status:exit-val status)))))
(one-shot? #t)
;; Try every second.
(respawn-delay 1)
;; Retry forever. Double-quoting is intentional.
(respawn-limit ''(5 . 5)))))

Now, when I reboot the machine, I see in the log that the service did
start:

Toggle snippet (8 lines)
Nov 7 00:18:20 localhost shepherd[1]: Starting service network-online...
[..]
Nov 7 00:18:20 localhost shepherd[1]: [sh] PING 192.168.0.110 (192.168.0.110): 56 data bytes
Nov 7 00:18:20 localhost shepherd[1]: [sh] /run/privileged/bin/ping: sending packet: Network is unreachable
Nov 7 00:18:20 localhost shepherd[1]: Service network-online could not be started.
Nov 7 00:18:20 localhost shepherd[1]: Service network-online failed to start.

The fail on first run is expected, however the problem is it starts
exactly once. I do not see any attempts to respawn it in the
/var/log/messages, but based on the documentation the service *should*
get respawned, since it failed. What am I doing wrong? Would anyone
have any suggestions, either what is wrong with the code above or how to
approach it in another way?

Have a nice day,
Tomas

--
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.
L
L
Ludovic Courtès wrote on 10 Nov 12:32 +0100
(address . 74279@debbugs.gnu.org)
87zfm7wcgv.fsf@gnu.org
Hi Tomas,

Tomas Volf <~@wolfsden.cz> skribis:

Toggle quote (10 lines)
> (start #~(lambda _
> (let* ((cmd "/run/privileged/bin/ping -qc1 -W1 1.1.1.1")
> (status (system cmd)))
> (= 0 (status:exit-val status)))))
> (one-shot? #t)
> ;; Try every second.
> (respawn-delay 1)
> ;; Retry forever. Double-quoting is intentional.
> (respawn-limit ''(5 . 5)))))

[...]

Toggle quote (7 lines)
> Nov 7 00:18:20 localhost shepherd[1]: Starting service network-online...
> [..]
> Nov 7 00:18:20 localhost shepherd[1]: [sh] PING 192.168.0.110 (192.168.0.110): 56 data bytes
> Nov 7 00:18:20 localhost shepherd[1]: [sh] /run/privileged/bin/ping: sending packet: Network is unreachable
> Nov 7 00:18:20 localhost shepherd[1]: Service network-online could not be started.
> Nov 7 00:18:20 localhost shepherd[1]: Service network-online failed to start.

I think there’s a misunderstanding here: ‘respawn?’ is about respawning
a service that, once it is running, terminates prematurely.

In your case, the service does not start (its ‘start’ method returns
#f).

Now, it would probably make sense to have a mechanism to retry starting
services.

In the specific case of ‘network-online’ though, you could use a
different approach: the ‘start’ method could itself try retry pinging
the network several times and fail only if it failed to reach the
network after, say, 10s. (Remember that ‘start’ and ‘stop’ must
complete in a timely fashion.)

HTH,
Ludo’.
L
L
Ludovic Courtès wrote on 20 Nov 22:48 +0100
control message for bug #74279
(address . control@debbugs.gnu.org)
87zfltvaoe.fsf@gnu.org
tags 74279 notabug
close 74279
quit
?
Your comment

This issue is archived.

To comment on this conversation send an email to 74279@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 74279
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch