shepherd exits for no good reason

OpenSubmitted by Ludovic Courtès.
Details
2 participants
  • Ludovic Courtès
  • Mathieu Othacehe
Owner
unassigned
Severity
important
L
L
Ludovic Courtès wrote on 7 May 12:32 +0200
(address . bug-guix@gnu.org)
87bln0doh2.fsf@inria.fr
Hello,
I witnessed a case with Shepherd 0.8.0 on ‘core-updates’(7b07852ddb334c92bcef69666f21c599f1f0fa79) where shepherd exited all byitself, all of a sudden. Here’s what /var/log/messages shows:
Toggle snippet (16 lines)May 7 09:36:23 localhost vmunix: [ 20.316829] shepherd[1]: Service user-homes has been started.May 7 09:36:23 localhost vmunix: [ 21.319625] shepherd[1]: Service nscd has been started.May 7 09:36:23 localhost vmunix: [ 21.321029] shepherd[1]: Service guix-daemon has been started.
[…]
May 7 09:36:52 localhost shepherd[1]: Exiting shepherd... May 7 09:36:52 localhost shepherd[1]: Service xorg-server has been stopped. May 7 09:36:52 localhost shepherd[1]: Service console-font-tty2 has been stopped. May 7 09:36:52 localhost shepherd[1]: Service term-tty2 has been stopped. May 7 09:36:52 localhost shepherd[1]: Service upower-daemon has been stopped. May 7 09:36:52 localhost shepherd[1]: Service elogind has been stopped. May 7 09:36:52 localhost ntpd[482]: ntpd exiting on signal 15 (Terminated)May 7 09:36:52 localhost syslogd: exiting on signal 15
The end result was a kernel panic with exitcode=0x100 (meaning exitedwith 1).
It looks as though one had run ‘herd stop root’.
Ludo’.
M
M
Mathieu Othacehe wrote on 7 May 18:55 +0200
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 41123@debbugs.gnu.org)
87imh7elch.fsf@gmail.com
Hey Ludo,
Toggle quote (14 lines)> May 7 09:36:52 localhost shepherd[1]: Exiting shepherd... > May 7 09:36:52 localhost shepherd[1]: Service xorg-server has been stopped. > May 7 09:36:52 localhost shepherd[1]: Service console-font-tty2 has been stopped. > May 7 09:36:52 localhost shepherd[1]: Service term-tty2 has been stopped. > May 7 09:36:52 localhost shepherd[1]: Service upower-daemon has been stopped. > May 7 09:36:52 localhost shepherd[1]: Service elogind has been stopped. > May 7 09:36:52 localhost ntpd[482]: ntpd exiting on signal 15 (Terminated)> May 7 09:36:52 localhost syslogd: exiting on signal 15>> The end result was a kernel panic with exitcode=0x100 (meaning exited> with 1).>> It looks as though one had run ‘herd stop root’.
It could be related to this bug[1]. The problem is that on 0.8.0 aprocess restart can cause a root-service stop.
On your log, I can't see a process being restarted, so it might also beunrelated.
Mathieu
[1]: https://lists.gnu.org/archive/html/bug-guix/2020-05/msg00085.html
L
L
Ludovic Courtès wrote on 10 May 12:38 +0200
(name . Mathieu Othacehe)(address . m.othacehe@gmail.com)(address . 41123@debbugs.gnu.org)
87lfm09isl.fsf@gnu.org
Hi,
Mathieu Othacehe <m.othacehe@gmail.com> skribis:
Toggle quote (20 lines)>> May 7 09:36:52 localhost shepherd[1]: Exiting shepherd... >> May 7 09:36:52 localhost shepherd[1]: Service xorg-server has been stopped. >> May 7 09:36:52 localhost shepherd[1]: Service console-font-tty2 has been stopped. >> May 7 09:36:52 localhost shepherd[1]: Service term-tty2 has been stopped. >> May 7 09:36:52 localhost shepherd[1]: Service upower-daemon has been stopped. >> May 7 09:36:52 localhost shepherd[1]: Service elogind has been stopped. >> May 7 09:36:52 localhost ntpd[482]: ntpd exiting on signal 15 (Terminated)>> May 7 09:36:52 localhost syslogd: exiting on signal 15>>>> The end result was a kernel panic with exitcode=0x100 (meaning exited>> with 1).>>>> It looks as though one had run ‘herd stop root’.>> It could be related to this bug[1]. The problem is that on 0.8.0 a> process restart can cause a root-service stop.>> On your log, I can't see a process being restarted, so it might also be> unrelated.
It looks very much the same though: it’s stopping itself, which mostlikely happens as a result of killing itself. I’ve merged them, we’llsee!
Ludo’.
L
L
Ludovic Courtès wrote on 14 May 14:19 +0200
control message for bug #41123
(address . control@debbugs.gnu.org)
87zhaavhds.fsf@gnu.org
severity 41123 importantquit
?