(address . bug-guix@gnu.org)
I've been running into an issue with Shepherd on one of my machines. Every so often (and I haven't figured out what conditions trigger it), my Shepherd instances (both home and PID 1) will go unresponsive. I thought I had tracked it down to a misbehaving home service that I had configured, but it's just happened again without that service running.
'herd status' hangs indefinitely:
jfred@terracard ~$ sudo herd status
Password:
<never returns>
...on both instances:
jfred@terracard ~$ herd status
<never returns>
The PID 1 shepherd instance isn't reaping defunct processes:
jfred@terracard ~$ ps aux | grep -i lock
jfred 541 0.0 0.0 3700 2304 ? S 18:30 0:00 swayidle -w timeout 300 swaylock -f -i ~/.wallpapers/user-manual.jpg timeout 10 if pgrep swaylock; then swaymsg "output * dpms off"; fi resume swaymsg "output * dpms on" before-sleep swaylock -f -i ~/.wallpapers/user-manual.jpg
jfred 3111 0.0 0.0 0 0 ? Z 18:53 0:00 [swaylock] <defunct>
jfred 3112 0.0 0.0 0 0 ? Zs 18:53 0:00 [swaylock] <defunct>
Some further troubleshooting... strace indicates that it's waiting on a read() on its fd 9:
jfred@terracard ~ [env]$ sudo strace -fp 1
Password:
strace: Process 1 attached with 5 threads
[pid 144] read(9, <unfinished ...>
[pid 142] futex(0x7fa43892abe8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 141] futex(0x7fa43892abe8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 140] futex(0x7fa43892abe8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY^
...which seems to be:
jfred@terracard ~ [env]$ sudo ls -l /proc/1/fd/9
lr-x------ 1 root root 64 Jul 17 20:39 /proc/1/fd/9 -> 'pipe:[4015]'
jfred@terracard ~ [env]$ sudo lsof -n | grep 4015
lsof: WARNING: can't stat() fuse.portal file system /run/user/1000/doc
Output information may be incomplete.
shepherd 1 root 9r FIFO 0,15 0t0 4015 pipe
shepherd 1 root 11w FIFO 0,15 0t0 4015 pipe
shepherd 1 140 GC-marker root 9r FIFO 0,15 0t0 4015 pipe
shepherd 1 140 GC-marker root 11w FIFO 0,15 0t0 4015 pipe
shepherd 1 141 GC-marker root 9r FIFO 0,15 0t0 4015 pipe
shepherd 1 141 GC-marker root 11w FIFO 0,15 0t0 4015 pipe
shepherd 1 142 GC-marker root 9r FIFO 0,15 0t0 4015 pipe
shepherd 1 142 GC-marker root 11w FIFO 0,15 0t0 4015 pipe
shepherd 1 144 shepherd root 9r FIFO 0,15 0t0 4015 pipe
shepherd 1 144 shepherd root 11w FIFO 0,15 0t0 4015 pipe
My system configuration for this machine can be found here, and I last ran a 'guix pull' on June 21: https://github.com/jfrederickson/dotfiles/blob/master/guix/guix/system/machines/terracard/config.scm
Has anyone else run into this?