‘static-networking’ fails to start

  • Open
  • quality assurance status badge
Details
4 participants
  • Leo Nikkilä
  • Ludovic Courtès
  • Ludovic Courtès
  • Matt Wette
Owner
unassigned
Submitted by
Ludovic Courtès
Severity
important
L
L
Ludovic Courtès wrote on 15 Jul 22:04 +0200
(address . bug-guix@gnu.org)
87pm4tuej8.fsf@inria.fr
Hi!

On the machine that exhibited https://issues.guix.gnu.org/63516, I’m
now seeing this, with the fix from commit
26602f4063a6e0c626e8deb3423166bcd0abeb90:

Toggle snippet (25 lines)
[ 121.017522] shepherd[1]: Starting service user-homes...
[ 121.049038] tg3 0000:05:00.0 eth0: Tigon3 [partno(BCM95720) rev 5720000] (PCI Express) MAC address b8:cb:29:b5:1c:3a
[ 121.049042] tg3 0000:05:00.0 eth0: attached PHY is 5720C (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[1])
[ 121.049044] tg3 0000:05:00.0 eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1]
[ 121.049045] tg3 0000:05:00.0 eth0: dma_rwctrl[00000001] dma_mask[64-bit]
[ 121.084342] tg3 0000:05:00.1 eth1: Tigon3 [partno(BCM95720) rev 5720000] (PCI Express) MAC address b8:cb:29:b5:1c:3b
[ 121.084355] tg3 0000:05:00.1 eth1: attached PHY is 5720C (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[1])
[ 121.084363] tg3 0000:05:00.1 eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1]
[ 121.084370] tg3 0000:05:00.1 eth1: dma_rwctrl[00000001] dma_mask[64-bit]
[ 121.102367] iTCO_vendor_support: vendor-support=0
[ 121.103831] Error: Driver 'pcspkr' is already registered, aborting...
[ 121.108617] dcdbas dcdbas: Dell Systems Management Base Driver (version 5.6.0-3.4)
[ 121.113037] tg3 0000:05:00.1 eno2: renamed from eth1

[...]

[ 121.281600] shepherd[1]: Service user-homes has been started.
[ 121.282538] shepherd[1]: Service user-homes started.
[ 121.368316] ipmi_si IPI0001:00: Using irq 10
[ 121.405790] ipmi_si IPI0001:00: IPMI message handler: Found new BMC (man_id: 0x0002a2, prod_id: 0x0100, dev_id: 0x20)
[ 121.419871] shepherd[1]: Exception caught while starting #<<service> 7f19889012a0>: (wrong-type-arg "port-filename" "Wrong type argument in position ~A: ~S" (1 #<closed: file 7f1981887000>) (#<closed: file 7f1981887000>))
[ 121.420074] shepherd[1]: Service user-homes running with value #t.
[ 121.420218] shepherd[1]: Service networking failed to start.

The failure seems to happen after the whole static networking config has
been set up though (‘ip a’ shows that everything’s in place).

Problem is that at this point ‘networking’ cannot be started unless you
manually tear down everything with ‘ip’:

Toggle snippet (5 lines)
$ sudo herd start networking
herd: error: exception rattrapée pendant l’exécution de « start » sur le service « networking » :
Throw to key `%exception' with args `("#<&netlink-response-error errno: 17>")'.

(17 = EEXIST)

This makes me think we should make the set up phase idempotent or,
alternatively, add special actions to force a change.

Thoughts?

Ludo’.
M
M
Matt Wette wrote on 17 Sep 18:42 +0200
stopping ntp and dnsmasq
(address . 64653@debbugs.gnu.org)
a67e6fa6-31c3-4d00-add1-c3629d632a8a@gmail.com
Are there any workarounds for this.   I've been digging into anything to
help.
I'm dead in the water trying to get ntpd and tftpd (dnsmasq) working. 
They require this.
Or, is there a way to get dnsmasq working itself?

Matt
M
M
Matt Wette wrote on 17 Sep 19:09 +0200
(address . 64653@debbugs.gnu.org)
af7eb6e7-faed-4f82-77f1-5a5708c0a571@gmail.com
On 9/17/23 9:42 AM, Matt Wette wrote:
Toggle quote (6 lines)
> Are there any workarounds for this.   I've been digging into anything
> to help.
> I'm dead in the water trying to get ntpd and tftpd (dnsmasq) working. 
> They require this.
> Or, is there a way to get dnsmasq working itself?

I see there is atftp, so I'll try that.   Still no working ntpd.
L
L
Ludovic Courtès wrote on 2 Oct 12:24 +0200
control message for bug #64653
(address . control@debbugs.gnu.org)
871qedl3tz.fsf@gnu.org
severity 64653 important
quit
L
L
Ludovic Courtès wrote on 2 Oct 13:59 +0200
Re: bug#64653: ‘static-networking’ fails to start
(address . 64653@debbugs.gnu.org)
87msx1jkvi.fsf@gnu.org
Ludovic Courtès <ludovic.courtes@inria.fr> skribis:

Toggle quote (19 lines)
> [ 121.281600] shepherd[1]: Service user-homes has been started.
> [ 121.282538] shepherd[1]: Service user-homes started.
> [ 121.368316] ipmi_si IPI0001:00: Using irq 10
> [ 121.405790] ipmi_si IPI0001:00: IPMI message handler: Found new BMC (man_id: 0x0002a2, prod_id: 0x0100, dev_id: 0x20)
> [ 121.419871] shepherd[1]: Exception caught while starting #<<service> 7f19889012a0>: (wrong-type-arg "port-filename" "Wrong type argument in position ~A: ~S" (1 #<closed: file 7f1981887000>) (#<closed: file 7f1981887000>))
> [ 121.420074] shepherd[1]: Service user-homes running with value #t.
> [ 121.420218] shepherd[1]: Service networking failed to start.
>
>
> The failure seems to happen after the whole static networking config has
> been set up though (‘ip a’ shows that everything’s in place).
>
> Problem is that at this point ‘networking’ cannot be started unless you
> manually tear down everything with ‘ip’:
>
> $ sudo herd start networking
> herd: error: exception rattrapée pendant l’exécution de « start » sur le service « networking » :
> Throw to key `%exception' with args `("#<&netlink-response-error errno: 17>")'.

Quick workaround if you encounter this bug:

1. Find the “tear-down” script of your system with:

guix gc -R /run/current-system |grep tear-down-network

2. In a ‘screen’ session, run this as root:

while true ; do herd enable networking; herd start networking; sleep 3; done

3. Run:

sudo guile --no-auto-compile TEAR_DOWN_SCRIPT_FROM_STEP_1

Beautiful, isn’t it?

(We’ll actually work on fixing the bug, too…)

Ludo’.
L
L
Leo Nikkilä wrote on 11 Nov 17:25 +0100
Re: bug#64653: ‘static-networking’ fails to st art
(address . 64653@debbugs.gnu.org)
e5c80dd5-21dc-407c-a3c0-5d8746f8fbf1@betaapp.fastmail.com
I'm also seeing this issue on a headless RockPro64 system. Do you know anything I could change in the configuration to work around this during boot, e.g. patch a specific commit out?

Happy to provide further details or test things on my system.
?