Avahi substitute discovery keeps trying to ping unaccessible servers

  • Open
  • quality assurance status badge
Details
4 participants
  • Pierre Neidhardt
  • Mathieu Othacehe
  • Maxim Cournoyer
  • Mathieu Othacehe
Owner
unassigned
Submitted by
Pierre Neidhardt
Severity
normal
Merged with
P
P
Pierre Neidhardt wrote on 17 Dec 2020 18:40
(address . bug-guix@gnu.org)
87lfdw72lq.fsf@ambrevar.xyz
I've set up my desktop and laptop to use the new substitute discovery
feature, it's awesome!

However, when I put my desktop to sleep and run a Guix command on my
laptop that requires access to a sbustitute server, I see this:

Toggle snippet (8 lines)
$ guix build ncdu
substitute: guix substitute: warning: 10.0.0.5: connection failed: No route to host
substitute: updating substitutes from 'http://ci.guix.gnu.org'... 100.0%
0.0 MB will be downloaded:
/gnu/store/p70r4maqgh6ghl25h5a99w7sf1jidap8-ncdu-1.15.1
substituting /gnu/store/p70r4maqgh6ghl25h5a99w7sf1jidap8-ncdu-1.15.1...

The warning

Toggle snippet (3 lines)
substitute: guix substitute: warning: 10.0.0.5: connection failed: No route to host

pops up on every download, which adds some 2s delay each time. This
makes the whole process much slower.

For your information, Avahi does not find my desktop:

Toggle snippet (7 lines)
$ sudo avahi-browse -al
Password:
+ wlp2s0 IPv6 FOO___s MacBook Pro _companion-link._tcp local
+ wlp2s0 IPv4 FOO___s MacBook Pro _companion-link._tcp local
C-c C-cGot SIGINT, quitting.

--
Pierre Neidhardt
-----BEGIN PGP SIGNATURE-----

iQFGBAEBCAAwFiEEUPM+LlsMPZAEJKvom9z0l6S7zH8FAl/bmBESHG1haWxAYW1i
cmV2YXIueHl6AAoJEJvc9Jeku8x/F0cIAKnveHv+IANcyw23WSsh8yJ/wW9albid
SKHMlyjQSeEeBbKVsJo2KXZyFjFcJTFk49yIJ4kOB+jk7oa4BF3YOxhHtsqx2Sxl
Mj+GlgwFyLEK/SdkNtwbpR9PjDCyrRb47ULP8zW7SjeEO2hrVtEZeUVQAsruh72P
Ly4wJWQ6l86s81s5Ln7O91TxEJblMOwVQvuYYlbMcWkAKmvlGDuE/lkvpqMr7IXL
al47q5b/lP7JYFIIxLw8V4V2Pw8WOqJ4DX0GM4UzH+M+XGVsXmpZgVTg9va/s6Ea
u3GKZ0+V0wpkzeWVMxUPmPyjYZzWW42rQYv1JCUUL2FbYHtami40cL8=
=21XO
-----END PGP SIGNATURE-----

P
P
Pierre Neidhardt wrote on 17 Dec 2020 19:33
Re: bug#45302: Acknowledgement (Avahi substitute discovery keeps trying to ping unaccessible servers)
(address . 45302@debbugs.gnu.org)
877dpgi8pu.fsf@ambrevar.xyz
I still witness this issue after restarting my laptop.
Looks like Avahi remembers the discovery across restarts.

--
Pierre Neidhardt
-----BEGIN PGP SIGNATURE-----

iQFGBAEBCAAwFiEEUPM+LlsMPZAEJKvom9z0l6S7zH8FAl/bpF0SHG1haWxAYW1i
cmV2YXIueHl6AAoJEJvc9Jeku8x/dLUH/3fFylnC3r6+InXPzk/Rj2IyzAPYt8WO
zdvA+oVzf2wonA3NrDrvJy5bIlkYS+rvVja2H9cV4jyzMkVVL8yS99oWYUwvLPEg
0y7ZgTdZp93FFubFd6irICFoN8JFzFxRKbhW+ij85XC1jeav1FS4prXJHj05v1hw
jwIcusBMTKH69yXvwmHQRbz+QHRzARyFd5hs0mEnxc8ZCG7NdpVPqt9Mud/oN5Xo
viQQjdHgAA5GsE2r0Ma5W7gEYShY/C628rYnQw/MDEpzOK7bnLAvkXPDT4ZqF+XX
V0IJsMkCd3N3P/b1tU/aX0lUEXMKtxL7NHQW7GCZkp7UHUzXF1Rt8KM=
=c9Oa
-----END PGP SIGNATURE-----

M
M
Maxim Cournoyer wrote on 22 Dec 2020 04:25
Re: bug#45302: Avahi substitute discovery keeps trying to ping unaccessible servers
(name . Pierre Neidhardt)(address . mail@ambrevar.xyz)(address . 45302@debbugs.gnu.org)
87k0tasesw.fsf@gmail.com
Hello,

Pierre Neidhardt <mail@ambrevar.xyz> writes:

Toggle quote (30 lines)
> I've set up my desktop and laptop to use the new substitute discovery
> feature, it's awesome!
>
> However, when I put my desktop to sleep and run a Guix command on my
> laptop that requires access to a sbustitute server, I see this:
>
> $ guix build ncdu
> substitute: guix substitute: warning: 10.0.0.5: connection failed: No route to host
> substitute: updating substitutes from 'http://ci.guix.gnu.org'... 100.0%
> 0.0 MB will be downloaded:
> /gnu/store/p70r4maqgh6ghl25h5a99w7sf1jidap8-ncdu-1.15.1
> substituting /gnu/store/p70r4maqgh6ghl25h5a99w7sf1jidap8-ncdu-1.15.1...
>
>
> The warning
>
> substitute: guix substitute: warning: 10.0.0.5: connection failed: No route to host
>
>
> pops up on every download, which adds some 2s delay each time. This
> makes the whole process much slower.
>
> For your information, Avahi does not find my desktop:
>
> $ sudo avahi-browse -al
> Password:
> + wlp2s0 IPv6 FOO___s MacBook Pro _companion-link._tcp local
> + wlp2s0 IPv4 FOO___s MacBook Pro _companion-link._tcp local
> C-c C-cGot SIGINT, quitting.

This reminds me of https://issues.guix.gnu.org/30290. Perhaps if we can
fix that one it'd make this one go away too?. I was thinking of having
a simple mean to reduce the request attempts on servers down for a long
while, such as entering dead periods (breaks): "I've tried X times, it
doesn't respond, I give up for the next Y minutes"; allowing for X and Y
to be configured via the <guix-publish-configuration> record.

Do you think this would help?

Maxim
P
P
Pierre Neidhardt wrote on 22 Dec 2020 10:45
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 45302@debbugs.gnu.org)
87eejikwd8.fsf@ambrevar.xyz
Hi Maxim!

Thanks for the suggestion!
I'm a bit ignorant here, but my guess is that yes, it would help!

Cheers!

--
Pierre Neidhardt
-----BEGIN PGP SIGNATURE-----

iQFGBAEBCAAwFiEEUPM+LlsMPZAEJKvom9z0l6S7zH8FAl/hwCMSHG1haWxAYW1i
cmV2YXIueHl6AAoJEJvc9Jeku8x/gSwH/3bFtA+3Rt3VQa6NEvX42veum7Hlo1HV
XAUcXuJbbYF9pSYSgv0vGu9gy42jdiwQ+7hIC3uA+d/+CT4N8tP29GCcRpmUTQ66
Gz/cZ2Eaw/tsiI3RbwEn7dK6Chys+qLBtsm1dPvPKHXfqife+uorg7cNQQYE8/yM
x6Ti8rUDFAYe/6MXRz6bLdNz0Sb6XCZQPSsrqVRpNaBcyAiuT4gjzJ4VGOyxsyRm
jX2B8m46qEfuN6rI+8/gBdtYglTwUaWYW7t43oa7B6pw6yHsLwp2YoqkR9pKThE3
qMyL7JpNlKDmGFgxe9pD9Z54KIgT78kM55i28EZsQXBYwtr20WF1GxA=
=np7T
-----END PGP SIGNATURE-----

M
M
Mathieu Othacehe wrote on 11 Jan 2021 14:24
(name . Pierre Neidhardt)(address . mail@ambrevar.xyz)(address . 45302@debbugs.gnu.org)
87mtxf61z9.fsf@gnu.org
Hello Pierre,

Toggle quote (3 lines)
> I've set up my desktop and laptop to use the new substitute discovery
> feature, it's awesome!

Glad you like it, and sorry for the late answer.

Toggle quote (2 lines)
> substitute: guix substitute: warning: 10.0.0.5: connection failed: No route to host

Once your laptop goes to sleep, Avahi should detect that the publish
service is gone and remove it from the cache at
/var/guix/discovery/publish.

I'm having troubles reproducing it at home. Could you please run
"avahi-browse -a" on your desktop, put your laptop to sleep, and check
if the publish server disappears this way:

Toggle snippet (4 lines)
- enp7s0f0 IPv6 guix-publish-cervin _guix_publish._tcp local
- enp7s0f0 IPv4 guix-publish-cervin _guix_publish._tcp local

Regarding the reboot issue, that's because the cache wasn't cleaned-up,
it should by fixed with: ee94cd265e03d12eeeccf58cbaf74b90008fcd14.

Thanks,

Mathieu
P
P
Pierre Neidhardt wrote on 18 Jan 2021 12:14
(name . Mathieu Othacehe)(address . othacehe@gnu.org)(address . 45302@debbugs.gnu.org)
871reized9.fsf@ambrevar.xyz
Hi Mathieu,

sorry, didn't have much time to play with Guix recently.
All I can say is that I haven't experienced the issue lately.
I'm on commit c03875b0361f114634caeb54935fe37a9b7b05af.

I'll try the commands you suggested later. Thanks for your time!
-----BEGIN PGP SIGNATURE-----

iQFGBAEBCAAwFiEEUPM+LlsMPZAEJKvom9z0l6S7zH8FAmAFbYISHG1haWxAYW1i
cmV2YXIueHl6AAoJEJvc9Jeku8x/hCoH/37ywKUbLY/Ajw9BBzndFH9RULs3j/0O
MkedaTy4qK+402PdL9nmFbfgdHKiEHOdy+5GFOhM7hJKsxsD3N3gyZz1HIZsIdG5
u3k1QqqqWxSuF54fw1TJDaxJuEdImuic6agHiAYhKhZxKeiBUql64/CLGXtvbKcX
PKA2+jkfnLfQuVOjfqIBx05jCiDeY5CNq1DZKn8TUA+xgR3aeJ1gO6FP6MLiiakP
eKErbA/mnX8oOlhR3ga1KmGB/V4MXO64CXSzmISq55pHpzh4vvFomYq0HaQTZqZv
+NaDS8VQkBvGDGvzA42WKJ5rvaWBjTm/raOBPGXbUijHE28tyxTgTS0=
=Al24
-----END PGP SIGNATURE-----

P
P
Pierre Neidhardt wrote on 26 Jan 2021 11:53
(name . Mathieu Othacehe)(address . othacehe@gnu.org)(address . 45302@debbugs.gnu.org)
875z3k6kbc.fsf@ambrevar.xyz
Hi Mathieu,

Toggle quote (2 lines)
> Glad you like it, and sorry for the late answer.

No worries! :)

Toggle quote (15 lines)
>> substitute: guix substitute: warning: 10.0.0.5: connection failed: No route to host
>
> Once your laptop goes to sleep, Avahi should detect that the publish
> service is gone and remove it from the cache at
> /var/guix/discovery/publish.
>
> I'm having troubles reproducing it at home. Could you please run
> "avahi-browse -a" on your desktop, put your laptop to sleep, and check
> if the publish server disappears this way:
>
> --8<---------------cut here---------------start------------->8---
> - enp7s0f0 IPv6 guix-publish-cervin _guix_publish._tcp local
> - enp7s0f0 IPv4 guix-publish-cervin _guix_publish._tcp local
> --8<---------------cut here---------------end--------------->8---

To be sure we are on the same page, my desktop is the publisher, my
laptop is the client.

From my desktop:

Toggle snippet (21 lines)
$ avahi-browse -a
+ enp7s0 IPv6 guix-publish-DESKTOP _guix_publish._tcp local
+ enp7s0 IPv4 guix-publish-DESKTOP _guix_publish._tcp local
+ lo IPv4 guix-publish-DESKTOP _guix_publish._tcp local
+ enp7s0 IPv6 LAPTOP _ssh._tcp local
+ enp7s0 IPv6 DESKTOP _ssh._tcp local
+ enp7s0 IPv4 LAPTOP _ssh._tcp local
+ enp7s0 IPv4 DESKTOP _ssh._tcp local
+ lo IPv4 DESKTOP _ssh._tcp local
+ enp7s0 IPv6 LAPTOP [2a:81:b6:9b:6b:88] _workstation._tcp local
+ enp7s0 IPv6 DESKTOP [40:b0:76:0c:8d:47] _workstation._tcp local
+ enp7s0 IPv4 LAPTOP [2a:81:b6:9b:6b:88] _workstation._tcp local
+ enp7s0 IPv4 DESKTOP [40:b0:76:0c:8d:47] _workstation._tcp local
+ lo IPv4 DESKTOP [00:00:00:00:00:00] _workstation._tcp local
+ enp7s0 IPv6 LAPTOP _sftp-ssh._tcp local
+ enp7s0 IPv6 DESKTOP _sftp-ssh._tcp local
+ enp7s0 IPv4 LAPTOP _sftp-ssh._tcp local
+ enp7s0 IPv4 DESKTOP _sftp-ssh._tcp local
+ lo IPv4 DESKTOP _sftp-ssh._tcp local

The output never changes, regardless of my laptop going to sleep or not.

You said "check if the publish server disappears" but why would it since it's
running on my desktop, which is not put to sleep?

Misunderstanding? Forgive my ignorance about Avahi! :)

Cheers!

--
Pierre Neidhardt
-----BEGIN PGP SIGNATURE-----

iQFGBAEBCAAwFiEEUPM+LlsMPZAEJKvom9z0l6S7zH8FAmAP9KcSHG1haWxAYW1i
cmV2YXIueHl6AAoJEJvc9Jeku8x/0wsIAKjxPT59yi8gO2xQrLGElUTbx2DTkt4S
HHRzs8LwWPlflQdqS2BpLU+hPfcUbmSTCBbfrr0wzFIxoGqrWOAkgnScBYZFUDwP
UVlWrBniA5hZhk8N6+ZnPeXja75VxfNG7ZTu8VtJhvXF/vU4J/OFwYTxSgrO8Vf9
WK/1ifGVGrCuYlChFlWoxi9JSEQlPgyJm4XaVModJRBlMZY7ryp/WFbDXZSLj+ds
Ta20y60TMQlFFi8PQoFGGmXQ7ZrFXLJ2e5CLzmHLhCTaRO7AC+K+CO5Xz03qm5ZB
+nC6/bynfGWoi4cH/qroLvD2rhCF2TcCRlX3pjrKgIWoB4ciAswo9t4=
=WjMN
-----END PGP SIGNATURE-----

M
M
Mathieu Othacehe wrote on 4 Jun 2021 10:00
control message for bug #48808
(address . control@debbugs.gnu.org)
875yyuf49q.fsf@meije.i-did-not-set--mail-host-address--so-tickle-me
merge 48808 45302
quit
M
M
Maxim Cournoyer wrote on 11 Jan 2022 04:43
Re: bug#51472: substitute servers should be preferred according to their coverage rate
(name . Ludovic Courtès)(address . ludo@gnu.org)
87pmozndi2.fsf@gmail.com
merge 48808 51472
thanks

Hello Ludovic,

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (46 lines)
> Hi,
>
> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>> When using substitute servers discovery, I've noticed that if one of the
>> substitute servers doesn't have any substitutes available, it'll keep
>> getting tried instead of others, leading to a slide-show of substitutes
>> updates such as:
>>
>> normalized load on machine '127.0.0.1' is 0.04
>> building /gnu/store/ajd0hx104702jpz2ycdwgrnyrv8jsp6d-xorg-server-21.1.0.tar.xz.drv...
>> process 9195 acquired build slot '/var/guix/offload/127.0.0.1:6666/1'
>> normalized load on machine '127.0.0.1' is 0.04
>> building /gnu/store/49rqi3wpvdm5pv6in9pamzdvg0wscrl8-xorgproto-2021.5.drv...
>> substitute: updating substitutes from 'http://192.168.10.102:80'... 0.0%
>> substitute: updating substitutes from 'http://192.168.10.102:80'... 0.0%
>> substitute: updating substitutes from 'http://192.168.10.102:80'... 0.0%
>> substitute: updating substitutes from 'http://192.168.10.102:80'... 0.0%
>> substitute: updating substitutes from 'http://192.168.10.102:80'... 0.0%
>
> We’d need to check why this particular server is checked repeatedly.
> The fact that it displays “0.0%” doesn’t mean that the server lacks
> substitutes, but that it does not reply to ‘GET /xyz.narinfo’ requests,
> for example because it’s off-line (see
> <https://issues.guix.gnu.org/48808>.)
>
>> We should implement some scheme to prefer querying high-substitute
>> servers first, instead of wasting time querying servers always failed
>> queries; this would greatly improve performance when using substitute
>> discovery for example combined with low coverage.
>
> There are several problems with that. First one is that you can’t tell
> what substitute coverage is until you’ve actually made those GET
> requests. Second one is that substitute coverage varies and it’s not an
> absolute measure; for example, if a server provides substitutes for only
> 0.1% of all the packages, but that’s precisely the 0.1% you care about,
> it’s more valuable than the one that has 99% of the packages but lacks
> those you want.
>
> There are other issues such as the fact that current semantics is to
> respect the order of substitute URLs, which is presumably chosen by the
> user according to their own criteria: download speed, bandwidth usage,
> etc.
>
> I hope this makes sense!

It does! I agree that it'd be tricky to get this right; makes me
realize that my problem is probably due to #48808, and fixing that one
would probably have avoided that bug report :-).

I'm merging this one with 48808.

Thank you!

Maxim
?