guix-daemon slows to a crawl when a substitute server is offline

OpenSubmitted by Maxim Cournoyer.
Details
4 participants
  • Efraim Flashner
  • Ludovic Courtès
  • Maxim Cournoyer
  • zimoun
Owner
unassigned
Severity
normal
M
M
Maxim Cournoyer wrote on 30 Jan 2018 04:07
(name . bug-guix)(address . bug-guix@gnu.org)
87fu6o2ge2.fsf@gmail.com
When a substitute server used by guix-daemon is offline, the daemon willkeep attempting to connect to it, even when it shouldn't need any data(ran 'sudo guix system reconfigure my-config.scm' multiple times in arow.
With the disconnected server (bayfront in my case), that command wouldtake close to 8 minutes, with many system calls like:
Toggle snippet (3 lines)connect(14, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("141.255.128.56")}, 16) = -1 EINPROGRESS
which wasted 5 seconds each time.
After removing this server from my substitute servers list, the sameoperation (system reconfigure) is 8 times faster (1 minute).
Suggestion: the daemon should stop trying to use the offline substituteserver after trying for X times, and print a warning about it.
Maxim
Z
Z
zimoun wrote on 3 Dec 2020 01:20
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 30290@debbugs.gnu.org)
86mtyvzqo8.fsf@gmail.com
Hi Maxim,
On Mon, 29 Jan 2018 at 22:07, Maxim Cournoyer <maxim.cournoyer@gmail.com> wrote:
Toggle quote (18 lines)> When a substitute server used by guix-daemon is offline, the daemon will> keep attempting to connect to it, even when it shouldn't need any data> (ran 'sudo guix system reconfigure my-config.scm' multiple times in a> row.>> With the disconnected server (bayfront in my case), that command would> take close to 8 minutes, with many system calls like:>> connect(14, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("141.255.128.56")}, 16) = -1 EINPROGRESS>> which wasted 5 seconds each time.>> After removing this server from my substitute servers list, the same> operation (system reconfigure) is 8 times faster (1 minute).>> Suggestion: the daemon should stop trying to use the offline substitute> server after trying for X times, and print a warning about it.
This looks like as a wishlist, right? Do it make sense to include suchfeature to the recent discussions about the revamp of offloading,Cuirass, publish, etc.

All the best,simon
M
M
Maxim Cournoyer wrote on 19 Dec 2020 04:04
(name . zimoun)(address . zimon.toutoune@gmail.com)(address . 30290@debbugs.gnu.org)
87h7oiebtn.fsf@gmail.com
Hi!
zimoun <zimon.toutoune@gmail.com> writes:
Toggle quote (25 lines)> Hi Maxim,>> On Mon, 29 Jan 2018 at 22:07, Maxim Cournoyer <maxim.cournoyer@gmail.com> wrote:>> When a substitute server used by guix-daemon is offline, the daemon will>> keep attempting to connect to it, even when it shouldn't need any data>> (ran 'sudo guix system reconfigure my-config.scm' multiple times in a>> row.>>>> With the disconnected server (bayfront in my case), that command would>> take close to 8 minutes, with many system calls like:>>>> connect(14, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("141.255.128.56")}, 16) = -1 EINPROGRESS>>>> which wasted 5 seconds each time.>>>> After removing this server from my substitute servers list, the same>> operation (system reconfigure) is 8 times faster (1 minute).>>>> Suggestion: the daemon should stop trying to use the offline substitute>> server after trying for X times, and print a warning about it.>> This looks like as a wishlist, right? Do it make sense to include such> feature to the recent discussions about the revamp of offloading,> Cuirass, publish, etc.
To me it's an issue more than a feature request, especially in a buildfarm setting; having a substitute machine down shouldn't cause a slowdown for as long as it's down!
I'm not sure if the recent offloading work that Mathieu did touched thattopic. I'd need to test the scenario. Perhaps a system test would beuseful.
Maxim
L
L
Ludovic Courtès wrote on 22 Dec 2020 16:16
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 30290@debbugs.gnu.org)
87r1nhzxaf.fsf@gnu.org
Hi,
Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
Toggle quote (12 lines)> When a substitute server used by guix-daemon is offline, the daemon will> keep attempting to connect to it, even when it shouldn't need any data> (ran 'sudo guix system reconfigure my-config.scm' multiple times in a> row.>> With the disconnected server (bayfront in my case), that command would> take close to 8 minutes, with many system calls like:>> connect(14, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("141.255.128.56")}, 16) = -1 EINPROGRESS>> which wasted 5 seconds each time.
Is it still a problem? Commit 4f5234be0378368e6af25925db46612838d25e58(Nov. 2019) added a table of unreachable hosts. That way, a ‘guixsubstitute --query’ process won’t retry connections to an unreachablehost.
Ludo’.
E
E
Efraim Flashner wrote on 28 Dec 2020 13:19
(name . Ludovic Courtès)(address . ludo@gnu.org)
X+nNNn8n3orFiPR1@3900XT
On Tue, Dec 22, 2020 at 04:16:08PM +0100, Ludovic Courtès wrote:
Toggle quote (24 lines)> Hi,> > Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:> > > When a substitute server used by guix-daemon is offline, the daemon will> > keep attempting to connect to it, even when it shouldn't need any data> > (ran 'sudo guix system reconfigure my-config.scm' multiple times in a> > row.> >> > With the disconnected server (bayfront in my case), that command would> > take close to 8 minutes, with many system calls like:> >> > connect(14, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("141.255.128.56")}, 16) = -1 EINPROGRESS> >> > which wasted 5 seconds each time.> > Is it still a problem? Commit 4f5234be0378368e6af25925db46612838d25e58> (Nov. 2019) added a table of unreachable hosts. That way, a ‘guix> substitute --query’ process won’t retry connections to an unreachable> host.> > Ludo’.>
Occasionally my internet drops itself, and I find I'm left foreverwaiting for a timeout to see what sources I have cached locally.
-- Efraim Flashner <efraim@flashner.co.il> אפרים פלשנרGPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351Confidentiality cannot be guaranteed on emails sent or received unencrypted
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEoov0DD5VE3JmLRT3Qarn3Mo9g1EFAl/pzTMACgkQQarn3Mo9g1F/jhAAj2cB9IKVLCIsvozRZXNMd9WCT25zaNNN6oRPuNoRozDD4555fic0FPdwtv/S3aFnKrHjGuHsMs/qhQd1RzJXsFPO8szbPnwckyC8Cn70+1SvQIMCDfY7osXar94XBxPl0P9gAYNCmDSIgRGT8WhAlKAeuXAA6DnqrWyU0DvLADGsTLPab+olbe9F28uIqkQDf0689Z6lYgPd0FvjzRiR869/B3DYY+Kk1hSYWd3xTWan/NCdzHrvNAzgM5oYXUUur1BqLPT+deV2JBvPrJ5p2vsWZIjrx4AvI4cDB/xvYIAhJJuWD9+uP6as0oZt9wxSsvr8zYwpO2YO1Q6E7a/fuEzhCYGoVtw4pDLx5BKWBYBVWM0K+grrZq5namJ2fKcDjYHb0wWmk57wzaSqcuPGM22QA+WSehbYrkXsZCuCtGhkIawQAmUHmXDlu05hShPVy0CjhNA35dMlupNdXzOledt87K+H31YVdGAOaHVEd0NJ/RYLkm0aFkhU5cHFg8gJidOeAfiy1qxgO1msU0EParCBRb2D1c6JaXl1ClWz/8NszWcw6cbIkodVNeglZHLn1f4btEWkdigEjctxM4NL/DhJxleXXBp30VNGMmJ07dMhmgBTDqESK1SOxVZ4LudByowYJU9Pg57y083YuJqSB0XVPJC+laR2Mw8A1nXZT50==IRoi-----END PGP SIGNATURE-----

?