guix-daemon slows to a crawl when a substitute server is offline

OpenSubmitted by Maxim Cournoyer.
Details
4 participants
  • Efraim Flashner
  • Ludovic Courtès
  • Maxim Cournoyer
  • zimoun
Owner
unassigned
Severity
normal
M
M
Maxim Cournoyer wrote on 30 Jan 2018 04:07
(name . bug-guix)(address . bug-guix@gnu.org)
87fu6o2ge2.fsf@gmail.com
When a substitute server used by guix-daemon is offline, the daemon willkeep attempting to connect to it, even when it shouldn't need any data(ran 'sudo guix system reconfigure my-config.scm' multiple times in arow.
With the disconnected server (bayfront in my case), that command wouldtake close to 8 minutes, with many system calls like:
Toggle snippet (3 lines)connect(14, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("141.255.128.56")}, 16) = -1 EINPROGRESS
which wasted 5 seconds each time.
After removing this server from my substitute servers list, the sameoperation (system reconfigure) is 8 times faster (1 minute).
Suggestion: the daemon should stop trying to use the offline substituteserver after trying for X times, and print a warning about it.
Maxim
Z
Z
zimoun wrote on 3 Dec 2020 01:20
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 30290@debbugs.gnu.org)
86mtyvzqo8.fsf@gmail.com
Hi Maxim,
On Mon, 29 Jan 2018 at 22:07, Maxim Cournoyer <maxim.cournoyer@gmail.com> wrote:
Toggle quote (18 lines)> When a substitute server used by guix-daemon is offline, the daemon will> keep attempting to connect to it, even when it shouldn't need any data> (ran 'sudo guix system reconfigure my-config.scm' multiple times in a> row.>> With the disconnected server (bayfront in my case), that command would> take close to 8 minutes, with many system calls like:>> connect(14, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("141.255.128.56")}, 16) = -1 EINPROGRESS>> which wasted 5 seconds each time.>> After removing this server from my substitute servers list, the same> operation (system reconfigure) is 8 times faster (1 minute).>> Suggestion: the daemon should stop trying to use the offline substitute> server after trying for X times, and print a warning about it.
This looks like as a wishlist, right? Do it make sense to include suchfeature to the recent discussions about the revamp of offloading,Cuirass, publish, etc.

All the best,simon
M
M
Maxim Cournoyer wrote on 19 Dec 2020 04:04
(name . zimoun)(address . zimon.toutoune@gmail.com)(address . 30290@debbugs.gnu.org)
87h7oiebtn.fsf@gmail.com
Hi!
zimoun <zimon.toutoune@gmail.com> writes:
Toggle quote (25 lines)> Hi Maxim,>> On Mon, 29 Jan 2018 at 22:07, Maxim Cournoyer <maxim.cournoyer@gmail.com> wrote:>> When a substitute server used by guix-daemon is offline, the daemon will>> keep attempting to connect to it, even when it shouldn't need any data>> (ran 'sudo guix system reconfigure my-config.scm' multiple times in a>> row.>>>> With the disconnected server (bayfront in my case), that command would>> take close to 8 minutes, with many system calls like:>>>> connect(14, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("141.255.128.56")}, 16) = -1 EINPROGRESS>>>> which wasted 5 seconds each time.>>>> After removing this server from my substitute servers list, the same>> operation (system reconfigure) is 8 times faster (1 minute).>>>> Suggestion: the daemon should stop trying to use the offline substitute>> server after trying for X times, and print a warning about it.>> This looks like as a wishlist, right? Do it make sense to include such> feature to the recent discussions about the revamp of offloading,> Cuirass, publish, etc.
To me it's an issue more than a feature request, especially in a buildfarm setting; having a substitute machine down shouldn't cause a slowdown for as long as it's down!
I'm not sure if the recent offloading work that Mathieu did touched thattopic. I'd need to test the scenario. Perhaps a system test would beuseful.
Maxim
L
L
Ludovic Courtès wrote on 22 Dec 2020 16:16
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 30290@debbugs.gnu.org)
87r1nhzxaf.fsf@gnu.org
Hi,
Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
Toggle quote (12 lines)> When a substitute server used by guix-daemon is offline, the daemon will> keep attempting to connect to it, even when it shouldn't need any data> (ran 'sudo guix system reconfigure my-config.scm' multiple times in a> row.>> With the disconnected server (bayfront in my case), that command would> take close to 8 minutes, with many system calls like:>> connect(14, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("141.255.128.56")}, 16) = -1 EINPROGRESS>> which wasted 5 seconds each time.
Is it still a problem? Commit 4f5234be0378368e6af25925db46612838d25e58(Nov. 2019) added a table of unreachable hosts. That way, a ‘guixsubstitute --query’ process won’t retry connections to an unreachablehost.
Ludo’.
E
E
Efraim Flashner wrote on 28 Dec 2020 13:19
(name . Ludovic Courtès)(address . ludo@gnu.org)
X+nNNn8n3orFiPR1@3900XT
On Tue, Dec 22, 2020 at 04:16:08PM +0100, Ludovic Courtès wrote:
Toggle quote (24 lines)> Hi,> > Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:> > > When a substitute server used by guix-daemon is offline, the daemon will> > keep attempting to connect to it, even when it shouldn't need any data> > (ran 'sudo guix system reconfigure my-config.scm' multiple times in a> > row.> >> > With the disconnected server (bayfront in my case), that command would> > take close to 8 minutes, with many system calls like:> >> > connect(14, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("141.255.128.56")}, 16) = -1 EINPROGRESS> >> > which wasted 5 seconds each time.> > Is it still a problem? Commit 4f5234be0378368e6af25925db46612838d25e58> (Nov. 2019) added a table of unreachable hosts. That way, a ‘guix> substitute --query’ process won’t retry connections to an unreachable> host.> > Ludo’.>
Occasionally my internet drops itself, and I find I'm left foreverwaiting for a timeout to see what sources I have cached locally.
-- Efraim Flashner <efraim@flashner.co.il> אפרים פלשנרGPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351Confidentiality cannot be guaranteed on emails sent or received unencrypted
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEoov0DD5VE3JmLRT3Qarn3Mo9g1EFAl/pzTMACgkQQarn3Mo9g1F/jhAAj2cB9IKVLCIsvozRZXNMd9WCT25zaNNN6oRPuNoRozDD4555fic0FPdwtv/S3aFnKrHjGuHsMs/qhQd1RzJXsFPO8szbPnwckyC8Cn70+1SvQIMCDfY7osXar94XBxPl0P9gAYNCmDSIgRGT8WhAlKAeuXAA6DnqrWyU0DvLADGsTLPab+olbe9F28uIqkQDf0689Z6lYgPd0FvjzRiR869/B3DYY+Kk1hSYWd3xTWan/NCdzHrvNAzgM5oYXUUur1BqLPT+deV2JBvPrJ5p2vsWZIjrx4AvI4cDB/xvYIAhJJuWD9+uP6as0oZt9wxSsvr8zYwpO2YO1Q6E7a/fuEzhCYGoVtw4pDLx5BKWBYBVWM0K+grrZq5namJ2fKcDjYHb0wWmk57wzaSqcuPGM22QA+WSehbYrkXsZCuCtGhkIawQAmUHmXDlu05hShPVy0CjhNA35dMlupNdXzOledt87K+H31YVdGAOaHVEd0NJ/RYLkm0aFkhU5cHFg8gJidOeAfiy1qxgO1msU0EParCBRb2D1c6JaXl1ClWz/8NszWcw6cbIkodVNeglZHLn1f4btEWkdigEjctxM4NL/DhJxleXXBp30VNGMmJ07dMhmgBTDqESK1SOxVZ4LudByowYJU9Pg57y083YuJqSB0XVPJC+laR2Mw8A1nXZT50==IRoi-----END PGP SIGNATURE-----

Z
Z
zimoun wrote on 24 Mar 22:55 +0100
(address . 30290@debbugs.gnu.org)
868s6cb4ji.fsf@gmail.com
Hi,
On Mon, 29 Jan 2018 at 22:07, Maxim Cournoyer <maxim.cournoyer@gmail.com> wrote:
Toggle quote (18 lines)> When a substitute server used by guix-daemon is offline, the daemon will> keep attempting to connect to it, even when it shouldn't need any data> (ran 'sudo guix system reconfigure my-config.scm' multiple times in a> row.>> With the disconnected server (bayfront in my case), that command would> take close to 8 minutes, with many system calls like:>> connect(14, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("141.255.128.56")}, 16) = -1 EINPROGRESS>> which wasted 5 seconds each time.>> After removing this server from my substitute servers list, the same> operation (system reconfigure) is 8 times faster (1 minute).>> Suggestion: the daemon should stop trying to use the offline substitute> server after trying for X times, and print a warning about it.
What is the status of this bug? Especially with the recent additions inCuirass?
Is it still an issue? Is some timeout still happening?
Well, in summary, the 3 relevant messages are:
-------------------- Start of forwarded message --------------------From: Maxim Cournoyer <maxim.cournoyer@gmail.com>Date: Fri, 18 Dec 2020 22:04:04 -0500
I'm not sure if the recent offloading work that Mathieu did touched thattopic. I'd need to test the scenario. Perhaps a system test would beuseful.-------------------- End of forwarded message --------------------
-------------------- Start of forwarded message --------------------From: Ludovic Courtès <ludo@gnu.org>Date: Tue, 22 Dec 2020 16:16:08 +0100
Is it still a problem? Commit 4f5234be0378368e6af25925db46612838d25e58(Nov. 2019) added a table of unreachable hosts. That way, a ‘guixsubstitute --query’ process won’t retry connections to an unreachablehost.-------------------- End of forwarded message --------------------
-------------------- Start of forwarded message --------------------Date: Mon, 28 Dec 2020 14:19:02 +0200From: Efraim Flashner <efraim@flashner.co.il>
Occasionally my internet drops itself, and I find I'm left foreverwaiting for a timeout to see what sources I have cached locally.-------------------- End of forwarded message --------------------

Cheers,simon
Z
Z
zimoun wrote on 9 Jun 23:34 +0200
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
86y2bilo26.fsf@gmail.com
Hi,

On Wed, 24 Mar 2021 at 22:55, zimoun <zimon.toutoune@gmail.com> wrote:
Toggle quote (20 lines)> On Mon, 29 Jan 2018 at 22:07, Maxim Cournoyer <maxim.cournoyer@gmail.com> wrote:>> When a substitute server used by guix-daemon is offline, the daemon will>> keep attempting to connect to it, even when it shouldn't need any data>> (ran 'sudo guix system reconfigure my-config.scm' multiple times in a>> row.>>>> With the disconnected server (bayfront in my case), that command would>> take close to 8 minutes, with many system calls like:>>>> connect(14, {sa_family=AF_INET, sin_port=htons(443),>> sin_addr=inet_addr("141.255.128.56")}, 16) = -1 EINPROGRESS>>>> which wasted 5 seconds each time.>>>> After removing this server from my substitute servers list, the same>> operation (system reconfigure) is 8 times faster (1 minute).>>>> Suggestion: the daemon should stop trying to use the offline substitute>> server after trying for X times, and print a warning about it.
[...]
Toggle quote (26 lines)> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>> Date: Fri, 18 Dec 2020 22:04:04 -0500 (24 weeks, 4 days, 18 hours ago)>> I'm not sure if the recent offloading work that Mathieu did touched that> topic. I'd need to test the scenario. Perhaps a system test would be> useful.> ---------->> From: Ludovic Courtès <ludo@gnu.org>> Date: Tue, 22 Dec 2020 16:16:08 +0100> Date: Tue, 22 Dec 2020 16:16:08 +0100 (24 weeks, 1 day, 6 hours ago)>> Is it still a problem? Commit 4f5234be0378368e6af25925db46612838d25e58> (Nov. 2019) added a table of unreachable hosts. That way, a ‘guix> substitute --query’ process won’t retry connections to an unreachable> host.> ---------->> From: Efraim Flashner <efraim@flashner.co.il>> Date: Mon, 28 Dec 2020 14:19:02 +0200> Date: Mon, 28 Dec 2020 14:19:02 +0200 (23 weeks, 2 days, 9 hours ago)>> Occasionally my internet drops itself, and I find I'm left forever> waiting for a timeout to see what sources I have cached locally.> ----------
What is the current stats of this bug? Is it still happening with therecent improvements of Cuirass?
Cheers,simon
Z
Z
zimoun wrote on 13 Jul 10:49 +0200
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
86wnpuoaxc.fsf@gmail.com
Hi,
What is the status of this old bug#30290 [1]?
1: http://issues.guix.gnu.org/issue/30290
On Wed, 09 Jun 2021 at 23:34, zimoun <zimon.toutoune@gmail.com> wrote:
Toggle quote (53 lines)> On Wed, 24 Mar 2021 at 22:55, zimoun <zimon.toutoune@gmail.com> wrote:>> On Mon, 29 Jan 2018 at 22:07, Maxim Cournoyer <maxim.cournoyer@gmail.com>>> wrote:>>> When a substitute server used by guix-daemon is offline, the daemon will>>> keep attempting to connect to it, even when it shouldn't need any data>>> (ran 'sudo guix system reconfigure my-config.scm' multiple times in a>>> row.>>>>>> With the disconnected server (bayfront in my case), that command would>>> take close to 8 minutes, with many system calls like:>>>>>> connect(14, {sa_family=AF_INET, sin_port=htons(443),>>> sin_addr=inet_addr("141.255.128.56")}, 16) = -1 EINPROGRESS>>>>>> which wasted 5 seconds each time.>>>>>> After removing this server from my substitute servers list, the same>>> operation (system reconfigure) is 8 times faster (1 minute).>>>>>> Suggestion: the daemon should stop trying to use the offline substitute>>> server after trying for X times, and print a warning about it.>> [...]>>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>>> Date: Fri, 18 Dec 2020 22:04:04 -0500 (24 weeks, 4 days, 18 hours ago)>>>> I'm not sure if the recent offloading work that Mathieu did touched that>> topic. I'd need to test the scenario. Perhaps a system test would be>> useful.>> ---------->>>> From: Ludovic Courtès <ludo@gnu.org>>> Date: Tue, 22 Dec 2020 16:16:08 +0100>> Date: Tue, 22 Dec 2020 16:16:08 +0100 (24 weeks, 1 day, 6 hours ago)>>>> Is it still a problem? Commit 4f5234be0378368e6af25925db46612838d25e58>> (Nov. 2019) added a table of unreachable hosts. That way, a ‘guix>> substitute --query’ process won’t retry connections to an unreachable>> host.>> ---------->>>> From: Efraim Flashner <efraim@flashner.co.il>>> Date: Mon, 28 Dec 2020 14:19:02 +0200>> Date: Mon, 28 Dec 2020 14:19:02 +0200 (23 weeks, 2 days, 9 hours ago)>>>> Occasionally my internet drops itself, and I find I'm left forever>> waiting for a timeout to see what sources I have cached locally.>> ---------->> What is the current stats of this bug? Is it still happening with the> recent improvements of Cuirass?
After reading all this, I think this bug can be closed. WDYT?
Cheers,simon
Z
Z
zimoun wrote on 18 Aug 13:19 +0200
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
86k0kjknkd.fsf@gmail.com
Hi Maxim,
Reading the discussion…
On Tue, 13 Jul 2021 at 10:49, zimoun <zimon.toutoune@gmail.com> wrote:
Toggle quote (60 lines)> What is the status of this old bug#30290 [1]?>> 1: <http://issues.guix.gnu.org/issue/30290>>> On Wed, 09 Jun 2021 at 23:34, zimoun <zimon.toutoune@gmail.com> wrote:>> On Wed, 24 Mar 2021 at 22:55, zimoun <zimon.toutoune@gmail.com> wrote:>>> On Mon, 29 Jan 2018 at 22:07, Maxim Cournoyer <maxim.cournoyer@gmail.com>>>> wrote:>>>> When a substitute server used by guix-daemon is offline, the daemon will>>>> keep attempting to connect to it, even when it shouldn't need any data>>>> (ran 'sudo guix system reconfigure my-config.scm' multiple times in a>>>> row.>>>>>>>> With the disconnected server (bayfront in my case), that command would>>>> take close to 8 minutes, with many system calls like:>>>>>>>> connect(14, {sa_family=AF_INET, sin_port=htons(443),>>>> sin_addr=inet_addr("141.255.128.56")}, 16) = -1 EINPROGRESS>>>>>>>> which wasted 5 seconds each time.>>>>>>>> After removing this server from my substitute servers list, the same>>>> operation (system reconfigure) is 8 times faster (1 minute).>>>>>>>> Suggestion: the daemon should stop trying to use the offline substitute>>>> server after trying for X times, and print a warning about it.>>>> [...]>>>>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>>>> Date: Fri, 18 Dec 2020 22:04:04 -0500 (24 weeks, 4 days, 18 hours ago)>>>>>> I'm not sure if the recent offloading work that Mathieu did touched that>>> topic. I'd need to test the scenario. Perhaps a system test would be>>> useful.>>> ---------->>>>>> From: Ludovic Courtès <ludo@gnu.org>>>> Date: Tue, 22 Dec 2020 16:16:08 +0100>>> Date: Tue, 22 Dec 2020 16:16:08 +0100 (24 weeks, 1 day, 6 hours ago)>>>>>> Is it still a problem? Commit 4f5234be0378368e6af25925db46612838d25e58>>> (Nov. 2019) added a table of unreachable hosts. That way, a ‘guix>>> substitute --query’ process won’t retry connections to an unreachable>>> host.>>> ---------->>>>>> From: Efraim Flashner <efraim@flashner.co.il>>>> Date: Mon, 28 Dec 2020 14:19:02 +0200>>> Date: Mon, 28 Dec 2020 14:19:02 +0200 (23 weeks, 2 days, 9 hours ago)>>>>>> Occasionally my internet drops itself, and I find I'm left forever>>> waiting for a timeout to see what sources I have cached locally.>>> ---------->>>> What is the current stats of this bug? Is it still happening with the>> recent improvements of Cuirass?>> After reading all this, I think this bug can be closed. WDYT?
…I appears to me that this bug could be close. WDYT?
Cheers,simon
M
M
Maxim Cournoyer wrote on 18 Aug 15:18 +0200
(name . zimoun)(address . zimon.toutoune@gmail.com)
87r1eqx569.fsf@gmail.com
Hi,
And sorry for failing to produce a reply earlier :-).
zimoun <zimon.toutoune@gmail.com> writes:
[...]
Toggle quote (31 lines)>>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>>>> Date: Fri, 18 Dec 2020 22:04:04 -0500 (24 weeks, 4 days, 18 hours ago)>>>>>> I'm not sure if the recent offloading work that Mathieu did touched that>>> topic. I'd need to test the scenario. Perhaps a system test would be>>> useful.>>> ---------->>>>>> From: Ludovic Courtès <ludo@gnu.org>>>> Date: Tue, 22 Dec 2020 16:16:08 +0100>>> Date: Tue, 22 Dec 2020 16:16:08 +0100 (24 weeks, 1 day, 6 hours ago)>>>>>> Is it still a problem? Commit 4f5234be0378368e6af25925db46612838d25e58>>> (Nov. 2019) added a table of unreachable hosts. That way, a ‘guix>>> substitute --query’ process won’t retry connections to an unreachable>>> host.>>> ---------->>>>>> From: Efraim Flashner <efraim@flashner.co.il>>>> Date: Mon, 28 Dec 2020 14:19:02 +0200>>> Date: Mon, 28 Dec 2020 14:19:02 +0200 (23 weeks, 2 days, 9 hours ago)>>>>>> Occasionally my internet drops itself, and I find I'm left forever>>> waiting for a timeout to see what sources I have cached locally.>>> ---------->>>> What is the current stats of this bug? Is it still happening with the>> recent improvements of Cuirass?>> After reading all this, I think this bug can be closed. WDYT?
Were you able to replay a scenario in which a substitute server is madeunreachable? That's the information that I'd like to have/see beforeclosing. I don't come across unreachable substitute servers often, andcan't think of a way to easily test this.
I could make it hang by dropping the input/output connections withiptables to a remote guix publish server, but then SSH also hangs, soperhaps that's expected.
I'll try to configure a couple local machines to act as publish servers,and disconnect them from the network to see what happens.
Thanks,
Maxim
M
M
Maxim Cournoyer wrote on 19 Aug 03:54 +0200
(name . zimoun)(address . zimon.toutoune@gmail.com)
87mtpew655.fsf@gmail.com
Hello,
Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:
Toggle quote (55 lines)> Hi,>> And sorry for failing to produce a reply earlier :-).>> zimoun <zimon.toutoune@gmail.com> writes:>> [...]>>>>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>>>>> Date: Fri, 18 Dec 2020 22:04:04 -0500 (24 weeks, 4 days, 18 hours ago)>>>>>>>> I'm not sure if the recent offloading work that Mathieu did touched that>>>> topic. I'd need to test the scenario. Perhaps a system test would be>>>> useful.>>>> ---------->>>>>>>> From: Ludovic Courtès <ludo@gnu.org>>>>> Date: Tue, 22 Dec 2020 16:16:08 +0100>>>> Date: Tue, 22 Dec 2020 16:16:08 +0100 (24 weeks, 1 day, 6 hours ago)>>>>>>>> Is it still a problem? Commit 4f5234be0378368e6af25925db46612838d25e58>>>> (Nov. 2019) added a table of unreachable hosts. That way, a ‘guix>>>> substitute --query’ process won’t retry connections to an unreachable>>>> host.>>>> ---------->>>>>>>> From: Efraim Flashner <efraim@flashner.co.il>>>>> Date: Mon, 28 Dec 2020 14:19:02 +0200>>>> Date: Mon, 28 Dec 2020 14:19:02 +0200 (23 weeks, 2 days, 9 hours ago)>>>>>>>> Occasionally my internet drops itself, and I find I'm left forever>>>> waiting for a timeout to see what sources I have cached locally.>>>> ---------->>>>>> What is the current stats of this bug? Is it still happening with the>>> recent improvements of Cuirass?>>>> After reading all this, I think this bug can be closed. WDYT?>> Were you able to replay a scenario in which a substitute server is made> unreachable? That's the information that I'd like to have/see before> closing. I don't come across unreachable substitute servers often, and> can't think of a way to easily test this.>> I could make it hang by dropping the input/output connections with> iptables to a remote guix publish server, but then SSH also hangs, so> perhaps that's expected.>> I'll try to configure a couple local machines to act as publish servers,> and disconnect them from the network to see what happens.>> Thanks,>> Maxim
I managed to get some problematic behavior from guix substitute:
My test protocole was roughly like this:
1. Setup a 2nd machine (machine B) to act as a substitute server, andguix pull to the same commit as that of my main machine (machine A).
2. Run guix build -m manifest.scm on machine B (IP: 192.168.10.172).
3. On the machine A, run the command below, explicitly listing machine Aas a substitute URL, along ci.guix.gnu.org. During a download from A,break the connection (I pulled the wifi USB dongle out):
$ guix build -m ~/stow/guix/manifest.scm --substitute-urls='http://192.168.10.172 https://ci.guix.gnu.org' --no-offloadsubstitute: updating substitutes from 'http://192.168.10.172:80'... 100.0%substitute: updating substitutes from 'http://192.168.10.172'... 100.0%substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0%The following derivations will be built: /gnu/store/lxm7brkbrkkv58c4kzlw1lh3wc0bm8wz-gimp-2.10.24.drv /gnu/store/ddv8jyzwk92nsg1dkv9n3scf6f7w83g5-keepassxc-2.6.6.drv /gnu/store/xky1y32mccplxsb448ziq68by2mvkdaz-ruby-asciidoctor-2.0.10.drv /gnu/store/0ph0sjib0d13n2fsl8w9prnky8g5fkvf-ruby-haml-5.0.4.drv /gnu/store/4dfwfj9qinw4vs6290gdy5qbnqbczm2v-ruby-temple-0.8.2.drv /gnu/store/b12krypa196yg6gzk2bvrh35i1fg5c7x-ruby-tilt-2.0.10.drv /gnu/store/131d8193hi1485ylnb9w8jm3jnlv3iyx-ruby-slim-4.1.0.drv /gnu/store/489nq0jqjby92kv3c6nwrdfqg45l40nw-ruby-sinatra-2.0.8.1.drv /gnu/store/yay3sa8nnq4j0ixwhp3bxfg5vfisfmf1-sflvault-client-0.9.2-1.8de3902.drv /gnu/store/2n1xyy0y3nnkrp3mpdifn8r7wf6pzpb0-sflvault-0.9.2-1.8de3902-checkout.drv /gnu/store/jsyhy4vxzr9yyg66kzk7w28xffyx050c-python-keyring-1.6.1.drv /gnu/store/kiwn3x2la23f1pa3a5ypsihhc6ja19y5-python-keyring-1.6.1-checkout.drvThe following files will be downloaded: /gnu/store/2qphwngpvawl6f06d33b2jr18vk1hyc9-module-import-compiled /gnu/store/r7vsb0vl4y66jbq7b56zmrm60q2507zl-wireshark-3.4.7 /gnu/store/wnzx9anjdkmbnkcg5qdd3j77q1w2j1bd-yelp-3.32.2 /gnu/store/vcxwcwlwhvhxj15ma8ik8lghmz8sb2vq-vinagre-3.22.0 /gnu/store/yg8r6kz95p8v03gz0rglpwzrj21npzzw-spacefm-1.0.6 /gnu/store/bn35x60w72ad59a5pd7gmvxgjwgkqvag-youtube-dl-2021.06.06 /gnu/store/xkn540dzpz75hr9cx19xgd3b1r7vgswi-mpv-0.33.1 /gnu/store/6abwn23grk710qvzvvg1384bs3kc2f8i-linphone-desktop-4.2.5-debug /gnu/store/4h8ixlh5by2l09vv3rvknmlxv2gm9d6s-linphone-desktop-4.2.5 /gnu/store/f10an83xvya46ndh61y59qaw5vvs5f7n-libreoffice-7.1.4.2 /gnu/store/zczjaxs118155n3mx8w91c24izhx0h0f-ruby-asciimath-2.0.1 /gnu/store/zll4p79a29hw95d2gsh4vjdvd856ry4s-ruby-cucumber-html-formatter-7.0.0[...] /gnu/store/sa6hvh9bnw73mpplasbjb3idlv71rvcb-gnome-boxes-3.36.6 /gnu/store/6gy957mhm07zaa001avzkv2d8zhjdl5h-poppler-data-0.4.10 /gnu/store/7kwgmhlsy6qal56h3z19anxmw4c7pf35-diffoscope-177 /gnu/store/hxvlcb4wgw0fpyi9ssc4x6f8w3hlng55-gst-plugins-good-1.18.2 /gnu/store/7bqpzvzanmvb4g1g6gqb1jmrw2j8gv3d-gst-plugins-bad-1.18.2 /gnu/store/f8hzmmnp8cm4yqq0y9cf7rgxl05hf423-cheese-3.38.0substituting /gnu/store/7kwgmhlsy6qal56h3z19anxmw4c7pf35-diffoscope-177...substituting /gnu/store/ns4n01xgbk6ccvd2z127v71d806rnr6f-inkscape-1.1...substituting /gnu/store/f10an83xvya46ndh61y59qaw5vvs5f7n-libreoffice-7.1.4.2...substituting /gnu/store/4h8ixlh5by2l09vv3rvknmlxv2gm9d6s-linphone-desktop-4.2.5...downloading from http://192.168.10.172/nar/zstd/7kwgmhlsy6qal56h3z19anxmw4c7pf35-diffoscope-177... diffoscope-177 10.5MiB/s 00:00 | 128KiB transferreddownloading from http://192.168.10.172/nar/zstd/ns4n01xgbk6ccvd2z127v71d806rnr6f-inkscape-1.1...downloading from http://192.168.10.172/nar/zstd/f10an83xvya46ndh61y59qaw5vvs5f7n-libreoffice-7.1.4.2...downloading from http://192.168.10.172/nar/zstd/4h8ixlh5by2l09vv3rvknmlxv2gm9d6s-linphone-desktop-4.2.5...
substitution of /gnu/store/4h8ixlh5by2l09vv3rvknmlxv2gm9d6s-linphone-desktop-4.2.5 completesubstituting /gnu/store/7bqpzvzanmvb4g1g6gqb1jmrw2j8gv3d-gst-plugins-bad-1.18.2...downloading from http://192.168.10.172/nar/zstd/7bqpzvzanmvb4g1g6gqb1jmrw2j8gv3d-gst-plugins-bad-1.18.2...
substitution of /gnu/store/7kwgmhlsy6qal56h3z19anxmw4c7pf35-diffoscope-177 completesubstituting /gnu/store/hxvlcb4wgw0fpyi9ssc4x6f8w3hlng55-gst-plugins-good-1.18.2...downloading from http://192.168.10.172/nar/zstd/hxvlcb4wgw0fpyi9ssc4x6f8w3hlng55-gst-plugins-good-1.18.2...
^ It hung up there, waiting indefinitely.
What I would have expected instead, would have been to find out aboutthe network failure, and retry from the other available substitute URL,else build locally.
At that time, all the 'substitute' processes are blocked on a read(2)call, while one of the guix-daemon is also, and 2 others are blocked onselect.
That's not the same as the original report though; let's try toreproduce that one by running the same command again, while thesubstitute server B is still disconnected:
Toggle snippet (73 lines)$ time guix build -m ~/stow/guix/manifest.scm --substitute-urls='http://192.168.10.172 https://ci.guix.gnu.org' --no-offloadsubstitute: updating substitutes from 'http://192.168.10.172:80'... 0.0%guix substitute: warning: 192.168.10.172: connection failed: No route to hostsubstitute: substitute: updating substitutes from 'http://192.168.10.172'... 0.0%substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0%The following derivations will be built: /gnu/store/lxm7brkbrkkv58c4kzlw1lh3wc0bm8wz-gimp-2.10.24.drv /gnu/store/ddv8jyzwk92nsg1dkv9n3scf6f7w83g5-keepassxc-2.6.6.drv /gnu/store/xky1y32mccplxsb448ziq68by2mvkdaz-ruby-asciidoctor-2.0.10.drv /gnu/store/0ph0sjib0d13n2fsl8w9prnky8g5fkvf-ruby-haml-5.0.4.drv /gnu/store/4dfwfj9qinw4vs6290gdy5qbnqbczm2v-ruby-temple-0.8.2.drv /gnu/store/b12krypa196yg6gzk2bvrh35i1fg5c7x-ruby-tilt-2.0.10.drv /gnu/store/131d8193hi1485ylnb9w8jm3jnlv3iyx-ruby-slim-4.1.0.drv /gnu/store/489nq0jqjby92kv3c6nwrdfqg45l40nw-ruby-sinatra-2.0.8.1.drv /gnu/store/yay3sa8nnq4j0ixwhp3bxfg5vfisfmf1-sflvault-client-0.9.2-1.8de3902.drv /gnu/store/2n1xyy0y3nnkrp3mpdifn8r7wf6pzpb0-sflvault-0.9.2-1.8de3902-checkout.drv /gnu/store/jsyhy4vxzr9yyg66kzk7w28xffyx050c-python-keyring-1.6.1.drv /gnu/store/kiwn3x2la23f1pa3a5ypsihhc6ja19y5-python-keyring-1.6.1-checkout.drvThe following files will be downloaded: /gnu/store/2qphwngpvawl6f06d33b2jr18vk1hyc9-module-import-compiled /gnu/store/r7vsb0vl4y66jbq7b56zmrm60q2507zl-wireshark-3.4.7 /gnu/store/wnzx9anjdkmbnkcg5qdd3j77q1w2j1bd-yelp-3.32.2 /gnu/store/vcxwcwlwhvhxj15ma8ik8lghmz8sb2vq-vinagre-3.22.0 /gnu/store/yg8r6kz95p8v03gz0rglpwzrj21npzzw-spacefm-1.0.6
[...]
/gnu/store/zvnnafb7hmiklj8wpvn9qdc85w8rdprl-gnucash-4.2-doc /gnu/store/rp2ai59zvx5m0k6db0cnkx6nn9n41qjd-gnucash-4.2 /gnu/store/hmy026sjdl489sy3i25r2kz9f70h3awm-gnucash-4.2-python /gnu/store/1bspzx0103mr17mxhgw0d9zdlgca2psq-spice-gtk-0.37 /gnu/store/bribnmf6djvh1d3rjr2vs5y97141ad97-osinfo-db-20201218 /gnu/store/r1a25sizf07nmh388ri4qybshzlcxbqd-libosinfo-1.7.1 /gnu/store/2z7p7ynamiarxkx4hnk8dk377xqgm3zl-tracker-2.3.5 /gnu/store/458bw9h0f0ybjdqwg4zm5gjjsmfxbalx-webkitgtk-2.32.3 /gnu/store/sa6hvh9bnw73mpplasbjb3idlv71rvcb-gnome-boxes-3.36.6 /gnu/store/6gy957mhm07zaa001avzkv2d8zhjdl5h-poppler-data-0.4.10 /gnu/store/hxvlcb4wgw0fpyi9ssc4x6f8w3hlng55-gst-plugins-good-1.18.2 /gnu/store/7bqpzvzanmvb4g1g6gqb1jmrw2j8gv3d-gst-plugins-bad-1.18.2 /gnu/store/f8hzmmnp8cm4yqq0y9cf7rgxl05hf423-cheese-3.38.0substitute: updating substitutes from 'http://192.168.10.172:80'... 0.0%substitute: updating substitutes from 'http://192.168.10.172'... 0.0%substitute: updating substitutes from 'http://192.168.10.172:80'... 0.0%substitute: updating substitutes from 'http://192.168.10.172'... 0.0%substitute: updating substitutes from 'http://192.168.10.172:80'... 0.0%substitute: updating substitutes from 'http://192.168.10.172'... 0.0%substitute: updating substitutes from 'http://192.168.10.172:80'... 0.0%substitute: updating substitutes from 'http://192.168.10.172'... 0.0%substitute: updating substitutes from 'http://192.168.10.172:80'... 0.0%substitute: updating substitutes from 'http://192.168.10.172'... 0.0%substitute: updating substitutes from 'http://192.168.10.172:80'... 0.0%substitute: updating substitutes from 'http://192.168.10.172'... 0.0%substituting /gnu/store/ns4n01xgbk6ccvd2z127v71d806rnr6f-inkscape-1.1...substituting /gnu/store/f10an83xvya46ndh61y59qaw5vvs5f7n-libreoffice-7.1.4.2...substituting /gnu/store/6abwn23grk710qvzvvg1384bs3kc2f8i-linphone-desktop-4.2.5-debug...substituting /gnu/store/bribnmf6djvh1d3rjr2vs5y97141ad97-osinfo-db-20201218...substitute: updating substitutes from 'http://192.168.10.172:80'... 0.0%substitute: updating substitutes from 'http://192.168.10.172'... 0.0%guix substitute: error: connect*: No route to hostguix substitute: error: connect*: No route to hostguix substitute: error: connect*: No route to hostguix substitute: warning: 192.168.10.172: connection failed: No route to hostdownloading from https://ci.guix.gnu.org/nar/lzip/bribnmf6djvh1d3rjr2vs5y97141ad97-osinfo-db-20201218 ... osinfo-db-20201218 88KiB 5.9MiB/s 00:00 [############# ] 73.1%substitution of /gnu/store/ns4n01xgbk6ccvd2z127v71d806rnr6f-inkscape-1.1 failedsubstitution of /gnu/store/f10an83xvya46ndh61y59qaw5vvs5f7n-libreoffice-7.1.4.2 failedsubstitution of /gnu/store/6abwn23grk710qvzvvg1384bs3kc2f8i-linphone-desktop-4.2.5-debug failedguix build: error: corrupt input while restoring archive from #<closed: file 7f16de01c230>
real 1m13.549suser 0m25.348ssys 0m0.721s
Hmm.
Let's try again,
Toggle snippet (64 lines)$ time guix build -m ~/stow/guix/manifest.scm --substitute-urls='http://192.168.10.172 https://ci.guix.gnu.org' --no-offloadsubstitute: updating substitutes from 'http://192.168.10.172:80'... 0.0%guix substitute: warning: 192.168.10.172: connection failed: No route to hostsubstitute: substitute: updating substitutes from 'http://192.168.10.172'... 0.0%The following derivations will be built: /gnu/store/lxm7brkbrkkv58c4kzlw1lh3wc0bm8wz-gimp-2.10.24.drv /gnu/store/ddv8jyzwk92nsg1dkv9n3scf6f7w83g5-keepassxc-2.6.6.drv /gnu/store/xky1y32mccplxsb448ziq68by2mvkdaz-ruby-asciidoctor-2.0.10.drv /gnu/store/0ph0sjib0d13n2fsl8w9prnky8g5fkvf-ruby-haml-5.0.4.drv /gnu/store/4dfwfj9qinw4vs6290gdy5qbnqbczm2v-ruby-temple-0.8.2.drv /gnu/store/b12krypa196yg6gzk2bvrh35i1fg5c7x-ruby-tilt-2.0.10.drv /gnu/store/131d8193hi1485ylnb9w8jm3jnlv3iyx-ruby-slim-4.1.0.drv /gnu/store/489nq0jqjby92kv3c6nwrdfqg45l40nw-ruby-sinatra-2.0.8.1.drv /gnu/store/yay3sa8nnq4j0ixwhp3bxfg5vfisfmf1-sflvault-client-0.9.2-1.8de3902.drv /gnu/store/2n1xyy0y3nnkrp3mpdifn8r7wf6pzpb0-sflvault-0.9.2-1.8de3902-checkout.drv /gnu/store/jsyhy4vxzr9yyg66kzk7w28xffyx050c-python-keyring-1.6.1.drv /gnu/store/kiwn3x2la23f1pa3a5ypsihhc6ja19y5-python-keyring-1.6.1-checkout.drvThe following files will be downloaded: /gnu/store/2qphwngpvawl6f06d33b2jr18vk1hyc9-module-import-compiled /gnu/store/r7vsb0vl4y66jbq7b56zmrm60q2507zl-wireshark-3.4.7 /gnu/store/wnzx9anjdkmbnkcg5qdd3j77q1w2j1bd-yelp-3.32.2
[...]
/gnu/store/sa6hvh9bnw73mpplasbjb3idlv71rvcb-gnome-boxes-3.36.6 /gnu/store/6gy957mhm07zaa001avzkv2d8zhjdl5h-poppler-data-0.4.10 /gnu/store/hxvlcb4wgw0fpyi9ssc4x6f8w3hlng55-gst-plugins-good-1.18.2 /gnu/store/7bqpzvzanmvb4g1g6gqb1jmrw2j8gv3d-gst-plugins-bad-1.18.2 /gnu/store/f8hzmmnp8cm4yqq0y9cf7rgxl05hf423-cheese-3.38.0substitute: updating substitutes from 'http://192.168.10.172:80'... 0.0%substitute: updating substitutes from 'http://192.168.10.172'... 0.0%substitute: updating substitutes from 'http://192.168.10.172:80'... 0.0%substitute: updating substitutes from 'http://192.168.10.172'... 0.0%substitute: updating substitutes from 'http://192.168.10.172:80'... 0.0%substitute: updating substitutes from 'http://192.168.10.172'... 0.0%substitute: updating substitutes from 'http://192.168.10.172:80'... 0.0%substitute: updating substitutes from 'http://192.168.10.172'... 0.0% substitute: updating substitutes from 'http://192.168.10.172:80'... 0.0%substitute: updating substitutes from 'http://192.168.10.172'... 0.0%substitute: updating substitutes from 'http://192.168.10.172:80'... 0.0%substitute: updating substitutes from 'http://192.168.10.172'... 0.0%substituting /gnu/store/ns4n01xgbk6ccvd2z127v71d806rnr6f-inkscape-1.1...substituting /gnu/store/f10an83xvya46ndh61y59qaw5vvs5f7n-libreoffice-7.1.4.2...substituting /gnu/store/6abwn23grk710qvzvvg1384bs3kc2f8i-linphone-desktop-4.2.5-debug...substituting /gnu/store/bribnmf6djvh1d3rjr2vs5y97141ad97-osinfo-db-20201218...substitute: updating substitutes from 'http://192.168.10.172:80'... 0.0%substitute: updating substitutes from 'http://192.168.10.172'... 0.0%guix substitute: error: connect*: No route to hostguix substitute: error: connect*: No route to hostguix substitute: error: connect*: No route to hostguix substitute: warning: 192.168.10.172: connection failed: No route to hostdownloading from https://ci.guix.gnu.org/nar/lzip/bribnmf6djvh1d3rjr2vs5y97141ad97-osinfo-db-20201218 ... osinfo-db-20201218 88KiB 6.0MiB/s 00:00 [############# ] 73.1%substitution of /gnu/store/ns4n01xgbk6ccvd2z127v71d806rnr6f-inkscape-1.1 failedsubstitution of /gnu/store/f10an83xvya46ndh61y59qaw5vvs5f7n-libreoffice-7.1.4.2 failedsubstitution of /gnu/store/6abwn23grk710qvzvvg1384bs3kc2f8i-linphone-desktop-4.2.5-debug failedguix build: error: corrupt input while restoring archive from #<closed: file 7f1471840230>
real 1m15.216suser 0m24.963ssys 0m0.702s

Same thing, the daemon is still trying really hard to get something fromthat dead substitute server, slowing things down.
That corrupted archive failure is curious, I wonder if it may berelated.
We'll have to keep this bug open I'm afraid :-/.
Thanks,
Maxim
M
M
Maxim Cournoyer wrote on 19 Aug 04:25 +0200
(name . zimoun)(address . zimon.toutoune@gmail.com)
87fsv6w4qf.fsf@gmail.com
Extra note: the problems reported earlier (hang or backtrace instead ofgraceful fallback to other substitute servers) also affect the scenariowhere substitutes are fetched from uDNS discovered substitute servers (Ijust tried).
Thanks,
Maxim
?
Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send email to 30290@debbugs.gnu.org