Downloading substitutes is too slow upon nginx cache misses

DoneSubmitted by dian_cecht.
Details
7 participants
  • dian_cecht
  • Ludovic Courtès
  • Maxim Cournoyer
  • Tobias Geerinckx-Rice
  • Mark H Weaver
  • Florian Pelz
  • Ricardo Wurmus
Owner
unassigned
Severity
important
D
D
dian_cecht wrote on 21 Mar 2017 02:44
No notification of cache misses when downloading substitutes
(name . GuixSD)(address . bug-guix@gnu.org)
20170320184449.5ac06051@khaalida
Just ran guix pull and guix package -u, and found some of the programsdownload VERY slowly (<100kb/s, usually around 95). I asked on #guixand lfam mentioned it was probably a cache miss.
It would be nice if there was some notification that a cache misshappened and the download will likely be slow, otherwise a user mightwonder what problem there is with their connection.
T
T
Tobias Geerinckx-Rice wrote on 21 Mar 2017 03:46
(address . dian_cecht@zoho.com)(address . 26201@debbugs.gnu.org)
144e9ba8-af93-fb18-d2b9-f198ae7c11e9@tobias.gr
Hullo,
On 21/03/17 02:44, dian_cecht@zoho.com wrote:
Toggle quote (4 lines)> Just ran guix pull and guix package -u, and found some of the programs> download VERY slowly (<100kb/s, usually around 95). I asked on #guix> and lfam mentioned it was probably a cache miss.
Do you mean that *substitutes* existed, but were not yet onmirror.hydra.gnu.org and so were silently proxied from the much slowerhydra.gnu.org?
Or did Guix fall back to downloading *source* tarballs from some slowupstream to build locally?
(I've no access to IRC at the mo'.)
Kind regards,
T G-R
Attachment: signature.asc
D
D
dian_cecht wrote on 21 Mar 2017 03:52
(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)(address . 26201@debbugs.gnu.org)
20170320195247.05f72fc9@khaalida
On Tue, 21 Mar 2017 03:46:29 +0100Tobias Geerinckx-Rice <me@tobias.gr> wrote:
Toggle quote (11 lines)> Hullo,> > On 21/03/17 02:44, dian_cecht@zoho.com wrote:> > Just ran guix pull and guix package -u, and found some of the> > programs download VERY slowly (<100kb/s, usually around 95). I> > asked on #guix and lfam mentioned it was probably a cache miss. > > Do you mean that *substitutes* existed, but were not yet on> mirror.hydra.gnu.org and so were silently proxied from the much slower> hydra.gnu.org?
The URL displayed during the download was mirror.hydra.gnu.org.
Toggle quote (4 lines)> > Or did Guix fall back to downloading *source* tarballs from some slow> upstream to build locally?
It was a binary download, not source. At least, I don't recall anythingabout compiles at any point (and I'm sure it didn't take long enough todo that; one package was icecat which I'm sure wouldn't have downloadedat 90k/s then compiled in less than 15 minutes (fwiw, according to mybuild logs firefox takes about 2 hours to build, so unless icecat ismagically orders of magnitude faster to build, then I'm sure it wasjust a download + install, and not download + compile + install)
T
T
Tobias Geerinckx-Rice wrote on 21 Mar 2017 04:57
(address . dian_cecht@zoho.com)(address . 26201@debbugs.gnu.org)
8e7e07d1-563f-666f-2c32-2a772757c86f@tobias.gr
Ahoy,
On 21/03/17 03:52, dian_cecht@zoho.com wrote:
Toggle quote (3 lines)> The URL displayed during the download was mirror.hydra.gnu.org.> [...] It was a binary download, not source.
Oh, OK. I'm not an expert on how Hydra's set up these days, but willassume it's not too different from my own (a fast nginx proxy_cache,mirror.hydra.gnu.org, in front of a slower build farm, hydra.gnu.org).
Whenever you're the first to request a substitute, mirror.hydra.gnu.orgtransparently forwards the request to hydra.gnu.org.
The latter has to compress the response on the fly, leading to muchslower transfer speeds. It slowly sends it back to the mirror, whichslowly sends it on to you while also saving it on disc so all subsequentdownloads will be fast — by Hydra standards – and not involve hydra.gnu.org.
Maybe you knew all this, but it's also the reason that...
Toggle quote (5 lines)> On 21/03/17 02:44, dian_cecht@zoho.com wrote:> It would be nice if there was some notification that a cache miss> happened and the download will likely be slow, otherwise a user might> wonder what problem there is with their connection.
...I'm afraid this makes no sense from guix's point of view.
The term ‘cache miss’ here is an implementation detail of our currentHydra set-up, not something guix can or IMO should care about. There arehundreds of reasons why your connection might be slow at any given time.Guix should just tell you so (it does), not guess why. Or worse: know.
(But if others disagree, we'll have to extend the Hydra API to somehowrelay this information to the client, in the spirit of the modern Web.)
HTTP 200½: OK, fine, but it's Going to Suck.
T G-R
Attachment: signature.asc
D
D
dian_cecht wrote on 21 Mar 2017 05:48
(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)(address . 26201@debbugs.gnu.org)
20170320214809.466dc5fe@khaalida
On Tue, 21 Mar 2017 04:57:09 +0100Tobias Geerinckx-Rice <me@tobias.gr> wrote:
Toggle quote (22 lines)> Ahoy,> > On 21/03/17 03:52, dian_cecht@zoho.com wrote:> > The URL displayed during the download was mirror.hydra.gnu.org.> > [...] It was a binary download, not source. > > Oh, OK. I'm not an expert on how Hydra's set up these days, but will> assume it's not too different from my own (a fast nginx proxy_cache,> mirror.hydra.gnu.org, in front of a slower build farm, hydra.gnu.org).> > Whenever you're the first to request a substitute,> mirror.hydra.gnu.org transparently forwards the request to> hydra.gnu.org.> > The latter has to compress the response on the fly, leading to much> slower transfer speeds. It slowly sends it back to the mirror, which> slowly sends it on to you while also saving it on disc so all> subsequent downloads will be fast — by Hydra standards – and not> involve hydra.gnu.org.> > Maybe you knew all this, but it's also the reason that...
I'm not familiar with the implementation details, nor how hydra iscurrently setup.
Toggle quote (13 lines)> > On 21/03/17 02:44, dian_cecht@zoho.com wrote:> > It would be nice if there was some notification that a cache miss> > happened and the download will likely be slow, otherwise a user> > might wonder what problem there is with their connection. > > ...I'm afraid this makes no sense from guix's point of view.> > The term ‘cache miss’ here is an implementation detail of our current> Hydra set-up, not something guix can or IMO should care about. There> are hundreds of reasons why your connection might be slow at any> given time. Guix should just tell you so (it does), not guess why. Or> worse: know.
I'm not suggesting having Guix tell me why my network is slow, only ifthe download might be slow because it's having to pull fromhydra.gnu.org. Having Guix automagically troubleshoot networkingproblems is well beyond the scope of a package manager, even one thatgoes as far beyond simple package management as Guix does.
Toggle quote (5 lines)> > (But if others disagree, we'll have to extend the Hydra API to somehow> relay this information to the client, in the spirit of the modern> Web.)
AFAIK, Guix devs are working on a replacement for the current buildsystem, so the sane option wouldn't be extending the current hydrasystem to handle a new API call, but to try and work this type offeature into the next system. Unless, of course, something like thiscould be done in hydra reasonably easily, in which case why not.
Another option would be to have the mirrors automatically cache thefiles as soon as they are available to try. I'd hope this would be howthings are handled already, but one never knows.
T
T
Tobias Geerinckx-Rice wrote on 21 Mar 2017 07:21
(address . dian_cecht@zoho.com)(address . 26201@debbugs.gnu.org)
d8962205-0e0f-59ef-c957-923ba9bc01d4@tobias.gr
Mornin',
On 21/03/17 05:48, dian_cecht@zoho.com wrote:
Toggle quote (2 lines)> I'm not suggesting having Guix tell me why my network is slow,
I never mentioned your network. Your proxied connection to a substituteserver, yes. And, well, this very bug report is for Guix to tell you whythat's slow...
Toggle quote (3 lines)> only if the download might be slow because it's having to pull from > hydra.gnu.org.
(Side note: ‘it’ here is mirror.hydra.gnu.org, never a well-configuredGuix client.)
So to implement this, the client would need to display a ‘warning‘message or flag sent by the substitute server, to notify the user thattheir download might be slower... sometimes... by an unknown amount...possibly?
But see, that wouldn't be true at all on my system (and surely others),despite being set up nearly identically to Hydra. On the other hand, myhome download speed fluctuates wildly, even between simultaneousconnections to the same server. Whether or not a file is cached makes nodifference. To be told would be noise at best, misleading at worst.
I'd be against this only for those reasons, but I promise I'm not.
It's just all a bit vague, 's all, and my personal opinion is that oncethe vagueness is resolved, not much will remain. But who knows.
Toggle quote (5 lines)> AFAIK, Guix devs are working on a replacement for the current build > system, so the sane option wouldn't be extending the current hydra > system to handle a new API call, but to try and work this type of > feature into the next system.
My point is that it wouldn't be sane, and would be an ugly hack ineither system. Cuirass isn't really different from Hydra is this regard.
Me shut up now :-) I'm more interested in what others have to say.
Kind regards,
T G-R
Attachment: signature.asc
D
D
dian_cecht wrote on 21 Mar 2017 07:49
(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)(address . 26201@debbugs.gnu.org)
20170320234912.46680062@khaalida
On Tue, 21 Mar 2017 07:21:54 +0100Tobias Geerinckx-Rice <me@tobias.gr> wrote:
Toggle quote (8 lines)> > only if the download might be slow because [mirror.hydra] is having> > to pull from hydra.gnu.org. > > So to implement this, the client would need to display a ‘warning‘> message or flag sent by the substitute server, to notify the user that> their download might be slower... sometimes... by an unknown amount...> possibly?
Simply a notification that mirror.hydra doesn't currently have a cachedversion of the file and the download might be slower than normal wouldbe fine. As-is, looking up and seeing download speeds that amount toless than 10% of one's normal bandwidth is a bit concerning since itwould seem like there is a problem. In this case, Guix would be givingthe user some notification that something /is/ out of the ordinary, andpossibly save the user some effort trying to determine the cause of theslowdown.
Toggle quote (5 lines)> But see, that wouldn't be true at all on my system (and surely> others), despite being set up nearly identically to Hydra. On the> other hand, my home download speed fluctuates wildly, even between> simultaneous connections to the same server.
I'm not sure how any of this matters. If you are running a local Hydrainstance or whatever, then I'd assume you'd be aware of what, if any,problems that could arise. In this case, I'd hope hydra would allow youto disable this feature.
Toggle quote (3 lines)> Whether or not a file is cached makes no difference. To be told would> be noise at best, is leading at worst.
Had I been notified that mirror.hydra was currently pulling from hydra,it would have saved me the time of jumping on IRC and asking what wasup, which only worked because someone was in #guix and had an idea ofwhat was going on; had that not been the case, I would have startedlooking for the cause for the slowdown and wasted several minutes (atleast) trying to figure out what was wrong, and since it was onmirror.hydra's end, I'd have no way to know the slowdown was on theirend and not mine, nor my ISP's problem.
Toggle quote (8 lines)> > AFAIK, Guix devs are working on a replacement for the current build > > system, so the sane option wouldn't be extending the current hydra > > system to handle a new API call, but to try and work this type of > > feature into the next system. > > My point is that it wouldn't be sane, and would be an ugly hack in> either system.
I don't see how this would have to be "an ugly hack". It's simply aquery and response. The simplest way I can see for this to work wouldbe for mirror.hydra to either just send the requested file, or aresponse that the file isn't cached then start to trickle the file on tothe client.
F
F
Florian Pelz wrote on 21 Mar 2017 13:59
(address . 26201@debbugs.gnu.org)
be6b7b69-5ab9-3d4e-68fe-4d582699b2cc@pelzflorian.de
On Mon, 2017-03-20 at 21:48 -0700, dian_cecht@zoho.com wrote:
Toggle quote (5 lines)> Another option would be to have the mirrors automatically cache the> files as soon as they are available to try. I'd hope this would be how> things are handled already, but one never knows.>
If it cached everything, it wouldn’t be a cache?
T
T
Tobias Geerinckx-Rice wrote on 21 Mar 2017 15:55
(address . dian_cecht@zoho.com)(address . 26201@debbugs.gnu.org)
1bbd8ee3-1745-3642-27ed-f095c732dc11@tobias.gr
Hullo!
On 21/03/17 07:49, dian_cecht@zoho.com wrote:
Toggle quote (4 lines)> I'm not sure how any of this matters. If you are running a local > Hydra instance or whatever, then I'd assume you'd be aware of what, > if any, problems that could arise.
It matters for the reasons mentioned. It's not a ‘local Hydra’ & I haveno idea what problems you're talking about.
My problem is that every invocation of Guix already fills severalscreens with Guile cache misses. Adding another warning (‘warning! thesystem is working exactly as designed!’) will only serve to make thoseother warnings look less silly, and I think that would be a shame.
To clarify:
- Warnings should be scary because warnings should be actionable. There's nothing the user can or needs to do about a cache miss.- It would be randomly shown to everyone, since this happens constantly.- The behaviour warned about is not incorrect or abnormal.- As already noted, it's how caching works.
Toggle quote (6 lines)> I don't see how this would have to be "an ugly hack". It's simply a > query and response. The simplest way I can see for this to work would> be for mirror.hydra to either just send the requested file, or a> response that the file isn't cached then start to trickle the file on> to the client.
Well, yeah... That's the ugly hack. :-)
It's not that your suggestion's hard to implement. In fact, it'sjust one line for nginx (which it turns out I already had):
add_header X-Cache-Status $upstream_cache_status;
and 6 lines of lightly-tested Guile (attached)¹. And presto. This thing.
Doesn't mean we should.
Kind regards,
T G-R
¹: Why? Practice. Irony. Light masochism.
From 6d459a442d73628a0628385283c7cf04dff1b797 Mon Sep 17 00:00:00 2001From: Tobias Geerinckx-Rice <me@tobias.gr>Date: Tue, 21 Mar 2017 15:31:56 +0100Subject: [PATCH] http-client: Warn on proxy cache misses.
Still not a good idea.
* guix/http-client.scm (http-fetch): Add #:peek-behind-proxy parameterto expose caching proxy implementation details as a scary warning.* guix/scripts/substitute.scm (fetch): Use it.--- guix/http-client.scm | 10 +++++++++- guix/scripts/substitute.scm | 3 ++- 2 files changed, 11 insertions(+), 2 deletions(-)
Toggle diff (53 lines)diff --git a/guix/http-client.scm b/guix/http-client.scmindex 6874c51..2366f5e 100644--- a/guix/http-client.scm+++ b/guix/http-client.scm@@ -2,6 +2,7 @@ ;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017 Ludovic Courtès <ludo@gnu.org> ;;; Copyright © 2015 Mark H Weaver <mhw@netris.org> ;;; Copyright © 2012, 2015 Free Software Foundation, Inc.+;;; Copyright © 2017 Tobias Geerinckx-Rice <me@tobias.gr> ;;; ;;; This file is part of GNU Guix. ;;;@@ -222,7 +223,8 @@ or if EOF is reached." (define* (http-fetch uri #:key port (text? #f) (buffered? #t) keep-alive? (verify-certificate? #t)- (headers '((user-agent . "GNU Guile"))))+ (headers '((user-agent . "GNU Guile")))+ (peek-behind-cache? #f)) "Return an input port containing the data at URI, and the expected number of bytes available or #f. If TEXT? is true, the data at URI is considered to be textual. Follow any HTTP redirection. When BUFFERED? is #f, return an@@ -253,8 +255,14 @@ Raise an '&http-get-error' condition if downloading fails." (http-get uri #:streaming? #t #:port port #:keep-alive? #t #:headers headers))+ ((headers)+ (response-headers resp)) ((code) (response-code resp)))+ (when (and peek-behind-cache?+ (equal? (assoc-ref headers 'x-cache-status) "MISS"))+ (warning (_ "the caching proxy is working properly!~%"))+ (warning (_ "and there's nothing you can do about it.~%"))) (case code ((200) (values data (response-content-length resp)))diff --git a/guix/scripts/substitute.scm b/guix/scripts/substitute.scmindex faeb019..4a4f115 100755--- a/guix/scripts/substitute.scm+++ b/guix/scripts/substitute.scm@@ -216,7 +216,8 @@ provide." (unless (or buffered? (not (file-port? port))) (setvbuf port _IONBF))) (http-fetch uri #:text? #f #:port port- #:verify-certificate? #f))))))+ #:verify-certificate? #f+ #:peek-behind-cache? #t)))))) (else (leave (_ "unsupported substitute URI scheme: ~a~%") (uri->string uri)))))-- 2.9.3
Attachment: signature.asc
D
D
dian_cecht wrote on 21 Mar 2017 16:32
(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)(address . 26201@debbugs.gnu.org)
20170321083239.3cbf1e8d@khaalida
On Tue, 21 Mar 2017 15:55:05 +0100Tobias Geerinckx-Rice <me@tobias.gr> wrote:
Toggle quote (4 lines)> To clarify:> > - Warnings should be scary because warnings should be actionable.
There are warnings and there are errors. Warnings don't have to bescary; I get them every time I update emacs because of duplicate iconsstored in two different directories in the store. Is that actionable?Not as far as I am concerned, unless I want to hand delete somethingfrom the store, which, as far as I understand it, shouldn't be done.
Toggle quote (2 lines)> There's nothing the user can or needs to do about a cache miss.
Please reread the 2nd part of my response in Message #23 in thisbugreport for why this is needed.
Toggle quote (3 lines)> - It would be randomly shown to everyone, since this happens> constantly.
Unless mirror.hydra randomly loses data in it's cache from hydra, itwon't be random in the least.
Toggle quote (2 lines)> - The behaviour warned about is not incorrect or abnormal.
No, but the behavior would inform the user that the unusual and randomslowdown isn't another problem and is because mirror.hydra is having toupdate it's cache, which, as I explained before, is useful information.
Toggle quote (2 lines)> [...]
Quite frankly I'd like someone else to take a look at this bug, iffor no other reason than I'm not sure if we're communicating clearlywith each other here. Most of what you are saying makes no sensewhatsoever and seems to miss the point I have attempted to make.
While I will thank you for actually writing a patch, saying "thecaching proxy is working properly! and there's nothing you can do aboutit." seems rather cynical and clearly misses the point of what I'mrequesting here.
D
D
dian_cecht wrote on 21 Mar 2017 16:35
(name . Florian Pelz)(address . pelzflorian@pelzflorian.de)(address . 26201@debbugs.gnu.org)
20170321083536.639716a9@khaalida
On Tue, 21 Mar 2017 13:59:27 +0100Florian Pelz <pelzflorian@pelzflorian.de> wrote:
Toggle quote (8 lines)> On Mon, 2017-03-20 at 21:48 -0700, dian_cecht@zoho.com wrote:> > Another option would be to have the mirrors automatically cache the> > files as soon as they are available to try. I'd hope this would be> > how things are handled already, but one never knows.> > > > If it cached everything, it wouldn’t be a cache?
If the point is to reduce the load on hydra, then at some point itcould have everything. If it doesn't, then why have a mirror when it'sjust pulling right the source all the time anyways?
T
T
Tobias Geerinckx-Rice wrote on 21 Mar 2017 17:07
(address . dian_cecht@zoho.com)(address . 26201@debbugs.gnu.org)
553699c2-fb50-5cf4-a80d-8ee0a70c039d@tobias.gr
On 21/03/17 16:32, dian_cecht@zoho.com wrote:
Toggle quote (3 lines)> Unless mirror.hydra randomly loses data in it's cache from hydra, it> won't be random in the least.
It will. Whether one is first to download from the cache after thesubstitute is built is essentially random.
Toggle quote (2 lines)> Quite frankly I'd like someone else to take a look at this bug,
Glad you agree.
Toggle quote (4 lines)> if for no other reason than I'm not sure if we're communicating clearly> with each other here. Most of what you are saying makes no sense> whatsoever and seems to miss the point I have attempted to make.
I assure you it does not.
Kind regards,
T G-R
Attachment: signature.asc
L
L
Ludovic Courtès wrote on 21 Mar 2017 17:43
(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)
8760j2wpfy.fsf@gnu.org
Hello!
Tobias Geerinckx-Rice <me@tobias.gr> skribis:
Toggle quote (4 lines)> Oh, OK. I'm not an expert on how Hydra's set up these days, but will> assume it's not too different from my own (a fast nginx proxy_cache,> mirror.hydra.gnu.org, in front of a slower build farm, hydra.gnu.org).
I think there’s room for improvement in our nginx config athttps://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/hydra/nginx/mirror.conf.
For instance, I just discovered ‘proxy_cache_lock’ while looking athttp://nginx.org/en/docs/http/ngx_http_proxy_module.html; looks usefulin reducing load on hydra.gnu.org. Surely there are other ways to tweakcaching.
Besides, I’d like to use ‘guix publish’ on hydra.gnu.org. I suspectit’s going to be faster than Starman (the HTTP server behind Hydra), andalso it uses an in-process gzip by default, as opposed to bzip2 which iswhat Hydra uses (better compression ratio, but super CPU-intensive).
At any rate, clients should not paper over server-side performanceissues IMO.
Thanks,Ludo’.
T
T
Tobias Geerinckx-Rice wrote on 21 Mar 2017 18:08
(address . ludo@gnu.org)(address . 26201@debbugs.gnu.org)
9889a4b5-c300-cd03-1095-1115428067fb@tobias.gr
Ludo',
On 21/03/17 17:43, Ludovic Courtès wrote:
Toggle quote (8 lines)> I think there’s room for improvement in our nginx config at> <https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/hydra/nginx/mirror.conf>.> > For instance, I just discovered ‘proxy_cache_lock’ while looking at> <http://nginx.org/en/docs/http/ngx_http_proxy_module.html>; looks useful> in reducing load on hydra.gnu.org. Surely there are other ways to tweak> caching.
Indeed! For reference, here's my cache configuration.
That's right. Now you can all¹ steal some criminally overpriced Belgianbandwidth!
server { server_name substitutes.tobias.gr; listen [::]:443 ssl http2; listen 443 ssl http2;
# FIXME move to main LE cert ssl_certificate substitutes.pem; ssl_certificate_key substitutes.key;
# "" means ‘inherit from upstream’ here. add_header Cache-Control ""; # So does ‘off’. This is all a bit hacky. expires off; proxy_hide_header Set-Cookie; proxy_ignore_headers Set-Cookie;
# Almost all traffic is already compressed. gzip off;
...
location / { limit_except GET { deny all; } proxy_pass SUPER_SEKRIT_BACKEND;
# https://www.nginx.com/blog/nginx-caching-guide add_header X-Cache-Status $upstream_cache_status;
proxy_cache default; # We allow only GET requests, so don't waste key space: proxy_cache_key "$request_uri"; proxy_cache_lock on; proxy_cache_lock_timeout 3h; #yolo proxy_cache_use_stale error timeout http_500 http_502 http_503 http_504; } ... }
I'm sure it's hardly optimal (or, erm, ‘good’) either but it works.
Toggle quote (5 lines)> Besides, I’d like to use ‘guix publish’ on hydra.gnu.org. I suspect> it’s going to be faster than Starman (the HTTP server behind Hydra), and> also it uses an in-process gzip by default, as opposed to bzip2 which is> what Hydra uses (better compression ratio, but super CPU-intensive).
Back when I used Hydra-the-software I do so briefly and I think itworked. But no hard tests.
Toggle quote (3 lines)> At any rate, clients should not paper over server-side performance> issues IMO.
Entirely off-topic, but this 'tude is a part of what drew me to Guix inthe first place. So, like, thanks, in general :-)
Kind regards,
T G-R
¹: Just put it *after* mirror.hydra.gnu.org, OK?
Attachment: signature.asc
L
L
Ludovic Courtès wrote on 22 Mar 2017 23:06
(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)(address . 26201@debbugs.gnu.org)
87fui50xws.fsf@gnu.org
Hey Tobias,
Tobias Geerinckx-Rice <me@tobias.gr> skribis:
Toggle quote (14 lines)> On 21/03/17 17:43, Ludovic Courtès wrote:>> I think there’s room for improvement in our nginx config at>> <https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/hydra/nginx/mirror.conf>.>> >> For instance, I just discovered ‘proxy_cache_lock’ while looking at>> <http://nginx.org/en/docs/http/ngx_http_proxy_module.html>; looks useful>> in reducing load on hydra.gnu.org. Surely there are other ways to tweak>> caching.>> Indeed! For reference, here's my cache configuration.>> That's right. Now you can all¹ steal some criminally overpriced Belgian> bandwidth!
Heheh. :-)
Toggle quote (14 lines)> limit_except GET { deny all; }> proxy_pass SUPER_SEKRIT_BACKEND;>> # https://www.nginx.com/blog/nginx-caching-guide> add_header X-Cache-Status $upstream_cache_status;>> proxy_cache default;> # We allow only GET requests, so don't waste key space:> proxy_cache_key "$request_uri";> proxy_cache_lock on;> proxy_cache_lock_timeout 3h; #yolo> proxy_cache_use_stale error timeout> http_500 http_502 http_503 http_504;
I didn’t fully understand the docs for the last 3 directives here. Forinstance, what happens when 10 clients do GET /nar/xyz-texlive? Do the9 unlucky clients wait for 3 hours and then get 404?
Anyway, thanks for sharing your tips. :-)
Toggle quote (3 lines)> Entirely off-topic, but this 'tude is a part of what drew me to Guix in> the first place. So, like, thanks, in general :-)
:-)
Ludo’.
L
L
Ludovic Courtès wrote on 22 Mar 2017 23:22
hydra.gnu.org uses ‘guix publish’ for nars and narinfos
(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)
87r31pyms2.fsf_-_@gnu.org
Hi again!
Until now hydra.gnu.org was using Hydra (the software) to serve not onlythe Web interface but also all the .narinfo and /nar URLs (substitutemeta-data and substitutes).
Starting from now, hydra.gnu.org directs all .narinfo and correspondingnar requests to ‘guix publish’ instead of Hydra.
‘guix publish’ should be faster and less resource-hungry than Hydra. Ituses in-process gzip for nar compression instead of bzip2 (I chose level7, which seems to provide compression ratios close to what bzip2provides with its default compression level, while being 3 timesfaster). Unlike Hydra it never forks so for instance, 404 responses for.narinfo URLs should be quicker. Hopefully, that will improve theworst-case (cache miss) throughput.
I configured nginx in such a way that the former Hydra-provided /narURLs (which are cached in nginx instances, in our/var/guix/substitute/cache directories, etc.) are still available.‘guix publish’ uses the /guix/nar URLs while Hydra uses /nar, so thenginx config redirects to either Hydra or ‘guix publish’ depending onthe URL:
https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/hydra/nginx/hydra.gnu.org-locations.conf#n29
Hydra-provided .narinfos are still cached here and there; they’ll beprogressively expire and be replaced by ‘guix publish’-provided.narinfos.
Let me know if you notice anything fishy!
Ludo’.
R
R
Ricardo Wurmus wrote on 23 Mar 2017 11:29
(name . Ludovic Courtès)(address . ludo@gnu.org)
87mvccs2uu.fsf@elephly.net
Ludovic Courtès <ludo@gnu.org> writes:
Toggle quote (7 lines)> Until now hydra.gnu.org was using Hydra (the software) to serve not only> the Web interface but also all the .narinfo and /nar URLs (substitute> meta-data and substitutes).>> Starting from now, hydra.gnu.org directs all .narinfo and corresponding> nar requests to ‘guix publish’ instead of Hydra.
That’s very cool! I’m happy to see more of Hydra replaced.
-- Ricardo
GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAChttps://elephly.net
M
M
Mark H Weaver wrote on 23 Mar 2017 19:36
Re: bug#26201: hydra.gnu.org uses ‘guix publish ’ for nars and narinfos
(name . Ludovic Courtès)(address . ludo@gnu.org)
87inmzrgbf.fsf@netris.org
ludo@gnu.org (Ludovic Courtès) writes:
Toggle quote (17 lines)> Hi again!>> Until now hydra.gnu.org was using Hydra (the software) to serve not only> the Web interface but also all the .narinfo and /nar URLs (substitute> meta-data and substitutes).>> Starting from now, hydra.gnu.org directs all .narinfo and corresponding> nar requests to ‘guix publish’ instead of Hydra.>> ‘guix publish’ should be faster and less resource-hungry than Hydra. It> uses in-process gzip for nar compression instead of bzip2 (I chose level> 7, which seems to provide compression ratios close to what bzip2> provides with its default compression level, while being 3 times> faster). Unlike Hydra it never forks so for instance, 404 responses for> .narinfo URLs should be quicker. Hopefully, that will improve the> worst-case (cache miss) throughput.
Excellent! Any improvement in 404 response time will be very helpful.I've noticed that spikes of narinfo requests resulting in 404 has been amajor source of overloading on Hydra, because these requests cannot becached for very long. The reason: if we cache those failures for Nminutes, this effectively delays the appearance of new nars by N minutes(if it was requested before that). This forces us to choose a small Nfor negative cache entries, which means the cache is not much help here.
One question: what will happen in the case of multiple concurrentrequests for the same nar? Will multiple nar-pack-and-bzip2 processesbe run on-demand? Recall that the nginx proxy will pass all of thoserequests through, and not create the cache entry until it has received acomplete response. This has caused us severe problems with huge narssuch as texinfo-texmf, to the point that we had to crudely block thosenar requests. Unfortunately, it is not obvious how to block theassociated narinfo requests due to the lack of job name in the URL, sothis results in failures on the client side that must be manually workedaround.
Thanks, Mark
T
T
Tobias Geerinckx-Rice wrote on 23 Mar 2017 19:52
(address . mhw@netris.org)
25b2472a-c705-53fe-f94f-04de9a2d484e@tobias.gr
Mark,
On 23/03/17 19:36, Mark H Weaver wrote:
Toggle quote (4 lines)> One question: what will happen in the case of multiple concurrent> requests for the same nar? Will multiple nar-pack-and-bzip2 processes> be run on-demand?
I think this used to be the case with the previous nginx configuration,but the recent changes pushed by Ludo' were aimed in part at preventingthat.
Toggle quote (2 lines)> Recall that the nginx proxy will pass all of those requests through,
Are you sure? I was under the impression¹ that this is exactly what‘proxy_cache_lock on;’ prevents. I'm no nginx guru, obviously, so please— anyone! — correct me if I'm misguided.
Kind regards,
T G-R
¹:https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_lock
Attachment: signature.asc
T
T
Tobias Geerinckx-Rice wrote on 23 Mar 2017 20:25
(address . ludo@gnu.org)(address . 26201@debbugs.gnu.org)
a1f7cae6-0d37-6d6b-8ed9-8fd124fc037c@tobias.gr
Ludo',
On 22/03/17 23:06, Ludovic Courtès wrote:
Toggle quote (9 lines)> Tobias Geerinckx-Rice <me@tobias.gr> skribis:>> proxy_cache_lock on;>> proxy_cache_lock_timeout 3h; #yolo>> proxy_cache_use_stale error timeout>> http_500 http_502 http_503 http_504;> I didn’t fully understand the docs for the last 3 directives here. For> instance, what happens when 10 clients do GET /nar/xyz-texlive? Do the> 9 unlucky clients wait for 3 hours and then get 404?
From ‘proxy_cache_lock’ [1]:
“When enabled, only one request at a time will be allowed to populate a new cache element identified according to the proxy_cache_key directive by passing a request to a proxied server. Other requests of the same cache element will either wait for a response to appear in the cache or the cache lock for this element to be released, up to the time set by the proxy_cache_lock_timeout directive.”
Hmm. Good point: ‘to appear in the cache’, when we don't cache 404s oreven 410s.
I don't actually know.
Kind regards,
T G-R
[1]:https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_lock
Attachment: signature.asc
M
M
Maxim Cournoyer wrote on 24 Mar 2017 03:15
Re: bug#26201: No notification of cache misses when downloading substitutes
(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)
87efxnzagb.fsf@gmail.com
Hi!
Tobias Geerinckx-Rice <me@tobias.gr> writes:
Toggle quote (21 lines)> On 21/03/17 16:32, dian_cecht@zoho.com wrote:>> Unless mirror.hydra randomly loses data in it's cache from hydra, it>> won't be random in the least.>> It will. Whether one is first to download from the cache after the> substitute is built is essentially random.>>> Quite frankly I'd like someone else to take a look at this bug,>> Glad you agree.>>> if for no other reason than I'm not sure if we're communicating clearly>> with each other here. Most of what you are saying makes no sense>> whatsoever and seems to miss the point I have attempted to make.>> I assure you it does not.>> Kind regards,>> T G-R
Please allow me to jump in and voice my opinion here. To me it doesn'tmake sense to concern the Guix client with implementation details of howthe caching of substitutes happen and its impacts.
This situation is bound to change in the future or become irrelevant(say, if a new build farm would be able to sustain higher transferspeeds to the cache mirror), or if the caching implementation changes.
If the current cache building implementation is slow to the point ofbeing a problem it should be fixed (or documented).
Cheers,
Maxim
-----BEGIN PGP SIGNATURE-----
iQIzBAEBCAAdFiEEJ9WGpPiQCFQyn/CfEmDkZILmNWIFAljUgVUACgkQEmDkZILmNWLXkA//fiY5xgNAAbJ+QANhXWNcYsCHfTVm9Zhl/dqq2rnKgUcDs7/vd7AKfQJTwQmoWJf2Uz+lnGJep5plLxCy1Q0DhmnnfVtjrtcD2Z12IIkfCd0jo2DIFiuVH4LOPnyhEzQZnSlF/wYPxiyYRkagp5eNQNBeCA8Ym14VP15PXytb7GvrKldH0o3oBBm6Eht4WjKQ9wWeu5vwcRyWAMxQyPbD1ITpfFRUru1mNgjCmeNRDH7g/q17lQlyXuNA/QVNoJsT2+FOSdjFhvTPGyWXWtVnWWHzU0XGw3iKYfvAHxxroNP12LzK8Mr/KuUwOux6MIrpsdwCoMmtLZmqVkQEYFbXAPoqZftN1OXOqXdIXNmh9fE6ZAlLrqVPkTdn19bdRONIxZGOS39lIB1SS0jJ4gIehjWU1ZiqgoKIZ/4jArjn+5cd81+yB5rsUDCFNgCILJRK6TXoaqHCjEj3N0ci3jxrpwtobsAERkiK80tOegPCTCNvIym94y0Zce0QpJrSBNjPVq1DFXQ/biGlcDsoVq/eGGY9Ie6WfqGfgjpfmb/Espud/XQYQj7j9MjmOTGcu8vd0Q3TING1RjW1FDlI2dfRyIxVda8Zosj1ckS72OIQ2HFWRqQmL/DD44NYW2qeBfQ3yYHmTalm7ir65Oj9J80AuBpb9KHsbPC5ZzBhuCiP9Io==zBw4-----END PGP SIGNATURE-----
M
M
Mark H Weaver wrote on 24 Mar 2017 09:12
Re: bug#26201: hydra.gnu.org uses ‘guix publish ’ for nars and narinfos
(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)
87y3vvozy5.fsf@netris.org
Hi,
Tobias Geerinckx-Rice <me@tobias.gr> writes:
Toggle quote (15 lines)> On 23/03/17 19:36, Mark H Weaver wrote:>> One question: what will happen in the case of multiple concurrent>> requests for the same nar? Will multiple nar-pack-and-bzip2 processes>> be run on-demand?>> I think this used to be the case with the previous nginx configuration,> but the recent changes pushed by Ludo' were aimed in part at preventing> that.>>> Recall that the nginx proxy will pass all of those requests through,>> Are you sure? I was under the impression¹ that this is exactly what> ‘proxy_cache_lock on;’ prevents. I'm no nginx guru, obviously, so please> — anyone! — correct me if I'm misguided.
I agree that "proxy_cache_lock on" should prevent multiple concurrentrequests for the same URL, but unfortunately its behavior is quiteundesirable, and arguably worse than leaving it off in our case. See:
https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_lock
Specifically:
Other requests of the same cache element will either wait for a response to appear in the cache or the cache lock for this element to be released, up to the time set by the proxy_cache_lock_timeout directive.
In our problem case, it takes more than an hour for Hydra to finishsending a response for the 'texlive-texmf' nar. During that time, thenar will be slowly sent to the first client while it's being packed andbzipped on-demand.
IIUC, with "proxy_cache_lock on", we have two choices of how otherclient requests will be treated:
(1) If we increase "proxy_cache_lock_timeout" to a huge value, then there will *no* data sent to the other clients until the first client has received the entire nar, which means they wait over an hour before receiving the first byte. I guess this will result in timeouts on the client side.
(2) If "proxy_cache_lock_timeout" is *not* huge, then all other clients will get failure responses until the first client has received the entire nar.
Either way, this would cause users to see the same download failures(requiring user work-arounds like --fallback) that this fix is intendedto prevent for 'texlive-texmf', but instead of happening only for thatone nar, it will now happen for *all* large nars.
Or at least that's what I'd expect based on my reading of the nginx docslinked above. I haven't tried it.
IMO, the best solution is to *never* generate nars on Hydra in responseto client requests, but rather to have the build slaves pack andcompress the nars, copy them to Hydra, and then serve them as staticfiles using nginx.
A far inferior solution, but possibly acceptable and closer to thecurrent approach, would be to arrange for all concurrent responses forthe same nar to be sent incrementally from a single nar-packing process.More concretely, while packing and sending a nar response to the firstclient, the data would also be written to a file. Subsequent requestsfor the same nar would be serviced using the equivalent of:
tail --bytes=+0 --follow FILENAME
This way, no one would have to wait an hour to receive the first byte.
What do you think?
Mark
L
L
Ludovic Courtès wrote on 24 Mar 2017 10:25
(name . Mark H Weaver)(address . mhw@netris.org)
87d1d710xc.fsf@gnu.org
Hi!
Mark H Weaver <mhw@netris.org> skribis:
Toggle quote (2 lines)> Tobias Geerinckx-Rice <me@tobias.gr> writes:
[...]
Toggle quote (40 lines)>> Are you sure? I was under the impression¹ that this is exactly what>> ‘proxy_cache_lock on;’ prevents. I'm no nginx guru, obviously, so please>> — anyone! — correct me if I'm misguided.>> I agree that "proxy_cache_lock on" should prevent multiple concurrent> requests for the same URL, but unfortunately its behavior is quite> undesirable, and arguably worse than leaving it off in our case. See:>> https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_lock>> Specifically:>> Other requests of the same cache element will either wait for a> response to appear in the cache or the cache lock for this element to> be released, up to the time set by the proxy_cache_lock_timeout> directive.>> In our problem case, it takes more than an hour for Hydra to finish> sending a response for the 'texlive-texmf' nar. During that time, the> nar will be slowly sent to the first client while it's being packed and> bzipped on-demand.>> IIUC, with "proxy_cache_lock on", we have two choices of how other> client requests will be treated:>> (1) If we increase "proxy_cache_lock_timeout" to a huge value, then> there will *no* data sent to the other clients until the first> client has received the entire nar, which means they wait over an> hour before receiving the first byte. I guess this will result in> timeouts on the client side.>> (2) If "proxy_cache_lock_timeout" is *not* huge, then all other clients> will get failure responses until the first client has received the> entire nar.>> Either way, this would cause users to see the same download failures> (requiring user work-arounds like --fallback) that this fix is intended> to prevent for 'texlive-texmf', but instead of happening only for that> one nar, it will now happen for *all* large nars.
My understanding is that proxy_cache_lock allows us to avoid spawningconcurrent compression threads of the same item at the same time, whilealso avoiding starvation (proxy_cache_lock_timeout should ensure thatnobody ends up waiting until the nar-compression process is done.)
IOW, it should help reduce load in most cases, while introducing smalldelays in some cases (if you’re downloading a nar that’s already beingdownloaded.)
Toggle quote (5 lines)> IMO, the best solution is to *never* generate nars on Hydra in response> to client requests, but rather to have the build slaves pack and> compress the nars, copy them to Hydra, and then serve them as static> files using nginx.
The problem is that we want nars to be signed by the master node. Or,if we don’t require that, we need a PKI that allows us to express thefact that hydra.gnu.org delegates to the build machines.
Toggle quote (11 lines)> A far inferior solution, but possibly acceptable and closer to the> current approach, would be to arrange for all concurrent responses for> the same nar to be sent incrementally from a single nar-packing process.> More concretely, while packing and sending a nar response to the first> client, the data would also be written to a file. Subsequent requests> for the same nar would be serviced using the equivalent of:>> tail --bytes=+0 --follow FILENAME>> This way, no one would have to wait an hour to receive the first byte.
Yes. I would think that NGINX does something like that for its caching,but I don’t know exactly when/how.
Other solutions I’ve thought about:
1. Produce narinfos and nars periodically rather than on-demand and serve them as static files.
pros: better HTTP latency and bandwidth pros: allows us to add a Content-Length for nars cons: doesn’t reduce load on hydra.gnu.org cons: introduces arbitrary delays in delivering nars cons: difficult/expensive to know what new store items are available
2. Produce a narinfo and corresponding nar the first time they are requested. So, the first time we receive “GET foo.narinfo”, return 404 and spawn a thread to compute foo.narinfo and foo.nar. Return 200 only when both are ready.
The precomputed nar{,info}s would be kept in a cache and we could make sure a narinfo and its nar have the same lifetime, which addresses one of the problems we have.
pros: better HTTP latency and bandwidth pros: allows us to add a Content-Length for nars pros: helps keep narinfo/nar lifetime in sync cons: doesn’t reduce load on hydra.gnu.org cons: exposes inconsistency between the store contents and the HTTP response (you may get 404 even if the thing is actually in store), but maybe that’s not a problem
Thoughts?
Ludo’.
T
T
Tobias Geerinckx-Rice wrote on 26 Mar 2017 19:35
(address . mhw@netris.org)
1988d01c-1e67-bf47-2b43-cf3551d0651b@tobias.gr
Mark,
On 24/03/17 09:12, Mark H Weaver wrote:
Toggle quote (5 lines)> IIUC, with "proxy_cache_lock on", we have two choices of how other> client requests will be treated:> > [badly, ed.]
Eh. You're probably (and disappointingly) right.
When configuring my little cache, I had a clear idea of how such a cache should work (basically, your last scenario below), then looked at thenginx documentation to find what I had in mind. ‘proxy_cache_lock’ matched.
I should have been more pessimistic and done more testing.Shame on me, &c. Too much other things on my mind. :-/
Toggle quote (3 lines)> Or at least that's what I'd expect based on my reading of the nginx docs> linked above. I haven't tried it.
I can try to do some simple tests tomorrow.
Toggle quote (5 lines)> IMO, the best solution is to *never* generate nars on Hydra in response> to client requests, but rather to have the build slaves pack and> compress the nars, copy them to Hydra, and then serve them as static> files using nginx.
A true mirror at last! Do we have the disc space for that?
And could Hydra actually handle compressing *everything*, without aninfinitely growing back-log? I don't have access to any statistics, butI'm guessing that a fair number of package+versions are never actuallyrequested, and hence never compressed. This would change that.
Toggle quote (11 lines)> A far inferior solution, but possibly acceptable and closer to the> current approach, would be to arrange for all concurrent responses for> the same nar to be sent incrementally from a single nar-packing process.> More concretely, while packing and sending a nar response to the first> client, the data would also be written to a file. Subsequent requests> for the same nar would be serviced using the equivalent of:> > tail --bytes=+0 --follow FILENAME> > This way, no one would have to wait an hour to receive the first byte.
^ This is so obviously the right solution, that it would bedisappointing if nginx really couldn't be made to do it. It alreadybuffers proxy responses to a temporary file anyway...
Kind regards,
T G-R
Attachment: signature.asc
L
L
Ludovic Courtès wrote on 27 Mar 2017 13:20
Bandwidth when retrieving substitutes
(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)
8760ivm0dx.fsf_-_@gnu.org
Hi there!
ludo@gnu.org (Ludovic Courtès) skribis:
Toggle quote (8 lines)> ‘guix publish’ should be faster and less resource-hungry than Hydra. It> uses in-process gzip for nar compression instead of bzip2 (I chose level> 7, which seems to provide compression ratios close to what bzip2> provides with its default compression level, while being 3 times> faster). Unlike Hydra it never forks so for instance, 404 responses for> .narinfo URLs should be quicker. Hopefully, that will improve the> worst-case (cache miss) throughput.
Another interesting data point on the client side this time:
Toggle snippet (37 lines)$ wget -O- https://mirror.hydra.gnu.org/nar/v6rq6j9wdx8ixsks05dxhxr26jgmr6z3-mysql-5.7.17 |bunzip2 >/dev/null--2017-03-27 13:12:50-- https://mirror.hydra.gnu.org/nar/v6rq6j9wdx8ixsks05dxhxr26jgmr6z3-mysql-5.7.17Resolving mirror.hydra.gnu.org (mirror.hydra.gnu.org)... 131.159.14.26, 2001:4ca0:2001:10:225:90ff:fedb:c720Connecting to mirror.hydra.gnu.org (mirror.hydra.gnu.org)|131.159.14.26|:443... connected.HTTP request sent, awaiting response... 200 OKLength: unspecified [application/x-nix-archive]Saving to: ‘STDOUT’
- [ <=> ] 53.01M 9.29MB/s in 5.5s
2017-03-27 13:12:55 (9.57 MB/s) - written to stdout [55582050]
$ wget -O- https://mirror.hydra.gnu.org/guix/nar/gzip/v6rq6j9wdx8ixsks05dxhxr26jgmr6z3-mysql-5.7.17 |gunzip >/dev/null--2017-03-27 13:13:00-- https://mirror.hydra.gnu.org/guix/nar/gzip/v6rq6j9wdx8ixsks05dxhxr26jgmr6z3-mysql-5.7.17Resolving mirror.hydra.gnu.org (mirror.hydra.gnu.org)... 131.159.14.26, 2001:4ca0:2001:10:225:90ff:fedb:c720Connecting to mirror.hydra.gnu.org (mirror.hydra.gnu.org)|131.159.14.26|:443... connected.HTTP request sent, awaiting response... 200 OKLength: unspecified [application/x-nix-archive]Saving to: ‘STDOUT’
- [ <=> ] 59.19M 40.8MB/s in 1.4s
2017-03-27 13:13:02 (40.8 MB/s) - written to stdout [62068901]
$ wget -O- https://mirror.hydra.gnu.org/guix/nar/gzip/v6rq6j9wdx8ixsks05dxhxr26jgmr6z3-mysql-5.7.17 >/dev/null--2017-03-27 13:15:58-- https://mirror.hydra.gnu.org/guix/nar/gzip/v6rq6j9wdx8ixsks05dxhxr26jgmr6z3-mysql-5.7.17Resolving mirror.hydra.gnu.org (mirror.hydra.gnu.org)... 131.159.14.26, 2001:4ca0:2001:10:225:90ff:fedb:c720Connecting to mirror.hydra.gnu.org (mirror.hydra.gnu.org)|131.159.14.26|:443... connected.HTTP request sent, awaiting response... 200 OKLength: unspecified [application/x-nix-archive]Saving to: ‘STDOUT’
- [ <=> ] 59.19M 42.5MB/s in 1.4s
2017-03-27 13:16:00 (42.5 MB/s) - written to stdout [62068901]
40 MB/s vs. 10 MB/s! (Both items were cached on mirror.hydra.gnu.org.)
IOW, bunzip2 was the bottleneck when retrieving substitutes (and that’son an i7.) With ‘perf timechart’ we see that bunzip2 is indeed busyall the time right from the start.
Ludo’.
T
T
Tobias Geerinckx-Rice wrote on 27 Mar 2017 20:47
Re: bug#26201: hydra.gnu.org uses ‘guix publish ’ for nars and narinfos
(address . 26201@debbugs.gnu.org)(address . ludo@gnu.org)
bad0ed66-6c44-7147-fc3d-01622cf6c62f@tobias.gr
Guix,
On 26/03/17 19:35, Tobias Geerinckx-Rice wrote:
Toggle quote (2 lines)> I can try to do some simple tests tomorrow.
Two observations:
- ‘proxy_cache_lock_timeout’ alone won't suffice to serialise requests; ‘proxy_cache_lock_age’ must also be set to an equally ridiculously long span. Otherwise, multiple requests will still be sent to ‘guix publish’ if they are more than 5s apart. Bleh.
(The problem then becomes that clients will stall while the file is being cached, as explained by Mark. curl patiently waited.)
- Say client A requests a nar from ‘guix publish’ (no nginx involved). If another client requests the same nar while A's still downloading, ‘guix publish’ will... silently drop A's connection? I was not expecting this.
Kind regards,
T G-R
Attachment: signature.asc
L
L
Ludovic Courtès wrote on 28 Mar 2017 16:47
(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)(address . 26201@debbugs.gnu.org)
87wpb931cd.fsf@gnu.org
Hey!
Tobias Geerinckx-Rice <me@tobias.gr> skribis:
Toggle quote (13 lines)> On 26/03/17 19:35, Tobias Geerinckx-Rice wrote:>> I can try to do some simple tests tomorrow.>> Two observations:>> - ‘proxy_cache_lock_timeout’ alone won't suffice to serialise requests;> ‘proxy_cache_lock_age’ must also be set to an equally ridiculously> long span. Otherwise, multiple requests will still be sent to ‘guix> publish’ if they are more than 5s apart. Bleh.>> (The problem then becomes that clients will stall while the file is> being cached, as explained by Mark. curl patiently waited.)
Setting ‘proxy_cache_lock_timeout’ to 5s is reasonable I think: ifyou’re unlucky, you wait for 5 seconds, and then we get ‘guix publish’threads serving the same request in parallel; in the most common case,there’s only ever one instance of a given request being served at agiven time.
Toggle quote (5 lines)> - Say client A requests a nar from ‘guix publish’ (no nginx involved).> If another client requests the same nar while A's still downloading,> ‘guix publish’ will... silently drop A's connection?> I was not expecting this.
That would be a bug. Do you have an easy way to reproduce?
Thanks,Ludo’.
L
L
Ludovic Courtès wrote on 8 Apr 2017 23:17
control message for bug #26201
(address . control@debbugs.gnu.org)
87wpauoayn.fsf@gnu.org
retitle 26201 Downloading substitutes is too slow upon nginx cache misses
L
L
Ludovic Courtès wrote on 8 Apr 2017 23:18
(address . control@debbugs.gnu.org)
87vaqeoayc.fsf@gnu.org
severity 26201 important
L
L
Ludovic Courtès wrote on 17 Apr 2017 23:36
Re: bug#26201: hydra.gnu.org uses ‘guix publish ’ for nars and narinfos
(name . Mark H Weaver)(address . mhw@netris.org)
87inm2ogxl.fsf@gnu.org
Hello,
ludo@gnu.org (Ludovic Courtès) skribis:
Toggle quote (28 lines)> Other solutions I’ve thought about:>> 1. Produce narinfos and nars periodically rather than on-demand and> serve them as static files.>> pros: better HTTP latency and bandwidth> pros: allows us to add a Content-Length for nars> cons: doesn’t reduce load on hydra.gnu.org> cons: introduces arbitrary delays in delivering nars> cons: difficult/expensive to know what new store items are available>> 2. Produce a narinfo and corresponding nar the first time they are> requested. So, the first time we receive “GET foo.narinfo”, return> 404 and spawn a thread to compute foo.narinfo and foo.nar. Return> 200 only when both are ready.>> The precomputed nar{,info}s would be kept in a cache and we could> make sure a narinfo and its nar have the same lifetime, which> addresses one of the problems we have.>> pros: better HTTP latency and bandwidth> pros: allows us to add a Content-Length for nars> pros: helps keep narinfo/nar lifetime in sync> cons: doesn’t reduce load on hydra.gnu.org> cons: exposes inconsistency between the store contents and the HTTP> response (you may get 404 even if the thing is actually in> store), but maybe that’s not a problem
The ‘wip-publish-baking’ implements #2 as a new option to ‘guixpublish’. It gives some control on the upper bound on CPU usage sincewe can specify how many worker threads are used.
I’ll finish it soon so we can experiment with it.
Thanks,Ludo’.
L
L
Ludovic Courtès wrote on 18 Apr 2017 23:27
(name . Mark H Weaver)(address . mhw@netris.org)
87o9vts8xb.fsf@gnu.org
ludo@gnu.org (Ludovic Courtès) skribis:
Toggle quote (17 lines)> 2. Produce a narinfo and corresponding nar the first time they are> requested. So, the first time we receive “GET foo.narinfo”, return> 404 and spawn a thread to compute foo.narinfo and foo.nar. Return> 200 only when both are ready.>> The precomputed nar{,info}s would be kept in a cache and we could> make sure a narinfo and its nar have the same lifetime, which> addresses one of the problems we have.>> pros: better HTTP latency and bandwidth> pros: allows us to add a Content-Length for nars> pros: helps keep narinfo/nar lifetime in sync> cons: doesn’t reduce load on hydra.gnu.org> cons: exposes inconsistency between the store contents and the HTTP> response (you may get 404 even if the thing is actually in> store), but maybe that’s not a problem
Implemented in commit 00753f7038234a0f5a79be3ec9ab949840a18743.
I’ll set up a test instance shortly.
Ludo’.
L
L
Ludovic Courtès wrote on 19 Apr 2017 16:24
Heads-up: hydra.gnu.org uses ‘guix publish --cache’
(name . Mark H Weaver)(address . mhw@netris.org)
87vaq0o4pd.fsf_-_@gnu.org
ludo@gnu.org (Ludovic Courtès) skribis:
Toggle quote (23 lines)> ludo@gnu.org (Ludovic Courtès) skribis:>>> 2. Produce a narinfo and corresponding nar the first time they are>> requested. So, the first time we receive “GET foo.narinfo”, return>> 404 and spawn a thread to compute foo.narinfo and foo.nar. Return>> 200 only when both are ready.>>>> The precomputed nar{,info}s would be kept in a cache and we could>> make sure a narinfo and its nar have the same lifetime, which>> addresses one of the problems we have.>>>> pros: better HTTP latency and bandwidth>> pros: allows us to add a Content-Length for nars>> pros: helps keep narinfo/nar lifetime in sync>> cons: doesn’t reduce load on hydra.gnu.org>> cons: exposes inconsistency between the store contents and the HTTP>> response (you may get 404 even if the thing is actually in>> store), but maybe that’s not a problem>> Implemented in commit 00753f7038234a0f5a79be3ec9ab949840a18743.>> I’ll set up a test instance shortly.
I ended up deploying it on hydra.gnu.org directly. :-)
Progressively the cached nar/narinfo at {,mirror.}hydra.gnu.org will bereplaced with the new ones. Now, the /guix/nar URLs have a‘Content-Length’ header you should see a progress bar when downloadingone of these:
Toggle snippet (11 lines)$ ./pre-inst-env guix build vimThe following file will be downloaded: /gnu/store/ax5cm9gr1741pcq17w7bhgss5nvq5470-vim-8.0.0566@ substituter-started /gnu/store/ax5cm9gr1741pcq17w7bhgss5nvq5470-vim-8.0.0566 /gnu/store/rnpz1svz4aw75kibb5qb02hhccy2m4y0-guix-0.12.0-7.aabe/libexec/guix/substituteDownloading https://mirror.hydra.gnu.org/guix/nar/gzip/ax5cm9gr1741pcq17w7bhgss5nvq5470-vim-8.0.0566 (23.4MiB installed)... vim-8.0.0566 7.8MiB 385KiB/s 00:21 [####################] 100.0%
@ substituter-succeeded /gnu/store/ax5cm9gr1741pcq17w7bhgss5nvq5470-vim-8.0.0566/gnu/store/ax5cm9gr1741pcq17w7bhgss5nvq5470-vim-8.0.0566
This new caching scheme should put an end to caching of truncated narsin nginx, which has been too frequent lately.
It should also mostly avoid the problem where we have a narinfo forsomething but not the corresponding nar, which leads to user frustration(‘guix’ reports that the thing will be downloaded and eventually failswith 410 “Gone” while trying to download it), because ‘guix publish’caches narinfo/nar pairs together. I say “mostly” because nginx cachingin front of ‘guix publish’ makes things more complicated.
The bandwidth issue reported at the beginning of this thread should bemostly fixed: serving a narinfo or nar URL is now just sendfile(2),which is the best we can do; 404s on narinfo should be immediate.
Of course, when the machine is overloaded, we’ll still experienceincreased latency and lower bandwidth, but that should be less acutethan with the previous setting.
Please report any problems you may have!
Ludo’.
L
L
Ludovic Courtès wrote on 25 Apr 2017 12:11
control message for bug #26201
(address . control@debbugs.gnu.org)
87h91cdcfv.fsf@gnu.org
tags 26201 fixedclose 26201
M
M
Mark H Weaver wrote on 3 May 2017 10:11
Re: bug#26201: hydra.gnu.org uses ‘guix publish ’ for nars and narinfos
(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)
877f1yjr64.fsf@netris.org
Reviving an old thread...
Tobias Geerinckx-Rice <me@tobias.gr> writes:
Toggle quote (12 lines)>> IMO, the best solution is to *never* generate nars on Hydra in response>> to client requests, but rather to have the build slaves pack and>> compress the nars, copy them to Hydra, and then serve them as static>> files using nginx.>> A true mirror at last! Do we have the disc space for that?>> And could Hydra actually handle compressing *everything*, without an> infinitely growing back-log? I don't have access to any statistics, but> I'm guessing that a fair number of package+versions are never actually> requested, and hence never compressed. This would change that.
Actually, IIUC, the build slaves are _already_ compressing everything,and they always have. They compress the build outputs for transmissionback to the master machine. In the current framework, the mastermachine immediately decompresses them upon receipt, and this compressionand decompression is considered an internal detail of the networktransport.
Currently, the master machine stores all build outputs uncompressed in/gnu/store, and then later recompresses them for transmission to usersand other build slaves. The needless decompression and recompression isa tremendous amount of wasted work on our master machine. That it's allstored uncompressed is also a significant waste of disk space, whichleads to significant additional costs during garbage collection.
Essentially, my proposal is for the build slaves to be modified toprepare the compressed NARs in a form suitable for delivery to end users(and other build slaves) with minimal processing by our master node.The master node would be significantly modified to receive, store, andforward NARs explicitly, without ever decompressing them. As far as Ican tell, this would mean strictly less work to do and less data tostore for every machine and in every case.
Ludovic has pointed out that we cannot do this because Hydra must addits digital signature, and that this digital signature is stored withinthe compressed NAR. Therefore, we cannot avoid having the mastermachine decompress and recompress every NAR that is delivered to users.
In my opinion, we should change the way we sign NARs. Signatures shouldbe external to the NARs, not internal. Not only would this allow us todecentralize production of our NARs, but more importantly, it wouldenable a community of independent builders to add their signatures to acommon pool of NARs. Having a common pool of NARs enables us to storethese NARs in a shared distribution network without duplication. Wecannot even have a common pool of NARs if they containbuild-farm-specific data such as signatures.
Thoughts?
Mark
L
L
Ludovic Courtès wrote on 3 May 2017 11:25
(name . Mark H Weaver)(address . mhw@netris.org)
87k25ywaul.fsf@gnu.org
Hello,
Mark H Weaver <mhw@netris.org> skribis:
Toggle quote (22 lines)> Actually, IIUC, the build slaves are _already_ compressing everything,> and they always have. They compress the build outputs for transmission> back to the master machine. In the current framework, the master> machine immediately decompresses them upon receipt, and this compression> and decompression is considered an internal detail of the network> transport.>> Currently, the master machine stores all build outputs uncompressed in> /gnu/store, and then later recompresses them for transmission to users> and other build slaves. The needless decompression and recompression is> a tremendous amount of wasted work on our master machine. That it's all> stored uncompressed is also a significant waste of disk space, which> leads to significant additional costs during garbage collection.>> Essentially, my proposal is for the build slaves to be modified to> prepare the compressed NARs in a form suitable for delivery to end users> (and other build slaves) with minimal processing by our master node.> The master node would be significantly modified to receive, store, and> forward NARs explicitly, without ever decompressing them. As far as I> can tell, this would mean strictly less work to do and less data to> store for every machine and in every case.
I agree that the redundant compression/decompression is terrible. YetI’m not sure how to architect a solution where compression is performedby build machines. The main issue is that offloading and publicationare two independent mechanisms, as things are.
Maybe each build machine for a build farm use-case we could have a“semi-offloading” mechanism whereby the master spawns a remote buildwithout retrieving its result, something akin to:
GUIX_DAEMON_SOCKET=ssh://build-machine.example.org \ guix build /gnu/store/…-foo.drv
In addition, the build machine would publish its result via ‘guixpublish’, which the master could then simply mirror and cache withnginx.
There’s the issue of signatures, but perhaps we could have a moresophisticated PKI and have the master delegate to build machines…
Then there are other issues such as that of synchronizing the TTL of anarinfo and its corresponding nar, which --cache addresses.
Tricky!
Toggle quote (14 lines)> Ludovic has pointed out that we cannot do this because Hydra must add> its digital signature, and that this digital signature is stored within> the compressed NAR. Therefore, we cannot avoid having the master> machine decompress and recompress every NAR that is delivered to users.>> In my opinion, we should change the way we sign NARs. Signatures should> be external to the NARs, not internal. Not only would this allow us to> decentralize production of our NARs, but more importantly, it would> enable a community of independent builders to add their signatures to a> common pool of NARs. Having a common pool of NARs enables us to store> these NARs in a shared distribution network without duplication. We> cannot even have a common pool of NARs if they contain> build-farm-specific data such as signatures.
Currently the signature is in the narinfos, not in nars proper¹. So wecan already add signatures on an externally provided nar, for instance.
There’s a silly limitation currently, which is that the signature iscomputed over all the fields of the narinfo. That’s silly because itmeans that if you change, say, the compression format or the URL of thenar, then the signature becomes invalid. We should fix that at somepoint.
Ludo’.
¹ For ‘guix publish’. ‘guix archive --export’ appends a signature to the nar set.
?
Your comment

This issue is archived.

To comment on this conversation send email to 26201@debbugs.gnu.org