[PATCH 0/1] 'guix publish --cache' can publish items not yet cached

  • Done
  • quality assurance status badge
Details
2 participants
  • Ludovic Courtès
  • Miguel Ángel Arruga Vivas
Owner
unassigned
Submitted by
Ludovic Courtès
Severity
normal
L
L
Ludovic Courtès wrote on 24 Oct 2020 16:49
(address . guix-patches@gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201024144929.4529-1-ludo@gnu.org
Hello!

The ‘--cache’ mode of ‘guix publish’ is nice for many reasons. Until
now though, it would only return 200 to narinfo and nar requests
corresponding to items already in cache—already “baked”. Thus, the
first narinfo request for an item would always return 404; one would
have to wait until the item is baked to get 200 and download the
substitute.

If you’re fetching substitutes for popular packages on a popular
instance, the extra delay is most likely invisible. However, if you’re
using an instance with few users, or if you’re interesting in
substitutes for an package that’s not popular, the behavior described
above is a showstopper. (Many people here have criticized it in the
past. :-))

This patch changes the behavior of ‘--cache’: if a store item is not
yet in cache, and if it’s “small enough”, then narinfo/nar requests for
it immediately return 200, as in the no-cache mode. You’re trading
possibly increased server resource usage for reduced publication delay.

To put an upper bound on the extra resource usage, the
‘--cache-bypass-threshold’ allows users to control the maximum size
of a store item that can receive this treatment.


An interesting use case that can benefit from this change is .drv
substitutes: .drv themselves can be substitutes, so you can run:

guix build /gnu/store/….drv

and the daemon will fetch the .drv and its closure and start
building it (since commit 9c9982dc0c8c38ce3821b154b7e92509c1564317).
It’s not something we use much so far, but maybe we could put it
to good use in the future (I know Chris has been playing with it
in the context of the Data Service).

Thoughts?

Ludo’.

Ludovic Courtès (1):
publish: Add '--cache-bypass-threshold'.

doc/guix.texi | 24 +++++++++++-
guix/scripts/publish.scm | 85 ++++++++++++++++++++++++++++++++--------
tests/publish.scm | 43 ++++++++++++++++++--
3 files changed, 130 insertions(+), 22 deletions(-)

--
2.28.0
L
L
Ludovic Courtès wrote on 24 Oct 2020 16:54
[PATCH 1/1] publish: Add '--cache-bypass-threshold'.
(address . 44193@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20201024145416.4691-1-ludo@gnu.org
* guix/scripts/publish.scm (show-help, %options): Add
'--cache-bypass-threshold'.
(low-compression): New procedure.
(cache-bypass-threshold): New parameter.
(bypass-cache?): New procedure.
(render-narinfo/cached): Call 'render-narinfo' when 'bypass-cache?'
returns true.
(render-nar/cached): Call 'render-nar' when 'bypass-cache?' returns
true.
(guix-publish): Parameterize 'cache-bypass-threshold'.
* tests/publish.scm ("with cache", "with cache, lzip + gzip")
("with cache, uncompressed"): Pass '--cache-bypass-threshold=0'.
("with cache, vanishing item"): Expect 200 for RESPONSE.
("with cache, cache bypass"): New test.
---
doc/guix.texi | 24 +++++++++++-
guix/scripts/publish.scm | 85 ++++++++++++++++++++++++++++++++--------
tests/publish.scm | 43 ++++++++++++++++++--
3 files changed, 130 insertions(+), 22 deletions(-)

Toggle diff (276 lines)
diff --git a/doc/guix.texi b/doc/guix.texi
index b5061877e2..633c974562 100644
--- a/doc/guix.texi
+++ b/doc/guix.texi
@@ -11977,13 +11977,20 @@ in advance, so @command{guix publish} does not add a
prevents clients from knowing the amount of data being downloaded.
Conversely, when @option{--cache} is used, the first request for a store
-item (@i{via} a @code{.narinfo} URL) returns 404 and triggers a
+item (@i{via} a @code{.narinfo} URL) triggers a
background process to @dfn{bake} the archive---computing its
@code{.narinfo} and compressing the archive, if needed. Once the
archive is cached in @var{directory}, subsequent requests succeed and
are served directly from the cache, which guarantees that clients get
the best possible bandwidth.
+That first @code{.narinfo} request nonetheless returns 200, provided the
+requested store item is ``small enough'', below the cache bypass
+threshold---see @option{--cache-bypass-threshold} below. That way,
+clients do not have to wait until the archive is baked. For larger
+store items, the first @code{.narinfo} request returns 404, meaning that
+clients have to wait until the archive is baked.
+
The ``baking'' process is performed by worker threads. By default, one
thread per CPU core is created, but this can be customized. See
@option{--workers} below.
@@ -12009,6 +12016,21 @@ Additionally, when @option{--cache} is used, cached entries that have
not been accessed for @var{ttl} and that no longer have a corresponding
item in the store, may be deleted.
+@item --cache-bypass-threshold=@var{size}
+When used in conjunction with @option{--cache}, store items smaller than
+@var{size} are immediately available, even when they are not yet in
+cache. @var{size} is a size in bytes, or it can be prefixed by @code{M}
+for megabytes and so on. The default is @code{10M}.
+
+``Cache bypass'' allows you to reduce the publication delay for clients
+at the expense of possibly additional I/O and CPU use on the server
+side: depending on the client access patterns, those store items can end
+up being baked several times until a copy is available in cache.
+
+Increasing the threshold may be useful for sites that have few users, or
+to guarantee that users get substitutes even for store items that are
+not popular.
+
@item --nar-path=@var{path}
Use @var{path} as the prefix for the URLs of ``nar'' files
(@pxref{Invoking guix archive, normalized archives}).
diff --git a/guix/scripts/publish.scm b/guix/scripts/publish.scm
index 4eaf961ab2..c0150c74da 100644
--- a/guix/scripts/publish.scm
+++ b/guix/scripts/publish.scm
@@ -81,6 +81,9 @@ Publish ~a over HTTP.\n") %store-directory)
compress archives with METHOD at LEVEL"))
(display (G_ "
-c, --cache=DIRECTORY cache published items to DIRECTORY"))
+ (display (G_ "
+ --cache-bypass-threshold=SIZE
+ serve store items below SIZE even when not cached"))
(display (G_ "
--workers=N use N workers to bake items"))
(display (G_ "
@@ -134,6 +137,12 @@ if ITEM is already compressed."
(list %no-compression)
requested))
+(define (low-compression c)
+ "Return <compression> of the same type as C, but optimized for low CPU
+usage."
+ (compression (compression-type c)
+ (min (compression-level c) 2)))
+
(define %options
(list (option '(#\h "help") #f #f
(lambda _
@@ -184,6 +193,10 @@ if ITEM is already compressed."
(option '(#\c "cache") #t #f
(lambda (opt name arg result)
(alist-cons 'cache arg result)))
+ (option '("cache-bypass-threshold") #t #f
+ (lambda (opt name arg result)
+ (alist-cons 'cache-bypass-threshold (size->number arg)
+ result)))
(option '("workers") #t #f
(lambda (opt name arg result)
(alist-cons 'workers (string->number* arg)
@@ -434,7 +447,7 @@ items. Failing that, we could eventually have to recompute them and return
(expiration-time file))))))
(define (hash-part->path* store hash cache)
- "Like 'hash-part->path' but cached results under CACHE. This ensures we can
+ "Like 'hash-part->path' but cache results under CACHE. This ensures we can
still map HASH to the corresponding store file name, even if said store item
vanished from the store in the meantime."
(let ((cached (hash-part-mapping-cache-file cache hash)))
@@ -454,6 +467,18 @@ vanished from the store in the meantime."
result))
(apply throw args))))))
+(define cache-bypass-threshold
+ ;; Maximum size of a store item that may be served by the '/cached' handlers
+ ;; below even when not in cache.
+ (make-parameter (* 10 (expt 2 20))))
+
+(define (bypass-cache? store item)
+ "Return true if we allow ITEM to be downloaded before it is cached. ITEM is
+interpreted as the basename of a store item."
+ (guard (c ((store-error? c) #f))
+ (< (path-info-nar-size (query-path-info store item))
+ (cache-bypass-threshold))))
+
(define* (render-narinfo/cached store request hash
#:key ttl (compressions (list %no-compression))
(nar-path "nar")
@@ -513,9 +538,20 @@ requested using POOL."
(nar-expiration-time ttl)
#:delete-entry delete-entry
#:cleanup-period ttl))))
- (not-found request
- #:phrase "We're baking it"
- #:ttl 300)) ;should be available within 5m
+
+ ;; If ITEM passes 'bypass-cache?', render a temporary narinfo right
+ ;; away, with a short TTL. The narinfo is temporary because it
+ ;; lacks 'FileSize', for instance, which the cached narinfo will
+ ;; have. Chances are that the nar will be baked by the time the
+ ;; client asks for it.
+ (if (bypass-cache? store item)
+ (render-narinfo store request hash
+ #:ttl 300 ;temporary
+ #:nar-path nar-path
+ #:compressions compressions)
+ (not-found request
+ #:phrase "We're baking it"
+ #:ttl 300))) ;should be available within 5m
(else
(not-found request #:phrase "")))))
@@ -627,19 +663,31 @@ return it; otherwise, return 404. When TTL is true, use it as the
'Cache-Control' expiration time."
(let ((cached (nar-cache-file cache store-item
#:compression compression)))
- (if (file-exists? cached)
- (values `((content-type . (application/octet-stream
- (charset . "ISO-8859-1")))
- ,@(if ttl
- `((cache-control (max-age . ,ttl)))
- '())
+ (cond ((file-exists? cached)
+ (values `((content-type . (application/octet-stream
+ (charset . "ISO-8859-1")))
+ ,@(if ttl
+ `((cache-control (max-age . ,ttl)))
+ '())
- ;; XXX: We're not returning the actual contents, deferring
- ;; instead to 'http-write'. This is a hack to work around
- ;; <http://bugs.gnu.org/21093>.
- (x-raw-file . ,cached))
- #f)
- (not-found request))))
+ ;; XXX: We're not returning the actual contents, deferring
+ ;; instead to 'http-write'. This is a hack to work around
+ ;; <http://bugs.gnu.org/21093>.
+ (x-raw-file . ,cached))
+ #f))
+ ((let* ((hash (and=> (string-index store-item #\-)
+ (cut string-take store-item <>)))
+ (item (and hash
+ (guard (c ((store-error? c) #f))
+ (hash-part->path store hash)))))
+ (and item (bypass-cache? store item)))
+ ;; Render STORE-ITEM live. We reach this because STORE-ITEM is
+ ;; being baked but clients are already asking for it. Thus, we're
+ ;; duplicating work, but doing so allows us to reduce delays.
+ (render-nar store request store-item
+ #:compression (low-compression compression)))
+ (else
+ (not-found request)))))
(define (render-content-addressed-file store request
name algo hash)
@@ -1061,7 +1109,10 @@ methods, return the applicable compression."
consider using the '--user' option!~%")))
(parameterize ((%public-key public-key)
- (%private-key private-key))
+ (%private-key private-key)
+ (cache-bypass-threshold
+ (or (assoc-ref opts 'cache-bypass-threshold)
+ (cache-bypass-threshold))))
(info (G_ "publishing ~a on ~a, port ~d~%")
%store-directory
(inet-ntop (sockaddr:fam address) (sockaddr:addr address))
diff --git a/tests/publish.scm b/tests/publish.scm
index 1c3b2785fb..f081d016d3 100644
--- a/tests/publish.scm
+++ b/tests/publish.scm
@@ -412,7 +412,8 @@ References: ~%"
(call-with-new-thread
(lambda ()
(guix-publish "--port=6797" "-C2"
- (string-append "--cache=" cache)))))))
+ (string-append "--cache=" cache)
+ "--cache-bypass-threshold=0"))))))
(wait-until-ready 6797)
(let* ((base "http://localhost:6797/")
(part (store-path-hash-part %item))
@@ -461,7 +462,8 @@ References: ~%"
(call-with-new-thread
(lambda ()
(guix-publish "--port=6794" "-Cgzip:2" "-Clzip:2"
- (string-append "--cache=" cache)))))))
+ (string-append "--cache=" cache)
+ "--cache-bypass-threshold=0"))))))
(wait-until-ready 6794)
(let* ((base "http://localhost:6794/")
(part (store-path-hash-part %item))
@@ -516,7 +518,8 @@ References: ~%"
(call-with-new-thread
(lambda ()
(guix-publish "--port=6796" "-C2" "--ttl=42h"
- (string-append "--cache=" cache)))))))
+ (string-append "--cache=" cache)
+ "--cache-bypass-threshold=0"))))))
(wait-until-ready 6796)
(let* ((base "http://localhost:6796/")
(part (store-path-hash-part item))
@@ -580,12 +583,44 @@ References: ~%"
(basename item)
".narinfo"))
(response (http-get url)))
- (and (= 404 (response-code response))
+ (and (= 200 (response-code response)) ;we're below the threshold
(wait-for-file cached)
(begin
(delete-paths %store (list item))
(response-code (pk 'response (http-get url))))))))))
+(test-equal "with cache, cache bypass"
+ 200
+ (call-with-temporary-directory
+ (lambda (cache)
+ (let ((thread (with-separate-output-ports
+ (call-with-new-thread
+ (lambda ()
+ (guix-publish "--port=6788" "-C" "gzip"
+ (string-append "--cache=" cache)))))))
+ (wait-until-ready 6788)
+
+ (let* ((base "http://localhost:6788/")
+ (item (add-text-to-store %store "random" (random-text)))
+ (part (store-path-hash-part item))
+ (narinfo (string-append base part ".narinfo"))
+ (nar (string-append base "nar/gzip/" (basename item)))
+ (cached (string-append cache "/gzip/" (basename item)
+ ".narinfo")))
+ ;; We're below the default cache bypass threshold, so NAR and NARINFO
+ ;; should immediately return 200. The NARINFO request should trigger
+ ;; caching, and the next request to NAR should return 200 as well.
+ (and (let ((response (pk 'r1 (http-get nar))))
+ (and (= 200 (response-code response))
+ (not (response-content-length response)))) ;not known
+ (= 200 (response-code (http-get narinfo)))
+ (begin
+ (wait-for-file cached)
+ (let ((response (pk 'r2 (http-get nar))))
+ (and (> (response-content-length response)
+ (stat:size (stat item)))
+ (response-code response))))))))))
+
(test-equal "/log/NAME"
`(200 #t application/x-bzip2)
(let ((drv (run-with-store %store
--
2.28.0
M
M
Miguel Ángel Arruga Vivas wrote on 25 Oct 2020 14:11
Re: [bug#44193] [PATCH 0/1] 'guix publish --cache' can publish items not yet cached
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 44193@debbugs.gnu.org)
87v9ey5u2e.fsf@gmail.com
Hi!

Just one general comment about this issue:

Ludovic Courtès <ludo@gnu.org> writes:
Toggle quote (4 lines)
> Thus, the first narinfo request for an item would always return 404;
> one would have to wait until the item is baked to get 200 and download
> the substitute.

I'd argue that returning unconditionally the 404 is a problem. If the
nar is getting baked, I guess that a 202[1] would be the appropriate
answer, and I'd leave the 404 for invalid store paths[2]. This way the
client could implement more policies: the classic timeout, but also, for
example, it might check other servers before checking once again if
nobody else has it, or directly wait until a 404 is reached. WDYT?

Happy hacking!
Miguel

[1] Section 10.2.3 from https://www.ietf.org/rfc/rfc2616.txt
[2] I understand that it isn't at all a bad usage of the 404, as it
explicitly says that the condition might be temporary, but on the
other hand I don't know how could that extra information be used by
a rogue client in any way worse than it could do right now, as the
server process is doing the same computation more or less in both
cases.
L
L
Ludovic Courtès wrote on 25 Oct 2020 17:49
(name . Miguel Ángel Arruga Vivas)(address . rosen644835@gmail.com)(address . 44193@debbugs.gnu.org)
87pn56dzdp.fsf@gnu.org
Hi!

Miguel Ángel Arruga Vivas <rosen644835@gmail.com> skribis:

Toggle quote (12 lines)
> Ludovic Courtès <ludo@gnu.org> writes:
>> Thus, the first narinfo request for an item would always return 404;
>> one would have to wait until the item is baked to get 200 and download
>> the substitute.
>
> I'd argue that returning unconditionally the 404 is a problem. If the
> nar is getting baked, I guess that a 202[1] would be the appropriate
> answer, and I'd leave the 404 for invalid store paths[2]. This way the
> client could implement more policies: the classic timeout, but also, for
> example, it might check other servers before checking once again if
> nobody else has it, or directly wait until a 404 is reached. WDYT?

Indeed, 202 seems more appropriate (and it’s precisely half of 404, that
tells something!).

Unfortunately (guix scripts substitute) currently explicitly checks for
404 and 200 and considers anything else to be a transient error with a
default TTL (in ‘handle-narinfo-response’). So we would need to adapt
that first and then wait until some time has passed before ‘guix
publish’ can return 202. :-/

I guess we can change (guix scripts substitute) with that in mind
already. WDYT?

Ludo’.
M
M
Miguel Ángel Arruga Vivas wrote on 25 Oct 2020 18:30
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 44193@debbugs.gnu.org)
874kmiqkla.fsf@gmail.com
Hi, Ludo!

Ludovic Courtès <ludo@gnu.org> writes:
Toggle quote (19 lines)
> Hi!
>
> Miguel Ángel Arruga Vivas <rosen644835@gmail.com> skribis:
>
>> Ludovic Courtès <ludo@gnu.org> writes:
>>> Thus, the first narinfo request for an item would always return 404;
>>> one would have to wait until the item is baked to get 200 and download
>>> the substitute.
>>
>> I'd argue that returning unconditionally the 404 is a problem. If the
>> nar is getting baked, I guess that a 202[1] would be the appropriate
>> answer, and I'd leave the 404 for invalid store paths[2]. This way the
>> client could implement more policies: the classic timeout, but also, for
>> example, it might check other servers before checking once again if
>> nobody else has it, or directly wait until a 404 is reached. WDYT?
>
> Indeed, 202 seems more appropriate (and it’s precisely half of 404, that
> tells something!).

:-)

Toggle quote (6 lines)
> Unfortunately (guix scripts substitute) currently explicitly checks for
> 404 and 200 and considers anything else to be a transient error with a
> default TTL (in ‘handle-narinfo-response’). So we would need to adapt
> that first and then wait until some time has passed before ‘guix
> publish’ can return 202. :-/

I see, it uses 'max-age from the http response only when it's a 404.
Nonetheless, correct me if I'm wrong, the difference is 5 vs 10 minutes,
so I don't think we should wait too much to upgrade both sides. :-)

Toggle quote (3 lines)
> I guess we can change (guix scripts substitute) with that in mind
> already. WDYT?

I fully agree with that. Adding 202 together with 404 would be enough
as an start, wouldn't it?
-------------------------------8<-----------------------------
(cache-narinfo! url (hash-part->path hash-part) #f
(if (or (= 404 code) (= 202 code))
ttl
%narinfo-transient-error-ttl))
------------------------------->8-----------------------------

Happy hacking!
Miguel
L
L
Ludovic Courtès wrote on 26 Oct 2020 11:50
(name . Miguel Ángel Arruga Vivas)(address . rosen644835@gmail.com)(address . 44193@debbugs.gnu.org)
87r1plb6sd.fsf@gnu.org
Hi,

Miguel Ángel Arruga Vivas <rosen644835@gmail.com> skribis:

Toggle quote (9 lines)
> I fully agree with that. Adding 202 together with 404 would be enough
> as an start, wouldn't it?
> -------------------------------8<-----------------------------
> (cache-narinfo! url (hash-part->path hash-part) #f
> (if (or (= 404 code) (= 202 code))
> ttl
> %narinfo-transient-error-ttl))
> ------------------------------->8-----------------------------

Yes, exactly!

Ludo’.
M
M
Miguel Ángel Arruga Vivas wrote on 27 Oct 2020 20:19
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 44193@debbugs.gnu.org)
87eeljo4sq.fsf@gmail.com
Hi,

Ludovic Courtès <ludo@gnu.org> writes:
Toggle quote (15 lines)
> Hi,
>
> Miguel Ángel Arruga Vivas <rosen644835@gmail.com> skribis:
>
>> I fully agree with that. Adding 202 together with 404 would be enough
>> as an start, wouldn't it?
>> -------------------------------8<-----------------------------
>> (cache-narinfo! url (hash-part->path hash-part) #f
>> (if (or (= 404 code) (= 202 code))
>> ttl
>> %narinfo-transient-error-ttl))
>> ------------------------------->8-----------------------------
>
> Yes, exactly!

Should I, or you, push this before the release? It's probably worth
having it already for 1.2.

The optimization could be cool too: IIUC could be only the other if
branch the one returning the 202 when it's widely accepted, perhaps I
should have explicitly pointed out that earlier instead of driving too
much the conversation to the return code, sorry for that. :-(

Happy hacking!
Miguel
L
L
Ludovic Courtès wrote on 28 Oct 2020 10:39
(name . Miguel Ángel Arruga Vivas)(address . rosen644835@gmail.com)(address . 44193@debbugs.gnu.org)
87lffq665t.fsf@gnu.org
Hi,

Miguel Ángel Arruga Vivas <rosen644835@gmail.com> skribis:

Toggle quote (19 lines)
> Ludovic Courtès <ludo@gnu.org> writes:
>> Hi,
>>
>> Miguel Ángel Arruga Vivas <rosen644835@gmail.com> skribis:
>>
>>> I fully agree with that. Adding 202 together with 404 would be enough
>>> as an start, wouldn't it?
>>> -------------------------------8<-----------------------------
>>> (cache-narinfo! url (hash-part->path hash-part) #f
>>> (if (or (= 404 code) (= 202 code))
>>> ttl
>>> %narinfo-transient-error-ttl))
>>> ------------------------------->8-----------------------------
>>
>> Yes, exactly!
>
> Should I, or you, push this before the release? It's probably worth
> having it already for 1.2.

Agreed, you can go ahead and push this change.

Toggle quote (5 lines)
> The optimization could be cool too: IIUC could be only the other if
> branch the one returning the 202 when it's widely accepted, perhaps I
> should have explicitly pointed out that earlier instead of driving too
> much the conversation to the return code, sorry for that. :-(

No problem!

Ludo’.
L
L
Ludovic Courtès wrote on 28 Oct 2020 16:26
Re: [bug#44193] [PATCH 1/1] publish: Add '--cache-bypass-threshold'.
(address . 44193-done@debbugs.gnu.org)
87v9euz807.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (20 lines)
> * guix/scripts/publish.scm (show-help, %options): Add
> '--cache-bypass-threshold'.
> (low-compression): New procedure.
> (cache-bypass-threshold): New parameter.
> (bypass-cache?): New procedure.
> (render-narinfo/cached): Call 'render-narinfo' when 'bypass-cache?'
> returns true.
> (render-nar/cached): Call 'render-nar' when 'bypass-cache?' returns
> true.
> (guix-publish): Parameterize 'cache-bypass-threshold'.
> * tests/publish.scm ("with cache", "with cache, lzip + gzip")
> ("with cache, uncompressed"): Pass '--cache-bypass-threshold=0'.
> ("with cache, vanishing item"): Expect 200 for RESPONSE.
> ("with cache, cache bypass"): New test.
> ---
> doc/guix.texi | 24 +++++++++++-
> guix/scripts/publish.scm | 85 ++++++++++++++++++++++++++++++++--------
> tests/publish.scm | 43 ++++++++++++++++++--
> 3 files changed, 130 insertions(+), 22 deletions(-)

Pushed as ecaa102a58ad3ab0b42e04a3d10d7c761c05ec98.

Ludo’.
Closed
?