Cuirass: Trigger 'guix publish' baking

  • Done
  • quality assurance status badge
Details
3 participants
  • Clément Lassieur
  • Ludovic Courtès
  • Tobias Geerinckx-Rice
Owner
unassigned
Submitted by
Clément Lassieur
Severity
normal
C
C
Clément Lassieur wrote on 14 Nov 2018 00:48
guix publish: at least one user will have to build a given substitute
(name . Bug Guix)(address . bug-guix@gnu.org)
87ftw4wnc7.fsf@lassieur.org
Hi,

I've noticed that narinfo baking is triggered by user requests when the
'--cache' option of 'guix publish' is used. It means that the first
user who will want it will get the 404 response and will have to build
it manually. (See guix/scripts/publish.scm, make-request-handler.)

I was reluctant to send this email to bug-guix@gnu.org because it's
fairly well documented, but I don't like this behaviour... As a matter
of fact I'm often the first user downloading substitutes on my 'guix
publish' server.

Would it be possible to trigger the baking right after the build is
done? So that every user can be sure that they will get the substitute
once they know that Cuirass has built it.

If 'guix publish' has no way to get the notification that a build is
done, maybe Cuirass could trigger the baking? (But that would be
hackish in my opinion.)

Cheers,
Clément

Toggle snippet (28 lines)
‘--cache=DIRECTORY’
‘-c DIRECTORY’
Cache archives and meta-data (‘.narinfo’ URLs) to DIRECTORY and
only serve archives that are in cache.

When this option is omitted, archives and meta-data are created
on-the-fly. This can reduce the available bandwidth, especially
when compression is enabled, since this may become CPU-bound.
Another drawback of the default mode is that the length of archives
is not known in advance, so ‘guix publish’ does not add a
‘Content-Length’ HTTP header to its responses, which in turn
prevents clients from knowing the amount of data being downloaded.

Conversely, when ‘--cache’ is used, the first request for a store
item (via a ‘.narinfo’ URL) returns 404 and triggers a background
process to “bake” the archive—computing its ‘.narinfo’ and
compressing the archive, if needed. Once the archive is cached in
DIRECTORY, subsequent requests succeed and are served directly from
the cache, which guarantees that clients get the best possible
bandwidth.

The “baking” process is performed by worker threads. By default,
one thread per CPU core is created, but this can be customized.
See ‘--workers’ below.

When ‘--ttl’ is used, cached entries are automatically deleted when
they have expired.
L
L
Ludovic Courtès wrote on 14 Nov 2018 11:09
(name . Clément Lassieur)(address . clement@lassieur.org)(address . 33370@debbugs.gnu.org)
87in10km16.fsf@gnu.org
Hello,

Clément Lassieur <clement@lassieur.org> skribis:

Toggle quote (5 lines)
> I've noticed that narinfo baking is triggered by user requests when the
> '--cache' option of 'guix publish' is used. It means that the first
> user who will want it will get the 404 response and will have to build
> it manually. (See guix/scripts/publish.scm, make-request-handler.)

Note that the first request (404) returns with an expiry of 5mn instead
of the default (much longer) expiry for “normal” 404s.

We discussed this behavior at length back then and that seemed to me
like a reasonable behavior for a service with many users: the first one
gets 404 (or has to wait for 5 more minutes), but when there are enough
users, it doesn’t matter much.

For a single-user setup, I recommend not using ‘--cache’.

Toggle quote (8 lines)
> Would it be possible to trigger the baking right after the build is
> done? So that every user can be sure that they will get the substitute
> once they know that Cuirass has built it.
>
> If 'guix publish' has no way to get the notification that a build is
> done, maybe Cuirass could trigger the baking? (But that would be
> hackish in my opinion.)

I had that in mind: adding a build completion hook on Cuirass, which
could trigger baking (I don’t think it’s particularly hackish: Cuirass
is the only place that can send a notification.) Basically we’d run:

cuirass --build-completion-hook=/some/program …

and that program could do a GET on the right narinfo URL(s).

This would be useful in reducing latency; the downside is that we’d bake
lots of things, even possibly things that nobody ever needs.

Thoughts?

Ludo’.
C
C
Clément Lassieur wrote on 14 Nov 2018 11:18
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 33370@debbugs.gnu.org)
87muqcezci.fsf@lassieur.org
Hi Ludo,

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (17 lines)
> Hello,
>
> Clément Lassieur <clement@lassieur.org> skribis:
>
>> I've noticed that narinfo baking is triggered by user requests when the
>> '--cache' option of 'guix publish' is used. It means that the first
>> user who will want it will get the 404 response and will have to build
>> it manually. (See guix/scripts/publish.scm, make-request-handler.)
>
> Note that the first request (404) returns with an expiry of 5mn instead
> of the default (much longer) expiry for “normal” 404s.
>
> We discussed this behavior at length back then and that seemed to me
> like a reasonable behavior for a service with many users: the first one
> gets 404 (or has to wait for 5 more minutes), but when there are enough
> users, it doesn’t matter much.

But at least one user will complain, and if it's a small laptop building
Icecat...

Toggle quote (2 lines)
> For a single-user setup, I recommend not using ‘--cache’.

Yes, that's what I did.

Toggle quote (16 lines)
>> Would it be possible to trigger the baking right after the build is
>> done? So that every user can be sure that they will get the substitute
>> once they know that Cuirass has built it.
>>
>> If 'guix publish' has no way to get the notification that a build is
>> done, maybe Cuirass could trigger the baking? (But that would be
>> hackish in my opinion.)
>
> I had that in mind: adding a build completion hook on Cuirass, which
> could trigger baking (I don’t think it’s particularly hackish: Cuirass
> is the only place that can send a notification.) Basically we’d run:
>
> cuirass --build-completion-hook=/some/program …
>
> and that program could do a GET on the right narinfo URL(s).

Yeah I agree it's not that hackish.

Toggle quote (5 lines)
> This would be useful in reducing latency; the downside is that we’d bake
> lots of things, even possibly things that nobody ever needs.
>
> Thoughts?

What about getting the first user to block until the baking is done? It
will take more time for them but at least they won't have to build it
locally.

And things nobody use won't have to be baked.

Clément
L
L
Ludovic Courtès wrote on 14 Nov 2018 15:49
(name . Clément Lassieur)(address . clement@lassieur.org)(address . 33370@debbugs.gnu.org)
877ehfg1ed.fsf@gnu.org
Hi,

Clément Lassieur <clement@lassieur.org> skribis:

Toggle quote (22 lines)
> Ludovic Courtès <ludo@gnu.org> writes:
>
>> Hello,
>>
>> Clément Lassieur <clement@lassieur.org> skribis:
>>
>>> I've noticed that narinfo baking is triggered by user requests when the
>>> '--cache' option of 'guix publish' is used. It means that the first
>>> user who will want it will get the 404 response and will have to build
>>> it manually. (See guix/scripts/publish.scm, make-request-handler.)
>>
>> Note that the first request (404) returns with an expiry of 5mn instead
>> of the default (much longer) expiry for “normal” 404s.
>>
>> We discussed this behavior at length back then and that seemed to me
>> like a reasonable behavior for a service with many users: the first one
>> gets 404 (or has to wait for 5 more minutes), but when there are enough
>> users, it doesn’t matter much.
>
> But at least one user will complain, and if it's a small laptop building
> Icecat...

The way we’re doing things, there’s necessarily a delay (the build time
plus some additional latency) between the moment and commit is pushed
and the moment the corresponding package is built. Baking only adds a
very small latency.

Toggle quote (7 lines)
>> This would be useful in reducing latency; the downside is that we’d bake
>> lots of things, even possibly things that nobody ever needs.
>>
>> Thoughts?
>
> What about getting the first user to block until the baking is done?

That’s generally not possible because HTTP is supposedly synchronous.
Also, ‘guix publish’ has a bunch of worker threads that pick baking
tasks from a queue. When the queue is empty and you asking for a
substitute of sed, it will take seconds to bake it; but when the queue
is already large and you’re asking for LibreOffice, it could take a few
minutes.

For the intended use case, which is a build farm with many users,
optimizing for the first user makes little sense IMO.

Thanks,
Ludo’.
C
C
Clément Lassieur wrote on 14 Nov 2018 16:34
bug#33370: Cuirass: Trigger 'guix publish' baking
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 33370@debbugs.gnu.org)
87a7mbad1q.fsf@lassieur.org
Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (17 lines)
>>> This would be useful in reducing latency; the downside is that we’d bake
>>> lots of things, even possibly things that nobody ever needs.
>>>
>>> Thoughts?
>>
>> What about getting the first user to block until the baking is done?
>
> That’s generally not possible because HTTP is supposedly synchronous.
> Also, ‘guix publish’ has a bunch of worker threads that pick baking
> tasks from a queue. When the queue is empty and you asking for a
> substitute of sed, it will take seconds to bake it; but when the queue
> is already large and you’re asking for LibreOffice, it could take a few
> minutes.
>
> For the intended use case, which is a build farm with many users,
> optimizing for the first user makes little sense IMO.

I don't agree, because I find it stressful when you build something and
you're not 100% sure you'll get the substitute. If someone is the only
user of several Guix packages (and I think it's the case for many of our
users), they'll have to re-build them locally every time one of their
dependencies is updated.

So if I understand well, the Cuirass solution seems the best... I leave
the bug open but I change the name :-)

Thank you,
Clément
C
C
Clément Lassieur wrote on 14 Nov 2018 16:35
control message for bug #33370
(address . control@debbugs.gnu.org)
878t1vacz8.fsf@lassieur.org
retitle 33370 Cuirass: Trigger 'guix publish' baking
T
T
Tobias Geerinckx-Rice wrote on 30 Nov 2020 23:15
Cuirass: Trigger 'guix publish' baking
(address . 33370-done@debbugs.gnu.org)
87ft4qwkyi.fsf@nckx
This was (‘mostly’ --Ludo') addressed by adding
‘--cache-bypass-threshold’.

Closing,

T G-R
-----BEGIN PGP SIGNATURE-----

iIMEARYKACsWIQT12iAyS4c9C3o4dnINsP+IT1VteQUCX8Vu5g0cbWVAdG9iaWFz
LmdyAAoJEA2w/4hPVW15q8YA/0Clf83SkY0CBnHQvR2rqz5UOw1BcUJidKAlv9wa
N2zUAP44fRP8DuZj/CCvUf/B8GTyA0SGN8czu+aFLXBiefqJAQ==
=QRRR
-----END PGP SIGNATURE-----

Closed
?