nginx serving files from the store returns Last-Modified = Epoch

  • Open
  • quality assurance status badge
Details
7 participants
  • Gábor Boskovits
  • Danny Milosavljevic
  • anadon via web
  • Ludovic Courtès
  • Christopher Baines
  • Tobias Geerinckx-Rice
  • Vincent Legoll
Owner
unassigned
Submitted by
Ludovic Courtès
Severity
normal
Merged with
L
L
Ludovic Courtès wrote on 28 Aug 2019 11:52
guix.gnu.org returns Last-Modified = Epoch
(address . bug-Guix@gnu.org)
875zmhliqj.fsf@gnu.org
Hello Guix,

Since the use of the ‘static-web-site’ service, which puts web site
files in the store, nginx returns a ‘Last-Modified’ header that can
trick clients into caching things forever:

Toggle snippet (4 lines)
$ wget --debug -O /dev/null https://guix.gnu.org/packages.json 2>&1 | grep Last
Last-Modified: Thu, 01 Jan 1970 00:00:01 GMT

We should tell nginx to do not emit ‘Last-Modified’, or to take the
state from the /srv/guix.gnu.org symlink, if possible.

Ludo’.
G
G
Gábor Boskovits wrote on 28 Aug 2019 12:40
guix.gnu.org Last Modified at epoch
(address . 37207@debbugs.gnu.org)
CAE4v=piZWbeVx2qMRcg-kgbq+paBKC0L+Z6f9ADLUhpZaA=37A@mail.gmail.com
Hello,

Supressing the last modified header is just an
add_header Last-Modified "";
away.

To get the info from the symlink seems to be much trickier, i would do with
either embedded perl or embedded lua. I am not sure if we should bother
with it, though. Wdyt?
Attachment: file
T
T
Tobias Geerinckx-Rice wrote on 28 Aug 2019 16:37
(address . bug-guix@gnu.org)(address . 37207@debbugs.gnu.org)
877e6xqrtw.fsf@nckx
Gábor, Ludo',

Gábor Boskovits ???
Toggle quote (4 lines)
> Supressing the last modified header is just an
> add_header Last-Modified "";
> away.

You'll also need:

# Don't honour client If-Modified-Since constraints.
if_modified_since off;
# Nginx's etags are hashes of file timestamp & file length.
etag off;

Turning these off will of course prevent all caching. I don't
know if that would add measurable load to guix.gnu.org (it would
be more problematic if we used a CDN, but it might still make a
difference).

Nix does something both interesting and icky — as always: patch[0]
nginx to look up the realpath() instead, so clients can still
cache using If-None-Match.

Kind regards,

T G-R

-----BEGIN PGP SIGNATURE-----

iHUEARYKAB0WIQT12iAyS4c9C3o4dnINsP+IT1VteQUCXWaRmwAKCRANsP+IT1Vt
eWLLAP0SAJCUU8QtWOgV//NZDjDU5B90Y2jK9T9MqXOSuViDCgEAhLqEnRhED7aZ
9akV12ZscZH3PV76z5fsYmfgOwtflgA=
=DnoZ
-----END PGP SIGNATURE-----

D
D
Danny Milosavljevic wrote on 28 Aug 2019 17:05
(name . Gábor Boskovits)(address . boskovits@gmail.com)(address . 37207@debbugs.gnu.org)
20190828170530.3a3d638e@scratchpost.org
Hi Gabor,

On Wed, 28 Aug 2019 12:40:37 +0200
Gábor Boskovits <boskovits@gmail.com> wrote:

Toggle quote (8 lines)
> Supressing the last modified header is just an
> add_header Last-Modified "";
> away.
>
> To get the info from the symlink seems to be much trickier, i would do with
> either embedded perl or embedded lua. I am not sure if we should bother
> with it, though. Wdyt?

Since we already emit ETag, I don't think we need to bother with Last-Modified.

Why is the ETag so short, though?

Toggle quote (2 lines)
>wget --debug -O /dev/null https://guix.gnu.org/packages.json2 &1 | grep -i etag
>ETag: "1-2f38b1"
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl1mmDoACgkQ5xo1VCww
uqW/mAf+PjOVBfuAg7XAaKmU4Mi76ql2NJG/BmX0UVyYREw5ZSyIYp7xgKS5MlMm
RZS8+cSPTnhGauZM04lnPmZYiBj+Zn2aoZqv/d87XeF12T/ZPfTvV4n06lLRmIMe
WVkmsAiPNfZirIvsaRCfFfeuG0P9/HQmURX4WqdUqEqjO+hHeynKo3NEMgi3KUmf
Nbq0QcB8ka6eYJDce8zCiXj8ePbeipFCm6NNAFwp44GuyLiPh6uOKaWHuTKDmOPt
uFJS9h2XUFtyH2cvLJ5y1GKrZ40XIySsJ1D4a2xBTJzXPcSyfvEE2zcXu0nVvbk8
dy6EDme1jCg8/3efSX9LwdLcO5RWiQ==
=+f+g
-----END PGP SIGNATURE-----


G
G
Gábor Boskovits wrote on 28 Aug 2019 20:59
(name . Danny Milosavljevic)(address . dannym@scratchpost.org)(address . 37207@debbugs.gnu.org)
CAE4v=pj08reuetW6JC2+PA3TTZkyy17szVCYH=0THsytDnsdfg@mail.gmail.com
Hello Danny,

Danny Milosavljevic <dannym@scratchpost.org> ezt írta (id?pont: 2019. aug.
28., Sze, 17:05):

Toggle quote (20 lines)
> Hi Gabor,
>
> On Wed, 28 Aug 2019 12:40:37 +0200
> Gábor Boskovits <boskovits@gmail.com> wrote:
>
> > Supressing the last modified header is just an
> > add_header Last-Modified "";
> > away.
> >
> > To get the info from the symlink seems to be much trickier, i would do
> with
> > either embedded perl or embedded lua. I am not sure if we should bother
> > with it, though. Wdyt?
>
> Since we already emit ETag, I don't think we need to bother with
> Last-Modified.
>
> Why is the ETag so short, though?
>
>
The ETag we emit is also bad. Nginx calculates this from mtime and
content-lenght,
so in our case it's just content length.


Toggle quote (5 lines)
> >wget --debug -O /dev/null https://guix.gnu.org/packages.json2 &1 |
> grep -i etag
> >ETag: "1-2f38b1"
>
>
Best regards,
g_bor

--
OpenPGP Key Fingerprint: 7988:3B9F:7D6A:4DBF:3719:0367:2506:A96C:CF63:0B21
Attachment: file
G
G
Gábor Boskovits wrote on 28 Aug 2019 21:42
(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)(address . 37207@debbugs.gnu.org)
CAE4v=pgZzr6Cf3MCJj5Mhp4fr101ym94bLcHsLgTZ1z1sR4QSQ@mail.gmail.com
Hello Tobias,

Tobias Geerinckx-Rice via Bug reports for GNU Guix <bug-guix@gnu.org> ezt
írta (id?pont: 2019. aug. 28., Sze, 16:38):

Toggle quote (15 lines)
> Gábor, Ludo',
>
> Gábor Boskovits ???
> > Supressing the last modified header is just an
> > add_header Last-Modified "";
> > away.
>
> You'll also need:
>
> # Don't honour client If-Modified-Since constraints.
> if_modified_since off;
> # Nginx's etags are hashes of file timestamp & file length.
> etag off;
>
>
You really have a point here.

Based on my reseach, I came up with the following:

we need
etag off;

we should create a file with the git last modification time of the files,
updated when there is a new commit in the repo => last-modified
we should create a file with some hash of the files, updated when there is
a new commit in the repo => etag
we could restrict these operations to the files modified since the last
checkout.

Retrieve these with embededd perl.
Wdyt?


Toggle quote (16 lines)
> Turning these off will of course prevent all caching. I don't
> know if that would add measurable load to guix.gnu.org (it would
> be more problematic if we used a CDN, but it might still make a
> difference).
>
> Nix does something both interesting and icky — as always: patch[0]
> nginx to look up the realpath() instead, so clients can still
> cache using If-None-Match.
>
> Kind regards,
>
> T G-R
>
> [0]: https://github.com/NixOS/nixpkgs/pull/48337
>

Best regards,
g_bor

--
OpenPGP Key Fingerprint: 7988:3B9F:7D6A:4DBF:3719:0367:2506:A96C:CF63:0B21
Attachment: file
L
L
Ludovic Courtès wrote on 28 Aug 2019 22:32
(name . Gábor Boskovits)(address . boskovits@gmail.com)
87o909f2tx.fsf@gnu.org
Hello,

Gábor Boskovits <boskovits@gmail.com> skribis:

Toggle quote (10 lines)
> we should create a file with the git last modification time of the files,
> updated when there is a new commit in the repo => last-modified
> we should create a file with some hash of the files, updated when there is
> a new commit in the repo => etag
> we could restrict these operations to the files modified since the last
> checkout.
>
> Retrieve these with embededd perl.
> Wdyt?

What would the config look like? AFAICS our ‘nginx’ package doesn’t
embed Perl, and I think it’s better this way. :-) Can we do that with
pure nginx directives?

We create /srv/guix.gnu.org (as a symlink) with the correct mtime¹. If
we can tell nginx to use it as the ‘Last-Modified’ date, that’s perfect.

Ludo’.

G
G
Gábor Boskovits wrote on 29 Aug 2019 08:11
(name . Ludovic Courtès)(address . ludo@gnu.org)
CAE4v=pjOh=ZP6WN41aVCZ5ORA7AnCX7sNJi3p2M_ZsC8gmYVBQ@mail.gmail.com
Hello Ludo,

Ludovic Courtès <ludo@gnu.org> ezt írta (id?pont: 2019. aug. 28., Sze,
22:32):

Toggle quote (23 lines)
> Hello,
>
> Gábor Boskovits <boskovits@gmail.com> skribis:
>
> > we should create a file with the git last modification time of the files,
> > updated when there is a new commit in the repo => last-modified
> > we should create a file with some hash of the files, updated when there
> is
> > a new commit in the repo => etag
> > we could restrict these operations to the files modified since the last
> > checkout.
> >
> > Retrieve these with embededd perl.
> > Wdyt?
>
> What would the config look like? AFAICS our ‘nginx’ package doesn’t
> embed Perl, and I think it’s better this way. :-) Can we do that with
> pure nginx directives?
>
> We create /srv/guix.gnu.org (as a symlink) with the correct mtime¹. If
> we can tell nginx to use it as the ‘Last-Modified’ date, that’s perfect.
>
>
I was thinking about this. Yes, we can solve that with pure nginx. There is
an issue however.
It invalidates all cached entries on update, so files not modified will
also need to be downloaded again.

The easiest way to do that would be to simply generate an nginx config
snippet at a configurable location,
setting up the mtime and etags variable, and include that from the main
config.

If this would be ok, then I will have a look at implementing this.

Ludo’.
Toggle quote (5 lines)
>
> ¹
> https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/hydra/berlin.scm#n212
>

Best regards,
g_bor

--
OpenPGP Key Fingerprint: 7988:3B9F:7D6A:4DBF:3719:0367:2506:A96C:CF63:0B21
Attachment: file
L
L
Ludovic Courtès wrote on 29 Aug 2019 14:40
(name . Gábor Boskovits)(address . boskovits@gmail.com)
87ftlki1qr.fsf@gnu.org
Hi Gábor,

Gábor Boskovits <boskovits@gmail.com> skribis:

Toggle quote (38 lines)
> Ludovic Courtès <ludo@gnu.org> ezt írta (id?pont: 2019. aug. 28., Sze,
> 22:32):
>
>> Hello,
>>
>> Gábor Boskovits <boskovits@gmail.com> skribis:
>>
>> > we should create a file with the git last modification time of the files,
>> > updated when there is a new commit in the repo => last-modified
>> > we should create a file with some hash of the files, updated when there
>> is
>> > a new commit in the repo => etag
>> > we could restrict these operations to the files modified since the last
>> > checkout.
>> >
>> > Retrieve these with embededd perl.
>> > Wdyt?
>>
>> What would the config look like? AFAICS our ‘nginx’ package doesn’t
>> embed Perl, and I think it’s better this way. :-) Can we do that with
>> pure nginx directives?
>>
>> We create /srv/guix.gnu.org (as a symlink) with the correct mtime¹. If
>> we can tell nginx to use it as the ‘Last-Modified’ date, that’s perfect.
>>
>>
> I was thinking about this. Yes, we can solve that with pure nginx. There is
> an issue however.
> It invalidates all cached entries on update, so files not modified will
> also need to be downloaded again.
>
> The easiest way to do that would be to simply generate an nginx config
> snippet at a configurable location,
> setting up the mtime and etags variable, and include that from the main
> config.
>
> If this would be ok, then I will have a look at implementing this.

I’m not sure I fully understand, but yes, if you could send a prototype
as a diff against maintenance.git, that’d be great!

Thank you,
Ludo’.
L
L
Ludovic Courtès wrote on 5 Sep 2019 22:47
(name . Gábor Boskovits)(address . boskovits@gmail.com)
875zm6eahj.fsf@gnu.org
Hello!

Did one of you have chance to come up with a trick to emit the right
‘Last-Modified’? We seemed to be close to having something. :-)

Thanks,
Ludo’.
L
L
Ludovic Courtès wrote on 26 Sep 2019 10:39
(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)(address . 37207@debbugs.gnu.org)
87wodvpi26.fsf@gnu.org
Hi Tobias,

Tobias Geerinckx-Rice <me@tobias.gr> skribis:

Toggle quote (10 lines)
> Turning these off will of course prevent all caching. I don't know if
> that would add measurable load to guix.gnu.org (it would be more
> problematic if we used a CDN, but it might still make a difference).
>
> Nix does something both interesting and icky — as always: patch[0]
> nginx to look up the realpath() instead, so clients can still cache
> using If-None-Match.

> [0]: https://github.com/NixOS/nixpkgs/pull/48337

(See

I had overlooked this patch but it looks like the right approach
overall. Calling ‘realpath’ each time seems a bit expensive as it
creates an ‘lstat’ storm, but I can’t think of a better solution.

I also found this post whose main interest is in showing how to write a
plugin to generate custom etags:


Thoughts?

Ludo’.
L
L
Ludovic Courtès wrote on 11 Jan 2020 22:23
control message for bug #37207
(address . control@debbugs.gnu.org)
87y2udu1pk.fsf@gnu.org
merge 37207 39051
quit
L
L
Ludovic Courtès wrote on 11 Jan 2020 22:25
(address . control@debbugs.gnu.org)
87sgklu1m4.fsf@gnu.org
retitle 37207 nginx serving files from the store returns Last-Modified = Epoch
quit
V
V
Vincent Legoll wrote on 27 Mar 2020 00:06
Re: nginx serving files from the store returns Last-Modified = Epoch
(address . 37207@debbugs.gnu.org)
CAEwRq=qCz5-eseg7na7AjbvzJi9rwx18TUVuP=6KSXtdLM2Fbw@mail.gmail.com
This bug prevents repology [1] to show
the latest versions of packages in guix,
as it relies on 'Last-Modified' for:
changing in a meaningful way...


--
Vincent Legoll
V
V
Vincent Legoll wrote on 27 Mar 2020 00:30
Repology
(address . 37207@debbugs.gnu.org)
CAEwRq=qGYquzw-Z88sX-9ga4Rr8mu2qE5cRV=6_X_7tLarAr4Q@mail.gmail.com
It also paint us a a fairly outdated distro,
despite our efforts to keep the pace and
update to latest versions of packages.

We may even get into the top ten, which
may give us a bit of attention and attract
some distrohoppers^Wusers.

--
Vincent Legoll
G
G
Gábor Boskovits wrote on 29 Mar 2020 11:50
Re: bug#37207: nginx serving files from the store returns Last-Modified = Epoch
(name . Vincent Legoll)(address . vincent.legoll@gmail.com)(address . 37207@debbugs.gnu.org)
CAE4v=pjF--P0A-cCuk-jYDzneaKqF09yVNCEANw_vKR-vEYfVA@mail.gmail.com
Hello Vincent,

Vincent Legoll <vincent.legoll@gmail.com> ezt írta (id?pont: 2020.
márc. 27., P, 0:07):
Toggle quote (8 lines)
>
> This bug prevents repology [1] to show
> the latest versions of packages in guix,
> as it relies on 'Last-Modified' for:
> https://guix.gnu.org/packages.json
> changing in a meaningful way...
>

Does it also use etags, or just last-modified?

I ask this because we already have bug similar to this, and it would
be interesting to know if
it would be enough to have a meaningful etags generation, or we still have to
deal with last-modified.

Toggle quote (8 lines)
>
> --
> Vincent Legoll
>
>
>

Best regards,
g_bor
--
OpenPGP Key Fingerprint: 7988:3B9F:7D6A:4DBF:3719:0367:2506:A96C:CF63:0B21
V
V
Vincent Legoll wrote on 30 Mar 2020 13:53
(name . Gábor Boskovits)(address . boskovits@gmail.com)(address . 37207@debbugs.gnu.org)
CAEwRq=puFS3YJ+G_JSUJxudtrgHQTtczWh8oxe3qY0MCJ=c99g@mail.gmail.com
Hello,

On Sun, Mar 29, 2020 at 11:50 AM Gábor Boskovits <boskovits@gmail.com> wrote:
Toggle quote (2 lines)
> Does it also use etags, or just last-modified?

From the email exchange I had with the maintainer of the site,
I think it only uses last-modified.

Toggle quote (5 lines)
> I ask this because we already have bug similar to this, and it would
> be interesting to know if
> it would be enough to have a meaningful etags generation, or we
> still have to deal with last-modified.

Is etags easier for us to handle ?

--
Vincent Legoll
C
C
Christopher Baines wrote on 10 May 2020 00:07
Re: bug#37207: guix.gnu.org returns Last-Modified = Epoch
(address . 37207@debbugs.gnu.org)(address . ludo@gnu.org)
87o8qwg3te.fsf@cbaines.net
Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (12 lines)
> Since the use of the ‘static-web-site’ service, which puts web site
> files in the store, nginx returns a ‘Last-Modified’ header that can
> trick clients into caching things forever:
>
> --8<---------------cut here---------------start------------->8---
> $ wget --debug -O /dev/null https://guix.gnu.org/packages.json 2>&1 | grep Last
> Last-Modified: Thu, 01 Jan 1970 00:00:01 GMT
> --8<---------------cut here---------------end--------------->8---
>
> We should tell nginx to do not emit ‘Last-Modified’, or to take the
> state from the /srv/guix.gnu.org symlink, if possible.

I ended up looking at this again in relation to Repology [1].


Going back to that comment, given that the Last-Modified header (and the
ETag) is wrong, it's probably sensible to remove them. That might even
fix the issue with Repology fetching the packages.json file.

Alternatively (or in addition), we could run a really simple Guile web
server that just serves the packages.json file with the right
Last-Modified value, and have NGinx proxy requests to that server. This
would be pretty easy to setup I believe, and would allow providing a
correct value.

Thoughts?

Chris
-----BEGIN PGP SIGNATURE-----

iQKTBAEBCgB9FiEEPonu50WOcg2XVOCyXiijOwuE9XcFAl63Ka1fFIAAAAAALgAo
aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDNF
ODlFRUU3NDU4RTcyMEQ5NzU0RTBCMjVFMjhBMzNCMEI4NEY1NzcACgkQXiijOwuE
9Xe2XBAAthr2GVCI1itmIJdRvk2x77KmpyUlOWGN9clJwBMinxS5AoqWRMrTRUiD
UfMPp3/ngld5qT3WmuJDnUP/65krZ6+Pq1Add+KdetJacrpblnNY8/N3wMtv88J0
Cy5mkJstWZ2jP5Lj72kcW5WpW0mg+KqBh/76IGd9JUZJDNExcXFOTUYhJ2+qIhci
fyV7dufFYUobMFhxjwnR4N4eYnBowJR8crdRNuej7W7AB+dy8LCECeb/M8x5/xBh
tYoJnzGJFQezH1bCyiuPAzEuvkgc9WxAHik+bfGBrzcRQY1Jmw19mb/FjEuPKQFY
Sw8FrcvlF91uYnApcxRTrWdvSRihXkNkINofAtzlrg/9WOc7tnt5k9SBuYCqmRax
MQgGS6AVOnyfjXcALr/BLpRN5N0GubX52m3s3+JQX7UQPY22VdtxOjhIFiNSA4FU
4ExdtX37g+5aUWb3/qmyleL9oZ5OmzkKjUFceXsQlnUXYdfa5oWBLtUgLANXwekg
mS6idRNHqz1sAbYTvc2baxv42hMT9XE3C6Lg/M+TMospGJbmVOPDeJZ96DQPEj/X
qC1NRk+KWc9eTIdFt2NbMM21TefWpBGwSp+AktVPbGkzwkqGUMUHoCqNVbAz3U7L
Xahl0P0FqZUXSK4/itMrXxGRjtBoNL81nwR1AryoZf8pVEAOfts=
=sdwC
-----END PGP SIGNATURE-----

L
L
Ludovic Courtès wrote on 10 May 2020 12:11
(name . Christopher Baines)(address . mail@cbaines.net)(address . 37207@debbugs.gnu.org)
874ksoaym3.fsf@gnu.org
Howdy!

Christopher Baines <mail@cbaines.net> skribis:

Toggle quote (30 lines)
> Ludovic Courtès <ludo@gnu.org> writes:
>
>> Since the use of the ‘static-web-site’ service, which puts web site
>> files in the store, nginx returns a ‘Last-Modified’ header that can
>> trick clients into caching things forever:
>>
>> --8<---------------cut here---------------start------------->8---
>> $ wget --debug -O /dev/null https://guix.gnu.org/packages.json 2>&1 | grep Last
>> Last-Modified: Thu, 01 Jan 1970 00:00:01 GMT
>> --8<---------------cut here---------------end--------------->8---
>>
>> We should tell nginx to do not emit ‘Last-Modified’, or to take the
>> state from the /srv/guix.gnu.org symlink, if possible.
>
> I ended up looking at this again in relation to Repology [1].
>
> 1: https://github.com/repology/repology-updater/issues/218#issuecomment-525905704
>
> Going back to that comment, given that the Last-Modified header (and the
> ETag) is wrong, it's probably sensible to remove them. That might even
> fix the issue with Repology fetching the packages.json file.
>
> Alternatively (or in addition), we could run a really simple Guile web
> server that just serves the packages.json file with the right
> Last-Modified value, and have NGinx proxy requests to that server. This
> would be pretty easy to setup I believe, and would allow providing a
> correct value.
>
> Thoughts?

I think it wouldn’t really help because the Last-Modified issue is
pervasive. It shows for instance when accessing the web site: one often
has to force the browser to reload pages to get the latest version.

So I’m all for one of the solutions that were proposed earlier.

WDYT?

Ludo’.
C
C
Christopher Baines wrote on 11 May 2020 12:32
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 37207@debbugs.gnu.org)
87imh2pxsc.fsf@cbaines.net
Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (42 lines)
> Howdy!
>
> Christopher Baines <mail@cbaines.net> skribis:
>
>> Ludovic Courtès <ludo@gnu.org> writes:
>>
>>> Since the use of the ‘static-web-site’ service, which puts web site
>>> files in the store, nginx returns a ‘Last-Modified’ header that can
>>> trick clients into caching things forever:
>>>
>>> --8<---------------cut here---------------start------------->8---
>>> $ wget --debug -O /dev/null https://guix.gnu.org/packages.json 2>&1 | grep Last
>>> Last-Modified: Thu, 01 Jan 1970 00:00:01 GMT
>>> --8<---------------cut here---------------end--------------->8---
>>>
>>> We should tell nginx to do not emit ‘Last-Modified’, or to take the
>>> state from the /srv/guix.gnu.org symlink, if possible.
>>
>> I ended up looking at this again in relation to Repology [1].
>>
>> 1: https://github.com/repology/repology-updater/issues/218#issuecomment-525905704
>>
>> Going back to that comment, given that the Last-Modified header (and the
>> ETag) is wrong, it's probably sensible to remove them. That might even
>> fix the issue with Repology fetching the packages.json file.
>>
>> Alternatively (or in addition), we could run a really simple Guile web
>> server that just serves the packages.json file with the right
>> Last-Modified value, and have NGinx proxy requests to that server. This
>> would be pretty easy to setup I believe, and would allow providing a
>> correct value.
>>
>> Thoughts?
>
> I think it wouldn’t really help because the Last-Modified issue is
> pervasive. It shows for instance when accessing the web site: one often
> has to force the browser to reload pages to get the latest version.
>
> So I’m all for one of the solutions that were proposed earlier.
>
> WDYT?

So I think removing the Last-Modified header from the responses will fix
the issue with the Repology fetcher (as it will stop thinking it's
already fetch the file, since it was last modified in 1970), instead it
will just always process the file.

Removing the Last-Modified header, and maybe the ETag as well from
responses should avoid the issue with web browsers using a cached
version of the page when they probably shouldn't.

I realise what I described with using a Guile web server to serve the
packages.json file wouldn't help with other pages (unless they're served
as well, which is a possibility), but that was just an optimisation over
removing the header entirely, as having the Last-Modified header, with a
correct value would help the Repology fetcher cache the file.

Does that make sense? It still seems to me that a small change to the
NGinx config (I think these lines somewhere in the config would do it
[1]) would help with the Repology fetcher issue, and the issue you
describe with web browsers.

1:

add_header Last-Modified "";
if_modified_since off;
etag off;
-----BEGIN PGP SIGNATURE-----

iQKTBAEBCgB9FiEEPonu50WOcg2XVOCyXiijOwuE9XcFAl65KbNfFIAAAAAALgAo
aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDNF
ODlFRUU3NDU4RTcyMEQ5NzU0RTBCMjVFMjhBMzNCMEI4NEY1NzcACgkQXiijOwuE
9XdHlA//cd0NZrmdmt1wzM1x48Mi8hbvjM0V1ySm2wJcmgcxuvqxyhJYMZXNghn8
zGQ78Z5ZBSxd4qI9WWFNe69uPewHbmTWQliu+Ju5RJSDwmPKC4EuMOWidGgcDIUh
E0PF0549JYkBG2K9YAFDnnGFa+d9suJpAvqfVcNmdcPUbnXPNYj8wql0/EiYBvZU
7DnLFhiQjLxIt6za2dE7+P3kzqpOUW49b/EE8snk8/10reJhxRYuxEObTK1Vx5zL
CbceiaxzYiZ5BZLu7J1zh+oH3GbErjoTI2YlDIcp7wQ7GYAhK4eWh3uRzxecCK3H
A8A30EkLeKGZItHlo16djuqGKGu8rf29dchKViRaNrJ/aq/OiL+MHLbkE7zmQp9n
EZNbAVBQopTuM6VgxeSvkDE+8QfoOeM2tQI6ufS3CEvPBqjIYPd+00IYk2ufjTXR
tL4OmcHEbeafqnacrJDelvsIWQZPbfWXcS4hYljf76EVXWS5pnPdwCi7NfZM8yOp
MpDRkAmZI/QGdHrOPjLLMm1awzT/J3ee69opMHWM+7NHYQyObysknfSDMw8SIknD
bJv/I8QrjIEKEFd4PXClRIgJVBKfJ5V2v8Hz9Ry8AKXxI5NUR8CCSraAwJ2hRBOH
t1yEDfQAP3uKKCXSwJ7ZvW3et7RcOzY1ggZG+8O4jH8sURGjTIs=
=xvw6
-----END PGP SIGNATURE-----

L
L
Ludovic Courtès wrote on 11 May 2020 14:47
(name . Christopher Baines)(address . mail@cbaines.net)(address . 37207@debbugs.gnu.org)
873686iqop.fsf@gnu.org
Hi,

Christopher Baines <mail@cbaines.net> skribis:

Toggle quote (9 lines)
> So I think removing the Last-Modified header from the responses will fix
> the issue with the Repology fetcher (as it will stop thinking it's
> already fetch the file, since it was last modified in 1970), instead it
> will just always process the file.
>
> Removing the Last-Modified header, and maybe the ETag as well from
> responses should avoid the issue with web browsers using a cached
> version of the page when they probably shouldn't.

It would prevent client-side caching altogether. So perhaps we can do
that as a stopgap (and it has the advantage of requiring only a tiny
config change).

Eventually, it’d be nice to have something better, like one of the Etag
patches discussed upthread.

Thanks,
Ludo’.
A
A
anadon via web wrote on 15 May 2020 23:12
nginx serving files from the store returns Last-Modified = Epoch
(address . 37207@debbugs.gnu.org)
7f41efd4a1c0.74767349eab4d05@guile.gnu.org
Any movement on this?
C
C
Christopher Baines wrote on 25 May 2020 10:20
[PATCH] nginx: berlin: Work around Last-Modified issues for guix.gnu.org.
(address . 37207@debbugs.gnu.org)
20200525082047.3943-1-mail@cbaines.net
* hydra/nginx/berlin.scm (%berlin-servers): Add some config to the
nginx-server-configurations for guix.gnu.org.
---
hydra/nginx/berlin.scm | 14 ++++++++++++++
1 file changed, 14 insertions(+)

Toggle diff (34 lines)
diff --git a/hydra/nginx/berlin.scm b/hydra/nginx/berlin.scm
index 303fd35..8c90eb1 100644
--- a/hydra/nginx/berlin.scm
+++ b/hydra/nginx/berlin.scm
@@ -514,6 +514,13 @@ PUBLISH-URL."
(locations guix.gnu.org-locations)
(raw-content
(list
+ ;; TODO This works around NGinx using the epoch for the
+ ;; Last-Modified date, as well as the etag.
+ ;; See http://issues.guix.info/issue/37207
+ "add_header Last-Modified \"\";"
+ "if_modified_since off;"
+ "etag off;"
+
"access_log /var/log/nginx/guix-info.access.log;")))
(nginx-server-configuration
@@ -634,6 +641,13 @@ PUBLISH-URL."
(append
%tls-settings
(list
+ ;; TODO This works around NGinx using the epoch for the
+ ;; Last-Modified date, as well as the etag.
+ ;; See http://issues.guix.info/issue/37207
+ "add_header Last-Modified \"\";"
+ "if_modified_since off;"
+ "etag off;"
+
"access_log /var/log/nginx/guix-gnu-org.https.access.log;"))))
(nginx-server-configuration
--
2.26.2
C
C
Christopher Baines wrote on 25 May 2020 10:24
Re: bug#37207: guix.gnu.org returns Last-Modified = Epoch
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 37207@debbugs.gnu.org)
87h7w4l8v8.fsf@cbaines.net
Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (17 lines)
> Hi,
>
> Christopher Baines <mail@cbaines.net> skribis:
>
>> So I think removing the Last-Modified header from the responses will fix
>> the issue with the Repology fetcher (as it will stop thinking it's
>> already fetch the file, since it was last modified in 1970), instead it
>> will just always process the file.
>>
>> Removing the Last-Modified header, and maybe the ETag as well from
>> responses should avoid the issue with web browsers using a cached
>> version of the page when they probably shouldn't.
>
> It would prevent client-side caching altogether. So perhaps we can do
> that as a stopgap (and it has the advantage of requiring only a tiny
> config change).

Great, I've finally got around to sending a patch for this now.
-----BEGIN PGP SIGNATURE-----

iQKTBAEBCgB9FiEEPonu50WOcg2XVOCyXiijOwuE9XcFAl7LgNtfFIAAAAAALgAo
aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDNF
ODlFRUU3NDU4RTcyMEQ5NzU0RTBCMjVFMjhBMzNCMEI4NEY1NzcACgkQXiijOwuE
9Xd0fQ/9Hzrkwwz8OXjDZb76YfP4YOK57j8eLk8Jj1pWBeHnM93XlQqB7JB4Vubz
K8zO/a7DTCcMvyswaV2k3WaGJyId+QAEZfYxxZ+6nj+IxjRXQLWt8CIEucCnezon
pQmQZNQBjrRLBu/lTQ/QRx5WaSwSWdHIsCVBzQYBoIgEjPZJh13M5pSkPFi1tCU/
jvFYvibjadN7loMIVcADdDQOnjSr1auOYPQpN44VaCMwffqXdFwqgrATw5hq39+X
d+54iXtzmj9MZNZ7uotpcFouJEHddPIR9LeVav1ZAYFEMVlAgv48MiDfw6XE7irh
lcoSEGDvfkoBrYzw5x8dABWMduzLu7ySxv3PGxtXE7fG3wAP4C+5Mgh9phX1vrOe
DnSA28YdFu9bBT9idFzXJhBOVxFn5cGaoyuxdML09DRDmyiRP0gjQluC4n9IRCiL
SyWrzyW6pV+42djruPft0OlXF3zb4NeYy/ILp7KY72ldVxypgH+KmfVnjTBwQai0
LznpGL2Rh55NzgQqunGZCdV+XHOoi06D/vzpa7FNjX+5x4k9lSZeKnURVoVde1te
ZdkZxvy4/BDzuX+i/e0ivHiQ7lpiwBh1IEhEUCNRPHm3GIMMGkl8cwiR+0OfnAJh
DIILmSrHuDpf1rmTAlDqq3sXgI5xU3V/h+4/eaSbphbS0VHr6F0=
=fd7n
-----END PGP SIGNATURE-----

?