Simon Tournier <zimon.toutoune@gmail.com> writes:
Toggle quote (87 lines)
> Hi,
>
> On Fri, 08 Sep 2023 at 19:09, Ludovic Courtès <ludo@gnu.org> wrote:
>
>>>> It would also be pretty bad for closure size:
>>>>
>>>> --8<---------------cut here---------------start------------->8---
>>>> $ guix size guile-git | tail -1
>>>> total: 106.6 MiB
>>>> $ guix size guile-git git-minimal | tail -1
>>>> total: 169.8 MiB
>>>> --8<---------------cut here---------------end--------------->8---
>>>>
>>>> It’s also not clear concretely how we’d add that dependency. Try
>>>> invoking ‘git’ from $PATH and print a warning if it doesn’t work?
>>>> But then, what about applications like Cuirass and hpcguix-web?
>>>
>>> I think we can rely on something like,
>>>
>>> guix shell -C git-minimal -- git gc
>>
>> We’re talking about the implementation of a cache (meant to speed up
>> operations), that would actually fill said cache plus do a whole bunch
>> of expensive operations? Nah. :-)
>
> I do not think. If I understand correctly, we need to run “git gc” at
> some point, therefore git-minimal needs to me around. The question is
> how and when.
>
> Well, maybe I am missing what the bug is about. For me, it is about
> running ‘git gc’ for cleaning the Git checkout cache, no?
>
>
> Solution #1. Add git-minimal as inputs. It increases the closure and
> the extra load (on average) is about the ratio between the rate of “guix
> pull” and the rate of the git-minimal changes.
>
> Assuming, that people are running “guix pull” once per week and say “git
> gc” is run after 50 pulls. (These both number are totally arbitrary and
> based on my personal estimate).
>
> Data Service [1] tells:
>
> 2023-07-07 15:45:22 2023-09-08 21:22:08
> 2023-05-11 16:10:48 2023-07-07 14:21:45
> 2023-05-01 16:40:08 2023-05-11 14:36:16
> 2023-04-25 13:34:54 2023-05-01 15:19:55
> 2023-04-25 13:34:54 2023-09-08 21:22:08
> 2023-03-06 17:22:28 2023-04-25 12:27:33
> 2023-01-17 23:49:19 2023-03-06 16:48:43
> 2022-11-08 13:06:42 2023-01-17 15:11:47
> 2022-10-08 05:14:46 2022-11-08 09:56:31
> 2022-09-06 15:00:08 2022-10-08 04:15:43
> 2022-08-13 22:02:31 2022-09-06 12:58:52
> …
>
> It means that an user will download ~10 times git-minimal for nothing.
>
>
> Solution #2. The one I am proposing. :-) Download git-minimal only
> when Guix needs it for running “git gc”. Yeah, there is probably a
> small overload with some operations. But, I bet this overload is much
> smaller than the one of solution #1.
>
> Well, it depends on the number of times people are updating the cache vs
> the rate of change of git-minimal.
>
> For sure, if one updates 100 times per week the cache, having
> git-minimal as inputs is far better. But I do not think that the
> regular usage on average. :-)
>
> That’s why I am proposing to have an option for turning off this “git
> gc“ operation.
>
> Well, we have lived since years without running ‘git gc’ so running it
> once per year on average is probably enough to keep the cache size
> reasonable. And git-minimal is changing every month.
>
>
> Maybe, there is some solution #3. ;-)
>
> Cheers,
> simon
>
>
> 1: https://data.guix.gnu.org/repository/1/branch/master/package/git-minimal/output-history
Please don't create another situation like with guix system roll-back,
where a crucial sysadmin operation doesn't work without network access.
Or at least make it configurable, so things that are likely to be needed
for future operations are pre-fetched.