Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:
Toggle quote (53 lines)
> Hi,
>
> Christopher Baines <mail@cbaines.net> writes:
>
>> Christopher Baines <mail@cbaines.net> writes:
>>
>>> I'm also really confused by what commits appear to be on the branch,
>>> take 12b15585a75062f3fba09d82861c6fae9a7743b2 which appears to be one
>>> core-updates, but it's a duplicate of
>>> e2a7c227dea5b361e2ebdbba24b923d1922a79d0 which was pushed to
>>> master. Same with this commit 28d14130953d868d4848540d9de8e1ae4a01a467,
>>> which is different to f29f80c194d0c534a92354b2bc19022a9b70ecf8 on
>>> master.
>>
>> I've worked out at least when these two werid commits turned up on
>> core-updates.
>>
>> 12b15585a7 is mentioned here:
>> https://lists.gnu.org/archive/html/guix-commits/2023-09/msg00955.html
>>
>> and 28d1413095 is mentioned here:
>> https://lists.gnu.org/archive/html/guix-commits/2024-03/msg00381.html
>>
>>
>> With the changes last month in March, I was going to suggest deleting
>> the branch and then re-creating from f205179ed2 and trying to re-apply
>> the changes that should be on core-updates, while avoiding any
>> "duplicate" commits. However, I'm not even sure where to being with the
>> ~5000 commits pushed in September, at least one of them is a duplicate
>> of a commit on master, but I'm not sure how many of the other ~5000 are.
>>
>> For comparison, I did a merge of master in to core-updates today, and
>> this is what it shows up like on guix-commits:
>>
>> https://lists.gnu.org/archive/html/guix-commits/2024-04/msg01209.html
>>
>> There are only two new revisions, the ed update I pushed, and the merge
>> commit, which is what a merge should look like as far as I'm aware.
>
> I think probably what happened is that in the middle of a merge of
> master -> core-updates (which entails sometimes painful conflicts
> resolution), a new commit pushed to core-updates, and to be able to push
> the resulting local branch (including the thousands of commits from the
> merge commit) got rebased on the remote core-updates.
>
> Perhaps another merge commit appeared on the remote around the same
> time, which would explain the duplicates.
>
> While I agree it's messy to have 5000 of duplicated commits, I'm not
> sure attempting to rewrite the branch, which has seen a lot of original
> commits, is a good idea (it'd be easy to have some good commits fall
> into cracks, leading to lost of work).
I think it's important to weigh up the cost and risks associated with
either merging these commits, or somehow avoiding doing so. I think the
potential impact is more than just a bit of messy Git history.
Assuming we merge core-updates without doing anything about these
duplicate commits, and taking the cwltool package as a semi-random
example, if you do:
git log -p gnu/packages/bioinformatics.scm
You're going to see two commits for the update to 3.1.20240112164112,
that's maybe confusing, but not a big issue I guess since they look the
same, just different hashes.
But say you're looking at the Git history because you want that specific
version of cwltool and you're going to use guix time-machine or an
inferior looking at that revision. Well, it's a lucky dip. If you pick
the original master commit, you're in luck, you'll probably get
substitutes for cwltool. But if you pick the other seemingly identical
commit, you're effectively checking out core-updates as it was last
month and the chance of substitutes is much less likely. I also can't
really think how you'd work out which commit is best to use once
core-updates is merged? The easiest way would probably be to check the
signature, but that will only work most of the time.
This isn't a new issue, it's already problematic for substitute
availability to use intermediate commits (commits that weren't directly
pointed to by master). But there are over 1000 packages who's versions
are being changed on core-updates currently, or at least it looks like
this because of the duplicate commits, and if I'm correct about how
people are using the git history to find commits for specific versions
of packages, then having these duplicates in the Git history for master
forever more is going to catch people out for as long as those versions
remain relevant.