[PATCH] union: Resolve collisions by stable-sort'ing them.

OpenSubmitted by Attila Lendvai.
Details
4 participants
  • Attila Lendvai
  • Liliana Marie Prikler
  • Ludovic Courtès
  • Maxime Devos
Owner
unassigned
Severity
normal
A
A
Attila Lendvai wrote on 28 Sep 2021 23:40
(address . guix-patches@gnu.org)(name . Attila Lendvai)(address . attila@lendvai.name)
20210928214044.437-1-attila@lendvai.name
* guix/build/union.scm (resolve-collision/alphanumeric-last): New function.
(warn-about-collision): Renamed to default-collision-resolver.
---

this should work, but i cannot test it, because srfi-43 seems not to be
available on the build side:

unpacking bootstrap Guile to '/home/alendvai/workspace/guix/guix/test-tmp/store/qky0jf68rr7pnsvmhj0ay42rzh4qk6r9-guile-bootstrap-2.0'...
[...] output without sfri-43.go

and then unsurprisingly: "no code for module (srfi srfi-43)"

is tis only a peculiarity of the test environment?

can you please advise how to proceed?

guix/build/union.scm | 26 ++++++++++++++++++++------
guix/gexp.scm | 2 +-
tests/union.scm | 9 +++++++++
3 files changed, 30 insertions(+), 7 deletions(-)

Toggle diff (88 lines)
diff --git a/guix/build/union.scm b/guix/build/union.scm
index 961ac3298b..747902ec6c 100644
--- a/guix/build/union.scm
+++ b/guix/build/union.scm
@@ -23,11 +23,12 @@
   #:use-module (ice-9 format)
   #:use-module (srfi srfi-1)
   #:use-module (srfi srfi-26)
+  #:use-module (srfi srfi-43)
   #:use-module (rnrs bytevectors)
   #:use-module (rnrs io ports)
   #:export (union-build
 
-            warn-about-collision
+            default-collision-resolver
 
             relative-file-name
             symlink-relative))
@@ -102,10 +103,23 @@ identical, #f otherwise."
   ;; applications via 'glib-or-gtk-build-system'.
   '("icon-theme.cache" "gschemas.compiled"))
 
-(define (warn-about-collision files)
-  "Handle the collision among FILES by emitting a warning and choosing the
-first one of THEM."
-  (let ((file (first files)))
+(define (resolve-collision/alphanumeric-last files)
+  ;; Let's do a stable-sort at least, so that multiple foo-1.2.3/bin/foo
+  ;; variants will predictably resolve to the highest versioned one.
+  (let* ((original-files (list->vector files))
+         (count (vector-length original-files))
+         (stripped-files (vector-map (lambda (_ el)
+                                       (strip-store-file-name el))
+                                     original-files))
+         (indices (vector-unfold values count)))
+    (stable-sort! indices
+                  (lambda (a b)
+                    (string> (vector-ref stripped-files a)
+                             (vector-ref stripped-files b))))
+    (vector-ref original-files (vector-ref indices 0))))
+
+(define (default-collision-resolver files)
+  (let ((file (resolve-collision/alphanumeric-last files)))
     (unless (member (basename file) %harmless-collisions)
       (format (current-error-port)
               "~%warning: collision encountered:~%~{  ~a~%~}"
@@ -117,7 +131,7 @@ first one of THEM."
                       #:key (log-port (current-error-port))
                       (create-all-directories? #f)
                       (symlink symlink)
-                      (resolve-collision warn-about-collision))
+                      (resolve-collision default-collision-resolver))
   "Build in the OUTPUT directory a symlink tree that is the union of all the
 INPUTS, using SYMLINK to create symlinks.  As a special case, if
 CREATE-ALL-DIRECTORIES?, creates the subdirectories in the output directory to
diff --git a/guix/gexp.scm b/guix/gexp.scm
index f3d278b3e6..32e8748443 100644
--- a/guix/gexp.scm
+++ b/guix/gexp.scm
@@ -1983,7 +1983,7 @@ This yields an 'etc' directory containing these two files."
 
 (define* (directory-union name things
                           #:key (copy? #f) (quiet? #f)
-                          (resolve-collision 'warn-about-collision))
+                          (resolve-collision 'default-collision-resolver))
   "Return a directory that is the union of THINGS, where THINGS is a list of
 file-like objects denoting directories.  For example:
 
diff --git a/tests/union.scm b/tests/union.scm
index a8387edf42..cbf8840793 100644
--- a/tests/union.scm
+++ b/tests/union.scm
@@ -204,4 +204,13 @@
    ("/a/b" "/a/b/c/d"   => "c/d")
    ("/a/b/c" "/a/d/e/f" => "../../d/e/f")))
 
+(test-assert "resolve-collision/alphanumeric-last sorts alphanumerically"
+  (string=
+   ((@@ (guix build union) resolve-collision/alphanumeric-last)
+     (list "/gnu/store/c0000000000000000000000000000000-idris-0.0.0/bin/idris"
+           "/gnu/store/60000000000000000000000000000000-idris-2.0.0/bin/idris"
+           "/gnu/store/z0000000000000000000000000000000-idris-1.3.5/bin/idris"
+           "/gnu/store/00000000000000000000000000000000-idris-1.3.3/bin/idris"))
+   "/gnu/store/60000000000000000000000000000000-idris-2.0.0/bin/idris"))
+
 (test-end)
-- 
2.33.0
M
M
Maxime Devos wrote on 29 Sep 2021 15:48
530e2ad39a865b57f4a0049e862af0d23ac4c592.camel@telenet.be
Attila Lendvai schreef op di 28-09-2021 om 23:40 [+0200]:
Toggle quote (12 lines)
> * guix/build/union.scm (resolve-collision/alphanumeric-last): New function.
> (warn-about-collision): Renamed to default-collision-resolver.
> ---
>
> this should work, but i cannot test it, because srfi-43 seems not to be
> available on the build side:
>
> unpacking bootstrap Guile to '/home/alendvai/workspace/guix/guix/test-tmp/store/qky0jf68rr7pnsvmhj0ay42rzh4qk6r9-guile-bootstrap-2.0'...
> [...] output without sfri-43.go
>
> and then unsurprisingly: "no code for module (srfi srfi-43)"

SRFI-43 is in Guile since Guile 2.0.10, according to Guile's NEWS.
The bootstrap guile is older:

$(guix build -e '(@@ (gnu packages bootstrap) %bootstrap-guile)')/bin/guile --version
guile (GNU Guile) 2.0.9
[...]

Greetings,
Maxime.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYVRukRccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7poCAQDxE3F4TznHt8S4PHyqKy7V51Zu
gQtLUeR/EZB7Z2gb5QEApzobTsrjrUXE+4/c/ottb9IKTu9xLda0X1gSctJNfQ8=
=xEFr
-----END PGP SIGNATURE-----


A
A
Attila Lendvai wrote on 29 Sep 2021 18:03
(name . Maxime Devos)(address . maximedevos@telenet.be)(address . 50878@debbugs.gnu.org)
9gdtvXBJqCfaLCQswTjsN95-pmgfMzqlELlgJEH0o2n_WyWE8LTncnQCIMTEnjFiKO1gKSBMukitcmbQv5FNWRVUc6PLbLBIdQDype7SDLk=@lendvai.name
Toggle quote (7 lines)
> SRFI-43 is in Guile since Guile 2.0.10, according to Guile's NEWS.
> The bootstrap guile is older:
>
> $(guix build -e '(@@ (gnu packages bootstrap) %bootstrap-guile)')/bin/guile --version
>
> guile (GNU Guile) 2.0.9

thank you for the analysis!

is it easy and desirable to upgrade it to 2.0.10 or newer?

shall i try to do it, or advocate for it if it's not trivial, by
e.g. opening an issue?

- attila
PGP: 5D5F 45C7 DFCD 0A39
L
L
Liliana Marie Prikler wrote on 29 Sep 2021 19:42
Re: [PATCH] union: Resolve collisions by stable-sort'ing them.
57f1435fd83da8c0e0acaef64d5f08e4ca7b3404.camel@gmail.com
Hi,

Am Dienstag, den 28.09.2021, 23:40 +0200 schrieb Attila Lendvai:
Toggle quote (42 lines)
> [...]
> index 961ac3298b..747902ec6c 100644
> --- a/guix/build/union.scm
> +++ b/guix/build/union.scm
> @@ -23,11 +23,12 @@
> #:use-module (ice-9 format)
> #:use-module (srfi srfi-1)
> #:use-module (srfi srfi-26)
> + #:use-module (srfi srfi-43)
> #:use-module (rnrs bytevectors)
> #:use-module (rnrs io ports)
> #:export (union-build
>
> - warn-about-collision
> + default-collision-resolver
>
> relative-file-name
> symlink-relative))
> @@ -102,10 +103,23 @@ identical, #f otherwise."
> ;; applications via 'glib-or-gtk-build-system'.
> '("icon-theme.cache" "gschemas.compiled"))
>
> -(define (warn-about-collision files)
> - "Handle the collision among FILES by emitting a warning and
> choosing the
> -first one of THEM."
> - (let ((file (first files)))
> +(define (resolve-collision/alphanumeric-last files)
> + ;; Let's do a stable-sort at least, so that multiple foo-
> 1.2.3/bin/foo
> + ;; variants will predictably resolve to the highest versioned one.
> + (let* ((original-files (list->vector files))
> + (count (vector-length original-files))
> + (stripped-files (vector-map (lambda (_ el)
> + (strip-store-file-name el))
> + original-files))
> + (indices (vector-unfold values count)))
> + (stable-sort! indices
> + (lambda (a b)
> + (string> (vector-ref stripped-files a)
> + (vector-ref stripped-files b))))
> + (vector-ref original-files (vector-ref indices 0))))
Instead of stable-sort!-ing the indices of a vector, what about stable-
sort!-ing (map strip-store-file-name original-files) in more or less
one go?

Toggle quote (17 lines)
> +(define (default-collision-resolver files)
> + (let ((file (resolve-collision/alphanumeric-last files)))
> (unless (member (basename file) %harmless-collisions)
> (format (current-error-port)
> "~%warning: collision encountered:~%~{ ~a~%~}"
> @@ -117,7 +131,7 @@ first one of THEM."
> #:key (log-port (current-error-port))
> (create-all-directories? #f)
> (symlink symlink)
> - (resolve-collision warn-about-collision))
> + (resolve-collision default-collision-
> resolver))
> "Build in the OUTPUT directory a symlink tree that is the union of
> all the
> INPUTS, using SYMLINK to create symlinks. As a special case, if
> CREATE-ALL-DIRECTORIES?, creates the subdirectories in the output
> directory to
I don't think the default collision resolver ought to sort the files.
The rationale behind ignoring certain collisions, e.g. icon caches
relies on the fact that Guix will use the correct files because they
are put first in the manifest. The hooks themselves have no special
names that could put them "always first" and profiles are themselves
union-built.

I do however support the addition of sorting methods as collision
resolvers in general and would welcome a way of doing so for profiles
pre-hook.

Regards,
Liliana
M
M
Maxime Devos wrote on 29 Sep 2021 23:00
Re: [bug#50878] [PATCH] union: Resolve collisions by stable-sort'ing them.
(name . Attila Lendvai)(address . attila@lendvai.name)(address . 50878@debbugs.gnu.org)
297b9f7f0924f14fa52dcd3689b233edd5d45275.camel@telenet.be
Attila Lendvai schreef op wo 29-09-2021 om 16:03 [+0000]:
Toggle quote (11 lines)
> > SRFI-43 is in Guile since Guile 2.0.10, according to Guile's NEWS.
> > The bootstrap guile is older:
> >
> > $(guix build -e '(@@ (gnu packages bootstrap) %bootstrap-guile)')/bin/guile --version
> >
> > guile (GNU Guile) 2.0.9
>
> thank you for the analysis!
>
> is it easy and desirable to upgrade it to 2.0.10 or newer?

It's possible by modifying 'bootstrap-guile-url-path' and
'bootstrap-guile-hash'. Apparently, some architectures already
use a newer guile. E.g., aarch64 has 2.0.14. However, this
probably would entail a world-rebuild I think, so this probably
needs to be done on core-updates.

Why limit to guile 2.0.10, why not go for guile 3.0.7 instead?
I don't kow how one would go about updating the bootstrap binaries though.

Anyway, there have been quite a few bug fixes and new features since
2.0.9, and updating to guile 3.0.? would allow dropping some compatibility
code in various guix/build/, so I wouldn't be opposed to such a change.

Greetings,
Maxime
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYVTT1RccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7pDQAQCiV7aq9RbNZ050FH3VqRfGp2WO
PBE/9H9es+hsEGXYmAD+Kz7uoug5EwS84emjdNlrbiS1ePi7/engCydav8gNEg4=
=S5ee
-----END PGP SIGNATURE-----


A
A
Attila Lendvai wrote on 30 Sep 2021 10:10
Re: [PATCH] union: Resolve collisions by stable-sort'ing them.
(name . Liliana Marie Prikler)(address . liliana.prikler@gmail.com)(address . 50878@debbugs.gnu.org)
x6K1iTIy5vDsT_3R2yNaMF5r9W8CbBNsIhom3XkIG8ew8gRzVb97qmLkUdRa20lo9uQq_iGIN5jscojKa1KZPMFLTG952268MSVgdtqYA8U=@lendvai.name
Toggle quote (18 lines)
> > - (let* ((original-files (list->vector files))
> > - (count (vector-length original-files))
> > - (stripped-files (vector-map (lambda (_ el)
> > - (strip-store-file-name el))
> > - original-files))
> > - (indices (vector-unfold values count)))
> >
> > - (stable-sort! indices
> > - (lambda (a b)
> > - (string> (vector-ref stripped-files a)
> > - (vector-ref stripped-files b))))
> > - (vector-ref original-files (vector-ref indices 0))))
>
> Instead of stable-sort!-ing the indices of a vector, what about stable-
> sort!-ing (map strip-store-file-name original-files) in more or less
> one go?


the hash also needs to be dropped from the path for sorting to be
useful, but the return value must be the full path, hence the
complexity with sorting the indices, pointing both to the full paths
and the cut parts.


Toggle quote (13 lines)
> I don't think the default collision resolver ought to sort the files.
>
> The rationale behind ignoring certain collisions, e.g. icon caches
> relies on the fact that Guix will use the correct files because they
> are put first in the manifest. The hooks themselves have no special
> names that could put them "always first" and profiles are themselves
> union-built.
>
> I do however support the addition of sorting methods as collision
> resolvers in general and would welcome a way of doing so for profiles
> pre-hook.


please note that i almost completely lack the knowledge of the
relevant internals.

with that in mind, do i read you right? this should add a new exported
resolver funtion, and leave the default one as it was?

and use the new resolver somewhere (where?) that will affect only the
union of profiles, i.e. user visible effecs when installing software?
(because DIRECTORY-UNION is called in other contexts also?)

thanks for the insights! i'd appreciate a bit more higher level
guidance, and then i'll resend the patch accordingly.

- attila
PGP: 5D5F 45C7 DFCD 0A39
M
M
Maxime Devos wrote on 30 Sep 2021 10:42
Re: [bug#50878] [PATCH] union: Resolve collisions by stable-sort'ing them.
(address . 50878@debbugs.gnu.org)
9ab183637e8cd645267a37ba0f05319f8b3c72ff.camel@telenet.be
Attila Lendvai schreef op do 30-09-2021 om 08:10 [+0000]:
Toggle quote (22 lines)
> > > - (let* ((original-files (list->vector files))
> > > - (count (vector-length original-files))
> > > - (stripped-files (vector-map (lambda (_ el)
> > > - (strip-store-file-name el))
> > > - original-files))
> > > - (indices (vector-unfold values count)))
> > >
> > > - (stable-sort! indices
> > > - (lambda (a b)
> > > - (string> (vector-ref stripped-files a)
> > > - (vector-ref stripped-files b))))
> > > - (vector-ref original-files (vector-ref indices 0))))
> >
> > Instead of stable-sort!-ing the indices of a vector, what about stable-
> > sort!-ing (map strip-store-file-name original-files) in more or less
> > one go?
>
> the hash also needs to be dropped from the path for sorting to be
> useful, but the return value must be the full path, hence the
> complexity with sorting the indices, pointing both to the full paths
> and the cut parts.

You can replace the 'less' argument of 'stable-sort'.
Example sorting by the second character of a string:

(sort '("za" "yb" "xc") (lambda (x y)
(char>? (string-ref x 1)
(string-ref y 1)))))

IIUC, you would need to replace char>? by string> and string-ref
by strip-store-file-name.

Greetings,
Maxime.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYVV4ZRccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7vNkAQDAgk7tcyMnnpquDZroTaTWfQMD
JhFccN39qFLYapM/UAD+MMzwqIvnbNtf/Qq88FMnBVURdP/RP3pIvMBpx77xmwE=
=YJvu
-----END PGP SIGNATURE-----


L
L
Ludovic Courtès wrote on 30 Sep 2021 16:00
Re: bug#50878: [PATCH] union: Resolve collisions by stable-sort'ing them.
(name . Maxime Devos)(address . maximedevos@telenet.be)
87czoq40zj.fsf_-_@gnu.org
Hi,

Maxime Devos <maximedevos@telenet.be> skribis:

Toggle quote (33 lines)
> Attila Lendvai schreef op do 30-09-2021 om 08:10 [+0000]:
>> > > - (let* ((original-files (list->vector files))
>> > > - (count (vector-length original-files))
>> > > - (stripped-files (vector-map (lambda (_ el)
>> > > - (strip-store-file-name el))
>> > > - original-files))
>> > > - (indices (vector-unfold values count)))
>> > >
>> > > - (stable-sort! indices
>> > > - (lambda (a b)
>> > > - (string> (vector-ref stripped-files a)
>> > > - (vector-ref stripped-files b))))
>> > > - (vector-ref original-files (vector-ref indices 0))))
>> >
>> > Instead of stable-sort!-ing the indices of a vector, what about stable-
>> > sort!-ing (map strip-store-file-name original-files) in more or less
>> > one go?
>>
>> the hash also needs to be dropped from the path for sorting to be
>> useful, but the return value must be the full path, hence the
>> complexity with sorting the indices, pointing both to the full paths
>> and the cut parts.
>
> You can replace the 'less' argument of 'stable-sort'.
> Example sorting by the second character of a string:
>
> (sort '("za" "yb" "xc") (lambda (x y)
> (char>? (string-ref x 1)
> (string-ref y 1)))))
>
> IIUC, you would need to replace char>? by string> and string-ref
> by strip-store-file-name.

Agreed. I’d advice using this strategy rather than resorting to
SRFI-43; it should have the desired effect.

BTW, because TeX Live packages rely on ‘union-build’, this patch
triggers a lot of rebuilds, but we can try to squeeze it in the upcoming
‘core-updates-frozen’ rebuild.

Thanks,
Ludo’.
A
A
Attila Lendvai wrote on 30 Sep 2021 16:12
Re: [bug#50878] [PATCH] union: Resolve collisions by stable-sort'ing them.
(name . Maxime Devos)(address . maximedevos@telenet.be)
RG7heiLAiSf93kpl3Y4Iml-x5Ons2p1WfXXD5ZkTkaD_wpkBOmzNnS21fmUIT_MBCZxRilDPOUEE58qHChZKE_1_HmtMZFKHKzxN7-CrJmw=@lendvai.name
Toggle quote (13 lines)
> > the hash also needs to be dropped from the path for sorting to be
> > useful, but the return value must be the full path, hence the
> > complexity with sorting the indices, pointing both to the full paths
> > and the cut parts.
>
> You can replace the 'less' argument of 'stable-sort'.

> Example sorting by the second character of a string:
>
> (sort '("za" "yb" "xc") (lambda (x y)
> (char>? (string-ref x 1)
> (string-ref y 1)))))

i don't know about the expected size of the collision list here, but
that would cons much more, because that would cons up two substrings
at each comparison, i.e. O(n^2) vs O(n) at least in GC load, probably
in time also.

i think it's not worth it, but let me know, and then i can simplify
the code somewhat at the cost of more consing.

sorting lists is probably also much slower than sorting vectors, but
there may be some tricks i don't know about.

random note: civodul said on IRC that there's a certain reluctancy to
update the opaque binary blobs, and the bootstrap guile will be
replaced by mes anyway.

so, if we want to have this merged, then we will need to give up on
testing it until mes becomes the bootstrap scheme (and assuming it
will have srfi vectors).

in the lights of the above, i think i'll stop pursuing this patch, but
i'd be happy to implement something if you can give me highlevel
guidance.

with the above pointed out, are you still happy with the list sorting?

i'd love to have idris packaged so that it has bin/idris symlinked to
bin/idris-1.2.3, and installing multiple versions of it would pick the
newest one for bin/idris.

--
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“Have the courage to take your own thoughts seriously, for they will shape you.”
— Albert Einstein (1879–1955)
M
M
Maxime Devos wrote on 30 Sep 2021 17:18
(name . Attila Lendvai)(address . attila@lendvai.name)
57b644a0b8accceed5d390cd82b2f69d427c464d.camel@telenet.be
Attila Lendvai schreef op do 30-09-2021 om 14:12 [+0000]:
Toggle quote (17 lines)
> > > the hash also needs to be dropped from the path for sorting to be
> > > useful, but the return value must be the full path, hence the
> > > complexity with sorting the indices, pointing both to the full paths
> > > and the cut parts.
> >
> > You can replace the 'less' argument of 'stable-sort'.
> > Example sorting by the second character of a string:
> >
> > (sort '("za" "yb" "xc") (lambda (x y)
> > (char>? (string-ref x 1)
> > (string-ref y 1)))))
>
> i don't know about the expected size of the collision list here, but
> that would cons much more, because that would cons up two substrings
> at each comparison, i.e. O(n^2) vs O(n) at least in GC load, probably
> in time also.

I don't see any consing here, or any increase in complexity?

Toggle quote (6 lines)
> i think it's not worth it, but let me know, and then i can simplify
> the code somewhat at the cost of more consing.
>
> sorting lists is probably also much slower than sorting vectors, but
> there may be some tricks i don't know about.

I was suggesting replacing the 'less' procedure with a procedure
like (lambda (x y) (char>? (string-ref x 1) (string-ref y 1)))
instead of doing the sorting in two steps.

I wasn't suggesting using lists. The '("za" "yb" "xc") was only
for demonstration, a vector should work as well.

Greetings,
Maxime.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYVXVKRccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7rNpAPkBGWGqP2ymoSNiKlKJoulyD6Xg
Q7oOtq5wD8Vl66WSBgD+Nv17xzGCSTdKpY0nRg+mxye2gatFPRikpgIwMdAZUg0=
=8wo9
-----END PGP SIGNATURE-----


L
L
Liliana Marie Prikler wrote on 30 Sep 2021 20:13
(address . 50878@debbugs.gnu.org)
ca072b80416c063619301d35283b47aba05be170.camel@gmail.com
Am Donnerstag, den 30.09.2021, 10:42 +0200 schrieb Maxime Devos:
Toggle quote (37 lines)
> Attila Lendvai schreef op do 30-09-2021 om 08:10 [+0000]:
> > > > - (let* ((original-files (list->vector files))
> > > > - (count (vector-length original-files))
> > > > - (stripped-files (vector-map (lambda (_ el)
> > > > - (strip-store-file-
> > > > name el))
> > > > - original-files))
> > > > - (indices (vector-unfold values count)))
> > > >
> > > > - (stable-sort! indices
> > > > - (lambda (a b)
> > > > - (string> (vector-ref stripped-files a)
> > > > - (vector-ref stripped-files
> > > > b))))
> > > > - (vector-ref original-files (vector-ref indices 0))))
> > >
> > > Instead of stable-sort!-ing the indices of a vector, what about
> > > stable-
> > > sort!-ing (map strip-store-file-name original-files) in more or
> > > less
> > > one go?
> >
> > the hash also needs to be dropped from the path for sorting to be
> > useful, but the return value must be the full path, hence the
> > complexity with sorting the indices, pointing both to the full
> > paths
> > and the cut parts.
>
> You can replace the 'less' argument of 'stable-sort'.
> Example sorting by the second character of a string:
>
> (sort '("za" "yb" "xc") (lambda (x y)
> (char>? (string-ref x 1)
> (string-ref y 1)))))
>
> IIUC, you would need to replace char>? by string> and string-ref
> by strip-store-file-name.
You could also store a mapping of long file name to stripped file name
in a hash table for fast lookup, so as to not compute (strip-store-
file-name) over and over. That way you can stable-sort! the long
names, but don't forget to list-copy them at least for functional
purity.

Greetings,
Liliana
L
L
Liliana Marie Prikler wrote on 30 Sep 2021 20:52
Re: [PATCH] union: Resolve collisions by stable-sort'ing them.
(name . Attila Lendvai)(address . attila@lendvai.name)(address . 50878@debbugs.gnu.org)
93c0bb40dcf98c72b896b837b8be6b06f71161bb.camel@gmail.com
Hi Attila

Am Donnerstag, den 30.09.2021, 08:10 +0000 schrieb Attila Lendvai:
Toggle quote (22 lines)
> > > - (let* ((original-files (list->vector files))
> > > - (count (vector-length original-files))
> > > - (stripped-files (vector-map (lambda (_ el)
> > > - (strip-store-file-name
> > > el))
> > > - original-files))
> > > - (indices (vector-unfold values count)))
> > >
> > > - (stable-sort! indices
> > > - (lambda (a b)
> > > - (string> (vector-ref stripped-files a)
> > > - (vector-ref stripped-files b))))
> > > - (vector-ref original-files (vector-ref indices 0))))
> >
> > Instead of stable-sort!-ing the indices of a vector, what about
> > stable-sort!-ing (map strip-store-file-name original-files) in more
> > or less one go?
>
> the hash also needs to be dropped from the path for sorting to be
> useful, but the return value must be the full path, hence the
> complexity with sorting the indices, pointing both to the full paths
> and the cut parts.
This thing has received some replies already to which I already dropped
a comment, so I'll try not to repeat myself here. If something is
still unclear, go down the sub-thread started by Maxime.

Toggle quote (19 lines)
> > I don't think the default collision resolver ought to sort the
> > files.
> >
> > The rationale behind ignoring certain collisions, e.g. icon caches
> > relies on the fact that Guix will use the correct files because
> > they are put first in the manifest. The hooks themselves have no
> > special names that could put them "always first" and profiles are
> > themselves union-built.
> >
> > I do however support the addition of sorting methods as collision
> > resolvers in general and would welcome a way of doing so for
> > profiles
> > pre-hook.
>
> please note that i almost completely lack the knowledge of the
> relevant internals.
>
> with that in mind, do i read you right? this should add a new
> exported resolver funtion, and leave the default one as it was?
Yes, I think that is safer than actually overriding the default for now
at least.

Toggle quote (6 lines)
> and use the new resolver somewhere (where?) that will affect only the
> union of profiles, i.e. user visible effecs when installing software?
> (because DIRECTORY-UNION is called in other contexts also?)
>
> thanks for the insights! i'd appreciate a bit more higher level
> guidance, and then i'll resend the patch accordingly.
I must admit, I'm at a loss of knowledge myself here. Perhaps someone
else can jive in and tell you which union-build is done without hooks,
but I fear there might also be none. For now, simply having the
resolver is in my opinion enough if we don't find the right location as
well. We can default it later with "first-come-first-serve" exceptions
where needed.

Cheers
A
A
Attila Lendvai wrote on 3 Oct 2021 14:43
[PATCH 1/4] guix: build: Promote local define-inline to a define-constant util.
(address . 50878@debbugs.gnu.org)(name . Attila Lendvai)(address . attila@lendvai.name)
20211003124303.8277-1-attila@lendvai.name
* guix/build/utils.scm: Moved/renamed define-inline from grafts.scm to an
exported define-constant util.
---
guix/build/graft.scm | 5 +----
guix/build/utils.scm | 12 ++++++++++++
2 files changed, 13 insertions(+), 4 deletions(-)

Toggle diff (48 lines)
diff --git a/guix/build/graft.scm b/guix/build/graft.scm
index f04c35fa74..daac958d4f 100644
--- a/guix/build/graft.scm
+++ b/guix/build/graft.scm
@@ -44,10 +44,7 @@
 ;;;
 ;;; Code:
 
-(define-syntax-rule (define-inline name val)
-  (define-syntax name (identifier-syntax val)))
-
-(define-inline hash-length 32)
+(define-constant hash-length 32)
 
 (define nix-base32-char?
   (cute char-set-contains?
diff --git a/guix/build/utils.scm b/guix/build/utils.scm
index 419c10195b..f3d913aa70 100644
--- a/guix/build/utils.scm
+++ b/guix/build/utils.scm
@@ -73,6 +73,8 @@
             list->search-path-as-string
             which
 
+            define-constant
+
             every*
             alist-cons-before
             alist-cons-after
@@ -112,6 +114,16 @@
 
             locale-category->string))
 
+
+;;;
+;;; Syntax
+;;;
+
+;; Note that in its current form VAL doesn't get evaluated, just simply
+;; inlined. TODO?
+(define-syntax-rule (define-constant name val)
+  (define-syntax name (identifier-syntax val)))
+
 
 ;;;
 ;;; Guile 2.0 compatibility later.
-- 
2.33.0
A
A
Attila Lendvai wrote on 3 Oct 2021 14:43
[PATCH 2/4] guix: build: Avoid using magic literals in the code for hash length.
(address . 50878@debbugs.gnu.org)(name . Attila Lendvai)(address . attila@lendvai.name)
20211003124303.8277-2-attila@lendvai.name
* guix/build/utils.scm (%store-hash-string-length): New constant.
(store-path-prefix-length): Factor out the calculation of the total store
prefix length.
* guix/build/graft.scm (hash-length): Use it.
---
guix/build/graft.scm | 2 +-
guix/build/utils.scm | 11 +++++++++--
2 files changed, 10 insertions(+), 3 deletions(-)

Toggle diff (51 lines)
diff --git a/guix/build/graft.scm b/guix/build/graft.scm
index daac958d4f..281dbaba6f 100644
--- a/guix/build/graft.scm
+++ b/guix/build/graft.scm
@@ -44,7 +44,7 @@
 ;;;
 ;;; Code:
 
-(define-constant hash-length 32)
+(define-constant hash-length %store-hash-string-length)
 
 (define nix-base32-char?
   (cute char-set-contains?
diff --git a/guix/build/utils.scm b/guix/build/utils.scm
index f3d913aa70..4009c137b8 100644
--- a/guix/build/utils.scm
+++ b/guix/build/utils.scm
@@ -44,6 +44,7 @@
                ;; <https://debbugs.gnu.org/cgi/bugreport.cgi?bug=26805#16>.
                delete)
   #:export (%store-directory
+            %store-hash-string-length
             store-file-name?
             strip-store-file-name
             package-name->name+version
@@ -154,15 +155,21 @@
   (or (getenv "NIX_STORE")
       "/gnu/store"))
 
+(define-constant %store-hash-string-length 32)
+
 (define (store-file-name? file)
   "Return true if FILE is in the store."
   (string-prefix? (%store-directory) file))
 
+(define (store-path-prefix-length)
+  (+ 2         ; the slash after %store-directory, and the dash after the hash
+     (string-length (%store-directory))
+     %store-hash-string-length))
+
 (define (strip-store-file-name file)
   "Strip the '/gnu/store' and hash from FILE, a store file name.  The result
 is typically a \"PACKAGE-VERSION\" string."
-  (string-drop file
-               (+ 34 (string-length (%store-directory)))))
+  (string-drop file (store-path-prefix-length)))
 
 (define (package-name->name+version name)
   "Given NAME, a package name like \"foo-0.9.1b\", return two values:
-- 
2.33.0
A
A
Attila Lendvai wrote on 3 Oct 2021 14:43
[PATCH 3/4] guix: build: Factor out and export default-collision-resolver.
(address . 50878@debbugs.gnu.org)(name . Attila Lendvai)(address . attila@lendvai.name)
20211003124303.8277-3-attila@lendvai.name
This prepares the stage for new collision resolvers, but no semantic changes.

* guix/build/union.scm (resolve-collision/pick-first): New function.
(resolve-collision-and-maybe-warn): New function.
(default-collision-resolver): New function.
---
guix/build/union.scm | 16 ++++++++++------
guix/gexp.scm | 2 +-
2 files changed, 11 insertions(+), 7 deletions(-)

Toggle diff (60 lines)
diff --git a/guix/build/union.scm b/guix/build/union.scm
index 961ac3298b..9e8c2af4f5 100644
--- a/guix/build/union.scm
+++ b/guix/build/union.scm
@@ -27,7 +27,7 @@
   #:use-module (rnrs io ports)
   #:export (union-build
 
-            warn-about-collision
+            default-collision-resolver
 
             relative-file-name
             symlink-relative))
@@ -102,10 +102,11 @@ identical, #f otherwise."
   ;; applications via 'glib-or-gtk-build-system'.
   '("icon-theme.cache" "gschemas.compiled"))
 
-(define (warn-about-collision files)
-  "Handle the collision among FILES by emitting a warning and choosing the
-first one of THEM."
-  (let ((file (first files)))
+(define (resolve-collision/pick-first files)
+  (first files))
+
+(define (resolve-collision-and-maybe-warn files resolver)
+  (let ((file (resolver files)))
     (unless (member (basename file) %harmless-collisions)
       (format (current-error-port)
               "~%warning: collision encountered:~%~{  ~a~%~}"
@@ -113,11 +114,14 @@ first one of THEM."
       (format (current-error-port) "warning: choosing ~a~%" file))
     file))
 
+(define (default-collision-resolver files)
+  (resolve-collision-and-maybe-warn files resolve-collision/pick-first))
+
 (define* (union-build output inputs
                       #:key (log-port (current-error-port))
                       (create-all-directories? #f)
                       (symlink symlink)
-                      (resolve-collision warn-about-collision))
+                      (resolve-collision default-collision-resolver))
   "Build in the OUTPUT directory a symlink tree that is the union of all the
 INPUTS, using SYMLINK to create symlinks.  As a special case, if
 CREATE-ALL-DIRECTORIES?, creates the subdirectories in the output directory to
diff --git a/guix/gexp.scm b/guix/gexp.scm
index f3d278b3e6..32e8748443 100644
--- a/guix/gexp.scm
+++ b/guix/gexp.scm
@@ -1983,7 +1983,7 @@ This yields an 'etc' directory containing these two files."
 
 (define* (directory-union name things
                           #:key (copy? #f) (quiet? #f)
-                          (resolve-collision 'warn-about-collision))
+                          (resolve-collision 'default-collision-resolver))
   "Return a directory that is the union of THINGS, where THINGS is a list of
 file-like objects denoting directories.  For example:
 
-- 
2.33.0
A
A
Attila Lendvai wrote on 3 Oct 2021 14:43
[PATCH 4/4] WIP guix: build: Add resolve-collision/alphanumeric-last for union.
(address . 50878@debbugs.gnu.org)(name . Attila Lendvai)(address . attila@lendvai.name)
20211003124303.8277-4-attila@lendvai.name
It is currently not used anywhere, only exported. The tests are boken, because
guile is too old in the test environment, at least on 'x86_64-linux' (guile
2.0.9 doesn't have srfi-43, aka vectors). Probably it's also broken because
testing errors with `no code for module (guix build utils)`.

* guix/build/union.scm (resolve-collision/alphanumeric-last): New function.
* guix/build/utils.scm (compare-strings-ignoring-store-path-prefix): New function.
---

I think the previous 3 patches in this patchset are worthy of inclusion,
but this one is more of a good idea than a worked out change, to be picked
up later, if at all.

The primary issue is that the test framework uses a guile that is too old,
but it's also not used anywhere. It would be nice if this was used for
resolving conflicts for profiles, i.e. for the user's bin/ directory.

guix/build/union.scm | 12 ++++++++++++
guix/build/utils.scm | 27 +++++++++++++++++++++++++++
tests/union.scm | 9 +++++++++
3 files changed, 48 insertions(+)

Toggle diff (104 lines)
diff --git a/guix/build/union.scm b/guix/build/union.scm
index 9e8c2af4f5..339af7576c 100644
--- a/guix/build/union.scm
+++ b/guix/build/union.scm
@@ -19,15 +19,18 @@
 ;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
 
 (define-module (guix build union)
+  #:use-module (guix build utils)
   #:use-module (ice-9 match)
   #:use-module (ice-9 format)
   #:use-module (srfi srfi-1)
   #:use-module (srfi srfi-26)
+  #:use-module (srfi srfi-43)
   #:use-module (rnrs bytevectors)
   #:use-module (rnrs io ports)
   #:export (union-build
 
             default-collision-resolver
+            resolve-collision/alphanumeric-last
 
             relative-file-name
             symlink-relative))
@@ -102,6 +105,15 @@ identical, #f otherwise."
   ;; applications via 'glib-or-gtk-build-system'.
   '("icon-theme.cache" "gschemas.compiled"))
 
+(define (resolve-collision/alphanumeric-last files)
+  ;; Let's do a stable-sort, so that multiple foo-1.2.3/bin/foo variants will
+  ;; predictably resolve to the highest versioned one.
+  (let ((files-vector (list->vector files)))
+    (stable-sort! files-vector
+                  (lambda (a b)
+                    (> 0 (compare-strings-ignoring-store-path-prefix a b))))
+    (vector-ref files-vector 0)))
+
 (define (resolve-collision/pick-first files)
   (first files))
 
diff --git a/guix/build/utils.scm b/guix/build/utils.scm
index 4009c137b8..1ae0244b04 100644
--- a/guix/build/utils.scm
+++ b/guix/build/utils.scm
@@ -47,6 +47,7 @@
             %store-hash-string-length
             store-file-name?
             strip-store-file-name
+            compare-strings-ignoring-store-path-prefix
             package-name->name+version
             parallel-job-count
 
@@ -171,6 +172,32 @@
 is typically a \"PACKAGE-VERSION\" string."
   (string-drop file (store-path-prefix-length)))
 
+(define (compare-strings-ignoring-store-path-prefix a b)
+  (let ((a-length (string-length a))
+        (b-length (string-length b)))
+    (do ((i (store-path-prefix-length) (+ i 1)))
+        ((not (and (< i a-length)
+                   (< i b-length)
+                   (char=? (string-ref a i)
+                           (string-ref b i))))
+         (cond
+          ((= a-length b-length)
+           (if (= i a-length)      ; we reached the end without any difference
+               0
+               (- (char->integer (string-ref a i))
+                  (char->integer (string-ref b i)))))
+          ((> a-length b-length)
+           (if (= i b-length)   ; we reached the end of B without a difference
+               1
+               (- (char->integer (string-ref a i))
+                  (char->integer (string-ref b i)))))
+          (else                 ; i.e. (< a-length b-length)
+           (if (= i a-length)   ; we reached the end of A without a difference
+               -1
+               (- (char->integer (string-ref a i))
+                  (char->integer (string-ref b i)))))))
+      '())))
+
 (define (package-name->name+version name)
   "Given NAME, a package name like \"foo-0.9.1b\", return two values:
 \"foo\" and \"0.9.1b\".  When the version part is unavailable, NAME and
diff --git a/tests/union.scm b/tests/union.scm
index a8387edf42..cbf8840793 100644
--- a/tests/union.scm
+++ b/tests/union.scm
@@ -204,4 +204,13 @@
    ("/a/b" "/a/b/c/d"   => "c/d")
    ("/a/b/c" "/a/d/e/f" => "../../d/e/f")))
 
+(test-assert "resolve-collision/alphanumeric-last sorts alphanumerically"
+  (string=
+   ((@@ (guix build union) resolve-collision/alphanumeric-last)
+     (list "/gnu/store/c0000000000000000000000000000000-idris-0.0.0/bin/idris"
+           "/gnu/store/60000000000000000000000000000000-idris-2.0.0/bin/idris"
+           "/gnu/store/z0000000000000000000000000000000-idris-1.3.5/bin/idris"
+           "/gnu/store/00000000000000000000000000000000-idris-1.3.3/bin/idris"))
+   "/gnu/store/60000000000000000000000000000000-idris-2.0.0/bin/idris"))
+
 (test-end)
-- 
2.33.0
A
A
Attila Lendvai wrote on 3 Oct 2021 14:59
(No Subject)
(name . 50878@debbugs.gnu.org)(address . 50878@debbugs.gnu.org)
4ULxo0ijtifihUKvZWxBWjvn5e59PpVKYoAa5-rYM5yw2zswQAeUhdrkWoTfG1k3PP5gPjVulxe4xTTGy7XgK_Mc--PMemsFhHzvh25cf4w=@lendvai.name
sorry for being slow with understanding the suggested solution. i have automatically dismissed everything with leaky abstractions, and that made me blind to it.

i didn't realize that i could either introduce constants for the hash length, and the store path, or straight out export a special comparator function to be used.

either way, i have organized the patches so that the first 3 are useful, and the 4th one is more of a demo/inspiration for prosperity.

apply 1-3 as you see fit, and ignore or finish the 4th.

- attila
PGP: 5D5F 45C7 DFCD 0A39
Attachment: file
?