Hi Mark, Mark H Weaver skribis: >> The risks of bugs I can think of are: missed ASCII references (a >> regression, whereby some ASCII references would not get rewritten), > > This is indeed a risk whenever we modify the grafting code, which is > written for efficiency, not clarity. I've tried to be careful, and have > checked that my newly grafted system and user profiles do not retain > references to ungrafted code, modulo the following pre-existing issues: > > > (ibus-daemon launches ungrafted subprocesses) > > > (Chunked store references in .zo files in Racket 8) OK. >> and unrelated UTF-{16,32}-base32-looking references getting rewritten. >> >> I guess the latter is very unlikely because only sequences found in the >> replacement table may be rewritten, right? > > Yes, that's correct. > >> The former should be caught by ‘tests/grafts.scm’ but we could also >> check the closure of real-world systems, for instance, to make sure it >> doesn’t refer to ungrafted things. > > As I mention above, I've already done so for my own GNOME-based Guix > system. Yes, we should be safe. >> Do you know how this affects performance? > > I have not yet measured it, but subjectively, I do not notice any > obvious difference in speed. > > I expect that the main performance impact is that large blocks of NULs > will be processed more slowly by the current draft version of the new > grafting code. That's because NULs can now be part of a nix hash, and > therefore the new grafting code can only advance 1 byte position when > seeing a NUL, whereas previously it would skip ahead 33 positions in > that case. > > If desired, the handling of NULs could be made more efficient, at the > cost of a bit more complexity. When seeing a NUL, we could check the > adjacent bytes to see if this is part of a nix-base32 character in > UTF-16 or UTF-32 encoding. If not, we could skip ahead. Let’s keep it this way for now; we can always revisit later if we feel the need for it. >> For clarity, perhaps you can define a top-level procedure for the test >> and call it from ‘for-each’. > > Good idea. I'd like to optimize the tests a bit before pushing this. > They take a fairly long time to run, and lead to a huge 1.5G grafts.log > file. It might be sufficient to avoid 'test-equal' here. Ah yes, rather use ‘test-assert’ or some such to avoid the huge log file. :-) > Below, I've attached my current revision of this draft patch, which > incorporates your suggestions regarding the main code. What remains is > to improve the tests. LGTM! Feel free to push this version or an improved one. I think it’s good to have it in the upcoming release, and if it’s pushed sooner, we’ll have more time to react in case something’s wrong. Thank you! Ludo’.