From debbugs-submit-bounces@debbugs.gnu.org Thu Apr 08 06:13:50 2021 Received: (at 33848) by debbugs.gnu.org; 8 Apr 2021 10:13:50 +0000 Received: from localhost ([127.0.0.1]:46041 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lURfa-00005I-4u for submit@debbugs.gnu.org; Thu, 08 Apr 2021 06:13:50 -0400 Received: from eggs.gnu.org ([209.51.188.92]:46538) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lURfY-00004y-9z for 33848@debbugs.gnu.org; Thu, 08 Apr 2021 06:13:49 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:37374) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lURfP-00005Z-UQ; Thu, 08 Apr 2021 06:13:40 -0400 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=58442 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1lURfP-00057c-Gj; Thu, 08 Apr 2021 06:13:39 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Mark H Weaver Subject: Re: bug#33848: Store references in SBCL-compiled code are "invisible" References: <87r2e8jpfx.fsf@gnu.org> <87o9979gfn.fsf@gnu.org> <87tvizgghs.fsf@ambrevar.xyz> <87k1juaomo.fsf@gnu.org> <87muoqhk62.fsf@ambrevar.xyz> <87zhsq8wkj.fsf@gnu.org> <87d0pmhbgn.fsf@ambrevar.xyz> <87r2e28tkv.fsf@gnu.org> <874laygkiy.fsf@ambrevar.xyz> <87lfa5eymf.fsf@ambrevar.xyz> <87tuoscsk9.fsf@gnu.org> <87im57b8u7.fsf@ambrevar.xyz> <87czvebky2.fsf@netris.org> <87eefu30a4.fsf@gnu.org> <87im56l6es.fsf@yamatai> <87wntm8j18.fsf@ambrevar.xyz> <87a6qil4b1.fsf@yamatai> <87a6qiz5b3.fsf@ambrevar.xyz> <871rbtc3j5.fsf@netris.org> <87r1js9udv.fsf@netris.org> <87h7kj7j7x.fsf@netris.org> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 19 Germinal an 229 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Thu, 08 Apr 2021 12:13:37 +0200 In-Reply-To: <87h7kj7j7x.fsf@netris.org> (Mark H. Weaver's message of "Tue, 06 Apr 2021 07:19:51 -0400") Message-ID: <87ft01axta.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 33848 Cc: Guillaume Le Vaillant , Pierre Neidhardt , 33848@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) Hi Mark, Mark H Weaver skribis: > From 6eec36e66d20d82fe02c6de793422875477b90d8 Mon Sep 17 00:00:00 2001 > From: Mark H Weaver > Date: Fri, 2 Apr 2021 18:36:23 -0400 > Subject: [PATCH] DRAFT: grafts: Support rewriting UTF-16 and UTF-32 store > references. > > * guix/build/graft.scm (replace-store-references): Add support for > finding and rewriting UTF-16 and UTF-32 store references. > * tests/grafts.scm: Add tests. Please add a =E2=80=9CFixes=E2=80=9D line in the commit log. I=E2=80=99m not reviewing the code in depth and I trust your judgment. The risks of bugs I can think of are: missed ASCII references (a regression, whereby some ASCII references would not get rewritten), and unrelated UTF-{16,32}-base32-looking references getting rewritten. I guess the latter is very unlikely because only sequences found in the replacement table may be rewritten, right? The former should be caught by =E2=80=98tests/grafts.scm=E2=80=99 but we co= uld also check the closure of real-world systems, for instance, to make sure it doesn=E2=80=99t refer to ungrafted things. Do you know how this affects performance? Some superficial comments: > +(define (possible-utf16-hash? buffer i w) > + (and (<=3D (* 2 hash-length) (- i w)) > + (let loop ((j (+ 1 (- i (* 2 hash-length))))) > + (or (>=3D j i) > + (and (zero? (bytevector-u8-ref buffer j)) > + (loop (+ j 2))))))) > + > +(define (possible-utf32-hash? buffer i w) > + (and (<=3D (* 4 hash-length) (- i w)) > + (let loop ((j (+ 1 (- i (* 4 hash-length))))) > + (or (>=3D j i) > + (and (zero? (bytevector-u8-ref buffer j)) > + (zero? (bytevector-u8-ref buffer (+ j 1))) > + (zero? (bytevector-u8-ref buffer (+ j 2))) > + (loop (+ j 4))))))) > + > +(define (insert-nuls char-size bv) Perhaps add short docstrings for clarity. > +(for-each > + (lambda (char-size1) > + (for-each > + (lambda (char-size2) > + (for-each > + (lambda (gap) > + (for-each > + (lambda (offset) > + (test-equal (format #f "replace-store-references, char-sizes = ~a ~a, gap ~s, offset ~a" > + char-size1 char-size2 gap offset) > + (string-append (make-string offset #\=3D) > + (nul-expand (string-append "/gnu/store/" > + (make-string 32 #= \6) > + "-BlahBlaH") > + char-size1) > + gap > + (nul-expand (string-append "/gnu/store/" > + (make-string 32 #= \8) > + "-SoMeTHiNG") > + char-size2) > + (list->string (map integer->char (iota 77 33= )))) > + > + ;; Create input data where the right-hand-size of the dash = ("-something" > + ;; here) goes beyond the end of the internal buffer of > + ;; 'replace-store-references'. > + (let* ((content (string-append (make-string offset #\= =3D) > + (nul-expand (string-appe= nd "/gnu/store/" > + = (make-string 32 #\5) > + = "-blahblah") > + char-size1) > + gap > + (nul-expand (string-appe= nd "/gnu/store/" > + = (make-string 32 #\7) > + = "-something") > + char-size2) > + (list->string > + (map integer->char (iot= a 77 33))))) > + (replacement (alist->vhash > + `((,(make-string 32 #\5) > + . ,(string->utf8 (string-append > + (make-string 32 #= \6) > + "-BlahBlaH"))) > + (,(make-string 32 #\7) > + . ,(string->utf8 (string-append > + (make-string 32 #= \8) > + "-SoMeTHiNG")))))= )) > + (call-with-output-string > + (lambda (output) > + ((@@ (guix build graft) replace-store-references) > + (open-input-string content) output > + replacement > + "/gnu/store")))))) > + ;; offsets to test > + (map (lambda (i) (- buffer-size (* 40 char-size1) i)) > + (iota 30)))) > + ;; gaps > + '("" "-" " " "a"))) > + ;; char-size2 values to test > + '(1 2))) > + ;; char-size1 values to test > + '(1 2 4)) For clarity, perhaps you can define a top-level procedure for the test and call it from =E2=80=98for-each=E2=80=99. Modulo these very minor issues, it looks like it=E2=80=99s ready to go! Thank you, Ludo=E2=80=99.