[PATCH 0/7] Lzip support for 'guix publish' and 'guix substitute'

Done

Details

2 participants

Ludovic Courtès
Pierre Neidhardt

Owner: unassigned

Submitted by: Ludovic Courtès

Severity: normal

Ludovic Courtès wrote on 24 May 2019 15:31

Recipients:(address . guix-patches@gnu.org)

Message-ID:20190524133159.22568-1-ludo@gnu.org

Hi!

As a followup to Pierre’s work on (guix lzlib), these patches implement

‘lzip’ support for ‘guix substitute’ and ‘guix publish’. With these,

you can now run:

./pre-inst-env guix publish -Clzip …

on one side, and on another machine:

./pre-inst-env guix-daemon --build-users-group=guixbuild

and from there the client machine should be able to fetch

lzip-compressed substitutes.

These patches do not address the transitioning issue that we discussed

earlier, where we have clients lacking lzip support talking to an

lzip-capable server. As discussed earlier, clients will have to send

a special HTTP header, ‘X-Guix-Accept-Encoding’.

For the server-side, I’m still hesitating between implementing it in ‘guix

publish’ or simply running two instances of ‘guix publish’ side-by-side

(one gzip and one lzip) and letting nginx dispatch between the two. :-)

Note that we’ll have to adjust our nginx mirror configs to take that

header into account!

Comments?

Ludo’.

Ludovic Courtès (7):

lzlib: Add 'make-lzip-input-port/compressed'.

utils: Test 'compressed-port' and 'decompressed-port' for both gzip

and xz.

utils: Support compression and decompression with lzip.

publish: Add support for lzip.

self: Add dependency on lzlib.

gnu: guix: Add dependency on lzlib.

lzlib: 'lzread!' never returns more than it was asked for.

.dir-locals.el | 2 +

doc/guix.texi | 25 +++++--

gnu/packages/package-management.scm | 1 +

guix/lzlib.scm | 101 ++++++++++++++++++++++------

guix/scripts/publish.scm | 84 +++++++++++++++++------

guix/self.scm | 13 +++-

guix/tests.scm | 1 +

guix/utils.scm | 27 ++++++--

tests/lzlib.scm | 10 +++

tests/publish.scm | 36 ++++++++++

tests/utils.scm | 62 +++++++++++------

11 files changed, 284 insertions(+), 78 deletions(-)

2.21.0

Ludovic Courtès wrote on 24 May 2019 15:42

[PATCH 2/7] utils: Test 'compressed-port' and 'decompressed-port' for both gzip and xz.

Recipients:(address . 35880@debbugs.gnu.org)

Message-ID:20190524134238.22802-2-ludo@gnu.org

* tests/utils.scm (test-compression/decompression): New procedure.
<top level>: Call it for both 'xz and 'gzip.
---
 tests/utils.scm | 61 +++++++++++++++++++++++++++++++------------------
 1 file changed, 39 insertions(+), 22 deletions(-)

Toggle diff (76 lines)diff --git a/tests/utils.scm b/tests/utils.scm
index 3015b21b23..7d55107fda 100644
--- a/tests/utils.scm
+++ b/tests/utils.scm
@@ -174,30 +174,47 @@
       (any (compose (negate zero?) cdr waitpid)
            pids))))
 
-(test-assert "compressed-port, decompressed-port, non-file"
-  (let ((data (call-with-input-file (search-path %load-path "guix.scm")
-                get-bytevector-all)))
-    (let*-values (((compressed pids1)
-                   (compressed-port 'xz (open-bytevector-input-port data)))
-                  ((decompressed pids2)
-                   (decompressed-port 'xz compressed)))
-      (and (every (compose zero? cdr waitpid)
-                  (append pids1 pids2))
-           (equal? (get-bytevector-all decompressed) data)))))
+(define (test-compression/decompression method run?)
+  "Test METHOD, a symbol such as 'gzip.  Call RUN? to determine whether to
+skip these tests."
+  (unless (run?) (test-skip 1))
+  (test-assert (format #f "compressed-port, decompressed-port, non-file [~a]"
+                       method)
+    (let ((data (call-with-input-file (search-path %load-path "guix.scm")
+                  get-bytevector-all)))
+      (let*-values (((compressed pids1)
+                     (compressed-port method (open-bytevector-input-port data)))
+                    ((decompressed pids2)
+                     (decompressed-port method compressed)))
+        (and (every (compose zero? cdr waitpid)
+                    (pk 'pids method (append pids1 pids2)))
+             (let ((result (get-bytevector-all decompressed)))
+               (pk 'len method
+                   (if (bytevector? result)
+                       (bytevector-length result)
+                       result)
+                   (bytevector-length data))
+               (equal? result data))))))
 
-(false-if-exception (delete-file temp-file))
-(test-assert "compressed-output-port + decompressed-port"
-  (let* ((file (search-path %load-path "guix/derivations.scm"))
-         (data (call-with-input-file file get-bytevector-all))
-         (port (open-file temp-file "w0b")))
-    (call-with-compressed-output-port 'xz port
-      (lambda (compressed)
-        (put-bytevector compressed data)))
-    (close-port port)
+  (false-if-exception (delete-file temp-file))
+  (unless (run?) (test-skip 1))
+  (test-assert (format #f "compressed-output-port + decompressed-port [~a]"
+                       method)
+    (let* ((file (search-path %load-path "guix/derivations.scm"))
+           (data (call-with-input-file file get-bytevector-all))
+           (port (open-file temp-file "w0b")))
+      (call-with-compressed-output-port method port
+        (lambda (compressed)
+          (put-bytevector compressed data)))
+      (close-port port)
 
-    (bytevector=? data
-                  (call-with-decompressed-port 'xz (open-file temp-file "r0b")
-                    get-bytevector-all))))
+      (bytevector=? data
+                    (call-with-decompressed-port method (open-file temp-file "r0b")
+                      get-bytevector-all)))))
+
+(for-each test-compression/decompression
+          '(gzip xz)
+          (list (const #t) (const #f)))
 
 ;; This is actually in (guix store).
 (test-equal "store-path-package-name"
-- 
2.21.0

Ludovic Courtès wrote on 24 May 2019 15:42

[PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed'.

Recipients:(address . 35880@debbugs.gnu.org)

Message-ID:20190524134238.22802-1-ludo@gnu.org

* guix/lzlib.scm (lzwrite!, make-lzip-input-port/compressed): New
procedures.
* tests/lzlib.scm ("make-lzip-input-port/compressed"): New test.
* guix/tests.scm (%seed): Export.
---
 guix/lzlib.scm  | 62 +++++++++++++++++++++++++++++++++++++++++++++++++
 guix/tests.scm  |  1 +
 tests/lzlib.scm | 10 ++++++++
 3 files changed, 73 insertions(+)

Toggle diff (129 lines)diff --git a/guix/lzlib.scm b/guix/lzlib.scm
index a6dac46049..48927c6262 100644
--- a/guix/lzlib.scm
+++ b/guix/lzlib.scm
@@ -1,5 +1,6 @@
 ;;; GNU Guix --- Functional package management for GNU
 ;;; Copyright © 2019 Pierre Neidhardt <mail@ambrevar.xyz>
+;;; Copyright © 2019 Ludovic Courtès <ludo@gnu.org>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -23,9 +24,11 @@
   #:use-module (ice-9 match)
   #:use-module (system foreign)
   #:use-module (guix config)
+  #:use-module (srfi srfi-11)
   #:export (lzlib-available?
             make-lzip-input-port
             make-lzip-output-port
+            make-lzip-input-port/compressed
             call-with-lzip-input-port
             call-with-lzip-output-port
             %default-member-length-limit
@@ -515,6 +518,23 @@ the end-of-stream has been reached."
         (loop rd)))
     read))
 
+(define (lzwrite! encoder source source-offset source-count
+                  target target-offset target-count)
+  "Write up to SOURCE-COUNT bytes from SOURCE to ENCODER, and read up to
+TARGET-COUNT bytes into TARGET at TARGET-OFFSET.  Return two values: the
+number of bytes read from SOURCE, and the number of bytes written to TARGET."
+  (define read
+    (if (< 0 (lz-compress-write-size encoder))
+        (match (lz-compress-write encoder source source-offset source-count)
+          (0 (lz-compress-finish encoder) 0)
+          (n n))
+        0))
+
+  (let loop ()
+    (match (lz-compress-read encoder target target-offset target-count)
+      (0       (loop))
+      (written (values read written)))))
+
 (define* (lzwrite encoder bv lz-port
                   #:optional (start 0) (count (bytevector-length bv)))
   "Write up to COUNT bytes from BV at offset START into LZ-PORT.  Return
@@ -597,6 +617,48 @@ port is closed."
                                     (lz-compress-close encoder)
                                     (close-port port))))
 
+(define* (make-lzip-input-port/compressed port
+                                          #:key
+                                          (level %default-compression-level))
+  "Return an input port that compresses data read from PORT, with the given LEVEL.
+PORT is automatically closed when the resulting port is closed."
+  (define encoder (apply lz-compress-open
+                         (car (assoc-ref %compression-levels level))))
+
+  (define input-buffer (make-bytevector 8192))
+  (define input-len 0)
+  (define input-offset 0)
+
+  (define input-eof? #f)
+
+  (define (read! bv start count)
+    (cond
+     (input-eof?
+      (lz-compress-read encoder bv start count))
+     ((= input-offset input-len)
+      (match (get-bytevector-n! port input-buffer 0
+                                (bytevector-length input-buffer))
+        ((? eof-object?)
+         (set! input-eof? #t)
+         (lz-compress-finish encoder))
+        (count
+         (set! input-offset 0)
+         (set! input-len count)))
+      (read! bv start count))
+     (else
+      (let-values (((read written)
+                    (lzwrite! encoder
+                              input-buffer input-offset
+                              (- input-len input-offset)
+                              bv start count)))
+        (set! input-offset (+ input-offset read))
+        written))))
+
+  (make-custom-binary-input-port "lzip-input/compressed"
+                                 read! #f #f
+                                 (lambda ()
+                                   (close-port port))))
+
 (define* (call-with-lzip-input-port port proc)
   "Call PROC with a port that wraps PORT and decompresses data read from it.
 PORT is closed upon completion."
diff --git a/guix/tests.scm b/guix/tests.scm
index 35ebf8464d..66d60e964e 100644
--- a/guix/tests.scm
+++ b/guix/tests.scm
@@ -33,6 +33,7 @@
   #:use-module (web uri)
   #:export (open-connection-for-tests
             with-external-store
+            %seed
             random-text
             random-bytevector
             file=?
diff --git a/tests/lzlib.scm b/tests/lzlib.scm
index cf53a9417d..543622bb45 100644
--- a/tests/lzlib.scm
+++ b/tests/lzlib.scm
@@ -108,4 +108,14 @@
 (test-assert* "Bytevector of size relative to Lzip internal buffers (1MiB+1)"
   (compress-and-decompress (random-bytevector (1+ (* 1024 1024)))))
 
+(test-assert "make-lzip-input-port/compressed"
+  (let* ((len        (pk 'len (+ 10 (random 4000 %seed))))
+         (data       (random-bytevector len))
+         (compressed (make-lzip-input-port/compressed
+                      (open-bytevector-input-port data)))
+         (result     (call-with-lzip-input-port compressed
+                                                get-bytevector-all)))
+    (pk (bytevector-length result) (bytevector-length data))
+    (bytevector=? result data)))
+
 (test-end)
-- 
2.21.0

Ludovic Courtès wrote on 24 May 2019 15:42

[PATCH 3/7] utils: Support compression and decompression with lzip.

Recipients:(address . 35880@debbugs.gnu.org)

Message-ID:20190524134238.22802-3-ludo@gnu.org

* guix/utils.scm (lzip-port): New procedure.
(decompressed-port, compressed-port, compressed-output-port): Add 'lzip
case.
* tests/utils.scm <top level>: Call 'test-compression/decompression' for
'lzip as well.
---
 guix/utils.scm  | 27 ++++++++++++++++++++++-----
 tests/utils.scm |  5 +++--
 2 files changed, 25 insertions(+), 7 deletions(-)

Toggle diff (90 lines)diff --git a/guix/utils.scm b/guix/utils.scm
index ed1a418cca..709cdf9353 100644
--- a/guix/utils.scm
+++ b/guix/utils.scm
@@ -1,5 +1,5 @@
 ;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019 Ludovic Courtès <ludo@gnu.org>
 ;;; Copyright © 2013, 2014, 2015 Mark H Weaver <mhw@netris.org>
 ;;; Copyright © 2014 Eric Bavier <bavier@member.fsf.org>
 ;;; Copyright © 2014 Ian Denhardt <ian@zenhack.net>
@@ -169,6 +169,17 @@ buffered data is lost."
               (close-port out)
               (loop in (cons child pids)))))))))
 
+(define (lzip-port proc port . args)
+  "Return the lzip port produced by calling PROC (a symbol) on PORT and ARGS.
+Raise an error if lzlib support is missing."
+  (let* ((lzlib       (false-if-exception (resolve-interface '(guix lzlib))))
+         (supported?  (and lzlib
+                           ((module-ref lzlib 'lzlib-available?)))))
+    (if supported?
+        (let ((make-port (module-ref lzlib proc)))
+          (values (make-port port) '()))
+        (error "lzip compression not supported" lzlib))))
+
 (define (decompressed-port compression input)
   "Return an input port where INPUT is decompressed according to COMPRESSION,
 a symbol such as 'xz."
@@ -177,17 +188,21 @@ a symbol such as 'xz."
     ('bzip2        (filtered-port `(,%bzip2 "-dc") input))
     ('xz           (filtered-port `(,%xz "-dc") input))
     ('gzip         (filtered-port `(,%gzip "-dc") input))
-    (else          (error "unsupported compression scheme" compression))))
+    ('lzip         (values (lzip-port 'make-lzip-input-port input)
+                           '()))
+    (_             (error "unsupported compression scheme" compression))))
 
 (define (compressed-port compression input)
-  "Return an input port where INPUT is decompressed according to COMPRESSION,
+  "Return an input port where INPUT is compressed according to COMPRESSION,
 a symbol such as 'xz."
   (match compression
     ((or #f 'none) (values input '()))
     ('bzip2        (filtered-port `(,%bzip2 "-c") input))
     ('xz           (filtered-port `(,%xz "-c") input))
     ('gzip         (filtered-port `(,%gzip "-c") input))
-    (else          (error "unsupported compression scheme" compression))))
+    ('lzip         (values (lzip-port 'make-lzip-input-port/compressed input)
+                           '()))
+    (_             (error "unsupported compression scheme" compression))))
 
 (define (call-with-decompressed-port compression port proc)
   "Call PROC with a wrapper around PORT, a file port, that decompresses data
@@ -244,7 +259,9 @@ program--e.g., '(\"--fast\")."
     ('bzip2        (filtered-output-port `(,%bzip2 "-c" ,@options) output))
     ('xz           (filtered-output-port `(,%xz "-c" ,@options) output))
     ('gzip         (filtered-output-port `(,%gzip "-c" ,@options) output))
-    (else          (error "unsupported compression scheme" compression))))
+    ('lzip         (values (lzip-port 'make-lzip-output-port output)
+                           '()))
+    (_             (error "unsupported compression scheme" compression))))
 
 (define* (call-with-compressed-output-port compression port proc
                                            #:key (options '()))
diff --git a/tests/utils.scm b/tests/utils.scm
index 7d55107fda..7c8f7c09d0 100644
--- a/tests/utils.scm
+++ b/tests/utils.scm
@@ -23,6 +23,7 @@
   #:use-module (guix utils)
   #:use-module ((guix store) #:select (%store-prefix store-path-package-name))
   #:use-module ((guix search-paths) #:select (string-tokenize*))
+  #:use-module ((guix lzlib) #:select (lzlib-available?))
   #:use-module (srfi srfi-1)
   #:use-module (srfi srfi-11)
   #:use-module (srfi srfi-64)
@@ -213,8 +214,8 @@ skip these tests."
                       get-bytevector-all)))))
 
 (for-each test-compression/decompression
-          '(gzip xz)
-          (list (const #t) (const #f)))
+          '(gzip xz lzip)
+          (list (const #t) (const #f) lzlib-available?))
 
 ;; This is actually in (guix store).
 (test-equal "store-path-package-name"
-- 
2.21.0

Ludovic Courtès wrote on 24 May 2019 15:42

[PATCH 4/7] publish: Add support for lzip.

Recipients:(address . 35880@debbugs.gnu.org)

Message-ID:20190524134238.22802-4-ludo@gnu.org

* guix/scripts/publish.scm (show-help, %options): Support '-C METHOD'
and '-C METHOD:LEVEL'.
(default-compression): New procedure.
(bake-narinfo+nar): Add lzip.
(nar-response-port): Likewise.
(string->compression-type): New procedure.
(make-request-handler): Generalize /nar/gzip handler to handle /nar/lzip
as well.
* tests/publish.scm ("/nar/lzip/*"): New test.
("/*.narinfo with lzip compression"): New test.
* doc/guix.texi (Invoking guix publish): Document it.
(Requirements): Mention lzlib.
---
 .dir-locals.el           |  2 +
 doc/guix.texi            | 25 +++++++++---
 guix/scripts/publish.scm | 84 +++++++++++++++++++++++++++++-----------
 tests/publish.scm        | 36 +++++++++++++++++
 4 files changed, 119 insertions(+), 28 deletions(-)

Toggle diff (275 lines)diff --git a/.dir-locals.el b/.dir-locals.el
index 550e06ef09..f1196fd781 100644
--- a/.dir-locals.el
+++ b/.dir-locals.el
@@ -53,6 +53,8 @@
    (eval . (put 'call-with-decompressed-port 'scheme-indent-function 2))
    (eval . (put 'call-with-gzip-input-port 'scheme-indent-function 1))
    (eval . (put 'call-with-gzip-output-port 'scheme-indent-function 1))
+   (eval . (put 'call-with-lzip-input-port 'scheme-indent-function 1))
+   (eval . (put 'call-with-lzip-output-port 'scheme-indent-function 1))
    (eval . (put 'signature-case 'scheme-indent-function 1))
    (eval . (put 'emacs-batch-eval 'scheme-indent-function 0))
    (eval . (put 'emacs-batch-edit-file 'scheme-indent-function 1))
diff --git a/doc/guix.texi b/doc/guix.texi
index f176bb0163..b0de5632e7 100644
--- a/doc/guix.texi
+++ b/doc/guix.texi
@@ -757,6 +757,11 @@ Support for build offloading (@pxref{Daemon Offload Setup}) and
 @uref{https://github.com/artyom-poptsov/guile-ssh, Guile-SSH},
 version 0.10.2 or later.
 
+@item
+When @url{https://www.nongnu.org/lzip/lzlib.html, lzlib} is available, lzlib
+substitutes can be used and @command{guix publish} can compress substitutes
+with lzlib.
+
 @item
 When @url{http://www.bzip.org, libbz2} is available,
 @command{guix-daemon} can use it to compress build logs.
@@ -9656,12 +9661,20 @@ accept connections from any interface.
 Change privileges to @var{user} as soon as possible---i.e., once the
 server socket is open and the signing key has been read.
 
-@item --compression[=@var{level}]
-@itemx -C [@var{level}]
-Compress data using the given @var{level}.  When @var{level} is zero,
-disable compression.  The range 1 to 9 corresponds to different gzip
-compression levels: 1 is the fastest, and 9 is the best (CPU-intensive).
-The default is 3.
+@item --compression[=@var{method}[:@var{level}]]
+@itemx -C [@var{method}[:@var{level}]]
+Compress data using the given @var{method} and @var{level}.  @var{method} is
+one of @code{lzip} and @code{gzip}; when @var{method} is omitted, @code{gzip}
+is used.
+
+When @var{level} is zero, disable compression.  The range 1 to 9 corresponds
+to different compression levels: 1 is the fastest, and 9 is the best
+(CPU-intensive).  The default is 3.
+
+Usually, @code{lzip} compresses noticeably better than @code{gzip} for a small
+increase in CPU usage; see
+@uref{https://nongnu.org/lzip/lzip_benchmark.html,benchmarks on the lzip Web
+page}.
 
 Unless @option{--cache} is used, compression occurs on the fly and
 the compressed streams are not
diff --git a/guix/scripts/publish.scm b/guix/scripts/publish.scm
index a236f3e45c..9e74d789ce 100644
--- a/guix/scripts/publish.scm
+++ b/guix/scripts/publish.scm
@@ -1,6 +1,6 @@
 ;;; GNU Guix --- Functional package management for GNU
 ;;; Copyright © 2015 David Thompson <davet@gnu.org>
-;;; Copyright © 2015, 2016, 2017, 2018 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2015, 2016, 2017, 2018, 2019 Ludovic Courtès <ludo@gnu.org>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -51,6 +51,7 @@
   #:use-module (guix store)
   #:use-module ((guix serialization) #:select (write-file))
   #:use-module (guix zlib)
+  #:autoload   (guix lzlib) (lzlib-available?)
   #:use-module (guix cache)
   #:use-module (guix ui)
   #:use-module (guix scripts)
@@ -74,8 +75,8 @@ Publish ~a over HTTP.\n") %store-directory)
   (display (G_ "
   -u, --user=USER        change privileges to USER as soon as possible"))
   (display (G_ "
-  -C, --compression[=LEVEL]
-                         compress archives at LEVEL"))
+  -C, --compression[=METHOD:LEVEL]
+                         compress archives with METHOD at LEVEL"))
   (display (G_ "
   -c, --cache=DIRECTORY  cache published items to DIRECTORY"))
   (display (G_ "
@@ -121,6 +122,9 @@ Publish ~a over HTTP.\n") %store-directory)
   ;; Since we compress on the fly, default to fast compression.
   (compression 'gzip 3))
 
+(define (default-compression type)
+  (compression type 3))
+
 (define (actual-compression item requested)
   "Return the actual compression used for ITEM, which may be %NO-COMPRESSION
 if ITEM is already compressed."
@@ -153,18 +157,28 @@ if ITEM is already compressed."
                             name)))))
         (option '(#\C "compression") #f #t
                 (lambda (opt name arg result)
-                  (match (if arg (string->number* arg) 3)
-                    (0
-                     (alist-cons 'compression %no-compression result))
-                    (level
-                     (if (zlib-available?)
-                         (alist-cons 'compression
-                                     (compression 'gzip level)
-                                     result)
-                         (begin
-                           (warning (G_ "zlib support is missing; \
-compression disabled~%"))
-                           result))))))
+                  (let* ((colon (string-index arg #\:))
+                         (type  (cond
+                                 (colon (string-take arg colon))
+                                 ((string->number arg) "gzip")
+                                 (else arg)))
+                         (level (if colon
+                                    (string->number*
+                                     (string-drop arg (+ 1 colon)))
+                                    (or (string->number arg) 3))))
+                    (match level
+                      (0
+                       (alist-cons 'compression %no-compression result))
+                      (level
+                       (match (string->compression-type type)
+                         ((? symbol? type)
+                          (alist-cons 'compression
+                                      (compression type level)
+                                      result))
+                         (_
+                          (warning (G_ "~a: unsupported compression type~%")
+                                   type)
+                          result)))))))
         (option '(#\c "cache") #t #f
                 (lambda (opt name arg result)
                   (alist-cons 'cache arg result)))
@@ -483,6 +497,13 @@ requested using POOL."
          #:level (compression-level compression)
          #:buffer-size (* 128 1024))
        (rename-file (string-append nar ".tmp") nar))
+      ('lzip
+       ;; Note: the file port gets closed along with the lzip port.
+       (call-with-lzip-output-port (open-output-file (string-append nar ".tmp"))
+         (lambda (port)
+           (write-file item port))
+         #:level (compression-level compression))
+       (rename-file (string-append nar ".tmp") nar))
       ('none
        ;; Cache nars even when compression is disabled so that we can
        ;; guarantee the TTL (see <https://bugs.gnu.org/28664>.)
@@ -687,6 +708,9 @@ example: \"/foo/bar\" yields '(\"foo\" \"bar\")."
      (make-gzip-output-port (response-port response)
                             #:level level
                             #:buffer-size (* 64 1024)))
+    (($ <compression> 'lzip level)
+     (make-lzip-output-port (response-port response)
+                            #:level level))
     (($ <compression> 'none)
      (response-port response))
     (#f
@@ -761,12 +785,23 @@ blocking."
   http-write
   (@@ (web server http) http-close))
 
+(define (string->compression-type string)
+  "Return a symbol denoting the compression method expressed by STRING; return
+#f if STRING doesn't match any supported method."
+  (match string
+    ("gzip" (and (zlib-available?) 'gzip))
+    ("lzip" (and (lzlib-available?) 'lzip))
+    (_      #f)))
+
 (define* (make-request-handler store
                                #:key
                                cache pool
                                narinfo-ttl
                                (nar-path "nar")
                                (compression %no-compression))
+  (define compression-type?
+    string->compression-type)
+
   (define nar-path?
     (let ((expected (split-and-decode-uri-path nar-path)))
       (cut equal? expected <>)))
@@ -815,13 +850,18 @@ blocking."
           ;; is restarted with different compression parameters.
 
           ;; /nar/gzip/<store-item>
-          ((components ... "gzip" store-item)
-           (if (and (nar-path? components) (zlib-available?))
-               (let ((compression (match compression
-                                    (($ <compression> 'gzip)
-                                     compression)
-                                    (_
-                                     %default-gzip-compression))))
+          ((components ... (? compression-type? type) store-item)
+           (if (nar-path? components)
+               (let* ((compression-type (string->compression-type type))
+                      (compression (match compression
+                                     (($ <compression> type)
+                                      (if (eq? type compression-type)
+                                          compression
+                                          (default-compression
+                                            compression-type)))
+                                     (_
+                                      (default-compression
+                                        compression-type)))))
                  (if cache
                      (render-nar/cached store cache request store-item
                                         #:ttl narinfo-ttl
diff --git a/tests/publish.scm b/tests/publish.scm
index 097ac036e0..10bc859695 100644
--- a/tests/publish.scm
+++ b/tests/publish.scm
@@ -36,6 +36,7 @@
   #:use-module (gcrypt pk-crypto)
   #:use-module ((guix pki) #:select (%public-key-file %private-key-file))
   #:use-module (guix zlib)
+  #:use-module (guix lzlib)
   #:use-module (web uri)
   #:use-module (web client)
   #:use-module (web response)
@@ -229,6 +230,19 @@ FileSize: ~a~%"
                (string-append "/nar/gzip/" (basename %item))))))
     (get-bytevector-n nar (bytevector-length %gzip-magic-bytes))))
 
+(unless (lzlib-available?)
+  (test-skip 1))
+(test-equal "/nar/lzip/*"
+  "bar"
+  (call-with-temporary-output-file
+   (lambda (temp port)
+     (let ((nar (http-get-port
+                 (publish-uri
+                  (string-append "/nar/lzip/" (basename %item))))))
+       (call-with-lzip-input-port nar
+         (cut restore-file <> temp)))
+     (call-with-input-file temp read-string))))
+
 (unless (zlib-available?)
   (test-skip 1))
 (test-equal "/*.narinfo with compression"
@@ -251,6 +265,28 @@ FileSize: ~a~%"
                   (_ #f)))
               (recutils->alist body)))))
 
+(unless (lzlib-available?)
+  (test-skip 1))
+(test-equal "/*.narinfo with lzip compression"
+  `(("StorePath" . ,%item)
+    ("URL" . ,(string-append "nar/lzip/" (basename %item)))
+    ("Compression" . "lzip"))
+  (let ((thread (with-separate-output-ports
+                 (call-with-new-thread
+                  (lambda ()
+                    (guix-publish "--port=6790" "-Clzip"))))))
+    (wait-until-ready 6790)
+    (let* ((url  (string-append "http://localhost:6790/"
+                                (store-path-hash-part %item) ".narinfo"))
+           (body (http-get-port url)))
+      (filter (lambda (item)
+                (match item
+                  (("Compression" . _) #t)
+                  (("StorePath" . _)  #t)
+                  (("URL" . _) #t)
+                  (_ #f)))
+              (recutils->alist body)))))
+
 (unless (zlib-available?)
   (test-skip 1))
 (test-equal "/*.narinfo for a compressed file"
-- 
2.21.0

Ludovic Courtès wrote on 24 May 2019 15:42

[PATCH 5/7] self: Add dependency on lzlib.

Recipients:(address . 35880@debbugs.gnu.org)

Message-ID:20190524134238.22802-5-ludo@gnu.org

* guix/self.scm (compiled-guix): Pass #:lzlib to 'make-config.scm'.
(make-config.scm): Add #:lzlib and honor it.
(specification->package): Add "lzlib".
---
 guix/self.scm | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

Toggle diff (61 lines)diff --git a/guix/self.scm b/guix/self.scm
index 6d7569ec19..17d289e486 100644
--- a/guix/self.scm
+++ b/guix/self.scm
@@ -57,6 +57,7 @@
       ("guile-gcrypt"  (ref '(gnu packages gnupg) 'guile-gcrypt))
       ("gnutls"     (ref '(gnu packages tls) 'gnutls))
       ("zlib"       (ref '(gnu packages compression) 'zlib))
+      ("lzlib"      (ref '(gnu packages compression) 'lzlib))
       ("gzip"       (ref '(gnu packages compression) 'gzip))
       ("bzip2"      (ref '(gnu packages compression) 'bzip2))
       ("xz"         (ref '(gnu packages compression) 'xz))
@@ -646,6 +647,7 @@ Info manual."
                         (guile-version (effective-version))
                         (guile-for-build (default-guile))
                         (zlib (specification->package "zlib"))
+                        (lzlib (specification->package "lzlib"))
                         (gzip (specification->package "gzip"))
                         (bzip2 (specification->package "bzip2"))
                         (xz (specification->package "xz"))
@@ -800,6 +802,7 @@ Info manual."
                  #:extra-modules
                  `(((guix config)
                     => ,(make-config.scm #:zlib zlib
+                                         #:lzlib lzlib
                                          #:gzip gzip
                                          #:bzip2 bzip2
                                          #:xz xz
@@ -897,7 +900,7 @@ Info manual."
                                       (variables rest ...))))))
     (variables %localstatedir %storedir %sysconfdir)))
 
-(define* (make-config.scm #:key zlib gzip xz bzip2
+(define* (make-config.scm #:key zlib lzlib gzip xz bzip2
                           (package-name "GNU Guix")
                           (package-version "0")
                           (bug-report-address "bug-guix@gnu.org")
@@ -919,7 +922,7 @@ Info manual."
                                %store-database-directory
                                %config-directory
                                %libz
-                               ;; TODO: %liblz
+                               %liblz
                                %gzip
                                %bzip2
                                %xz))
@@ -966,7 +969,11 @@ Info manual."
 
                    (define %libz
                      #+(and zlib
-                            (file-append zlib "/lib/libz"))))
+                            (file-append zlib "/lib/libz")))
+
+                   (define %liblz
+                     #+(and lzlib
+                            (file-append lzlib "/lib/liblz"))))
 
                ;; Guile 2.0 *requires* the 'define-module' to be at the
                ;; top-level or the 'toplevel-ref' in the resulting .go file are
-- 
2.21.0

Ludovic Courtès wrote on 24 May 2019 15:42

[PATCH 6/7] gnu: guix: Add dependency on lzlib.

Recipients:(address . 35880@debbugs.gnu.org)

Message-ID:20190524134238.22802-6-ludo@gnu.org

* gnu/packages/package-management.scm (guix)[inputs]: Add LZLIB.

---

gnu/packages/package-management.scm | 1 +

1 file changed, 1 insertion(+)

Toggle diff (14 lines)diff --git a/gnu/packages/package-management.scm b/gnu/packages/package-management.scm
index a356a6dab7..985fa801b1 100644
--- a/gnu/packages/package-management.scm
+++ b/gnu/packages/package-management.scm
@@ -272,6 +272,7 @@
        `(("bzip2" ,bzip2)
          ("gzip" ,gzip)
          ("zlib" ,zlib)                           ;for 'guix publish'
+         ("lzlib" ,lzlib)            ;for 'guix publish' and 'guix substitute'
 
          ("sqlite" ,sqlite)
          ("libgcrypt" ,libgcrypt)
-- 
2.21.0

Ludovic Courtès wrote on 24 May 2019 15:42

[PATCH 7/7] lzlib: 'lzread!' never returns more than it was asked for.

Recipients:(address . 35880@debbugs.gnu.org)

Message-ID:20190524134238.22802-7-ludo@gnu.org

Fixes a bug whereby 'lzread!' could return more than COUNT.

* guix/lzlib.scm (lzread!): Rewrite in a semi-functional style.

---

guix/lzlib.scm | 39 +++++++++++++++++++--------------------

1 file changed, 19 insertions(+), 20 deletions(-)

Toggle diff (55 lines)diff --git a/guix/lzlib.scm b/guix/lzlib.scm
index 48927c6262..0e384f6cb8 100644
--- a/guix/lzlib.scm
+++ b/guix/lzlib.scm
@@ -494,29 +494,28 @@ perhaps not yet read."
 
 
 ;; High level functions.
-(define* (lzread! decoder file-port bv
+
+(define* (lzread! decoder port bv
                   #:optional (start 0) (count (bytevector-length bv)))
-  "Read up to COUNT bytes from FILE-PORT into BV at offset START.  Return the
+  "Read up to COUNT bytes from PORT into BV at offset START.  Return the
 number of uncompressed bytes actually read; it is zero if COUNT is zero or if
 the end-of-stream has been reached."
-  ;; WARNING: Because we don't alternate between lz-reads and lz-writes, we can't
-  ;; process more than lz-decompress-write-size from the file-port.
-  (when (> count (lz-decompress-write-size decoder))
-    (set! count (lz-decompress-write-size decoder)))
-  (let ((file-bv (get-bytevector-n file-port count)))
-    (unless (eof-object? file-bv)
-      (lz-decompress-write decoder file-bv 0 (bytevector-length file-bv))))
-  (let ((read 0))
-    (let loop ((rd 0))
-      (if (< start (bytevector-length bv))
-          (begin
-            (set! rd (lz-decompress-read decoder bv start (- (bytevector-length bv) start)))
-            (set! start (+ start rd))
-            (set! read (+ read rd)))
-          (set! rd 0))
-      (unless (= rd 0)
-        (loop rd)))
-    read))
+  (define (feed-decoder! decoder)
+    ;; Feed DECODER with data read from PORT.
+    (match (get-bytevector-n port (lz-decompress-write-size decoder))
+      ((? eof-object? eof) eof)
+      (bv (lz-decompress-write decoder bv))))
+
+  (let loop ((read 0)
+             (start start))
+    (cond ((< read count)
+           (match (lz-decompress-read decoder bv start (- count read))
+             (0 (if (eof-object? (feed-decoder! decoder))
+                    read
+                    (loop read start)))
+             (n (loop (+ read n) (+ start n)))))
+          (else
+           read))))
 
 (define (lzwrite! encoder source source-offset source-count
                   target target-offset target-count)
-- 
2.21.0

Pierre Neidhardt wrote on 25 May 2019 19:24

Re: [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed'.

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)

Message-ID:87d0k6o3am.fsf@ambrevar.xyz

As an Lzip enthusiast, I have some questions ;)

I see you are using make-lzip-input-port/compressed in a subsequent

patch, but this does not map how it's done for gzip et al., the latter

being invoked via it's system command "gzip -c ...". Why did you decide

to do it differently for lzip?

Much of the code induced by make-lzip-input-port/compressed seems to

repeat the lzread! / lzwrite business, maybe there is a way to factor

some of it?

Toggle quote (17 lines)> +(define (lzwrite! encoder source source-offset source-count
> +                  target target-offset target-count)
> +  "Write up to SOURCE-COUNT bytes from SOURCE to ENCODER, and read up to
> +TARGET-COUNT bytes into TARGET at TARGET-OFFSET.  Return two values: the
> +number of bytes read from SOURCE, and the number of bytes written to TARGET."
> +  (define read
> +    (if (< 0 (lz-compress-write-size encoder))
> +        (match (lz-compress-write encoder source source-offset source-count)
> +          (0 (lz-compress-finish encoder) 0)
> +          (n n))
> +        0))
> +
> +  (let loop ()
> +    (match (lz-compress-read encoder target target-offset target-count)
> +      (0       (loop))
> +      (written (values read written)))))

Why looping on 0? If there is no byte to read, wouldn't this loop indefinitely?

Pierre Neidhardt

https://ambrevar.xyz/

Pierre Neidhardt wrote on 25 May 2019 19:27

Re: [PATCH 3/7] utils: Support compression and decompression with lzip.

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)

Message-ID:87a7fao360.fsf@ambrevar.xyz

This is the part where you use make-lzip-input-port/compressed:

Toggle quote (14 lines)>  (define (compressed-port compression input)
> -  "Return an input port where INPUT is decompressed according to COMPRESSION,
> +  "Return an input port where INPUT is compressed according to COMPRESSION,
>  a symbol such as 'xz."
>    (match compression
>      ((or #f 'none) (values input '()))
>      ('bzip2        (filtered-port `(,%bzip2 "-c") input))
>      ('xz           (filtered-port `(,%xz "-c") input))
>      ('gzip         (filtered-port `(,%gzip "-c") input))
> -    (else          (error "unsupported compression scheme" compression))))
> +    ('lzip         (values (lzip-port 'make-lzip-input-port/compressed input)
> +                           '()))
> +    (_             (error "unsupported compression scheme" compression))))

So why not doing like for the others?

Pierre Neidhardt

https://ambrevar.xyz/

Pierre Neidhardt wrote on 25 May 2019 19:31

Re: [PATCH 7/7] lzlib: 'lzread!' never returns more than it was asked for.

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)

Message-ID:877eaeo2zr.fsf@ambrevar.xyz

Toggle quote (2 lines)

> Fixes a bug whereby 'lzread!' could return more than COUNT.

Hmm... But why is this a bug? lzread! returns the number of

_uncompressed_ bytes, while COUNT is number of _compressed_ bytes

written, so it's often expected that the former is more than the latter.

(By the way, nice rewrite, I like it much better than my C->Scheme

translation ;p)

Pierre Neidhardt

https://ambrevar.xyz/

Ludovic Courtès wrote on 26 May 2019 21:51

Re: [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed'.

Recipients:(name . Pierre Neidhardt)(address . mail@ambrevar.xyz)(address . 35880@debbugs.gnu.org)

Message-ID:87ftp1m1te.fsf@gnu.org

Hi!

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

Toggle quote (11 lines)> As an Lzip enthusiast, I have some questions ;)
>
> I see you are using make-lzip-input-port/compressed in a subsequent
> patch, but this does not map how it's done for gzip et al., the latter
> being invoked via it's system command "gzip -c ...".  Why did you decide
> to do it differently for lzip?
>
> Much of the code induced by make-lzip-input-port/compressed seems to
> repeat the lzread! / lzwrite business, maybe there is a way to factor
> some of it?

Actually, ‘make-lzip-input-port/compressed’ exists solely so we can have

‘compressed-port’ for lzip, which in turn allows us to write tests.

It uses (guix lzlib) instead of invoking the ‘lzip’ command because we

can. :-) Invoking commands is not as nice, because it’s more

expensive, requires us to spawn an additional process when the input is

not a file port (e.g., it’s a string port), and it’s forking is not

possible in multi-threaded programs like ‘guix publish’.

Toggle quote (19 lines)>> +(define (lzwrite! encoder source source-offset source-count
>> +                  target target-offset target-count)
>> +  "Write up to SOURCE-COUNT bytes from SOURCE to ENCODER, and read up to
>> +TARGET-COUNT bytes into TARGET at TARGET-OFFSET.  Return two values: the
>> +number of bytes read from SOURCE, and the number of bytes written to TARGET."
>> +  (define read
>> +    (if (< 0 (lz-compress-write-size encoder))
>> +        (match (lz-compress-write encoder source source-offset source-count)
>> +          (0 (lz-compress-finish encoder) 0)
>> +          (n n))
>> +        0))
>> +
>> +  (let loop ()
>> +    (match (lz-compress-read encoder target target-offset target-count)
>> +      (0       (loop))
>> +      (written (values read written)))))
>
> Why looping on 0?  If there is no byte to read, wouldn't this loop indefinitely?

Hmm, good point. The idea is that ‘lzwrite!’ should return 0 only on

end-of-file, but then the loop should include reading more from SOURCE.

I’ll follow up on this one.

Thanks!

Ludo’.

Ludovic Courtès wrote on 26 May 2019 21:52

Re: [PATCH 3/7] utils: Support compression and decompression with lzip.

Recipients:(name . Pierre Neidhardt)(address . mail@ambrevar.xyz)(address . 35880@debbugs.gnu.org)

Message-ID:878sutm1sg.fsf@gnu.org

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

Toggle quote (18 lines)> This is the part where you use make-lzip-input-port/compressed:
>
>>  (define (compressed-port compression input)
>> -  "Return an input port where INPUT is decompressed according to COMPRESSION,
>> +  "Return an input port where INPUT is compressed according to COMPRESSION,
>>  a symbol such as 'xz."
>>    (match compression
>>      ((or #f 'none) (values input '()))
>>      ('bzip2        (filtered-port `(,%bzip2 "-c") input))
>>      ('xz           (filtered-port `(,%xz "-c") input))
>>      ('gzip         (filtered-port `(,%gzip "-c") input))
>> -    (else          (error "unsupported compression scheme" compression))))
>> +    ('lzip         (values (lzip-port 'make-lzip-input-port/compressed input)
>> +                           '()))
>> +    (_             (error "unsupported compression scheme" compression))))
>
> So why not doing like for the others?

See my other previous reply. :-)

Ludovic Courtès wrote on 26 May 2019 21:54

Re: [PATCH 7/7] lzlib: 'lzread!' never returns more than it was asked for.

Recipients:(name . Pierre Neidhardt)(address . mail@ambrevar.xyz)(address . 35880@debbugs.gnu.org)

Message-ID:87zhn9kn3p.fsf@gnu.org

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

Toggle quote (4 lines)

>> Fixes a bug whereby 'lzread!' could return more than COUNT.

> Hmm... But why is this a bug?

Because then the ‘read!’ method of the custom binary input port could

return more than ‘count’, which is understandably not permitted.

Toggle quote (3 lines)

> (By the way, nice rewrite, I like it much better than my C->Scheme

> translation ;p)

Heheh, I initially tried to minimize changes but I found it easier to

reason about this version.

Ludo’.

Pierre Neidhardt wrote on 26 May 2019 22:57

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 35880@debbugs.gnu.org)

Message-ID:87r28llyr4.fsf@ambrevar.xyz

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (9 lines)

> Pierre Neidhardt <mail@ambrevar.xyz> skribis:

>>> Fixes a bug whereby 'lzread!' could return more than COUNT.

>> Hmm... But why is this a bug?

> Because then the ‘read!’ method of the custom binary input port could

> return more than ‘count’, which is understandably not permitted.

That's the part where I'm a bit confused because we deal with compressed

data here.

So when we say "(read count)", does COUNT refer to the compressed or

uncompressed data?

Pierre Neidhardt

https://ambrevar.xyz/

Ludovic Courtès wrote on 26 May 2019 23:28

Recipients:(name . Pierre Neidhardt)(address . mail@ambrevar.xyz)(address . 35880@debbugs.gnu.org)

Message-ID:87a7f8kird.fsf@gnu.org

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

Toggle quote (17 lines)> Ludovic Courtès <ludo@gnu.org> writes:
>
>> Pierre Neidhardt <mail@ambrevar.xyz> skribis:
>>
>>>> Fixes a bug whereby 'lzread!' could return more than COUNT.
>>>
>>> Hmm... But why is this a bug?
>>
>> Because then the ‘read!’ method of the custom binary input port could
>> return more than ‘count’, which is understandably not permitted.
>
> That's the part where I'm a bit confused because we deal with compressed
> data here.
>
> So when we say "(read count)", does COUNT refer to the compressed or
> uncompressed data?

We have this:

(define* (make-lzip-input-port port)

(define decoder (lz-decompress-open))

(define (read! bv start count)

(lzread! decoder port bv start count))

(make-custom-binary-input-port "lzip-input" read! #f #f

(lambda () …)))

Here ‘read!’ must return an integer between 1 and COUNT; it must return

0 if and only if the end-of-file is reached.

IOW, ‘lzread!’ must return the number of uncompressed bytes of BV that

it consumed, and that number is necessarily <= COUNT.

Does that make sense?

Thanks,

Ludo’.

Pierre Neidhardt wrote on 27 May 2019 09:00

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 35880@debbugs.gnu.org)

Message-ID:87a7f8mlfw.fsf@ambrevar.xyz

It does make sense, but then don't we have the same issue with zlib.scm:

Toggle snippet (9 lines)

(define gzread!

(let ((proc (zlib-procedure int "gzread" (list '* '* unsigned-int))))

(lambda* (gzfile bv #:optional (start 0) (count (bytevector-length bv)))

"Read up to COUNT bytes from GZFILE into BV at offset START. Return the

number of uncompressed bytes actually read; it is zero if COUNT is zero or if

the end-of-stream has been reached."

...

I initially tried to mimic zlib.scm and this part confused me a lot back then.

Pierre Neidhardt

https://ambrevar.xyz/

Ludovic Courtès wrote on 27 May 2019 12:00

Recipients:(name . Pierre Neidhardt)(address . mail@ambrevar.xyz)(address . 35880@debbugs.gnu.org)

Message-ID:87pno4z06h.fsf@gnu.org

Hi Pierre,

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

Toggle quote (2 lines)

> It does make sense, but then don't we have the same issue with zlib.scm:

Which issue?

Toggle quote (10 lines)> (define gzread!
>   (let ((proc (zlib-procedure int "gzread" (list '* '* unsigned-int))))
>     (lambda* (gzfile bv #:optional (start 0) (count (bytevector-length bv)))
>       "Read up to COUNT bytes from GZFILE into BV at offset START.  Return the
> number of uncompressed bytes actually read; it is zero if COUNT is zero or if
> the end-of-stream has been reached."
>  ...
>
> I initially tried to mimic zlib.scm and this part confused me a lot back then.

There’s a key difference: the ‘gzread’ etc. API is high-level and easy
to use, but it wants a file descriptor to read from (thus a file port in
Scheme land.)

That’s enough for ‘guix publish’, which writes gzipped data to files,
but that’s not enough for ‘guix substitute’, which can read data from
non-file ports (e.g., chunked-encoding ports or TLS ports from the HTTP
client.)

HTH,
Ludo’.

Ludovic Courtès wrote on 27 May 2019 17:45

Re: [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed'.

Recipients:(name . Pierre Neidhardt)(address . mail@ambrevar.xyz)(address . 35880@debbugs.gnu.org)

Message-ID:87blznsxym.fsf@gnu.org

Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (2 lines)

> Pierre Neidhardt <mail@ambrevar.xyz> skribis:

[...]

Toggle quote (23 lines)>>> +(define (lzwrite! encoder source source-offset source-count
>>> +                  target target-offset target-count)
>>> +  "Write up to SOURCE-COUNT bytes from SOURCE to ENCODER, and read up to
>>> +TARGET-COUNT bytes into TARGET at TARGET-OFFSET.  Return two values: the
>>> +number of bytes read from SOURCE, and the number of bytes written to TARGET."
>>> +  (define read
>>> +    (if (< 0 (lz-compress-write-size encoder))
>>> +        (match (lz-compress-write encoder source source-offset source-count)
>>> +          (0 (lz-compress-finish encoder) 0)
>>> +          (n n))
>>> +        0))
>>> +
>>> +  (let loop ()
>>> +    (match (lz-compress-read encoder target target-offset target-count)
>>> +      (0       (loop))
>>> +      (written (values read written)))))
>>
>> Why looping on 0?  If there is no byte to read, wouldn't this loop indefinitely?
>
> Hmm, good point.  The idea is that ‘lzwrite!’ should return 0 only on
> end-of-file, but then the loop should include reading more from SOURCE.
> I’ll follow up on this one.

I noticed that ‘lz-compress-read’ is documented to return a “strictly

positive integer”, so I’m changing it to this:

Toggle snippet (20 lines)

(define (lzwrite! encoder source source-offset source-count

target target-offset target-count)

"Write up to SOURCE-COUNT bytes from SOURCE to ENCODER, and read up to

TARGET-COUNT bytes into TARGET at TARGET-OFFSET. Return two values: the

number of bytes read from SOURCE, and the non-zero number of bytes written to

TARGET."

(define read

(if (< 0 (lz-compress-write-size encoder))

(match (lz-compress-write encoder source source-offset source-count)

(0 (lz-compress-finish encoder) 0)

(n n))

0))

(define written

;; Note: 'lz-compress-read' promises to return a non-zero integer.

(lz-compress-read encoder target target-offset target-count))

(values read written))

Let me know what you think!

Ludo’.

Pierre Neidhardt wrote on 27 May 2019 18:24

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 35880@debbugs.gnu.org)

Message-ID:87ef4jlvbr.fsf@ambrevar.xyz

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (3 lines)

> I noticed that ‘lz-compress-read’ is documented to return a “strictly

> positive integer”, so I’m changing it to this:

The docstring (that I wrote) says "non-negative positive integer."

"positive" is a typo (sorry about that), it should read "non-negative

integer" since the return value can be zero.

In general, lzlib's "reads" and "writes" don't give any guarantee about

the size of bytes that are actually processed. You need to loop over

the calls until some condition is met, see the "finish(ed)" functions.

Here in particular, it's not clear that lz-compress-read is going to read

all the bytes in the encoder buffer. Maybe that's OK for this

particular functions if we don't expect TARGET-COUNT to be reached. See

http://www.nongnu.org/lzip/manual/lzlib_manual.html#Buffering

That said, if the encoder buffer is not empty, I think lz-compress-read

should always return something >0.

Yup, lzlib is very low-level! :p

Pierre Neidhardt

https://ambrevar.xyz/

Ludovic Courtès wrote on 27 May 2019 22:53

Recipients:(name . Pierre Neidhardt)(address . mail@ambrevar.xyz)(address . 35880-done@debbugs.gnu.org)

Message-ID:87lfyrr554.fsf@gnu.org

Hi!

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

Toggle quote (3 lines)

> That said, if the encoder buffer is not empty, I think lz-compress-read

> should always return something >0.

Yes, probably. The docstring for ‘lz-compress-read’ says:

"Read up to COUNT bytes from the encoder stream, storing the results in LZFILE-BV.

Return the number of uncompressed bytes written, a strictly positive integer."

^~~~~~~~~~~~~~~~~

However, the lzlib manual doesn’t say that for ‘LZ_compress_read’ (info

"(lzlib) Compression functions").

But that’s OK: the ‘read!’ method in ‘make-lzip-input-port/compressed’

can just call ‘lzwrite!’ again with more data when that happens, so I’ve

done that.

And I pushed the whole thing! :-)

I think it’d be good to let people play with it in their personal

setups.

Next up: multi-compression support in ‘guix publish’ (possibly?) so we

can smoothly transition on our build farms.

Thanks!

Ludo’.

Closed

Pierre Neidhardt wrote on 27 May 2019 23:12

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 35880-done@debbugs.gnu.org)

Message-ID:87sgszk3ev.fsf@ambrevar.xyz

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (5 lines)

>> That said, if the encoder buffer is not empty, I think lz-compress-read

>> should always return something >0.

> Yes, probably. The docstring for ‘lz-compress-read’ says:

Oops, I read the docstring of lz-DEcompress-read. My bad.

Toggle quote (4 lines)

> "Read up to COUNT bytes from the encoder stream, storing the results in LZFILE-BV.

> Return the number of uncompressed bytes written, a strictly positive integer."

> ^~~~~~~~~~~~~~~~~

Bigger oops! This comes from a copy-paste of the gzip docstring which I

forgot to update properly (I did for the decompression functions, but

not for the compression functions). The docstrings should be fixed.

Toggle quote (4 lines)

> But that’s OK: the ‘read!’ method in ‘make-lzip-input-port/compressed’

> can just call ‘lzwrite!’ again with more data when that happens, so I’ve

> done that.

This could work, but I've had some headaches on such assumptions

before. Tests are very necessary here to validate those assumptions ;)

The thing is that we are not using lzlib as it is meant to be used

(i.e. with the finish* functions) because of the functional approach of

the binary ports which don't really play well with the procedural

approach of the C library.

Toggle quote (2 lines)

> And I pushed the whole thing! :-)

Hurray! Can't wait to say lz-compressed archives coming to Guix! :)

Pierre Neidhardt

https://ambrevar.xyz/

Closed

Ludovic Courtès wrote on 28 May 2019 09:52

Recipients:(name . Pierre Neidhardt)(address . mail@ambrevar.xyz)(address . 35880-done@debbugs.gnu.org)

Message-ID:877eab3tjq.fsf@gnu.org

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

Toggle quote (17 lines)> Ludovic Courtès <ludo@gnu.org> writes:
>
>>> That said, if the encoder buffer is not empty, I think lz-compress-read
>>> should always return something >0.
>>
>> Yes, probably.  The docstring for ‘lz-compress-read’ says:
>
> Oops, I read the docstring of lz-DEcompress-read.  My bad.
>
>>     "Read up to COUNT bytes from the encoder stream, storing the results in LZFILE-BV.
>>   Return the number of uncompressed bytes written, a strictly positive integer."
>>                                                      ^~~~~~~~~~~~~~~~~
>
> Bigger oops!  This comes from a copy-paste of the gzip docstring which I
> forgot to update properly (I did for the decompression functions, but
> not for the compression functions).  The docstrings should be fixed.

I fixed this one in e13354a7ca5a0d5e28e02c4cfce6fecb1ab770e4.

Toggle quote (7 lines)

>> But that’s OK: the ‘read!’ method in ‘make-lzip-input-port/compressed’

>> can just call ‘lzwrite!’ again with more data when that happens, so I’ve

>> done that.

> This could work, but I've had some headaches on such assumptions

> before. Tests are very necessary here to validate those assumptions ;)

Definitely!

Toggle quote (5 lines)

> The thing is that we are not using lzlib as it is meant to be used

> (i.e. with the finish* functions) because of the functional approach of

> the binary ports which don't really play well with the procedural

> approach of the C library.

I think we’re using it the way it’s meant to be used, roughly along the

lines of the examples of its manual (info "(lzlib) Examples").

(I/O ports are not very “functional”.)

Toggle quote (4 lines)

>> And I pushed the whole thing! :-)

> Hurray! Can't wait to say lz-compressed archives coming to Guix! :)

I’ve updated the ‘guix’ package so people can start using

‘guix publish -C lzip’ and fetch substitute from there.

Thanks for making it possible!

Ludo’.

Closed

Pierre Neidhardt wrote on 28 May 2019 10:46

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 35880-done@debbugs.gnu.org)

Message-ID:87k1ebj7a8.fsf@ambrevar.xyz

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (6 lines)

>> Bigger oops! This comes from a copy-paste of the gzip docstring which I

>> forgot to update properly (I did for the decompression functions, but

>> not for the compression functions). The docstrings should be fixed.

> I fixed this one in e13354a7ca5a0d5e28e02c4cfce6fecb1ab770e4.

Thanks!

Toggle quote (10 lines)>> The thing is that we are not using lzlib as it is meant to be used
>> (i.e. with the finish* functions) because of the functional approach of
>> the binary ports which don't really play well with the procedural
>> approach of the C library.
>
> I think we’re using it the way it’s meant to be used, roughly along the
> lines of the examples of its manual (info "(lzlib) Examples").
>
> (I/O ports are not very “functional”.)

In the sense that we define "read!" and "write!" functions which don't

allow us to call the "finish" functions properly.

So maybe we are following the doc too "roughly" :p

This has multiple implications, e.g. it impedes support for multiple

members, (which is OK as long as we don't accept archives produced by a

third-party) and more importantly it lifts the guarantee that the

library is going to work as per the documentation.

Toggle quote (9 lines)

>>> And I pushed the whole thing! :-)

>> Hurray! Can't wait to say lz-compressed archives coming to Guix! :)

> I’ve updated the ‘guix’ package so people can start using

> ‘guix publish -C lzip’ and fetch substitute from there.

> Thanks for making it possible!

And thanks for doing most of the work! :D

Pierre Neidhardt

https://ambrevar.xyz/

Closed

Ludovic Courtès wrote on 28 May 2019 15:47

Recipients:(name . Pierre Neidhardt)(address . mail@ambrevar.xyz)(address . 35880-done@debbugs.gnu.org)

Message-ID:871s0i1yji.fsf@gnu.org

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

Toggle quote (2 lines)

> Ludovic Courtès <ludo@gnu.org> writes:

[...]

Toggle quote (20 lines)>>> The thing is that we are not using lzlib as it is meant to be used
>>> (i.e. with the finish* functions) because of the functional approach of
>>> the binary ports which don't really play well with the procedural
>>> approach of the C library.
>>
>> I think we’re using it the way it’s meant to be used, roughly along the
>> lines of the examples of its manual (info "(lzlib) Examples").
>>
>> (I/O ports are not very “functional”.)
>
> In the sense that we define  "read!" and "write!" functions which don't
> allow us to call the "finish" functions properly.
>
> So maybe we are following the doc too "roughly" :p
>
> This has multiple implications, e.g. it impedes support for multiple
> members, (which is OK as long as we don't accept archives produced by a
> third-party) and more importantly it lifts the guarantee that the
> library is going to work as per the documentation.

I’m not sure I follow.  I think ‘make-lzip-input-port/compressed’
corresponds to Example 2 in the manual (info "(lzlib) Examples"),
‘make-lzip-output-port’ corresponds to Example 1, and
‘make-lzip-input-port’ corresponds to Example 4 (with the exception that
‘lzread!’ doesn’t call ‘lz-decompress-finished?’, but it has other means
to tell whether we’re done processing input.)

What do you think is not done as documented?

Thanks,
Ludo’.

Closed

Pierre Neidhardt wrote on 29 May 2019 16:57

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 35880-done@debbugs.gnu.org)

Message-ID:87a7f5wbow.fsf@ambrevar.xyz

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (7 lines)

> I’m not sure I follow. I think ‘make-lzip-input-port/compressed’

> corresponds to Example 2 in the manual (info "(lzlib) Examples"),

> ‘make-lzip-output-port’ corresponds to Example 1, and

> ‘make-lzip-input-port’ corresponds to Example 4 (with the exception that

> ‘lzread!’ doesn’t call ‘lz-decompress-finished?’, but it has other means

> to tell whether we’re done processing input.)

Example 4 is:

1) LZ_decompress_open

2) go to step 5 if LZ_decompress_write_size returns 0

3) LZ_decompress_write

4) if no more data to write, call LZ_decompress_finish

5) LZ_decompress_read

5a) optionally, if LZ_decompress_member_finished returns 1, read

final values for member with LZ_decompress_data_crc, etc.

6) go back to step 2 until LZ_decompress_finished returns 1

7) LZ_decompress_close

In `lzread!', we don't call lz-decompress-finished? nor do we loop on

lz-decompress-finished.

This only works for decompression of single-member archive, but the

documentation does not say that.

Toggle snippet (5 lines)

(match (get-bytevector-n port (lz-decompress-write-size decoder))

((? eof-object? eof) eof)

(bv (lz-decompress-write decoder bv)))

In the above if lz-decompress-write-size returns 0, we won't be reading

anything (infinite loop?). While I understand this should not happen in

practice, the documentation of the library does not give such guarantees.

Antonio told me that explicitly.

Toggle snippet (6 lines)

(match (lz-decompress-read decoder bv start (- count read))

(0 (if (eof-object? (feed-decoder! decoder))

read

(loop read start)))

I'm not sure I understand the above: if we read nothing, then we try

again? This might loop forever.

What do you think?

Pierre Neidhardt

https://ambrevar.xyz/

Closed

Ludovic Courtès wrote on 31 May 2019 22:54

Recipients:(name . Pierre Neidhardt)(address . mail@ambrevar.xyz)(address . 35880-done@debbugs.gnu.org)

Message-ID:87v9xqny4u.fsf@gnu.org

Hello,

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

Toggle quote (24 lines)> Ludovic Courtès <ludo@gnu.org> writes:
>
>> I’m not sure I follow.  I think ‘make-lzip-input-port/compressed’
>> corresponds to Example 2 in the manual (info "(lzlib) Examples"),
>> ‘make-lzip-output-port’ corresponds to Example 1, and
>> ‘make-lzip-input-port’ corresponds to Example 4 (with the exception that
>> ‘lzread!’ doesn’t call ‘lz-decompress-finished?’, but it has other means
>> to tell whether we’re done processing input.)
>
> Example 4 is:
>
>      1) LZ_decompress_open
>      2) go to step 5 if LZ_decompress_write_size returns 0
>      3) LZ_decompress_write
>      4) if no more data to write, call LZ_decompress_finish
>      5) LZ_decompress_read
>      5a) optionally, if LZ_decompress_member_finished returns 1, read
>          final values for member with LZ_decompress_data_crc, etc.
>      6) go back to step 2 until LZ_decompress_finished returns 1
>      7) LZ_decompress_close
>
> In `lzread!', we don't call lz-decompress-finished? nor do we loop on
> lz-decompress-finished.

Indeed, we’re missing a call to ‘lz-decompress-finish’, so lzlib could

in theory think there’s still data coming, and so fail to produce more

output, possibly leading to an infinite loop.

I think the ‘lzread!’ loop should look like this (the tests still pass

with this):

Toggle snippet (16 lines)  (let loop ((read 0)
             (start start))
    (cond ((< read count)
           (match (lz-decompress-read decoder bv start (- count read))
             (0 (cond ((lz-decompress-finished? decoder)
                       read)
                      ((eof-object? (feed-decoder! decoder))
                       (lz-decompress-finish decoder)
                       (loop read start))
                      (else                       ;read again
                       (loop read start))))
             (n (loop (+ read n) (+ start n)))))
          (else
           read)))

That way, I believe all cases are correctly handled.

WDYT?

Toggle quote (11 lines)> This only works for decompression of single-member archive, but the
> documentation does not say that.
>
>     (match (get-bytevector-n port (lz-decompress-write-size decoder))
>       ((? eof-object? eof) eof)
>       (bv (lz-decompress-write decoder bv)))
>
>
> In the above if lz-decompress-write-size returns 0, we won't be reading
> anything (infinite loop?).

We’re calling ‘feed-decoder!’ iff ‘lz-decompress-read’ returned 0; when

that happens ‘lz-decompress-write-size’ must return a strictly positive

number. Otherwise there’s an inconsistency.

Toggle quote (8 lines)

> (match (lz-decompress-read decoder bv start (- count read))

> (0 (if (eof-object? (feed-decoder! decoder))

> read

> (loop read start)))

> I'm not sure I understand the above: if we read nothing, then we try

> again?

No: if we read *something*, we try again; if we read nothing, we return.

Thanks for your careful review, much appreciated!

Ludo’.

Closed

Pierre Neidhardt wrote on 1 Jun 2019 08:02

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 35880-done@debbugs.gnu.org)

Message-ID:87ef4dzvuy.fsf@ambrevar.xyz

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (20 lines)> I think the ‘lzread!’ loop should look like this (the tests still pass
> with this):
>
> --8<---------------cut here---------------start------------->8---
>   (let loop ((read 0)
>              (start start))
>     (cond ((< read count)
>            (match (lz-decompress-read decoder bv start (- count read))
>              (0 (cond ((lz-decompress-finished? decoder)
>                        read)
>                       ((eof-object? (feed-decoder! decoder))
>                        (lz-decompress-finish decoder)
>                        (loop read start))
>                       (else                       ;read again
>                        (loop read start))))
>              (n (loop (+ read n) (+ start n)))))
>           (else
>            read)))
> --8<---------------cut here---------------end--------------->8---

Looks good to me!

Toggle quote (10 lines)>>            (match (lz-decompress-read decoder bv start (- count read))
>>              (0 (if (eof-object? (feed-decoder! decoder))
>>                     read
>>                     (loop read start)))
>>
>> I'm not sure I understand the above: if we read nothing, then we try
>> again?
>
> No: if we read *something*, we try again; if we read nothing, we return.

If we read nothing _and_ it is not an EOF (it can be an empty vector),

then we loop indefinitely, no?

Toggle quote (2 lines)

> Thanks for your careful review, much appreciated!

You are welcome, thanks for your invaluable work!

Pierre Neidhardt

https://ambrevar.xyz/

Closed

Ludovic Courtès wrote on 1 Jun 2019 11:41

Recipients:(name . Pierre Neidhardt)(address . mail@ambrevar.xyz)(address . 35880-done@debbugs.gnu.org)

Message-ID:87ef4dk5g8.fsf@gnu.org

Hi!

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

Toggle quote (24 lines)> Ludovic Courtès <ludo@gnu.org> writes:
>
>> I think the ‘lzread!’ loop should look like this (the tests still pass
>> with this):
>>
>> --8<---------------cut here---------------start------------->8---
>>   (let loop ((read 0)
>>              (start start))
>>     (cond ((< read count)
>>            (match (lz-decompress-read decoder bv start (- count read))
>>              (0 (cond ((lz-decompress-finished? decoder)
>>                        read)
>>                       ((eof-object? (feed-decoder! decoder))
>>                        (lz-decompress-finish decoder)
>>                        (loop read start))
>>                       (else                       ;read again
>>                        (loop read start))))
>>              (n (loop (+ read n) (+ start n)))))
>>           (else
>>            read)))
>> --8<---------------cut here---------------end--------------->8---
>
> Looks good to me!

OK, committed!

Toggle quote (13 lines)>>>            (match (lz-decompress-read decoder bv start (- count read))
>>>              (0 (if (eof-object? (feed-decoder! decoder))
>>>                     read
>>>                     (loop read start)))
>>>
>>> I'm not sure I understand the above: if we read nothing, then we try
>>> again?
>>
>> No: if we read *something*, we try again; if we read nothing, we return.
>
> If we read nothing _and_ it is not an EOF (it can be an empty vector),
> then we loop indefinitely, no?

‘feed-decoder!’ cannot return an empty bytevector because

‘lz-decompress-write-size’ necessarily returns a strictly positive

integer at this point.

(Imperative programming is hard! :-))

Thanks,

Ludo’.

Closed

Pierre Neidhardt wrote on 1 Jun 2019 11:58

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 35880-done@debbugs.gnu.org)

Message-ID:8736ktzkxs.fsf@ambrevar.xyz

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (4 lines)

> ‘feed-decoder!’ cannot return an empty bytevector because

> ‘lz-decompress-write-size’ necessarily returns a strictly positive

> integer at this point.

I'm not sure that's true: if the buffer is full and the next

lz-decompress-read does not read anything, then the buffer will still be

full and lz-decompress-write-size will return 0.

The specs don't guarantee that lz-decompress-read will always read

something.

But that's the only assumption we are making I believe, and it's fair :)

Pierre Neidhardt

https://ambrevar.xyz/

Closed

Ludovic Courtès wrote on 1 Jun 2019 14:21

Recipients:(name . Pierre Neidhardt)(address . mail@ambrevar.xyz)(address . 35880-done@debbugs.gnu.org)

Message-ID:8736ktijha.fsf@gnu.org

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

Toggle quote (10 lines)> Ludovic Courtès <ludo@gnu.org> writes:
>
>> ‘feed-decoder!’ cannot return an empty bytevector because
>> ‘lz-decompress-write-size’ necessarily returns a strictly positive
>> integer at this point.
>
> I'm not sure that's true: if the buffer is full and the next
> lz-decompress-read does not read anything, then the buffer will still be
> full and lz-decompress-write-size will return 0.

Hmm I don’t think that’s a reasonable scenario, otherwise it would mean

that you’re in a deadlock anyway (decoder buffer is already full yet it

does not contain enough data to actually decode anything.)

Dunno, I’m rather confident here but we’ll see if that causes any

troubles.

Thanks,

Ludo’.

Closed

Your comment

This issue is archived.

To comment on this conversation send an email to 35880@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it

mumi current 35880

Then, you may apply the latest patchset in this issue (with sign off)

mumi am -- -s

Or, compose a reply to this issue

mumi compose

Or, send patches to this issue

mumi send-email *.patch

is:open	open issues
is:done	closed issues
submitter:<who>	search issue submitter
author:<who>	search by message author
date:yesterday..now	search by issue date
mdate:3m..2d	search by message date