[PATCH 1/2] Add (guix lzlib).

Done

Details

3 participants

Ludovic Courtès
Pierre Neidhardt
Tobias Geerinckx-Rice

Owner: unassigned

Submitted by: Pierre Neidhardt

Severity: normal

Pierre Neidhardt wrote on 10 Mar 2019 19:02

Recipients:(address . guix-patches@gnu.org)

Message-ID:20190310180209.11578-1-mail@ambrevar.xyz

* guix/lzlib.scm, tests/lzlib.scm: New files.
* Makefile.am (MODULES): Add guix/lzlib.scm.
(SCM_TESTS): Add tests/lzlib.scm.
* m4/guix.m4 (GUIX_LIBLZ_LIBDIR): New macro.
* configure.ac (LIBLZ_LIBDIR): Use it.  Define and substitute
'LIBLZ'.
* guix/config.scm.in (%liblz): New variable.
---
 Makefile.am        |   2 +
 configure.ac       |  11 +
 guix/config.scm.in |   7 +-
 guix/lzlib.scm     | 592 +++++++++++++++++++++++++++++++++++++++++++++
 m4/guix.m4         |  12 +
 tests/lzlib.scm    |  62 +++++
 6 files changed, 685 insertions(+), 1 deletion(-)
 create mode 100644 guix/lzlib.scm
 create mode 100644 tests/lzlib.scm

Toggle diff (481 lines)diff --git a/Makefile.am b/Makefile.am
index cf35770ba7..fd48c57a8d 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -101,6 +101,7 @@ MODULES =					\
   guix/cve.scm					\
   guix/workers.scm				\
   guix/zlib.scm					\
+  guix/lzlib.scm				\
   guix/build-system.scm				\
   guix/build-system/android-ndk.scm		\
   guix/build-system/ant.scm			\
@@ -389,6 +390,7 @@ SCM_TESTS =					\
   tests/cve.scm					\
   tests/workers.scm				\
   tests/zlib.scm				\
+  tests/lzlib.scm				\
   tests/file-systems.scm			\
   tests/uuid.scm				\
   tests/system.scm				\
diff --git a/configure.ac b/configure.ac
index 5d70de4beb..edfe807ddd 100644
--- a/configure.ac
+++ b/configure.ac
@@ -258,6 +258,17 @@ AC_MSG_CHECKING([for zlib's shared library name])
 AC_MSG_RESULT([$LIBZ])
 AC_SUBST([LIBZ])
 
+dnl Library name of lzlib suitable for 'dynamic-link'.
+GUIX_LIBLZ_LIBDIR([liblz_libdir])
+if test "x$liblz_libdir" = "x"; then
+  LIBLZ="liblz"
+else
+  LIBLZ="$liblz_libdir/liblz"
+fi
+AC_MSG_CHECKING([for lzlib's shared library name])
+AC_MSG_RESULT([$LIBLZ])
+AC_SUBST([LIBLZ])
+
 dnl Check for Guile-SSH, for the (guix ssh) module.
 GUIX_CHECK_GUILE_SSH
 AM_CONDITIONAL([HAVE_GUILE_SSH],
diff --git a/guix/config.scm.in b/guix/config.scm.in
index d2ec9921c6..0808947ddd 100644
--- a/guix/config.scm.in
+++ b/guix/config.scm.in
@@ -37,7 +37,8 @@
             %libz
             %gzip
             %bzip2
-            %xz))
+            %xz
+            %liblz))
 
 ;;; Commentary:
 ;;;
@@ -103,4 +104,8 @@
 (define %xz
   "@XZ@")
 
+(define %liblz
+  ;; TODO: Set this dynamically.
+  "/gnu/store/8db7vivi8p9mpkbphb8xy8gh2bkwc4iz-lzlib-1.11/lib/liblz")
+
 ;;; config.scm ends here
diff --git a/guix/lzlib.scm b/guix/lzlib.scm
new file mode 100644
index 0000000000..abab3f761c
--- /dev/null
+++ b/guix/lzlib.scm
@@ -0,0 +1,592 @@
+;;; GNU Guix --- Functional package management for GNU
+;;; Copyright © 2019 Pierre Neidhardt <mail@ambrevar.xyz>
+;;;
+;;; This file is part of GNU Guix.
+;;;
+;;; GNU Guix is free software; you can redistribute it and/or modify it
+;;; under the terms of the GNU General Public License as published by
+;;; the Free Software Foundation; either version 3 of the License, or (at
+;;; your option) any later version.
+;;;
+;;; GNU Guix is distributed in the hope that it will be useful, but
+;;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;;; GNU General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU General Public License
+;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
+
+(define-module (guix lzlib)
+  #:use-module (rnrs bytevectors)
+  #:use-module (rnrs arithmetic bitwise)
+  #:use-module (ice-9 binary-ports)
+  #:use-module (ice-9 match)
+  #:use-module (system foreign)
+  #:use-module (guix config)
+  #:export (lzlib-available?
+            make-lzip-input-port
+            make-lzip-output-port
+            call-with-lzip-input-port
+            call-with-lzip-output-port
+            %default-member-length-limit
+            %default-compression-level))
+
+;;; Commentary:
+;;;
+;;; Bindings to the lzlib / liblz API.
+;;;
+;;; Code:
+
+(define %lzlib
+  ;; File name of lzlib's shared library.  When updating via 'guix pull',
+  ;; '%liblz' might be undefined so protect against it.
+  (delay (dynamic-link (if (defined? '%liblz)
+                           %liblz
+                           "liblz"))))
+
+(define (lzlib-available?)
+  "Return true if lzlib is available, #f otherwise."
+  (false-if-exception (force %lzlib)))
+
+(define (lzlib-procedure ret name parameters)
+  "Return a procedure corresponding to C function NAME in liblz, or #f if
+either lzlib or the function could not be found."
+  (match (false-if-exception (dynamic-func name (force %lzlib)))
+    ((? pointer? ptr)
+     (pointer->procedure ret ptr parameters))
+    (#f
+     #f)))
+
+(define-wrapped-pointer-type <lz-decoder>
+  ;; Scheme counterpart of the 'LZ_Decoder' opaque type.
+  lz-decoder?
+  pointer->lz-decoder
+  lz-decoder->pointer
+  (lambda (obj port)
+    (format port "#<lz-decoder ~a>"
+            (number->string (object-address obj) 16))))
+
+(define-wrapped-pointer-type <lz-encoder>
+  ;; Scheme counterpart of the 'LZ_Encoder' opaque type.
+  lz-encoder?
+  pointer->lz-encoder
+  lz-encoder->pointer
+  (lambda (obj port)
+    (format port "#<lz-encoder ~a>"
+            (number->string (object-address obj) 16))))
+
+(define %error-number-ok
+  ;; TODO: How do we get the values of a C enum?
+  0)
+
+
+;; Compression bindings.
+
+(define lz-compress-open
+  (let ((proc (lzlib-procedure '* "LZ_compress_open" (list int int uint64))))
+    ;; TODO: member-size default is INT64_MAX.  Is there a better way to do this with Guile?
+    (lambda* (dictionary-size match-length-limit #:optional (member-size #x7FFFFFFFFFFFFFFF))
+      "Initializes the internal stream state for compression and returns a
+pointer that can only be used as the encoder argument for the other
+lz-compress functions, or a null pointer if the encoder could not be
+allocated.
+
+See the manual: (lzlib) Compression functions."
+      (let ((encoder-ptr (proc dictionary-size match-length-limit member-size)))
+        (if (not (= (lz-compress-error encoder-ptr) -1))
+            (pointer->lz-encoder encoder-ptr)
+            (throw 'lzlib-error 'lz-compress-open))))))
+
+(define lz-compress-close
+  (let ((proc (lzlib-procedure int "LZ_compress_close" '(*))))
+    (lambda (encoder)
+      "Close encoder.  ENCODER can no longer be used as an argument to any
+lz-compress function. "
+      (let ((ret (proc (lz-encoder->pointer encoder))))
+        (if (= ret -1)
+            (throw 'lzlib-error 'lz-compress-close ret)
+            ret)))))
+
+(define lz-compress-finish
+  (let ((proc (lzlib-procedure int "LZ_compress_finish" '(*))))
+    (lambda (encoder)
+      "Use this function to tell that all the data for this member have
+already been written (with the `lz-compress-write' function).  It is safe to
+call `lz-compress-finish' as many times as needed.  After all the produced
+compressed data have been read with `lz-compress-read' and
+`lz-compress-member-finished?' returns #t, a new member can be started with
+'lz-compress-restart-member'."
+      (let ((ret (proc (lz-encoder->pointer encoder))))
+        (when (= ret -1)
+          (throw 'lzlib-error 'lz-compress-finish (lz-compress-error encoder)))))))
+
+(define lz-compress-restart-member
+  (let ((proc (lzlib-procedure int "LZ_compress_restart_member" (list '* uint64))))
+    (lambda (encoder member-size)
+      "Use this function to start a new member in a multimember data stream.
+Call this function only after `lz-compress-member-finished?' indicates that the
+current member has been fully read (with the `lz-compress-read' function)."
+      (let ((ret (proc (lz-encoder->pointer encoder) member-size)))
+        (when (= ret -1)
+          (throw 'lzlib-error 'lz-compress-restart-member
+                 (lz-compress-error encoder)))))))
+
+(define lz-compress-sync-flush
+  (let ((proc (lzlib-procedure int "LZ_compress_sync_flush" (list '*))))
+    (lambda (encoder)
+      "Use this function to make available to `lz-compress-read' all the data
+already written with the `LZ-compress-write' function.  First call
+`lz-compress-sync-flush'.  Then call 'lz-compress-read' until it returns 0.
+
+Repeated use of `LZ-compress-sync-flush' may degrade compression ratio,
+so use it only when needed. "
+      (let ((ret (proc (lz-encoder->pointer encoder))))
+        (when (= ret -1)
+          (throw 'lzlib-error 'lz-compress-sync-flush
+                 (lz-compress-error encoder)))))))
+
+(define lz-compress-read
+  (let ((proc (lzlib-procedure int "LZ_compress_read" (list '* '* int))))
+    (lambda* (encoder lzfile-bv #:optional (start 0) (count (bytevector-length lzfile-bv)))
+      "Read up to COUNT bytes from the encoder stream, storing the results in LZFILE-BV.
+Return the number of uncompressed bytes written, a strictly positive integer."
+      (let ((ret (proc (lz-encoder->pointer encoder)
+                       (bytevector->pointer lzfile-bv start)
+                       count)))
+        (if (= ret -1)
+            (throw 'lzlib-error 'lz-compress-read (lz-compress-error encoder))
+            ret)))))
+
+(define lz-compress-write
+  (let ((proc (lzlib-procedure int "LZ_compress_write" (list '* '* int))))
+    (lambda* (encoder bv #:optional (start 0) (count (bytevector-length bv)))
+      "Write up to COUNT bytes from BV to the encoder stream.  Return the
+number of uncompressed bytes written, a strictly positive integer."
+      (let ((ret (proc (lz-encoder->pointer encoder)
+                       (bytevector->pointer bv start)
+                       count)))
+        (if (< ret 0)
+            (throw 'lzlib-error 'lz-compress-write (lz-compress-error encoder))
+            ret)))))
+
+(define lz-compress-write-size
+  (let ((proc (lzlib-procedure int "LZ_compress_write_size" '(*))))
+    (lambda (encoder)
+      "The maximum number of bytes that can be immediately written through the
+`lz-compress-write' function.
+
+It is guaranteed that an immediate call to `lz-compress-write' will accept a
+SIZE up to the returned number of bytes. "
+      (let ((ret (proc (lz-encoder->pointer encoder))))
+        (if (= ret -1)
+            (throw 'lzlib-error 'lz-compress-write-size (lz-compress-error encoder))
+            ret)))))
+
+(define lz-compress-error
+  (let ((proc (lzlib-procedure int "LZ_compress_errno" '(*))))
+    (lambda (encoder)
+      "ENCODER can be a Scheme object or a pointer."
+      (let* ((error-number (proc (if (lz-encoder? encoder)
+                                     (lz-encoder->pointer encoder)
+                                     encoder))))
+        error-number))))
+
+(define lz-compress-finished?
+  (let ((proc (lzlib-procedure int "LZ_compress_finished" '(*))))
+    (lambda (encoder)
+      "Return #t if all the data have been read and `lz-compress-close' can
+be safely called. Otherwise return #f."
+      (let ((ret (proc (lz-encoder->pointer encoder))))
+        (match ret
+          (1 #t)
+          (0 #f)
+          (_ (throw 'lzlib-error 'lz-compress-finished? (lz-compress-error encoder))))))))
+
+(define lz-compress-member-finished?
+  (let ((proc (lzlib-procedure int "LZ_compress_member_finished" '(*))))
+    (lambda (encoder)
+      "Return #t if the current member, in a multimember data stream, has
+been fully read and 'lz-compress-restart-member' can be safely called.
+Otherwise return #f."
+      (let ((ret (proc (lz-encoder->pointer encoder))))
+        (match ret
+          (1 #t)
+          (0 #f)
+          (_ (throw 'lzlib-error 'lz-compress-member-finished? (lz-compress-error encoder))))))))
+
+(define lz-compress-data-position
+  (let ((proc (lzlib-procedure uint64 "LZ_compress_data_position" '(*))))
+    (lambda (encoder)
+      "Return the number of input bytes already compressed in the current
+member."
+      (let ((ret (proc (lz-encoder->pointer encoder))))
+        (if (= ret -1)
+            (throw 'lzlib-error 'lz-compress-data-position
+                   (lz-compress-error encoder))
+            ret)))))
+
+(define lz-compress-member-position
+  (let ((proc (lzlib-procedure uint64 "LZ_compress_member_position" '(*))))
+    (lambda (encoder)
+      "Return the number of compressed bytes already produced, but perhaps
+not yet read, in the current member."
+      (let ((ret (proc (lz-encoder->pointer encoder))))
+        (if (= ret -1)
+            (throw 'lzlib-error 'lz-compress-member-position
+                   (lz-compress-error encoder))
+            ret)))))
+
+(define lz-compress-total-in-size
+  (let ((proc (lzlib-procedure uint64 "LZ_compress_total_in_size" '(*))))
+    (lambda (encoder)
+      "Return the total number of input bytes already compressed."
+      (let ((ret (proc (lz-encoder->pointer encoder))))
+
+        (if (= ret -1)
+            (throw 'lzlib-error 'lz-compress-total-in-size
+                   (lz-compress-error encoder))
+            ret)))))
+
+(define lz-compress-total-out-size
+  (let ((proc (lzlib-procedure uint64 "LZ_compress_total_out_size" '(*))))
+    (lambda (encoder)
+      "Return the total number of compressed bytes already produced, but
+perhaps not yet read."
+      (let ((ret (proc (lz-encoder->pointer encoder))))
+        (if (= ret -1)
+            (throw 'lzlib-error 'lz-compress-total-out-size
+                   (lz-compress-error encoder))
+            ret)))))
+
+
+;; Decompression bindings.
+
+(define lz-decompress-open
+  (let ((proc (lzlib-procedure '* "LZ_decompress_open" '())))
+    (lambda ()
+      "Initializes the internal stream state for decompression and returns a
+pointer that can only be used as the decoder argument for the other
+lz-decompress functions, or a null pointer if the decoder could not be
+allocated.
+
+See the manual: (lzlib) Decompression functions."
+      (let ((decoder-ptr (proc)))
+        (if (not (= (lz-decompress-error decoder-ptr) -1))
+            (pointer->lz-decoder decoder-ptr)
+            (throw 'lzlib-error 'lz-decompress-open))))))
+
+(define lz-decompress-close
+  (let ((proc (lzlib-procedure int "LZ_decompress_close" '(*))))
+    (lambda (decoder)
+      "Close decoder.  DECODER can no longer be used as an argument to any
+lz-decompress function. "
+      (let ((ret (proc (lz-decoder->pointer decoder))))
+        (if (= ret -1)
+            (throw 'lzlib-error 'lz-decompress-close ret)
+            ret)))))
+
+(define lz-decompress-finish
+  (let ((proc (lzlib-procedure int "LZ_decompress_finish" '(*))))
+    (lambda (decoder)
+      "Use this function to tell that all the data for this stream
+have already been written (with the `lz-decompress-write' function).  It is
+safe to call `lz-decompress-finish' as many times as needed."
+      (let ((ret (proc (lz-decoder->pointer decoder))))
+        (when (= ret -1)
+          (throw 'lzlib-error 'lz-decompress-finish (lz-decompress-error decoder)))))))
+
+(define lz-decompress-reset
+  (let ((proc (lzlib-procedure int "LZ_decompress_reset" '(*))))
+    (lambda (decoder)
+      "Resets the internal state of DECODER as it was just after opening it
+with the `lz-decompress-open' function.  Data stored in the internal buffers
+is discarded.  Position counters are set to 0."
+      (let ((ret (proc (lz-decoder->pointer decoder))))
+        (when (= ret -1)
+          (throw 'lzlib-error 'lz-decompress-reset
+                 (lz-decompress-error decoder)))))))
+
+(define lz-decompress-sync-to-member
+  (let ((proc (lzlib-procedure int "LZ_decompress_sync_to_member" '(*))))
+    (lambda (decoder)
+      "Resets the error state of DECODER and enters a search state that lasts
+until a new member header (or the end of the stream) is found.  After a
+successful call to `lz-decompress-sync-to-member', data written with
+`lz-decompress-write' will be consumed and 'lz-decompress-read' will return 0
+until a header is found.
+
+This function is useful to discard any data preceding the first member, or to
+discard the rest of the current member, for example in case of a data
+error.  If the decoder is already at the beginning of a member, this function
+does nothing."
+      (let ((ret (proc (lz-decoder->pointer decoder))))
+        (when (= ret -1)
+          (throw 'lzlib-error 'lz-decompress-sync-to-member
+                 (lz-decompress-error decoder)))))))
+
+(define lz-decompress-read
+  (let ((proc (lzlib-procedure int "LZ_decompress_read" (list '* '* int))))
+    (lambda* (decoder file-bv #:optional (start 0) (count (bytevector-length file-bv)))
+      "Read up to COUNT bytes from the decoder stream, storing the results in FILE-BV.
+Return the number of uncompressed bytes written, a strictly positive integer."
+      (let ((ret (proc (lz-decoder->pointer decoder)
+                       (bytevector->pointer file-bv start)
+                       count)))
+        (if (< ret 0)
+            (throw 'lzlib-error 'lz-decompress-read (lz-decompress-error decoder))
+            ret)))))
+
+(define lz-decompress-write
+  (let ((proc (lzlib-procedure int "LZ_decompress_write" (list '* '* int))))
+    (lambda* (decoder bv #:optional (start 0) (count (bytevector-length bv)))
+      "Write up to COUNT bytes from BV to the decoder stream.  Return the
+number of uncompressed bytes written, a strictly positive integer."
+      (let ((ret (proc (lz-decoder->pointer decoder)
+                       (bytevector->pointer bv start)
+                       count)))
+        (if (< ret 0)
+            (throw 'lzlib-error 'lz-decompress-write (lz-decompress-error decoder))
+            ret)))))
+
+(define lz-decompress-write-size
+  (let ((proc (lzlib-procedure int "LZ_decompress_write_size" '(*))))
+    (lambda (decoder)
+      "Return the maximum number of bytes that can be immediately written
+through the `lz-decompress-write' function.
+
+It is guaranteed that an immediate call to `lz-decompress-write' will accept a
+SIZE up to the returned number of bytes. "
+      (let ((ret (proc (lz-decoder->pointer decoder))))
+        (if (= ret -1)
+            (throw 'lzlib-error 'lz-decompress-write-size (lz-decompress-error decoder))
+            ret)))))
+
+(define lz-decompress-error
+  (let ((proc (lzlib-procedure int "LZ_decompress_errno" '(*))))
+    (lambda (decoder)
+      "DECODER can be a Scheme object or a pointer."
+      (let* ((error-number (proc (if (lz-decoder? decoder)
+                                     (lz-decoder->pointer decoder)
+                                     decoder))))
+        error-number))))
+
+(define lz-decompress-finished?
+  (let ((proc (lzlib-procedure int "LZ_decompress_finished" '(*))))
+    (lambda (decoder)
+      "Return #t if all the data have been read and `lz-decompress-close' can
+be safely called.  Otherwise return #f."
+      (let ((ret (proc (lz-decoder->pointer decoder))))
+        (match ret
+          (1 #t)
+          (0 #f)
+          (_ (throw 'lzlib-error 'lz-decompress-finished? (lz-decompress-error encoder))))))))
+
+(define lz-decompress-member-finished?
+  (let ((proc (lzlib-procedure int "LZ_decompress_member_finished" '(*))))
+    (lambda (decoder)
+      "Return #t if the current member, in a multimember data stream, has
+been fully read and `lz-decompress-restart-member' can be safely called.
+Otherwise return #f."
+      (let ((ret (proc (lz-decoder->pointer decoder))))
+        (match ret
+          (1 #t)
+          (0 #f)
+          (_ (throw 'lzlib-error 'lz-decompress-finished? (lz-decompress-error encoder))))))))
+
+(define lz-decompress-member-version
+  (let ((proc (lzlib-procedure int "LZ_decompress_member_version" '(*))))
+    (lambda (decoder)
+      (let ((ret (proc (lz-decoder->pointer decoder))))
+        "Return the version of current member from member header."
+        (if (= ret -1)
+            (throw 'lzlib-error 'lz-decompress-data-position
+                   (lz-decompress-error decoder))
+            ret)))))
+
+(define lz-decompress-dictionary-size
+  (let ((proc (lzlib-procedure int "LZ_decompress_dictionary_size" '(*))))
+    (lambda (decoder)
+      (let ((ret (proc (lz-decoder->pointer decoder))))
+   

This message was truncated. Download the full message here.

Pierre Neidhardt wrote on 10 Mar 2019 19:09

[PATCH 2/2] dir-locals.el: Add 'call-with-lzip-input-port' and 'call-with-lzip-output-port' keywords.

Recipients:(address . 34807@debbugs.gnu.org)

Message-ID:20190310180905.14459-1-mail@ambrevar.xyz

* .dir-locals.el: Add indentation rules for 'call-with-lzip-input-port' and
'call-with-lzip-output-port'.
---
 .dir-locals.el | 2 ++
 1 file changed, 2 insertions(+)

Toggle diff (15 lines)diff --git a/.dir-locals.el b/.dir-locals.el
index 550e06ef09..f1196fd781 100644
--- a/.dir-locals.el
+++ b/.dir-locals.el
@@ -53,6 +53,8 @@
    (eval . (put 'call-with-decompressed-port 'scheme-indent-function 2))
    (eval . (put 'call-with-gzip-input-port 'scheme-indent-function 1))
    (eval . (put 'call-with-gzip-output-port 'scheme-indent-function 1))
+   (eval . (put 'call-with-lzip-input-port 'scheme-indent-function 1))
+   (eval . (put 'call-with-lzip-output-port 'scheme-indent-function 1))
    (eval . (put 'signature-case 'scheme-indent-function 1))
    (eval . (put 'emacs-batch-eval 'scheme-indent-function 0))
    (eval . (put 'emacs-batch-edit-file 'scheme-indent-function 1))
-- 
2.20.1

Ludovic Courtès wrote on 22 Mar 2019 22:35

Re: [bug#34807] [PATCH 1/2] Add (guix lzlib).

Recipients:(name . Pierre Neidhardt)(address . mail@ambrevar.xyz)(address . 34807@debbugs.gnu.org)

Message-ID:8736ne3855.fsf@gnu.org

Hello,

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

Toggle quote (8 lines)

> * guix/lzlib.scm, tests/lzlib.scm: New files.

> * Makefile.am (MODULES): Add guix/lzlib.scm.

> (SCM_TESTS): Add tests/lzlib.scm.

> * m4/guix.m4 (GUIX_LIBLZ_LIBDIR): New macro.

> * configure.ac (LIBLZ_LIBDIR): Use it. Define and substitute

> 'LIBLZ'.

> * guix/config.scm.in (%liblz): New variable.

This looks really nice!

Please update ‘make-config.scm’ in (guix self) so that it defines

‘%liblz’ as well (setting it to #f for now).

Toggle quote (4 lines)

> +(define %liblz

> + ;; TODO: Set this dynamically.

> + "/gnu/store/8db7vivi8p9mpkbphb8xy8gh2bkwc4iz-lzlib-1.11/lib/liblz")

You can already put "@LIBLZ@" here.

Toggle quote (4 lines)

> +(define %lzlib

> + ;; File name of lzlib's shared library. When updating via 'guix pull',

> + ;; '%liblz' might be undefined so protect against it.

Updating ‘make-config.scm’ will fix it.

Toggle quote (3 lines)

> +(define %error-number-ok

> + ;; TODO: How do we get the values of a C enum?

See the thread on guix-devel.

Toggle quote (5 lines)

> +(define lz-compress-open

> + (let ((proc (lzlib-procedure '* "LZ_compress_open" (list int int uint64))))

> + ;; TODO: member-size default is INT64_MAX. Is there a better way to do this with Guile?

> + (lambda* (dictionary-size match-length-limit #:optional (member-size #x7FFFFFFFFFFFFFFF))

You could write (- (expt 2 63) 1) I guess for clarity, but what you wrote is OK.

Is it also the case on 32-bit platforms?

Toggle quote (10 lines)> +(define lz-compress-finish
> +  (let ((proc (lzlib-procedure int "LZ_compress_finish" '(*))))
> +    (lambda (encoder)
> +      "Use this function to tell that all the data for this member have
> +already been written (with the `lz-compress-write' function).  It is safe to
> +call `lz-compress-finish' as many times as needed.  After all the produced
> +compressed data have been read with `lz-compress-read' and
> +`lz-compress-member-finished?' returns #t, a new member can be started with
> +'lz-compress-restart-member'."

For docstrings, the convention in GNU and Guix is to use the imperative

tense and to explicitly refer to the arguments there, like:

"Tell ENCODER that all the data for this member have alrady been

written. …"

(Same for the other docstrings that start with “Use this function.”)

Toggle quote (21 lines)> +(define* (lzread! decoder file-port bv
> +                 #:optional (start 0) (count (bytevector-length bv)))
> +  "Read up to COUNT bytes from FILE-PORT into BV at offset START.  Return the
> +number of uncompressed bytes actually read; it is zero if COUNT is zero or if
> +the end-of-stream has been reached."
> +  (let* ((written 0)
> +         (read 0)
> +         (chunk (* 64 1024))
> +         (file-bv (get-bytevector-n file-port count)))
> +    (if (eof-object? file-bv)
> +        0
> +        (begin
> +          (while (and (< 0 (lz-decompress-write-size decoder))
> +                      (< written (bytevector-length file-bv)))
> +            (set! written (lz-decompress-write decoder file-bv written (- (bytevector-length file-bv) written))))
> +          ;; TODO: When should we call `lz-decompress-finish'?
> +          ;; (lz-decompress-finish decoder)
> +          ;; TODO: Loop?
> +          (set! read (lz-decompress-read decoder bv start
> +                                         (- (bytevector-length bv) start)))

It’s worth figuring out. :-)

Are there examples or the code of the actual ‘lzip’ command that could help?

Toggle quote (12 lines)> +dnl GUIX_LIBLZ_LIBDIR VAR
> +dnl
> +dnl Attempt to determine liblz's LIBDIR; store the result in VAR.
> +AC_DEFUN([GUIX_LIBLZ_LIBDIR], [
> +  AC_REQUIRE([PKG_PROG_PKG_CONFIG])
> +  AC_CACHE_CHECK([lzlib's library directory],
> +    [guix_cv_liblz_libdir],
> +    dnl TODO: This fails because lzlib has no pkg-config.
> +    [guix_cv_liblz_libdir="`$PKG_CONFIG lzlib --variable=libdir 2> /dev/null`"])
> +  $1="$guix_cv_liblz_libdir"
> +])

I think you could do something like this in the body of ‘AC_CACHE_CHECK’

(untested):

old_LIBS="$LIBS"

LIBS="-llz"

AC_LINK_IFELSE([LZ_decompress_open();],

[guix_cv_libz_libdir="`ldd conftest$EXEEXT | grep liblz | sed '-es/.*=> $.*$ .*$/\1/g'`"])

LIBS="$old_LIBS"

Regarding testing, it’s easy to get this sort of binding subtly wrong

IME. :-) I’d suggest looking at things like:

1. Passing short input bytevectors, large input bytevectors, and input

that’s equal to liblz’s internal buffer size or off by one.

2. File descriptors: strace guile while doing

‘call-with-lzip-input-port’ and ‘call-with-lzip-output-port’ and

make sure that file descriptors are not left open and are not

closed prematurely either. (This is particularly important for

long-running processes like ‘guix publish’.)

But overall, modulo the small issues above, it looks pretty much ready

to me.

Thank you!

Ludo’.

Pierre Neidhardt wrote on 1 May 2019 18:46

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 34807@debbugs.gnu.org)

Message-ID:87pnp2f7gr.fsf@ambrevar.xyz

Hi,

thanks for the review. I've worked on it and I've managed to address

almost all issues.

Now I'm stuck with the stream decompression.

Lzip expects some special terminating bytes for each member. In

tests/lzlib.scm, we produce a compressed stream and decompress it in

parallel. But more often than not, before the compression is done, the

decompression will exhaust the port's byte and terminate prematurely. I

don't know what to do in this case. From the Guile manual:

Toggle snippet (9 lines)

-- Scheme Procedure: make-custom-binary-input-port id read!

get-position set-position! close

Return a new custom binary input port(1) named ID (a string) whose

input is drained by invoking READ! and passing it a bytevector, an

index where bytes should be written, and the number of bytes to

read. The ‘read!’ procedure must return an integer indicating the

number of bytes read, or ‘0’ to indicate the end-of-file.

The decompression will sometime decompress 0 byte (when it's faster

then the compression). But if I return 0 in lzread!, then the custom

port will be closed too early, before we could decompress the

terminating bytes.

Is there a way to wait on the port instead of reading 0 bytes?

Note that lzip can test whether the decompressed stream is terminated or

not with lz-decompress-member-finished?.

Pierre Neidhardt

https://ambrevar.xyz/

Pierre Neidhardt wrote on 2 May 2019 11:16

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 34807@debbugs.gnu.org)

Message-ID:8736lxdxn6.fsf@ambrevar.xyz

OK, I think I've figured it out.  The issue above was a red herring.
I think I've got it to work, I need to do more testing though.
Stay tuned.

-- 
Pierre Neidhardt
https://ambrevar.xyz/

Ludovic Courtès wrote on 4 May 2019 11:11

Recipients:(name . Pierre Neidhardt)(address . mail@ambrevar.xyz)(address . 34807@debbugs.gnu.org)

Message-ID:87lfzm7fdz.fsf@gnu.org

Hi,

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

Toggle quote (4 lines)

> OK, I think I've figured it out. The issue above was a red herring.

> I think I've got it to work, I need to do more testing though.

> Stay tuned.

OK. :-)

Good to see progress on this front!

Ludo’.

Pierre Neidhardt wrote on 4 May 2019 12:23

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 34807@debbugs.gnu.org)

Message-ID:878svm5xic.fsf@ambrevar.xyz

Right on time, I just finished it!

- I've been in touch with Antonio, Lzip's maintainer, for more than a

week and now I'm confident that I have a decent understanding of the

library.

- Your m4 suggestion didn't work. I've included a comment. We need to

fix it before merging. I'm not the right person for this job I'm

afraid :p Ludo?

- The convenience functions do not support multi-member archives.

Multi-member archives are mostly useful for parallelization, but we

don't use that in Guix, so it's OK. Should it be required some day,

we would need to implement it, which requires a little bit more work.

I've documented all that.

- The implementation of lzread! is subpar because I understood a

subtlety a bit too late. But that's alright, it does not affect

performance nor reliability.

- I've included 11 tests covering all your suggestions.

- I haven't strace'd the Guile process. The code regarding ports is

identical to zlib.scm, so it's unlikely there would be an issue in

this area. I have never done this before, so out of curiosity, how do

you run a specific Guix tests without going through `make'?

Next steps? :D

Pierre Neidhardt

https://ambrevar.xyz/

Attachment: 0001-Add-guix-lzlib.patch

From 7dd8f4207657ae7ad178c21a45f74bef6cc0a314 Mon Sep 17 00:00:00 2001
From: Pierre Neidhardt <mail@ambrevar.xyz>
Date: Sun, 10 Mar 2019 16:40:41 +0100
Subject: [PATCH 2/2] dir-locals.el: Add 'call-with-lzip-input-port' and
 'call-with-lzip-output-port' keywords.

* .dir-locals.el: Add indentation rules for 'call-with-lzip-input-port' and
'call-with-lzip-output-port'.
---
 .dir-locals.el | 2 ++
 1 file changed, 2 insertions(+)

Toggle diff (15 lines)diff --git a/.dir-locals.el b/.dir-locals.el
index 550e06ef09..f1196fd781 100644
--- a/.dir-locals.el
+++ b/.dir-locals.el
@@ -53,6 +53,8 @@
    (eval . (put 'call-with-decompressed-port 'scheme-indent-function 2))
    (eval . (put 'call-with-gzip-input-port 'scheme-indent-function 1))
    (eval . (put 'call-with-gzip-output-port 'scheme-indent-function 1))
+   (eval . (put 'call-with-lzip-input-port 'scheme-indent-function 1))
+   (eval . (put 'call-with-lzip-output-port 'scheme-indent-function 1))
    (eval . (put 'signature-case 'scheme-indent-function 1))
    (eval . (put 'emacs-batch-eval 'scheme-indent-function 0))
    (eval . (put 'emacs-batch-edit-file 'scheme-indent-function 1))
-- 
2.21.0

Ludovic Courtès wrote on 4 May 2019 23:09

Recipients:(name . Pierre Neidhardt)(address . mail@ambrevar.xyz)(address . 34807@debbugs.gnu.org)

Message-ID:87ef5e0vvv.fsf@gnu.org

Hello!

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

Toggle quote (10 lines)> Right on time, I just finished it!
>
> - I've been in touch with Antonio, Lzip's maintainer, for more than a
>   week and now I'm confident that I have a decent understanding of the
>   library.
>
> - Your m4 suggestion didn't work.  I've included a comment.  We need to
>   fix it before merging.  I'm not the right person for this job I'm
>   afraid :p  Ludo?

Sure, I can do it.

Toggle quote (19 lines)> - The convenience functions do not support multi-member archives.
>   Multi-member archives are mostly useful for parallelization, but we
>   don't use that in Guix, so it's OK.  Should it be required some day,
>   we would need to implement it, which requires a little bit more work.
>   I've documented all that.
>
> - The implementation of lzread! is subpar because I understood a
>   subtlety a bit too late.  But that's alright, it does not affect
>   performance nor reliability.
>
> - I've included 11 tests covering all your suggestions.
>
> - I haven't strace'd the Guile process.  The code regarding ports is
>   identical to zlib.scm, so it's unlikely there would be an issue in
>   this area.  I have never done this before, so out of curiosity, how do
>   you run a specific Guix tests without going through `make'?
>
> Next steps? :D

This looks all good to me!

I was about to apply it and add the Autoconf machinery, but I thought we

could also make it a separate project that could be beneficial to other

Guilers out there (like we did with Guile-Gcrypt and Guile-Git).

Incidentally that would also avoid the need for adding the ‘%liblz’

variable in (guix config), which simplifies things a bit.

WDYT?

If you want to take that route, I’m happy to help with the Autotools

machinery (or you could use ‘hall’ from the ‘guile-hall’ package to do

that for you.)

If you don’t feel like taking that route (or at least not yet ;-)),

that’s OK for me too, I don’t feel strongly either way.

Thoughts?

Thank you!

Ludo’.

Pierre Neidhardt wrote on 4 May 2019 23:39

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 34807@debbugs.gnu.org)

Message-ID:87o94h527a.fsf@ambrevar.xyz

Hi!

It's definitely the ideal route. Something like guile-compress or

guile-archive, with a high-level abstraction for a collection of

bindings including zlib and lzlib for now.

Sadly I don't have the time for it at the moment. Unless you do (:p) I

suggest we add a TODO item and keep it for later.

Regarding guix publish and the farms, what shall we do?

Pierre Neidhardt

https://ambrevar.xyz/

Ludovic Courtès wrote on 6 May 2019 23:18

Recipients:(name . Pierre Neidhardt)(address . mail@ambrevar.xyz)(address . 34807@debbugs.gnu.org)

Message-ID:87a7fzffip.fsf@gnu.org

Hi Pierre,

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

Toggle quote (7 lines)

> It's definitely the ideal route. Something like guile-compress or

> guile-archive, with a high-level abstraction for a collection of

> bindings including zlib and lzlib for now.

> Sadly I don't have the time for it at the moment. Unless you do (:p) I

> suggest we add a TODO item and keep it for later.

Sounds good!

Below are the Autoconf-related changes I made. Committed!

We’ll take care of (guix self) when (guix lzlib) is actually used by

other parts of the code.

Toggle quote (2 lines)

> Regarding guix publish and the farms, what shall we do?

I think we should arrange for the client part, ‘guix substitute’, to be

ready to lzip-decode as soon as it talks to an lzip-capable server.

Then we should add support in ‘guix publish’. At some later point, we’d

deploy it on the build farms.

For this migration to be incremental, we need (1) clients to be able to

transparently switch to lzip when it’s available, and (2) servers to be

able to produce both lzip archives (for new clients) and gzip archives

(for old clients) during the transition period.

That’s a bit of work in ‘guix publish’. It’ll be extra CPU and storage

usage on the build farm since during the transition period it’d have to

produce and store both gzip and lzip archives for each store item. I

don’t really see any way around that, though.

A difficulty is that narinfos currently include a fixed compression

scheme:

Toggle snippet (6 lines)

$ wget -q -O - https://ci.guix.info/nrkm1683p1cqnkcmhlmhiig9q9qd7xqh.narinfo | head -3

StorePath: /gnu/store/nrkm1683p1cqnkcmhlmhiig9q9qd7xqh-sed-4.5

URL: nar/gzip/nrkm1683p1cqnkcmhlmhiig9q9qd7xqh-sed-4.5

Compression: gzip

So, depending on the client, ‘guix publish’ should return either a

narinfo-for-gzip or a narinfo-for-lzip. To make it possible, new

clients could send an extra HTTP header, say ‘X-Guix-Compression’, that

would specify their preferred compression method(s). ‘guix publish’

would take that into account when replying.

How does that sound?

Thanks,

Ludo’.

Attachment: file

Tobias Geerinckx-Rice wrote on 7 May 2019 01:28

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)

Message-ID:87v9ynf9iu.fsf@nckx

Ludo',

Ludovic Courtès wrote:

Toggle quote (10 lines)> So, depending on the client, ‘guix publish’ should return either 
> a
> narinfo-for-gzip or a narinfo-for-lzip.  To make it possible, 
> new
> clients could send an extra HTTP header, say 
> ‘X-Guix-Compression’, that
> would specify their preferred compression method(s).  ‘guix 
> publish’
> would take that into account when replying.

There's a standard[0] HTTP header for that: ‘Accept-Encoding’.

Unfortunately (and for reasons that I cannot fathom), it doesn't

use standard MIME types, but pseudostandard strings like ‘gzip’

and ‘br’. We can boldly add ‘lzip’ to that :-)

Similarly, servers can send ‘Content-Encoding’[1] HTTP headers,

but I don't see a need for it here.

Kind regards,

T G-R

[0]:

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Encoding

[1]:

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Encoding

-----BEGIN PGP SIGNATURE-----

iHUEARYKAB0WIQT12iAyS4c9C3o4dnINsP+IT1VteQUCXNDDCQAKCRANsP+IT1Vt

efyOAPkBiYUZx40xk8eKEOJnDH2nvES6mx695c5hyWuKutqVcAEAiGbLbXrxylsN

H/EkvEUOrKKkgjof1fH2zksp4TYwkgM=

=MOcO

-----END PGP SIGNATURE-----

Pierre Neidhardt wrote on 7 May 2019 09:02

Recipients:(address . 34807@debbugs.gnu.org)

Message-ID:87d0ku21da.fsf@ambrevar.xyz

All good.

I'm very busy (with Next browser) these days, so I won't have much time.

Maybe I can give (1) a shot (lzip-decoding for clients), don't think

I'll have time for the guix publish part before a while.

Anyone?

Pierre Neidhardt

https://ambrevar.xyz/

Ludovic Courtès wrote on 7 May 2019 10:19

Recipients:(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)

Message-ID:87sgtq8yns.fsf@gnu.org

Tobias Geerinckx-Rice <me@tobias.gr> skribis:

Toggle quote (14 lines)> Ludovic Courtès wrote:
>> So, depending on the client, ‘guix publish’ should return either a
>> narinfo-for-gzip or a narinfo-for-lzip.  To make it possible, new
>> clients could send an extra HTTP header, say ‘X-Guix-Compression’,
>> that
>> would specify their preferred compression method(s).  ‘guix publish’
>> would take that into account when replying.
>
> There's a standard[0] HTTP header for that: ‘Accept-Encoding’.
>
> Unfortunately (and for reasons that I cannot fathom), it doesn't use
> standard MIME types, but pseudostandard strings like ‘gzip’ and ‘br’.
> We can boldly add ‘lzip’ to that :-)

Well, that’s why I thought about using a new header. :-)

Ludo’.

Ludovic Courtès wrote on 7 May 2019 17:44

Recipients:(name . Pierre Neidhardt)(address . mail@ambrevar.xyz)

Message-ID:87lfzi46cv.fsf@gnu.org

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

Toggle quote (4 lines)

> I'm very busy (with Next browser) these days, so I won't have much time.

> Maybe I can give (1) a shot (lzip-decoding for clients), don't think

> I'll have time for the guix publish part before a while.

I’ll take a look at it, probably after 1.0.1.

Anyway, we can close this issue and open new ones for the remaining

bits.

Ludo’.

Closed

Pierre Neidhardt wrote on 7 May 2019 17:51

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)

Message-ID:875zqmz2in.fsf@ambrevar.xyz

OK, feel free to open the corresponding issues and forward me the
messages, I'll see what I can do.

-- 
Pierre Neidhardt
https://ambrevar.xyz/

Closed

Your comment

This issue is archived.

To comment on this conversation send an email to 34807@debbugs.gnu.org