[PATCH 0/5] Reduce the size of gnu/packages/*.go files

  • Open
  • quality assurance status badge
Details
2 participants
  • Ludovic Courtès
  • Simon Tournier
Owner
unassigned
Submitted by
Ludovic Courtès
Severity
normal
L
L
Ludovic Courtès wrote on 15 Apr 17:27 +0200
(address . guix-patches@gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
cover.1713194148.git.ludo@gnu.org
Hello!

As a followup to https://issues.guix.gnu.org/70280, I started looking
into the size of Guix itself, and in particular that of gnu/packages/*.go
files.

What follows is a bunch of tricks to reduce code bloat, achieving ~14%
reduction in the size of .go files (~18% if we look at gnu/packages
alone).

About 60% of those files are in the 64–128 KiB range. Since ELF sections
are currently 64 KiB-aligned (see ‘*lcm-page-size*’ in Guile), we would
save space by ensuring these are sparse files. To do that, we’ll need to
detect holes when restoring nars and/or to change the nar format to
preserve holes, while also ensuring that when the daemon copies files
around, it also preserves holes. Work for later!

Thoughts?

Ludo’.

Ludovic Courtès (5):
records: Do not inline throws for ABI mismatches.
packages: Reduce bloat induced by ‘sanitize-inputs’.
records: Do not inline the constructor.
packages: ‘define-public’ replacement calls ‘module-export!’ directly.
packages: Reduce code bloat due to list allocation in input fields.

guix/packages.scm | 53 +++++++++++++++++++++++++++++++++++--------
guix/records.scm | 58 ++++++++++++++++++++++++++++++++---------------
2 files changed, 83 insertions(+), 28 deletions(-)


base-commit: cd45294d576975a3bff2f755764a3f46f09ea6f9
--
2.41.0
L
L
Ludovic Courtès wrote on 15 Apr 17:37 +0200
[PATCH 1/5] records: Do not inline throws for ABI mismatches.
(address . 70398@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
a169679444f30d2d10c71438a00167c857ca50ef.1713194148.git.ludo@gnu.org
* guix/records.scm (record-abi-mismatch-error): New procedure.
(abi-check): Use it.

Change-Id: I49936599716e117b8fbf26fb9d8f462bbbb8e88b
---
guix/records.scm | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

Toggle diff (36 lines)
diff --git a/guix/records.scm b/guix/records.scm
index f4d12a861d..48637ea0a4 100644
--- a/guix/records.scm
+++ b/guix/records.scm
@@ -1,5 +1,5 @@
;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2012-2023 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2012-2024 Ludovic Courtès <ludo@gnu.org>
;;; Copyright © 2018 Mark H Weaver <mhw@netris.org>
;;;
;;; This file is part of GNU Guix.
@@ -61,6 +61,11 @@ (define-syntax record-error
(string-append "% " (symbol->string type-name)
" abi-cookie")))))
+ (define (record-abi-mismatch-error type)
+ (throw 'record-abi-mismatch-error 'abi-check
+ "~a: record ABI mismatch; recompilation needed"
+ (list type) '()))
+
(define (abi-check type cookie)
"Return syntax that checks that the current \"application binary
interface\" (ABI) for TYPE is equal to COOKIE."
@@ -68,9 +73,7 @@ (define-syntax record-error
#`(unless (eq? current-abi #,cookie)
;; The source file where this exception is thrown must be
;; recompiled.
- (throw 'record-abi-mismatch-error 'abi-check
- "~a: record ABI mismatch; recompilation needed"
- (list #,type) '()))))
+ (record-abi-mismatch-error #,type))))
(define* (report-invalid-field-specifier name bindings
#:optional parent-form)
--
2.41.0
L
L
Ludovic Courtès wrote on 15 Apr 17:37 +0200
[PATCH 3/5] records: Do not inline the constructor.
(address . 70398@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
f09974509ebd1aeffed10a2db08720bcedd039b5.1713194148.git.ludo@gnu.org
Struct initialization uses one instruction per field, which contributes
to code bloat in the case of package modules. With this change, the
‘.rtl-text’ section of ‘gnu/packages/tex.go’ goes from 7,334,508 B to
6,356,592 B (-13%; -7% on the whole file size), which alone is still
larger than the source file (4,2 MB).

* guix/records.scm (make-syntactic-constructor)[record-inheritance]: Use
CTOR instead of ‘make-struct/no-tail’.
Pass ABI-COOKIE as the first argument to CTOR.
(define-record-type*): Define CTOR-PROCEDURE and pass it to
‘make-syntactic-constructor’.

Change-Id: Ifd7b4e884e9fbf21c43fb4c3ad963126ef5cb476
---
guix/records.scm | 47 +++++++++++++++++++++++++++++++++--------------
1 file changed, 33 insertions(+), 14 deletions(-)

Toggle diff (81 lines)
diff --git a/guix/records.scm b/guix/records.scm
index 48637ea0a4..dca1e3c2e7 100644
--- a/guix/records.scm
+++ b/guix/records.scm
@@ -164,16 +164,16 @@ (define-syntax make-syntactic-constructor
(record-error 'name s "extraneous field initializers ~a"
unexpected)))
- #`(make-struct/no-tail type
- #,@(map (lambda (field index)
- (or (field-inherited-value field)
- (if (innate-field? field)
- (wrap-field-value
- field (field-default-value field))
- #`(struct-ref #,orig-record
- #,index))))
- '(expected ...)
- (iota (length '(expected ...))))))
+ #`(ctor #,abi-cookie
+ #,@(map (lambda (field index)
+ (or (field-inherited-value field)
+ (if (innate-field? field)
+ (wrap-field-value
+ field (field-default-value field))
+ #`(struct-ref #,orig-record
+ #,index))))
+ '(expected ...)
+ (iota (length '(expected ...))))))
(define (thunked-field? f)
(memq (syntax->datum f) 'thunked))
@@ -249,8 +249,8 @@ (define-syntax make-syntactic-constructor
(cond ((lset= eq? fields '(expected ...))
#`(let* #,(field-bindings
#'((field value) (... ...)))
- #,(abi-check #'type abi-cookie)
- (ctor #,@(map field-value '(expected ...)))))
+ (ctor #,abi-cookie
+ #,@(map field-value '(expected ...)))))
((pair? (lset-difference eq? fields
'(expected ...)))
(record-error 'name s
@@ -435,7 +435,13 @@ (define-syntax define-record-type*
(sanitizers (filter-map field-sanitizer
#'((field properties ...) ...)))
(cookie (compute-abi-cookie field-spec)))
- (with-syntax (((field-spec* ...)
+ (with-syntax ((ctor-procedure
+ (datum->syntax
+ #'ctor
+ (symbol-append (string->symbol " %")
+ (syntax->datum #'ctor)
+ '-procedure/abi-check)))
+ ((field-spec* ...)
(map field-spec->srfi-9 field-spec))
((field-type ...)
(map (match-lambda
@@ -502,7 +508,20 @@ (define-syntax define-record-type*
#'id)))))))
thunked-field-accessor ...
delayed-field-accessor ...
- (make-syntactic-constructor type syntactic-ctor ctor
+
+ (define ctor-procedure
+ ;; This procedure is *not* inlined, to reduce code bloat
+ ;; (struct initialization takes at least one instruction per
+ ;; field).
+ (case-lambda
+ ((cookie field ...)
+ (unless (eq? cookie #,cookie)
+ (record-abi-mismatch-error type))
+ (ctor field ...))
+ (_
+ (record-abi-mismatch-error type))))
+
+ (make-syntactic-constructor type syntactic-ctor ctor-procedure
(field ...)
#:abi-cookie #,cookie
#:thunked #,thunked
--
2.41.0
L
L
Ludovic Courtès wrote on 15 Apr 17:37 +0200
[PATCH 4/5] packages: ‘define-public’ replacement calls ‘module-export!’ dir ectly.
(address . 70398@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
723be7e4e70628f32862b8c1043eee641c4ccf8c.1713194148.git.ludo@gnu.org
This reduces code bloat and loading overhead for package modules, which
use ‘define-public’ extensively.

* guix/packages.scm (define-public*): Use ‘define’ followed by
‘module-export!’ directly instead of ‘define-public’.

Change-Id: I7f56d46b391c1e3eeeb0b9a08a9d34b5de341245
---
guix/packages.scm | 22 +++++++++++++++++-----
1 file changed, 17 insertions(+), 5 deletions(-)

Toggle diff (42 lines)
diff --git a/guix/packages.scm b/guix/packages.scm
index bd6724cdd4..6c697bcc67 100644
--- a/guix/packages.scm
+++ b/guix/packages.scm
@@ -482,7 +482,8 @@ (define-syntax-parameter current-definition-location
(define-syntax define-public*
(lambda (s)
"Like 'define-public' but set 'current-definition-location' for the
-lexical scope of its body."
+lexical scope of its body. (This also disables notification of \"module
+observers\", but this is unlikely to affect anyone.)"
(define location
(match (syntax-source s)
(#f #f)
@@ -499,10 +500,21 @@ (define-syntax define-public*
(syntax-case s ()
((_ prototype body ...)
- #`(define-public prototype
- (syntax-parameterize ((current-definition-location
- (lambda (s) #,location)))
- body ...))))))
+ (with-syntax ((name (syntax-case #'prototype ()
+ ((id _ ...) #'id)
+ (id #'id))))
+ #`(begin
+ (define prototype
+ (syntax-parameterize ((current-definition-location
+ (lambda (s) #,location)))
+ body ...))
+
+ ;; Note: Use 'module-export!' directly to avoid emitting a
+ ;; 'call-with-deferred-observers' call for each 'define-public*'
+ ;; instance, which is not only pointless but also contributes to
+ ;; code bloat and to load-time overhead in package modules.
+ (eval-when (expand load eval)
+ (module-export! (current-module) '(name)))))))))
(define-syntax validate-texinfo
(let ((validate? (getenv "GUIX_UNINSTALLED")))
--
2.41.0
L
L
Ludovic Courtès wrote on 15 Apr 17:37 +0200
[PATCH 2/5] packages: Reduce bloat induced by ‘sanitize-inputs’.
(address . 70398@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
5668e959834c21809c42c1556359eb65bf285caf.1713194148.git.ludo@gnu.org
At -O1, peval does the bulk of the optimization work and it cannot
reduce things like (null? (list 1 2)), unlike what happens in CPS at
-O2. Thus, reduce the part of ‘sanitize-inputs’ that’s inlined.

* guix/packages.scm (maybe-add-input-labels): New procedure.
(sanitize-inputs): Turn into a macro; use ‘maybe-add-input-labels’.

Change-Id: Id2283bb5a2f5d714722200bdcfe0b0bfa606923f
---
guix/packages.scm | 21 ++++++++++++++++-----
1 file changed, 16 insertions(+), 5 deletions(-)

Toggle diff (44 lines)
diff --git a/guix/packages.scm b/guix/packages.scm
index 930b1a3b0e..bd6724cdd4 100644
--- a/guix/packages.scm
+++ b/guix/packages.scm
@@ -1,5 +1,5 @@
;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2012-2023 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2012-2024 Ludovic Courtès <ludo@gnu.org>
;;; Copyright © 2014, 2015, 2017, 2018, 2019 Mark H Weaver <mhw@netris.org>
;;; Copyright © 2015 Eric Bavier <bavier@member.fsf.org>
;;; Copyright © 2016 Alex Kost <alezost@gmail.com>
@@ -430,15 +430,26 @@ (define %cuirass-supported-systems
;; <https://lists.gnu.org/archive/html/guix-devel/2017-03/msg00790.html>.
(fold delete %supported-systems '("mips64el-linux" "powerpc-linux" "riscv64-linux")))
-(define-inlinable (sanitize-inputs inputs)
- "Sanitize INPUTS by turning it into a list of name/package tuples if it's
-not already the case."
- (cond ((null? inputs) inputs)
+(define (maybe-add-input-labels inputs)
+ "Add labels to INPUTS unless it already has them."
+ (cond ((null? inputs)
+ inputs)
((and (pair? (car inputs))
(string? (caar inputs)))
inputs)
(else (map add-input-label inputs))))
+(define-syntax sanitize-inputs
+ ;; This is written as a macro rather than as a 'define-inlinable' procedure
+ ;; because as of Guile 3.0.9, peval can handle (null? '()) but not
+ ;; (null? (list x y z)); that residual 'null?' test contributes to code
+ ;; bloat.
+ (syntax-rules (quote)
+ "Sanitize INPUTS by turning it into a list of name/package tuples if it's
+not already the case."
+ ((_ '()) '())
+ ((_ inputs) (maybe-add-input-labels inputs))))
+
(define-syntax current-location-vector
(lambda (s)
"Like 'current-source-location' but expand to a literal vector with
--
2.41.0
L
L
Ludovic Courtès wrote on 15 Apr 17:37 +0200
[PATCH 5/5] packages: Reduce code bloat due to list allocation in input fields.
(address . 70398@debbugs.gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
e66ee292ea3368424d1ec904a45c804f5fa81879.1713194148.git.ludo@gnu.org
* guix/packages.scm (add-input-labels): New procedure.
(sanitize-inputs): Add case for (list …).

Change-Id: Ice8241508ded51efd38867b97ca19c262b8c4363
---
guix/packages.scm | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)

Toggle diff (35 lines)
diff --git a/guix/packages.scm b/guix/packages.scm
index 6c697bcc67..3a4f547d6b 100644
--- a/guix/packages.scm
+++ b/guix/packages.scm
@@ -439,16 +439,26 @@ (define (maybe-add-input-labels inputs)
inputs)
(else (map add-input-label inputs))))
+(define (add-input-labels . inputs)
+ "Add labels to all of INPUTS."
+ (map add-input-label inputs))
+
(define-syntax sanitize-inputs
;; This is written as a macro rather than as a 'define-inlinable' procedure
;; because as of Guile 3.0.9, peval can handle (null? '()) but not
;; (null? (list x y z)); that residual 'null?' test contributes to code
;; bloat.
- (syntax-rules (quote)
+ (syntax-rules (quote list)
"Sanitize INPUTS by turning it into a list of name/package tuples if it's
not already the case."
((_ '()) '())
- ((_ inputs) (maybe-add-input-labels inputs))))
+ ((_ (list args ...))
+ ;; As of 3.0.9, (list ...) is open-coded, which can lead to a long list
+ ;; of instructions. To reduce code bloat in package modules where input
+ ;; fields may create such lists, move list allocation to the callee.
+ (add-input-labels args ...))
+ ((_ inputs)
+ (maybe-add-input-labels inputs))))
(define-syntax current-location-vector
(lambda (s)
--
2.41.0
L
L
Ludovic Courtès wrote on 15 Apr 18:10 +0200
Re: [bug#70398] [PATCH 0/5] Reduce the size of gnu/packages/*.go files
(address . 70398@debbugs.gnu.org)
87cyqq1ua3.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (4 lines)
> What follows is a bunch of tricks to reduce code bloat, achieving ~14%
> reduction in the size of .go files (~18% if we look at gnu/packages
> alone).

On this topic, you may also like this earlier post:


Ludo’.
L
L
Ludovic Courtès wrote on 15 Apr 18:24 +0200
(address . 70398@debbugs.gnu.org)
87sezmzj9i.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (4 lines)
> What follows is a bunch of tricks to reduce code bloat, achieving ~14%
> reduction in the size of .go files (~18% if we look at gnu/packages
> alone).

On this topic, you may also like this earlier post:


Ludo’.
S
S
Simon Tournier wrote on 15 Apr 19:56 +0200
Re: [bug#70398] [PATCH 5/5] packages: Reduce code bloat due to list allocation in input fields.
(name . Ludovic Courtès)(address . ludo@gnu.org)
87zftuik7b.fsf@gmail.com
Hi Ludo,

On lun., 15 avril 2024 at 17:37, Ludovic Courtès <ludo@gnu.org> wrote:

Toggle quote (6 lines)
> + ((_ (list args ...))
> + ;; As of 3.0.9, (list ...) is open-coded, which can lead to a long list
> + ;; of instructions. To reduce code bloat in package modules where input
> + ;; fields may create such lists, move list allocation to the callee.
> + (add-input-labels args ...))

I am not sure to understand: « (list ...) is open-coded, which can lead
to a long list of instructions. ». Well, irrelevant for .go size but
why not something like:

((_ (list args . rest))
(apply add-inputs-labels (append args rest)))

It would not change for .go size but it would change for run-time if
it’s a long list, no?

Cheers,
simon
S
S
Simon Tournier wrote on 15 Apr 20:06 +0200
Re: [bug#70398] [PATCH 0/5] Reduce the size of gnu/packages/*.go files
(name . Ludovic Courtès)(address . ludo@gnu.org)
87wmoyijqg.fsf@gmail.com
Hi,

On lun., 15 avril 2024 at 17:27, Ludovic Courtès <ludo@gnu.org> wrote:

Toggle quote (4 lines)
> What follows is a bunch of tricks to reduce code bloat, achieving ~14%
> reduction in the size of .go files (~18% if we look at gnu/packages
> alone).

If have not checked that the reduction would be of ~18%. From my
understanding, the patch set LGTM; modulo an unrelated comment about
ellipsis and potential quadratic penalty on performances.


Toggle quote (7 lines)
> About 60% of those files are in the 64–128 KiB range. Since ELF sections
> are currently 64 KiB-aligned (see ‘*lcm-page-size*’ in Guile), we would
> save space by ensuring these are sparse files. To do that, we’ll need to
> detect holes when restoring nars and/or to change the nar format to
> preserve holes, while also ensuring that when the daemon copies files
> around, it also preserves holes. Work for later!

Since [1], I think that compiling a generic Guile record for <package>
is touching the limit of DSL. :-) Other said, I think the binary
(compiled) representation of records <package> should be specific and
thus optimized. Work for after later. ;-)

Cheers,
simon


1: How many bytes do we add (closure of guix) when adding one new package?
Simon Tournier <zimon.toutoune@gmail.com>
Thu, 25 May 2023 20:24:30 +0200
id:87r0r4uv4x.fsf@gmail.com
S
S
Simon Tournier wrote on 15 Apr 20:49 +0200
(name . Ludovic Courtès)(address . ludo@gnu.org)
87sezmihqb.fsf@gmail.com
Hi,

On lun., 15 avril 2024 at 18:10, Ludovic Courtès <ludo@gnu.org> wrote:
Toggle quote (10 lines)
> Ludovic Courtès <ludo@gnu.org> skribis:
>
>> What follows is a bunch of tricks to reduce code bloat, achieving ~14%
>> reduction in the size of .go files (~18% if we look at gnu/packages
>> alone).
>
> On this topic, you may also like this earlier post:
>
> https://lists.gnu.org/archive/html/guix-devel/2020-06/msg00071.html

And unrelated to this patch set, let also mention this other thread [1],
comparing (btrfs):

Toggle snippet (15 lines)
# compsize /gnu/store/nqrb3g4l59wd74w8mr9v0b992bj2sd1w-guix-d62c9b267-modules/lib/guile/3.0/site-ccache/gnu
Processed 503 files, 1317 regular extents (1317 refs), 0 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 27% 40M 144M 144M
none 100% 10M 10M 10M
zstd 22% 30M 133M 133M

# compsize /gnu/store/s6rqlhqr750k44ynkqqj5mwjj2cs2yln-guix-a09968565-modules/lib/guile/3.0/site-ccache/gnu
Processed 530 files, 1169 regular extents (1169 refs), 0 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 19% 22M 116M 116M
none 100% 32K 32K 32K
zstd 19% 22M 116M 116M

Compared to (ext4):

Toggle snippet (4 lines)
145M /gnu/store/nqrb3g4l59wd74w8mr9v0b992bj2sd1w-guix-d62c9b267-modules/lib/guile/3.0/site-ccache/gnu
117M /gnu/store/s6rqlhqr750k44ynkqqj5mwjj2cs2yln-guix-a09968565-modules/lib/guile/3.0/site-ccache/gnu

Somehow, these packages .go files could be compressed and decompressed
on the fly when needed.

Cheers,
simon


1: Re: How many bytes do we add (closure of guix) when adding one new package?
Guillaume Le Vaillant <glv@posteo.net>
Wed, 31 May 2023 12:47:09 +0000
id:87h6rsll5i.fsf@kitej
L
L
Ludovic Courtès wrote on 15 Apr 22:31 +0200
Re: [bug#70398] [PATCH 5/5] packages: Reduce code bloat due to list allocation in input fields.
(name . Simon Tournier)(address . zimon.toutoune@gmail.com)
877cgyz7tz.fsf@gnu.org
Simon Tournier <zimon.toutoune@gmail.com> skribis:

Toggle quote (11 lines)
> On lun., 15 avril 2024 at 17:37, Ludovic Courtès <ludo@gnu.org> wrote:
>
>> + ((_ (list args ...))
>> + ;; As of 3.0.9, (list ...) is open-coded, which can lead to a long list
>> + ;; of instructions. To reduce code bloat in package modules where input
>> + ;; fields may create such lists, move list allocation to the callee.
>> + (add-input-labels args ...))
>
> I am not sure to understand: « (list ...) is open-coded, which can lead
> to a long list of instructions. ».

This:

Toggle snippet (18 lines)
scheme@(guile-user)> ,c (lambda () (list 1 2 3 4))

[...]

8 (allocate-words/immediate 0 2)
9 (scm-set!/immediate 0 0 2)
10 (scm-set!/immediate 0 1 1)
11 (allocate-words/immediate 2 2)
12 (scm-set!/immediate 2 0 3)
13 (scm-set!/immediate 2 1 0)
14 (allocate-words/immediate 3 2)
15 (scm-set!/immediate 3 0 4)
16 (scm-set!/immediate 3 1 2)
17 (allocate-words/immediate 4 2)
18 (scm-set!/immediate 4 0 5)
19 (scm-set!/immediate 4 1 3)

Toggle quote (5 lines)
> Well, irrelevant for .go size but why not something like:
>
> ((_ (list args . rest))
> (apply add-inputs-labels (append args rest)))

That’s more code and I’m really trying hard to minimize generated code.
:-)

Ludo’.
S
S
Simon Tournier wrote on 22 Apr 02:15 +0200
(name . Ludovic Courtès)(address . ludo@gnu.org)
87frve6ymj.fsf@gmail.com
Hi,

On lun., 15 avril 2024 at 22:31, Ludovic Courtès <ludo@gnu.org> wrote:

Toggle quote (32 lines)
>> I am not sure to understand: « (list ...) is open-coded, which can lead
>> to a long list of instructions. ».
>
> This:
>
> --8<---------------cut here---------------start------------->8---
> scheme@(guile-user)> ,c (lambda () (list 1 2 3 4))
>
> [...]
>
> 8 (allocate-words/immediate 0 2)
> 9 (scm-set!/immediate 0 0 2)
> 10 (scm-set!/immediate 0 1 1)
> 11 (allocate-words/immediate 2 2)
> 12 (scm-set!/immediate 2 0 3)
> 13 (scm-set!/immediate 2 1 0)
> 14 (allocate-words/immediate 3 2)
> 15 (scm-set!/immediate 3 0 4)
> 16 (scm-set!/immediate 3 1 2)
> 17 (allocate-words/immediate 4 2)
> 18 (scm-set!/immediate 4 0 5)
> 19 (scm-set!/immediate 4 1 3)
> --8<---------------cut here---------------end--------------->8---
>
>> Well, irrelevant for .go size but why not something like:
>>
>> ((_ (list args . rest))
>> (apply add-inputs-labels (append args rest)))
>
> That’s more code and I’m really trying hard to minimize generated code.
> :-)

Thanks for explaining. :-) Yeah that’s make sense.

Cheers,
simon
?