[PATCH] lint: Speed up the formatting linter.

  • Done
  • quality assurance status badge
Details
2 participants
  • Ludovic Courtès
  • Christopher Baines
Owner
unassigned
Submitted by
Christopher Baines
Severity
normal
C
C
Christopher Baines wrote on 28 Oct 2023 16:35
(address . guix-patches@gnu.org)
4499b0c65aa2b2578b1d2efd17cd9f91d97fd2a0.1698503714.git.mail@cbaines.net
By storing the bytes to seek to for the start of each line the first time you
want to check a package in a file, rather than figuring out where the package
starts each time.

This cuts down the time to run guix lint -c formatting from 450 seconds to 13
seconds.

* guix/lint.scm (report-formatting-issues): If %check-formatting-seek-lookup
is a hash table, store vlist's in it to map from a line number to a byte to
seek to.
(%check-formatting-seek-lookup): New parameter.
* guix/scripts/lint.scm (guix-lint): Enable faster formatting linting, when
linting all packages.

Change-Id: I34e4d3acfbb1e14e026d2e7f712ba8d22b56c147
---
guix/lint.scm | 44 ++++++++++++++++++++++++++++++++++++++++++-
guix/scripts/lint.scm | 3 +++
2 files changed, 46 insertions(+), 1 deletion(-)

Toggle diff (101 lines)
diff --git a/guix/lint.scm b/guix/lint.scm
index 7ccf52dec1..d94b4026c6 100644
--- a/guix/lint.scm
+++ b/guix/lint.scm
@@ -68,6 +68,7 @@ (define-module (guix lint)
svn-multi-reference-user-name
svn-multi-reference-password)
#:use-module (guix import stackage)
+ #:use-module (ice-9 vlist)
#:use-module (ice-9 match)
#:use-module (ice-9 regex)
#:use-module (ice-9 format)
@@ -109,6 +110,7 @@ (define-module (guix lint)
check-license
check-vulnerabilities
check-for-updates
+ %check-formatting-seek-lookup
check-formatting
check-archival
check-profile-collisions
@@ -1839,6 +1841,40 @@ (define* (report-formatting-issues package file starting-line
#:key (reporters %formatting-reporters))
"Report white-space issues in FILE starting from STARTING-LINE, and report
them for PACKAGE."
+ (define (seek-to-line port line)
+ (let ((offset
+ (vlist-ref
+ (or (hash-ref (%check-formatting-seek-lookup) file)
+ (call-with-input-file file
+ (lambda (port)
+ (let* ((buf-length 80)
+ (buf (make-string buf-length)))
+ (let loop ((byte-lookup-list '(0)))
+ (let* ((rv (%read-delimited! "\n" buf #t port))
+ (terminator (car rv))
+ (nchars (cdr rv)))
+ (cond
+ ((eof-object? terminator)
+ (let ((byte-lookup-vlist
+ (list->vlist
+ (reverse byte-lookup-list))))
+ (hash-set! (%check-formatting-seek-lookup)
+ file
+ byte-lookup-vlist)
+ byte-lookup-vlist))
+
+ ((not terminator)
+ (loop byte-lookup-list))
+
+ (nchars
+ (loop (cons
+ (ftell port)
+ byte-lookup-list))))))))))
+ (- line 1))))
+ (set-port-line! port line)
+ (seek port offset SEEK_SET)
+ line))
+
(define (sexp-last-line port)
;; Return the last line of the sexp read from PORT or an estimate thereof.
(define &failure (list 'failure))
@@ -1857,7 +1893,10 @@ (define* (report-formatting-issues package file starting-line
(call-with-input-file file
(lambda (port)
- (let loop ((line-number 1)
+ (let loop ((line-number
+ (if (%check-formatting-seek-lookup)
+ (seek-to-line port starting-line)
+ 1))
(last-line #f)
(warnings '()))
(let ((line (read-line port)))
@@ -1879,6 +1918,9 @@ (define* (report-formatting-issues package file starting-line
(report package line line-number))
reporters)))))))))))
+(define %check-formatting-seek-lookup
+ (make-parameter #f))
+
(define (check-formatting package)
"Check the formatting of the source code of PACKAGE."
(let ((location (package-location package)))
diff --git a/guix/scripts/lint.scm b/guix/scripts/lint.scm
index ee3de51fb1..219c3b91be 100644
--- a/guix/scripts/lint.scm
+++ b/guix/scripts/lint.scm
@@ -222,6 +222,9 @@ (define-command (guix-lint . args)
(lambda (store)
(cond
((null? args)
+ ;; Enable fast seeking to lines for the check-formatting linter
+ (%check-formatting-seek-lookup (make-hash-table))
+
(fold-packages (lambda (p r) (run-checkers p checkers
#:store store)) '()))
(else

base-commit: c3cf04d05b452fee549bb84b323d056fd30cef45
--
2.41.0
L
L
Ludovic Courtès wrote on 5 Nov 2023 15:35
(name . Christopher Baines)(address . mail@cbaines.net)
87zfzs1d5h.fsf@gnu.org
Hi,

Christopher Baines <mail@cbaines.net> skribis:

Toggle quote (7 lines)
> By storing the bytes to seek to for the start of each line the first time you
> want to check a package in a file, rather than figuring out where the package
> starts each time.
>
> This cuts down the time to run guix lint -c formatting from 450 seconds to 13
> seconds.

Excellent!

Toggle quote (34 lines)
> + (define (seek-to-line port line)
> + (let ((offset
> + (vlist-ref
> + (or (hash-ref (%check-formatting-seek-lookup) file)
> + (call-with-input-file file
> + (lambda (port)
> + (let* ((buf-length 80)
> + (buf (make-string buf-length)))
> + (let loop ((byte-lookup-list '(0)))
> + (let* ((rv (%read-delimited! "\n" buf #t port))
> + (terminator (car rv))
> + (nchars (cdr rv)))
> + (cond
> + ((eof-object? terminator)
> + (let ((byte-lookup-vlist
> + (list->vlist
> + (reverse byte-lookup-list))))
> + (hash-set! (%check-formatting-seek-lookup)
> + file
> + byte-lookup-vlist)
> + byte-lookup-vlist))
> +
> + ((not terminator)
> + (loop byte-lookup-list))
> +
> + (nchars
> + (loop (cons
> + (ftell port)
> + byte-lookup-list))))))))))
> + (- line 1))))
> + (set-port-line! port line)
> + (seek port offset SEEK_SET)
> + line))

Two things: (1) it’s a bit hard to read, in part due to long
identifiers, and (2) it would be nice to keep this facility orthogonal,
outside the checker.

As it turns out, a similar facility is available in (guix utils),
exposed via ‘go-to-location’. Would it be possible to use it here?

Toggle quote (5 lines)
> + (let loop ((line-number
> + (if (%check-formatting-seek-lookup)
> + (seek-to-line port starting-line)
> + 1))

Answering myself: I guess ‘seek-to-line’ can be replaced by
(go-to-location port starting-line 0).

Thanks!

Ludo’.
C
C
Christopher Baines wrote on 5 Nov 2023 19:17
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 66796-done@debbugs.gnu.org)
87pm0ojc7q.fsf@cbaines.net
Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (62 lines)
> Hi,
>
> Christopher Baines <mail@cbaines.net> skribis:
>
>> By storing the bytes to seek to for the start of each line the first time you
>> want to check a package in a file, rather than figuring out where the package
>> starts each time.
>>
>> This cuts down the time to run guix lint -c formatting from 450 seconds to 13
>> seconds.
>
> Excellent!
>
>> + (define (seek-to-line port line)
>> + (let ((offset
>> + (vlist-ref
>> + (or (hash-ref (%check-formatting-seek-lookup) file)
>> + (call-with-input-file file
>> + (lambda (port)
>> + (let* ((buf-length 80)
>> + (buf (make-string buf-length)))
>> + (let loop ((byte-lookup-list '(0)))
>> + (let* ((rv (%read-delimited! "\n" buf #t port))
>> + (terminator (car rv))
>> + (nchars (cdr rv)))
>> + (cond
>> + ((eof-object? terminator)
>> + (let ((byte-lookup-vlist
>> + (list->vlist
>> + (reverse byte-lookup-list))))
>> + (hash-set! (%check-formatting-seek-lookup)
>> + file
>> + byte-lookup-vlist)
>> + byte-lookup-vlist))
>> +
>> + ((not terminator)
>> + (loop byte-lookup-list))
>> +
>> + (nchars
>> + (loop (cons
>> + (ftell port)
>> + byte-lookup-list))))))))))
>> + (- line 1))))
>> + (set-port-line! port line)
>> + (seek port offset SEEK_SET)
>> + line))
>
> Two things: (1) it’s a bit hard to read, in part due to long
> identifiers, and (2) it would be nice to keep this facility orthogonal,
> outside the checker.
>
> As it turns out, a similar facility is available in (guix utils),
> exposed via ‘go-to-location’. Would it be possible to use it here?
>
>> + (let loop ((line-number
>> + (if (%check-formatting-seek-lookup)
>> + (seek-to-line port starting-line)
>> + 1))
>
> Answering myself: I guess ‘seek-to-line’ can be replaced by
> (go-to-location port starting-line 0).

Cool, this simplifies the change a lot, I've pushed the amended version
to master as aa98a976079101318d53b412fef6c722bb4332c9.

Thanks,

Chris
-----BEGIN PGP SIGNATURE-----

iQKlBAEBCgCPFiEEPonu50WOcg2XVOCyXiijOwuE9XcFAmVH3GlfFIAAAAAALgAo
aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDNF
ODlFRUU3NDU4RTcyMEQ5NzU0RTBCMjVFMjhBMzNCMEI4NEY1NzcRHG1haWxAY2Jh
aW5lcy5uZXQACgkQXiijOwuE9Xfz7g/7B5Y5SFSnjIU9G5M+eZqFOz/FQcUSgobp
FWfziNHnc3VpCvEd9f//oVFwkGJtDwP/aex9kCmTeaK76itNJsiu0TI2TpaReCcR
l/me1agFyjR2j9mpf3cOdr9VFbenQ1ZryuolZxulNNTee4+fRTOKDTqOarDufc4I
WjPKGse6BV1qLA4Ppt1f4OtjfwzyRMCes3HzIrxdUAYSAksj4qTJ33Kc7fDIn6jo
o4m78JCIMZWDsKIUlW6VOZ6wSLK/R4xZB8nOW2l4d+HQYjniPvujOO7DZzRt6oYj
EJNdMSAmjuNKybFhBwzqS7Vs3OAf9UgdM+7mwBbEPyLMlXC238K/ssuVo/885nCe
yDklanS2hz4ugVPqWSj9kpy9fVPbQGtbDJXnyWrGY+UaG3Tzbmkcci8L/kDjyFSk
99iSVTunm0oOqZkaWC/MowVVamrtiwrJ3amwalqQpvGdoDbFFan7LtJQohu70rEb
y+G2Z6gzlQL2SaRfRiazUhgFdJ5qsvlNwRZu9USRFWQ2SP2wo6ipjKhFy2Zonu5F
szQyxWNIuPlowfkDSrncvsUasDL4Tc3EHPQ1LH866EygSaeaqpxduESp9aT0/UmR
TT+eRX/bY5aubmTPEci1mhmUZXq998vrl1Pzq2vOnL9NIJvrDibTpsSzdHnZBQD6
zNvPdrBjGxQ=
=VFYY
-----END PGP SIGNATURE-----

Closed
?
Your comment

This issue is archived.

To comment on this conversation send an email to 66796@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 66796
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch