[PATCH] packages: 'patch-and-repack' returns a directory when given a directory.

  • Done
  • quality assurance status badge
Details
2 participants
  • Ludovic Courtès
  • Maxim Cournoyer
Owner
unassigned
Submitted by
Ludovic Courtès
Severity
normal
L
L
Ludovic Courtès wrote on 15 Jan 2021 14:15
(address . guix-patches@gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20210115131548.8792-1-ludo@gnu.org
Previously, 'patch-and-repack' would always create a tar.xz archive as a
result, even if the input was a directory (a checkout). This change
reduces gratuitous CPU and storage overhead.

* guix/packages.scm (patch-and-repack)[tarxz-name]: Remove 'checkout?' case.
[build](repack): New procedure, with "tar" invocation formerly at the
top level.
If SOURCE is a directory, call 'copy-recursively'; otherwise, call
'repack'.
Change NAME to ORIGINAL-FILE-NAME when it matches 'checkout?'.
---
guix/packages.scm | 65 ++++++++++++++++++++++++++---------------------
1 file changed, 36 insertions(+), 29 deletions(-)

Hi!

This change is a followup to a recent IRC discussion: it makes
‘patch-and-repack’ preserve the “directoriness” of its input.

It conflicts with other changes Maxim posted at
https://issues.guix.gnu.org/45773, though that could be addressed.

Thoughts?

Ludo’.

Toggle diff (98 lines)
diff --git a/guix/packages.scm b/guix/packages.scm
index 4caaa9cb79..cd2cded9ee 100644
--- a/guix/packages.scm
+++ b/guix/packages.scm
@@ -1,5 +1,5 @@
;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021 Ludovic Courtès <ludo@gnu.org>
;;; Copyright © 2014, 2015, 2017, 2018 Mark H Weaver <mhw@netris.org>
;;; Copyright © 2015 Eric Bavier <bavier@member.fsf.org>
;;; Copyright © 2016 Alex Kost <alezost@gmail.com>
@@ -635,11 +635,9 @@ specifies modules in scope when evaluating SNIPPET."
(define (tarxz-name file-name)
;; Return a '.tar.xz' file name based on FILE-NAME.
- (let ((base (cond ((numeric-extension? file-name)
- original-file-name)
- ((checkout? file-name)
- (string-drop-right file-name 9))
- (else (file-sans-extension file-name)))))
+ (let ((base (if (numeric-extension? file-name)
+ original-file-name
+ (file-sans-extension file-name))))
(string-append base
(if (equal? (file-extension base) "tar")
".xz"
@@ -689,6 +687,29 @@ specifies modules in scope when evaluating SNIPPET."
(lambda (name)
(not (member name '("." "..")))))))
+ (define (repack directory output)
+ ;; Write to OUTPUT a compressed tarball containing DIRECTORY.
+ (unless tar-supports-sort?
+ (call-with-output-file ".file_list"
+ (lambda (port)
+ (for-each (lambda (name)
+ (format port "~a~%" name))
+ (find-files directory
+ #:directories? #t
+ #:fail-on-error? #t)))))
+
+ (apply invoke #+(file-append tar "/bin/tar")
+ "cvfa" output
+ ;; Avoid non-determinism in the archive. Set the mtime
+ ;; to 1 as is the case in the store (software like gzip
+ ;; behaves differently when it stumbles upon mtime = 0).
+ "--mtime=@1"
+ "--owner=root:0" "--group=root:0"
+ (if tar-supports-sort?
+ `("--sort=name" ,directory)
+ '("--no-recursion"
+ "--files-from=.file_list"))))
+
;; Encoding/decoding errors shouldn't be silent.
(fluid-set! %default-port-conversion-strategy 'error)
@@ -742,30 +763,16 @@ specifies modules in scope when evaluating SNIPPET."
(chdir "..")
- (unless tar-supports-sort?
- (call-with-output-file ".file_list"
- (lambda (port)
- (for-each (lambda (name)
- (format port "~a~%" name))
- (find-files directory
- #:directories? #t
- #:fail-on-error? #t)))))
- (apply invoke
- (string-append #+tar "/bin/tar")
- "cvfa" #$output
- ;; Avoid non-determinism in the archive. Set the mtime
- ;; to 1 as is the case in the store (software like gzip
- ;; behaves differently when it stumbles upon mtime = 0).
- "--mtime=@1"
- "--owner=root:0"
- "--group=root:0"
- (if tar-supports-sort?
- `("--sort=name"
- ,directory)
- '("--no-recursion"
- "--files-from=.file_list")))))))
+ ;; If SOURCE is a directory (such as a checkout), return a
+ ;; directory. Otherwise create a tarball.
+ (if (file-is-directory? #+source)
+ (copy-recursively directory #$output
+ #:log (%make-void-port "w"))
+ (repack directory #$output))))))
- (let ((name (tarxz-name original-file-name)))
+ (let ((name (if (checkout? original-file-name)
+ original-file-name
+ (tarxz-name original-file-name))))
(gexp->derivation name build
#:graft? #f
#:system system
--
2.30.0
M
M
Maxim Cournoyer wrote on 15 Jan 2021 23:14
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 45891@debbugs.gnu.org)
878s8tluev.fsf@gmail.com
Hello!

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (4 lines)
> Previously, 'patch-and-repack' would always create a tar.xz archive as a
> result, even if the input was a directory (a checkout). This change
> reduces gratuitous CPU and storage overhead.

I like it!

Note that on core-updates, xz compression is relatively fast on modern
machines as it can do multi-threading. About space the savings; could
the 'temporary' pristine source be cleared from the store always? This
would prevent keeping nonfree cruft under /gnu/store until the next
garbage collection run, for those sources that are cleaned up.

Toggle quote (46 lines)
> * guix/packages.scm (patch-and-repack)[tarxz-name]: Remove 'checkout?' case.
> [build](repack): New procedure, with "tar" invocation formerly at the
> top level.
> If SOURCE is a directory, call 'copy-recursively'; otherwise, call
> 'repack'.
> Change NAME to ORIGINAL-FILE-NAME when it matches 'checkout?'.
> ---
> guix/packages.scm | 65 ++++++++++++++++++++++++++---------------------
> 1 file changed, 36 insertions(+), 29 deletions(-)
>
> Hi!
>
> This change is a followup to a recent IRC discussion: it makes
> ‘patch-and-repack’ preserve the “directoriness” of its input.
>
> It conflicts with other changes Maxim posted at
> <https://issues.guix.gnu.org/45773>, though that could be addressed.
>
> Thoughts?
>
> Ludo’.
>
> diff --git a/guix/packages.scm b/guix/packages.scm
> index 4caaa9cb79..cd2cded9ee 100644
> --- a/guix/packages.scm
> +++ b/guix/packages.scm
> @@ -1,5 +1,5 @@
> ;;; GNU Guix --- Functional package management for GNU
> -;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020 Ludovic Courtès <ludo@gnu.org>
> +;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021 Ludovic Courtès <ludo@gnu.org>
> ;;; Copyright © 2014, 2015, 2017, 2018 Mark H Weaver <mhw@netris.org>
> ;;; Copyright © 2015 Eric Bavier <bavier@member.fsf.org>
> ;;; Copyright © 2016 Alex Kost <alezost@gmail.com>
> @@ -635,11 +635,9 @@ specifies modules in scope when evaluating SNIPPET."
>
> (define (tarxz-name file-name)
> ;; Return a '.tar.xz' file name based on FILE-NAME.
> - (let ((base (cond ((numeric-extension? file-name)
> - original-file-name)
> - ((checkout? file-name)
> - (string-drop-right file-name 9))
> - (else (file-sans-extension file-name)))))
> + (let ((base (if (numeric-extension? file-name)
> + original-file-name
> + (file-sans-extension file-name))))

This is not new code, but I'm wondering what's the purpose of
numeric-extension? What kind of files does it expect to catch? Also,
what happened to stripping the '-checkout' suffix that used to be done?
It doesn't seem like it will happen anymore.

Toggle quote (74 lines)
> (string-append base
> (if (equal? (file-extension base) "tar")
> ".xz"
> @@ -689,6 +687,29 @@ specifies modules in scope when evaluating SNIPPET."
> (lambda (name)
> (not (member name '("." "..")))))))
>
> + (define (repack directory output)
> + ;; Write to OUTPUT a compressed tarball containing DIRECTORY.
> + (unless tar-supports-sort?
> + (call-with-output-file ".file_list"
> + (lambda (port)
> + (for-each (lambda (name)
> + (format port "~a~%" name))
> + (find-files directory
> + #:directories? #t
> + #:fail-on-error? #t)))))
> +
> + (apply invoke #+(file-append tar "/bin/tar")
> + "cvfa" output
> + ;; Avoid non-determinism in the archive. Set the mtime
> + ;; to 1 as is the case in the store (software like gzip
> + ;; behaves differently when it stumbles upon mtime = 0).
> + "--mtime=@1"
> + "--owner=root:0" "--group=root:0"
> + (if tar-supports-sort?
> + `("--sort=name" ,directory)
> + '("--no-recursion"
> + "--files-from=.file_list"))))
> +
> ;; Encoding/decoding errors shouldn't be silent.
> (fluid-set! %default-port-conversion-strategy 'error)
>
> @@ -742,30 +763,16 @@ specifies modules in scope when evaluating SNIPPET."
>
> (chdir "..")
>
> - (unless tar-supports-sort?
> - (call-with-output-file ".file_list"
> - (lambda (port)
> - (for-each (lambda (name)
> - (format port "~a~%" name))
> - (find-files directory
> - #:directories? #t
> - #:fail-on-error? #t)))))
> - (apply invoke
> - (string-append #+tar "/bin/tar")
> - "cvfa" #$output
> - ;; Avoid non-determinism in the archive. Set the mtime
> - ;; to 1 as is the case in the store (software like gzip
> - ;; behaves differently when it stumbles upon mtime = 0).
> - "--mtime=@1"
> - "--owner=root:0"
> - "--group=root:0"
> - (if tar-supports-sort?
> - `("--sort=name"
> - ,directory)
> - '("--no-recursion"
> - "--files-from=.file_list")))))))
> + ;; If SOURCE is a directory (such as a checkout), return a
> + ;; directory. Otherwise create a tarball.
> + (if (file-is-directory? #+source)
> + (copy-recursively directory #$output
> + #:log (%make-void-port "w"))
> + (repack directory #$output))))))
>
> - (let ((name (tarxz-name original-file-name)))
> + (let ((name (if (checkout? original-file-name)
> + original-file-name
> + (tarxz-name original-file-name))))
> (gexp->derivation name build
> #:graft? #f
> #:system system

Was these cases (tar archive source derivation, directory source
derivation) already covered by tests under tests/packages.cm? How did
you otherwise test it? World rebuilding changes are not fun to test
without unit tests.

I've run that test module like this:

$ make check TESTS=tests/packages.scm SCM_LOG_DRIVER_FLAGS=' --brief=no' VERBOSE=1

with your change and it passed.

Thanks,

Maxim
L
L
Ludovic Courtès wrote on 16 Jan 2021 22:28
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 45891@debbugs.gnu.org)
87lfcsy3jd.fsf@gnu.org
Hi Maxim,

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

Toggle quote (12 lines)
> Ludovic Courtès <ludo@gnu.org> writes:
>
>> Previously, 'patch-and-repack' would always create a tar.xz archive as a
>> result, even if the input was a directory (a checkout). This change
>> reduces gratuitous CPU and storage overhead.
>
> I like it!
>
> Note that on core-updates, xz compression is relatively fast on modern
> machines as it can do multi-threading. About space the savings; could
> the 'temporary' pristine source be cleared from the store always?

No, it’s not possible—the GC will remove what’s unreachable when it
eventually runs.

Toggle quote (14 lines)
>> (define (tarxz-name file-name)
>> ;; Return a '.tar.xz' file name based on FILE-NAME.
>> - (let ((base (cond ((numeric-extension? file-name)
>> - original-file-name)
>> - ((checkout? file-name)
>> - (string-drop-right file-name 9))
>> - (else (file-sans-extension file-name)))))
>> + (let ((base (if (numeric-extension? file-name)
>> + original-file-name
>> + (file-sans-extension file-name))))
>
> This is not new code, but I'm wondering what's the purpose of
> numeric-extension?

It’s for file names like “hello-2.10”, where you wouldn’t want to strip
“.10”. (Such file names should be rare, but it’s not impossible.)

Toggle quote (4 lines)
> What kind of files does it expect to catch? Also, what happened to
> stripping the '-checkout' suffix that used to be done? It doesn't
> seem like it will happen anymore.

Stripping “-checkout” is no longer necessary because for these we keep
the original name.

Toggle quote (13 lines)
>> - (let ((name (tarxz-name original-file-name)))
>> + (let ((name (if (checkout? original-file-name)
>> + original-file-name
>> + (tarxz-name original-file-name))))
>> (gexp->derivation name build
>> #:graft? #f
>> #:system system
>
> Was these cases (tar archive source derivation, directory source
> derivation) already covered by tests under tests/packages.cm? How did
> you otherwise test it? World rebuilding changes are not fun to test
> without unit tests.

In this case I rebuilt the world and tested ‘guix build -S’ on a
git-fetch package with a snippet, but I agree that’s super expensive (I
tested the handful of commits I recently pushed to ‘core-updates’ at the
same time.)

There are no unit tests specifically for this procedure, but I think
we’ll quickly find out if it doesn’t behave as intended.

WDYT?

Thanks,
Ludo’.
M
M
Maxim Cournoyer wrote on 18 Jan 2021 07:24
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 45891@debbugs.gnu.org)
87v9bukbjx.fsf@gmail.com
Hi Ludo,

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (19 lines)
> Hi Maxim,
>
> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>> Ludovic Courtès <ludo@gnu.org> writes:
>>
>>> Previously, 'patch-and-repack' would always create a tar.xz archive as a
>>> result, even if the input was a directory (a checkout). This change
>>> reduces gratuitous CPU and storage overhead.
>>
>> I like it!
>>
>> Note that on core-updates, xz compression is relatively fast on modern
>> machines as it can do multi-threading. About space the savings; could
>> the 'temporary' pristine source be cleared from the store always?
>
> No, it’s not possible—the GC will remove what’s unreachable when it
> eventually runs.

OK.

Toggle quote (18 lines)
>
>>> (define (tarxz-name file-name)
>>> ;; Return a '.tar.xz' file name based on FILE-NAME.
>>> - (let ((base (cond ((numeric-extension? file-name)
>>> - original-file-name)
>>> - ((checkout? file-name)
>>> - (string-drop-right file-name 9))
>>> - (else (file-sans-extension file-name)))))
>>> + (let ((base (if (numeric-extension? file-name)
>>> + original-file-name
>>> + (file-sans-extension file-name))))
>>
>> This is not new code, but I'm wondering what's the purpose of
>> numeric-extension?
>
> It’s for file names like “hello-2.10”, where you wouldn’t want to strip
> “.10”. (Such file names should be rare, but it’s not impossible.)

Ah! Thanks for explaining.

Toggle quote (30 lines)
>> What kind of files does it expect to catch? Also, what happened to
>> stripping the '-checkout' suffix that used to be done? It doesn't
>> seem like it will happen anymore.
>
> Stripping “-checkout” is no longer necessary because for these we keep
> the original name.
>
>>> - (let ((name (tarxz-name original-file-name)))
>>> + (let ((name (if (checkout? original-file-name)
>>> + original-file-name
>>> + (tarxz-name original-file-name))))
>>> (gexp->derivation name build
>>> #:graft? #f
>>> #:system system
>>
>> Was these cases (tar archive source derivation, directory source
>> derivation) already covered by tests under tests/packages.cm? How did
>> you otherwise test it? World rebuilding changes are not fun to test
>> without unit tests.
>
> In this case I rebuilt the world and tested ‘guix build -S’ on a
> git-fetch package with a snippet, but I agree that’s super expensive (I
> tested the handful of commits I recently pushed to ‘core-updates’ at the
> same time.)
>
> There are no unit tests specifically for this procedure, but I think
> we’ll quickly find out if it doesn’t behave as intended.
>
> WDYT?

LGTM. Feel free to push! About my related change that we thought was
conflicting with this one; it at least applies on top of your change,
and only one test fails currently, I'm working on a fix.

Feel free to push to core-updates!

Thank you,

Maxim
L
L
Ludovic Courtès wrote on 18 Jan 2021 15:56
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 45891-done@debbugs.gnu.org)
87o8hmjnta.fsf@gnu.org
Hi!

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

Toggle quote (4 lines)
> LGTM. Feel free to push! About my related change that we thought was
> conflicting with this one; it at least applies on top of your change,
> and only one test fails currently, I'm working on a fix.

Alright.

Toggle quote (2 lines)
> Feel free to push to core-updates!

Done, thanks!

Ludo’.
Closed
?