[RESEND #2] [PATCH v1 0/2] Start adding ZIM file(s)

  • Done
  • quality assurance status badge
Details
2 participants
  • Denis 'GNUtoo' Carikli
  • Christopher Baines
Owner
unassigned
Submitted by
Denis 'GNUtoo' Carikli
Severity
normal
D
D
Denis 'GNUtoo' Carikli wrote on 23 Dec 2022 23:07
(address . guix-patches@gnu.org)(name . Denis 'GNUtoo' Carikli)(address . GNUtoo@cyberdimension.org)
20221220052349.4965-1-GNUtoo@cyberdimension.org
Hi,

Here are two small patches.

The first one add #:substitutable? to the copy-build system.

I don't know how to check if it works as intended though. It's
similar to the commit d0050ea8ad1c32d94cf5ba6725a0fc961bb23f38
("build-system/go: Add #:substitutable? argument.") so normally
it shouldn't be an issue, but if someone can double check it it
would be best as it would avoid keeping around substitutes of
very big sizes.

The second patch adds a ZIM file. I'll most likely send more
patches to add additional ZIM files packages (about 10) later
on. I prefer doing it this way as it avoids having to deal with
potential rebases breaking if there is something wrong with my
second patch.

Denis 'GNUtoo' Carikli (2):
build-system/copy: Add #:substitutable? argument.
gnu: Add wikipedia_en_all_maxi

gnu/local.mk | 1 +
gnu/packages/zim-files.scm | 86 ++++++++++++++++++++++++++++++++++++++
guix/build-system/copy.scm | 4 +-
3 files changed, 90 insertions(+), 1 deletion(-)
create mode 100644 gnu/packages/zim-files.scm


base-commit: c193b5203b31246a6d74270c8086c45851561947
--
2.38.1
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEEeC+d2+Nrp/PU3kkGX138wUF34mMFAmOmJqQACgkQX138wUF3
4mNvnQ//Vxn8qW97pKPuNKidpl4iL2/uxXaxzk2aRQ6IdS/PS+S6VZxl1FSW3efK
Gg1fPlUCaR0X2LxqqAMe9LANP0sAVGOQEYm76+JQ70Yrs/EFryreQMA7+dHBuxlA
hRgSE9TxteoEy1w9lBHDgxPGpG9Sda+luHr5oNDcq0HSB43qd3StD+UmC+mAOmnQ
yWX341sDDaDNMIqCNql84PgjBaBFqIzfOhb3W0K1wRRUCAQ7ZuqXzWQwIUJ4XGUZ
voX3Rg5YlEM2/fZGXeNyWQ8UFhtl4n4rkgH3xa6IYYsqZ7V/5wSm2K1CZGX7B2Jq
oE6c9LwQj91r1A39Zq5+/yIkC77ABIszVtYqe74QqlmJviYhHhSXOJU9PPpQyOR1
GdnVZXchtLlH4b9XCr9hUpJ878bIrpTWSftldvxbI36o+JCFezjddY3vvvYMUEp+
rGb47YjJRl2dtjoVRYzGjUxJ3qV5Rkeep80r0NKdokUlAlln6Sx6yUvY3hrlzhlr
YpTZ+UJ4dq6g/C0nzSZTV0rgw8Zxjj0QlEb1VQYSpKzVekiCLPiwHqOUpwQAZU3u
fl/Z646eBrg4NyibiEjpOxMr6CRcLeZg6ve0+Ir0yeRfq53HUjIgPwlwMY935nhk
+4PAg/uWdiIQWHEGwxEQrnamOu4SlEP+pAavwv38BpLDpGvIi/U=
=DAyV
-----END PGP SIGNATURE-----


D
D
Denis 'GNUtoo' Carikli wrote on 23 Dec 2022 23:20
[PATCH v1 1/2] build-system/copy: Add #:substitutable? argument.
(address . 60288@debbugs.gnu.org)(name . Denis 'GNUtoo' Carikli)(address . GNUtoo@cyberdimension.org)
20221223222024.13805-1-GNUtoo@cyberdimension.org
* guix/build-system/copy.scm (copy-build): Add 'substitutable?'
argument.
---
guix/build-system/copy.scm | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

Toggle diff (24 lines)
diff --git a/guix/build-system/copy.scm b/guix/build-system/copy.scm
index 4894ba46fb..bb4d2daaa8 100644
--- a/guix/build-system/copy.scm
+++ b/guix/build-system/copy.scm
@@ -96,7 +96,8 @@ (define* (copy-build name inputs
(target #f)
(imported-modules %copy-build-system-modules)
(modules '((guix build copy-build-system)
- (guix build utils))))
+ (guix build utils)))
+ (substitutable? #t))
"Build SOURCE using INSTALL-PLAN, and with INPUTS."
(define builder
(with-imported-modules imported-modules
@@ -129,6 +130,7 @@ (define builder
(gexp->derivation name builder
#:system system
#:target #f
+ #:substitutable? substitutable?
#:guile-for-build guile)))
(define copy-build-system
--
2.38.1
D
D
Denis 'GNUtoo' Carikli wrote on 23 Dec 2022 23:20
[PATCH v1 2/2] gnu: Add wikipedia_en_all_maxi
(address . 60288@debbugs.gnu.org)(name . Denis 'GNUtoo' Carikli)(address . GNUtoo@cyberdimension.org)
20221223222024.13805-2-GNUtoo@cyberdimension.org
* gnu/packages/zim-files.scm (wikipedia_en_all_maxi): New variable.
---
gnu/local.mk | 1 +
gnu/packages/zim-files.scm | 86 ++++++++++++++++++++++++++++++++++++++
2 files changed, 87 insertions(+)
create mode 100644 gnu/packages/zim-files.scm

Toggle diff (106 lines)
diff --git a/gnu/local.mk b/gnu/local.mk
index 5b8944f568..8957554fc2 100644
--- a/gnu/local.mk
+++ b/gnu/local.mk
@@ -643,6 +643,7 @@ GNU_SYSTEM_MODULES = \
%D%/packages/xfce.scm \
%D%/packages/zig.scm \
%D%/packages/zile.scm \
+ %D%/packages/zim-files.scm \
%D%/packages/zwave.scm \
\
%D%/services.scm \
diff --git a/gnu/packages/zim-files.scm b/gnu/packages/zim-files.scm
new file mode 100644
index 0000000000..49b7accb52
--- /dev/null
+++ b/gnu/packages/zim-files.scm
@@ -0,0 +1,86 @@
+;;; GNU Guix --- Functional package management for GNU
+;;; Copyright © 2022 Denis 'GNUtoo' Carikli <GNUtoo@cyberdimension.org>
+;;;
+;;; This file is part of GNU Guix.
+;;;
+;;; GNU Guix is free software; you can redistribute it and/or modify it
+;;; under the terms of the GNU General Public License as published by
+;;; the Free Software Foundation; either version 3 of the License, or (at
+;;; your option) any later version.
+;;;
+;;; GNU Guix is distributed in the hope that it will be useful, but
+;;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+;;; GNU General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU General Public License
+;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>.
+
+(define-module (gnu packages zim-files)
+ #:use-module (gnu packages)
+ #:use-module (guix build-system copy)
+ #:use-module (guix download)
+ #:use-module (guix gexp)
+ #:use-module (guix utils)
+ #:use-module ((guix licenses) #:prefix license:)
+ #:use-module (guix packages))
+
+;;; Commentary:
+;;;
+;;; Many Guix contributors have a tendency to update packages in this
+;;; way: they only update the package revision and then launch a build
+;;; that fails just to make Guix tell them the right base32 hash. They
+;;; then update the base32 hash and launch the build again.
+;;;
+;;; However some ZIM files are quite big. At the time of writing,
+;;; wikipedia_en_all_maxi_2022-05.zim is about 89 GiB.
+;;;
+;;; So this approach will be time consuming as the second time Guix
+;;; will restart downloading the same file from scratch.
+;;;
+;;; The solution to this issue is to download the sha256sums (for that
+;;; simply append .sha256 to the URL of the ZIM file). It will give a
+;;; file like that:
+;;; f12163513307893c87fd75009b1d61677bae675627eaadf4cb0fa63953eea021 wikipedia_en_all_maxi_2022-05.zim
+;;;
+;;; You can then use this hash to compute the base32 with nix-hash:
+;;; $ nix-hash --type sha256 --to-base32 \
+;;; f12163513307893c87fd75009b1d61677bae675627eaadf4cb0fa63953eea021
+;;; 08d0xr9kk9hgrgsavsi7arkswyv7c4frn03mzn3kr2876d8n68gi
+
+(define-public wikipedia-en-all-maxi
+ (package
+ (name "wikipedia-en-all-maxi")
+ (version "2022-05")
+ (source (origin
+ (method url-fetch)
+ (uri (string-append
+ "https://mirror.download.kiwix.org/zim/wikipedia/"
+ (string-replace-substring name "-" "_")
+ "_" version ".zim"))
+ (sha256
+ (base32
+ "08d0xr9kk9hgrgsavsi7arkswyv7c4frn03mzn3kr2876d8n68gi"))))
+ (build-system copy-build-system)
+ (arguments
+ (list
+ ;; We are not (yet) generating the zim file, so it doesn't make sense to
+ ;; build substitutes.
+ #:substitutable? #f
+ ;; If we use kiwix-serve, the path of the ZIM file needs to be passed to
+ ;; it. And if the filename has a version in it, we'd need to update the
+ ;; path manually each time the package is updated. We also need to
+ ;; change the filename to match the package name.
+ #:install-plan #~'((#$(string-append
+ (string-replace-substring name "-" "_")
+ "_" version ".zim")
+ #$(string-append "share/" name ".zim")))))
+ (synopsis
+ "Complete English Wikipedia packed in a ZIM file, for offline usage with
+Kiwix")
+ (description
+ "Wikipedia is a free Encyclopedia. This is the English version. It
+contains all the articles, and all the medias (images, etc) present in
+the articles in a scaled down resolution.")
+ (home-page "https://en.wikipedia.org/wiki/Main_Page")
+ (license license:cc-by-sa3.0)))
--
2.38.1
C
C
Christopher Baines wrote on 28 Dec 2022 19:10
Re: [bug#60288] [RESEND #2] [PATCH v1 0/2] Start adding ZIM file(s)
(name . Denis 'GNUtoo' Carikli)(address . GNUtoo@cyberdimension.org)
87bknn2y67.fsf@cbaines.net
Denis 'GNUtoo' Carikli <GNUtoo@cyberdimension.org> writes:

Toggle quote (21 lines)
> Here are two small patches.
>
> The first one add #:substitutable? to the copy-build system.
>
> I don't know how to check if it works as intended though. It's
> similar to the commit d0050ea8ad1c32d94cf5ba6725a0fc961bb23f38
> ("build-system/go: Add #:substitutable? argument.") so normally
> it shouldn't be an issue, but if someone can double check it it
> would be best as it would avoid keeping around substitutes of
> very big sizes.
>
> The second patch adds a ZIM file. I'll most likely send more
> patches to add additional ZIM files packages (about 10) later
> on. I prefer doing it this way as it avoids having to deal with
> potential rebases breaking if there is something wrong with my
> second patch.
>
> Denis 'GNUtoo' Carikli (2):
> build-system/copy: Add #:substitutable? argument.
> gnu: Add wikipedia_en_all_maxi

I haven't looked at this in detail, but one comment on the QA
failures. Building the package for this large file involves copying it
from the store, to another place in the store. This requires 2x the
space which this large file takes up, which is a pretty wasteful
approach.

This is the reason behind the build failures I've seen, the build
machines run out of space when attempting the file copy. Maybe an
alternative if you want to have a package would be to symlink to the
source. That way, there's only a large file and a symlink in the store,
rather than two copies of the same large file.

Chris
-----BEGIN PGP SIGNATURE-----

iQKlBAEBCgCPFiEEPonu50WOcg2XVOCyXiijOwuE9XcFAmOsh4BfFIAAAAAALgAo
aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDNF
ODlFRUU3NDU4RTcyMEQ5NzU0RTBCMjVFMjhBMzNCMEI4NEY1NzcRHG1haWxAY2Jh
aW5lcy5uZXQACgkQXiijOwuE9Xe3HA//ZH8Wp4Sg/muK1kak62JcRLbFNFDGpCMe
yoOuTBlnySZkG6/g5AWcu/m3TXwwo3MePO2WZ0bqhCYLLo+fuzYn7vaKfSQTVbbe
eRz+n/8PRgfHnBYeT9EvDLndELGkhzAz4FDhRyAsfqUoiHil+Fg6GWRs6pvAEuav
9GwpYhMtnxzzn/l3dLi4UpBIOdUl8jsfnTMQiPmzYPnXFhLATi3oQHNwSmOR8UiJ
EXGiuhuFjVcd5y0hdR46CDw/KjeSrjdOJEJzAjlzSOxuto7TgD1jCLdPqcj5oLwT
cxxbsHPK4Htce3FuLPKxmM2S6E87ga7ku66TcMHpU+H8Bi7XU3meJsAnAN//3BqD
PfJTc8On46sQgaoW5yctwJK/pbURUGYufd/OEKlwA7yRK9/CUikU1IqEk9fCiNZV
KYjeo1m8qEQxB9Hhjfd/NzWVMlLzUOzWqf07sVn0Q69m8PgOcXCWj8uviFdOZsg9
QotG2KVyhLO04MX4Ccykf/BiTj2+6snCbvs2NcMMmd72DmBPD+1pq7rHDywm4X1d
O00niQ/KzqP9a5DEd8cJtAohxvmlkXzgEYLaB0bSC5/GJl2fSlthCDPGx6bcepP4
9HjmXDhRr4UTshFDJCuuw9LgmuJoyrA0s5eQlCO3rk8jZYJs23ECW1U4SGT0t7nf
YVfys+yvSOs=
=CtrP
-----END PGP SIGNATURE-----

D
D
Denis 'GNUtoo' Carikli wrote on 30 Dec 2022 00:19
(name . Christopher Baines)(address . mail@cbaines.net)
20221230001950.4cc86d04@primary_laptop
On Wed, 28 Dec 2022 18:10:54 +0000
Christopher Baines <mail@cbaines.net> wrote:
Toggle quote (5 lines)
> I haven't looked at this in detail, but one comment on the QA
> failures. Building the package for this large file involves copying it
> from the store, to another place in the store. This requires 2x the
> space which this large file takes up, which is a pretty wasteful
> approach.
Not only that but it also take a very long time to do that copy on
slower machines with an encrypted rootfs.

Toggle quote (5 lines)
> This is the reason behind the build failures I've seen, the build
> machines run out of space when attempting the file copy. Maybe an
> alternative if you want to have a package would be to symlink to the
> source. That way, there's only a large file and a symlink in the
> store, rather than two copies of the same large file.
I'll try that. I hope that guix gc will not garbage collect the source
though.

Do you know if it's possible just to have a source package somehow
(and download the source to a specific filename) and not copy anything
at all?

Denis.
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEEeC+d2+Nrp/PU3kkGX138wUF34mMFAmOuIJYACgkQX138wUF3
4mM+Nw/9FBPWlNpQS7RA4KJNWVTDLaesFXIPgzeYIF554M80FO/DXGmDd8pZp8ls
dTLGuMPMcWTkjbDLN+UTH757NiTR2udxLbw9bbT12PuQ6c5Z/i+9FPKDex/RNEil
R6IjPBJ3o7TNQusP+tpW3LNDGdQ2QedhTAnqLb3naDIzFL7S4nSCzBw3+U/7oMLO
p38oWk28I+8wi9KdoC1HUbGgUKVVGhGejwjJbayDLBVXMzxXWkdcYnmGuSOftIzU
r4ZNx6YbsjHr7/jWYksRj1ifo9r05xXEXzw04OSWl95FG5i4rjYBhmvIj/yNiNII
UUiEovl2mVcfq3UVtU38+MyMv15mcNkVx1F4/e7MTOCa7M8hg7JJGNnkmgcqOS8E
pXJB3LgL5ii6/elPmEbh1mG6KibBcqfCOKbqetiZFQCCb3vySlvcZdV18N9DVmAg
AkpTgckqlM44SRcCzsUUzrKVPIB/uB/PLmqvTVDM3Qjzk0zetqlyU8n7Aka/f8Ta
F56xSKXAVTWpsNsfvT9ywn2tbSOmv7bp7wcZL4vgVxKsvOpXaWcl5TesfKcJ7lw/
ndzOzFurVJGebpMyzyFLC19U+Fojy3ED0kbVxtt2o2Zr9JFlaLgBDqugZiXqCVOH
U00PfFV8oucZ8ypeqw3LB9ucJE0SWiN6F/+CeVPmHtIIDc7qpNw=
=uyJD
-----END PGP SIGNATURE-----


D
D
Denis 'GNUtoo' Carikli wrote on 2 Jan 2023 21:01
(name . Christopher Baines)(address . mail@cbaines.net)
20230102210141.5aef6d97@primary_laptop
On Wed, 28 Dec 2022 18:10:54 +0000
Christopher Baines <mail@cbaines.net> wrote:>
Toggle quote (2 lines)
> Maybe an alternative if you want to have a package would be to
> symlink to the source.
The issue is that I don't know how to refer to the source in a
situation like that.

I didn't really find good examples of all that. So far the best
I saw was to either define (source [...]) and reuse it in multiple
packages or to reuse the source of another package with (package-source
<package name>) like in linux.scm.

With the gnu build system, it copies the source in the current
directory, so I've really no idea what to do here. We might also need
to add the source to the inputs or native-inputs or propagated-inputs
somehow so it would not garbage collect it when we install the zim. Is
propagated-inputs the way to go?

Denis.
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEEeC+d2+Nrp/PU3kkGX138wUF34mMFAmOzOCUACgkQX138wUF3
4mNgJhAArn+qbUJZd18fQuZfasDI+FlPgfQe3sdT87amo6MeGCAjwlCGX2vRBBKq
JKxEdpV7GzWPItWLOsd5ObVMsMnulOBPPUlr6z2Ge59W3p2td7M54tRpihVfExW5
geof/pWNWuAzGhQCh1fnbhB+aRHqljqFHTebjmFdBdfu7w4w3Lku0IU4l2TmQZmA
G4NUbkkPSafXJPiK+d3Iznvq40hwdVs52kv5IcQvce3qDdkLiZKkhxgCKaPlaDcG
GDj7nlbTwwtI57dMSQQMlJ9pKZnKw1kdPmceQphnSEVfuCEe4IshxDnZtX1BihPn
fvxzGFytSRfsMYSeQb1uq9ezTO8oJmpiAoHKMLwqe1frQz6Ha2HSKWpCxRAbMsrI
87Yb3rKhND9/4XYhGWffS/gKBj/TdpExX0/Z2DG7cl0NWHMLUHh4/hcWRLA9Jz0K
peUYJnNtesy3OvsEtbWlhFlgJqS3SloOMOI3+xAEVs/EAFJTDzFRqgM5Jec03It4
jbUc080v+olv1myd2tXnBPkcOgSkuoNyaGBQjgrH8dIMtpKCiQK4fa46I2Xtq6g7
nlx9fL3NWeOnlLcOJTXuRAVJkp/BCMAp5VreF2u/jgj0INqHZL0OK3Kp9do1BGJx
WFDYes2G5a+KfvL8yvJ+KuN5WCRStSfxAemFRtvFv9hkFvLF1sw=
=+dGS
-----END PGP SIGNATURE-----


D
D
Denis 'GNUtoo' Carikli wrote on 5 Dec 2023 15:58
(no subject)
(name . GNU bug tracker automated control server)(address . control@debbugs.gnu.org)
20231205155839.0dc06fab@primary_laptop
close 60288
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEEeC+d2+Nrp/PU3kkGX138wUF34mMFAmVvOp8ACgkQX138wUF3
4mNYwg//bza0RBg7NtYg/am91rIow1MbK1AC/S2Erb88r3mDikdf+1seE+PPK42o
6iwkcdcML7BpGOHkUkF3e9Yn15rYE02Q6q4UbR4lzccDCLyOeRkJEie7VJDcALkO
9s7FTEbPhLqN3UUOTBC5UnlNQtudXO4QhmljhhCafEojDjchrJmzc8SKpTR67V4a
PvjSgFZSWH9ykSfaOpheDSldGWvx3UI/dMN10M67Hh69hJhQQVq2nbYuHlLqNozm
fdPZPaFbQEyd0KSybhds5oDL1UhJqwuONUBreuqngQ2JjV3STiygONa30gVrrfA+
8RWTMIo7QxgziA+T2rA3xnySVv6DfRaQc814fUeQ6LX98oHfnXamQ3PUzY5jngvE
KaFRk8phBPSpBVX7jha78nyhylvY9K7ihJ+0/Xs/p3LHrZlINCE4IkiFPfbxDYl5
OR/Xd1lB2miprcEEzAaODTVSwMgqwVVKIw+VRB5faQhpMuB1UKOUNR5NeyAhbUe+
Oxm8UMjhU9MgViT9OYEMb8qC7Hb+pM3pHxB/stzV9K5yGxcX/iGVt4s0LNuQX4DJ
dWIPjdhds1Wx83gnY5NDhsbcyu25DMWRybFuXa3VqYfGPwSNPZpWX/EYAHBCRA9s
70SePxlDeBoLQiweXIlgExXOPvKBwvGtxPBHGtYUZ5hEl25WF4M=
=HGF7
-----END PGP SIGNATURE-----


?