guix-daemon fails to copy 4+GB file to store

  • Done
  • quality assurance status badge
Details
3 participants
  • Efraim Flashner
  • Ludovic Courtès
  • Ricardo Wurmus
Owner
unassigned
Submitted by
Ricardo Wurmus
Severity
important
R
R
Ricardo Wurmus wrote on 11 May 12:52 +0200
(address . bug-guix@gnu.org)
87msowpqm2.fsf@elephly.net
The guix-daemon's libutil/util.cc uses copy_file_range to copy a
downloaded file into the store. copy_file_range fails on files larger
than 4GB with an error like this:

guix build: error: short write in copy_file_range `15' to `16': No such file or directory

The man page for copy_file_range says that it could return EFBIG when
the range exceeds the maximum range. The daemon code does not check any
limits and will attempt to copy the whole file.

I believe our code ought to check the value of st.size and fall back to
a boring copy if it exceeds some "reasonable" value.

This is where copy_file_range is used:

Here is a little reproducer:
(use-modules (guix download) (guix packages) (guix build-system trivial)) (package (name "chungus") (version "1") (source (origin (method url-fetch) (uri "http://localhost:1111/chungus") (sha256 (base32 "0nx67d4ls2nfwcfdmg81vf240z6lpwpdqypssr1wzn3hyz4szci4")))) (build-system trivial-build-system) (home-page "") (synopsis "") (description "") (license #f))
Toggle snippet (10 lines)
# generate a big file
dd bs=1M count=4096 if=/dev/zero of=/tmp/chungus
# serve it
guix shell woof -- woof -i 127.0.0.1 -p 1111 -c 1 /tmp/chungus
# build the source derivation
guix build --no-grafts -Sf bug.scm
# observe the error
# guix build: error: short write in copy_file_range `15' to `16': No such file or directory

--
Ricardo
E
E
Efraim Flashner wrote on 12 May 09:12 +0200
(name . Ricardo Wurmus)(address . rekado@elephly.net)
ZkBrywRMBFbtlZqU@3900XT
On Sat, May 11, 2024 at 12:52:53PM +0200, Ricardo Wurmus wrote:
Toggle quote (51 lines)
> The guix-daemon's libutil/util.cc uses copy_file_range to copy a
> downloaded file into the store. copy_file_range fails on files larger
> than 4GB with an error like this:
>
> guix build: error: short write in copy_file_range `15' to `16': No such file or directory
>
> The man page for copy_file_range says that it could return EFBIG when
> the range exceeds the maximum range. The daemon code does not check any
> limits and will attempt to copy the whole file.
>
> I believe our code ought to check the value of st.size and fall back to
> a boring copy if it exceeds some "reasonable" value.
>
> This is where copy_file_range is used:
> https://git.savannah.gnu.org/cgit/guix.git/tree/nix/libutil/util.cc#n382
>
> Here is a little reproducer:
>

> (use-modules (guix download)
> (guix packages)
> (guix build-system trivial))
>
> (package
> (name "chungus")
> (version "1")
> (source
> (origin
> (method url-fetch)
> (uri "http://localhost:1111/chungus")
> (sha256
> (base32 "0nx67d4ls2nfwcfdmg81vf240z6lpwpdqypssr1wzn3hyz4szci4"))))
> (build-system trivial-build-system)
> (home-page "")
> (synopsis "")
> (description "")
> (license #f))

>
> --8<---------------cut here---------------start------------->8---
> # generate a big file
> dd bs=1M count=4096 if=/dev/zero of=/tmp/chungus
> # serve it
> guix shell woof -- woof -i 127.0.0.1 -p 1111 -c 1 /tmp/chungus
> # build the source derivation
> guix build --no-grafts -Sf bug.scm
> # observe the error
> # guix build: error: short write in copy_file_range `15' to `16': No such file or directory
> --8<---------------cut here---------------end--------------->8---
>

This sounds like a similar failure to bug 65714 that I ran into with
guix copy, but I wasn't able to diagnose it.



--
Efraim Flashner <efraim@flashner.co.il> ????? ?????
GPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted
-----BEGIN PGP SIGNATURE-----

iQIzBAABCAAdFiEEoov0DD5VE3JmLRT3Qarn3Mo9g1EFAmZAa8gACgkQQarn3Mo9
g1GBJw//VqKkLgiTUONuYFYrM7TiCSmxrt3wLT3GkxxF+EDEyaf6Lye9W3uSFYek
Moy/5kWSSvjfyiJA6UCAGelcao90ogylfH88oHYt6zqa4h5ArwimAmYTSmrRr3b6
Pvr6QypDT1tflwe5x2R+YbigFswCgXYPXtM85IDwQzOcwffl5rePz3HB9YKCJCVx
+54xhbY8LL7aBmG68jyXS4G1XsIGWnzBVret7hLZ++i/HTwC9lX2Kn9lSXUSfPev
hiS8290J3Vz7pES5jp2/VsTZpw2zBnhNOwK+rbEpQMcOYHycfTjDTlZ4x2POPXSg
HpcMIsyo34nOWsVtYYbMl3wYSukF3mIkZx2lpb7ZNNsOTvGqX3IIgADslqoiaCtB
ENaXGLO3BLJc3AXUEwrOe6a5b0APdDy4RDNqZXZYP0RMDR5M13YREepazHcb0If7
IExQp3S9Eo9DA6Jnbb5wcNAVHta4ytvE3qYrYMp4+J0aYFETlLyyFQqbTpPiqHxN
Gp/NLT8VravsGf9r/UJYZHEz12wC4XrSHR1tvw/yyj0T4cARgv27kU/8myGPXWtC
ys90UhU2hs2MoCcgtW5cML2epH/+5ycgXmPgWM9pS5ukMDKktcPnJCwMUO/2DawC
UocY2aWvPStnddMkCfdT+QqaKXukWYENo+QAk1fEBNvplJMDAX8=
=ZUuL
-----END PGP SIGNATURE-----


L
L
Ludovic Courtès wrote on 13 May 11:05 +0200
control message for bug #70877
(address . control@debbugs.gnu.org)
87frumdquy.fsf@gnu.org
severity 70877 important
quit
L
L
Ludovic Courtès wrote on 13 May 12:10 +0200
Re: bug#70877: guix-daemon fails to copy 4+GB file to store
(name . Ricardo Wurmus)(address . rekado@elephly.net)(address . 70877@debbugs.gnu.org)
87v83ic99h.fsf@gnu.org
Hi,

Thanks for the bug report and nice reproducer!

Ricardo Wurmus <rekado@elephly.net> skribis:

Toggle quote (13 lines)
> The guix-daemon's libutil/util.cc uses copy_file_range to copy a
> downloaded file into the store. copy_file_range fails on files larger
> than 4GB with an error like this:
>
> guix build: error: short write in copy_file_range `15' to `16': No such file or directory
>
> The man page for copy_file_range says that it could return EFBIG when
> the range exceeds the maximum range. The daemon code does not check any
> limits and will attempt to copy the whole file.
>
> I believe our code ought to check the value of st.size and fall back to
> a boring copy if it exceeds some "reasonable" value.

The goal leading to this error message looks like this:

copy_file_range(15, NULL, 16, NULL, 4294967297, 0) = 2147479552

… which is precisely 2 GiB - 4 KiB.

Reading the man page, it’s entirely fine: like ‘write’,
‘copy_file_range’ might copy less than asked for, so it’s really a
mistake of mine to assume that short writes can’t happen. Presumably
there’s an internal limit here we’re reaching that explains why it won’t
copy more than 2 GiB at once.

With the following change, we get:

newfstatat(15, "", {st_mode=S_IFREG|0644, st_size=4294967297, ...}, AT_EMPTY_PATH) = 0
copy_file_range(15, NULL, 16, NULL, 4294967297, 0) = 2147479552
copy_file_range(15, NULL, 16, NULL, 2147487745, 0) = 2147479552
copy_file_range(15, NULL, 16, NULL, 8193, 0) = 8193
fchown(16, 30001, 30000) = 0

Could you confirm that it works for you?

Thanks,
Ludo’.
From efd9f3383756df9959651125c0f2e2e769630851 Mon Sep 17 00:00:00 2001
Message-ID: <efd9f3383756df9959651125c0f2e2e769630851.1715594931.git.ludo@gnu.org>
From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= <ludo@gnu.org>
Date: Mon, 13 May 2024 12:02:30 +0200
Subject: [PATCH] =?UTF-8?q?daemon:=20Loop=20over=20=E2=80=98copy=5Ffile=5F?=
=?UTF-8?q?range=E2=80=99=20upon=20short=20writes.?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit


* nix/libutil/util.cc (copyFile): Loop over ‘copy_file_range’ instead of
throwing upon short write.

Reported-by: Ricardo Wurmus <rekado@elephly.net>
Change-Id: Id7b8a65ea59006c2d91bc23732309a68665b9ca0
---
nix/libutil/util.cc | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

Toggle diff (26 lines)
diff --git a/nix/libutil/util.cc b/nix/libutil/util.cc
index 578d6572934..3206dea11b1 100644
--- a/nix/libutil/util.cc
+++ b/nix/libutil/util.cc
@@ -397,9 +397,14 @@ static void copyFile(int sourceFd, int destinationFd)
} else {
if (result < 0)
throw SysError(format("copy_file_range `%1%' to `%2%'") % sourceFd % destinationFd);
- if (result < st.st_size)
- throw SysError(format("short write in copy_file_range `%1%' to `%2%'")
- % sourceFd % destinationFd);
+
+ /* If 'copy_file_range' copied less than requested, try again. */
+ for (ssize_t copied = result; copied < st.st_size; copied += result) {
+ result = copy_file_range(sourceFd, NULL, destinationFd, NULL,
+ st.st_size - copied, 0);
+ if (result < 0)
+ throw SysError(format("copy_file_range `%1%' to `%2%'") % sourceFd % destinationFd);
+ }
}
}

base-commit: 89cd778f6a45cd9b43a4dc1f236dcd0a87af955c
--
2.41.0
R
R
Ricardo Wurmus wrote on 13 May 14:09 +0200
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 70877@debbugs.gnu.org)
87wmnxoqus.fsf@elephly.net
Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (2 lines)
> Could you confirm that it works for you?

I've applied this locally, started the new daemon, and used it to build
the 4+GB source code derivation of a big package that used to fail
before. It works now. Thank you!

--
Ricardo
L
L
Ludovic Courtès wrote on 13 May 16:34 +0200
(name . Ricardo Wurmus)(address . rekado@elephly.net)(address . 70877@debbugs.gnu.org)
874jb1aih9.fsf@gnu.org
Ricardo Wurmus <rekado@elephly.net> skribis:

Toggle quote (8 lines)
> Ludovic Courtès <ludo@gnu.org> writes:
>
>> Could you confirm that it works for you?
>
> I've applied this locally, started the new daemon, and used it to build
> the 4+GB source code derivation of a big package that used to fail
> before. It works now. Thank you!

Pushed as 7757fdd491862fa5c33f1f894503346b89898a01.

I’ll update the ‘guix’ package to make the fix available.

Thanks for testing!

Ludo’.
L
L
Ludovic Courtès wrote on 13 May 18:24 +0200
(name . Ricardo Wurmus)(address . rekado@elephly.net)(address . 70877-done@debbugs.gnu.org)
87ikzh7k95.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (4 lines)
> Pushed as 7757fdd491862fa5c33f1f894503346b89898a01.
>
> I’ll update the ‘guix’ package to make the fix available.

Done in 58be9a79e2862d5fa9842d73f498ce2e5442b9ce.

Ludo'.
Closed
L
L
Ludovic Courtès wrote on 15 May 00:26 +0200
874jb02fol.fsf@gnu.org
BTW, the newly updated ‘guix’ package is 8% smaller, as a result of

Toggle snippet (16 lines)
$ guix describe
Generation 302 May 12 2024 23:29:11 (current)
guix 89cd778
repository URL: https://git.savannah.gnu.org/git/guix.git
branch: master
commit: 89cd778f6a45cd9b43a4dc1f236dcd0a87af955c
$ guix size guix |head -2
store item total self
/gnu/store/r96xq0064nqf43ygcr7z9lgb18vrd1wa-guix-1.4.0-18.4c94b9e 705.8 400.6 56.8%
$ ./pre-inst-env guix size guix |head -2
store item total self
/gnu/store/mcw1d2zy96is5ymjj903i3bi5a0qdwr5-guix-1.4.0-19.7ca9809 673.8 368.7 54.7%
$ git log |head -1
commit 58be9a79e2862d5fa9842d73f498ce2e5442b9ce

Ludo’.
?
Your comment

This issue is archived.

To comment on this conversation send an email to 70877@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 70877
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch