Integer overflow on Guix GC size calculation

  • Open
  • quality assurance status badge
Details
5 participants
  • Bengt Richter
  • Ekaitz Zarraga
  • Liliana Marie Prikler
  • Ludovic Courtès
  • Maxime Devos
Owner
unassigned
Submitted by
Ekaitz Zarraga
Severity
normal
E
E
Ekaitz Zarraga wrote on 1 Feb 2022 15:06
(name . bug-guix@gnu.org)(address . bug-guix@gnu.org)
ZbUG0J0t7mjKPrXaPwoPrhN-Qx-jMae7vM2CCoOV-kHm7ID85ZoiNij86PcR8_zB2O9-DyxIPwWNlvCTS8AYSYemIuGJ4L8nNYwK0AXYuvQ=@elenq.tech
Hi,

I noticed there's some kind of a wierd integer overflow on the size calculation on `guix gc`:

[17592186042896 MiB] deleting '/gnu/store/j2s6kva8l20m6rjj10bnknq99jc9rg6w-ghc-random-1.2.0-builder'
[17592186042896 MiB] deleting '/gnu/store/8nx7zzj629qvv1533c48bl19wrkwcjh2-curl-7.79.1-doc'
[17592186042897 MiB] deleting '/gnu/store/dcsi13588yyjws76s1wjc7h5spnjd2vn-ghc-kan-extensions-5.2.3-builder'
[17592186042897 MiB] deleting '/gnu/store/5zrhw6kvb8wd3n6lazpblqzsg92y320b-ghc-sop-core-0.5.0.1-builder'
[17592186042897 MiB] deleting '/gnu/store/l2ya1z3la9qfdj9139f09q3djs36lw3l-ghc-aeson-pretty-0.8.9-builder'
[17592186042897 MiB] deleting '/gnu/store/8a8nbfxq508r7qywkhaw8jj8hicpfjh8-ghc-prelude-extras-0.4.0.3-builder'
[17592186042897 MiB] deleting '/gnu/store/wbz6vkiz7cq8c531xvb31lxm28nz332i-ghc-8.10.7'
[19 MiB] deleting '/gnu/store/i5np7ifiabg333g2l8ycmvbhimnrrx8k-ghc-8.10.7-doc'
[170 MiB] deleting '/gnu/store/yx9zjw9118gzfcx33adfwy6kghrzxkys-ghc-pem-0.2.4-builder'
[170 MiB] deleting '/gnu/store/pinvkg1x5rpsgm95zhn50l6xq7rly43f-perl-test-output-1.033.drv'
[170 MiB] deleting '/gnu/store/k1bdc950d62g1pw4yw1khgmfr32m3rpm-perl-sub-exporter-0.988.drv'
L
L
Liliana Marie Prikler wrote on 1 Feb 2022 22:20
a4c782cea72bf0ccb82f235ea76ce1a402a1aa59.camel@gmail.com
Am Dienstag, dem 01.02.2022 um 14:06 +0000 schrieb Ekaitz Zarraga:
Toggle quote (32 lines)
> Hi,
>
> I noticed there's some kind of a wierd integer overflow on the size
> calculation on `guix gc`:
>
> [17592186042896 MiB] deleting
> '/gnu/store/j2s6kva8l20m6rjj10bnknq99jc9rg6w-ghc-random-1.2.0-
> builder'
> [17592186042896 MiB] deleting
> '/gnu/store/8nx7zzj629qvv1533c48bl19wrkwcjh2-curl-7.79.1-doc'
> [17592186042897 MiB] deleting
> '/gnu/store/dcsi13588yyjws76s1wjc7h5spnjd2vn-ghc-kan-extensions-
> 5.2.3-builder'
> [17592186042897 MiB] deleting
> '/gnu/store/5zrhw6kvb8wd3n6lazpblqzsg92y320b-ghc-sop-core-0.5.0.1-
> builder'
> [17592186042897 MiB] deleting
> '/gnu/store/l2ya1z3la9qfdj9139f09q3djs36lw3l-ghc-aeson-pretty-0.8.9-
> builder'
> [17592186042897 MiB] deleting
> '/gnu/store/8a8nbfxq508r7qywkhaw8jj8hicpfjh8-ghc-prelude-extras-
> 0.4.0.3-builder'
> [17592186042897 MiB] deleting
> '/gnu/store/wbz6vkiz7cq8c531xvb31lxm28nz332i-ghc-8.10.7'
> [19 MiB] deleting '/gnu/store/i5np7ifiabg333g2l8ycmvbhimnrrx8k-ghc-
> 8.10.7-doc'
> [170 MiB] deleting '/gnu/store/yx9zjw9118gzfcx33adfwy6kghrzxkys-ghc-
> pem-0.2.4-builder'
> [170 MiB] deleting '/gnu/store/pinvkg1x5rpsgm95zhn50l6xq7rly43f-perl-
> test-output-1.033.drv'
> [170 MiB] deleting '/gnu/store/k1bdc950d62g1pw4yw1khgmfr32m3rpm-perl-
> sub-exporter-0.988.drv'
I find it somewhat concerning that you've accumulated more than 2^64
bytes of garbage. Are some items counted double here?
Other than that, that's pretty normal size_t wraparound semantics. I
don't think that number is used for anything other than displaying.

Cheers
E
E
Ekaitz Zarraga wrote on 1 Feb 2022 22:54
(name . Liliana Marie Prikler)(address . liliana.prikler@gmail.com)(address . 53696@debbugs.gnu.org)
_owHxRMem22hRxaUSHfQVU1KLNpPpXZUIgdzrH8dNVIslwfs5wUYfSrDfFgXtoI7V7wFWKcwqP-OmM7-Ib5yjucP6hm2CuM2Izl-noUD52U=@elenq.tech
Hi,

??????? Original Message ???????

On Tuesday, February 1st, 2022 at 10:20 PM, Liliana Marie Prikler <liliana.prikler@gmail.com> wrote:
Toggle quote (4 lines)
>
> I find it somewhat concerning that you've accumulated more than 2^64
> bytes of garbage.

I'm a dirty boy.

Toggle quote (2 lines)
> Are some items counted double here?

The number started growing from 0 and then that appeared and it continued
smoothly from the previous. It's like something went bad in the middle.

Toggle quote (3 lines)
> Other than that, that's pretty normal size_t wraparound semantics. I
> don't think that number is used for anything other than displaying.

Showing wrong information to the people that use the program is pretty
weird. The program still works but showing wrong data is worse than
not showing it in my opinion.
I'll take a look and try to see if I can fix it.

Toggle quote (2 lines)
> Cheers

Best,
Ekaitz
M
M
Maxime Devos wrote on 2 Feb 2022 11:05
Re: bug#53696: Integer overflow on Guix GC size calculation
f0a7920cbd0c80894a305e535031f1e1bb4403e2.camel@telenet.be
Ekaitz Zarraga schreef op di 01-02-2022 om 14:06 [+0000]:
Toggle quote (2 lines)
> [17592186042897 MiB] deleting '/gnu/store/wbz6vkiz7cq8c531xvb31lxm28nz332i-ghc-8.10.7'

For comparison, this is about 16 exbibyte.
that's more than the global monthly Internet traffic in 2004.

According to https://what-if.xkcd.com/31/, 16 exbibyte would be about
17 million solid-state disks. Even though this ignores deduplication,
this seems rather expensive.

My guess is that the size of a store item was misrecorded somewhere.

Greetings,
Maxime.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYfpXaxccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7tGLAQC1fw1r7S3m7gRTdpRSvJtiaEsB
b0ByzGgjsrdRw3tevwEAwZeWG06b2kb3nyT4CPcg6OkL2c5S7HP1aVlV4O2PcwE=
=2EV5
-----END PGP SIGNATURE-----


B
B
Bengt Richter wrote on 2 Feb 2022 13:04
(name . Maxime Devos)(address . maximedevos@telenet.be)
20220202120441.GA2665@LionPure
Hi Maxime, Ekaitz, et al,

On +2022-02-02 11:05:31 +0100, Maxime Devos wrote:
Toggle quote (16 lines)
> Ekaitz Zarraga schreef op di 01-02-2022 om 14:06 [+0000]:
> > [17592186042897 MiB] deleting '/gnu/store/wbz6vkiz7cq8c531xvb31lxm28nz332i-ghc-8.10.7'
>
> For comparison, this is about 16 exbibyte.
> According to <https://en.wikipedia.org/wiki/Byte#Multiple-byte_units>,
> that's more than the global monthly Internet traffic in 2004.
>
> According to <https://what-if.xkcd.com/31/>, 16 exbibyte would be about
> 17 million solid-state disks. Even though this ignores deduplication,
> this seems rather expensive.
>
> My guess is that the size of a store item was misrecorded somewhere.
>
> Greetings,
> Maxime.

s/misrecorded/mis-defined-in-record/ ?
Wild guessing follows:

Toggle snippet (6 lines)
$ guile --no-auto-compile -c '(use-modules (ice-9 format))(format #t "~20x\n~20x\n~20d\n" (* 17592186042897 (expt 2 20)) #xa1100000 #xa1100000)';
ffffffffa1100000
a1100000
2702180352

It looks to me like a 32-bit unsigned int should have been turned to 64-bit unsigned long or bigint
but somehow got cast/interpreted as signed, becoming signed 64-bit long,
which then in turn was seen by the print as 64-bit unsigned long.

I don't know, but if records are being used, perhaps some slot integer-widening logic
might be involved? Or a mis-defined int slot that should have been long to accomodate
big > 31-bit positive integers?

Just guessing wildly -- I think I saw something about records and defining their fields
as fixed C ints or longs.

--
Regards,
Bengt Richter
L
L
Liliana Marie Prikler wrote on 2 Feb 2022 20:45
Re: Integer overflow on Guix GC size calculation
(name . Ekaitz Zarraga)(address . ekaitz@elenq.tech)(address . 53696@debbugs.gnu.org)
3b7bb17af335d41a13bb638a75ae5f36480d0f68.camel@gmail.com
Hi,

Am Dienstag, dem 01.02.2022 um 21:54 +0000 schrieb Ekaitz Zarraga:
Toggle quote (18 lines)
>
> Hi,
>
> ??????? Original Message ???????
>
> On Tuesday, February 1st, 2022 at 10:20 PM, Liliana Marie Prikler
> <liliana.prikler@gmail.com> wrote:
> >
> > I find it somewhat concerning that you've accumulated more than
> > 2^64 bytes of garbage.
>
> I'm a dirty boy.
>
> > Are some items counted double here?
>
> The number started growing from 0 and then that appeared and it
> continued smoothly from the previous. It's like something went bad in
> the middle.
WDYM by the middle? Do you mean the jump back to 0 or do you mean
something before those lines? If you did encounter a "self-correcting"
bit-flip that'd be one thing, but to me it appears as though you have
some very large storage on your hands. Would you mind me asking where
you purchased that disk ?

Toggle quote (8 lines)
> > Other than that, that's pretty normal size_t wraparound semantics.
> > I don't think that number is used for anything other than
> > displaying.
>
> Showing wrong information to the people that use the program is
> pretty weird. The program still works but showing wrong data is worse
> than not showing it in my opinion.
> I'll take a look and try to see if I can fix it.
I mean we could switch to GMP for those numbers, but it doesn't make
sense. Ext4 volume size is capped at 2^60, which is still pretty well
below 2^64. Even BTRFS can't get larger than that. So unless you have
a distributed store, I'd hazard a guess that such numbers ought not to
even appear.

Cheers
E
E
Ekaitz Zarraga wrote on 2 Feb 2022 20:51
(name . Liliana Marie Prikler)(address . liliana.prikler@gmail.com)(address . 53696@debbugs.gnu.org)
eiL-NqLXkO8KslOAJSoMQ0KR48Le7DLPP2bE0dw74ydUB3JO5NdKOHVO941JEaGwNVPzaPhHeWEr-If-bxyKovX5DOOav6cpE-_ALeF8F8I=@elenq.tech
Hi,

On Wednesday, February 2nd, 2022 at 8:45 PM, Liliana Marie Prikler <liliana.prikler@gmail.com> wrote:

Toggle quote (30 lines)
> Hi,
>
> Am Dienstag, dem 01.02.2022 um 21:54 +0000 schrieb Ekaitz Zarraga:
>
> > Hi,
> >
> > ??????? Original Message ???????
> >
> > On Tuesday, February 1st, 2022 at 10:20 PM, Liliana Marie Prikler
> >
> > liliana.prikler@gmail.com wrote:
> >
> > > I find it somewhat concerning that you've accumulated more than
> > >
> > > 2^64 bytes of garbage.
> >
> > I'm a dirty boy.
> >
> > > Are some items counted double here?
> >
> > The number started growing from 0 and then that appeared and it
> > continued smoothly from the previous. It's like something went bad in
> > the middle.
>
> WDYM by the middle? Do you mean the jump back to 0 or do you mean
> something before those lines? If you did encounter a "self-correcting"
> bit-flip that'd be one thing, but to me it appears as though you have
> some very large storage on your hands. Would you mind me asking where
> you purchased that disk ?

I mean something like:

0
1
2
4
8
10
12
HUGE_NUMBER
HUGE_NUMBER
...
HUGE_NUMBER
15
20
...

It's like it corrected itself. It happened in "low numbers" (less than a
hundred).

I just say this if it helps in the correction. It's very funny, still :3
L
L
Liliana Marie Prikler wrote on 2 Feb 2022 21:04
(name . Ekaitz Zarraga)(address . ekaitz@elenq.tech)(address . 53696@debbugs.gnu.org)
66a344bca9c0ebf7a7430b8af6cd705c93946d7f.camel@gmail.com
Hi,

Am Mittwoch, dem 02.02.2022 um 19:51 +0000 schrieb Ekaitz Zarraga:
Toggle quote (23 lines)
> I mean something like:
>
> 0
> 1
> 2
> 4
> 8
> 10
> 12
> HUGE_NUMBER
> HUGE_NUMBER
> ...
> HUGE_NUMBER
> 15
> 20
> ...
>
> It's like it corrected itself. It happened in "low numbers" (less
> than a
> hundred).
>
> I just say this if it helps in the correction. It's very funny, still
> :3
Thanks, that wasn't clear from your original report. As I hinted at in
my previous message, I think you'd get such results through well-placed
bit flips. I'm not aware of Guix itself intentionally or otherwise
causing those, but bit flips are a problem on any modern hardware and
thus I'm sure such glitches will be encountered.
B
B
Bengt Richter wrote on 2 Feb 2022 23:46
Re: bug#53696: Integer overflow on Guix GC size calculation
(name . Maxime Devos)(address . maximedevos@telenet.be)
20220202224617.GA18103@LionPure
Sorry for following up my own post, but maybe it wasn't clear
why I printed (* 17592186042897 (expt 2 20)) in hex ?

That is the value of [17592186042897 MiB] that you've been discussing.
(expt 2 20) is one MiB

Does that make
Toggle quote (6 lines)
> --8<---------------cut here---------------start------------->8---
> $ guile --no-auto-compile -c '(use-modules (ice-9 format))(format #t "~20x\n~20x\n~20d\n" (* 17592186042897 (expt 2 20)) #xa1100000 #xa1100000)';
> ffffffffa1100000
> a1100000
> 2702180352
> --8<---------------cut here---------------end--------------->8---
a little clearer?

The discussion seems to be continuing, but no mention of the above.
How come?

Feeling ignored, and top-posting in desperation ;/

CC-ing ludo, who will instantly know where to fix it, if he hasn't already.


On +2022-02-02 13:04:41 +0100, Bengt Richter wrote:
Toggle quote (46 lines)
> Hi Maxime, Ekaitz, et al,
>
> On +2022-02-02 11:05:31 +0100, Maxime Devos wrote:
> > Ekaitz Zarraga schreef op di 01-02-2022 om 14:06 [+0000]:
> > > [17592186042897 MiB] deleting '/gnu/store/wbz6vkiz7cq8c531xvb31lxm28nz332i-ghc-8.10.7'
> >
> > For comparison, this is about 16 exbibyte.
> > According to <https://en.wikipedia.org/wiki/Byte#Multiple-byte_units>,
> > that's more than the global monthly Internet traffic in 2004.
> >
> > According to <https://what-if.xkcd.com/31/>, 16 exbibyte would be about
> > 17 million solid-state disks. Even though this ignores deduplication,
> > this seems rather expensive.
> >
> > My guess is that the size of a store item was misrecorded somewhere.
> >
> > Greetings,
> > Maxime.
>
> s/misrecorded/mis-defined-in-record/ ?
> Wild guessing follows:
>
> --8<---------------cut here---------------start------------->8---
> $ guile --no-auto-compile -c '(use-modules (ice-9 format))(format #t "~20x\n~20x\n~20d\n" (* 17592186042897 (expt 2 20)) #xa1100000 #xa1100000)';
> ffffffffa1100000
> a1100000
> 2702180352
> --8<---------------cut here---------------end--------------->8---
>
> It looks to me like a 32-bit unsigned int should have been turned to 64-bit unsigned long or bigint
> but somehow got cast/interpreted as signed, becoming signed 64-bit long,
> which then in turn was seen by the print as 64-bit unsigned long.
>
> I don't know, but if records are being used, perhaps some slot integer-widening logic
> might be involved? Or a mis-defined int slot that should have been long to accomodate
> big > 31-bit positive integers?
>
> Just guessing wildly -- I think I saw something about records and defining their fields
> as fixed C ints or longs.
>
> --
> Regards,
> Bengt Richter
>
>
>
L
L
Ludovic Courtès wrote on 8 Feb 2022 11:01
(name . Bengt Richter)(address . bokr@bokr.com)
87ee4dbrsa.fsf@gnu.org
Hi!

Bengt Richter <bokr@bokr.com> skribis:

Toggle quote (4 lines)
> It looks to me like a 32-bit unsigned int should have been turned to 64-bit unsigned long or bigint
> but somehow got cast/interpreted as signed, becoming signed 64-bit long,
> which then in turn was seen by the print as 64-bit unsigned long.

That could be the explanation; there were a couple of bugs along these
lines fixed recently:


It would have been good to see which store item that came right before
the huge number. Could it be that it was texlive-texmf, or flightgear,
or repeat-masker, the three packages whose store item size exceeds 2^31?

Ekaitz, what guix-daemon version are you running?

Thanks,
Ludo’.
E
E
Ekaitz Zarraga wrote on 8 Feb 2022 11:52
(name . Ludovic Courtès)(address . ludo@gnu.org)
YcmoSJT5U0njXiQ5dVcdSFcq32Rj4E4zZPtjtuPALaUlfz3W6A0Vr1_2odLCvQVuJyM1j8xhmRHYSLGMBkE7coqZJ-wkXS7ItyYs0TYDZk4=@elenq.tech
Toggle quote (2 lines)
> Ekaitz, what guix-daemon version are you running?

$ guix --version
guix (GNU Guix) c5000dcc375229ff42727f090d4243107d3a04a6
Copyright (C) 2022 the Guix authors
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

TY!
L
L
Ludovic Courtès wrote on 10 Feb 2022 21:25
(name . Ekaitz Zarraga)(address . ekaitz@elenq.tech)
87leyi5v08.fsf@gnu.org
Ekaitz Zarraga <ekaitz@elenq.tech> skribis:

Toggle quote (4 lines)
>> Ekaitz, what guix-daemon version are you running?
>
> $ guix --version

What about the daemon though? As in:

sudo /proc/$(pidof guix-daemon |cut -f1 -d ' ')/exe --version

Thanks,
Ludo’.
E
E
Ekaitz Zarraga wrote on 10 Feb 2022 21:27
(name . Ludovic Courtès)(address . ludo@gnu.org)
i7tEcpsGFOVAY4Sgs7taf8GHPCxkhPFPUCWIGZNTjBvFBmjUOcpTLMwus2a8W_lUlVkPfhoN98wu8OBOkP-yhvTmlObRQnH0qCxeqfrs5eE=@elenq.tech
Toggle quote (14 lines)
> Ekaitz Zarraga ekaitz@elenq.tech skribis:
>
> > > Ekaitz, what guix-daemon version are you running?
> >
> > $ guix --version
>
> What about the daemon though? As in:
>
> sudo /proc/$(pidof guix-daemon |cut -f1 -d ' ')/exe --version
>
> Thanks,
>
> Ludo’.

ouch... sorry!

$ sudo /proc/$(pidof guix-daemon |cut -f1 -d ' ')/exe --version
Password:
guix-daemon (GNU Guix) 1.3.0-23.a27e47f
?