[security] Chunked store references in .zo files in Racket 8

  • Done
  • quality assurance status badge
Details
4 participants
  • Léo Le Bouter
  • Ludovic Courtès
  • Mark H Weaver
  • Philip McGrath
Owner
unassigned
Submitted by
Mark H Weaver
Severity
normal
M
M
Mark H Weaver wrote on 6 Apr 2021 13:06
(address . bug-guix@gnu.org)
87k0pf7jti.fsf@netris.org
On my system, Racket 8.0 contains a *.zo file that contains a *chunked*
store reference. As a result, it retains a reference to the ungrafted
Gtk+, and therefore to the ungrafted glib, cairo, and libx11.

The file is:

/gnu/store/…-racket-8.0/share/racket/pkgs/gui-lib/mred/private/wx/gtk/compiled/gtk3_rkt.zo,

and here's the relevant excerpt:

Toggle snippet (12 lines)
mhw@jojen ~$ hexdump -C /gnu/store/…-racket-8.0/share/racket/pkgs/gui-lib/mred/private/wx/gtk/compiled/gtk3_rkt.zo | grep -B2 -A6 /gnu/
00000cf0 c0 06 23 00 06 36 02 31 c7 c6 46 25 02 61 7f 0b |..#..6.1..F%.a..|
00000d00 48 c7 c5 06 a3 01 28 67 03 32 01 08 0c 00 f0 23 |H.....(g.2.....#|
00000d10 05 00 58 11 1e 26 48 2f 67 6e 75 2f 73 74 6f 72 |..X..&H/gnu/stor|
00000d20 65 2f 6e 32 63 6e 70 32 66 69 76 78 71 31 30 6b |e/n2cnp2fivxq10k|
00000d30 78 71 61 6c 63 76 32 71 34 31 77 7a 73 79 6a 39 |xqalcv2q41wzsyj9|
00000d40 79 64 62 01 d0 2b 2d 33 2e 32 34 2e 32 34 2f 6c |ydb..+-3.24.24/l|
00000d50 69 62 04 00 f0 1f 67 74 6b 2d 33 2e 73 6f 00 0e |ib....gtk-3.so..|
00000d60 11 1f 07 02 12 23 12 24 0c 26 00 15 06 41 0b 40 |.....#.$.&...A.@|
00000d70 00 1d 11 20 26 1e 5b 2e 2e 2e 61 74 65 2f 77 78 |... &.[...ate/wx|

The referenced store item is this:

/gnu/store/n2cnp2fivxq10kxqalcv2q41wzsyj9yd-gtk+-3.24.24

Notice that in the .zo file, there are three additional bytes inserted
before the dash ("-").

This store reference is seen by the Guix scanner, because the nix hash
is stored contiguously. However, it is *not* seen by the grafter.

Note that the grafter assumes that the entire store item name will be
stored contiguously. The current implementation only finds hashes that
are immediately followed by a dash ("-"), and moreover assumes that nix
hashes will never occur except within the corresponding store item name.

In this case, the reference was simply ignored, because the dash was
separated from the hash. If the extra junk had been inserted *after*
the dash, the grafter would have made a mess of things. It would have
(incorrectly) assumed that the rest of the expected store item name
followed the dash, and inappropriately written the replacement string
over the unexpected bytes.

With this case in mind, I think we can no longer safely assume that the
bytes following a nix hash will be as we expect. As a general
principle, I think that *every* byte that the grafter modifies should
first be checked against its expected value. That should allow us to
catch problems like this early, and avoid non-obvious breakage cropping
up.

What do you think?

Mark
L
L
Léo Le Bouter wrote on 6 Apr 2021 19:32
(address . control@debbugs.gnu.org)
b86c3395f3a73d4e1cd484c1314851e148639a1b.camel@zaclys.net
tags 47614 + security
quit
-----BEGIN PGP SIGNATURE-----

iQIzBAABCgAdFiEEFIvLi9gL+xax3g6RRaix6GvNEKYFAmBsmykACgkQRaix6GvN
EKa/nhAAtOQyRlQlNgsiGll+PLaY+GFo7WxWmwCD0KhNCQb45ZA0rU+hVgVIMGjI
QtrrK9UkC3q5vH2ABIGQ99ZyINqeMQR3AqPX9YFGmxgdiz3XswybRywfS2aR8YdT
clTJlYQbkLQsvHc7UZrTYPYN87u80oIEnggqrr6N5n64B5G9u6Wdo9fCBCkuq+Xq
IXheUymuF6QAwmC17wrri2Z+EjYkUIwdzHndL8TMVgXaKKieAiKWFZKM/oH1dlWL
/zigAaLDnljrhVqfIBQ8ajK/07YOb110QYPYL6o7tZkENXj2kuHE833M/b+Xmt5c
H23IIP7sACrhjSg9u1v/j6ldsg33rXnpZ8GHwHbH/OeBBQCBpHoA/l+ENTdImkGG
DU2vR3UPk/5epnvXfFl9VtO6TO5PB1tin6wv7VnCrN0k8eVJ3yMzClnLV1mj4Y89
rUa/8hrELipO9FBlD32REKU+VJb9KClqVrDyoekwm/EG02z/O14BgnOEvpluFBOf
knx2XSyS59FzPkLj48k1tJVUd8EsmX+g4humy3UJ23hIqIGiITKFzFY6i08NBynN
wEWYoT4ek5/0fcM9gU6ejjMky4i2n1tYJDiSFiz8vH6emrOZ8FWoINHDtG9S7onw
WDVILUQcbOr37jJJt84UaB+la4T/FxbmVBw2KPnupnWvn1SqdWs=
=OjIR
-----END PGP SIGNATURE-----


L
L
Léo Le Bouter wrote on 6 Apr 2021 19:39
e9234acf1e9dff8e5a0fc0ff078fc9e2f201e9a4.camel@zaclys.net
I think that probably replacing arbitrary paths in built binaries is a
risky and maybe unreliable engineering choice and that mechanisms
inside kernels should be preferred to give processes a different view
of the file system (retaining the path but changing the contents of the
folder).

OTOH, what would be wrong with replacing hashes directly without
expecting them to be next to anything else?

Léo
-----BEGIN PGP SIGNATURE-----

iQIzBAABCgAdFiEEFIvLi9gL+xax3g6RRaix6GvNEKYFAmBsnN4ACgkQRaix6GvN
EKYdyQ//ZM1NkwAVwlz9jvAGlquG4PZtpNVMo9nKwCtDFNkdH24qcw0C4B9vDcz5
eSQl1zdd/Umo4brKd7namErlRBUS6C5HtsT763R8XbT8oY6qQhPYbl7fy445H3DK
D/vXRDZBMPAkmeX5W6bn9h+ZOULy1PB4iQXZ+/rleq+SvE7PGVbN2FKt7I2/mEn0
ft11Xf8XcDbD7IlKRgPcudYBJ7Eb8ibRjO4n9iluILxoZ6ST/rZHsn5XFFl4SuT/
O/a0NqoaFs5rd8aQcx2S3oTyRlpSDeR7o7IpKLGgZgjxCijzW0X6hEoo/d0QPd7Y
87srlKLNzj1KTP3UoOy5yYEEuw0lIplB+Jmzri93ncBEEDthWiHgpAfpDn26lbFz
DIFsLB07vL10QxrDgsGsGEpgFnmA/L0jeJCD+2PlrPNMovhYi9lypsdpBpDcoSOZ
aaPQXIdkwOo9iPybXkWI+eGRV+vMGwG1vli/v4YvoJXBh+eJwwi2d/mKfQzgt+EZ
dON/KbrVgVgtiyhic5ADKVn1t6xJQboqz075EAVwB7unH/XHpLY+TXRY4kogWfUi
XQLkSzG5MjOuNAI0WgUTz9IfXsQuFeHx4E1iYdSIwmZl+vos6YGkVykMsKHdIRn5
9Zuwt9Kl6hF8viKb4Plmhf/biCdKq3S/bxBybRrd7Tbd3OlopPo=
=CakI
-----END PGP SIGNATURE-----


M
M
Mark H Weaver wrote on 6 Apr 2021 23:27
87tuojqf0r.fsf@netris.org
Hi Léo,

Léo Le Bouter <lle-bout@zaclys.net> writes:

Toggle quote (6 lines)
> I think that probably replacing arbitrary paths in built binaries is a
> risky and maybe unreliable engineering choice and that mechanisms
> inside kernels should be preferred to give processes a different view
> of the file system (retaining the path but changing the contents of the
> folder).

I've had thoughts along these lines myself, but I don't think it can
work properly. The fundamental problem is that in general, each process
includes shared objects from many different Guix packages. There would
need to be a mechanism to determine, when looking up a file, which Guix
package that file lookup was originating from (or whether it was coming
from a file name provided by the user), in order to determine which
"view of the file system" to use for purposes of that lookup. There's
no way to determine this reliably.

For example, when Emacs stats a file, there's no way to automatically
determine which view of the file system to use for that file lookup. If
the file being stat'd is a file that the user asked to look at, it
should use the user's view of the file system. If Emacs is trying to
load one of its own dependent libraries, it should see the file system
view associated with the dependencies of Emacs. If some code in
GnuTLS's shared library (loaded by Emacs) performs a file lookup, it
should see the GnuTLS file system view. See the problem?

I've come to think that the Guix approach is the most "correct"
approach, given the APIs that our existing body of software was written
for. (If we rewrote our software from scratch with different APIs, we
would have more options here, but that would be crazy :)

Toggle quote (3 lines)
> OTOH, what would be wrong with replacing hashes directly without
> expecting them to be next to anything else?

Personally, I would find that limitation acceptable, and that's fairly
close to what our grafter originally did (although my fast grafting code
always assumed that a "-" would follow the hash). However, we've since
become accustomed to being able to have replacements with different
version numbers. That's a nice feature.

Anyway, I doubt that imposing such a limitation would adequately solve
the problem here of chunked references in Racket 8, because I suspect
that Racket 8 could split store references at arbitrary points in the
string. I doubt that we can safely assume that the hash component of
store references will be stored contiguously in *.zo files.

What do you think?

Thanks,
Mark
L
L
Léo Le Bouter wrote on 7 Apr 2021 00:18
9b7e130d5b993a0376698e07f5f9346a5775604f.camel@zaclys.net
On Tue, 2021-04-06 at 17:27 -0400, Mark H Weaver wrote:
Toggle quote (27 lines)
> Hi Léo,
>
> Léo Le Bouter <lle-bout@zaclys.net> writes:
>
> > I think that probably replacing arbitrary paths in built binaries
> > is a
> > risky and maybe unreliable engineering choice and that mechanisms
> > inside kernels should be preferred to give processes a different
> > view
> > of the file system (retaining the path but changing the contents of
> > the
> > folder).
>
> I've had thoughts along these lines myself, but I don't think it can
> work properly. The fundamental problem is that in general, each
> process
> includes shared objects from many different Guix packages. There
> would
> need to be a mechanism to determine, when looking up a file, which
> Guix
> package that file lookup was originating from (or whether it was
> coming
> from a file name provided by the user), in order to determine which
> "view of the file system" to use for purposes of that
> lookup. There's
> no way to determine this reliably.

Is it really that big a deal if it's impossible to access the ungrafted
/gnu/store item? If really required we could document a way to disable
it temporarily maybe? Do we need a specific view for each and every
package? I am thinking that overriding the view to the store item
that's a result of a package with a replacement field globally would be
sufficient.

Toggle quote (13 lines)
> > OTOH, what would be wrong with replacing hashes directly without
> > expecting them to be next to anything else?
>
> Personally, I would find that limitation acceptable, and that's
> fairly
> close to what our grafter originally did (although my fast grafting
> code
> always assumed that a "-" would follow the hash). However, we've
> since
> become accustomed to being able to have replacements with different
> version numbers. That's a nice feature.
>

Version numbers, agree, I didnt realize that replacing the program name
and version was also required there. However I am thinking we could
fake (or alias, with a symlink) the version in the store item name on
purpose so that it remains the same while pointing to something with a
newer version, it would actually be better that way because we wouldnt
have to think about retaining identical version string length during
grafts.

Toggle quote (7 lines)
> Anyway, I doubt that imposing such a limitation would adequately
> solve
> the problem here of chunked references in Racket 8, because I suspect
> that Racket 8 could split store references at arbitrary points in the
> string. I doubt that we can safely assume that the hash component of
> store references will be stored contiguously in *.zo files.

Indeed, is the format for the string references in .zo files documented
anywhere? Is there hope you think we can recognize and automatically
rewrite these strings?

Thanks,
Léo
-----BEGIN PGP SIGNATURE-----

iQIzBAABCgAdFiEEFIvLi9gL+xax3g6RRaix6GvNEKYFAmBs3jUACgkQRaix6GvN
EKZahA//XEDSzqC2DXejbNOrKOnf7mqTm57Btvfr7F0PkCu/4PXBa+jAlORbJxgO
tyMEgL6vOK9RRgugUVtThHsKP7+FUIKABB6wSJLSf9AOGo+a0A4lrUdNLBkip71m
8Ol4nZ3UQM7ZwlMFM3SePYpzYrOWbU1Q++NCb1DeCcsUvCD+3gHYhp0p69UVgn46
HWk8Ue2D6Q3K7zQtlxP6so27R5rc+zv2zzKWm4wi/HSRzR9kWD9GK0bvYRAsQPFU
JQc08xJzNil/FeJTeQUKQHwKImiqSWZC26z74aMGpxIIavwk7sXz5565UvL0KZXr
SxcOxALEJsre6t9SqE90C9F/7Tkhu11FaPd75okFPNPTKiBQUB98Ywsmzg3sioGA
OgIfVTYpz31tPBKxphV+UkMoIUxjJCG7Jc0JOI5waNz86a0D5lCFkekGq7Szieh7
QTddZCYYJrn2ApTgMFvNf2HchJYA/KISjXx6RuJArXgc3JjkXJvfm2MOZ+bYfwnW
D51GDzgjIInJXgn6tG0Pm6siL+AHNGmaLzBg8pDsdtZ3leh95zSq6U6J/qG1FBef
bHNSaon0aO3b7pD1w7PpExi/+mqGxNziEKeYeY3njwssZzckbAVKr2qi/kmLfiyz
ZbueI8ERJBdZpyodFAwGR26NieJlgjVUHXhYQOvOxmtPjZjsgvE=
=/wmS
-----END PGP SIGNATURE-----


P
P
Philip McGrath wrote on 7 Apr 2021 03:48
[security] Chunked store references in .zo files in Racket 8 #47614
(address . 47614@debbugs.gnu.org)
2abc59d0-905e-ab0c-ae25-bf572f34fcd5@philipmcgrath.com
Ah, I see the thread for https://issues.guix.gnu.org/47614wasn't cc'ed
here:


-------- Forwarded Message --------
Subject: Re: Racket 8 and store references (was [security] Chunked store
references in .zo files in Racket 8 #47614)
Date: Tue, 6 Apr 2021 21:38:57 -0400
From: Philip McGrath <philip@philipmcgrath.com>
To: Jack Hill <jackhill@jackhill.us>, Mark H Weaver <mhw@netris.org>
CC: guix-devel@gnu.org

Indeed, I expect this is a more precise diagnosis of the same problem.
My patch in https://issues.guix.gnu.org/47180solves it by putting the
store references (search paths for foreign libraries) in a configuration
data file that isn't compiled, so they don't end up in .zo files in the
first place.

The .zo format is intentionally undocumented and subject to breaking
change, including from different compilation options. At a minimum, a
change to the Racket version number signals a breaking change to
compiled code (e.g. Git is now at 8.0.0.13, so 13 breaking changes since
the release). Internally, I don't know all the details, but the normal
8.0 .zo format has a Racket layer around the Chez Scheme object format,
which seems to be very complex: it looks like it supports
user-configurable compression at the granularity of the individual
object within an object file. So it seems much better to avoid rewriting
.zo files altogether.

-Philip

On 4/6/21 9:20 PM, Jack Hill wrote:
Toggle quote (17 lines)
> On Tue, 6 Apr 2021, Mark H Weaver wrote:
>
>> Anyway, I doubt that imposing such a limitation would adequately solve
>> the problem here of chunked references in Racket 8, because I suspect
>> that Racket 8 could split store references at arbitrary points in the
>> string.  I doubt that we can safely assume that the hash component of
>> store references will be stored contiguously in *.zo files.
>
> Mark and everyone,
>
> I wanted to spin off a subthread on guix-devel, to make you aware of
> another problem that we've run into with reference in .zo getting
> mangled: https://issues.guix.gnu.org/47180
>
> Best,
> Jack
>
M
M
Mark H Weaver wrote on 13 Apr 2021 23:27
Re: bug#47614: [security] Chunked store references in .zo files in Racket 8
87a6q1swmt.fsf@netris.org
Hi Léo,

Léo Le Bouter <lle-bout@zaclys.net> writes:

Toggle quote (22 lines)
> On Tue, 2021-04-06 at 17:27 -0400, Mark H Weaver wrote:
>>
>> Léo Le Bouter <lle-bout@zaclys.net> writes:
>>
>> > I think that probably replacing arbitrary paths in built binaries
>> > is a risky and maybe unreliable engineering choice and that
>> > mechanisms inside kernels should be preferred to give processes a
>> > different view of the file system (retaining the path but changing
>> > the contents of the folder).
>>
>> I've had thoughts along these lines myself, but I don't think it can
>> work properly. The fundamental problem is that in general, each
>> process includes shared objects from many different Guix packages.
>> There would need to be a mechanism to determine, when looking up a
>> file, which Guix package that file lookup was originating from (or
>> whether it was coming from a file name provided by the user), in
>> order to determine which "view of the file system" to use for
>> purposes of that lookup. There's no way to determine this reliably.
>
> Is it really that big a deal if it's impossible to access the ungrafted
> /gnu/store item?

It's a fair question, and reasonable people may disagree, but I would
personally find it quite troubling to not be able to confidently and
straightforwardly examine files in /gnu/store without wondering if my
tools were showing me something else.

Anyway, this would be a very radical change in Guix, and I think this
bug report is not the best place to discuss it. If you'd like to persue
this idea further, I suggest starting a thread on 'guix-devel'.

Toggle quote (17 lines)
>> > OTOH, what would be wrong with replacing hashes directly without
>> > expecting them to be next to anything else?
>>
>> Personally, I would find that limitation acceptable, and that's
>> fairly close to what our grafter originally did (although my fast
>> grafting code always assumed that a "-" would follow the hash).
>> However, we've since become accustomed to being able to have
>> replacements with different version numbers. That's a nice feature.
>
> Version numbers, agree, I didnt realize that replacing the program name
> and version was also required there. However I am thinking we could
> fake (or alias, with a symlink) the version in the store item name on
> purpose so that it remains the same while pointing to something with a
> newer version, it would actually be better that way because we wouldnt
> have to think about retaining identical version string length during
> grafts.

This idea is the subject of https://bugs.gnu.org/43984, and it's
certainly doable. The main disadvantage I see is that file system
lookups in grafted store items would become less efficient, because more
symbolic links would need to be followed. Anyway, if you'd like to
persue this idea further, let's discuss it in that other bug report.

Toggle quote (11 lines)
>> Anyway, I doubt that imposing such a limitation would adequately
>> solve the problem here of chunked references in Racket 8, because I
>> suspect that Racket 8 could split store references at arbitrary
>> points in the string. I doubt that we can safely assume that the
>> hash component of store references will be stored contiguously in
>> *.zo files.
>
> Indeed, is the format for the string references in .zo files documented
> anywhere? Is there hope you think we can recognize and automatically
> rewrite these strings?

According to Philip McGrath, "The .zo format is intentionally
undocumented and subject to breaking change, including from different
compilation options." See https://bugs.gnu.org/47614#19.

Thanks,
Mark
L
L
Ludovic Courtès wrote on 16 Apr 2021 17:46
(name . Philip McGrath)(address . philip@philipmcgrath.com)
87blae44gx.fsf_-_@gnu.org
Hi Philip and all,

Philip McGrath <philip@philipmcgrath.com> skribis:

Toggle quote (6 lines)
> Indeed, I expect this is a more precise diagnosis of the same
> problem. My patch in https://issues.guix.gnu.org/47180 solves it by
> putting the store references (search paths for foreign libraries) in a
> configuration data file that isn't compiled, so they don't end up in
> .zo files in the first place.

IIUC, now that https://issues.guix.gnu.org/47180 has been closed, this
bug is fixed. Am I right?

Thanks,
Ludo’.
P
P
Philip McGrath wrote on 16 Apr 2021 21:46
(name . Ludovic Courtès)(address . ludo@gnu.org)
884230f7-1d73-003f-cfb1-4ee01d38605e@philipmcgrath.com
Hi Ludo’,

On 4/16/21 11:46 AM, Ludovic Courtès wrote:
Toggle quote (3 lines)
> IIUC, now that https://issues.guix.gnu.org/47180 has been closed, this
> bug is fixed. Am I right?

Yes, it seems to be fixed with respect to Racket, though Mark mentioned
here in https://issues.guix.gnu.org/47064#9 (also now fixed) some
broader implications for the grafting code.

-Philip
M
M
Mark H Weaver wrote on 17 Apr 2021 11:25
(address . 47614-done@debbugs.gnu.org)
87h7k58dop.fsf@netris.org
Ludovic Courtès <ludo@gnu.org> writes:
Toggle quote (3 lines)
> IIUC, now that https://issues.guix.gnu.org/47180 has been closed,
> this bug is fixed. Am I right?

Yes, I believe so. All store items referenced by Racket now seem to be
properly grafted, so I'm closing this bug now.

The more general issue with the grafting code--namely that since commit
57bdd79e48, it no longer has the desirable property of checking every
byte against an expected value before rewriting it, which can lead to
silent corruption of files such as Racket .zo files if any store items
references sneak in--can be addressed in another bug report.

Thanks,
Mark
Closed
?