[PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.

Open

Details

5 participants

Andreas Enge
Danny Milosavljevic
Efraim Flashner
Ludovic Courtès
Marius Bakke

Owner: unassigned

Submitted by: Danny Milosavljevic

Severity: normal

Danny Milosavljevic wrote on 24 Sep 2020 16:12

Recipients:(address . guix-patches@gnu.org)(name . Danny Milosavljevic)(address . dannym@scratchpost.org)

Message-ID:20200924141211.21649-1-dannym@scratchpost.org

* gnu/packages/commencement.scm (glibc-final): Catch all cases of a glibc user not
requesting 64-bit offsets and then using readdir.
---
 gnu/packages/commencement.scm | 52 ++++++++++++++++++++++++++++++++++-
 1 file changed, 51 insertions(+), 1 deletion(-)

Toggle diff (63 lines)diff --git a/gnu/packages/commencement.scm b/gnu/packages/commencement.scm
index e5a4caa95c..5a62c9df3a 100644
--- a/gnu/packages/commencement.scm
+++ b/gnu/packages/commencement.scm
@@ -3462,7 +3462,57 @@ exec ~a/bin/~a-~a -B~a/lib -Wl,-dynamic-linker -Wl,~a/~a \"$@\"~%"
                ,hurd-headers-boot0)
              '())
        ,@(package-outputs glibc-final-with-bootstrap-bash))
-      ,@(package-arguments glibc-final-with-bootstrap-bash)))))
+      ,@(substitute-keyword-arguments
+         (package-arguments glibc-final-with-bootstrap-bash)
+         ((#:phases phases)
+          `(modify-phases ,phases
+             (add-after 'unpack 'patch-dirent
+               (lambda* (#:key outputs #:allow-other-keys)
+                 ;; QEMU transparent emulation is in somewhat of a pickle sometimes.
+                 ;; There is no support in the kernel syscalls of specifying what
+                 ;; kind of userspace you are emulating.  Some parts of the
+                 ;; structures passed back-and-forth between kernel and guest
+                 ;; userspace can change size (including size of individual fields).
+                 ;;
+                 ;; One of the affected structures is "struct dirent".  The ext4
+                 ;; file system puts a 64 bit hash into "d_off" on the kernel side.
+                 ;; If the guest system's glibc is 32 bit it is going to be very
+                 ;; confused (it does check whether d_off fits into the structure
+                 ;; it gives back to the user--and it doesn't fit.  Hence readdir
+                 ;; fails).
+                 ;; This manifests itself in simple directory reads not working
+                 ;; anymore in parts of cmake, for example.
+                 ;;
+                 ;; There is a very simple and complete way to avoid this problem:
+                 ;; Just always use 64 bit offsets in user space programs (also
+                 ;; on 32 bit machines).
+                 ;;
+                 ;; Note: We might want to avoid using 64 bit when bootstrapping
+                 ;; using mescc (since mescc doesn't directly support 64 bit
+                 ;; values)--but then bootstrapping has to be done on a
+                 ;; file system other than ext4, or on ext4 with the feature
+                 ;; "dir_index" disabled.
+                 ;;
+                 ;; The change below does not affect 64 bit users.
+                 ;;
+                 ;; See <https://issues.guix.gnu.org/43513>.
+                 (let ((port (open-file "include/dirent.h" "a")))
+                   (display "
+#if __SIZEOF_LONG__ < 8
+#ifndef __USE_FILE_OFFSET64
+#undef readdir
+#define readdir @READDIR_WITHOUT_FILE_OFFSET64_IS_A_REALLY_BAD_IDEA@
+#endif
+#endif
+" port)
+                   (close-port port))
+                 ;; This file includes <dirent.h> and thus checks sanity already.
+                 ;; TODO: Check dirent/scandir-tail.c, dirent/scandir64-tail.c.
+                 (substitute* "posix/glob.c"
+                  (("(#[ ]*define[ ][ ]*readdir)") "
+#undef readdir
+#define readdir"))
+                 #t)))))))))
 
 (define/system-dependent gcc-boot0-wrapped
   ;; Make the cross-tools GCC-BOOT0 and BINUTILS-BOOT0 available under the

Danny Milosavljevic wrote on 24 Sep 2020 16:16

Recipients:(address . 43591@debbugs.gnu.org)

Message-ID:20200924161607.0c196f98@scratchpost.org

On Thu, 24 Sep 2020 16:12:11 +0200

Danny Milosavljevic <dannym@scratchpost.org> wrote:

Toggle quote (6 lines)

> + ;; Note: We might want to avoid using 64 bit when bootstrapping

> + ;; using mescc (since mescc doesn't directly support 64 bit

> + ;; values)--but then bootstrapping has to be done on a

> + ;; file system other than ext4, or on ext4 with the feature

> + ;; "dir_index" disabled.

Or on a real 32 bit machine, where there is no problem with mismatching struct
sizes between emulated guest glibc and host system kernel, because there is no
emulation going on.

-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl9sqicACgkQ5xo1VCww

uqXO7wf+MGIbf5TXP6EEZ+ay4VWLji+EfEdvvv37LSSvlGBtAIyDY2spjc2dcvcd

yBA8o04jMes+JyCFWTpkZxbk5uwpEQxlSbmxtBtnjLnK7+1txVzweONiP590AMa2

9SApQbqpTr6mJrdLuaX+ATU54XXqAQ/Ah+wHWwoFeSk7catsAmhIEIOcbWB1zOMO

G+Mf+xR1l3GInukjUm3FvtJfHFd5B9nzgLeDKXeboXKdSvL90Hm5LYgIDxQkQ8J+

Iua9O5XPR200M4UMfkJVuD+clLP3bt63qmFUE/Z557woIqeA7l6lY4ee8kf8ylR1

Y8piTqrroEewYjqC1/xqCtXOB8Gliw==

=6+cv

-----END PGP SIGNATURE-----

Marius Bakke wrote on 24 Sep 2020 20:17

Re: [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.

Recipients:(name . Danny Milosavljevic)(address . dannym@scratchpost.org)

Message-ID:87363759at.fsf@gnu.org

Danny Milosavljevic <dannym@scratchpost.org> writes:

Toggle quote (15 lines)> +                 ;; QEMU transparent emulation is in somewhat of a pickle sometimes.
> +                 ;; There is no support in the kernel syscalls of specifying what
> +                 ;; kind of userspace you are emulating.  Some parts of the
> +                 ;; structures passed back-and-forth between kernel and guest
> +                 ;; userspace can change size (including size of individual fields).
> +                 ;;
> +                 ;; One of the affected structures is "struct dirent".  The ext4
> +                 ;; file system puts a 64 bit hash into "d_off" on the kernel side.
> +                 ;; If the guest system's glibc is 32 bit it is going to be very
> +                 ;; confused (it does check whether d_off fits into the structure
> +                 ;; it gives back to the user--and it doesn't fit.  Hence readdir
> +                 ;; fails).
> +                 ;; This manifests itself in simple directory reads not working
> +                 ;; anymore in parts of cmake, for example.

Note that for CMake in particular, this problem will be fixed in 3.19:

https://gitlab.kitware.com/cmake/cmake/-/issues/20568

As mentioned in that issue, and which this patch states on no uncertain

terms, a workaround is to use -D_FILE_OFFSET_BITS=64 on 32-bit platforms.

Toggle quote (21 lines)> +                 ;;
> +                 ;; There is a very simple and complete way to avoid this problem:
> +                 ;; Just always use 64 bit offsets in user space programs (also
> +                 ;; on 32 bit machines).
> +                 ;;
> +                 ;; Note: We might want to avoid using 64 bit when bootstrapping
> +                 ;; using mescc (since mescc doesn't directly support 64 bit
> +                 ;; values)--but then bootstrapping has to be done on a
> +                 ;; file system other than ext4, or on ext4 with the feature
> +                 ;; "dir_index" disabled.
> +                 ;;
> +                 ;; The change below does not affect 64 bit users.
> +                 ;;
> +                 ;; See <https://issues.guix.gnu.org/43513>.
> +                 (let ((port (open-file "include/dirent.h" "a")))
> +                   (display "
> +#if __SIZEOF_LONG__ < 8
> +#ifndef __USE_FILE_OFFSET64
> +#undef readdir
> +#define readdir @READDIR_WITHOUT_FILE_OFFSET64_IS_A_REALLY_BAD_IDEA@

Won't this break _everything_ that uses readdir() without 64-bit

offsets? Or does that @@ string get substituted by the glibc build

system somehow.

Toggle quote (11 lines)> +#endif
> +#endif
> +" port)
> +                   (close-port port))
> +                 ;; This file includes <dirent.h> and thus checks sanity already.
> +                 ;; TODO: Check dirent/scandir-tail.c, dirent/scandir64-tail.c.
> +                 (substitute* "posix/glob.c"
> +                  (("(#[ ]*define[ ][ ]*readdir)") "
> +#undef readdir
> +#define readdir"))

Can you file a bug report upstream about the duplicate definition(s)?

Enforcing this restriction in glibc feels rather sledgehammer-y.  Would
it make sense to introduce a GCC warning instead?  I'm sure there are
legitimate uses of smaller file offsets (i.e. embedded).  A GCC warning
will still break -Werror, but that's a lot more manageable than breaking
almost every use of readdir() on 32-bit platforms.

Danny Milosavljevic wrote on 24 Sep 2020 22:27

Recipients:(name . Marius Bakke)(address . marius@gnu.org)(address . 43591@debbugs.gnu.org)

Message-ID:20200924222711.2f22281a@scratchpost.org

Hi Marius,

On Thu, 24 Sep 2020 20:17:14 +0200

Marius Bakke <marius@gnu.org> wrote:

Toggle quote (3 lines)

> As mentioned in that issue, and which this patch states on no uncertain

> terms, a workaround is to use -D_FILE_OFFSET_BITS=64 on 32-bit platforms.

Yeah. The problem is how to find all those places where we need to add it,

and how to keep finding them as we maintain stuff (read: as upstream changes

stuff).

Toggle quote (3 lines)

> Won't this break _everything_ that uses readdir() without 64-bit

> offsets?

That's the goal of that patch. Because it won't work at runtime anyway, when

run on a guix x86_64 build machine, building for ARM (which is the usual way

we build stuff for ARM).

Maybe we can find out whether dirent.h is #included for compiling for one of

the main Guix platforms (I prefer this latter solution, if possible).

It's either that or we stop qemu transparent emulation on the build farm,

because the result isn't reliable. This time it was "caught" because of

an overabundance of caution on behalf of the glibc people. We won't be

so lucky next time.

Toggle quote (3 lines)

> Or does that @@ string get substituted by the glibc build

> system somehow.

No. It's specifically made so downstream users of glibc on guix can't use

readdir() on 32-bit systems without enabling large file support.

It totally could use some more #if about whether it's building stuff for

an actual guix main platform (i.e. not on embedded).

Toggle quote (2 lines)

> Can you file a bug report upstream about the duplicate definition(s)?

It's not really a problem for them. But I can try anyway.

Toggle quote (6 lines)

> Enforcing this restriction in glibc feels rather sledgehammer-y. Would

> it make sense to introduce a GCC warning instead? I'm sure there are

> legitimate uses of smaller file offsets (i.e. embedded). A GCC warning

> will still break -Werror, but that's a lot more manageable than breaking

> almost every use of readdir() on 32-bit platforms.

I thought about a gcc #warning. But I don't want it to warn when readdir

isn't used in the first place but dirent.h ended up being included by some

dependency. I guess we could use a deprecation warning on readdir() instead.

Can this warning print a custom message?

What do you think?

In any case, I'd like to have some overview of how bad the problem is and

thus I'd like to have a wip branch with this patch being built entirely

for 32 bits via x86_64 (not sure whether that includes i686 btw).

Is it possible to force using these x86_64 machines in the build farm?

Toggle quote (2 lines)

>I'm sure there are legitimate uses of smaller file offsets (i.e. embedded).

On guix? With our glibc? Do we support that?

Toggle quote (3 lines)

>A GCC warning will still break -Werror, but that's a lot more manageable

>than breaking almost every use of readdir() on 32-bit platforms.

These uses of readdir() are not reliable the way we are building guix on

cuirass.

But I'm all for refining this patch, if there are suggestions on how.

Actually, I think a warning is not strong enough. This bug made it all the

way down to json-c (!) before finally being detected. That should not happen.

And if the build farm would have selected an actual ARM machine, the build

would have succeeded and no one on x86_64 could have reproduced the result.

That would have been real bad.

-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl9tAR8ACgkQ5xo1VCww

uqUQqAgAhPm5IvAyqQ+RtKLzcd/JTSlzR8/lbc6RVhuea0/IiYZJHjFKcY9YTbWs

3c9Rvk0zI8XvzXs8ViLqVRwMDf+fFW/KBFKWgkyfpF784T12m/4epJE/3s4v1ihy

UC1WGxS5xTs2PkXcL9aG+FpwV+5zMROV7ge4he0wGJUnB1BtkvYh3PWzVoqeM1Wt

+M8I0anOi7NLc3EHWJDCv+RjcRrI1f2HBXlWprc6rxBGnVpRJrtHZSQHrUrTOQF1

IB8YK12qB6FbDy/DZDaMwLiY4x+sonKQ1+CTJkkG6irJ+Xh/MtHgNj9rWVOXzsCE

3RG3MNjdAx3o/08r262LrNxRFITu1w==

=9PzS

-----END PGP SIGNATURE-----

Marius Bakke wrote on 25 Sep 2020 01:11

Recipients:(name . Danny Milosavljevic)(address . dannym@scratchpost.org)(address . 43591@debbugs.gnu.org)

Message-ID:87tuvm4vop.fsf@gnu.org

Danny Milosavljevic <dannym@scratchpost.org> writes:

Toggle quote (19 lines)> Hi Marius,
>
> On Thu, 24 Sep 2020 20:17:14 +0200
> Marius Bakke <marius@gnu.org> wrote:
>
>> As mentioned in that issue, and which this patch states on no uncertain
>> terms, a workaround is to use -D_FILE_OFFSET_BITS=64 on 32-bit platforms.
>
> Yeah.  The problem is how to find all those places where we need to add it,
> and how to keep finding them as we maintain stuff (read: as upstream changes
> stuff).
>
>> Won't this break _everything_ that uses readdir() without 64-bit
>> offsets?
>
> That's the goal of that patch.  Because it won't work at runtime anyway, when
> run on a guix x86_64 build machine, building for ARM (which is the usual way
> we build stuff for ARM).

Why is this not an issue with i686 on x86_64 kernels? Or armhf on

AArch64? This problem only manifests when using QEMUs syscall

emulation, right?

Toggle quote (8 lines)

> Maybe we can find out whether dirent.h is #included for compiling for one of

> the main Guix platforms (I prefer this latter solution, if possible).

> It's either that or we stop qemu transparent emulation on the build farm,

> because the result isn't reliable. This time it was "caught" because of

> an overabundance of caution on behalf of the glibc people. We won't be

> so lucky next time.

Arguably running code for foreign architectures through QEMU binfmt is
something of a hack.  Mandating that every package *must* be patched to
support it seems user-hostile.  I'm more in favor of dropping it on the
build farm, or just keep fixing things on a per-package basis.

A less user-hostile solution could perhaps be to (setenv "CFLAGS"
"-D_FILE_OFFSET_BITS=64") on 32-bit architectures in gnu-build-system.
Not sure whether that could cause any adverse effects.  But again, I
don't like the idea of optimizing for QEMUs user-mode emulation.

Danny Milosavljevic wrote on 25 Sep 2020 12:20

Recipients:(name . Marius Bakke)(address . marius@gnu.org)(address . 43591@debbugs.gnu.org)

Message-ID:20200925122004.38275411@scratchpost.org

Hi Marius,

On Fri, 25 Sep 2020 01:11:18 +0200

Marius Bakke <marius@gnu.org> wrote:

Toggle quote (3 lines)

> Arguably running code for foreign architectures through QEMU binfmt is

> something of a hack.

The problem here is not qemu. It's not even the size of the whole struct

dirent that's the problem, or endianness, or alignment, or any of that stuff.

The problem is that if you have a 64-bit hash value then you can't fit it into

a 32-bit field.

You have to make the field bigger.

It's the only non-insane way to fix this and still be able to use the host

filesystem from the guest.

This bug is making readdir randomly and silently succeed or fail.

Note: The bug I'm referring to is caused by having 32 bit offsets in user space

in the first place--nothing else. That is the root of the problem, not qemu,

not ext4, not the kernel. Having 32 bit offsets in user space when kernel space

uses 64 bit offsets is just weird (note: the kernel always uses 64 bit

offsets--also on 32 bit architectures!).

Note: This also affects 9p, which has been recently "fixed" in the Linux kernel.

Toggle quote (2 lines)

>Why is this not an issue with i686 on x86_64 kernels?

I'm not sure. I'll check.

Toggle quote (2 lines)

>Or armhf on AArch64?

I cannot check that, but it totally should be a problem there.

It has nothing to do with qemu specifically, and IMO qemu is right in not

"fixing" it (because it's NOT a problem in qemu), and in expecting

32 bit guests not to use the getdents64 system call :P

But even using getdents (no 64) as a flag what size d_off the caller

expects is an ugly workaround.

Not because of the flag, but because of what you have to do if it's set.

The right fix is to switch to LFS (and thus also 64 bit d_off) on 32 bit guests.

It's 2020; normal systems have drives a lot bigger than 4 GiB. Even Raspberrys

have bigger drives.

Also, just for the record, glibc always calls getdents64, never getdents.

Using getdents64 and then being surprised when a 64 bit result comes back is

seriously weird. And they are still defending it!

If they didn't do that, at least the kernel would know something is up,

and qemu would still pass through getdents as getdents to the host kernel and

getdents64 as getdents64 to the host kernel. Also, qemu could warn on calls

to getdents (and not on calls to getdents64).

There are no downsides to knowing what the user expects.

But even then, user space probably would have LFS disabled. That means

as soon as user space mounts something from NFS or 9p or whatever on their

(let's say embedded) platform, they would have this problem again,

and it would be very difficult to find what happened. Why keep this problem

alive?

Toggle quote (2 lines)

>This problem only manifests when using QEMUs syscall emulation, right?

No.

It's a fundamental problem with the slot you want to store something in

being too small for it.

We just didn't see it yet, because usually the system *silently returns* halfway

through the readdir loop and does not tell us about any problems once it

encounters any d_off >= 2**32 along the way.

In the "json-c" bug, luckily json-c or cmake actually NEEDED a file that had

not been returned because of that. That is not a given.

Toggle quote (2 lines)

>Mandating that every package *must* be patched to support it seems user-hostile.

Ok, then how about a flag that users can set in order to let it compile

regardless? But then they shouldn't use that flag in any (non-cross-compiled)

guix packages.

We can mention it in the news so people are not surprised.

(Thinking about it, the flag could just be "-D_FILE_OFFSET_BITS=32". Because

when you are setting it explicitly, you probably know what you are doing)

It really depends on whether we officially support running 32 bit guests

on 64 bit hosts. Qemu has nothing to do with it.

Since the majority of our build farm runs qemu transparent emulation, and

since a majority of our ARM packages are built using it, it seems that we

decided to support 32 bit guests on 64 bit hosts. Well, then something like

this patch has to be done, otherwise we have Schrödinger's build system--and

no one likes that.

(I want to stress that the main problem is the case where readdir SUCCEEDS

even though there is a problem;

if readdir would always fail in the situation with a 32 bit guest on

64 bit host, that would be much less bad. But it doesn't always fail!)

I'd be really uneasy if the sanity check this patch introduces were only

enabled when explicitly setting a flag--for something as bad as this.

Especially for an error that silently makes it into base derivations and then

distributes all the way to who-knows-what normal client derivations.

Keeping the user from shooting himself in the foot by default if he doesn't

say "yes, please shoot me in the foot" beforehand is not "user-hostile".

We could also do a deprecation warning instead:

struct dirent* readdir(DIR* d) __attribute__((deprecated("Unsafe if not using large file support")));

f1.c:5:2: warning: 'readdir' is deprecated: Unsafe if not using large file support [-Wdeprecated-declarations]

But then the warning would not be emitted in the, for example, json-c package,

but in cmake. I would never have found the bug that way. I guess now we

know, but still... only emitting a warning seems weak for something this bad.

So I'm against it.

Toggle quote (5 lines)

> I'm more in favor of dropping it on the

> build farm,

>or just keep fixing things on a per-package basis.

That means you have to *find* the things first. The only reason glibc found

the problem this one time is because the d_off telldir pointer just happened

to be >= 2**32. If it's not bigger one time, and is bigger another time in

the same package build run, then you get weird unreproducible results.

That is horrible. A copy-recursively could leave half the files off and not

tell anyone. PLEASE let's do something about it. Also, let's delete all

our ARM substitutes we have so far.

Toggle quote (3 lines)

> A less user-hostile solution could perhaps be to (setenv "CFLAGS"

> "-D_FILE_OFFSET_BITS=64") on 32-bit architectures in gnu-build-system.

Sure, we can definitely do that in addition. It's a good idea. There can

be no ABI compatibility problem inside Guix if *all* Guix packages use that

flag.

While we are at it, we can do it on ALL architectures. Because that's what

we actually want. Should there be a 128 bit CPU, we still want all CPUs

(no matter how many bit registers the CPU has) use the same file offset

bit size.

But will that be picked up everywhere? cmake? Rust C extensions? Python

extensions?

Toggle quote (2 lines)

> Not sure whether that could cause any adverse effects.

It will. And if we want to find out where those effects would be, this

patch--or one with warnings (that one is only useful if there is a way to

automatically collect AND KEEP all warnings into one place)--is how to find

out.

Toggle quote (2 lines)

> But again, I don't like the idea of optimizing for QEMUs user-mode emulation.

It's not optimizing for qemu specifically. Qemu is not at fault here.

(guest) glibc is at fault. Because of the way we build it.

I've thought through a few possible fixes for this problem:

(1) Wait until the kernel adds a "I-have-space-for-x-bits" field to the

getdents64 system call and for qemu to actually use it. But how would qemu

know how much space the guest glibc has in its field?

But fine, let's say they have a crystal ball. Even then, what is ext4 going

to do with this request? Just cut the cookie somewhere and throw away half?

That's not safe. So this solution is right out.

(2) Have glibc use getdents on 32 bit and getdents64 on 64 bit. See above--only

now qemu can tell know much space the guest glibc has, and so can the host kernel

(which is much better).

Cutting the cookie is not safe. So this solution is still out, barely.

(3) Have qemu maintain a mapping table per file of which 64-bit d_off is

associated with which weird cookie qemu invents and remembers.

That doesn't help us in the case where you DO NOT USE qemu but still have

a 32 bit executable running on a 64 bit host (natively).

Therefore, this solution is out.

(4) Build user space with large file support. There are no downsides to this,

and as a distribution we can indeed decide only to support large file guix

packages. I argue that this is the right solution for all "Guix system" hosts,

because of the "SD card" argument above. Also, directories with Guix on it

aren't exactly small either. So you need it anyway. If there are embedded

platforms with disks smaller than 4 GiB used in guix, we can given those

glibcs an original dirent.h (they will fail loudly--so we will find them easily).

I prefer this solution.

(5) Build the qemu that emulates 32 bit ARM for 32 bit i686 (this is not a typo).

In that case the kernel would enable compatibility mode (X86_32 personality) and

it would work. I want to stress that this compatibility mode calculates

different hashes depending on the program that is requesting it, on the

same host, directory and entry. This is not something one would want,

especially when building things--the kernel is basically lying to you.

See kernel option CONFIG_X86_32--which we enable.

It would work okay in practice, because user space programs usually don't store

d_off. But (4) is just better.

P.S. by now, my local build found a binutils 2.35.1 bfd plugin, and

(libstdc++-v3/src/filesystem/dir.cc in) libstdc++ 7.5.0, that also don't use

large files and thus would also only work sporadically in random ways.

How many hundreds of packages does this affect that we don't know yet?

PPS. I get a lot of "grep: invalid option -o" messages when building

glibc-2.32--even without the patch. That's probably not good either.

PPPS. new patch will come soon.

-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl9txFQACgkQ5xo1VCww

uqWCSwf/WwR2WWHPz4kO0hVdcEjmjMTwarKs1VF5DNs+pA6WwVN5nB2Nyp5El4mV

wwo4ik3BAzOicLBHwNYk6H+/RINRqFrtDqHUkC6dRo3zXzhmtgXSs2PajYie2ljr

eviZNBmZcwGdEitvv6E/9+K4fbDIIG6rhygWiY/x9YD+3/mdw8eTJ5Km1tnjCSZ4

uPT7CiWB43LuValm7R9k034N/dgix8bT8QTgLPQ25X7bhII1sX6Xs33d20eLCDen

cD1pKmcCvbVgh5ZGsG8BI7Lcp86AfUvfOVlZZlwNIFvSVS9DUMJ23sCiL48GGFV8

FYUb2ZvPyFdP60JzPykH/PlpTXg8nA==

=eqNk

-----END PGP SIGNATURE-----

Danny Milosavljevic wrote on 25 Sep 2020 12:24

Recipients:(name . Marius Bakke)(address . marius@gnu.org)(address . 43591@debbugs.gnu.org)

Message-ID:20200925122450.61df97da@scratchpost.org

On Thu, 24 Sep 2020 20:17:14 +0200

Marius Bakke <marius@gnu.org> wrote:

Toggle quote (3 lines)

> As mentioned in that issue, and which this patch states on no uncertain

> terms, a workaround is to use -D_FILE_OFFSET_BITS=64 on 32-bit platforms.

That is not a workaround. That is the correct fix.

The kernel uses 64 bit offsets on 32-bit platforms anyway.

So either they don't ask the kernel questions they don't want to know the

answer to, or they use 64 bit offsets, too.

-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl9txXIACgkQ5xo1VCww

uqWFxgf/SFs4Qr2DuOGjlo/8lwAeGFPdnAWlun5d8yEeBcQCOnxPIi/XjeJNryM2

+T/81IBsByMIZlQWPPssYsFgqAqpPh4jG0ymYc77f3mJ4a4vqHzvbM7CWYRpxFO8

WBv42TsOl1szwOrwknF/Qqx7ERv2hNidkzCmqrAaOtnu5aLCrmdWX/UmeZkEZYJG

gTQUKOdhgUQTfJgzW9RbdeN/TmJRsfsQZGzq+65BFDg9HERU/LhtR5E704HmRiZx

lmmHR1ArysoEvhioX6aqaQxkFFgCeTn0IXtmSpaHZSsFHrsl7LAWx4V8DcOaAF8E

upmivdcdHAKPP6ULiRKpsYzceAwHPw==

=YvvO

-----END PGP SIGNATURE-----

Danny Milosavljevic wrote on 25 Sep 2020 12:42

[PATCH v2 core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir regardless.

Recipients:(address . 43591@debbugs.gnu.org)(name . Danny Milosavljevic)(address . dannym@scratchpost.org)

Message-ID:20200925104243.2298-1-dannym@scratchpost.org

* gnu/packages/commencement.scm (glibc-final): Catch all cases of a glibc user not
requesting 64-bit offsets and then using readdir.
---
 gnu/packages/commencement.scm | 68 ++++++++++++++++++++++++++++++++++-
 1 file changed, 67 insertions(+), 1 deletion(-)

Toggle diff (79 lines)diff --git a/gnu/packages/commencement.scm b/gnu/packages/commencement.scm
index e5a4caa95c..284fa65d20 100644
--- a/gnu/packages/commencement.scm
+++ b/gnu/packages/commencement.scm
@@ -3462,7 +3462,73 @@ exec ~a/bin/~a-~a -B~a/lib -Wl,-dynamic-linker -Wl,~a/~a \"$@\"~%"
                ,hurd-headers-boot0)
              '())
        ,@(package-outputs glibc-final-with-bootstrap-bash))
-      ,@(package-arguments glibc-final-with-bootstrap-bash)))))
+      ,@(substitute-keyword-arguments
+         (package-arguments glibc-final-with-bootstrap-bash)
+         ((#:phases phases)
+          `(modify-phases ,phases
+             (add-after 'unpack 'patch-dirent
+               (lambda* (#:key outputs #:allow-other-keys)
+                 ;; Linux kernel file offsets are always 64 bits.
+                 ;; But userspace can be built to use 32 bit offsets.
+                 ;;
+                 ;; "struct dirent", returned by readdir, uses d_off to store
+                 ;; such an "offset" that it got from the Linux kernel.
+                 ;; In the case of ext4 that "offset" is actually a 64 bit
+                 ;; cookie which includes a hash value.
+                 ;;
+                 ;; Therefore, there are cases where such an offset that it got
+                 ;; from the Linux kernel does not fit in the "struct dirent"
+                 ;; field "d_off".
+                 ;;
+                 ;; If the guest system's glibc is 32 bit AND uses 32 bit
+                 ;; file offsets it is going to be very confused.
+                 ;; It does check whether d_off fits into the structure
+                 ;; it gives back to the user--and it doesn't fit.  Hence readdir
+                 ;; fails, with errno == EOVERFLOW (which is undocumented and thus
+                 ;; an API error).
+                 ;; This manifests itself in simple directory reads not working
+                 ;; anymore in parts of cmake, for example.
+                 ;;
+                 ;; This manifested in Guix when building stuff for
+                 ;; ARMHF on a x86_64 build host using QEMU transparent emulation.
+                 ;;
+                 ;; There is a very simple and complete way to avoid this problem:
+                 ;; Just always use 64 bit offsets in user space programs (also
+                 ;; on 32 bit machines).  The Linux kernel does that already
+                 ;; anyway.
+                 ;;
+                 ;; Note: We might want to avoid using 64 bit when bootstrapping
+                 ;; using mescc (since mescc doesn't directly support 64 bit
+                 ;; values)--but then bootstrapping has to be done on a
+                 ;; file system other than ext4, or on ext4 with the feature
+                 ;; "dir_index" disabled.
+                 ;;
+                 ;; The change below does not affect 64 bit users.
+                 ;;
+                 ;; See <https://issues.guix.gnu.org/43513>.
+                 (let ((port (open-file "dirent/dirent.h" "a")))
+                   (display "
+#ifndef _LIBC
+#if __SIZEOF_LONG__ < 8
+#ifndef __USE_FILE_OFFSET64
+#if defined(_FILE_OFFSET_BITS) && _FILE_OFFSET_BITS == 32
+#warning \"Using -D_FILE_OFFSET_BITS=32 and using readdir is a bad idea, see <https://bugzilla.kernel.org/show_bug.cgi?id=205957>\"
+#else
+#undef readdir
+#define readdir @READDIR_WITHOUT_FILE_OFFSET64_IS_A_REALLY_BAD_IDEA@
+#endif
+#endif
+#endif
+#endif
+" port)
+                   (close-port port))
+                 ;; This file includes <dirent.h> and thus checks sanity already.
+                 ;; TODO: Check dirent/scandir-tail.c, dirent/scandir64-tail.c.
+                 (substitute* "posix/glob.c"
+                  (("(#[ ]*define[ ][ ]*readdir)") "
+#undef readdir
+#define readdir"))
+                 #t)))))))))
 
 (define/system-dependent gcc-boot0-wrapped
   ;; Make the cross-tools GCC-BOOT0 and BINUTILS-BOOT0 available under the

Danny Milosavljevic wrote on 25 Sep 2020 15:36

Re: [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.

Recipients:(name . Marius Bakke)(address . marius@gnu.org)(address . 43591@debbugs.gnu.org)

Message-ID:20200925153646.6ef95908@scratchpost.org

Toggle quote (4 lines)

> >Why is this not an issue with i686 on x86_64 kernels?

> I'm not sure. I'll check.

$ cat a00.c
#include <stdio.h>
#if defined( __ILP32__ )
#warning ILP32
#endif
int main() {
        return sizeof(off_t);
}
$ LC_ALL=C guix environment -s i686-linux gcc-toolchain -- gcc -o a00 a00.c
a00.c:3:2: warning: #warning ILP32 [-Wcpp]
    3 | #warning ILP32
      |  ^~~~~~~
$ ./a00
$ echo $?
4

That means they are using the Linux kernel's X86_32 ABI.
It has its own getdents64 system call that returns another value for d_off.

$ LC_ALL=C guix environment -s i686-linux gcc-toolchain -- gcc -o a00 -D_FILE_OFFSET_BITS=64 a00.c
a00.c:3:2: warning: #warning ILP32 [-Wcpp]
    3 | #warning ILP32
      |  ^~~~~~~
$ ./a00
$ echo $?
8

That is why __i686__ is not affected--at the cost of the kernel lying to us
about the file system.

Note that I also tried printing the actual d_off values[1]--on ILP32, even with
_FILE_OFFSET_BITS=64, the VALUES are still 32 bits, and the same values as
without _FILE_OFFSET_BITS.  The d_off SLOT gets bigger on _FILE_OFFSET_BITS=64.

(I also tried printing the actual d_off values[1] on x86_64 without any guix
environment -s, I get entirely different d_off values!!)

I also tried the former on native ARMHF--you get 32 bits d_off values.  And d_off
is always the same size as off_t.

off_t size changes depending on _FILE_OFFSET_BITS.

I do not have access to a real aarch64 machine--so no idea how it is there.
That would be the most interesting case, because those don't have X86_32,
so ILP32 is either not present or implemented differently.

ppc64 would also be interesting...

Test result of [1]:

system _FILE_OFFSET_BITS off_t   d_off-sizeof   d_off-values
-------------------------------------------------------------
x86_64 -                 8 Byte  8 Byte         8 Byte
i686   -                 4 Byte  4 Byte         4 Byte
i686   64                8 Byte  8 Byte         4 Byte
i686   32                4 Byte  4 Byte         4 Byte
i686   7                 4 Byte  4 Byte         4 Byte
armhf  -                 4 Byte  4 Byte         4 Byte
armhf  64                8 Byte  8 Byte         4 Byte
armhf  32                4 Byte  4 Byte         4 Byte
armhf  7                 4 Byte  4 Byte         4 Byte

This is all without qemu--in order to simplify the test case.

So I take it ext4 has some special compilation mode for 32 bits...

Could someone please test [1] on (real!) aarch64 and ppc64 and ppc32
machines?

[1] $ cat a00.c 
#include <stdio.h>
#include <errno.h>
#include <assert.h>
#include <dirent.h>
#if defined( __ILP32__ )
#warning ILP32
#endif

int main() {
        DIR* d;
        struct dirent* ent;
        d = opendir("/tmp");
        errno = 0;
        assert(sizeof(ent->d_off) == sizeof(off_t));
        while ((ent = readdir(d)) != NULL) {
                printf("%llu\n", (unsigned long long) ent->d_off);
        }
        if (errno)
                perror("readdir");
        return sizeof(off_t);

-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl9t8m4ACgkQ5xo1VCww

uqWmVQf9Gy9Y5HXJyFlOOnywiaIBQuhAmEry1kSYKv0CEwwlZDsMmNyYbKZeMtMS

AY5mw0zxZ215+47xTDqys5GOKKn6Iq6DgVw96Ts+sig945Crw52Aey/XIzrGrEaM

9sJ7xIIcLZuBUlOye9QtctB7tgDhCw9afjKfkB4nwnvPP3H6c+sOIcZiEH4OY3VQ

3xN40ErTBY+PovHojfvdiKy/7spTibocYxC5xWOaAXBIkaP6Xj1m7ySk96A4WF/8

WycJHuSsrVn/QEbens03akkCeME0ZDVZkQd0O12FZ3zL01aNB7EBZa3vjGxkcWSM

H4oBddMV8VfQ2CzJeK4nsNz8FmFU2Q==

=14ua

-----END PGP SIGNATURE-----

Danny Milosavljevic wrote on 25 Sep 2020 17:33

Recipients:(name . Marius Bakke)(address . marius@gnu.org)(address . 43591@debbugs.gnu.org)

Message-ID:20200925173320.593e9179@scratchpost.org

Hi,

I wrote a FUSE filesystem to test what happens with big d_off (I just

hard-or-ed a bitmask) and ran it on a real ARMHF machine, then made the program

from before([1] from before) look into that directory.

Result (on ARMHF, so real 32 bit machine!):

$ gcc --version

gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609

$ gcc a00.c

$ ./a00

1737031971 .

1737032035 ..

1737032035 hello

$ gcc -D_FILE_OFFSET_BITS=64 a00.c

$ ./a.out

320255973458211 .

320255973458275 ..

320255973458275 hello

(Note: Guix gcc-toolchain 10 on ARMHF is still building from source--and

will continue to do so for some hours I guess)

I only had to patch fuse 2.9.4 (lib/fuse_lowlevel.c) to do this:

char *fuse_add_dirent(char *buf, const char *name, const struct stat *stbuf,

off_t off)

{

unsigned namelen = strlen(name);

unsigned entlen = FUSE_NAME_OFFSET + namelen;

unsigned entsize = fuse_dirent_size(namelen);

unsigned padlen = entsize - entlen;

struct fuse_dirent *dirent = (struct fuse_dirent *) buf;

dirent->ino = stbuf->st_ino;

dirent->off = off | 0x1234567890123; // !!!!

dirent->namelen = namelen;

dirent->type = (stbuf->st_mode & 0170000) >> 12;

strncpy(dirent->name, name, namelen);

if (padlen)

memset(buf + entlen, 0, padlen);

return buf + entsize;

}

(I DID NOT have to patch the kernel or even have root)

So it can happen that you get 64 bit d_off even on real 32 bit machines!

That's what I thought--but I still wanted to make sure.

And the same on Guix i686 (a00 is [1] from my previous e-mail):

$ ./a00-i686

readdir: Value too large for defined data type

$ ./a00-i686_flag_32

readdir: Value too large for defined data type

$ ./a00-i686_flag_64

320255973458211

320255973458275

So there you have it, even on i686--without emulating anything--you can get a

64 bit d_off value.

-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl9uDcAACgkQ5xo1VCww

uqXHrAf7Bkl5DifcLI6I7rlRy3s0K7YaZv3P+59pkKv5rf/AhRFdOTfn6QHQ/meq

ml+xqF29R85vPLFSmjSsx2psqWPA1H5ujj0UrBOJ+5PqjIWbjzz77TnCgBNrnglP

kMsvxBUUe477SNa+uQ7TxzycTE1xV2jJ5Pyh5L+kx3JjSSYChVtrn0JvwWfhQSaD

x9A9zTzD3GGMTI1PE8uvhKC80PZi4gc8Ov1Qu6gcwBbsTTe82JZsSpd5xezdm4AO

6nvmuHVOaRZXOGNdvAI+Y7l6+EDisMqDNtzpFIEfPeAds0dcNkC8h2DaH547mN+T

ULenCL7NaTlOzXd4ST82iiVy6UQpMA==

=DzDT

-----END PGP SIGNATURE-----

Andreas Enge wrote on 25 Sep 2020 22:03

Recipients:(name . Marius Bakke)(address . marius@gnu.org)

Message-ID:20200925200323.GB5828@jurong

On Fri, Sep 25, 2020 at 01:11:18AM +0200, Marius Bakke wrote:

Toggle quote (5 lines)

> Arguably running code for foreign architectures through QEMU binfmt is

> something of a hack. Mandating that every package *must* be patched to

> support it seems user-hostile. I'm more in favor of dropping it on the

> build farm

Indeed it is weird we do not only compile packages natively on the build
farm. If we do not have enough aarch64 machines, would it make sense to buy
some more? How many would we need? Which are our options of machines that
can run the GNU system (as opposed to Guix on a foreign distro, and possibly
non-free firmware)?

Andreas

Danny Milosavljevic wrote on 26 Sep 2020 03:42

Recipients:(name . Marius Bakke)(address . marius@gnu.org)(address . 43591@debbugs.gnu.org)

Message-ID:20200926034244.1aa3c4f6@scratchpost.org

Hi,

by now I also tested it on overdrive1, an aarch64 machine.

Toggle quote (4 lines)

> I do not have access to a real aarch64 machine--so no idea how it is there.

> That would be the most interesting case, because those don't have X86_32,

> so ILP32 is either not present or implemented differently.

I added it in the table below under "a64armhf".

ILP32 is not present on aarch64.

Test result of [1] updated:

system   _FILE_OFFSET_BITS off_t   d_off-sizeof   d_off-values
---------------------------------------------------------------
x86_64   -                 8 Byte  8 Byte         8 Byte
i686     -                 4 Byte  4 Byte         4 Byte
i686     64                8 Byte  8 Byte         4 Byte
i686     32                4 Byte  4 Byte         4 Byte
i686     7                 4 Byte  4 Byte         4 Byte
armhf    -                 4 Byte  4 Byte         4 Byte
armhf    64                8 Byte  8 Byte         4 Byte
armhf    32                4 Byte  4 Byte         4 Byte
armhf    7                 4 Byte  4 Byte         4 Byte
a64armhf -                 4 Byte  4 Byte         FAIL
a64armhf 64                8 Byte  8 Byte         8 Byte
a64armhf 32                4 Byte  4 Byte         FAIL
a64armhf 7                 4 Byte  4 Byte         FAIL

For a64armhf, the version with -D_FILE_OFFSET_BITS=64 is the only one where
readdir succeeds on my specially-prepared directory.

-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl9unJQACgkQ5xo1VCww

uqXawgf9H6G51A5PzxdtOgSuUilzwEULll2McSYirbJ5kKgXMNZjWESrpEfsCQsX

p68j23tSxTaInQi1uEBd+ye9Qrt5oQl0xCN14y3KIygb+m8SgC04yMS7Q2VgbhF9

7CSYqjXRb2zkB4OTMVPtFl8p04EcHZvBuFx0p2bCNXfFupPJ2ELRqnNIAht4fInW

JZnD7SYfrh39rBjLFeHIswyeH25LeFysvkGlpPbbL0ZX6Ku1RU/h+B6AXMWIriv4

04/10Qya1dlGVNY1qxShgWrvVTEJuAlZ+L6xg0DjkGbbedjU0cywwxIJybmgoWFY

K7nIC7zvm1DvHjEIzLm/0spCmCCf2A==

=BlIh

-----END PGP SIGNATURE-----

Danny Milosavljevic wrote on 26 Sep 2020 03:49

Recipients:(name . Marius Bakke)(address . marius@gnu.org)(address . 43591@debbugs.gnu.org)

Message-ID:20200926034912.6ad44194@scratchpost.org

On Sat, 26 Sep 2020 03:42:44 +0200

Danny Milosavljevic <dannym@scratchpost.org> wrote:

Toggle quote (30 lines)> Hi,
> 
> by now I also tested it on overdrive1, an aarch64 machine.
> 
> > I do not have access to a real aarch64 machine--so no idea how it is there.
> > That would be the most interesting case, because those don't have X86_32,
> > so ILP32 is either not present or implemented differently.  
> 
> I added it in the table below under "a64armhf".
> 
> ILP32 is not present on aarch64.
> 
> Test result of [1] updated:
> 
> system   _FILE_OFFSET_BITS off_t   d_off-sizeof   d_off-values
> ---------------------------------------------------------------
> x86_64   -                 8 Byte  8 Byte         8 Byte
> i686     -                 4 Byte  4 Byte         4 Byte
> i686     64                8 Byte  8 Byte         4 Byte
> i686     32                4 Byte  4 Byte         4 Byte
> i686     7                 4 Byte  4 Byte         4 Byte
> armhf    -                 4 Byte  4 Byte         4 Byte
> armhf    64                8 Byte  8 Byte         4 Byte
> armhf    32                4 Byte  4 Byte         4 Byte
> armhf    7                 4 Byte  4 Byte         4 Byte
> a64armhf -                 4 Byte  4 Byte         FAIL
> a64armhf 64                8 Byte  8 Byte         8 Byte
> a64armhf 32                4 Byte  4 Byte         FAIL
> a64armhf 7                 4 Byte  4 Byte         FAIL

aarch64 - 8 Byte 8 Byte 8 Byte

Toggle quote (2 lines)

> For a64armhf, the version with -D_FILE_OFFSET_BITS=64 is the only one where

> readdir succeeds on my specially-prepared directory.

-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl9unhgACgkQ5xo1VCww

uqX0kAf/aQdWX0rI6ewzOIgGflwtqpn+EX4XCuan4RCip2QINbbjNtkPyizeteKN

0cNwZryEmqfrzCoBiO9UpXIb+n4/pUQn2o8jW/s1x6UJJR1SmWiX4bE28SffOqtU

HOudY9tEgildMf2GJG5XJzJy3XVzvPig93ttsGwOL/CSmVm6/WiEw1n16SJW/O4H

jla3/solrbfLhppngE61fHciv4UhDsaoMi9b6GvXaz7frlBnLZ0eaTAYTf6cfe3f

LoxKnx0GpRwYLIIGpQHIBgDlqC9k0ZCo2fC86oMb+ZQg8RFe8oolhmzU4fZO1fZ6

Puyn6V1vMF9+l91oXoyQiSA9+POL6w==

=CB0G

-----END PGP SIGNATURE-----

Danny Milosavljevic wrote on 26 Sep 2020 12:50

Recipients:(name . Andreas Enge)(address . andreas@enge.fr)

Message-ID:20200926124758.28866f16@scratchpost.org

Hi Andreas,

On Fri, 25 Sep 2020 22:03:23 +0200

Andreas Enge <andreas@enge.fr> wrote:

Toggle quote (9 lines)

> On Fri, Sep 25, 2020 at 01:11:18AM +0200, Marius Bakke wrote:

> > Arguably running code for foreign architectures through QEMU binfmt is

> > something of a hack. Mandating that every package *must* be patched to

> > support it seems user-hostile. I'm more in favor of dropping it on the

> > build farm

> Indeed it is weird we do not only compile packages natively on the build

> farm.

I'm sorry that my earlier analysis mentioned qemu at all. qemu is not at fault

at all. The same happens if you run 32 bit code on 64 bit hosts without using

qemu! Except for one case: i686 on x86_64--where they have a compatibility

layer in the kernel that works around this problem (which I don't like

either--it obfuscates the problem).

For example a problem appears on:

- armhf on x86_64 host; fault not because of qemu

- armhf on aarch64 host; not using qemu in the first place

And it should appear also on (but I didn't test):

- i686 on aarch64 host; not using qemu in the first place

The problem always appears if the host is 64 bits and the guest is 32 bits,

no matter what cpus both are, except for the case "i686 on x86_64".

-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl9vHPQACgkQ5xo1VCww

uqWN+gf6At1d5aYujggHx0tZ53zlU+hUN7ok+tj8Ea51tkjINPKEwPjcmbdGjAs5

5OyXLLBQe+v1nbltp7nTMsryQelqWnYpzEerGw4wuFI2q/SQ3N9ARy1YXmBitTlc

+7nlCkx798uIIEBF9VYJGcTCXd+WPh9O7lqYnhwIMxzPqXvxfnBWmvnBWloMIFx7

Uw+wm/j4jTaI/RFI8/AtotPRfVubqXQhAMagW1vfB/1q0qyy9+v/VvsMO3u7iaKQ

75QdHmGZ1bMKEi4AIpP046Hn6Z0R5rALPk+EG+nOXqXtyq0/N386gF9b+OrCoDjK

vtASqeg4hV5Ot9ApjfbQmHw6XEQUXA==

=Btl3

-----END PGP SIGNATURE-----

Efraim Flashner wrote on 27 Sep 2020 08:43

Recipients:(name . Danny Milosavljevic)(address . dannym@scratchpost.org)

Message-ID:20200927064313.GB1386@E5400

On Fri, Sep 25, 2020 at 03:36:46PM +0200, Danny Milosavljevic wrote:

Toggle quote (94 lines)> > >Why is this not an issue with i686 on x86_64 kernels?  
> > 
> > I'm not sure.  I'll check.
> 
> $ cat a00.c
> #include <stdio.h>
> #if defined( __ILP32__ )
> #warning ILP32
> #endif
> int main() {
>         return sizeof(off_t);
> }
> $ LC_ALL=C guix environment -s i686-linux gcc-toolchain -- gcc -o a00 a00.c
> a00.c:3:2: warning: #warning ILP32 [-Wcpp]
>     3 | #warning ILP32
>       |  ^~~~~~~
> $ ./a00
> $ echo $?
> 4
> 
> That means they are using the Linux kernel's X86_32 ABI.
> It has its own getdents64 system call that returns another value for d_off.
> 
> $ LC_ALL=C guix environment -s i686-linux gcc-toolchain -- gcc -o a00 -D_FILE_OFFSET_BITS=64 a00.c
> a00.c:3:2: warning: #warning ILP32 [-Wcpp]
>     3 | #warning ILP32
>       |  ^~~~~~~
> $ ./a00
> $ echo $?
> 8
> 
> That is why __i686__ is not affected--at the cost of the kernel lying to us
> about the file system.
> 
> Note that I also tried printing the actual d_off values[1]--on ILP32, even with
> _FILE_OFFSET_BITS=64, the VALUES are still 32 bits, and the same values as
> without _FILE_OFFSET_BITS.  The d_off SLOT gets bigger on _FILE_OFFSET_BITS=64.
> 
> (I also tried printing the actual d_off values[1] on x86_64 without any guix
> environment -s, I get entirely different d_off values!!)
> 
> I also tried the former on native ARMHF--you get 32 bits d_off values.  And d_off
> is always the same size as off_t.
> 
> off_t size changes depending on _FILE_OFFSET_BITS.
> 
> I do not have access to a real aarch64 machine--so no idea how it is there.
> That would be the most interesting case, because those don't have X86_32,
> so ILP32 is either not present or implemented differently.
> 
> ppc64 would also be interesting...
> 
> Test result of [1]:
> 
> system _FILE_OFFSET_BITS off_t   d_off-sizeof   d_off-values
> -------------------------------------------------------------
> x86_64 -                 8 Byte  8 Byte         8 Byte
> i686   -                 4 Byte  4 Byte         4 Byte
> i686   64                8 Byte  8 Byte         4 Byte
> i686   32                4 Byte  4 Byte         4 Byte
> i686   7                 4 Byte  4 Byte         4 Byte
> armhf  -                 4 Byte  4 Byte         4 Byte
> armhf  64                8 Byte  8 Byte         4 Byte
> armhf  32                4 Byte  4 Byte         4 Byte
> armhf  7                 4 Byte  4 Byte         4 Byte
> 
> This is all without qemu--in order to simplify the test case.
> 
> So I take it ext4 has some special compilation mode for 32 bits...
> 
> Could someone please test [1] on (real!) aarch64 and ppc64 and ppc32
> machines?
> 
> [1] $ cat a00.c 
> #include <stdio.h>
> #include <errno.h>
> #include <assert.h>
> #include <dirent.h>
> #if defined( __ILP32__ )
> #warning ILP32
> #endif
> 
> int main() {
>         DIR* d;
>         struct dirent* ent;
>         d = opendir("/tmp");
>         errno = 0;
>         assert(sizeof(ent->d_off) == sizeof(off_t));
>         while ((ent = readdir(d)) != NULL) {
>                 printf("%llu\n", (unsigned long long) ent->d_off);
>         }
>         if (errno)
>                 perror("readdir");
>         return sizeof(off_t);

missing } at the end

tested on my ibook G4:
(I believe for powerpc the 32-bit is defined as __powerpc__)

(ins)efraim@g4:~$ gcc -o a00 a00.c
(ins)efraim@g4:~$ ./a00
404218588
473424804
681064434
708508942
805784207
980330722
1080794671
1734802884
1909818320
1923689764
2019671154
2125602108
2147483647
(ins)efraim@g4:~$ echo $?
4
(ins)efraim@g4:~$ rm a00
(ins)efraim@g4:~$ gcc -o a00 -D_FILE_OFFSET_BITS=64 a00.c
(ins)efraim@g4:~$ ./a00
404218588
473424804
681064434
708508942
805784207
980330722
1080794671
1734802884
1909818320
1923689764
2019671154
2125602108
2147483647
(ins)efraim@g4:~$ echo $?
8



-- 
Efraim Flashner   <efraim@flashner.co.il>   ????? ?????
GPG key = A28B F40C 3E55 1372 662D  14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted

Danny Milosavljevic wrote on 29 Sep 2020 16:51

Recipients:(address . 43591@debbugs.gnu.org)(name . Marius Bakke)(address . marius@gnu.org)

Message-ID:20200929165145.5d6b4434@scratchpost.org

(Note: ILP32 is not present on aarch64)

system   _FILE_OFFSET_BITS off_t   d_off-sizeof   d_off-values
---------------------------------------------------------------
x86_64   -                 8 Byte  8 Byte         8 Byte
i686     -                 4 Byte  4 Byte         4 Byte
i686     64                8 Byte  8 Byte         FAIL*
i686     32                4 Byte  4 Byte         FAIL*
i686     7                 4 Byte  4 Byte         4 Byte
armhf    -                 4 Byte  4 Byte         4 Byte
armhf    64                8 Byte  8 Byte         4 Byte
armhf    32                4 Byte  4 Byte         4 Byte
armhf    7                 4 Byte  4 Byte         4 Byte
a64armhf -                 4 Byte  4 Byte         FAIL*
a64armhf 64                8 Byte  8 Byte         8 Byte
a64armhf 32                4 Byte  4 Byte         FAIL*
a64armhf 7                 4 Byte  4 Byte         FAIL* 
aarch64  -                 8 Byte  8 Byte         8 Byte

*: Using FUSE filesystem with big d_off value.

-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl9zSgEACgkQ5xo1VCww

uqWFbAgAnExo/8+26P+b98euZLJ++EQl74LB++acJmgQl5Qq+upiUQz+QyTp9fNp

WZHfv7mP9z9pHcOsYsDnFWTdwPiSkYXkDAlqxHedqptA/d8ZjHgUg32JQ4Q7ZXxr

aCyhAk7ej0QYxfDXaduOAJ1rT+7wtBoRDO5XMSz6kKdL66c1UCHv/RhmhV0eq2FQ

kg23P0T+VzWxuII1O/uFgrPdawkKletbL1eGy0u9KncIQMEp2o9iPX0b0rmLRVfP

szOgrhY9lD/+VNnl61aB98OQLoWupXadKlI0sthXvjJYFuY/tXygfGlreqDSqKJr

HgX022VdkFn3OX9qW2sZHCvU89E0jg==

=Kiu1

-----END PGP SIGNATURE-----

Ludovic Courtès wrote on 29 Sep 2020 22:52

Recipients:(name . Marius Bakke)(address . marius@gnu.org)

Message-ID:87h7rg4879.fsf@gnu.org

Hi,

Marius Bakke <marius@gnu.org> skribis:

Toggle quote (5 lines)

> Arguably running code for foreign architectures through QEMU binfmt is

> something of a hack. Mandating that every package *must* be patched to

> support it seems user-hostile. I'm more in favor of dropping it on the

> build farm, or just keep fixing things on a per-package basis.

I’m fine with dropping things on the build farm; it’s just about

modifying machines-for-berlin.scm in maintenance.git. Any takers? :-)

Toggle quote (5 lines)

> A less user-hostile solution could perhaps be to (setenv "CFLAGS"

> "-D_FILE_OFFSET_BITS=64") on 32-bit architectures in gnu-build-system.

> Not sure whether that could cause any adverse effects. But again, I

> don't like the idea of optimizing for QEMUs user-mode emulation.

The above would override the default CFLAGS in Autoconf-generated

configure scripts (which is “-O2 -g”). So we’d have to be cautious.

But I think a global solution is preferable to adding

-D_FILE_OFFSET_BITS=64 to tens of packages.

WDYT?

Ludo’.

Danny Milosavljevic wrote on 30 Sep 2020 00:09

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)

Message-ID:20200930000934.6812b7c8@scratchpost.org

Hi Ludo,

On Tue, 29 Sep 2020 22:52:10 +0200

Ludovic Courtès <ludo@gnu.org> wrote:

Toggle quote (10 lines)> Marius Bakke <marius@gnu.org> skribis:
> 
> > Arguably running code for foreign architectures through QEMU binfmt is
> > something of a hack.  Mandating that every package *must* be patched to
> > support it seems user-hostile.  I'm more in favor of dropping it on the
> > build farm, or just keep fixing things on a per-package basis.  
> 
> I’m fine with dropping things on the build farm; it’s just about
> modifying machines-for-berlin.scm in maintenance.git.  Any takers?  :-)

I don't know what "dropping things on the build farm" means in this context.

Dropping what exactly?

Toggle quote (3 lines)

> The above would override the default CFLAGS in Autoconf-generated

> configure scripts (which is “-O2 -g”).

That is correct. I'm currently working on v2 (testing a patchset already)

and I totally forgot to add "-g -O2" the first time around. Also, glibc

itself must NOT have -D_FILE_OFFSET_BITS=64 (it makes sense not to, too).

Toggle quote (4 lines)

>So we’d have to be cautious.

> But I think a global solution is preferable to adding

> -D_FILE_OFFSET_BITS=64 to tens of packages.

I agree.

I still would like to see what actually changes--and I think with

guix-data-services it should actually be possible to compare derivations

before-and-after and find out which derivations of which packages changed

at all because of the global -D_FILE_OFFSET_BITS=64. I'd like some help

using guix-data-services to find that out. Otherwise a risk estimation

cannot be done.

Technically, if a package used direct assembly offsets (for some unfathomable

reason), it could have an undetectable problem with the size change of off_t

(and also struct dirent). So examining the source code of the most essential

packages manually is still good. That's what I did in

branch wip-file-offsets-64.

I'm in the process of testing a patchset that globally sets

CFLAGS="-D_FILE_OFFSET_BITS=64 -g -O2"

instead.

That alone is not enough since there are a lot of non-autotools projects that

just ignore the environment variable entirely--not to mention languages other

than C.

I have access to bayfront--but I don't remember how to evaluate a new branch

there (the new branch is "wip-file-offset-bits-64-sledgehammer"). How does

it work?

-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl9zsJ4ACgkQ5xo1VCww

uqX2cQgAmvzuWUzAWHm6T9M58P3PhM7dlK5gtkowB3hgxzYk/KkpluX56HXE2P54

q+m78K5rumyDkoVVtMtsAQbrJLnd+HuHOrlwBipcTuBsCPJ6r7Fc4BMVbvfm/kvd

y7fxqsmDybtrynFpUNiQadQZNT//NaSRLqa7XKv7HWblmGYVwuQ0+fdj+IEqUv5X

3l8h/ziDg2vB10rmtmFzh9kjN/lGsYYfHlnY87iNqoYbtY4Klkg9xQiGpKULWdA4

tUATpX9JsHpAmMYN/Rja3Yk1inpMZ4aWpSdYETZGi90WvItyrp8tSlsK0uqh7/Ak

8u39izKY7Yx4fNKDJqtUiy2DHHQW9g==

=F4u0

-----END PGP SIGNATURE-----

Danny Milosavljevic wrote on 30 Sep 2020 10:45

[PATCH core-updates v2 3/5] gnu: glibc: Do not explicitly set _FILE_OFFSET_BITS.

Recipients:(address . 43591@debbugs.gnu.org)(name . Danny Milosavljevic)(address . dannym@scratchpost.org)

Message-ID:20200930084512.31738-4-dannym@scratchpost.org

* gnu/packages/base.scm (glibc)[arguments]<#:phases>[set-FILE-OFFSET-BITS]:
Delete phase.
---
 gnu/packages/base.scm | 2 ++
 1 file changed, 2 insertions(+)

Toggle diff (13 lines)diff --git a/gnu/packages/base.scm b/gnu/packages/base.scm
index 41976c5ab0..6f3c717b50 100644
--- a/gnu/packages/base.scm
+++ b/gnu/packages/base.scm
@@ -782,6 +782,8 @@ the store.")
 
       #:tests? #f                                 ; XXX
       #:phases (modify-phases %standard-phases
+                 ;; glibc itself should support both--so don't choose here.
+                 (delete 'set-FILE-OFFSET-BITS)
                  (add-before
                   'configure 'pre-configure
                   (lambda* (#:key inputs native-inputs outputs

Danny Milosavljevic wrote on 30 Sep 2020 10:45

[PATCH core-updates v2 0/5] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.

Recipients:(address . 43591@debbugs.gnu.org)(name . Danny Milosavljevic)(address . dannym@scratchpost.org)

Message-ID:20200930084512.31738-1-dannym@scratchpost.org

Linux kernel file offsets are always 64 bits.
But userspace can be built to use 32 bit offsets.

"struct dirent", returned by readdir, uses d_off to store
such an "offset" that it got from the Linux kernel.
In the case of ext4 that "offset" is actually a 64 bit
hash value.

Therefore, there are cases where such an offset that it got
from the Linux kernel does not fit in the "struct dirent"
field "d_off".

If the guest system's glibc is 32 bit AND uses 32 bit
file offsets it is going to be very confused.
It does check whether d_off fits into the structure
it gives back to the user--and it doesn't fit.  Hence readdir
fails, with errno == EOVERFLOW (which is undocumented and thus
an API error).
This manifests itself in simple directory reads not working
anymore in parts of cmake, for example.

This happened in Guix when building stuff for
ARMHF on a x86_64 build host using QEMU transparent emulation.

There is a very simple and complete way to avoid this problem:
Just always use 64 bit offsets in user space programs (also
on 32 bit machines).

Danny Milosavljevic (5):
  gnu: glibc-final: Catch all cases of a glibc user not requesting
    64-bit offsets and then using readdir regardless.
  build-system/gnu: Explicity declare the _FILE_OFFSET_BITS we want.
  gnu: glibc: Do not explicitly set _FILE_OFFSET_BITS.
  gnu: glibc-mesboot0: Do not explicitly set _FILE_OFFSET_BITS.
  gnu: rhash: Explicity declare the _FILE_OFFSET_BITS we want.

 gnu/packages/base.scm           |  2 +
 gnu/packages/commencement.scm   | 70 ++++++++++++++++++++++++++++++++-
 gnu/packages/crypto.scm         |  9 ++++-
 guix/build/gnu-build-system.scm | 13 +++++-
 4 files changed, 90 insertions(+), 4 deletions(-)

Danny Milosavljevic wrote on 30 Sep 2020 10:45

[PATCH core-updates v2 4/5] gnu: glibc-mesboot0: Do not explicitly set _FILE_OFFSET_BITS.

Recipients:(address . 43591@debbugs.gnu.org)(name . Danny Milosavljevic)(address . dannym@scratchpost.org)

Message-ID:20200930084512.31738-5-dannym@scratchpost.org

* gnu/packages/commencement.scm (glibc-mesboot0)[arguments]<#:phases>[set-FILE-OFFSET-BITS]:
Delete phase.
---
 gnu/packages/commencement.scm | 2 ++
 1 file changed, 2 insertions(+)

Toggle diff (13 lines)diff --git a/gnu/packages/commencement.scm b/gnu/packages/commencement.scm
index 284fa65d20..ad6a27ba6b 100644
--- a/gnu/packages/commencement.scm
+++ b/gnu/packages/commencement.scm
@@ -1398,6 +1398,8 @@ ac_cv_c_float_format='IEEE (little-endian)'
            ,(string-append "--prefix=" out)))
        #:phases
        (modify-phases %standard-phases
+         ;; glibc itself should support both--so don't choose here.
+         (delete 'set-FILE-OFFSET-BITS)
          (add-after 'unpack 'apply-boot-patch
            (lambda* (#:key inputs #:allow-other-keys)
              (and (let ((patch (assoc-ref inputs "boot-patch")))

Danny Milosavljevic wrote on 30 Sep 2020 10:45

[PATCH core-updates v2 2/5] build-system/gnu: Explicity declare the _FILE_OFFSET_BITS we want.

Recipients:(address . 43591@debbugs.gnu.org)(name . Danny Milosavljevic)(address . dannym@scratchpost.org)

Message-ID:20200930084512.31738-3-dannym@scratchpost.org

* guix/build/gnu-build-system.scm (set-FILE-OFFSET-BITS): New procedure.
(%standard-phases): Add it.
---
 guix/build/gnu-build-system.scm | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

Toggle diff (31 lines)diff --git a/guix/build/gnu-build-system.scm b/guix/build/gnu-build-system.scm
index d3347c9518..1a1c4627ab 100644
--- a/guix/build/gnu-build-system.scm
+++ b/guix/build/gnu-build-system.scm
@@ -60,6 +60,17 @@ See https://reproducible-builds.org/specs/source-date-epoch/."
   (setenv "SOURCE_DATE_EPOCH" "1")
   #t)
 
+(define* (set-FILE-OFFSET-BITS #:rest _)
+  "Define '_FILE_OFFSET_BITS' using the C preprocessor.
+This ensures that 32 bit and 64 bit user space both can handle results
+returned from a 64 bit kernel.
+See https://bugzilla.kernel.org/show_bug.cgi?id=205957."
+  ;; Setting CFLAGS here makes packages not default CFLAGS.
+  ;; Since glibc (and probably other packages) don't like being built without optimization, enable optimization here.
+  (setenv "CFLAGS" "-D_FILE_OFFSET_BITS=64 -g -O2")
+  (setenv "CXXFLAGS" "-D_FILE_OFFSET_BITS=64 -g -O2")
+  #t)
+
 (define (first-subdirectory directory)
   "Return the file name of the first sub-directory of DIRECTORY."
   (match (scandir directory
@@ -803,7 +814,7 @@ which cannot be found~%"
   ;; Standard build phases, as a list of symbol/procedure pairs.
   (let-syntax ((phases (syntax-rules ()
                          ((_ p ...) `((p . ,p) ...)))))
-    (phases set-SOURCE-DATE-EPOCH set-paths install-locale unpack
+    (phases set-SOURCE-DATE-EPOCH set-FILE-OFFSET-BITS set-paths install-locale unpack
             bootstrap
             patch-usr-bin-file
             patch-source-shebangs configure patch-generated-file-shebangs

Danny Milosavljevic wrote on 30 Sep 2020 10:45

[PATCH core-updates v2 1/5] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir regardless.

Recipients:(address . 43591@debbugs.gnu.org)(name . Danny Milosavljevic)(address . dannym@scratchpost.org)

Message-ID:20200930084512.31738-2-dannym@scratchpost.org

* gnu/packages/commencement.scm (glibc-final): Catch all cases of a glibc user not
requesting 64-bit offsets and then using readdir.
---
 gnu/packages/commencement.scm | 68 ++++++++++++++++++++++++++++++++++-
 1 file changed, 67 insertions(+), 1 deletion(-)

Toggle diff (79 lines)diff --git a/gnu/packages/commencement.scm b/gnu/packages/commencement.scm
index e5a4caa95c..284fa65d20 100644
--- a/gnu/packages/commencement.scm
+++ b/gnu/packages/commencement.scm
@@ -3462,7 +3462,73 @@ exec ~a/bin/~a-~a -B~a/lib -Wl,-dynamic-linker -Wl,~a/~a \"$@\"~%"
                ,hurd-headers-boot0)
              '())
        ,@(package-outputs glibc-final-with-bootstrap-bash))
-      ,@(package-arguments glibc-final-with-bootstrap-bash)))))
+      ,@(substitute-keyword-arguments
+         (package-arguments glibc-final-with-bootstrap-bash)
+         ((#:phases phases)
+          `(modify-phases ,phases
+             (add-after 'unpack 'patch-dirent
+               (lambda* (#:key outputs #:allow-other-keys)
+                 ;; Linux kernel file offsets are always 64 bits.
+                 ;; But userspace can be built to use 32 bit offsets.
+                 ;;
+                 ;; "struct dirent", returned by readdir, uses d_off to store
+                 ;; such an "offset" that it got from the Linux kernel.
+                 ;; In the case of ext4 that "offset" is actually a 64 bit
+                 ;; hash value.
+                 ;;
+                 ;; Therefore, there are cases where such an offset that it got
+                 ;; from the Linux kernel does not fit in the "struct dirent"
+                 ;; field "d_off".
+                 ;;
+                 ;; If the guest system's glibc is 32 bit AND uses 32 bit
+                 ;; file offsets it is going to be very confused.
+                 ;; It does check whether d_off fits into the structure
+                 ;; it gives back to the user--and it doesn't fit.  Hence readdir
+                 ;; fails, with errno == EOVERFLOW (which is undocumented and thus
+                 ;; an API error).
+                 ;; This manifests itself in simple directory reads not working
+                 ;; anymore in parts of cmake, for example.
+                 ;;
+                 ;; This manifested in Guix when building stuff for
+                 ;; ARMHF on a x86_64 build host using QEMU transparent emulation.
+                 ;;
+                 ;; There is a very simple and complete way to avoid this problem:
+                 ;; Just always use 64 bit offsets in user space programs (also
+                 ;; on 32 bit machines).  The Linux kernel does that already
+                 ;; anyway.
+                 ;;
+                 ;; Note: We might want to avoid using 64 bit when bootstrapping
+                 ;; using mescc (since mescc doesn't directly support 64 bit
+                 ;; values)--but then bootstrapping has to be done on a
+                 ;; file system other than ext4, or on ext4 with the feature
+                 ;; "dir_index" disabled.
+                 ;;
+                 ;; The change below does not affect 64 bit users.
+                 ;;
+                 ;; See <https://issues.guix.gnu.org/43513>.
+                 (let ((port (open-file "dirent/dirent.h" "a")))
+                   (display "
+#ifndef _LIBC
+#if __SIZEOF_LONG__ < 8
+#ifndef __USE_FILE_OFFSET64
+#if defined(_FILE_OFFSET_BITS) && _FILE_OFFSET_BITS == 32
+#warning \"Using -D_FILE_OFFSET_BITS=32 and using readdir is a bad idea, see <https://bugzilla.kernel.org/show_bug.cgi?id=205957>\"
+#else
+#undef readdir
+#define readdir @READDIR_WITHOUT_FILE_OFFSET64_IS_A_REALLY_BAD_IDEA@
+#endif
+#endif
+#endif
+#endif
+" port)
+                   (close-port port))
+                 ;; This file includes <dirent.h> and thus checks sanity already.
+                 ;; TODO: Check dirent/scandir-tail.c, dirent/scandir64-tail.c.
+                 (substitute* "posix/glob.c"
+                  (("(#[ ]*define[ ][ ]*readdir)") "
+#undef readdir
+#define readdir"))
+                 #t)))))))))
 
 (define/system-dependent gcc-boot0-wrapped
   ;; Make the cross-tools GCC-BOOT0 and BINUTILS-BOOT0 available under the

Danny Milosavljevic wrote on 30 Sep 2020 10:45

[PATCH core-updates v2 5/5] gnu: rhash: Explicity declare the _FILE_OFFSET_BITS we want.

Recipients:(address . 43591@debbugs.gnu.org)(name . Danny Milosavljevic)(address . dannym@scratchpost.org)

Message-ID:20200930084512.31738-6-dannym@scratchpost.org

* gnu/packages/crypto.scm (rhash)[arguments]<#:make-flags>:
Explicity declare the _FILE_OFFSET_BITS we want.
---
 gnu/packages/crypto.scm | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Toggle diff (20 lines)diff --git a/gnu/packages/crypto.scm b/gnu/packages/crypto.scm
index 028c140185..a10dd62e8b 100644
--- a/gnu/packages/crypto.scm
+++ b/gnu/packages/crypto.scm
@@ -845,8 +845,13 @@ BLAKE.")
                                       "/bin/" ,target "-gcc"))
                      '())))
        #:make-flags
-       ;; The binaries in /bin need some help finding librhash.so.0.
-       (list (string-append "LDFLAGS=-Wl,-rpath=" %output "/lib"))
+       (list ;; This package uses a configure script that is not from GNU
+             ;; autotools; it doesn't handle the environment variable
+             ;; CFLAGS (or for that matter the configure option).
+             ;; Therefore, directly pass it to make.
+             "CFLAGS=-D_FILE_OFFSET_BITS=64"
+             ;; The binaries in /bin need some help finding librhash.so.0.
+             (string-append "LDFLAGS=-Wl,-rpath=" %output "/lib"))
        #:test-target "test"             ; ‘make check’ just checks the sources
        #:phases
        (modify-phases %standard-phases

Ludovic Courtès wrote on 30 Sep 2020 11:32

Re: [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.

Recipients:(name . Danny Milosavljevic)(address . dannym@scratchpost.org)

Message-ID:87tuvf1uet.fsf@gnu.org

Hi,

Danny Milosavljevic <dannym@scratchpost.org> skribis:

Toggle quote (16 lines)> On Tue, 29 Sep 2020 22:52:10 +0200
> Ludovic Courtès <ludo@gnu.org> wrote:
>
>> Marius Bakke <marius@gnu.org> skribis:
>> 
>> > Arguably running code for foreign architectures through QEMU binfmt is
>> > something of a hack.  Mandating that every package *must* be patched to
>> > support it seems user-hostile.  I'm more in favor of dropping it on the
>> > build farm, or just keep fixing things on a per-package basis.  
>> 
>> I’m fine with dropping things on the build farm; it’s just about
>> modifying machines-for-berlin.scm in maintenance.git.  Any takers?  :-)
>
> I don't know what "dropping things on the build farm" means in this context.
> Dropping what exactly?

Dropping emulated builds, or at least 32-bit emulated builds. We just

need to remove build machines from the file above.

Toggle quote (20 lines)>> The above would override the default CFLAGS in Autoconf-generated
>> configure scripts (which is “-O2 -g”).  
>
> That is correct.  I'm currently working on v2 (testing a patchset already)
> and I totally forgot to add "-g -O2" the first time around.  Also, glibc
> itself must NOT have -D_FILE_OFFSET_BITS=64 (it makes sense not to, too).
>
>>So we’d have to be cautious.
>> But I think a global solution is preferable to adding
>> -D_FILE_OFFSET_BITS=64 to tens of packages.
>
> I agree.
>
> I still would like to see what actually changes--and I think with
> guix-data-services it should actually be possible to compare derivations
> before-and-after and find out which derivations of which packages changed
> at all because of the global -D_FILE_OFFSET_BITS=64.  I'd like some help
> using guix-data-services to find that out.  Otherwise a risk estimation
> cannot be done.

A change in gnu-build-system would change all the derivations. I don’t

think the Data Service can help us here.

Toggle quote (6 lines)

> Technically, if a package used direct assembly offsets (for some unfathomable

> reason), it could have an undetectable problem with the size change of off_t

> (and also struct dirent). So examining the source code of the most essential

> packages manually is still good. That's what I did in

> branch wip-file-offsets-64.

Yeah.

Toggle quote (6 lines)

> I'm in the process of testing a patchset that globally sets

> CFLAGS="-D_FILE_OFFSET_BITS=64 -g -O2"

> instead.

OK.

Toggle quote (4 lines)

> That alone is not enough since there are a lot of non-autotools projects that

> just ignore the environment variable entirely--not to mention languages other

> than C.

Yeah…

I have mixed feelings: fixing packages one by one doesn’t sound great,

but OTOH setting the ‘CFLAGS’ environment variable globally can have

unexpected side effects in some cases (overriding package-specific

CFLAGS) and zero effects in other cases (for non-Autoconf packages or

badly-written ‘configure.ac’ files), both of which would be hard to

detect.

~~~

If we take a step back: what’s the problem? We have a problem with

emulated 32-bit architectures on 64-bit architectures. But that’s OK,

we can stop those emulations for now. Then we have packages that do not

support large files; it’s not great but evidently we can live with it.

:-) Ideally, we’d report it upstream when we encounter it.

So to me that hints at targeted fixes: fixing key packages like CMake

(roughly what you already did) where lack of large file support can be

problematic.

Thoughts?

Ludo’.

Danny Milosavljevic wrote on 30 Sep 2020 12:28

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)

Message-ID:20200930122821.1471d155@scratchpost.org

Hi Ludo,

On Wed, 30 Sep 2020 11:32:58 +0200

Ludovic Courtès <ludo@gnu.org> wrote:

Toggle quote (3 lines)

> Dropping emulated builds, or at least 32-bit emulated builds. We just

> need to remove build machines from the file above.

Oh.

Do we have real armhf machines? (as in not aarch64)

*Looks at guix-maintenance* We do. Awesome. Then sure.

But just to be clear, WE MUST NOT USE aarch64 to build armhf as long as this

problem isn't fixed.

This problem has nothing to do with emulation.

Toggle quote (3 lines)

> A change in gnu-build-system would change all the derivations. I don’t

> think the Data Service can help us here.

I want to know what actually changes in the final binaries. Surely that works

somehow--guix data services or not.

Basically, for each package in Guix,

diff -r `~/broken-guix/pre-inst-env guix build $package` `~/fixed-guix/pre-inst-env guix build $package` || echo "affected: $package"

but after replacing references by deduplicated content addressed references

(for example if derivation A refers to files in derivation B, but derivation B

only changed the directory name it's in and not the content of the derivation,

then that should not count as a diff in A. That should happen recursively).

Basically ignore the directory names in /gnu/store and make new directory

names that are the hash of each directory's (recursive) CONTENTS (after

fixing references in that content, too, obviously)--as opposed to sources.

Then diff those.

If necessary, I can run that on my laptop--it will just take several weeks

and miss derivations I don't have in the first place.

Toggle quote (7 lines)

> I have mixed feelings: fixing packages one by one doesn’t sound great,

> but OTOH setting the ‘CFLAGS’ environment variable globally can have

> unexpected side effects in some cases (overriding package-specific

> CFLAGS) and zero effects in other cases (for non-Autoconf packages or

> badly-written ‘configure.ac’ files), both of which would be hard to

> detect.

The latter is easy to detect since I patched dirent.h in glibc exactly for that

reason. That way, glibc WON'T let you use it wrong (except if you explicitly

ask for it). On Guix systems, there is no legitimate reason to use it wrong

in the first place.

In my opinion, not having an automated way to tell us when a package is using

it wrong would be not doing our due diligence--how would you know we had

actually fixed the problem for good? You wouldn't know.

And you wouldn't have fixed the problem for good--I can tell you that much now.

There already was an essential package, rhash, which didn't pick up the

CFLAGS--and that's how I found it. It's easy.

About the unexpected side effects--yes, that's right. That's why we should get

a list of diff results (see above for the command) and then manually look at

the source code of those packages and their dependencies.

Toggle quote (2 lines)

> If we take a step back: what’s the problem?

It means we have no trustworthy i686 packages, which means we do not have a

trustworthy full source bootstrap using Mes (since that uses i686 executable

to bootstrap).

In practice, this problem is not so bad since the kernel on i686 has a compat

layer that hasn't been telling us the truth for d_off, so we should be "good".

But philosophically, we are doing it dead wrong.

Also, this won't work on armhf or any other 32 bit architecture--so there,

we would be both philosophically and practically wrong.

Also, the "not telling us the truth for d_off on i686" is a leaky compat layer.

It totally DOES wind up telling us the truth sometimes (see my earlier test

table)--and then we have a problem.

Toggle quote (4 lines)

> Then we have packages that do not

> support large files; it’s not great but evidently we can live with it.

> :-) Ideally, we’d report it upstream when we encounter it.

I really don't care about large file support. That's mostly a bonus we get

while fixing this whole ordeal the right way. That said, maybe users

care they actually can store a 5 GiB file on their 4000 GiB drive on armhf :P

Toggle quote (4 lines)

> So to me that hints at targeted fixes: fixing key packages like CMake

> (roughly what you already did) where lack of large file support can be

> problematic.

As long as we patch glibc's dirent.h so it tells us when we are doing stupid

stuff, we can't go wrong much. So sure.

As I said, the main problem is FINDING the affected packages. It's not like

they'll fail building (in general). They'll just do weird stuff at runtime

instead. Case in point: cmake DID NOT fail building. And it totally is

broken.

Everything after that is easy.

-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl90XcUACgkQ5xo1VCww

uqXKxgf/foOlL/5z3NtiynHPzBn2e/EqziNnVXtSjZaPMdvmK4N09rvWBFLDZ3Rl

dOBfMTuwhiY63BbhmmSXngYdO8USLxRoEn1GCgcYrjLt2YzfxGPaQ5xyrXV0FhDR

l5UNQPNcqwk2qfK6kmmzg2qCoRG0C8+/MSt6QPGQLZ4U1JNfMuKqDFVoGR7sJW15

sLOH62x+WvIpOq8sXmHzeeguN1Aj01VaNvWWg5g0inBkqM5FcB1Q3JOxIAFnFsuf

hH2YGqOfqlz/o9hIYHYOSQt9FcwtuBVurz4fac0QmKnNlVQ8Qj2UKuD26LO5BBT3

YR3MH6yIPKc8tMSU2XnnmhnE2hPS6Q==

=/PXl

-----END PGP SIGNATURE-----

Danny Milosavljevic wrote on 30 Sep 2020 18:55

Re: [PATCH core-updates v2 1/5] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir regardless.

Recipients:(address . 43591@debbugs.gnu.org)

Message-ID:20200930185521.70952954@scratchpost.org

Still broken: boost.

In file included from ./boost/interprocess/detail/os_file_functions.hpp:40:0,

from ./boost/interprocess/mapped_region.hpp:29,

from libs/log/src/posix/ipc_reliable_message_queue.cpp:51:

./boost/interprocess/detail/os_file_functions.hpp:636:16: error: stray ‘@’ in program

while((de=::readdir(d))) {

[...]

build of /gnu/store/j29v8ph3asv4sywi0437hs6wwd9knvy2-boost-1.74.0.drv failed

That means that boost, or that source file, is not being compiled with

_FILE_OFFSET_BITS=64 .

Nowhere does the header file boost/interprocess/detail/os_file_functions.hpp

even check the value of _FILE_OFFSET_BITS.

./libs/filesystem/src/platform_config.hpp is responsible for setting up

_FILE_OFFSET_BITS in general.

That means that platform_config.hpp is not being #included in the former.

boost/interprocess/errors.hpp:// Parts of this code are taken from boost::filesystem library

boost/interprocess/errors.hpp:// See library home page at http://www.boost.org/libs/filesystem

But that means that this does not USE the "filesystem" library and neither does

it set _FILE_OFFSET_BITS like the "filesystem" library would.

All this happens even after I added a phase to gnu-build-system to set

_FILE_OFFSET_BITS unconditionally, so boost ignores that, too.

-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl90uHkACgkQ5xo1VCww

uqWXIQgAoDXHn/Nj7x2sGOXLLOAqD6Tbzk1ti0UhU15/ah9ygp/2aN6oVZOVclnz

7j7ppgdUPT78BaPrQ+LaPvaHmrZuHww7oNGQnIi6LmT47ExOZQwr59wHg/Q5lgx4

XVmWuWQ7tqE+WifAEffRR/TLGEJZwAe54LnwaS/QRtnW7T5gRUvJdgUeV8p0MgPa

f58AVS/zu3R3FSJuI4R0N1tlYZBZqNnykBM7n8gWhsacQKZjlrTNevEx6lS/scDd

yiIH9Po0Xej/B0CAa2fIhx56nBJFq5zdsXBBOV+gZIAvbCpxsOeF5mtOhBDijgKx

QemN1/Vzw56tC9soRwuKLJpvlHtnyg==

=xmR4

-----END PGP SIGNATURE-----

Ludovic Courtès wrote on 1 Oct 2020 09:14

Re: [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.

Recipients:(name . Danny Milosavljevic)(address . dannym@scratchpost.org)

Message-ID:87o8lmv2nx.fsf@gnu.org

Hi,

Danny Milosavljevic <dannym@scratchpost.org> skribis:

Toggle quote (17 lines)> On Wed, 30 Sep 2020 11:32:58 +0200
> Ludovic Courtès <ludo@gnu.org> wrote:
>
>> Dropping emulated builds, or at least 32-bit emulated builds.  We just
>> need to remove build machines from the file above.
>
> Oh.
>
> Do we have real armhf machines?  (as in not aarch64)
>
> *Looks at guix-maintenance*  We do.  Awesome.  Then sure.
>
> But just to be clear, WE MUST NOT USE aarch64 to build armhf as long as this
> problem isn't fixed.
>
> This problem has nothing to do with emulation.

Now I’m lost; I thought this had to do with qemu-user.

Could you propose a patch for maintenance.git?

Toggle quote (12 lines)> I want to know what actually changes in the final binaries.  Surely that works
> somehow--guix data services or not.
>
> Basically, for each package in Guix,
>
>   diff -r `~/broken-guix/pre-inst-env guix build $package` `~/fixed-guix/pre-inst-env guix build $package` || echo "affected: $package"
>
> but after replacing references by deduplicated content addressed references
> (for example if derivation A refers to files in derivation B, but derivation B
> only changed the directory name it's in and not the content of the derivation,
> then that should not count as a diff in A.  That should happen recursively).

Yeah, you “just” need to compare modulo store file names…

Toggle quote (12 lines)>> I have mixed feelings: fixing packages one by one doesn’t sound great,
>> but OTOH setting the ‘CFLAGS’ environment variable globally can have
>> unexpected side effects in some cases (overriding package-specific
>> CFLAGS) and zero effects in other cases (for non-Autoconf packages or
>> badly-written ‘configure.ac’ files), both of which would be hard to
>> detect.
>
> The latter is easy to detect since I patched dirent.h in glibc exactly for that
> reason.  That way, glibc WON'T let you use it wrong (except if you explicitly
> ask for it).  On Guix systems, there is no legitimate reason to use it wrong
> in the first place.

I’m very reluctant to patching public libc headers. Also, it’s not just

“our” problem, we should definitely discuss it with upstream and perhaps

propose your dirent.h patch.

I’m also not sure what you mean by “using it wrong”, what is “it”?

Building without _FILE_OFFSET_BITS=64 is still a valid option, albeit

one that is not recommended.

Toggle quote (4 lines)

> About the unexpected side effects--yes, that's right. That's why we should get

> a list of diff results (see above for the command) and then manually look at

> the source code of those packages and their dependencies.

A diff at one point in time (if we ever managed to get a usable diff) is

not enough: problems could pop up anytime. Setting ‘CFLAGS’ globally as

an environment variable seems risky.

Toggle quote (17 lines)>> If we take a step back: what’s the problem?
>
> It means we have no trustworthy i686 packages, which means we do not have a
> trustworthy full source bootstrap using Mes (since that uses i686 executable
> to bootstrap).
>
> In practice, this problem is not so bad since the kernel on i686 has a compat
> layer that hasn't been telling us the truth for d_off, so we should be "good".
> But philosophically, we are doing it dead wrong.
>
> Also, this won't work on armhf or any other 32 bit architecture--so there,
> we would be both philosophically and practically wrong.
>
> Also, the "not telling us the truth for d_off on i686" is a leaky compat layer.
> It totally DOES wind up telling us the truth sometimes (see my earlier test
> table)--and then we have a problem.

Hmm I guess I need to re-read all that, I’m overwhelmed.

Thanks,

Ludo’.

Danny Milosavljevic wrote on 2 Oct 2020 09:18

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)

Message-ID:20201001120944.5059c650@scratchpost.org

Hi Ludo,

On Thu, 01 Oct 2020 09:14:10 +0200

Ludovic Courtès <ludo@gnu.org> wrote:

Toggle quote (4 lines)

> > This problem has nothing to do with emulation.

> Now I’m lost; I thought this had to do with qemu-user.

I had thought so, too, a few weeks ago. But that's not the case.

It's not at all related to qemu.

The problem is a fundamental problem: a 64 bit value does NOT fit into a 32 bit

slot.

glibc uses getdents64 to get 64 bit dents and then acts all surprised and

errory when it gets 64 bit dents. (also on 32 bit glibc)

The same happens natively when using armhf on aarch64, without qemu-user.

That's what the table I sent was all about.

Calling getdents64 is not the problem--glibc has to do that, otherwise a 32 bit

glibc won't work RELIABLY on a 64 bit kernel anyway.

And emitting an error it does because we do not enable large file support.

This table (updated after I finally compiled a guix gcc-toolchain-10 with the

(unpatched) glibc in question on armhf):

system _FILE_OFFSET_BITS off_t d_off-sizeof d_off-values

---------------------------------------------------------------

x86_64 - 8 Byte 8 Byte 8 Byte

i686 - 4 Byte 4 Byte 4 Byte

i686 64 8 Byte 8 Byte FAIL*

i686 32 4 Byte 4 Byte FAIL*

i686 7 4 Byte 4 Byte 4 Byte

armhf - 4 Byte 4 Byte FAIL*

armhf 64 8 Byte 8 Byte 8 Byte

armhf 32 4 Byte 4 Byte FAIL*

armhf 7 4 Byte 4 Byte FAIL*

a64armhf - 4 Byte 4 Byte FAIL*

a64armhf 64 8 Byte 8 Byte 8 Byte

a64armhf 32 4 Byte 4 Byte FAIL*

a64armhf 7 4 Byte 4 Byte FAIL*

aarch64 - 8 Byte 8 Byte 8 Byte

*: Using FUSE filesystem with big d_off value.

None of those tests were done with qemu. They were all native.

That's why I wanted access to real aarch64 machines--otherwise I could have

done it with qemu on my x86_64 computer :P

Toggle quote (2 lines)

> I’m very reluctant to patching public libc headers.

Well, I don't like it either--that's why it's very very careful.  My patch
doesn't change anything that users experience at runtime and basically just
prevents developers from compiling something that is using readdir without
thinking about large files first (because they HAVE TO if their programs run
on a host kernel that has bigger d_off--there's no sane way around it).

If they absolutely want to, they can set _FILE_OFFSET_BITS=32 and it will
let them do it (the argument by Marius is that people might want to do that
on embedded.  But that means they'll sometimes have readdir fail--depending
on their setup (also on 32 bit kernels).  Embedded is not specially exempt
from ths bug ;) ).

I think that this patch is guix-specific in the sense that it happens
pretty often that we do "-s i686-linux" on x86_64, "-s armhf-linux" on
aarch64 and expect that to work.  And there's no qemu we could even patch in
those cases, because no qemu is used in the first place.

Toggle quote (4 lines)

> Also, it’s not just

> “our” problem, we should definitely discuss it with upstream and perhaps

> propose your dirent.h patch.

Sure. I think 15 years of migration path to 64 bit off_t was more than enough.

Now, I'd prefer if glibc made people choose _FILE_OFFSET_BITS explicitly on

32 bit. Everything else is a hack that WILL break unexpectedly. Users still

can choose _FILE_OFFSET_BITS=32, if they want.

Toggle quote (2 lines)

> I’m also not sure what you mean by “using it wrong”, what is “it”?

"it" is users calling readdir() without defining _FILE_OFFSET_BITS=64 in their

source file / Makefile.

This causes glibc to call getdents64 and then act all surprised when it gets

a 64 bit result back.

Toggle quote (9 lines)

> > Also, this won't work on armhf or any other 32 bit architecture--so there,

> > we would be both philosophically and practically wrong.

> >

> > Also, the "not telling us the truth for d_off on i686" is a leaky compat layer.

> > It totally DOES wind up telling us the truth sometimes (see my earlier test

> > table)--and then we have a problem.

> Hmm I guess I need to re-read all that, I’m overwhelmed.

Yeah--it's understandable. I'm working on understanding and fixing this problem

for a hundred hours now--it took forever for me to get to the bottom of this,

too.

And in the beginning I, too, suspected qemu. But it's totally blameless.

Nothing should be changed in qemu-user or in our qemu binfmt service.

The fundamental problem is that POSIX specifies that telldir and seekdir must

exist, and return and take a LONG, respectively.

That means that glibc has to preserve d_off it got from getdents64 (size is

64 bits), otherwise how would seekdir work?

But the offset parameter of seekdir is standardized as LONG, which means that it

won't work in the first place on 32 bit when there is either a 64 bit kernel or

a filesystem that just happens to store bigger stuff.

So glibc chose to check whether the getdents64 d_off just happens to fit into

the LONG this time around it was called. I argue that that is insane. It

would be better to always fail, or never fail--not only fail on the first d_off

that is > 2**32. When that happens is a filesystem implementation detail :P

I think the assumption was that the kernel would store an actual offset into

d_off. But it doesn't--it stores a hash in the case of ext4 (and probably

in other cases).

And in any case, even if it was an offset, that is still an unsafe way to fix

the problem.

First, someone needs to fix the POSIX standard to say "off_t", not "long".

Then, distributions who want to use 32 bit userland on 64 bit kernel need

to enable large files globally. That is a choice a distribution has to make.

Not making a choice is a choice too--the behavior will be random, and if

my research in wip-file-offset-bits-64 is any indication then very

fundamental things will be broken at unexpected places, and usually it DOES

NOT result in a build failure (without my glibc patch). That basically

means that using 32 bit native on 64 bit kernel cannot be supported in Guix

if no choice is made.

If choice "yes" is made, one needs to have a way to find these

non-build-breaking using-readdir-wrong packages. How would an alternative

way to do this look?

-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl921C0ACgkQ5xo1VCww

uqVaPgf/fIxl8BxRKuGGhS5XYcC2VjATPoqeCdJwh+ewtIhNUd//GAPVz8GLvLlW

kdyReYbVfqYTeQMxvRfurh7ASV3O64fiDMrAMkv1aYmjQXLZrviOI1NgTxDs4GcL

XgSv1sKFLJwuLYqXAlpE9msOkrFX6IPhnprqFfr7OIDQtM6OgBOkWoJ7yW+fU3b0

G+2P52cknDtqLBMFMDH0bGCbILNb+mrQPxYXkJzKYEydWKuErlKnclvLUI88siWX

BAg1LQ88jUjc86PmaJnwHU8nBPj2HrsfabcRNOIOMniSGRRYbFqr4slodVQ7nTzk

tCGBCx9D/Iuvdvll6Lmq7lFuL9c1rQ==

=559R

-----END PGP SIGNATURE-----

Danny Milosavljevic wrote on 2 Oct 2020 10:12

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)

Message-ID:20201002101237.3fcf2ef9@scratchpost.org

Toggle quote (6 lines)

> My patch

> doesn't change anything that users experience at runtime and basically just

> prevents developers from compiling something that is using readdir without

> thinking about large files first (because they HAVE TO if their programs run

> on a host kernel that has bigger d_off--there's no sane way around it).

And just to be clear, they still SHOULD choose whether or not to support
large files even if the host kernel is 32 bit (!!!!)--see table.  It's
just less common there.

-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl924PUACgkQ5xo1VCww

uqX8ygf+ID5inG0pcZIPcfNYEiGq4/eQJoWpzzxGfmslDCVcDNoq13ey/z9t+xVW

5G4tBNh3ouz3G7rVwTA7FF3SOFvTGYeCPkF605kDDJgvzFSsC0TB3W4tCWZBC0b9

deOPmnfDHTVt+JCKl1YTgCYiMJO8M03zSyT3VWOkoQmjvSwO0poxCqLjWY50ZG6u

f6foIkLloMCZ27S7fBuieTkGoeDtS03PhEAvnMoMrYKMFLllmENAAgXR1n5/jDB7

esjyfiBjOxGZBWmj6a2acr0LKjIAn8yUCuhY++5AfHTsJatw+y2pqpvZwn+rLIYd

0/FGxwjKY3GFQi2oFi7VzVhf7tRvzw==

=z5Zb

-----END PGP SIGNATURE-----

Danny Milosavljevic wrote on 2 Oct 2020 11:32

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)

Message-ID:20201002113238.33cdf7d2@scratchpost.org

Updated table after more tests:

system   _FILE_OFFSET_BITS off_t   d_off-sizeof   d_off-values
---------------------------------------------------------------
x86_64   -                 8 Byte  8 Byte         8 Byte
i686     -                 4 Byte  4 Byte         4 Byte
i686     64                8 Byte  8 Byte         FAIL*
i686     32                4 Byte  4 Byte         FAIL*
i686     7                 4 Byte  4 Byte         FAIL*
armhf    -                 4 Byte  4 Byte         FAIL*
armhf    64                8 Byte  8 Byte         8 Byte
armhf    32                4 Byte  4 Byte         FAIL*
armhf    7                 4 Byte  4 Byte         FAIL*
a64armhf -                 4 Byte  4 Byte         FAIL*
a64armhf 64                8 Byte  8 Byte         8 Byte
a64armhf 32                4 Byte  4 Byte         FAIL*
a64armhf 7                 4 Byte  4 Byte         FAIL* 
aarch64  -                 8 Byte  8 Byte         8 Byte

*: Using FUSE filesystem with big d_off value.

None of those tests were done with qemu.  They were all native.

-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl9287cACgkQ5xo1VCww

uqUAdwgAkp3NKYy6+mUkcQNej0JlQ2sS1GAxbuN6joI/c9eeS9lI8AJYYDd0GnR6

usz1zUJjNToZgngCPyNwz5m+z5ZfFQaLrcqAzNQZUIstYlSSt+rls9kVea3WboZz

rFffZgbOUiEXpHB3udUT+meWqPoWQTVacZ9tELqoBbOfEmYGwKon9qeNuhoxqMqB

xjc35UGJIm+bWskBkpjGOS3xQ2TUSMtUI81pyaOb5BETTymCenjUuxszer5hEuLX

DNZIMsI2UH2Y9GShNod+8KoEyOBSbtLYidoseC7oNYfueHc1y+0MJ/mLp50/KpId

qaT4hf5M1JaxjZgKqZ+lA3oPA2hkKg==

=LHw0

-----END PGP SIGNATURE-----

Danny Milosavljevic wrote on 2 Oct 2020 11:47

Recipients:

Message-ID:20201002114747.4da6c679@scratchpost.org

glibc bug report: https://sourceware.org/bugzilla/show_bug.cgi?id=23960

-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl9290MACgkQ5xo1VCww

uqW+/Af9EJhkqXzaTnRfRYtCfIPjCW9i83WYTh75MFwwRSJAzVtrbSObZsvFYzOl

dS1byAbJIHPatHELOD2Fgko5fhX3snl0gmpfTRowjVJGa/ZwLBo8yxx8LVxhMP8U

hCOB+BNjFM7uXkJqvvy6N9mWdS3W9DBy3znZvQnJMYX8z9q7/UVzoSADbWKcZtFs

pbQEhsZvXhbCRZdRjXLun+VHaXKAMHcNJ137HNmdlk2LoaAuiBGsOf2cPyLnXMJA

WzlxqvK6TxiUIgG3FahiyheYzl73mQSjs4QQ2KUfq4faITZVlcYVrrv4dW7KyVKA

ghNyZ2S8/tAMyYttmuS13b/CzvLmDg==

=QFV7

-----END PGP SIGNATURE-----

Danny Milosavljevic wrote on 6 Oct 2020 17:39

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)

Message-ID:20201006173932.08003c1d@scratchpost.org

Hi Ludo,

On Thu, 01 Oct 2020 09:14:10 +0200

Ludovic Courtès <ludo@gnu.org> wrote:

Toggle quote (2 lines)

> Could you propose a patch for maintenance.git?

OK--patch #43829.

Toggle quote (5 lines)

> >> I have mixed feelings: fixing packages one by one doesn’t sound great,

> >> but OTOH setting the ‘CFLAGS’ environment variable globally can have

> >> unexpected side effects in some cases (overriding package-specific

> >> CFLAGS)

Doing (setenv CFLAGS "-g -O2 -D_FILE_OFFSET_BITS=64") right after

phase 'set-SOURCE-DATE-EPOCH doesn't override package-specific CFLAGS.

Quite the opposite can happen, though. But:

Toggle quote (9 lines)

> > The latter is easy to detect since I patched dirent.h in glibc exactly for that

> > reason. That way, glibc WON'T let you use it wrong (except if you explicitly

> > ask for it). On Guix systems, there is no legitimate reason to use it wrong

> > in the first place.

> I’m very reluctant to patching public libc headers. Also, it’s not just

> “our” problem, we should definitely discuss it with upstream and perhaps

> propose your dirent.h patch.

I've reported it upstream.

However, GNU gcc and glibc support a lot of weird architectures--but Guix

system really doesn't. So it's much easier for us to get a good patch than

it is for them.

Toggle quote (4 lines)

> A diff at one point in time (if we ever managed to get a usable diff) is

> not enough: problems could pop up anytime. Setting ‘CFLAGS’ globally as

> an environment variable seems risky.

We are about 15 years late--so all other distributions already triggered

most of the bugs in that time. I don't think it's that bad anymore...

That's why I would prefer setting CFLAGS globally anyway.

And I don't have the energy to manually FIND AND fix however many packages

are affected otherwise.

Having this problem in 2020 is ridiculous--it's like someone accidentially

enabled a time machine...

The only reason this didn't fall on our head on x86_64 is because on 64 bit

systems something like it is the default anyway.

-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl98j7QACgkQ5xo1VCww

uqUvrgf+IotLG5H1B4fcZhvssJhkOrqpzLJmq4n7YnaYiMuUmxAV94jLvyjHWz66

Ab9zW0EFg8eKpR7GpvXI3J35aPUk+aHUb6frhkbp7ky6ItN6IfJYpyNyHJZv3eb4

bmdRAe6uW2a/09oH/Zhtji1cI9CH2tMudfWGfhNUZJwXeM2ZE99HDr4vsSeu0gYu

wEwJMxBdRwnITYofQBbPxnbYGipzLObAErew5zMSpoC4RXPHWbUsNGgkuSlnpSLu

/nUeWfJNkqOLHUj8vsemTyHw+kZ5qVsaKqOpEfjLuqK5ebx65XoDmAezcHwV05d4

4L/P5clGSvZ2ZfD+V1SPGD/b2af7Dg==

=gZXy

-----END PGP SIGNATURE-----

Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send an email to 43591@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it

mumi current 43591

Then, you may apply the latest patchset in this issue (with sign off)

mumi am -- -s

Or, compose a reply to this issue

mumi compose

Or, send patches to this issue

mumi send-email *.patch

is:open	open issues
is:done	closed issues
submitter:<who>	search issue submitter
author:<who>	search by message author
date:yesterday..now	search by issue date
mdate:3m..2d	search by message date