libdrm fails to build on armhf-linux

  • Open
  • quality assurance status badge
Details
3 participants
  • Andreas Enge
  • Mark H Weaver
  • Ricardo Wurmus
Owner
unassigned
Submitted by
Mark H Weaver
Severity
normal
M
M
Mark H Weaver wrote on 1 May 2019 23:41
(address . bug-guix@gnu.org)
87d0l1n97o.fsf@netris.org
Hydra failed two consecutive attempts to build libdrm on armhf-linux:


Both build attempts were made on hydra-slave2, which is a Wandboard Quad
based on the Freescale i.MX6 SOC.

Collateral damage includes several hundred dependency failures,
including emacs-26.

Mark
R
R
Ricardo Wurmus wrote on 3 May 2019 07:29
(name . Mark H Weaver)(address . mhw@netris.org)(address . 35529@debbugs.gnu.org)
87k1f8gl7k.fsf@elephly.net
Mark H Weaver <mhw@netris.org> writes:

Toggle quote (7 lines)
> Hydra failed two consecutive attempts to build libdrm on armhf-linux:
>
> https://hydra.gnu.org/build/3481547#tabs-summary
>
> Both build attempts were made on hydra-slave2, which is a Wandboard Quad
> based on the Freescale i.MX6 SOC.

This has built fine on berlin. We have a completed build for
/gnu/store/3c28p8b07709isd9jlcnnnyrpgz4ndz8-libdrm-2.4.97.

--
Ricardo
A
A
Andreas Enge wrote on 5 May 2019 19:42
(name . Ricardo Wurmus)(address . rekado@elephly.net)
20190505174226.GA2205@jurong
On Fri, May 03, 2019 at 07:29:03AM +0200, Ricardo Wurmus wrote:
Toggle quote (5 lines)
> > Both build attempts were made on hydra-slave2, which is a Wandboard Quad
> > based on the Freescale i.MX6 SOC.
> This has built fine on berlin. We have a completed build for
> /gnu/store/3c28p8b07709isd9jlcnnnyrpgz4ndz8-libdrm-2.4.97.

And I confirm it fails to build on redhill (a Novena machine) as well
(I have tried it only once, though).

Andreas
M
M
Mark H Weaver wrote on 6 May 2019 00:41
(name . Ricardo Wurmus)(address . rekado@elephly.net)(address . 35529@debbugs.gnu.org)
87imuosew7.fsf@netris.org
Hi Ricardo,

Ricardo Wurmus <rekado@elephly.net> writes:

Toggle quote (12 lines)
> Mark H Weaver <mhw@netris.org> writes:
>
>> Hydra failed two consecutive attempts to build libdrm on armhf-linux:
>>
>> https://hydra.gnu.org/build/3481547#tabs-summary
>>
>> Both build attempts were made on hydra-slave2, which is a Wandboard Quad
>> based on the Freescale i.MX6 SOC.
>
> This has built fine on berlin. We have a completed build for
> /gnu/store/3c28p8b07709isd9jlcnnnyrpgz4ndz8-libdrm-2.4.97.

What kind of hardware was it built on?

Note that the failure on Hydra was due to a timeout in the test suite:


All of the other tests completed within a few seconds, but the timeout
tripped after 1200 seconds. So, I'm not sure if it's simply that the
build hardware is too slow. It might have actually gotten stuck.
Perhaps the test uses /dev/random (as opposed to /dev/urandom) and
there's not enough entropy available on the build machine.

Thanks,
Mark
R
R
Ricardo Wurmus wrote on 6 May 2019 09:15
(name . Mark H Weaver)(address . mhw@netris.org)(address . 35529@debbugs.gnu.org)
874l68qcio.fsf@elephly.net
Hi Mark,

Toggle quote (16 lines)
> Ricardo Wurmus <rekado@elephly.net> writes:
>
>> Mark H Weaver <mhw@netris.org> writes:
>>
>>> Hydra failed two consecutive attempts to build libdrm on armhf-linux:
>>>
>>> https://hydra.gnu.org/build/3481547#tabs-summary
>>>
>>> Both build attempts were made on hydra-slave2, which is a Wandboard Quad
>>> based on the Freescale i.MX6 SOC.
>>
>> This has built fine on berlin. We have a completed build for
>> /gnu/store/3c28p8b07709isd9jlcnnnyrpgz4ndz8-libdrm-2.4.97.
>
> What kind of hardware was it built on?

I’m not sure. We’re using a few Overdrive 1000 machines that have quite
a bit more RAM than the other armhf nodes.

Toggle quote (10 lines)
> Note that the failure on Hydra was due to a timeout in the test suite:
>
> https://hydra.gnu.org/build/3481547/nixlog/6/tail-reload
>
> All of the other tests completed within a few seconds, but the timeout
> tripped after 1200 seconds. So, I'm not sure if it's simply that the
> build hardware is too slow. It might have actually gotten stuck.
> Perhaps the test uses /dev/random (as opposed to /dev/urandom) and
> there's not enough entropy available on the build machine.

My guess is that it ran out of RAM and began trashing. We’ve had at
least another build (nss) that worked fine on the Overdrive but failed
on other armhf machines.

--
Ricardo
M
M
Mark H Weaver wrote on 6 May 2019 10:22
(name . Ricardo Wurmus)(address . rekado@elephly.net)(address . 35529@debbugs.gnu.org)
8736lsq9ec.fsf@netris.org
Hi Ricardo,

Ricardo Wurmus <rekado@elephly.net> writes:

Toggle quote (19 lines)
>> Ricardo Wurmus <rekado@elephly.net> writes:
>>
>>> Mark H Weaver <mhw@netris.org> writes:
>>>
>>>> Hydra failed two consecutive attempts to build libdrm on armhf-linux:
>>>>
>>>> https://hydra.gnu.org/build/3481547#tabs-summary
>>>>
>>>> Both build attempts were made on hydra-slave2, which is a Wandboard Quad
>>>> based on the Freescale i.MX6 SOC.
>>>
>>> This has built fine on berlin. We have a completed build for
>>> /gnu/store/3c28p8b07709isd9jlcnnnyrpgz4ndz8-libdrm-2.4.97.
>>
>> What kind of hardware was it built on?
>
> I’m not sure. We’re using a few Overdrive 1000 machines that have quite
> a bit more RAM than the other armhf nodes.

Are there any other kinds of build slaves that build armhf binaries for
Berlin?

Toggle quote (14 lines)
>> Note that the failure on Hydra was due to a timeout in the test suite:
>>
>> https://hydra.gnu.org/build/3481547/nixlog/6/tail-reload
>>
>> All of the other tests completed within a few seconds, but the timeout
>> tripped after 1200 seconds. So, I'm not sure if it's simply that the
>> build hardware is too slow. It might have actually gotten stuck.
>> Perhaps the test uses /dev/random (as opposed to /dev/urandom) and
>> there's not enough entropy available on the build machine.
>
> My guess is that it ran out of RAM and began trashing. We’ve had at
> least another build (nss) that worked fine on the Overdrive but failed
> on other armhf machines.

All of the armhf build slaves on hydra.gnu.org have 4 gigabytes of RAM.
So does redhill, the armhf slave hosted by Andreas.

My Thinkpad X200 only has 4 gigabytes, and that's enough to build my
entire GNOME system from source code, including webkitgtk, icecat, rust,
nss, etc.

FWIW, I'd be very surprised if these libdrm test failures are due to
running out of memory.

Mark
R
R
Ricardo Wurmus wrote on 6 May 2019 13:06
(name . Mark H Weaver)(address . mhw@netris.org)(address . 35529@debbugs.gnu.org)
8736lrrget.fsf@elephly.net
Mark H Weaver <mhw@netris.org> writes:

Toggle quote (24 lines)
> Ricardo Wurmus <rekado@elephly.net> writes:
>
>>> Ricardo Wurmus <rekado@elephly.net> writes:
>>>
>>>> Mark H Weaver <mhw@netris.org> writes:
>>>>
>>>>> Hydra failed two consecutive attempts to build libdrm on armhf-linux:
>>>>>
>>>>> https://hydra.gnu.org/build/3481547#tabs-summary
>>>>>
>>>>> Both build attempts were made on hydra-slave2, which is a Wandboard Quad
>>>>> based on the Freescale i.MX6 SOC.
>>>>
>>>> This has built fine on berlin. We have a completed build for
>>>> /gnu/store/3c28p8b07709isd9jlcnnnyrpgz4ndz8-libdrm-2.4.97.
>>>
>>> What kind of hardware was it built on?
>>
>> I’m not sure. We’re using a few Overdrive 1000 machines that have quite
>> a bit more RAM than the other armhf nodes.
>
> Are there any other kinds of build slaves that build armhf binaries for
> Berlin?

Yes. We have a Beagleboard (x15.sjd.se), which is set up for 2 parallel
builds and we use the Qemu service on 5 of our x86_64 machines to build
for armhf. (We do the same for aarch64, but using 5 different nodes.)

“nss” failed its tests when built on x15.sjd.se, but it worked fine when
building on one of the Overdrives.

--
Ricardo
M
M
Mark H Weaver wrote on 6 May 2019 18:40
(name . Ricardo Wurmus)(address . rekado@elephly.net)(address . 35529@debbugs.gnu.org)
87a7fz7cyq.fsf@netris.org
Hi Ricardo,

Ricardo Wurmus <rekado@elephly.net> writes:

Toggle quote (15 lines)
>>>>> This has built fine on berlin. We have a completed build for
>>>>> /gnu/store/3c28p8b07709isd9jlcnnnyrpgz4ndz8-libdrm-2.4.97.
>>>>
>>>> What kind of hardware was it built on?
>>>
>>> I’m not sure. We’re using a few Overdrive 1000 machines that have quite
>>> a bit more RAM than the other armhf nodes.
>>
>> Are there any other kinds of build slaves that build armhf binaries for
>> Berlin?
>
> Yes. We have a Beagleboard (x15.sjd.se), which is set up for 2 parallel
> builds and we use the Qemu service on 5 of our x86_64 machines to build
> for armhf. (We do the same for aarch64, but using 5 different nodes.)

So, many of the armhf builds are done in an emulator. This is exactly
what I was curious about. One problem with doing this is that tests
performed during these builds do not necessarily reflect what will
happen on real armhf hardware.

I'll give just one example of where this approach will fail badly: tests
of thread synchronization. The memory models used in ARM and x86_64 are
quite different, and an ARM emulator running on x86_64 will effectively
have a much stronger memory model than real ARM hardware does.

It's much harder to perform safe thread synchronization on ARM than on
x86_64. Many programmers use idioms that they believe are safe, and
which work on x86_64, but are buggy on many architectures with weaker
memory models. Those are the kinds of bugs that will *not* be
discovered by test suites when we perform the builds under QEMU.

I hope that we will soon phase out the practice of performing builds
within emulators.

In the meantime, it would be good to know which machine built 'libdrm'
for armhf. Was that information recorded?

Toggle quote (3 lines)
> “nss” failed its tests when built on x15.sjd.se, but it worked fine when
> building on one of the Overdrives.

Can you find the failed NSS build log from the X15? It would useful to
see which tests failed, and whether they're the same ones that failed on
hydra-slave3, which is a Novena with 4 GB of RAM. Here's the relevant

Thanks!
Mark
R
R
Ricardo Wurmus wrote on 6 May 2019 19:00
(name . Mark H Weaver)(address . mhw@netris.org)(address . 35529@debbugs.gnu.org)
87a7fzfrhf.fsf@elephly.net
Hi Mark,

Toggle quote (19 lines)
> Ricardo Wurmus <rekado@elephly.net> writes:
>
>>>>>> This has built fine on berlin. We have a completed build for
>>>>>> /gnu/store/3c28p8b07709isd9jlcnnnyrpgz4ndz8-libdrm-2.4.97.
>>>>>
>>>>> What kind of hardware was it built on?
>>>>
>>>> I’m not sure. We’re using a few Overdrive 1000 machines that have quite
>>>> a bit more RAM than the other armhf nodes.
>>>
>>> Are there any other kinds of build slaves that build armhf binaries for
>>> Berlin?
>>
>> Yes. We have a Beagleboard (x15.sjd.se), which is set up for 2 parallel
>> builds and we use the Qemu service on 5 of our x86_64 machines to build
>> for armhf. (We do the same for aarch64, but using 5 different nodes.)
>
> So, many of the armhf builds are done in an emulator.

Since maybe a week or so. I only added the qemu machines very recently.

Toggle quote (3 lines)
> I hope that we will soon phase out the practice of performing builds
> within emulators.

I did this because the extra Overdrive boxen I bought still haven’t been
added to the build farm. To better deal with the build backlog I added
the Qemu machines. I also hope that we can soon use dedicated build
machines instead.

Toggle quote (3 lines)
> In the meantime, it would be good to know which machine built 'libdrm'
> for armhf. Was that information recorded?

I don’t know where I would find that information.

Toggle quote (8 lines)
>> “nss” failed its tests when built on x15.sjd.se, but it worked fine when
>> building on one of the Overdrives.
>
> Can you find the failed NSS build log from the X15? It would useful to
> see which tests failed, and whether they're the same ones that failed on
> hydra-slave3, which is a Novena with 4 GB of RAM. Here's the relevant
> Hydra build page: <https://hydra.gnu.org/build/3484222>.

I didn’t keep track of the log. I built this manually on berlin to test
a workaround (which failed). The build log probably still sits on
berlin somewhere, but I cannot find it.

--
Ricardo
?