libdrm fails to build on armhf-linux

Open

Details

3 participants

Andreas Enge
Mark H Weaver
Ricardo Wurmus

Owner: unassigned

Submitted by: Mark H Weaver

Severity: normal

Mark H Weaver wrote on 1 May 2019 23:41

Recipients:(address . bug-guix@gnu.org)

Message-ID:87d0l1n97o.fsf@netris.org

Hydra failed two consecutive attempts to build libdrm on armhf-linux:

https://hydra.gnu.org/build/3481547#tabs-summary

Both build attempts were made on hydra-slave2, which is a Wandboard Quad

based on the Freescale i.MX6 SOC.

Collateral damage includes several hundred dependency failures,

including emacs-26.

Mark

Ricardo Wurmus wrote on 3 May 2019 07:29

Recipients:(name . Mark H Weaver)(address . mhw@netris.org)(address . 35529@debbugs.gnu.org)

Message-ID:87k1f8gl7k.fsf@elephly.net

Mark H Weaver <mhw@netris.org> writes:

Toggle quote (7 lines)

> Hydra failed two consecutive attempts to build libdrm on armhf-linux:

> https://hydra.gnu.org/build/3481547#tabs-summary

> Both build attempts were made on hydra-slave2, which is a Wandboard Quad

> based on the Freescale i.MX6 SOC.

This has built fine on berlin. We have a completed build for

/gnu/store/3c28p8b07709isd9jlcnnnyrpgz4ndz8-libdrm-2.4.97.

Ricardo

Andreas Enge wrote on 5 May 2019 19:42

Recipients:(name . Ricardo Wurmus)(address . rekado@elephly.net)

Message-ID:20190505174226.GA2205@jurong

On Fri, May 03, 2019 at 07:29:03AM +0200, Ricardo Wurmus wrote:

Toggle quote (5 lines)

> > Both build attempts were made on hydra-slave2, which is a Wandboard Quad

> > based on the Freescale i.MX6 SOC.

> This has built fine on berlin. We have a completed build for

> /gnu/store/3c28p8b07709isd9jlcnnnyrpgz4ndz8-libdrm-2.4.97.

And I confirm it fails to build on redhill (a Novena machine) as well

(I have tried it only once, though).

Andreas

Mark H Weaver wrote on 6 May 2019 00:41

Recipients:(name . Ricardo Wurmus)(address . rekado@elephly.net)(address . 35529@debbugs.gnu.org)

Message-ID:87imuosew7.fsf@netris.org

Hi Ricardo,

Ricardo Wurmus <rekado@elephly.net> writes:

Toggle quote (12 lines)> Mark H Weaver <mhw@netris.org> writes:
>
>> Hydra failed two consecutive attempts to build libdrm on armhf-linux:
>>
>>   https://hydra.gnu.org/build/3481547#tabs-summary
>>
>> Both build attempts were made on hydra-slave2, which is a Wandboard Quad
>> based on the Freescale i.MX6 SOC.
>
> This has built fine on berlin.  We have a completed build for
> /gnu/store/3c28p8b07709isd9jlcnnnyrpgz4ndz8-libdrm-2.4.97.

What kind of hardware was it built on?

Note that the failure on Hydra was due to a timeout in the test suite:

https://hydra.gnu.org/build/3481547/nixlog/6/tail-reload

All of the other tests completed within a few seconds, but the timeout

tripped after 1200 seconds. So, I'm not sure if it's simply that the

build hardware is too slow. It might have actually gotten stuck.

Perhaps the test uses /dev/random (as opposed to /dev/urandom) and

there's not enough entropy available on the build machine.

Thanks,

Mark

Ricardo Wurmus wrote on 6 May 2019 09:15

Recipients:(name . Mark H Weaver)(address . mhw@netris.org)(address . 35529@debbugs.gnu.org)

Message-ID:874l68qcio.fsf@elephly.net

Hi Mark,

Toggle quote (16 lines)> Ricardo Wurmus <rekado@elephly.net> writes:
>
>> Mark H Weaver <mhw@netris.org> writes:
>>
>>> Hydra failed two consecutive attempts to build libdrm on armhf-linux:
>>>
>>>   https://hydra.gnu.org/build/3481547#tabs-summary
>>>
>>> Both build attempts were made on hydra-slave2, which is a Wandboard Quad
>>> based on the Freescale i.MX6 SOC.
>>
>> This has built fine on berlin.  We have a completed build for
>> /gnu/store/3c28p8b07709isd9jlcnnnyrpgz4ndz8-libdrm-2.4.97.
>
> What kind of hardware was it built on?

I’m not sure. We’re using a few Overdrive 1000 machines that have quite

a bit more RAM than the other armhf nodes.

Toggle quote (10 lines)> Note that the failure on Hydra was due to a timeout in the test suite:
>
>   https://hydra.gnu.org/build/3481547/nixlog/6/tail-reload
>
> All of the other tests completed within a few seconds, but the timeout
> tripped after 1200 seconds.  So, I'm not sure if it's simply that the
> build hardware is too slow.  It might have actually gotten stuck.
> Perhaps the test uses /dev/random (as opposed to /dev/urandom) and
> there's not enough entropy available on the build machine.

My guess is that it ran out of RAM and began trashing.  We’ve had at
least another build (nss) that worked fine on the Overdrive but failed
on other armhf machines.

-- 
Ricardo

Mark H Weaver wrote on 6 May 2019 10:22

Recipients:(name . Ricardo Wurmus)(address . rekado@elephly.net)(address . 35529@debbugs.gnu.org)

Message-ID:8736lsq9ec.fsf@netris.org

Hi Ricardo,

Ricardo Wurmus <rekado@elephly.net> writes:

Toggle quote (19 lines)>> Ricardo Wurmus <rekado@elephly.net> writes:
>>
>>> Mark H Weaver <mhw@netris.org> writes:
>>>
>>>> Hydra failed two consecutive attempts to build libdrm on armhf-linux:
>>>>
>>>>   https://hydra.gnu.org/build/3481547#tabs-summary
>>>>
>>>> Both build attempts were made on hydra-slave2, which is a Wandboard Quad
>>>> based on the Freescale i.MX6 SOC.
>>>
>>> This has built fine on berlin.  We have a completed build for
>>> /gnu/store/3c28p8b07709isd9jlcnnnyrpgz4ndz8-libdrm-2.4.97.
>>
>> What kind of hardware was it built on?
>
> I’m not sure.  We’re using a few Overdrive 1000 machines that have quite
> a bit more RAM than the other armhf nodes.

Are there any other kinds of build slaves that build armhf binaries for

Berlin?

Toggle quote (14 lines)>> Note that the failure on Hydra was due to a timeout in the test suite:
>>
>>   https://hydra.gnu.org/build/3481547/nixlog/6/tail-reload
>>
>> All of the other tests completed within a few seconds, but the timeout
>> tripped after 1200 seconds.  So, I'm not sure if it's simply that the
>> build hardware is too slow.  It might have actually gotten stuck.
>> Perhaps the test uses /dev/random (as opposed to /dev/urandom) and
>> there's not enough entropy available on the build machine.
>
> My guess is that it ran out of RAM and began trashing.  We’ve had at
> least another build (nss) that worked fine on the Overdrive but failed
> on other armhf machines.

All of the armhf build slaves on hydra.gnu.org have 4 gigabytes of RAM.

So does redhill, the armhf slave hosted by Andreas.

My Thinkpad X200 only has 4 gigabytes, and that's enough to build my

entire GNOME system from source code, including webkitgtk, icecat, rust,

nss, etc.

FWIW, I'd be very surprised if these libdrm test failures are due to

running out of memory.

Mark

Ricardo Wurmus wrote on 6 May 2019 13:06

Recipients:(name . Mark H Weaver)(address . mhw@netris.org)(address . 35529@debbugs.gnu.org)

Message-ID:8736lrrget.fsf@elephly.net

Mark H Weaver <mhw@netris.org> writes:

Toggle quote (24 lines)> Ricardo Wurmus <rekado@elephly.net> writes:
>
>>> Ricardo Wurmus <rekado@elephly.net> writes:
>>>
>>>> Mark H Weaver <mhw@netris.org> writes:
>>>>
>>>>> Hydra failed two consecutive attempts to build libdrm on armhf-linux:
>>>>>
>>>>>   https://hydra.gnu.org/build/3481547#tabs-summary
>>>>>
>>>>> Both build attempts were made on hydra-slave2, which is a Wandboard Quad
>>>>> based on the Freescale i.MX6 SOC.
>>>>
>>>> This has built fine on berlin.  We have a completed build for
>>>> /gnu/store/3c28p8b07709isd9jlcnnnyrpgz4ndz8-libdrm-2.4.97.
>>>
>>> What kind of hardware was it built on?
>>
>> I’m not sure.  We’re using a few Overdrive 1000 machines that have quite
>> a bit more RAM than the other armhf nodes.
>
> Are there any other kinds of build slaves that build armhf binaries for
> Berlin?

Yes. We have a Beagleboard (x15.sjd.se), which is set up for 2 parallel

builds and we use the Qemu service on 5 of our x86_64 machines to build

for armhf. (We do the same for aarch64, but using 5 different nodes.)

“nss” failed its tests when built on x15.sjd.se, but it worked fine when

building on one of the Overdrives.

Ricardo

Mark H Weaver wrote on 6 May 2019 18:40

Recipients:(name . Ricardo Wurmus)(address . rekado@elephly.net)(address . 35529@debbugs.gnu.org)

Message-ID:87a7fz7cyq.fsf@netris.org

Hi Ricardo,

Ricardo Wurmus <rekado@elephly.net> writes:

Toggle quote (15 lines)>>>>> This has built fine on berlin.  We have a completed build for
>>>>> /gnu/store/3c28p8b07709isd9jlcnnnyrpgz4ndz8-libdrm-2.4.97.
>>>>
>>>> What kind of hardware was it built on?
>>>
>>> I’m not sure.  We’re using a few Overdrive 1000 machines that have quite
>>> a bit more RAM than the other armhf nodes.
>>
>> Are there any other kinds of build slaves that build armhf binaries for
>> Berlin?
>
> Yes.  We have a Beagleboard (x15.sjd.se), which is set up for 2 parallel
> builds and we use the Qemu service on 5 of our x86_64 machines to build
> for armhf.  (We do the same for aarch64, but using 5 different nodes.)

So, many of the armhf builds are done in an emulator. This is exactly

what I was curious about. One problem with doing this is that tests

performed during these builds do not necessarily reflect what will

happen on real armhf hardware.

I'll give just one example of where this approach will fail badly: tests

of thread synchronization. The memory models used in ARM and x86_64 are

quite different, and an ARM emulator running on x86_64 will effectively

have a much stronger memory model than real ARM hardware does.

It's much harder to perform safe thread synchronization on ARM than on

x86_64. Many programmers use idioms that they believe are safe, and

which work on x86_64, but are buggy on many architectures with weaker

memory models. Those are the kinds of bugs that will *not* be

discovered by test suites when we perform the builds under QEMU.

I hope that we will soon phase out the practice of performing builds

within emulators.

In the meantime, it would be good to know which machine built 'libdrm'

for armhf. Was that information recorded?

Toggle quote (3 lines)

> “nss” failed its tests when built on x15.sjd.se, but it worked fine when

> building on one of the Overdrives.

Can you find the failed NSS build log from the X15?  It would useful to
see which tests failed, and whether they're the same ones that failed on
hydra-slave3, which is a Novena with 4 GB of RAM.  Here's the relevant
Hydra build page: https://hydra.gnu.org/build/3484222.

       Thanks!
         Mark

Ricardo Wurmus wrote on 6 May 2019 19:00

Recipients:(name . Mark H Weaver)(address . mhw@netris.org)(address . 35529@debbugs.gnu.org)

Message-ID:87a7fzfrhf.fsf@elephly.net

Hi Mark,

Toggle quote (19 lines)> Ricardo Wurmus <rekado@elephly.net> writes:
>
>>>>>> This has built fine on berlin.  We have a completed build for
>>>>>> /gnu/store/3c28p8b07709isd9jlcnnnyrpgz4ndz8-libdrm-2.4.97.
>>>>>
>>>>> What kind of hardware was it built on?
>>>>
>>>> I’m not sure.  We’re using a few Overdrive 1000 machines that have quite
>>>> a bit more RAM than the other armhf nodes.
>>>
>>> Are there any other kinds of build slaves that build armhf binaries for
>>> Berlin?
>>
>> Yes.  We have a Beagleboard (x15.sjd.se), which is set up for 2 parallel
>> builds and we use the Qemu service on 5 of our x86_64 machines to build
>> for armhf.  (We do the same for aarch64, but using 5 different nodes.)
>
> So, many of the armhf builds are done in an emulator.

Since maybe a week or so. I only added the qemu machines very recently.

Toggle quote (3 lines)

> I hope that we will soon phase out the practice of performing builds

> within emulators.

I did this because the extra Overdrive boxen I bought still haven’t been

added to the build farm. To better deal with the build backlog I added

the Qemu machines. I also hope that we can soon use dedicated build

machines instead.

Toggle quote (3 lines)

> In the meantime, it would be good to know which machine built 'libdrm'

> for armhf. Was that information recorded?

I don’t know where I would find that information.

Toggle quote (8 lines)

>> “nss” failed its tests when built on x15.sjd.se, but it worked fine when

>> building on one of the Overdrives.

> Can you find the failed NSS build log from the X15? It would useful to

> see which tests failed, and whether they're the same ones that failed on

> hydra-slave3, which is a Novena with 4 GB of RAM. Here's the relevant

> Hydra build page: <https://hydra.gnu.org/build/3484222>.

I didn’t keep track of the log.  I built this manually on berlin to test
a workaround (which failed).  The build log probably still sits on
berlin somewhere, but I cannot find it.

-- 
Ricardo

Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send an email to 35529@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it

mumi current 35529

Then, you may apply the latest patchset in this issue (with sign off)

mumi am -- -s

Or, compose a reply to this issue

mumi compose

Or, send patches to this issue

mumi send-email *.patch

is:open	open issues
is:done	closed issues
submitter:<who>	search issue submitter
author:<who>	search by message author
date:yesterday..now	search by issue date
mdate:3m..2d	search by message date