go-1.16 build failing on aarch64: "fatal error: runtime.newosproc"

  • Done
  • quality assurance status badge
Details
3 participants
  • Thiago Jung Bauermann
  • Sarah Morgensen
  • Leo Famulari
Owner
unassigned
Submitted by
Sarah Morgensen
Severity
normal
S
S
Sarah Morgensen wrote on 7 Aug 2021 07:04
(address . bug-guix@gnu.org)
861r75j164.fsf@mgsn.dev
Hello Guix,

I just noticed go-1.16 is failing on aarch64 [0]. I am not having any
success tracking down the cause. It looks like the error is the same as
was happening for go-1.14 circa 11 Mar [1], which was fixed by 9 Apr
[2], but I cannot tell what resolved the issue. I've attached the
relevant part of the build log; the full log is available at [0].

Any ideas?

starting phase `build'
runtime: failed to create new OS thread (have 2 already; errno=22)
fatal error: runtime.newosproc

runtime stack:
runtime.throw(0x5342c6)
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/panic.go:491 +0xa4
runtime.newosproc(0x10722000, 0x10732000)
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/os_linux.c:170 +0xec
newm(0xa4080, 0x0)
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/proc.c:1157 +0xbc
runtime.newsysmon()
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/proc.c:169 +0x2c
runtime.onM(0x54c228)
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/asm_arm.s:256 +0x74
runtime.mstart()
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/proc.c:818

goroutine 1 [running]:
runtime.switchtoM()
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/asm_arm.s:193 +0x4 fp=0x1071c7c0 sp=0x1071c7bc
runtime.main()
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/proc.go:32 +0x4c fp=0x1071c7e4 sp=0x1071c7c0
runtime.goexit()
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/asm_arm.s:1322 +0x4 fp=0x1071c7e4 sp=0x1071c7e4
Building Go cmd/dist using /gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003. ()
runtime: failed to create new OS thread (have 2 already; errno=22)
fatal error: runtime.newosproc

runtime stack:
runtime.throw(0x5342c6)
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/panic.go:491 +0xa4
runtime.newosproc(0x10722000, 0x10732000)
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/os_linux.c:170 +0xec
newm(0xa4080, 0x0)
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/proc.c:1157 +0xbc
runtime.newsysmon()
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/proc.c:169 +0x2c
runtime.onM(0x54c228)
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/asm_arm.s:256 +0x74
runtime.mstart()
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/proc.c:818

goroutine 1 [running]:
runtime.switchtoM()
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/asm_arm.s:193 +0x4 fp=0x1071c7c0 sp=0x1071c7bc
runtime.main()
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/proc.go:32 +0x4c fp=0x1071c7e4 sp=0x1071c7c0
runtime.goexit()
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/asm_arm.s:1322 +0x4 fp=0x1071c7e4 sp=0x1071c7e4
command "sh" "make.bash" "--no-banner" failed with status 2
builder for `/gnu/store/dfqww60vr4gykvz3fz4mj9sgk0x4jypz-go-1.16.7.drv' failed with exit code 1
@ build-failed /gnu/store/dfqww60vr4gykvz3fz4mj9sgk0x4jypz-go-1.16.7.drv - 1 builder for `/gnu/store/dfqww60vr4gykvz3fz4mj9sgk0x4jypz-go-1.16.7.drv' failed with exit code 1
S
S
Sarah Morgensen wrote on 2 Sep 2021 20:47
(address . 49921@debbugs.gnu.org)
86v93izufn.fsf@mgsn.dev
Sarah Morgensen <iskarian@mgsn.dev> writes:

Toggle quote (18 lines)
> Hello Guix,
>
> I just noticed go-1.16 is failing on aarch64 [0]. I am not having any
> success tracking down the cause. It looks like the error is the same as
> was happening for go-1.14 circa 11 Mar [1], which was fixed by 9 Apr
> [2], but I cannot tell what resolved the issue. I've attached the
> relevant part of the build log; the full log is available at [0].
>
> Any ideas?
>
> [0] https://ci.guix.gnu.org/build/949823/details
> [1] https://ci.guix.gnu.org/build/71004/details
> [2] https://ci.guix.gnu.org/build/19478/details
>
> starting phase `build'
> runtime: failed to create new OS thread (have 2 already; errno=22)
> fatal error: runtime.newosproc

I think this might be related to [0], although if it's true that CI uses
native builders for aarch64 now, I have no idea.

I've been able to reproduce this with both go-1.14 and go-1.16 when
building with --system=aarch64-linux from an amd64 system. I tried to
apply the patch in the thread I mentioned, but go-1.4 won't build at all
with QEMU.

cross compiled ARM binary on QEMU

--
Sarah
S
S
Sarah Morgensen wrote on 10 Sep 2021 02:51
control message for bug #50348
(address . control@debbugs.gnu.org)
8635qdmevt.fsf@mgsn.dev
block 50348 by 50227 49921 50495
thanks
S
S
Sarah Morgensen wrote on 10 Sep 2021 02:55
(address . control@debbugs.gnu.org)
861r5xmeq2.fsf@mgsn.dev
unblock 50348 by 50227 49921 50495
thanks

Oops, wrong bug #.
S
S
Sarah Morgensen wrote on 10 Sep 2021 02:56
control message for bug #50493
(address . control@debbugs.gnu.org)
86zgsll03k.fsf@mgsn.dev
block 50493 by 50495 49921 50227
thanks
S
S
Sarah Morgensen wrote on 10 Sep 2021 03:49
Re: bug#49921: go-1.16 build failing on aarch64: "fatal error: runtime.newosproc"
(address . 49921@debbugs.gnu.org)(name . Leo Famulari)(address . leo@famulari.name)
86v939kxnz.fsf@mgsn.dev
Sarah Morgensen <iskarian@mgsn.dev> writes:

Toggle quote (34 lines)
> Sarah Morgensen <iskarian@mgsn.dev> writes:
>
>> Hello Guix,
>>
>> I just noticed go-1.16 is failing on aarch64 [0]. I am not having any
>> success tracking down the cause. It looks like the error is the same as
>> was happening for go-1.14 circa 11 Mar [1], which was fixed by 9 Apr
>> [2], but I cannot tell what resolved the issue. I've attached the
>> relevant part of the build log; the full log is available at [0].
>>
>> Any ideas?
>>
>> [0] https://ci.guix.gnu.org/build/949823/details
>> [1] https://ci.guix.gnu.org/build/71004/details
>> [2] https://ci.guix.gnu.org/build/19478/details
>>
>> starting phase `build'
>> runtime: failed to create new OS thread (have 2 already; errno=22)
>> fatal error: runtime.newosproc
>
> I think this might be related to [0], although if it's true that CI uses
> native builders for aarch64 now, I have no idea.
>
> I've been able to reproduce this with both go-1.14 and go-1.16 when
> building with --system=aarch64-linux from an amd64 system. I tried to
> apply the patch in the thread I mentioned, but go-1.4 won't build at all
> with QEMU.
>
> [0] <https://github.com/golang/go/issues/20763> runtime: cannot run
> cross compiled ARM binary on QEMU
>
> --
> Sarah

I've written this up into a patch (attached below); I don't think
there's much of a way to test this other than just letting CI build it.
It's a backport, so it shouldn't hurt even if it doesn't fix it.

--
Sarah
From a5824c2495f5a547499ab200cd5b270b38f571d6 Mon Sep 17 00:00:00 2001
Message-Id: <a5824c2495f5a547499ab200cd5b270b38f571d6.1631237116.git.iskarian@mgsn.dev>
From: Sarah Morgensen <iskarian@mgsn.dev>
Date: Thu, 9 Sep 2021 18:23:10 -0700
Subject: [PATCH core-updates] gnu: go-1.4: Fix running with qemu-aarch64.

Backport the fix for running go with qemu-aarch64.

* gnu/packages/patches/go-1.4-fix-running-with-qemu.patch: New file.
* gnu/local.mk (dist_patch_DATA): Register it.
* gnu/packages/golang.scm (go-1.4)[origin]: Apply patch.
---
This might fix #49921, but I can't test it as there are a number of issues
preventing go-1.4 from compiling on qemu-aarch64.

It builds fine on x86_64, so it didn't break anything there.

--
Sarah

gnu/local.mk | 1 +
gnu/packages/golang.scm | 4 +-
.../go-1.4-fix-running-with-qemu.patch | 38 +++++++++++++++++++
3 files changed, 42 insertions(+), 1 deletion(-)
create mode 100644 gnu/packages/patches/go-1.4-fix-running-with-qemu.patch

Toggle diff (75 lines)
diff --git a/gnu/local.mk b/gnu/local.mk
index 20f0b8f081..39e17bb3bc 100644
--- a/gnu/local.mk
+++ b/gnu/local.mk
@@ -1144,6 +1144,7 @@ dist_patch_DATA = \
%D%/packages/patches/gobject-introspection-absolute-shlib-path.patch \
%D%/packages/patches/gobject-introspection-cc.patch \
%D%/packages/patches/gobject-introspection-girepository.patch \
+ %D%/packages/patches/go-1.4-fix-running-with-qemu.patch \
%D%/packages/patches/go-fix-script-tests.patch \
%D%/packages/patches/go-skip-gc-test.patch \
%D%/packages/patches/gpm-glibc-2.26.patch \
diff --git a/gnu/packages/golang.scm b/gnu/packages/golang.scm
index d3ef39a2e6..33f3120a09 100644
--- a/gnu/packages/golang.scm
+++ b/gnu/packages/golang.scm
@@ -1035,7 +1035,9 @@ your Go binary to be later served from an http.FileSystem.")
name version ".tar.gz"))
(sha256
(base32
- "0liybk5z00hizsb5ypkbhqcawnwwa6mkwgvjjg4y3jm3ndg5pzzl"))))
+ "0liybk5z00hizsb5ypkbhqcawnwwa6mkwgvjjg4y3jm3ndg5pzzl"))
+ (patches
+ (search-patches "go-1.4-fix-running-with-qemu.patch"))))
(build-system gnu-build-system)
(outputs '("out"
"doc"
diff --git a/gnu/packages/patches/go-1.4-fix-running-with-qemu.patch b/gnu/packages/patches/go-1.4-fix-running-with-qemu.patch
new file mode 100644
index 0000000000..52914c71a5
--- /dev/null
+++ b/gnu/packages/patches/go-1.4-fix-running-with-qemu.patch
@@ -0,0 +1,38 @@
+Backport from upstream: https://github/golang/go/commit/2673f9ed
+
+Original header:
+
+From 2673f9ed23348c634f6331ee589d489e4d9c7a9b Mon Sep 17 00:00:00 2001
+From: Austin Clements <austin@google.com>
+Date: Wed, 12 Jul 2017 10:12:50 -0600
+Subject: [PATCH] runtime: pass CLONE_SYSVSEM to clone
+
+SysV semaphore undo lists should be shared by threads, just like
+several other resources listed in cloneFlags. Currently we don't do
+this, but it probably doesn't affect anything because 1) probably
+nobody uses SysV semaphores from Go and 2) Go-created threads never
+exit until the process does. Beyond being the right thing to do,
+user-level QEMU requires this flag because it depends on glibc to
+create new threads and glibc uses this flag.
+
+Fixes #20763.
+
+---
+ src/runtime/os_linux.c | 1 +
+ 1 file changed, 1 insertion(+)
+
+diff --git a/src/runtime/os_linux.c b/src/runtime/os_linux.c
+index 0d8ffc9..573be7c 100644
+--- a/src/runtime/os_linux.c
++++ b/src/runtime/os_linux.c
+@@ -150,6 +150,7 @@ runtime·newosproc(M *mp, void *stk)
+ | CLONE_FS /* share cwd, etc */
+ | CLONE_FILES /* share fd table */
+ | CLONE_SIGHAND /* share sig handler table */
++ | CLONE_SYSVSEM /* share SysV semaphore undo lists (see issue #20763) */
+ | CLONE_THREAD /* revisit - okay for now */
+ ;
+
+--
+2.31.1
+

base-commit: 22f7d4bce1e694b7ac38e62410d76a6d46d96c5d
--
2.33.0
T
T
Thiago Jung Bauermann wrote on 11 Sep 2021 00:06
(name . Sarah Morgensen)(address . iskarian@mgsn.dev)(address . 49921@debbugs.gnu.org)
1787046.58CDggCRW2@popigai
Hello Sarah,

Em quinta-feira, 2 de setembro de 2021, às 15:47:08 -03, Sarah Morgensen
escreveu:
Toggle quote (3 lines)
> I think this might be related to [0], although if it's true that CI uses
> native builders for aarch64 now, I have no idea.

The CI has two native aarch64 builders (which also do armhf, despite bugs
43513 and 43591): dover and overdrive1.

It also uses half of the x86_64 builders (hydra-guix-*) for emulated builds
of aarch64 and armhf, as can be seen here:


--
Thanks,
Thiago
I
I
iskarian wrote on 11 Sep 2021 00:14
(name . Thiago Jung Bauermann)(address . bauermann@kolabnow.com)(address . 49921@debbugs.gnu.org)
2cce05f2762e6bb5beadc677a34f00c4@mgsn.dev
Hi Thiago,

(Re-sent due to missing Cc.)

September 10, 2021 3:06 PM, "Thiago Jung Bauermann" <bauermann@kolabnow.com> wrote:

Toggle quote (16 lines)
> Hello Sarah,
>
> Em quinta-feira, 2 de setembro de 2021, às 15:47:08 -03, Sarah Morgensen
> escreveu:
>
>> I think this might be related to [0], although if it's true that CI uses
>> native builders for aarch64 now, I have no idea.
>
> The CI has two native aarch64 builders (which also do armhf, despite bugs
> 43513 and 43591): dover and overdrive1.
>
> It also uses half of the x86_64 builders (hydra-guix-*) for emulated builds
> of aarch64 and armhf, as can be seen here:
>
> https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/hydra/berlin-nodes.scm#n143

Thanks for the explanation. Is there a way to tell CI (or Guix itself) that certain packages
shouldn't be built under emulation?

--
Sarah
T
T
Thiago Jung Bauermann wrote on 11 Sep 2021 01:06
(address . iskarian@mgsn.dev)(address . 49921@debbugs.gnu.org)
12949187.JfoJOkFrob@popigai
Hello Sarah,

Em sexta-feira, 10 de setembro de 2021, às 19:14:06 -03, iskarian@mgsn.dev
escreveu:
Toggle quote (5 lines)
> Hi Thiago,
>
> (Re-sent due to missing Cc.)
>
> September 10, 2021 3:06 PM, "Thiago Jung Bauermann"
<bauermann@kolabnow.com> wrote:
Toggle quote (21 lines)
> > Hello Sarah,
> >
> > Em quinta-feira, 2 de setembro de 2021, às 15:47:08 -03, Sarah
> > Morgensen
> >
> > escreveu:
> >> I think this might be related to [0], although if it's true that CI
> >> uses
> >> native builders for aarch64 now, I have no idea.
> >
> > The CI has two native aarch64 builders (which also do armhf, despite
> > bugs 43513 and 43591): dover and overdrive1.
> >
> > It also uses half of the x86_64 builders (hydra-guix-*) for emulated
> > builds of aarch64 and armhf, as can be seen here:
> >
> > https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/hydra/berli
> > n-nodes.scm#n143
> Thanks for the explanation. Is there a way to tell CI (or Guix itself)
> that certain packages shouldn't be built under emulation?

I don’t know. That would certainly be useful, though.

To be honest, my experience in the past couple of months with emulated
builds for powerpc64le-linux wasn’t good, so I asked for it to be turned
off.

--
Thanks,
Thiago
L
L
Leo Famulari wrote on 14 Dec 2021 20:29
(name . Sarah Morgensen)(address . iskarian@mgsn.dev)(address . 49921-done@debbugs.gnu.org)
YbjwrxIdbGj66MrV@jasmine.lan
On Fri, Aug 06, 2021 at 10:04:19PM -0700, Sarah Morgensen wrote:
Toggle quote (6 lines)
> I just noticed go-1.16 is failing on aarch64 [0]. I am not having any
> success tracking down the cause. It looks like the error is the same as
> was happening for go-1.14 circa 11 Mar [1], which was fixed by 9 Apr
> [2], but I cannot tell what resolved the issue. I've attached the
> relevant part of the build log; the full log is available at [0].

I was able to build Go 1.16 on aarch64-linux, by building this
derivation "by hand" on the Berlin server. It was offloaded to real
aarch64 hardware.

/gnu/store/jgbp8bpi86is2y620wqais904lvjmvj8-go-1.16.11.drv

If I understand correctly, we have totally disabled the aarch64 emulated
builds. They were problematic and I think that we should not re-enable
them. So, I'm closing this bug. Please let us know if you think it
should be re-opened.
Closed
?
Your comment

This issue is archived.

To comment on this conversation send an email to 49921@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 49921
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch