go-1.16 build failing on aarch64: "fatal error: runtime.newosproc"

  • Done
  • quality assurance status badge
Details
3 participants
  • Thiago Jung Bauermann
  • Sarah Morgensen
  • Leo Famulari
Owner
unassigned
Submitted by
Sarah Morgensen
Severity
normal
S
S
Sarah Morgensen wrote on 7 Aug 2021 07:04
(address . bug-guix@gnu.org)
861r75j164.fsf@mgsn.dev
Hello Guix,

I just noticed go-1.16 is failing on aarch64 [0]. I am not having any
success tracking down the cause. It looks like the error is the same as
was happening for go-1.14 circa 11 Mar [1], which was fixed by 9 Apr
[2], but I cannot tell what resolved the issue. I've attached the
relevant part of the build log; the full log is available at [0].

Any ideas?

starting phase `build'
runtime: failed to create new OS thread (have 2 already; errno=22)
fatal error: runtime.newosproc

runtime stack:
runtime.throw(0x5342c6)
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/panic.go:491 +0xa4
runtime.newosproc(0x10722000, 0x10732000)
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/os_linux.c:170 +0xec
newm(0xa4080, 0x0)
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/proc.c:1157 +0xbc
runtime.newsysmon()
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/proc.c:169 +0x2c
runtime.onM(0x54c228)
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/asm_arm.s:256 +0x74
runtime.mstart()
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/proc.c:818

goroutine 1 [running]:
runtime.switchtoM()
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/asm_arm.s:193 +0x4 fp=0x1071c7c0 sp=0x1071c7bc
runtime.main()
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/proc.go:32 +0x4c fp=0x1071c7e4 sp=0x1071c7c0
runtime.goexit()
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/asm_arm.s:1322 +0x4 fp=0x1071c7e4 sp=0x1071c7e4
Building Go cmd/dist using /gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003. ()
runtime: failed to create new OS thread (have 2 already; errno=22)
fatal error: runtime.newosproc

runtime stack:
runtime.throw(0x5342c6)
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/panic.go:491 +0xa4
runtime.newosproc(0x10722000, 0x10732000)
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/os_linux.c:170 +0xec
newm(0xa4080, 0x0)
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/proc.c:1157 +0xbc
runtime.newsysmon()
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/proc.c:169 +0x2c
runtime.onM(0x54c228)
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/asm_arm.s:256 +0x74
runtime.mstart()
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/proc.c:818

goroutine 1 [running]:
runtime.switchtoM()
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/asm_arm.s:193 +0x4 fp=0x1071c7c0 sp=0x1071c7bc
runtime.main()
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/proc.go:32 +0x4c fp=0x1071c7e4 sp=0x1071c7c0
runtime.goexit()
/gnu/store/ax3zhpkysy3nl5ipw3qb9yh2g04a0f1s-go-1.4-bootstrap-20171003/src/runtime/asm_arm.s:1322 +0x4 fp=0x1071c7e4 sp=0x1071c7e4
command "sh" "make.bash" "--no-banner" failed with status 2
builder for `/gnu/store/dfqww60vr4gykvz3fz4mj9sgk0x4jypz-go-1.16.7.drv' failed with exit code 1
@ build-failed /gnu/store/dfqww60vr4gykvz3fz4mj9sgk0x4jypz-go-1.16.7.drv - 1 builder for `/gnu/store/dfqww60vr4gykvz3fz4mj9sgk0x4jypz-go-1.16.7.drv' failed with exit code 1
S
S
Sarah Morgensen wrote on 2 Sep 2021 20:47
(address . 49921@debbugs.gnu.org)
86v93izufn.fsf@mgsn.dev
Sarah Morgensen <iskarian@mgsn.dev> writes:

Toggle quote (18 lines)
> Hello Guix,
>
> I just noticed go-1.16 is failing on aarch64 [0]. I am not having any
> success tracking down the cause. It looks like the error is the same as
> was happening for go-1.14 circa 11 Mar [1], which was fixed by 9 Apr
> [2], but I cannot tell what resolved the issue. I've attached the
> relevant part of the build log; the full log is available at [0].
>
> Any ideas?
>
> [0] https://ci.guix.gnu.org/build/949823/details
> [1] https://ci.guix.gnu.org/build/71004/details
> [2] https://ci.guix.gnu.org/build/19478/details
>
> starting phase `build'
> runtime: failed to create new OS thread (have 2 already; errno=22)
> fatal error: runtime.newosproc

I think this might be related to [0], although if it's true that CI uses
native builders for aarch64 now, I have no idea.

I've been able to reproduce this with both go-1.14 and go-1.16 when
building with --system=aarch64-linux from an amd64 system. I tried to
apply the patch in the thread I mentioned, but go-1.4 won't build at all
with QEMU.

cross compiled ARM binary on QEMU

--
Sarah
S
S
Sarah Morgensen wrote on 10 Sep 2021 02:51
control message for bug #50348
(address . control@debbugs.gnu.org)
8635qdmevt.fsf@mgsn.dev
block 50348 by 50227 49921 50495
thanks
S
S
Sarah Morgensen wrote on 10 Sep 2021 02:55
(address . control@debbugs.gnu.org)
861r5xmeq2.fsf@mgsn.dev
unblock 50348 by 50227 49921 50495
thanks

Oops, wrong bug #.
S
S
Sarah Morgensen wrote on 10 Sep 2021 02:56
control message for bug #50493
(address . control@debbugs.gnu.org)
86zgsll03k.fsf@mgsn.dev
block 50493 by 50495 49921 50227
thanks
S
S
Sarah Morgensen wrote on 10 Sep 2021 03:49
Re: bug#49921: go-1.16 build failing on aarch64: "fatal error: runtime.newosproc"
(address . 49921@debbugs.gnu.org)(name . Leo Famulari)(address . leo@famulari.name)
86v939kxnz.fsf@mgsn.dev
Sarah Morgensen <iskarian@mgsn.dev> writes:

Toggle quote (34 lines)
> Sarah Morgensen <iskarian@mgsn.dev> writes:
>
>> Hello Guix,
>>
>> I just noticed go-1.16 is failing on aarch64 [0]. I am not having any
>> success tracking down the cause. It looks like the error is the same as
>> was happening for go-1.14 circa 11 Mar [1], which was fixed by 9 Apr
>> [2], but I cannot tell what resolved the issue. I've attached the
>> relevant part of the build log; the full log is available at [0].
>>
>> Any ideas?
>>
>> [0] https://ci.guix.gnu.org/build/949823/details
>> [1] https://ci.guix.gnu.org/build/71004/details
>> [2] https://ci.guix.gnu.org/build/19478/details
>>
>> starting phase `build'
>> runtime: failed to create new OS thread (have 2 already; errno=22)
>> fatal error: runtime.newosproc
>
> I think this might be related to [0], although if it's true that CI uses
> native builders for aarch64 now, I have no idea.
>
> I've been able to reproduce this with both go-1.14 and go-1.16 when
> building with --system=aarch64-linux from an amd64 system. I tried to
> apply the patch in the thread I mentioned, but go-1.4 won't build at all
> with QEMU.
>
> [0] <https://github.com/golang/go/issues/20763> runtime: cannot run
> cross compiled ARM binary on QEMU
>
> --
> Sarah

I've written this up into a patch (attached below); I don't think
there's much of a way to test this other than just letting CI build it.
It's a backport, so it shouldn't hurt even if it doesn't fix it.

--
Sarah
From a5824c2495f5a547499ab200cd5b270b38f571d6 Mon Sep 17 00:00:00 2001
Message-Id: <a5824c2495f5a547499ab200cd5b270b38f571d6.1631237116.git.iskarian@mgsn.dev>
From: Sarah Morgensen <iskarian@mgsn.dev>
Date: Thu, 9 Sep 2021 18:23:10 -0700
Subject: [PATCH core-updates] gnu: go-1.4: Fix running with qemu-aarch64.

Backport the fix for running go with qemu-aarch64.

* gnu/packages/patches/go-1.4-fix-running-with-qemu.patch: New file.
* gnu/local.mk (dist_patch_DATA): Register it.
* gnu/packages/golang.scm (go-1.4)[origin]: Apply patch.
---
This might fix #49921, but I can't test it as there are a number of issues
preventing go-1.4 from compiling on qemu-aarch64.

It builds fine on x86_64, so it didn't break anything there.

--
Sarah

gnu/local.mk | 1 +
gnu/packages/golang.scm | 4 +-
.../go-1.4-fix-running-with-qemu.patch | 38 +++++++++++++++++++
3 files changed, 42 insertions(+), 1 deletion(-)
create mode 100644 gnu/packages/patches/go-1.4-fix-running-with-qemu.patch

Toggle diff (75 lines)
diff --git a/gnu/local.mk b/gnu/local.mk
index 20f0b8f081..39e17bb3bc 100644
--- a/gnu/local.mk
+++ b/gnu/local.mk
@@ -1144,6 +1144,7 @@ dist_patch_DATA = \
%D%/packages/patches/gobject-introspection-absolute-shlib-path.patch \
%D%/packages/patches/gobject-introspection-cc.patch \
%D%/packages/patches/gobject-introspection-girepository.patch \
+ %D%/packages/patches/go-1.4-fix-running-with-qemu.patch \
%D%/packages/patches/go-fix-script-tests.patch \
%D%/packages/patches/go-skip-gc-test.patch \
%D%/packages/patches/gpm-glibc-2.26.patch \
diff --git a/gnu/packages/golang.scm b/gnu/packages/golang.scm
index d3ef39a2e6..33f3120a09 100644
--- a/gnu/packages/golang.scm
+++ b/gnu/packages/golang.scm
@@ -1035,7 +1035,9 @@ your Go binary to be later served from an http.FileSystem.")
name version ".tar.gz"))
(sha256
(base32
- "0liybk5z00hizsb5ypkbhqcawnwwa6mkwgvjjg4y3jm3ndg5pzzl"))))
+ "0liybk5z00hizsb5ypkbhqcawnwwa6mkwgvjjg4y3jm3ndg5pzzl"))
+ (patches
+ (search-patches "go-1.4-fix-running-with-qemu.patch"))))
(build-system gnu-build-system)
(outputs '("out"
"doc"
diff --git a/gnu/packages/patches/go-1.4-fix-running-with-qemu.patch b/gnu/packages/patches/go-1.4-fix-running-with-qemu.patch
new file mode 100644
index 0000000000..52914c71a5
--- /dev/null
+++ b/gnu/packages/patches/go-1.4-fix-running-with-qemu.patch
@@ -0,0 +1,38 @@
+Backport from upstream: https://github/golang/go/commit/2673f9ed
+
+Original header:
+
+From 2673f9ed23348c634f6331ee589d489e4d9c7a9b Mon Sep 17 00:00:00 2001
+From: Austin Clements <austin@google.com>
+Date: Wed, 12 Jul 2017 10:12:50 -0600
+Subject: [PATCH] runtime: pass CLONE_SYSVSEM to clone
+
+SysV semaphore undo lists should be shared by threads, just like
+several other resources listed in cloneFlags. Currently we don't do
+this, but it probably doesn't affect anything because 1) probably
+nobody uses SysV semaphores from Go and 2) Go-created threads never
+exit until the process does. Beyond being the right thing to do,
+user-level QEMU requires this flag because it depends on glibc to
+create new threads and glibc uses this flag.
+
+Fixes #20763.
+
+---
+ src/runtime/os_linux.c | 1 +
+ 1 file changed, 1 insertion(+)
+
+diff --git a/src/runtime/os_linux.c b/src/runtime/os_linux.c
+index 0d8ffc9..573be7c 100644
+--- a/src/runtime/os_linux.c
++++ b/src/runtime/os_linux.c
+@@ -150,6 +150,7 @@ runtime·newosproc(M *mp, void *stk)
+ | CLONE_FS /* share cwd, etc */
+ | CLONE_FILES /* share fd table */
+ | CLONE_SIGHAND /* share sig handler table */
++ | CLONE_SYSVSEM /* share SysV semaphore undo lists (see issue #20763) */
+ | CLONE_THREAD /* revisit - okay for now */
+ ;
+
+--
+2.31.1
+

base-commit: 22f7d4bce1e694b7ac38e62410d76a6d46d96c5d
--
2.33.0
T
T
Thiago Jung Bauermann wrote on 11 Sep 2021 00:06
(name . Sarah Morgensen)(address . iskarian@mgsn.dev)(address . 49921@debbugs.gnu.org)
1787046.58CDggCRW2@popigai
Hello Sarah,

Em quinta-feira, 2 de setembro de 2021, às 15:47:08 -03, Sarah Morgensen
escreveu:
Toggle quote (3 lines)
> I think this might be related to [0], although if it's true that CI uses
> native builders for aarch64 now, I have no idea.

The CI has two native aarch64 builders (which also do armhf, despite bugs
43513 and 43591): dover and overdrive1.

It also uses half of the x86_64 builders (hydra-guix-*) for emulated builds
of aarch64 and armhf, as can be seen here:


--
Thanks,
Thiago
I
I
iskarian wrote on 11 Sep 2021 00:14
(name . Thiago Jung Bauermann)(address . bauermann@kolabnow.com)(address . 49921@debbugs.gnu.org)
2cce05f2762e6bb5beadc677a34f00c4@mgsn.dev
Hi Thiago,

(Re-sent due to missing Cc.)

September 10, 2021 3:06 PM, "Thiago Jung Bauermann" <bauermann@kolabnow.com> wrote:

Toggle quote (16 lines)
> Hello Sarah,
>
> Em quinta-feira, 2 de setembro de 2021, às 15:47:08 -03, Sarah Morgensen
> escreveu:
>
>> I think this might be related to [0], although if it's true that CI uses
>> native builders for aarch64 now, I have no idea.
>
> The CI has two native aarch64 builders (which also do armhf, despite bugs
> 43513 and 43591): dover and overdrive1.
>
> It also uses half of the x86_64 builders (hydra-guix-*) for emulated builds
> of aarch64 and armhf, as can be seen here:
>
> https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/hydra/berlin-nodes.scm#n143

Thanks for the explanation. Is there a way to tell CI (or Guix itself) that certain packages
shouldn't be built under emulation?

--
Sarah
T
T
Thiago Jung Bauermann wrote on 11 Sep 2021 01:06
(address . iskarian@mgsn.dev)(address . 49921@debbugs.gnu.org)
12949187.JfoJOkFrob@popigai
Hello Sarah,

Em sexta-feira, 10 de setembro de 2021, às 19:14:06 -03, iskarian@mgsn.dev
escreveu:
Toggle quote (5 lines)
> Hi Thiago,
>
> (Re-sent due to missing Cc.)
>
> September 10, 2021 3:06 PM, "Thiago Jung Bauermann"
<bauermann@kolabnow.com> wrote:
Toggle quote (21 lines)
> > Hello Sarah,
> >
> > Em quinta-feira, 2 de setembro de 2021, às 15:47:08 -03, Sarah
> > Morgensen
> >
> > escreveu:
> >> I think this might be related to [0], although if it's true that CI
> >> uses
> >> native builders for aarch64 now, I have no idea.
> >
> > The CI has two native aarch64 builders (which also do armhf, despite
> > bugs 43513 and 43591): dover and overdrive1.
> >
> > It also uses half of the x86_64 builders (hydra-guix-*) for emulated
> > builds of aarch64 and armhf, as can be seen here:
> >
> > https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/hydra/berli
> > n-nodes.scm#n143
> Thanks for the explanation. Is there a way to tell CI (or Guix itself)
> that certain packages shouldn't be built under emulation?

I don’t know. That would certainly be useful, though.

To be honest, my experience in the past couple of months with emulated
builds for powerpc64le-linux wasn’t good, so I asked for it to be turned
off.

--
Thanks,
Thiago
L
L
Leo Famulari wrote on 14 Dec 2021 20:29
(name . Sarah Morgensen)(address . iskarian@mgsn.dev)(address . 49921-done@debbugs.gnu.org)
YbjwrxIdbGj66MrV@jasmine.lan
On Fri, Aug 06, 2021 at 10:04:19PM -0700, Sarah Morgensen wrote:
Toggle quote (6 lines)
> I just noticed go-1.16 is failing on aarch64 [0]. I am not having any
> success tracking down the cause. It looks like the error is the same as
> was happening for go-1.14 circa 11 Mar [1], which was fixed by 9 Apr
> [2], but I cannot tell what resolved the issue. I've attached the
> relevant part of the build log; the full log is available at [0].

I was able to build Go 1.16 on aarch64-linux, by building this
derivation "by hand" on the Berlin server. It was offloaded to real
aarch64 hardware.

/gnu/store/jgbp8bpi86is2y620wqais904lvjmvj8-go-1.16.11.drv

If I understand correctly, we have totally disabled the aarch64 emulated
builds. They were problematic and I think that we should not re-enable
them. So, I'm closing this bug. Please let us know if you think it
should be re-opened.
Closed
?