Segfault with eigen in R

  • Done
  • quality assurance status badge
Details
4 participants
  • Kyle Meyer
  • Ludovic Courtès
  • Ricardo Wurmus
  • Ricardo Wurmus
Owner
unassigned
Submitted by
Kyle Meyer
Severity
normal

Debbugs page

Kyle Meyer wrote 9 years ago
(address . bug-guix@gnu.org)
874mgpj3p5.fsf@kyleam.com
Hello,

With R 3.2.2 built from r in statistics.scm (guix 0.9.0), I'm seeing a
segfault when eigen is called with a matrix over some size. I can
trigger the error with the following code [1]:

> M <- 50
> N <- 500
> eigen(crossprod(matrix(rnorm(M * N), M, N)))

*** caught segfault ***
address 0xfb0, cause 'memory not mapped'

Traceback:
1: eigen(crossprod(matrix(rnorm(M * N), M, N)))

Can others reproduce this?

Thanks.


[1] This is a down-sized version of the snippet from an ATLAS bug report
in 2011 for a similar error with R 2.14.


--
Kyle
Ricardo Wurmus wrote 9 years ago
(name . Kyle Meyer)(address . kyle@kyleam.com)(address . 21909@debbugs.gnu.org)
87pozd3ath.fsf@elephly.net
Kyle Meyer <kyle@kyleam.com> writes:

Toggle quote (16 lines)
> With R 3.2.2 built from r in statistics.scm (guix 0.9.0), I'm seeing a
> segfault when eigen is called with a matrix over some size. I can
> trigger the error with the following code [1]:
>
> > M <- 50
> > N <- 500
> > eigen(crossprod(matrix(rnorm(M * N), M, N)))
>
> *** caught segfault ***
> address 0xfb0, cause 'memory not mapped'
>
> Traceback:
> 1: eigen(crossprod(matrix(rnorm(M * N), M, N)))
>
> Can others reproduce this?

I can reproduce this running R 3.2.2 on GuixSD on a x86_64 machine.

~~ Ricardo
Kyle Meyer wrote 9 years ago
(name . Ricardo Wurmus)(address . rekado@elephly.net)(address . 21909@debbugs.gnu.org)
87twoisu2v.fsf@kyleam.com
Ricardo Wurmus <rekado@elephly.net> writes:

Toggle quote (20 lines)
> Kyle Meyer <kyle@kyleam.com> writes:
>
>> With R 3.2.2 built from r in statistics.scm (guix 0.9.0), I'm seeing a
>> segfault when eigen is called with a matrix over some size. I can
>> trigger the error with the following code [1]:
>>
>> > M <- 50
>> > N <- 500
>> > eigen(crossprod(matrix(rnorm(M * N), M, N)))
>>
>> *** caught segfault ***
>> address 0xfb0, cause 'memory not mapped'
>>
>> Traceback:
>> 1: eigen(crossprod(matrix(rnorm(M * N), M, N)))
>>
>> Can others reproduce this?
>
> I can reproduce this running R 3.2.2 on GuixSD on a x86_64 machine.

I haven't had any luck resolving this aside from just using R's internal
BLAS by removing "--with-blas=openblas" and "--with-lapack" from the
configure flags.

--
Kyle
Ludovic Courtès wrote 9 years ago
(name . Kyle Meyer)(address . kyle@kyleam.com)
87egfmfhvj.fsf@gnu.org
I can reproduce the bug with:

guix environment --pure --ad-hoc r -- R

and then typed “1” to get a core dump, which gives this:

Toggle snippet (26 lines)
Core was generated by `/gnu/store/zci2lb9jlc9hlck3x3hc04ab3y86fzf9-r-3.2.2/lib/R/bin/exec/R'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f34616450e0 in dgemv_t_SANDYBRIDGE () from /gnu/store/hw9p1zyn1nh8pbm1cl69nm0i391lk6c7-openblas-0.2.15/lib/libopenblas.so.0
[Current thread is 1 (Thread 0x7f34651307c0 (LWP 3399))]
(gdb) bt
#0 0x00007f34616450e0 in dgemv_t_SANDYBRIDGE () from /gnu/store/hw9p1zyn1nh8pbm1cl69nm0i391lk6c7-openblas-0.2.15/lib/libopenblas.so.0
#1 0x00000000000000a2 in ?? ()

[...]

(gdb) disassemble
Dump of assembler code for function dgemv_t_SANDYBRIDGE:

[...]

0x00007f34616450cf <+207>: jle 0x7f3461645140 <dgemv_t_SANDYBRIDGE+320>
0x00007f34616450d1 <+209>: nopl 0x0(%rax,%rax,1)
0x00007f34616450d6 <+214>: nopw %cs:0x0(%rax,%rax,1)
=> 0x00007f34616450e0 <+224>: movsd (%r9),%xmm0

[...]

(gdb) p $r9
$1 = 4016

My CPU seems to be a Sandy Bridge:

Toggle snippet (5 lines)
$ cat /proc/cpuinfo | grep ^model | head -2
model : 42
model name : Intel(R) Core(TM) i5-2540M CPU @ 2.60GHz

Might be useful to report it upstream?

Thanks,
Ludo’.
Kyle Meyer wrote 9 years ago
(name . Ludovic Courtès)(address . ludo@gnu.org)
87twohr0wc.fsf@kyleam.com
ludo@gnu.org (Ludovic Courtès) writes:

Toggle quote (4 lines)
> I can reproduce the bug with:
>
> guix environment --pure --ad-hoc r -- R

[...]

Toggle quote (2 lines)
> Might be useful to report it upstream?

Yes. In preparing to do so, I figured that I should reproduce the issue
with a build outside of Guix. However, when I tried with a manual build
on an Arch Linux system, the snippet ran fine. This was with gcc
version 5.2.0, so I switched the gfortran input of openblas over to
gfortran-5, which seems to fix the issue.

While the issue should still be reported upstream, would it be OK to
update the gfortran input to gfortran-5?

--
Kyle
Ludovic Courtès wrote 9 years ago
(name . Kyle Meyer)(address . kyle@kyleam.com)
87wptdj8ku.fsf@gnu.org
Kyle Meyer <kyle@kyleam.com> skribis:

Toggle quote (16 lines)
> ludo@gnu.org (Ludovic Courtès) writes:
>
>> I can reproduce the bug with:
>>
>> guix environment --pure --ad-hoc r -- R
>
> [...]
>
>> Might be useful to report it upstream?
>
> Yes. In preparing to do so, I figured that I should reproduce the issue
> with a build outside of Guix. However, when I tried with a manual build
> on an Arch Linux system, the snippet ran fine. This was with gcc
> version 5.2.0, so I switched the gfortran input of openblas over to
> gfortran-5, which seems to fix the issue.

Interesting.

Toggle quote (3 lines)
> While the issue should still be reported upstream, would it be OK to
> update the gfortran input to gfortran-5?

The problem is that this leads to an R linked against GCC 4.9’s libgcc_s
and other run-time support libraries, and also against those of GCC 5.2,
via OpenBLAS. I think we’d rather avoid it.

An additional data point:

Toggle snippet (47 lines)
$ guix environment --pure --ad-hoc r valgrind -- R -d valgrind
/gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/bin/R: line 8: uname: command not found
==3198== Memcheck, a memory error detector
==3198== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==3198== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==3198== Command: /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/bin/exec/R
==3198==

R version 3.2.2 (2015-08-14) -- "Fire Safety"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-unknown-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> M <- 50
> N <- 500
> eigen(crossprod(matrix(rnorm(M * N), M, N)))
==3198== Invalid read of size 8
==3198== at 0x8E400E0: dgemv_t_SANDYBRIDGE (in /gnu/store/hw9p1zyn1nh8pbm1cl69nm0i391lk6c7-openblas-0.2.15/lib/libopenblasp-r0.2.15.so)
==3198== by 0x183AED48: dlatrd_ (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/lib/libRlapack.so)
==3198== by 0x18461F92: dsytrd_ (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/lib/libRlapack.so)
==3198== by 0x184B9540: dsyevr_ (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/lib/libRlapack.so)
==3198== by 0x1B742D5E: La_rs (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/modules/lapack.so)
==3198== by 0x1B745B96: mod_do_lapack (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/modules/lapack.so)
==3198== by 0x4F35635: bcEval (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/lib/libR.so)
==3198== by 0x4F432DF: Rf_eval (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/lib/libR.so)
==3198== by 0x4F48F4B: Rf_applyClosure (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/lib/libR.so)
==3198== by 0x4F4345E: Rf_eval (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/lib/libR.so)
==3198== by 0x4F6B0D9: Rf_ReplIteration (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/lib/libR.so)
==3198== by 0x4F6B430: R_ReplConsole (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/lib/libR.so)
==3198== Address 0xfb0 is not stack'd, malloc'd or (recently) free'd
==3198==

*** caught segfault ***
address 0xfb0, cause 'memory not mapped'

Here this suggests an R issue more than an OpenBLAS problem.

Ludo’.
Kyle Meyer wrote 9 years ago
(name . Ludovic Courtès)(address . ludo@gnu.org)
87r3j7qdc1.fsf@kyleam.com
I've opened an issue in the OpenBLAS repo [1].


I'm trying to answer their questions, but, as is apparent in that
thread, I'm not really familiar with debugging these sorts of problems.
Since others are able to reproduce the error, any help over there would
be very appreciated.

[1] The last post suggested it may be an R issue rather than an OpenBLAS
one, but I wanted to be more sure of that before going to R
developers, especially given their comment about the --with-lapack
flag:

"Please do bear in mind that using --with-lapack is 'definitely not
recommended': it is provided only because it is necessary on some
platforms and because some users want to experiment with claimed
performance improvements. Reporting problems where it is used
unnecessarily will simply irritate the R helpers."


--
Kyle
Ricardo Wurmus wrote 9 years ago
(name . Kyle Meyer)(address . kyle@kyleam.com)
idjoaaxyxfu.fsf@bimsb-sys02.mdc-berlin.net
Kyle Meyer <kyle@kyleam.com> writes:

Toggle quote (22 lines)
> I've opened an issue in the OpenBLAS repo [1].
>
> https://github.com/xianyi/OpenBLAS/issues/703
>
> I'm trying to answer their questions, but, as is apparent in that
> thread, I'm not really familiar with debugging these sorts of problems.
> Since others are able to reproduce the error, any help over there would
> be very appreciated.
>
> [1] The last post suggested it may be an R issue rather than an OpenBLAS
> one, but I wanted to be more sure of that before going to R
> developers, especially given their comment about the --with-lapack
> flag:
>
> "Please do bear in mind that using --with-lapack is 'definitely not
> recommended': it is provided only because it is necessary on some
> platforms and because some users want to experiment with claimed
> performance improvements. Reporting problems where it is used
> unnecessarily will simply irritate the R helpers."
>
> https://cran.r-project.org/doc/manuals/r-release/R-admin.html#LAPACK

We have since removed OpenBLAS from the inputs of R and just use the
internal BLAS instead. Without OpenBLAS I cannot reproduce this bug
any more.

~~ Ricardo
Ricardo Wurmus wrote 9 years ago
close
(address . control@debbugs.gnu.org)
idjmvqhyxdw.fsf@bimsb-sys02.mdc-berlin.net
close 21909
thanks
Ludovic Courtès wrote 9 years ago
Re: bug#21909: Segfault with eigen in R
(name . Ricardo Wurmus)(address . ricardo.wurmus@mdc-berlin.de)
87egbsegdb.fsf@gnu.org
Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> skribis:

Toggle quote (4 lines)
> We have since removed OpenBLAS from the inputs of R and just use the
> internal BLAS instead. Without OpenBLAS I cannot reproduce this bug
> any more.

Nice. So we can close?

Ludo'.
Ricardo Wurmus wrote 9 years ago
(name . Ludovic Courtès)(address . ludo@gnu.org)
87y49z9t9n.fsf@mdc-berlin.de
Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (8 lines)
> Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> skribis:
>
>> We have since removed OpenBLAS from the inputs of R and just use the
>> internal BLAS instead. Without OpenBLAS I cannot reproduce this bug
>> any more.
>
> Nice. So we can close?

I had already closed it in a subsequent email to
control@debbugs.gnu.org.

~~ Ricardo
?
Your comment

This issue is archived.

To comment on this conversation send an email to 21909@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 21909
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch
You may also tag this issue. See list of standard tags. For example, to set the confirmed and easy tags
mumi command -t +confirmed -t +easy
Or, remove the moreinfo tag and set the help tag
mumi command -t -moreinfo -t +help