python-matplotlib fails to build on i686-linux

  • Done
  • quality assurance status badge
Details
3 participants
  • Diego Nicola Barbato
  • Leo Famulari
  • Maxim Cournoyer
Owner
unassigned
Submitted by
Diego Nicola Barbato
Severity
normal
D
D
Diego Nicola Barbato wrote on 3 Apr 2020 17:20
(address . bug-guix@gnu.org)
87mu7s1tqv.fsf@GlaDOS.home
Hi Guix,

The package python-matplotlib fails to build during the check phase on
i686-linux. The test failure appears to be deterministic:

Toggle snippet (24 lines)
=================================== FAILURES ===================================
_______________________ test_slider_horizontal_vertical ________________________

def test_slider_horizontal_vertical():
fig, ax = plt.subplots()
slider = widgets.Slider(ax=ax, label='', valmin=0, valmax=24,
valinit=12, orientation='horizontal')
slider.set_val(10)
assert slider.val == 10
# check the dimension of the slider patch in axes units
box = slider.poly.get_extents().transformed(ax.transAxes.inverted())
> assert_allclose(box.bounds, [0, 0, 10/24, 1])
E AssertionError:
E Not equal to tolerance rtol=1e-07, atol=0
E
E Mismatch: 25%
E Max absolute difference: 1.11022302e-16
E Max relative difference: 2.66453526e-16
E x: array([ 0.000000e+00, -2.310706e-18, 4.166667e-01, 1.000000e+00])
E y: array([0. , 0. , 0.416667, 1. ])

/gnu/store/8g8yfikj63wf0y3hwvpk00hqj5wpfs7v-python-matplotlib-3.1.2/lib/python3.7/site-packages/matplotlib/tests/test_widgets.py:333: AssertionError

Im currently on commit 151f3d4.

Regards,

Diego
L
L
Leo Famulari wrote on 3 Apr 2020 18:18
(name . Diego Nicola Barbato)(address . dnbarbato@posteo.de)(address . 40406@debbugs.gnu.org)
20200403161802.GA3560@jasmine.lan
On Fri, Apr 03, 2020 at 05:20:08PM +0200, Diego Nicola Barbato wrote:
Toggle quote (3 lines)
> The package python-matplotlib fails to build during the check phase on
> i686-linux. The test failure appears to be deterministic:

I wonder if this scientific computing stuff should be tried on i686 at
all. Should we limit it to contemporary architectures?
D
D
Diego Nicola Barbato wrote on 2 Jun 2020 18:18
(name . Leo Famulari)(address . leo@famulari.name)(address . 40406@debbugs.gnu.org)
878sh5a1bf.fsf@GlaDOS.home
Hi,

Leo Famulari <leo@famulari.name> writes:

Toggle quote (4 lines)
> On Fri, Apr 03, 2020 at 05:20:08PM +0200, Diego Nicola Barbato wrote:
>> The package python-matplotlib fails to build during the check phase on
>> i686-linux. The test failure appears to be deterministic:

I've taken a closer look at the failing test (on x86_64-linux, commit:
ebbf915) by keeping the build tree ...

Toggle snippet (3 lines)
guix build --no-grafts -s i686-linux --keep-failed python-matplotlib

... and running python inside of it.

Toggle snippet (6 lines)
cd /tmp/guix-build-python-matplotlib-3.1.2.drv-0/matplotlib-3.1.2/
. ../environment-variables
export PYTHONPATH="${PYTHONPATH}:build/lib.linux-i686-3.8"
python3

I retraced the steps of the failing test ...

Toggle snippet (10 lines)
import matplotlib.pyplot as plt
import matplotlib.widgets as widgets
from numpy.testing import assert_allclose
fig, ax = plt.subplots()
slider = widgets.Slider(ax=ax, label='', valmin=0, valmax=24, valinit=12, orientation='horizontal')
slider.set_val(10)
box = slider.poly.get_extents().transformed(ax.transAxes.inverted())
assert_allclose(box.bounds, [0, 0, 10/24, 1])

... and was able to reproduce the failing assertion ...

Toggle snippet (10 lines)
AssertionError:
Not equal to tolerance rtol=1e-07, atol=0

Mismatch: 25%
Max absolute difference: 1.11022302e-16
Max relative difference: 2.66453526e-16
x: array([ 0.000000e+00, -1.046255e-17, 4.166667e-01, 1.000000e+00])
y: array([0. , 0. , 0.416667, 1. ])

... , although the differing value wasn't exactly the same as the one
reported by the test. As expected, the assertion did not fail when I
followed the same steps on x86_64.

Toggle snippet (3 lines)
guix environment --ad-hoc python python-matplotlib -- python3

A closer look at the intermediate results ...

Toggle snippet (7 lines)
print(slider.poly.get_extents().bounds,
ax.transAxes.get_matrix(),
ax.transAxes.inverted().get_matrix(),
box.bounds,
sep='\n')

... revealed that only the values of box.bounds differ. The dimensions
of the slider are the same (as are the values of the transformation
matrices) on i686-linux ...

Toggle snippet (10 lines)
(80.0, 52.8, 206.66666666666663, 369.59999999999997)
[[496. 0. 80. ]
[ 0. 369.6 52.8]
[ 0. 0. 1. ]]
[[ 0.00201613 0. -0.16129032]
[ 0. 0.00270563 -0.14285714]
[ 0. 0. 1. ]]
(0.0, -1.0462550964485118e-17, 0.4166666666666666, 1.0)

... and on x86_64-linux.

Toggle snippet (10 lines)
(80.0, 52.8, 206.66666666666663, 369.59999999999997)
[[496. 0. 80. ]
[ 0. 369.6 52.8]
[ 0. 0. 1. ]]
[[ 0.00201613 0. -0.16129032]
[ 0. 0.00270563 -0.14285714]
[ 0. 0. 1. ]]
(0.0, 0.0, 0.41666666666666663, 1.0000000000000002)

Apparently there is nothing wrong with the slider. Instead matrix
multiplication, which is used under the hood for transformations, seems
to sometimes produce incorrect results on i686-linux. I have reported
this as a separate bug (https://debbugs.gnu.org/41665).

Toggle quote (3 lines)
> I wonder if this scientific computing stuff should be tried on i686 at
> all. Should we limit it to contemporary architectures?

I was opposed to this at first (after all, if upstream supports Numpy on
i686, why shouldn't we?), but after seeing that even simple things like
matrix multiplication can produce incorrect results I'm in favour of
limiting it to contemporary architectures. What do others think?

Regards,

Diego
L
L
Leo Famulari wrote on 2 Jun 2020 19:39
(name . Diego Nicola Barbato)(address . dnbarbato@posteo.de)(address . 40406@debbugs.gnu.org)
20200602173916.GB19317@jasmine.lan
On Tue, Jun 02, 2020 at 06:18:28PM +0200, Diego Nicola Barbato wrote:
Toggle quote (5 lines)
> I was opposed to this at first (after all, if upstream supports Numpy on
> i686, why shouldn't we?), but after seeing that even simple things like
> matrix multiplication can produce incorrect results I'm in favour of
> limiting it to contemporary architectures. What do others think?

I'll leave it up to you and others who are actually using these
programs. I do think it's unlikely the upstream developers use or test
the software on i686, or on anything besides x86_64.
D
D
Diego Nicola Barbato wrote on 8 Jun 2020 16:22
(address . 40406@debbugs.gnu.org)(name . Leo Famulari)(address . leo@famulari.name)
87zh9d8wn7.fsf@GlaDOS.home
Hey,

Diego Nicola Barbato <dnbarbato@posteo.de> writes:

[...]

Toggle quote (5 lines)
> Apparently there is nothing wrong with the slider. Instead matrix
> multiplication, which is used under the hood for transformations, seems
> to sometimes produce incorrect results on i686-linux. I have reported
> this as a separate bug (https://debbugs.gnu.org/41665).

I got this wrong: This issue isn't caused by the Numpy bug, since
Matplotlib doesn't use Numpy for transformations. Both bugs are caused
by the excess precision of the x87 FPU's floating point registers.

I've attached a patch which makes sure that the C and C++ extensions are
compiled with -ffloat-store. This doesn't get rid of all possible
rounding errors but it's enough for the slider test to pass.

[...]

Regards,

Diego
From 7c60615e29a1aab7922139183d191cc8cedd5c7f Mon Sep 17 00:00:00 2001
From: Diego Nicola Barbato <dnbarbato@posteo.de>
Date: Mon, 8 Jun 2020 02:31:17 +0200
Subject: [PATCH] gnu: python-matplotlib: Fix rounding errors on x86 CPUs.

Reported by Diego Nicola Barbato <dnbarbato@posteo.de>.

* gnu/packages/python-xyz.scm (python-matplotlib)[arguments]: Set the
environment variable CFLAGS to -ffloat-store.
---
gnu/packages/python-xyz.scm | 5 +++++
1 file changed, 5 insertions(+)

Toggle diff (25 lines)
diff --git a/gnu/packages/python-xyz.scm b/gnu/packages/python-xyz.scm
index 94e63d1c74..f0b96c6fb0 100644
--- a/gnu/packages/python-xyz.scm
+++ b/gnu/packages/python-xyz.scm
@@ -81,6 +81,7 @@
;;; Copyright © 2020 Josh Holland <josh@inv.alid.pw>
;;; Copyright © 2020 Yuval Kogman <nothingmuch@woobling.org>
;;; Copyright © 2020 Michael Rohleder <mike@rohleder.de>
+;;; Copyright © 2020 Diego N. Barbato <dnbarbato@posteo.de>
;;;
;;; This file is part of GNU Guix.
;;;
@@ -4744,6 +4745,10 @@ convert between colorspaces like sRGB, XYZ, CIEL*a*b*, CIECAM02, CAM02-UCS, etc.
;; has not effect.
(setenv "LD_LIBRARY_PATH" (string-append cairo "/lib"))
(setenv "HOME" (getcwd))
+ ;; Fix rounding errors when using the x87 FPU.
+ ,@(if (string-prefix? "i686" (%current-system))
+ '((setenv "CFLAGS" "-ffloat-store"))
+ '())
(call-with-output-file "setup.cfg"
(lambda (port)
(format port "[directories]~%
--
2.26.2
M
M
Maxim Cournoyer wrote on 15 Nov 2020 04:14
(name . Diego Nicola Barbato)(address . dnbarbato@posteo.de)
877dqnfh1b.fsf@gmail.com
Hello Diego,

Diego Nicola Barbato <dnbarbato@posteo.de> writes:

Toggle quote (19 lines)
> Hey,
>
> Diego Nicola Barbato <dnbarbato@posteo.de> writes:
>
> [...]
>
>> Apparently there is nothing wrong with the slider. Instead matrix
>> multiplication, which is used under the hood for transformations, seems
>> to sometimes produce incorrect results on i686-linux. I have reported
>> this as a separate bug (https://debbugs.gnu.org/41665).
>
> I got this wrong: This issue isn't caused by the Numpy bug, since
> Matplotlib doesn't use Numpy for transformations. Both bugs are caused
> by the excess precision of the x87 FPU's floating point registers.
>
> I've attached a patch which makes sure that the C and C++ extensions are
> compiled with -ffloat-store. This doesn't get rid of all possible
> rounding errors but it's enough for the slider test to pass.

Thanks for the investigationd and workaround! I've tested it and it
seems to work well. I've pushed a slightly modified version to master
as commit 81643c4cf3e61f5a98b92a72a92c230f5e7ca905.

Thank you!

Closing.

Maxim
Closed
?