Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

manylinux: disable -ffast-math #1820

Merged
merged 2 commits into from
Oct 5, 2021
Merged

manylinux: disable -ffast-math #1820

merged 2 commits into from
Oct 5, 2021

Conversation

ikonst
Copy link
Contributor

@ikonst ikonst commented Oct 5, 2021

This code runs in under 1 second:

import scipy.stats
import numpy as np

scipy.stats.skellam.sf(np.arange(100), 3.752321119795838e-06, 0.000001)

but if you import gevent

import gevent
scipy.stats.skellam.sf(np.arange(100), 3.752321119795838e-06, 0.000001)

then it takes many seconds to complete.

I don't entirely understand the culprit, i.e. how it affects the execution of scipy's Fortran code. From randomly breaking with gdb, in the 2nd case, the scipy code doesn't seem to use libm. I've narrowed it down to gevent being compiled with -ffast-math (due to -Ofast, introduced in 59478eb, first released in 20.6) and thus changing process-wide floating point flags. The change happens immediately when gevent's shared libraries are loaded, since the C runtime's set_fast_math code is set to run as a library constructor.

I've checked it by running the following function after import gevent to "fix" the problem:

void set_slow_math()
{
  unsigned int mxcsr = __builtin_ia32_stmxcsr ();
  // inverse of https://github.com/gcc-mirror/gcc/blob/master/libgcc/config/i386/crtfastmath.c#L94-L96
  mxcsr &= ~(MXCSR_DAZ | MXCSR_FTZ);
  __builtin_ia32_ldmxcsr (mxcsr);
}

@jamadden
Copy link
Member

jamadden commented Oct 5, 2021

Thanks for this PR. Can you rebase on current gevent master, please? That should give the CI a chance to run and pass.

@jamadden jamadden merged commit 69f3613 into gevent:master Oct 5, 2021
@jamadden
Copy link
Member

jamadden commented Oct 5, 2021

Thanks!

@ikonst ikonst deleted the patch-1 branch October 5, 2021 22:33
@ikonst
Copy link
Contributor Author

ikonst commented Oct 6, 2021

p.s. here's a gcc bug for this gotcha: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55522

Without understanding what the scipy code does, I suspect the scipy code got impacted because of some iterative algorithm converging to the exit condition slower when denormalized numbers are clamped to zero (perhaps never converging "for real" but hitting some maximum iteration count).

hmenke added a commit to hmenke/pairinteraction that referenced this pull request May 27, 2022
This also disables the use of -Ofast.  Not only is -Ofast detrimental to
floating-point accuracy but can in fact affect other code that is dynamically
linked!  That's really bad as it has the potential to break things like SciPy at
runtime.  See also these references:

- https://simonbyrne.github.io/notes/fastmath/
- gevent/gevent#1820
hmenke added a commit to pairinteraction/pairinteraction that referenced this pull request May 27, 2022
This also disables the use of -Ofast.  Not only is -Ofast detrimental to
floating-point accuracy but can in fact affect other code that is dynamically
linked!  That's really bad as it has the potential to break things like SciPy at
runtime.  See also these references:

- https://simonbyrne.github.io/notes/fastmath/
- gevent/gevent#1820
@ikonst ikonst mentioned this pull request Sep 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants