Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: py3.10 in more jobs and bump some 3.8 to 3.9 #14832

Merged
merged 13 commits into from
Nov 10, 2021

Conversation

tupui
Copy link
Member

@tupui tupui commented Oct 9, 2021

Update some versions of Python to 3.9 and 3.10. Only considering a few variables, it would look like the following:

Mac

  • 3.8, 3.10

Linux

  • 3.10 nightly
  • 3.8 dbg
  • 3.9 doc
  • 3.9 benchmark
  • 3.8 full + 64bit_blas + pre numpy + coverage
  • 3.9 fast asv refguide
  • 3.9 fast sdist
  • 3.8 fast gcc
  • 3.9 lint
  • 3.8 32 bit full

Windows

  • 3.8 fast 32 bit
  • 3.8 full no pythran
  • 3.9 full ilp
  • 3.10 full no pythran

@tupui tupui added the CI Items related to the CI tools such as CircleCI, GitHub Actions or Azure label Oct 9, 2021
@tupui
Copy link
Member Author

tupui commented Oct 9, 2021

Azure does not yet have Py3.10. But maybe we could move some jobs from azure to GH? Or do we have a credit reason, or other incentives, to stay with Azure?

@tupui
Copy link
Member Author

tupui commented Oct 9, 2021

I also see an issue with gmpy2. Why is this needed? I did not see any usage in the codebase. Also the last stable is quite old and they have a recent RC. Maybe worth checking this if we do need this package.

[EDIT:] oh ok it's for mpmath to run faster. I will try to use a more recent version.

@tylerjereddy
Copy link
Contributor

Or do we have a credit reason, or other incentives, to stay with Azure?

Not really--the last time we had a CI "crisis" (when Travis CI put the credit limit into production), we mostly just tried to "load balance" such that we didn't run into massive rate-limiting problems with a single service. I think Azure gives us 10 parallel jobs so we just try to balance with that in mind I think (Actions may be similar?).

@andyfaff
Copy link
Contributor

andyfaff commented Oct 10, 2021

It would be good for us to reduce our CI footprint as well. I saw a blog where they examined usage of various python open source projects, we're very near the top. They converted to cash amount and it's scary figures.

Once the new build system is in place our demands will go down, but it's also worth examining test run time again. There are a few stats tests that are quite demanding of time.

@tupui
Copy link
Member Author

tupui commented Oct 10, 2021

Thank you both for your inputs. This PR only remove a job by moving the coverage to another job (this is done in another open PR on the coverage). For the rest I am not long enough here to know what could be removed. If you have ideas I will be happy to change things.

What is missing in general is more caching from one commit to another. I have a failed attempt PR to add caching to CircleCI but for some reason I am having trouble with our build script.

Another possibility would be to use labels to trigger CI. We could have a small selection of workflows which would run all the time. And depending on the PR we could trigger manually more runs by adding a label. What do you think?

@tylerjereddy
Copy link
Contributor

Once the new build system is in place our demands will go down, but it's also worth examining test run time again.

I thought Ralf took a good solid look at that recently--I know I did a test run to confirm that we were not getting much benefit from multiple cores for the test suite.

I guess the meson stuff will speed up builds--we may not always get tons of benefit from parallel builds depending on the CI service, though just cutting in half or so with ~2 cores available would still be huge.

I'd be nervous about substantially cutting down the CI entries since we already get stuff slipping through to the wheels repo.

@andyfaff
Copy link
Contributor

I've looked at a few CI test runs:

122.91s call     build/testenv/lib/python3.8/site-packages/scipy/stats/tests/test_stats.py::TestKSTwoSamples::test_gh12999
86.72s call     build/testenv/lib/python3.8/site-packages/scipy/sparse/linalg/dsolve/tests/test_linsolve.py::TestSpsolveTriangular::test_random
79.40s call     build/testenv/lib/python3.8/site-packages/scipy/stats/tests/test_continuous_basic.py::test_moments[ksone-arg62-True-True-True-False]
44.15s call     build/testenv/lib/python3.8/site-packages/scipy/stats/tests/test_fit.py::test_cont_fit[MLE-genhyperbolic-arg32]
39.72s call     build/testenv/lib/python3.8/site-packages/scipy/stats/tests/test_stats.py::TestMGCStat::test_twosamp
38.69s call     build/testenv/lib/python3.8/site-packages/scipy/sparse/linalg/tests/test_propack.py::test_examples[True-complex16]
38.07s call     build/testenv/lib/python3.8/site-packages/scipy/sparse/linalg/tests/test_propack.py::test_examples[True-complex8]
37.43s call     build/testenv/lib/python3.8/site-packages/scipy/stats/tests/test_continuous_basic.py::test_cont_basic[500-200-ksone-arg57]
29.62s call     build/testenv/lib/python3.8/site-packages/scipy/stats/tests/test_distributions.py::TestLevyStable::test_pdf_nolan_samples
28.72s call     build/testenv/lib/python3.8/site-packages/scipy/stats/tests/test_continuous_basic.py::test_cont_basic[500-200-kstwo-arg58]

and https://dev.azure.com/scipy-org/SciPy/_build/results?buildId=14254&view=logs&j=203f29d8-a99b-512b-bad4-9c985ff1792a&t=0e9bc6fa-f840-5c44-330b-bc382f4fd289

413.01s call     stats/tests/test_stats.py::TestKSTwoSamples::test_gh12999
53.60s call     stats/tests/test_stats.py::TestKSTwoSamples::test_gh11184_bigger
48.95s call     stats/tests/test_stats.py::TestMGCStat::test_twosamp
44.41s call     stats/tests/test_mstats_basic.py::TestCorr::test_kendall_p_exact_large
37.29s call     _lib/tests/test_import_cycles.py::test_modules_importable
31.94s call     stats/tests/test_continuous_basic.py::test_cont_basic[500-200-kstwo-arg58]
30.89s call     stats/tests/test_fit.py::test_cont_fit[MLE-genhyperbolic-arg32]
30.76s call     special/tests/test_cdflib.py::TestCDFlib::test_chndtrix
24.94s call     stats/tests/test_continuous_basic.py::test_moments[vonmises-arg107-False-False-True-False]
22.55s call     stats/tests/test_continuous_basic.py::test_moments[kstwo-arg63-True-True-True-False]
== 44356 passed, 2710 skipped, 216 xfailed, 19 xpassed in 2011.29s (0:33:31) ==

Whilst there may be an anomaly for test_gh12999 (e.g. something happens just before that test runs), it's a bit suspicious. Taking almost 7 minutes to run a test (that's not marked as slow) is excessive, even 2 minutes is too long. Furthermore that specific test doesn't appear to be seeded, it failed straight out on my desktop when I copied code into a REPL.

Some of the MGCStat are marked slow, but at what point does one run a ~50 s individual test during a CI run?

@tupui
Copy link
Member Author

tupui commented Oct 11, 2021

I've looked at a few CI test runs:

122.91s call     build/testenv/lib/python3.8/site-packages/scipy/stats/tests/test_stats.py::TestKSTwoSamples::test_gh12999
86.72s call     build/testenv/lib/python3.8/site-packages/scipy/sparse/linalg/dsolve/tests/test_linsolve.py::TestSpsolveTriangular::test_random
...
413.01s call     stats/tests/test_stats.py::TestKSTwoSamples::test_gh12999
53.60s call     stats/tests/test_stats.py::TestKSTwoSamples::test_gh11184_bigger

Whilst there may be an anomaly for test_gh12999 (e.g. something happens just before that test runs), it's a bit suspicious. Taking almost 7 minutes to run a test (that's not marked as slow) is excessive, even 2 minutes is too long. Furthermore that specific test doesn't appear to be seeded, it failed straight out on my desktop when I copied code into a REPL.
Some of the MGCStat are marked slow, but at what point does one run a ~50 s individual test during a CI run?

Wooo... I never looked at these but indeed we would have more benefit reducing these instead! I am also questioning the very long tests. Maybe they test known results, but CI tests are not supposed to be rocket science so I am pretty sure there are ways to tests these in another way. It's like the xfails, do we actually review these from time to time? We have a lot of them and if they are most of the time failing it's also some lost CI time.

Maybe another PR though, I am neutral here.

@tupui tupui closed this Oct 30, 2021
@tupui tupui reopened this Oct 30, 2021
@tupui
Copy link
Member Author

tupui commented Oct 30, 2021

We should update windows distribution as well. We get this deprecation warning now.

##[warning]The windows-2016 environment will be deprecated on November 15, 2021, and removed on March 15, 2022. Migrate to windows-latest instead. For more details see https://github.com/actions/virtual-environments/issues/4312

@tupui tupui marked this pull request as ready for review October 30, 2021 12:03
@tupui
Copy link
Member Author

tupui commented Oct 30, 2021

I guess this will change with Meson coming soon (I've seen the blog post on the CI test suite). So I am not sure how this PR is relevant. Feel free to close it if something better is coming soon 😄

@rgommers
Copy link
Member

It would be good for us to reduce our CI footprint as well. I saw a blog where they examined usage of various python open source projects, we're very near the top. They converted to cash amount and it's scary figures.

That blog post was wrong by two orders of magnitude I believe. That said, it's still true that we should use fewer resources.

I thought Ralf took a good solid look at that recently--I know I did a test run to confirm that we were not getting much benefit from multiple cores for the test suite.

I still have an open PR for adding threadpoolctl which helps quite a bit. The problem is silencing warnings during the collection stage, I didn't find a good way to get that to work just yet.

I guess the meson stuff will speed up builds--we may not always get tons of benefit from parallel builds depending on the CI service, though just cutting in half or so with ~2 cores available would still be huge.

Yes, the new CI jobs should be significantly faster, due to both build improvements and caching.

I guess this will change with Meson coming soon (I've seen the blog post on the CI test suite). So I am not sure how this PR is relevant.

I would say it's still very helpful, would like to see this PR merged. Would be great if the pre-release job stops failing and then have this all green.

@rgommers
Copy link
Member

rgommers commented Oct 30, 2021

I've seen the blog post on the CI test suite

There are parts that are finished, and an overall matrix which will help in choosing configs and for documenting our setup (right now it's just reading through all yaml files ...), but it's not like we plan to redo the whole thing from scratch.

EDIT: this is the blog post in question: https://labs.quansight.org/blog/2021/10/re-engineering-cicd-pipelines-for-scipy/

@rgommers
Copy link
Member

Wooo... I never looked at these but indeed we would have more benefit reducing these instead! I am also questioning the very long tests

It looks like some of these got slower, by a lot. Unclear why. But that is indeed unacceptable, I'd mark them as xslow so they don't get run unless we set an env var.

@charris
Copy link
Member

charris commented Oct 30, 2021

We should update windows distribution as well. We get this deprecation warning now.

vmImage: 'windows-latest' will use vs2019. Made this change for NumPy and numpy-wheels.

@tupui
Copy link
Member Author

tupui commented Oct 31, 2021

I would say it's still very helpful, would like to see this PR merged. Would be great if the pre-release job stops failing and then have this all green.

Ok thanks for confirming. I will see to make it green. I will also see if I can mark other slow tests. And since NumPy has an updated windows image I will do the same here.

@tupui
Copy link
Member Author

tupui commented Nov 2, 2021

I think the failures in source_distribution are due to an older version of NumPy which raises a RuntimeWarning when there are NaNs for comparison. Not sure what to do. Silence it? Use np.less_equal ?

For refguide_asv_check, check_grad is not returning the same output. Not sure if it's just a tol thing here. cc @czgdp1807

@rgommers
Copy link
Member

rgommers commented Nov 2, 2021

I think the failures in source_distribution are due to an older version of NumPy which raises a RuntimeWarning when there are NaNs for comparison. Not sure what to do. Silence it? Use np.less_equal?

Extracted a failure, since it's not visible in the latest CI run right now:

________________ test_support_gh13294_regression[kappa4-args70] ________________
[gw0] linux -- Python 3.9.7 /opt/hostedtoolcache/Python/3.9.7/x64/bin/python
/opt/hostedtoolcache/Python/3.9.7/x64/lib/python3.9/site-packages/scipy/stats/tests/test_distributions.py:6023: in test_support_gh13294_regression
    a0, b0 = dist.support(*args)
        args       = (nan, 0)
        dist       = <scipy.stats._continuous_distns.kappa4_gen object at 0x7f345d2cf790>
        distname   = 'kappa4'
/opt/hostedtoolcache/Python/3.9.7/x64/lib/python3.9/site-packages/scipy/stats/_distn_infrastructure.py:1489: in support
    _a, _b = self._get_support(*args)
        args       = [array(nan), array(0)]
        arrs       = [array(nan), array(0), array(0), array(1)]
        cond       = False
        kwargs     = {}
        loc        = array(0)
        scale      = array(1)
        self       = <scipy.stats._continuous_distns.kappa4_gen object at 0x7f345d2cf790>
/opt/hostedtoolcache/Python/3.9.7/x64/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:5754: in _get_support
    condlist = [np.logical_and(h > 0, k > 0),
E   RuntimeWarning: invalid value encountered in greater
        h          = array(nan)
        k          = array(0)
        self       = <scipy.stats._continuous_distns.kappa4_gen object at 0x7f345d2cf790>

This looks odd indeed. The CI job uses: Windows, Python 3.8, GCC 4.8, NumPy 1.19.3, (thanks Pip for the install log pollution ... after many hundreds of lines we finally see it chose Downloading numpy-1.19.3-cp39-cp39-manylinux2010_x86_64.whl).

It's unclear to me why that is failing. I'd say yes, let's ignore that for now if needed. If you touched the config for the source_distribution job, just change it back to see if the problem goes away?

@tupui
Copy link
Member Author

tupui commented Nov 2, 2021

@rgommers thanks. Indeed and I just got why. The only change I did was to change Py 3.8 to 3.9 for this job, hence the version for NumPy in pyproject.toml is 1.19.3. Maybe a conflict here. I will update NumPy. EDIT: it fixed it 😃

For windows, it's failing with 3.10 on scipy.cluster._hierarchy.c compilation with (same error as with the previous windows vm)

  scipy\cluster\_hierarchy.c(31714): error C2105: '++' needs l-value
  scipy\cluster\_hierarchy.c(31716): error C2105: '--' needs l-value
  scipy\cluster\_hierarchy.c(32025): error C2105: '++' needs l-value
  scipy\cluster\_hierarchy.c(32027): error C2105: '--' needs l-value
  scipy\cluster\_hierarchy.c(32275): error C2105: '++' needs l-value
  scipy\cluster\_hierarchy.c(32277): error C2105: '--' needs l-value

@czgdp1807
Copy link
Member

czgdp1807 commented Nov 2, 2021

For refguide_asv_check, check_grad is not returning the same output. Not sure if it's just a tol thing here. cc @czgdp1807

I will look into this later and will keep in on priority (though a bit lower because some NumPy stuff and SciPy module renaming is underway).

@rgommers
Copy link
Member

rgommers commented Nov 6, 2021

This is starting to look good. I merged the xslow test change and the distutils deprecating warning filtering in commit ec469cb and commit a1c0950. They're useful to get in now, in case the CI overhaul takes a lot longer.

@tupui
Copy link
Member Author

tupui commented Nov 6, 2021

Thanks @rgommers. Seems like there is another warning issue and just the doctest failure and we are good.

@tupui
Copy link
Member Author

tupui commented Nov 8, 2021

All is green @rgommers 🎉 Anything else to do?

@tupui tupui added this to the 1.8.0 milestone Nov 10, 2021
Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now LGTM. It's a relatively small PR at this point, and the mix of 3.8/3.9/3.10 jobs seems good. Let's get this in. Thanks @tupui!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Items related to the CI tools such as CircleCI, GitHub Actions or Azure
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants