Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix certificate error for IERS data download on Windows #10434

Merged
merged 12 commits into from
Apr 27, 2021

Conversation

noc0lour
Copy link
Contributor

@noc0lour noc0lour commented Jun 3, 2020

Across different machines Windows often has intermittent problems of fetching data from datacenter.iers.org due to the CA certificate missing on Windows. (visiting the webpage via internet explorer will temporarily fetch the certificate, but that's rather clumsy and unreliable).

Description

Use certifi certificate store for reliable CA root certificate fetching on all platforms.

Fixes #9640

Close #10448

@noc0lour
Copy link
Contributor Author

noc0lour commented Jun 3, 2020

It seems that accessing URLs in travis is not always stable and sometimes times out. Other than that the amount of tests working in travis is increased. In the Windows test even the ftp test for downloading the IERS dataset works (XPASS), weirdly it still fails on Linux.

@pllim
Copy link
Member

pllim commented Jun 3, 2020

Yes, they are flaky. And the more jobs try to access them concurrently, the flakier they get as some servers try to block them from spamming. Not sure what we can do about it.

@mhvk
Copy link
Contributor

mhvk commented Jun 3, 2020

I don't really feel competent to review this, but am unclear about one thing: where should this belong in our dependencies? Since IERS files are required for quite a bit of essential use of astropy, is certifi effectively a required dependency on windows? Or just something to make travis work? It will certainly have to be documented in docs/install.rst.

@noc0lour
Copy link
Contributor Author

noc0lour commented Jun 3, 2020

@mhvk This is specifically to Windows, the travis test case on Windows also exhibits this behaviou, but it's not specific to travis.

To get the CA certificates in urllib/Python for Windows:
Essentially either users & systems will have to install required CA certificates on windows by means of using the certmanager or visiting datacenter.iers.org via internet explorer to auto-fetch is from a windows (update?) server. The next best thing is using certifi which is providing an independent CA store.

I've lost the original issue on bugs.python.org where this was discussed in more detail, but the answer on this https://bugs.python.org/issue36011 confirms the problem with the certificate storage on Windows and ways to solve it.

I'm happy to iterate on adding documentation to install/changelog as needed.

@mhvk
Copy link
Contributor

mhvk commented Jun 3, 2020

OK, so on Windows certifi effectively becomes a required dependency. This probably is not much of a problem, since many users likely have it anyway (on my Debian system, requests has pulled it in), but it needs some thought on how to have OS-specific dependencies. cc @astrofrog
(and of course ties in with the more general discussion about what to do about the exploding number of dependencies in all - #10342.

@saimn
Copy link
Contributor

saimn commented Jun 4, 2020

With the current version of the PR, this makes certifi a dependency not only for windows, and it should be listed in install_requires. If it is not too cumbersome to make it optional, window-only, then it is possible to use markers to make it a dependency only for windows:
https://setuptools.readthedocs.io/en/latest/setuptools.html#declaring-platform-specific-dependencies

@noc0lour
Copy link
Contributor Author

noc0lour commented Jun 4, 2020

In the current approach I deliberately included it for Linux and Windows as otherwise it would require to write OS specific codepaths later and is not really elegant nor readable.
Altthough if it is agreed upon to be the best way forward I can implement that as the patch here is rather short anyway.

On Linux the system-installed certifi package (at least on Fedora & Gentoo) actually points to the system CA certificates. Although in a virtualenv it points tho the locally installed CA certificates.

.travis.yml Outdated Show resolved Hide resolved
@pllim

This comment has been minimized.

@pllim
Copy link
Member

pllim commented Jun 4, 2020

Just so we have something we can compare against, I did a remote data run on Windows over at #10448 . Its log does show urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1076)> among some other errors that are unrelated to certificates.

EDIT: To be more specific...

UNEXPECTED EXCEPTION: AstropyWarning('failed to download ftp://anonymous:mail%40astropy.org@gdc.cddis.eosdis.nasa.gov/pub/products/iers/finals2000A.all and https://datacenter.iers.org/data/9/finals2000A.all, using local IERS-B: <urlopen error Unable to open any source! Exceptions were {\'ftp://anonymous:mail%40astropy.org@gdc.cddis.eosdis.nasa.gov/pub/products/iers/finals2000A.all\': URLError("ftp error: error_temp(\'425 Security: Bad IP connecting.\')"), \'https://datacenter.iers.org/data/9/finals2000A.all\': URLError(SSLCertVerificationError(1, \'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1076)\'))}>')
E               urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1076)>

This one is actually from astropy.readthedocs.io :

E           ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1076)

Copy link
Member

@pllim pllim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's my plan:

Todo for this PR:

  • Address existing review comments.
    • Add text to docs/install.rst
    • Remove remote data Windows job from PR CI. Instead create one as cron job.
  • Wait for next steps based on my investigation above.

How does this sound? Thanks!

setup.cfg Show resolved Hide resolved
@pllim
Copy link
Member

pllim commented Jun 5, 2020

I'll ponder the need for change log and milestone later. Given that this is a bug, it can be backported but then again, it's been there forever and our Windows user base is rather small. 🤷

UPDATE: I decided against backport due to introduction of new dependency for Windows.

@pllim
Copy link
Member

pllim commented Jun 6, 2020

So, reporting back on my work at #10448 that is built upon the fix here:

  • It is definitely possible to make certifi solution for Windows only. (You can ignore the OSX failures over there; they are not related to certs and I don't think there is a need to fix those.)
  • However, even on Windows, once you got that certificate, you don't really need certifi anymore, so I don't think it should be mandatory dependency even for Windows. And there is also no guarantee that this will not become a problem for other OSes in the future.
  • So, I am thinking maybe to make this solution something that user can toggle on and off, maybe in astropy.utils.conf and it will be off by default and we explicitly turn it on when we run remote data tests for Windows.

What do y'all think?

@mhvk
Copy link
Contributor

mhvk commented Jun 6, 2020

@pllim - thanks so much for the investigation. I feel somewhat incompetent here, so don't really know what to suggest - and will defer to @noc0lour for what will help windows users most - except that whatever we do should be documented clearly!

@noc0lour
Copy link
Contributor Author

noc0lour commented Jun 6, 2020

@pllim I fully agree that for now the fix is only necessary on Windows and also not always. I will implement the fix to target only Windows OS via the requirements specification in setup and checking for the OS/availability of certifi in the utils code.

As this is a annoying road block for new users (which with a entry in a FAQ etc can be resolved quickly), I would rather opt to default a possible configuration option to on or make it non-optional for Windows entirely. As certifi is used by requests we will not have an issue of it going away/getting outdated anytime soon.

I made this patch primarily not to fix tests or anything but to make it easier to get started on Windows (no error on the first run) for new users.

@pllim
Copy link
Member

pllim commented Jun 6, 2020

Thanks for your inputs, @noc0lour ! Looking forward to your next commit(s).

@noc0lour noc0lour force-pushed the fix_certificate_error_windows branch 2 times, most recently from e0e2053 to 9833363 Compare June 15, 2020 08:04
@noc0lour
Copy link
Contributor Author

Implemented the certifi solution to be only used on the windows platform checked with os.name == 'nt'. If we have a more elegant solution for these imports I'm happy to adopt it.

@pllim pllim added this to the v4.2 milestone Jun 16, 2020
Copy link
Member

@pllim pllim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for following up and your continued patience! I think about this a bit more and...

  • I am still not convinced that requiring it for Windows blindly is the best way forward, especially given the note about multiprocessing and the note about OpenSSL. The problem this PR is trying to solve does not seem to affect a lot of people, but the fix affects all Windows users going forward. As you have mentioned, a FAQ might be sufficient for beginner users even if this fix is not the default behavior? I still think that making it a non-default configurable item the safest route, as that won't change any existing behavior at all. Would be nice to have more opinions about this.
  • Depending on the decision above, the change log would be under API change or new feature, but regardless, I have set the milestone.
  • Additional text about certifi needs to be in docs/install.rst.
  • Some tests will fail for this new cron job unless their fixes are copied over from EXP: Testing on Windows with remote data #10448 for astropy/coordinates/tests/test_solar_system.py and docs/coordinates/spectralcoord.rst.

astropy/utils/data.py Outdated Show resolved Hide resolved
@pllim
Copy link
Member

pllim commented Sep 18, 2020

I wasn't the best reviewer for this PR, resulting in it getting stale. I do think we need to find a way forward with this one way or another. @embray , any ideas?

@noc0lour
Copy link
Contributor Author

@pllim at least from my side I just did not have the time (yet) to follow up with a distinct implementation to put things into configuration etc. Current state could be used by anyone interested to implement the configuration to go into the implemented codepath.

Copy link
Member

@pllim pllim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm that the utils/tests/test_misc.py failure I see on Windows complaining about SSL cert is gone with this patch! 🎉

Found some typos and more minor comments, but I think we're really close now. Thanks for your patience! 😸

CHANGES.rst Outdated Show resolved Hide resolved
astropy/utils/data.py Show resolved Hide resolved
astropy/utils/data.py Show resolved Hide resolved
astropy/utils/data.py Show resolved Hide resolved
astropy/utils/data.py Show resolved Hide resolved
astropy/utils/data.py Outdated Show resolved Hide resolved
astropy/utils/misc.py Show resolved Hide resolved
docs/install.rst Outdated Show resolved Hide resolved
docs/utils/data.rst Outdated Show resolved Hide resolved
docs/utils/data.rst Show resolved Hide resolved
@pllim
Copy link
Member

pllim commented Apr 19, 2021

@embray , any chance we can wrap this up before end of April 2021?

@embray
Copy link
Member

embray commented Apr 22, 2021

@pllim will do!

@embray embray force-pushed the fix_certificate_error_windows branch from 909eb2b to 892afe3 Compare April 22, 2021 10:49
embray added a commit to noc0lour/astropy that referenced this pull request Apr 22, 2021
@embray
Copy link
Member

embray commented Apr 22, 2021

Somehow an unrelated commit got in here when I rebased...

noc0lour and others added 11 commits April 22, 2021 13:01
Initialize a SSL context with the CA file taken from certifi which
collects the Mozilla CA certificates.
this is consistent with how the 'requests' package handles root CA
certificates: https://requests.readthedocs.io/en/master/user/advanced/#ca-certificates
…ly when

installed but is not required

adds an option allow_insecure to allow downloads over unverified SSL connections
to succeed with a warning, as well as an ssl_context argument which allows
additional optional settings for configuring SSL connections
timeout to 5 seconds

however, when it does fail it appears to be raising a URLError, not socket.timeout
automatically re-tried with allow_insecure=True
@embray embray force-pushed the fix_certificate_error_windows branch from a84237c to d77bf02 Compare April 22, 2021 11:02
@embray
Copy link
Member

embray commented Apr 22, 2021

Ok. Everything should be fixed now. @pllim I addressed all your comments. Also fixed the changelog entry to the new format.

@embray
Copy link
Member

embray commented Apr 22, 2021

Bleh, there are some test failures because I changed a v to a V in the error message :)

@embray embray force-pushed the fix_certificate_error_windows branch from d77bf02 to 3feff1f Compare April 22, 2021 11:27
@@ -1723,7 +1745,7 @@ def info(self):
def read(self, length=None):
return self.reader.read(length)

monkeypatch.setattr(urllib.request, "build_opener", mockurl_builder)
monkeypatch.setattr(astropy.utils.data, "_build_urlopener", mockurl_builder)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember seeing monkeypatching of astropy.utils.data over at astroquery, so this is a heads up to astroquery people: @keflavich , @bsipocz , @ceb8

Copy link
Member

@pllim pllim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I tested this again and seems fine on Windows. Thanks, all!

@pllim pllim merged commit b27c921 into astropy:main Apr 27, 2021
@embray
Copy link
Member

embray commented Apr 29, 2021

Thank you 🙏

@pllim
Copy link
Member

pllim commented May 5, 2021

For future reference, this broke astroquery's test monkeypatching downstream, see astropy/astroquery#2059 . FYI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

URLError in test_iers_a_dl_mirror (4.0rc1 - Windows 10)
8 participants