-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pandas backend support for Table read/write #8381
Conversation
Codecov Report
@@ Coverage Diff @@
## master #8381 +/- ##
==========================================
+ Coverage 86.8% 86.81% +<.01%
==========================================
Files 385 386 +1
Lines 58104 58166 +62
Branches 1060 1060
==========================================
+ Hits 50440 50495 +55
- Misses 7049 7056 +7
Partials 615 615
Continue to review full report at Codecov.
|
This would be so cool! Do we have to worry about minversion for |
@pllim - I'm glad you are psyched! Can you and/or @saimn review? About the minversion, I looked back in the docs to pandas 0.12 from Jan 2014 and from what I can see it has the same functions and methods here, and similar enough API. In particular there are a few hardcoded args in the connect wrapper that appear to be available as of pandas 0.12. |
if we could just run one of the test jobs with this min version, and another one with the latest, that could cover us pretty well I think. |
OK, good that you suggested checking older versions. I had to make a few changes and got this working back to pandas 0.14. Before that it fails, so we have a concrete min version now. |
@@ -34,8 +34,8 @@ env: | |||
- PYTEST_VERSION=3.10 | |||
- MAIN_CMD='python setup.py' | |||
- CONDA_DEPENDENCIES='Cython jinja2' | |||
- CONDA_ALL_DEPENDENCIES='Cython jinja2 scipy h5py matplotlib pyyaml pandas pytz beautifulsoup4 ipython mpmath bleach bottleneck' | |||
- DEV_PIP_DEP='asdf>=2.3 Cython jinja2 scipy h5py matplotlib pyyaml scikit-image pandas pytz beautifulsoup4 ipython mpmath bleach bottleneck' | |||
- CONDA_ALL_DEPENDENCIES='Cython jinja2 scipy h5py matplotlib pyyaml pandas pytz html5lib beautifulsoup4 ipython mpmath bleach bottleneck' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose we need to list html5lib
as optional dependency in docs/install.rst
, and while editing that file, please mention that 0.14 is the minimum pandas version required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some nitpicky nitpicks. LGTM. Thanks!
""" | ||
Test round-trip through pandas write/read for supported formats. | ||
|
||
:param fmt: format name, e.g. csv, html, json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, any reason to use epydoc
formatting in the docstring here? This does not matter because test function docstring is not exposed to users, still... why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's just the PyCharm default that got in there, not really by purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe @cdeil as pycharm power user can chime in and tell how to change the default to be numpydoc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! My PyCharm experiment is looking better each day!
@Gabriel-p , do you want to take this for a spin with your use case? |
Co-Authored-By: taldcroft <taldcroft@gmail.com>
Gah, should have made that last nit-pick commit with no-skip. |
@pllim is this implemented in the 3.1.2 release? |
@Gabriel-p - No, just merged to master yesterday. It will be in the 3.2 release. |
This is great! |
@pllim still appears to be a no go in my case. Downloaded the master branch and tried:
|
@Gabriel-p - try with |
It doesn't recognize that as a valid format.
|
@Gabriel-p - You need to use the Table interface:
|
That's it, loaded the table in a few seconds. It is a bit confusing though, having to import
|
@Gabriel-p - about where to come at this (from IO or from Table), that is a matter of perspective. Astropy has worked toward the idea that you mostly think about the data type (e.g. table or ND image) and then should not worry so much about the specific file format (ASCII table, FITS table, HDF5 table). So we have the unified I/O interface (https://astropy.readthedocs.io/en/latest/io/unified.html) that starts from the data type, in this case Table. So most of the time in your data I/O with tables you should be using |
Thanks but this is still not clear to me (perhaps this is not the place to be discussing this, if so let me know) When are we then supposed to use |
You should almost always use The only time I ever use |
Great! I was under the impression (perhaps because I've been using |
This makes it easy to use pandas
read_*
functions orDataFrame.to_*
methods to read/write tables via pandas I/O. E.g. see #8379. This is a relatively bare-bones start but it does the basics of round-tripping a simple table. There are other more obscure pandas formats likefeather
orparquet
that could be supported, but these all require additional packages so my initial plan is to just cover the common formats.To do:
With #8255 the underlying pandas function/method docstrings should also show up. @astrofrog - any chance of reviewing that?
Example: