Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix: df data synthesis with size=None, fix CI #410

Merged
merged 9 commits into from
Feb 12, 2021

Conversation

cosmicBboy
Copy link
Collaborator

Fixes #399: this PR fixes a bug in the dataframe synthesis logic in the strategies module where a length mismatch in a generated dataframe and index would occur, see here for an example.

It also fixes an issue with the CI #409 where the latest version of pandas would be installed in the virtual environment regardless of whether 0.25.3 or the latest version were specified in the nox test suite. It also makes the following changes to the nox test suite:

  • optional use of mamba in local CI
  • use mamba in the github action
  • specify external=True in the conda install command so that the installation process uses the underlying cache (the native nox.session.conda_install would re-install all dependencies from strach)

there was a bug in the no CI setup that overrode the pandas version
needed for a particular test session with the latest version. This
manifested in the github actions CI where e.g. pandas==0.25.3 was
not actually being tested
@codecov
Copy link

codecov bot commented Feb 12, 2021

Codecov Report

Merging #410 (f9dd6c5) into master (fc9597d) will increase coverage by 0.19%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #410      +/-   ##
==========================================
+ Coverage   99.00%   99.20%   +0.19%     
==========================================
  Files          21       21              
  Lines        2503     2502       -1     
==========================================
+ Hits         2478     2482       +4     
+ Misses         25       20       -5     
Impacted Files Coverage Δ
pandera/strategies.py 100.00% <100.00%> (ø)
pandera/dtypes.py 100.00% <0.00%> (+2.92%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fc9597d...2e78b8f. Read the comment docs.

@cosmicBboy cosmicBboy merged commit 132d29d into master Feb 12, 2021
@cosmicBboy
Copy link
Collaborator Author

@jeffzi FYI I had to make some updates to noxfile.py here: https://github.com/pandera-dev/pandera/pull/410/files#diff-f7a16a65f061822bcc73b8296f4dc837353d379d8d9cc5307982cb6941442835

The pandas version installed by _install_pandas was being clobbered by the install_extras call, which would just always install the latest version. Also added option of using mamba for faster conda dependency installation, and now conda package cache is being used with this change instead of being installed independently for each session.

@cosmicBboy cosmicBboy deleted the bugfix/data-synth-index branch February 13, 2021 16:14
@jeffzi
Copy link
Collaborator

jeffzi commented Feb 18, 2021

Glad to see those improvements. I hesitated between tox and nox but it looks like nox is the right tool for a very flexible CI.

and now conda package cache is being used with this change instead of being installed independently for each session.

On the CI, the .nox folder is cached. I wasn't 100% that was effective tbh. https://github.com/pandera-dev/pandera/blob/f99b163c5cc3190f2f6bd16c6fcf04015e331418/.github/workflows/ci-tests.yml#L25-L30

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parse Hypothesis strageies for schema inference, validation, and synthesis
2 participants