Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MNIST dataset & drop torchvision dep. from tests #986

Merged
merged 52 commits into from
Mar 30, 2020
Merged

Add MNIST dataset & drop torchvision dep. from tests #986

merged 52 commits into from
Mar 30, 2020

Conversation

awaelchli
Copy link
Contributor

@awaelchli awaelchli commented Feb 29, 2020

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
  • Did you read the contributor guideline?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?
  • If you made a notable change (that affects users), did you update the CHANGELOG?

What does this PR do?

Fixes #942 and follow up to #990.

  • Added a custom MNIST class that does not rely on torchvision.
  • Added download links to MNIST on S3
  • Removed torchvision dependency from tests
  • PL examples still use the real torchvision

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@pep8speaks
Copy link

pep8speaks commented Feb 29, 2020

Hello @awaelchli! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-03-27 22:46:44 UTC

@awaelchli awaelchli changed the title Drop torchvision dep. from tests Drop torchvision dep. from tests [WIP] Feb 29, 2020
@awaelchli
Copy link
Contributor Author

why is it trying to run the imagenet example? It's not part of the tests, or is it?

Copy link
Member

@Borda Borda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious, do we need all these functions/methods, I would rather use the minimal version so we do not maintain it in future...
Personally I would make simple MNIST datasets and include the to_tensor in it (__getitem__) so we do not need any transformers in tests (anyway we do just a few fist epochs)
Also, we can still use torchvision in examples just adjust tests (split tests and examples, I will do :])

@awaelchli
Copy link
Contributor Author

just curious, do we need all these functions/methods, I would rather use the minimal version so we do not maintain it in future...

ok, I will try to compress everything. Most of the code is for downloading and extracting the dataset files.

I didn't want to remove any tests, that's why I tried to override the imports. Maybe you go ahead and do the splitting of examples and tests first, then I will adjust my changes to that.

@Borda
Copy link
Member

Borda commented Feb 29, 2020

ok, I will try to compress everything. Most of the code is for downloading and extracting the dataset files.

Well I didn't check it in detail...

I didn't want to remove any tests, that's why I tried to override the imports. Maybe you go ahead and do the splitting of examples and tests first, then I will adjust my changes to that.

You are right, we shall not change any test, just swap dataset source lok

PS: sorry for miss-click (accidentally closing, working on phone)

@Borda Borda closed this Feb 29, 2020
@Borda Borda reopened this Feb 29, 2020
@Borda Borda changed the title Drop torchvision dep. from tests [WIP] Add MNIST dataset & drop torchvision dep. from tests [WIP] Feb 29, 2020
@Borda Borda added ci Continuous Integration feature Is an improvement or enhancement labels Feb 29, 2020
@awaelchli
Copy link
Contributor Author

@Borda I made it a lot simpler. It is not necessary to do all the data extraction that torchvision does. The MNIST class is now very simple as you wished. And I got rid of the transforms, so we can even drop the PIL requirement now. No PIL to Tensor conversions needed.

Now we just need to decouple the tests from examples :)

This was referenced Mar 1, 2020
@Borda
Copy link
Member

Borda commented Mar 1, 2020

@awaelchli I have prepared the split, check #990

@awaelchli
Copy link
Contributor Author

@Borda Great! I think yours should be merged first, because I can't make the tests pass otherwise.

@awaelchli awaelchli changed the title Add MNIST dataset & drop torchvision dep. from tests [WIP] [blocked by #990] Add MNIST dataset & drop torchvision dep. from tests Mar 3, 2020
@awaelchli awaelchli mentioned this pull request Mar 3, 2020
Copy link
Member

@Borda Borda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GREAT job! Get this done as soon as possible...

.gitignore Outdated Show resolved Hide resolved
tests/datasets/mnist.py Outdated Show resolved Hide resolved
tests/datasets/mnist.py Outdated Show resolved Hide resolved
@Borda Borda added the ready PRs ready to be merged label Mar 25, 2020
CHANGELOG.md Outdated Show resolved Hide resolved
tests/base/datasets.py Outdated Show resolved Hide resolved
tests/base/datasets.py Outdated Show resolved Hide resolved
Adrian Wälchli and others added 3 commits March 26, 2020 01:03
Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>
Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>
@awaelchli
Copy link
Contributor Author

awaelchli commented Mar 26, 2020

The test failed because of this?

Expected nothing
Got:
    TRAINS new version available: upgrade to v0.14.1 is recommended!

@awaelchli
Copy link
Contributor Author

@bmartinn is there a way we can suppress such messages during test?

@Borda
Copy link
Member

Borda commented Mar 26, 2020

@bmartinn is there a way we can suppress such messages during test?

We know about it... clearml/clearml#119

@bmartinn
Copy link
Contributor

Hi @awaelchli , as @Borda mentioned the issue is known, and a fix was already merged into the master branch.
If CI tests are failing, the easiest temporarily fix (just for this PR) is to upgrade the trains package in the requirements-extra.txt to the latest stable release.
There is already a pending PR on that as well ...

@awaelchli
Copy link
Contributor Author

@Borda I see KeyBoardInterrupt in CI. Did you have that before?

@Borda
Copy link
Member

Borda commented Mar 27, 2020

I see KeyBoardInterrupt in CI. Did you have that before?

not really...

@Borda Borda requested review from jeffling, jeremyjordan, justusschock, tullie and a team March 30, 2020 16:57
@Borda
Copy link
Member

Borda commented Mar 30, 2020

@williamFalcon @PyTorchLightning/core-contributors lets get this done...

@williamFalcon williamFalcon merged commit b7de42f into Lightning-AI:master Mar 30, 2020
@awaelchli awaelchli deleted the mnistcopy branch March 30, 2020 22:43
alexeykarnachev pushed a commit to alexeykarnachev/pytorch-lightning that referenced this pull request Apr 3, 2020
* added custom mnist without torchvision dep

* move files so it does not conflict with mnist gitignore

* mock torchvision for tests

* fix line too long

* fix line too long

* fix "module level import not at top of file" warning

* move mock imports to __init__.py

* simplify MNIST a lot and download directly the .pt files

* further simplify and clean up mnist

* revert import overrides

* make as before

* drop  PIL requirement

* move mnist.py to datasets subfolder

* use logging instead of print

* choose same name as in torchvision

* remove torchvision and pillow also from yml file

* refactor if train

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* capitalized class attr

* moved mnist to models

* re-added datsets ignore

* better name for file variable

* Update mnist.py

* move dataset classes to datasets.py

* new line

* update

* update

* fix automerge

* move to base folder

* adapt testingmnist to new mnist base class

* remove temporal fix

* fix datatype

* remove old testingmnist

* readable

* fix import

* fix whitespace

* docstring

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/base/datasets.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* changelog

* added types

* Update CHANGELOG.md

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* exist->isfile

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* index -> idx

* temporary fix for trains error

* better changelog message

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
@Borda Borda modified the milestones: v0.7., v0.7.x Apr 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Continuous Integration feature Is an improvement or enhancement ready PRs ready to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

create own MNIST dataset
5 participants