-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support num_sanity_val_steps=-1 #2246
Conversation
8b50ecf
to
d375e2a
Compare
Codecov Report
@@ Coverage Diff @@
## master #2246 +/- ##
======================================
- Coverage 92% 92% -0%
======================================
Files 74 74
Lines 6316 6322 +6
======================================
+ Hits 5786 5791 +5
- Misses 530 531 +1 |
Hello @awaelchli! Thanks for updating this PR. There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-07-23 10:48:02 UTC |
@williamFalcon is |
This pull request is now in conflict... :( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
it depends if you shuffle them... @williamFalcon? |
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
This pull request is now in conflict... :( |
@williamFalcon Ok, I hope I understood you correctly. I updated the PR. Here are some examples of how it currently is: # runs exactly 5 sanity val steps
Trainer(num_sanity_val_steps=5)
# runs exactly 5 sanity val steps
Trainer(num_sanity_val_steps=5, limit_val_batches=3)
# runs 0 sanity val steps
Trainer(num_sanity_val_steps=5, limit_val_batches=0)
# runs 0 sanity val steps (but run fast_dev check batches as usual)
Trainer(num_sanity_val_steps=5, limit_val_batches=3, fast_dev_run=True) We can also substitute 5 with -1 of course. |
almost perfect! fast_dev_run always runs 1 of each and completes. |
yes exactly, I did not change that. I just did not conceptually count it as "sanity check" because in the code it follows a different path. (in fact, fast_dev_run sets num_sanity_val_steps=0 in Trainer init) # runs 0 sanity steps, 1 val step and 1 train step.
so Trainer(num_sanity_val_steps=5, fast_dev_run=True) |
This pull request is now in conflict... :( |
This pull request is now in conflict... :( |
@williamFalcon kindly requesting your review :) |
This pull request is now in conflict... :( |
@awaelchli this looks great. Can you rebase so we can merge? |
@@ -302,7 +302,7 @@ def init_test_tqdm(self) -> tqdm: | |||
def on_sanity_check_start(self, trainer, pl_module): | |||
super().on_sanity_check_start(trainer, pl_module) | |||
self.val_progress_bar = self.init_sanity_tqdm() | |||
self.val_progress_bar.total = trainer.num_sanity_val_steps * len(trainer.val_dataloaders) | |||
self.val_progress_bar.total = convert_inf(trainer.num_sanity_val_steps * len(trainer.val_dataloaders)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tqdm does not understand float("inf"), so we have to convert it to None in the case num_sanity_val_steps=inf or dataloader has inf length.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right but where is this imported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh i see... just kind of stuck at the bottom lol
@williamFalcon rebase done. |
pytorch_lightning/trainer/trainer.py
Outdated
@property | ||
def disable_validation(self) -> bool: | ||
""" Check if validation is disabled during training. """ | ||
disable_validation = not (self.is_overridden('validation_step') and self.limit_val_batches > 0) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when all are not, would it be easier to write enable_validation
and negate it here? (vise verso to this state)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@awaelchli why does this drop coverage so much? maybe the logical part was wrong? |
@williamFalcon it is because it fails to upload the coverage, http error. |
What does this PR do?
Fixes #1715
Trainer(num_sanity_val_steps=-1)
is now possible and will run all val dataloaders in full.TODO:
limit_val_batches
supposed to influence sanity val steps? ok, clarified by williamBefore submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃