Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ref: bug fix with logging val epoch end + monitor #3812

Merged
merged 23 commits into from
Oct 3, 2020
Merged

Conversation

williamFalcon
Copy link
Contributor

@williamFalcon williamFalcon commented Oct 3, 2020

ref: bug fix with logging val epoch end + monitor

  • Also organized flow vs log vs old_log tests to make cleaning up tests later easy.
  • In addition, adds checking to schedulers that need monitor keys to make sure the user provides them. This is the last of removing the magic 'val_loss' keyword.

Fixes #3811

@mergify mergify bot requested a review from a team October 3, 2020 02:33
@pep8speaks
Copy link

pep8speaks commented Oct 3, 2020

Hello @williamFalcon! Thanks for updating this PR.

Line 175:39: W504 line break after binary operator

Line 416:121: E501 line too long (121 > 120 characters)

Line 136:5: E306 expected 1 blank line before a nested definition, found 0

Comment last updated at 2020-10-03 15:44:45 UTC

@ananthsub
Copy link
Contributor

there are test failures related to the order of metrics needing to be logged. does swapping the order here override what's sent to the logger?

@williamFalcon
Copy link
Contributor Author

there are test failures related to the order of metrics needing to be logged. does swapping the order here override what's sent to the logger?

the order doesn't affect the logger. It affects the case where you log in training and use that somehow during validation for monitoring or something. In this case you want to make sure to have the latest value (not a batch before), and in some cases the metrics may not even be available yet for the val loop.

@williamFalcon williamFalcon merged commit d9bc95f into master Oct 3, 2020
m = "ReduceLROnPlateau requires returning a dict from configure_optimizers with the keyword " \
"monitor=. For example:" \
"return {'optimizer': optimizer, 'lr_scheduler': scheduler, 'monitor': 'your_loss'}"
raise MisconfigurationException(m)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can do raise MisconfigurationException(m) from e to preserve the stacktrace better

@@ -94,13 +95,18 @@ def init_optimizers(
' a list of `torch.optim.lr_scheduler`'
' * multiple outputs, dictionaries as described with an optional `frequency` key (int)')

def configure_schedulers(self, schedulers: list):
def configure_schedulers(self, schedulers: list, monitor: str = None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

monitor: Optional[str] = None

@williamFalcon williamFalcon deleted the fw2_12 branch October 4, 2020 17:37
@Borda Borda added the refactor label Oct 4, 2020
@Borda Borda added this to the 0.10.0 milestone Oct 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ModelCheckpoint not picking up metrics logged from lightning module
4 participants