-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug-Fix]:properties current_epoch
and global_step
between model and trainer same always
#3785
[Bug-Fix]:properties current_epoch
and global_step
between model and trainer same always
#3785
Conversation
update to master central repo
Update code from remote
update master
Use `raise .. from ..` to explicitly chain exceptions (Lightning-AI#3750)
[pytest]: test function to check common properties shared between lightning model and trainer
@@ -122,6 +116,14 @@ def __init__(self, *args, **kwargs): | |||
def example_input_array(self) -> Any: | |||
return self._example_input_array | |||
|
|||
@property | |||
def current_epoch(self): | |||
return self.trainer.current_epoch if self.trainer else 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should these be -1 if self.trainer
isn't set? that could indicate that this is currently outside of training
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a nice addition but isn't trainer = None
, when it isn't set and that could be an indication too?
- have set it to
0
as previously it wasself.current_epoch = 0
in__init__
type check, better clarity Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
type check, better clarity Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Can you please re-add those properties comments? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mind share motivation of this addition since it is just one-to-one mapping to trainer attribute...?
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Should I mention that I was thinking if that information is really needed for the user to know? As the user will care only about if he could access those at any point in time and expect it to be right. |
I am suggesting that I prefer to have them in the docs. But, core contributors will decide to add it or not. |
I agree, I think adding has an advantage that the user will have clarity that |
@ydcjeff @ananthsub one of the logs: Is there a way to rerun them |
Don't worry about the failed ones, these are unrelated to this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 nice
Seems that some tests are failing... |
def test_resume_from_checkpoint(tmpdir): | ||
""" Test that properties like `current_epoch` and `global_step` | ||
in model and trainer are always the same. """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should extend this test somehow to check for all the things that are required to be restored. Not here maybe.
What does this PR do?
current_epoch
andglobal_step
shared between the lightning module and the trainer remain the same always.Before submitting
Did you have fun?
Yes I had fun learning from Adrian 🙃