-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metric aggregation testing #3517
Metric aggregation testing #3517
Conversation
Hello @SkafteNicki! Thanks for updating this PR.
Comment last updated at 2020-10-01 11:32:20 UTC |
Codecov Report
@@ Coverage Diff @@
## master #3517 +/- ##
=======================================
+ Coverage 87% 90% +3%
=======================================
Files 110 110
Lines 8858 8818 -40
=======================================
+ Hits 7731 7921 +190
+ Misses 1127 897 -230 |
@Borda , @awaelchli , @justusschock would like your guys input on this.
|
@SkafteNicki hey, I am am working on this as well, and I think there are some metrics which wouldn't work in that order. I think it is better to change the order to pre_ddp (input_convert) -> ddp_reduce/ddp_gather -> post_ddp (forward/output_convert). i.e. we need to gather statistics before computing the metrics themselves, rather than computing the metrics on individual GPUs and figuring out how metrics from different batches are combined. It won't just be mean/weighted mean or something along those lines for any specific metric. So, I have been changing this itself for metrics but was breaking a lot of things yesterday. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, LGTM on a high level. I'm not a metrics guy, so no comment on correctness :)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just left some comments.
…teNicki/pytorch-lightning into metrics/aggregation_testing
This pull request is now in conflict... :( |
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
This pull request is now in conflict... :( |
This pull request is now in conflict... :( |
What does this PR do?
With PR #3245 we changed the way we are doing ddp sync and also introduced aggregation over multiple batches. This PR will add test for all metrics such that we are sure that they work in the following cases:
Should fix #3230
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃