-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support for sync_bn #2801
add support for sync_bn #2801
Conversation
Hello @ananyahjha93! Thanks for updating this PR. There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-08-05 16:14:37 UTC |
This pull request is now in conflict... :( |
@Borda @williamFalcon removed apex option as backend, the tests for apex were initially passing because its sync_bn was falling back on pytorch's default version. When I reinstalled apex and got it to call its own sync_bn, there were quite a few issues tensors syncing between GPUs. Basic idea, the @justusschock ^^^ |
The DDP script tests were not working as of now, so I have an example in pl_examples/basic_examples which verifies the correct working of sync batch-norm. |
sorry, why not put as a test using ddp spawn? |
Codecov Report
@@ Coverage Diff @@
## master #2801 +/- ##
========================================
- Coverage 89% 59% -31%
========================================
Files 78 78
Lines 7109 6925 -184
========================================
- Hits 6349 4069 -2280
- Misses 760 2856 +2096 |
@nateraw |
What does this PR do?
Adds support for global batch norm using sync_bn and allows for customizing sync_bn with a provision to override configure_sync_bn() function in LightningModule.
Fixes #2589, #2509
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃