Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Example] Distributed Replay Buffer Prototype Example Implementation #615

Merged
merged 32 commits into from
Nov 3, 2022

Conversation

adityagoel4512
Copy link
Contributor

@adityagoel4512 adityagoel4512 commented Oct 27, 2022

Description

Prototype example distributed replay buffer implementation using LazyMemmapStorage and TensorDictReplayBuffer, with nodes communicating using torch.rpc. The implementation allows for 1 Trainer Node, 1 Replay Buffer Node, and N >= 1 Data Collector Nodes.

Motivation and Context

This investigative example illustrates some patterns with which we can implement distributed RL algorithms with replay buffers using the torch.rl framework. It also helps us understand the abstractions and ideas that may be missing from the framework or may need adapting to make writing distributed RL algorithms natural and performant.

  • I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

  • [ x] Example (update in the folder of examples)

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

  • [ x] I have read the CONTRIBUTION guide (required)
  • My change requires a change to the documentation.
  • I have updated the tests accordingly (required for a bug fix or a new feature).
  • I have updated the documentation accordingly.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 27, 2022
@adityagoel4512
Copy link
Contributor Author

adityagoel4512 commented Oct 27, 2022

@vmoens I've made ReplayBufferNode subclass TensorDictReplayBuffer now so it fits in with the rest of the object hierarchy and can be used entirely like any other Replay Buffer.

@adityagoel4512 adityagoel4512 marked this pull request as ready for review October 27, 2022 15:47
@vmoens vmoens changed the title Distributed Replay Buffer Prototype Example Implementation [Example] Distributed Replay Buffer Prototype Example Implementation Oct 27, 2022
@vmoens
Copy link
Contributor

vmoens commented Oct 27, 2022

this is soooo cool
Would you have time to have a look at why when a memmap tensor is sent on one end the other worker receives a regular tensor? Do you think we can easily make the other process receive a memmap tensor instead?

For instance, this receives memmap tensors from one process and those tensors are actually of memmap type.

But torch.distributed does not serialize things as multiprocess I guess...

@vmoens vmoens added documentation Improvements or additions to documentation enhancement New feature or request labels Oct 28, 2022
@vmoens
Copy link
Contributor

vmoens commented Oct 28, 2022

Note for future contributions: it's best if you don't develop on your main branch but if you branch out on your forked repo :)

@adityagoel4512
Copy link
Contributor Author

Add heading textAdd bold text, <Cmd+b>Add italic text, <Cmd+i>
Add a quote, <Cmd+Shift+.>Add code, <Cmd+e>Add a link, <Cmd+k>
Add a bulleted list, <Cmd+Shift+8>Add a numbered list, <Cmd+Shift+7>Add a task list, <Cmd+Shift+l>
Directly mention a user or team
Reference an issue, pull request, or discussion
Add saved reply

Sure thing - would you prefer I create a new branch for this PR as well?

Note for future contributions: it's best if you don't develop on your main branch but if you branch out on your forked repo :)

@codecov
Copy link

codecov bot commented Oct 28, 2022

Codecov Report

Merging #615 (1a73dcd) into main (e96fd37) will decrease coverage by 0.08%.
The diff coverage is 64.94%.

@@            Coverage Diff             @@
##             main     #615      +/-   ##
==========================================
- Coverage   87.82%   87.73%   -0.09%     
==========================================
  Files         125      126       +1     
  Lines       24280    24371      +91     
==========================================
+ Hits        21324    21382      +58     
- Misses       2956     2989      +33     
Flag Coverage Δ
habitat-gpu 23.95% <21.42%> (-0.01%) ⬇️
linux-cpu 85.03% <64.94%> (-0.08%) ⬇️
linux-gpu 86.35% <64.94%> (-0.09%) ⬇️
linux-outdeps-gpu 75.65% <64.94%> (-0.05%) ⬇️
linux-stable-cpu 84.92% <64.94%> (-0.07%) ⬇️
linux-stable-gpu 86.23% <64.94%> (-0.09%) ⬇️
macos-cpu 84.81% <64.94%> (-0.07%) ⬇️
olddeps-gpu 76.50% <64.94%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
test/test_rb_distributed.py 49.20% <49.20%> (ø)
test/test_rb.py 97.05% <80.00%> (ø)
torchrl/data/replay_buffers/utils.py 91.42% <92.85%> (+0.95%) ⬆️
torchrl/data/replay_buffers/rb_prototype.py 87.76% <100.00%> (+1.36%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Contributor

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!
Should we comment the example, or write a separate markdown file in the example directory to explain what this is about?

Personally I'd be in favour of commenting the code using this syntax which will allow us to port that to the doc later on.

Do you think we can test the distributed replay buffer in the CI? It would be nice to cover that there.

@adityagoel4512
Copy link
Contributor Author

Great work! Should we comment the example, or write a separate markdown file in the example directory to explain what this is about?

Personally I'd be in favour of commenting the code using this syntax which will allow us to port that to the doc later on.

Do you think we can test the distributed replay buffer in the CI? It would be nice to cover that there.

Ah yes I'll add some comments in the example. I'll investigate how best to test the distributed buffer now.

Copy link
Contributor

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonderful! Thanks a million for this!

@vmoens vmoens merged commit b4b27fe into pytorch:main Nov 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants