-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Benchmark storage types #633
Conversation
# Conflicts: # test/test_memmap.py # torchrl/data/tensordict/memmap.py
Optimization
…tion about speed up using MemmapTensor
I'd like to wait for #615 to be resolved before attempting to merge this PR |
Codecov Report
@@ Coverage Diff @@
## main #633 +/- ##
=======================================
Coverage 87.76% 87.76%
=======================================
Files 126 126
Lines 24470 24470
=======================================
Hits 21476 21476
Misses 2994 2994
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
…ypes and update benchmark to use RemoteTensorDictReplayBuffer
#615 is merged now! |
00c3963
to
ada0fcd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks, it's marvellous
Can you just add a small comment at the beginning of the script explaining how to run it?
Sample latency benchmarking (using RPC) | ||
====================================== | ||
A rough benchmark of sample latency using different storage types over the network using `torch.rpc`. | ||
This code is based on examples/distributed/distributed_replay_buffer.py. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you provide a simple example on how to run that script (since you need to run it twice with different ranks)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have added a small example - hopefully this works
| :class:`LazyTensorStorage` | 1.83x | | ||
+-------------------------------+-----------+ | ||
| :class:`LazyMemmapStorage` | 3.44x | | ||
+-------------------------------+-----------+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wonderful!
Description
Adds documentation of collection speed difference when using different storage types in a prototype distributed replay buffer as found in #615. This updates docs and introduces and benchmarks top level directory with the source code for the benchmark.
Motivation and Context
This addresses the need to document and investigate the sampling latency improvements from the memmap storage over tensor storage and list storage types in distributed RL settings. It validates that it does indeed lead to sampling latency improvements and documents these to encourage users to choose an appropriate storage type.
Types of changes
What types of changes does your code introduce? Remove all that do not apply:
Checklist
Go over all the following points, and put an
x
in all the boxes that apply.If you are unsure about any of these, don't hesitate to ask. We are here to help!