Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Benchmark storage types #633

Merged
merged 37 commits into from
Nov 4, 2022

Conversation

adityagoel4512
Copy link
Contributor

Description

Adds documentation of collection speed difference when using different storage types in a prototype distributed replay buffer as found in #615. This updates docs and introduces and benchmarks top level directory with the source code for the benchmark.

Motivation and Context

This addresses the need to document and investigate the sampling latency improvements from the memmap storage over tensor storage and list storage types in distributed RL settings. It validates that it does indeed lead to sampling latency improvements and documents these to encourage users to choose an appropriate storage type.

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds core functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (update in the documentation)
  • Example (update in the folder of examples)

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

  • [ x] I have read the CONTRIBUTION guide (required)
  • [ x] My change requires a change to the documentation.
  • I have updated the tests accordingly (required for a bug fix or a new feature).
  • [x ] I have updated the documentation accordingly.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 1, 2022
@adityagoel4512
Copy link
Contributor Author

I'd like to wait for #615 to be resolved before attempting to merge this PR

@codecov
Copy link

codecov bot commented Nov 1, 2022

Codecov Report

Merging #633 (246364c) into main (b4b27fe) will not change coverage.
The diff coverage is 100.00%.

❗ Current head 246364c differs from pull request most recent head 790f07b. Consider uploading reports for the commit 790f07b to get more accurate results

@@           Coverage Diff           @@
##             main     #633   +/-   ##
=======================================
  Coverage   87.76%   87.76%           
=======================================
  Files         126      126           
  Lines       24470    24470           
=======================================
  Hits        21476    21476           
  Misses       2994     2994           
Flag Coverage Δ
habitat-gpu 23.95% <ø> (ø)
linux-cpu 85.07% <100.00%> (?)
linux-gpu 86.38% <100.00%> (ø)
linux-outdeps-gpu 75.73% <100.00%> (ø)
linux-stable-cpu 84.96% <100.00%> (ø)
linux-stable-gpu 86.26% <100.00%> (ø)
macos-cpu 84.85% <100.00%> (?)
olddeps-gpu 76.57% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
test/test_rb.py 97.05% <100.00%> (ø)
torchrl/data/replay_buffers/rb_prototype.py 87.76% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@vmoens vmoens changed the title Benchmark storage types [Feature] Benchmark storage types Nov 1, 2022
@adityagoel4512 adityagoel4512 marked this pull request as ready for review November 3, 2022 18:32
@adityagoel4512
Copy link
Contributor Author

#615 is merged now!

@vmoens vmoens force-pushed the main branch 2 times, most recently from 00c3963 to ada0fcd Compare November 4, 2022 13:31
Copy link
Contributor

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks, it's marvellous
Can you just add a small comment at the beginning of the script explaining how to run it?

Sample latency benchmarking (using RPC)
======================================
A rough benchmark of sample latency using different storage types over the network using `torch.rpc`.
This code is based on examples/distributed/distributed_replay_buffer.py.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide a simple example on how to run that script (since you need to run it twice with different ranks)?

Copy link
Contributor Author

@adityagoel4512 adityagoel4512 Nov 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have added a small example - hopefully this works

| :class:`LazyTensorStorage` | 1.83x |
+-------------------------------+-----------+
| :class:`LazyMemmapStorage` | 3.44x |
+-------------------------------+-----------+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonderful!

@vmoens vmoens added the documentation Improvements or additions to documentation label Nov 4, 2022
@vmoens vmoens merged commit 49af5da into pytorch:main Nov 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants