[Feature] Benchmark storage types #633

adityagoel4512 · 2022-11-01T13:44:24Z

Description

Adds documentation of collection speed difference when using different storage types in a prototype distributed replay buffer as found in #615. This updates docs and introduces and benchmarks top level directory with the source code for the benchmark.

Motivation and Context

This addresses the need to document and investigate the sampling latency improvements from the memmap storage over tensor storage and list storage types in distributed RL settings. It validates that it does indeed lead to sampling latency improvements and documents these to encourage users to choose an appropriate storage type.

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds core functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (update in the documentation)
Example (update in the folder of examples)

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

[ x] I have read the CONTRIBUTION guide (required)
[ x] My change requires a change to the documentation.
I have updated the tests accordingly (required for a bug fix or a new feature).
[x ] I have updated the documentation accordingly.

…age types

# Conflicts: # test/test_memmap.py # torchrl/data/tensordict/memmap.py

Optimization

…tion about speed up using MemmapTensor

adityagoel4512 · 2022-11-01T13:45:10Z

I'd like to wait for #615 to be resolved before attempting to merge this PR

codecov · 2022-11-01T15:01:33Z

Codecov Report

Merging #633 (246364c) into main (b4b27fe) will not change coverage.
The diff coverage is 100.00%.

❗ Current head 246364c differs from pull request most recent head 790f07b. Consider uploading reports for the commit 790f07b to get more accurate results

@@           Coverage Diff           @@
##             main     #633   +/-   ##
=======================================
  Coverage   87.76%   87.76%           
=======================================
  Files         126      126           
  Lines       24470    24470           
=======================================
  Hits        21476    21476           
  Misses       2994     2994

Flag	Coverage Δ
habitat-gpu	`23.95% <ø> (ø)`
linux-cpu	`85.07% <100.00%> (?)`
linux-gpu	`86.38% <100.00%> (ø)`
linux-outdeps-gpu	`75.73% <100.00%> (ø)`
linux-stable-cpu	`84.96% <100.00%> (ø)`
linux-stable-gpu	`86.26% <100.00%> (ø)`
macos-cpu	`84.85% <100.00%> (?)`
olddeps-gpu	`76.57% <100.00%> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
test/test_rb.py	`97.05% <100.00%> (ø)`
torchrl/data/replay_buffers/rb_prototype.py	`87.76% <100.00%> (ø)`

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

…ypes and update benchmark to use RemoteTensorDictReplayBuffer

…ypes

adityagoel4512 · 2022-11-04T12:35:15Z

#615 is merged now!

vmoens

LGTM thanks, it's marvellous
Can you just add a small comment at the beginning of the script explaining how to run it?

vmoens · 2022-11-04T14:13:18Z

benchmarks/storage/benchmark_sample_latency_over_rpc.py

+Sample latency benchmarking (using RPC)
+======================================
+A rough benchmark of sample latency using different storage types over the network using `torch.rpc`.
+This code is based on examples/distributed/distributed_replay_buffer.py.


Can you provide a simple example on how to run that script (since you need to run it twice with different ranks)?

Have added a small example - hopefully this works

vmoens · 2022-11-04T14:14:06Z

docs/source/reference/data.rst

+| :class:`LazyTensorStorage`    | 1.83x     |
+-------------------------------+-----------+
+| :class:`LazyMemmapStorage`    | 3.44x     |
+-------------------------------+-----------+


Adi Goel and others added 27 commits October 27, 2022 12:44

Distributed replay buffer prototype

1f8d25a

Fixes comment issue

dffce19

Makes ReplayBufferNode subclass TensorDictReplayBuffer

1c1cca9

aha

e7735af

Merge branch 'pytorch:main' into main

421bbfc

amend

4bddc59

Merge branch 'index_memmap' of github.com:pytorch/rl

b25a2ec

bf

0ee8fb0

Merge branch 'index_memmap' of github.com:pytorch/rl

f329012

Fixes print statements and removes redundant Collector arg

7d4c6e5

Fixes print statements and removes redundant Collector arg

f90c53e

Merge branch 'main' of github.com:adityagoel4512/rl

bb77423

amend

f7afa74

Merge branch 'index_memmap' of github.com:pytorch/rl

1a2909d

amend

9a0d7b1

Merge branch 'index_memmap' of github.com:pytorch/rl

08abc27

Timing larger tensordict transfers over torch rpc using multiple stor…

ed87326

…age types

init

2dfcd00

amend

3ff9380

amend

24a0438

Merge branch 'memmap_improve' into benchmark-storage-types

641ee77

# Conflicts: # test/test_memmap.py # torchrl/data/tensordict/memmap.py

update_ and make sure content is read

e6132c1

amend

8189be6

amend

ac18ed3

Merge pull request #1 from vmoens/benchmark-storage-types

b15c98f

Optimization

Fixes list storage arg

02eb003

Moves benchmark to new top-level directory and adds note in documenta…

5281c4e

…tion about speed up using MemmapTensor

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 1, 2022

Resolves conflict and runs linter

b56fc02

Adi Goel added 2 commits November 1, 2022 14:03

Removes analysis.ipynb

094caea

Removes accidental edit to tensordict.py

b0c859a

vmoens changed the title ~~Benchmark storage types~~ [Feature] Benchmark storage types Nov 1, 2022

Adi Goel added 3 commits November 3, 2022 18:23

Merge branch 'main' of github.com:pytorch/rl into benchmark-storage-t…

97cb74e

…ypes and update benchmark to use RemoteTensorDictReplayBuffer

Updates data.rst text

9e2e279

Removes redundant variable

d8e7cfe

adityagoel4512 marked this pull request as ready for review November 3, 2022 18:32

Adi Goel and others added 3 commits November 3, 2022 18:48

Removes hack to get list read to work

246cf32

replace assert_allclose with assert_close (pytorch#644)

7faf1e8

Merge branch 'main' of github.com:pytorch/rl into benchmark-storage-t…

246364c

…ypes

vmoens force-pushed the main branch 2 times, most recently from 00c3963 to ada0fcd Compare November 4, 2022 13:31

vmoens approved these changes Nov 4, 2022

View reviewed changes

vmoens added the documentation Improvements or additions to documentation label Nov 4, 2022

Adds small note illustrating example usage

790f07b

vmoens merged commit 49af5da into pytorch:main Nov 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Benchmark storage types #633

[Feature] Benchmark storage types #633

adityagoel4512 commented Nov 1, 2022

adityagoel4512 commented Nov 1, 2022

codecov bot commented Nov 1, 2022 •

edited

Loading

adityagoel4512 commented Nov 4, 2022

vmoens left a comment

vmoens Nov 4, 2022

adityagoel4512 Nov 4, 2022 •

edited

Loading

vmoens Nov 4, 2022

[Feature] Benchmark storage types #633

[Feature] Benchmark storage types #633

Conversation

adityagoel4512 commented Nov 1, 2022

Description

Motivation and Context

Types of changes

Checklist

adityagoel4512 commented Nov 1, 2022

codecov bot commented Nov 1, 2022 • edited Loading

Codecov Report

adityagoel4512 commented Nov 4, 2022

vmoens left a comment

Choose a reason for hiding this comment

vmoens Nov 4, 2022

Choose a reason for hiding this comment

adityagoel4512 Nov 4, 2022 • edited Loading

Choose a reason for hiding this comment

vmoens Nov 4, 2022

Choose a reason for hiding this comment

codecov bot commented Nov 1, 2022 •

edited

Loading

adityagoel4512 Nov 4, 2022 •

edited

Loading