[Refactor] Graduate Replay Buffer prototype #794

KamilPiechowiak · 2023-01-04T15:48:01Z

Description

This change replaces ReplayBuffers implementations in replay_buffers.py by composable implementations from rb_prototype.py.
This is a breaking change. ReplayBuffer constructor no longer accepts size parameter. Now it is only provided to the underlying Storage. This change also makes sample return type consistent across ReplayBuffer and its derived classes. It is also a breaking change. That is why many files needed to be updated.

Motivation and Context

Why is this change required? What problem does it solve?
This change allows to create new types of replay buffers by composing samplers and writers.

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds core functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (update in the documentation)
Example (update in the folder of examples)

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

I have read the CONTRIBUTION guide (required)
My change requires a change to the documentation.
I have updated the tests accordingly (required for a bug fix or a new feature).
I have updated the documentation accordingly.

vmoens · 2023-01-05T12:13:16Z

README.md

@@ -335,7 +334,7 @@ The associated [`SafeModule` class](torchrl/modules/tensordict_module/common.py)
    ```python
    from torchrl.objectives import DQNLoss
    loss_module = DQNLoss(value_network=value_network, gamma=0.99)
-    tensordict = replay_buffer.sample(batch_size)
+    tensordict = replay_buffer.sample(batch_size)[0]


What's your view on this?
I feel that the [0] index is bothersome. With TensorDictReplayBuffer we read the second output from sample (info) and write the content in the tensordict.
Maybe we should just have a def sample(self, ..., return_info=False) flag that can be turned on when needed.
Wdyt?

I also don't like the [0] index. However, the version of TensorDictReplayBuffer proposed in rb_prototype.py also returns 2 values from sample(). While migrating I decided to keep this logic. If we change it, we break consistency with parent class ReplayBuffer. I don't know if there are cases when users would like to use both ReplayBuffer and TensorDictReplayBuffer interchangeably in a single piece of code. If this is not the case, we can stop returning info (or return it only if the flag is set).

Ok, I see two ways of going about this

branch out your branch and have a sample() function with a single output, merge it after review in your branch and ship the whole thing. Advantage: we don't bc-break nightly twice

ship this PR as it is and solve it in a second time.
Any pref?

I don't know if there are cases when users would like to use both ReplayBuffer and TensorDictReplayBuffer

It's better if they're consistent, even if they're not interchangeable

Sorry for the late reply (one needs to refresh the page to see new comments).
So which solution do we choose? You said that they should be consistent (returning 2 values by default from both TensorDictReplayBuffer and ReplayBuffer) but should change the sample() function in TensorDictReplayBuffer to return 1 value (which breaks consistency). Or shall we remove info from returned value of sample() also in the ReplayBuffer and return it conditionally there too? In this option we would return always one value and two values only if return_info=True.

I think that the last approach (1 value by default in base and derived classes) is best.
If you agree, I'll make a new PR with this change.

yeah, one value by default unless return_info = True
Thanks for taking care of this!

I've updated this PR with the required changes.

Change returned values of ReplayBuffer sample method

vmoens

Amazing! Can you look at the 2 comments I left?
Brilliant work I love it

torchrl/data/replay_buffers/replay_buffers.py

vmoens · 2023-01-06T14:28:53Z

the lint seems to be failing, can you fix that?

KamilPiechowiak · 2023-01-06T14:39:40Z

the lint seems to be failing, can you fix that?

Linting fails because of two things:

there are type annotations 'Transform' as strings and linter doesn't like it. I did it this way to avoid circular references from envs.transforms. It is possible to remove this linter error by making a global variable Transform = 'Transform' and use it in annotations but it also doesn't look good. Maybe you have other ideas on how to solve it?
ReplayBuffer is imported in trainers.py. It is not used there, but it is later imported somewhere else. I'll find this import and import from torchrl.data.

vmoens · 2023-01-06T14:41:51Z

there are type annotations 'Transform' as strings and linter doesn't like it. I did it this way to avoid circular references from envs.transforms. It is possible to remove this linter error by making a global variable Transform = 'Transform' and use it in annotations but it also doesn't look good. Maybe you have other ideas on how to solve it?

if that's the problem you can just put a # noqa-F821 comment at the end of the lines where the lint complains

KamilPiechowiak · 2023-01-06T16:44:49Z

Reopening. Removed merge conflicts.

Kamil Piechowiak added 2 commits January 4, 2023 15:06

Graduate Replay Buffer prototype

0945d1e

Merge Replay Buffer graduation

57e1d31

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 4, 2023

vmoens added enhancement New feature or request bc breaking backward compatibility breaking change labels Jan 5, 2023

vmoens reviewed Jan 5, 2023

View reviewed changes

vmoens changed the title ~~Graduate Replay Buffer prototype~~ [Refactor] Graduate Replay Buffer prototype Jan 5, 2023

Kamil Piechowiak and others added 3 commits January 5, 2023 18:30

change returned values of ReplayBuffer sample method

206ac97

Merge pull request #1 from KamilPiechowiak/sample_return

b4f719e

Change returned values of ReplayBuffer sample method

Fix sample return values indexing in rpc

f721ff6

vmoens approved these changes Jan 6, 2023

View reviewed changes

torchrl/data/replay_buffers/replay_buffers.py Show resolved Hide resolved

torchrl/data/replay_buffers/replay_buffers.py Show resolved Hide resolved

add docstrings to sample method in replay buffers

de0a1e6

fix linter errors in replay_buffer.py and trainers.py

d553063

KamilPiechowiak closed this Jan 6, 2023

KamilPiechowiak force-pushed the main branch from d553063 to e8ba8d3 Compare January 6, 2023 16:28

sync with remote

7b45803

KamilPiechowiak reopened this Jan 6, 2023

vmoens approved these changes Jan 7, 2023

View reviewed changes

vmoens merged commit 569161e into pytorch:main Jan 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Refactor] Graduate Replay Buffer prototype #794

[Refactor] Graduate Replay Buffer prototype #794

KamilPiechowiak commented Jan 4, 2023

vmoens Jan 5, 2023

KamilPiechowiak Jan 5, 2023

vmoens Jan 5, 2023

vmoens Jan 5, 2023

KamilPiechowiak Jan 5, 2023

KamilPiechowiak Jan 5, 2023

vmoens Jan 5, 2023

KamilPiechowiak Jan 6, 2023

vmoens left a comment

vmoens commented Jan 6, 2023

KamilPiechowiak commented Jan 6, 2023

vmoens commented Jan 6, 2023

KamilPiechowiak commented Jan 6, 2023 •

edited

Loading

[Refactor] Graduate Replay Buffer prototype #794

[Refactor] Graduate Replay Buffer prototype #794

Conversation

KamilPiechowiak commented Jan 4, 2023

Description

Motivation and Context

Types of changes

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vmoens left a comment

Choose a reason for hiding this comment

vmoens commented Jan 6, 2023

KamilPiechowiak commented Jan 6, 2023

vmoens commented Jan 6, 2023

KamilPiechowiak commented Jan 6, 2023 • edited Loading

KamilPiechowiak commented Jan 6, 2023 •

edited

Loading