[Feature] Add PrioritizedSliceSampler #1875

Cadene · 2024-02-06T01:32:10Z

Description

Add PrioritizedSliceSampler, a subclass of both PrioritizedSampler and SliceSampler. It allows to sample slices following some priority weights.

In contrast to the PrioritizedSampler, which selects steps based on individual priority weights, the PrioritizedSliceSampler selects slices with a corresponding priority weight for each. In its implementation, it focuses solely on sampling the start index of each slice while disregarding indices at the end of each episode, as they cannot form slices of sufficient length.

Differing from the SliceSampler, which initially samples trajectories and subsequently selects slices randomly within each trajectory, the PrioritizedSliceSampler directly targets slice sampling. This approach simplifies implementation, but it also means that, under the default uniform priority weights, the PrioritizedSliceSampler may tend to sample slices more frequently from longer trajectories than shorter ones. Nonetheless, the initial priority weights can be adjusted manually to reflect any prior in the sampling. Also, as priority weights are updated during training, the sampler should adjust, mitigating any oversampling of slices from longer trajectories over time.

Motivation and Context

This type of sampling is used in the literature:
https://github.com/fyhMer/fowm/blob/main/src/algorithm/helper.py#L504-L510
https://github.com/fyhMer/fowm/blob/main/src/algorithm/tdmpc.py#L334-L337

Addresses this feature request: #1876

I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds core functionality)
Documentation (update in the documentation)

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

I have read the CONTRIBUTION guide (required)
My change requires a change to the documentation.
I have updated the tests accordingly (required for a bug fix or a new feature).
I have updated the documentation accordingly.

pytorch-bot · 2024-02-06T01:32:13Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/1875

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (16 Unrelated Failures)

As of commit 6e68587 with merge base b34e2d2 ():

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

Continuous Benchmark (PR) / CPU Pytest benchmark (gh)
Workflow failed! Resource not accessible by integration
Continuous Benchmark (PR) / GPU Pytest benchmark (gh)
Workflow failed! Resource not accessible by integration
Examples Tests on Linux / tests (3.9, 12.1) / linux-job (gh)
RuntimeError: Command docker exec -t 7f4f3e3cf529ee1643e23d9fd3351cd756a8d2cdf791052e054dc819ea3181f6 /exec failed with exit code 1
Habitat Tests on Linux / tests (3.9, 11.6) / linux-job (gh)
RuntimeError: Command docker exec -t 54a6533ed5e32ec424a94bc1a38a529d584fa4a330914342a7d807312ee1f538 /exec failed with exit code 139
Unit-tests on Windows / unittests-gpu / windows-job (gh)
##[error]The operation was canceled.

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

Unit-tests on Linux / tests-cpu (3.10) / linux-job (gh)
test/test_cost.py::TestPPO::test_ppo_diff[device0-None-False-KLPENPPOLoss]
Unit-tests on Linux / tests-cpu (3.11) / linux-job (gh)
test/test_cost.py::TestPPO::test_ppo_diff[device0-None-False-KLPENPPOLoss]
Unit-tests on Linux / tests-cpu (3.8) / linux-job (gh)
test/test_cost.py::TestPPO::test_ppo_diff[device0-None-False-KLPENPPOLoss]
Unit-tests on Linux / tests-cpu (3.9) / linux-job (gh)
test/test_cost.py::TestPPO::test_ppo_diff[device0-None-False-KLPENPPOLoss]
Unit-tests on Linux / tests-gpu (3.8, 12.1) / linux-job (gh)
##[error]The operation was canceled.
Unit-tests on Linux / tests-olddeps (3.8, 11.6) / linux-job (gh)
test/test_cost.py::TestPPO::test_ppo_diff[device0-None-False-KLPENPPOLoss]
Unit-tests on Linux / tests-optdeps (3.9, 12.1) / linux-job (gh)
test/test_cost.py::TestPPO::test_ppo_diff[device0-None-False-KLPENPPOLoss]
Unit-tests on Linux / tests-stable-gpu (3.8, 11.8) / linux-job (gh)
##[error]The operation was canceled.
Unit-tests on MacOS CPU / tests (3.11) / macos-job (gh)
test/test_cost.py::TestPPO::test_ppo_diff[device0-None-False-KLPENPPOLoss]
Unit-tests on MacOS CPU / tests (3.8) / macos-job (gh)
test/test_cost.py::TestPPO::test_ppo_diff[device0-None-False-KLPENPPOLoss]
Unit-tests on Windows / unittests-cpu / windows-job (gh)
test/test_cost.py::TestPPO::test_ppo_diff[device0-None-False-KLPENPPOLoss]

This comment was automatically generated by Dr. CI and updates every 15 minutes.

torchrl/data/replay_buffers/samplers.py

test/test_rb.py

vmoens · 2024-02-06T07:59:54Z

Amazing! Will do a proper review shortly. Any chance we can get the bugfix in a separate PR to put it in the minor release?

…pler

vmoens

Wonderful! Thanks it's a super useful feature
I left a couple of comments, most are just suggestions so don't feel like you need to address them all.

If we can just move the bugfix to a separate PR, I'll merge that one straight away and then we can move to this.

torchrl/data/replay_buffers/samplers.py

vmoens · 2024-02-06T10:31:09Z

torchrl/data/replay_buffers/samplers.py

+        starts = torch.from_numpy(starts).to(device=lengths.device)
+        index = self._tensor_slices_from_startend(seq_length, starts)


note to self: we should avoid returning numpy arrays and stick to torch

cc @albertbou92

torchrl/data/replay_buffers/samplers.py

vmoens · 2024-02-06T10:35:09Z

torchrl/data/replay_buffers/samplers.py

+                    terminated_key: terminated,
+                }
+            )
+        return index.to(torch.long), info


we should check if there isn't a way to enforce this dtype earlier and avoid casting

FYI I borrowed this logic from SliceSampler

rl/torchrl/data/replay_buffers/samplers.py

Line 932 in 144f547

return index.to(torch.long), {}

I checked on my side, and index is already a torch.long, so the cast is a no-op

torchrl/data/replay_buffers/samplers.py

…ioritized_slice_sampler

Co-authored-by: Vincent Moens <vincentmoens@gmail.com>

Cadene · 2024-02-07T13:57:18Z

Amazing! Will do a proper review shortly. Any chance we can get the bugfix in a separate PR to put it in the minor release?

@vmoens Done ;) #1884

…pler

Add PrioritizedSliceSampler + few tests

ac4fa9d

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 6, 2024

Improve docstring + remove comments

74a4bee

Cadene commented Feb 6, 2024

View reviewed changes

torchrl/data/replay_buffers/samplers.py Outdated Show resolved Hide resolved

Cadene commented Feb 6, 2024

View reviewed changes

torchrl/data/replay_buffers/samplers.py Show resolved Hide resolved

Cadene commented Feb 6, 2024

View reviewed changes

test/test_rb.py Show resolved Hide resolved

Merge remote-tracking branch 'origin/main' into prioritized_slice_sam…

ffba3dc

…pler

vmoens added the enhancement New feature or request label Feb 6, 2024

vmoens approved these changes Feb 6, 2024

View reviewed changes

vmoens mentioned this pull request Feb 6, 2024

[Doc] Improve PrioritizedSampler doc and get rid of np dependency as much as possible #1881

Merged

Cadene and others added 4 commits February 7, 2024 14:25

Add more tests + Fix + Add docstrings + Improved import in __init__

58e1995

Merge remote-tracking branch 'fork/prioritized_slice_sampler' into pr…

84962c6

…ioritized_slice_sampler

Revert bugfix traj_terminated

109dbb5

Update torchrl/data/replay_buffers/samplers.py

a9d1388

Co-authored-by: Vincent Moens <vincentmoens@gmail.com>

Cadene changed the title ~~Add PrioritizedSliceSampler + Small fix "traj_terminated"~~ Add PrioritizedSliceSampler Feb 7, 2024

vmoens changed the title ~~Add PrioritizedSliceSampler~~ [Feature] Add PrioritizedSliceSampler Feb 7, 2024

vmoens added 4 commits February 7, 2024 16:35

amend

8463e09

Merge remote-tracking branch 'origin/main' into prioritized_slice_sam…

56de68a

…pler

amend

b647345

Merge remote-tracking branch 'origin/main' into prioritized_slice_sam…

6e68587

…pler

vmoens merged commit 4d52d5f into pytorch:main Feb 7, 2024
52 of 68 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Add PrioritizedSliceSampler #1875

[Feature] Add PrioritizedSliceSampler #1875

Cadene commented Feb 6, 2024 •

edited

Loading

pytorch-bot bot commented Feb 6, 2024 •

edited

Loading

vmoens commented Feb 6, 2024

vmoens left a comment

vmoens Feb 6, 2024

vmoens Feb 6, 2024

Cadene Feb 7, 2024

Cadene Feb 7, 2024

Cadene commented Feb 7, 2024

		starts = torch.from_numpy(starts).to(device=lengths.device)
		index = self._tensor_slices_from_startend(seq_length, starts)

[Feature] Add PrioritizedSliceSampler #1875

[Feature] Add PrioritizedSliceSampler #1875

Conversation

Cadene commented Feb 6, 2024 • edited Loading

Description

Motivation and Context

Types of changes

Checklist

pytorch-bot bot commented Feb 6, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/1875

✅ You can merge normally! (16 Unrelated Failures)

vmoens commented Feb 6, 2024

vmoens left a comment

Choose a reason for hiding this comment

vmoens Feb 6, 2024

Choose a reason for hiding this comment

vmoens Feb 6, 2024

Choose a reason for hiding this comment

Cadene Feb 7, 2024

Choose a reason for hiding this comment

Cadene Feb 7, 2024

Choose a reason for hiding this comment

Cadene commented Feb 7, 2024

Cadene commented Feb 6, 2024 •

edited

Loading

pytorch-bot bot commented Feb 6, 2024 •

edited

Loading