[Refactor] Update all instances of exploration `Wrapper` to `Module` #2298

kurtamohler · 2024-07-19T20:29:34Z

Description

Update instances of

AdditiveGaussianWrapper --> AdditiveGaussianModule
OrnsteinUhlenbeckProcessWrapper --> OrnsteinUhlenbeckProcessModule

everywhere in the code base, except in test/test_exploration.py, which should still test both the wrappers and modules until we finally remove the wrappers in the future.

Motivation and Context

close #2295

I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

I have read the CONTRIBUTION guide (required)
My change requires a change to the documentation.
I have updated the tests accordingly (required for a bug fix or a new feature).
I have updated the documentation accordingly.

pytorch-bot · 2024-07-19T20:29:36Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2298

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 1 Pending, 3 Unrelated Failures

As of commit 085bef2 with merge base bdc9784 ():

NEW FAILURES - The following jobs have failed:

Continuous Benchmark (PR) / CPU Pytest benchmark (gh)
Workflow failed! Resource not accessible by integration
Continuous Benchmark (PR) / GPU Pytest benchmark (gh)
Workflow failed! Resource not accessible by integration
Habitat Tests on Linux / tests (3.9, 12.1) / linux-job (gh)
RuntimeError: Command docker exec -t ba7908e7c0e8344ef0340e0037bf3eac7930d57ae660aeae20bf2a7649045a38 /exec failed with exit code 139

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

Libs Tests on Linux / unittests-gym (3.9, 12.1) / linux-job (gh) (trunk failure)
AttributeError: module 'torch' has no attribute 'compiler'
Unit-tests on Linux / tests-olddeps (3.8, 11.6) / linux-job (gh) (trunk failure)
AttributeError: module 'torch' has no attribute 'compiler'
Unit-tests on Windows / unittests-cpu / windows-job (gh) (trunk failure)
test/test_transforms.py::TestActionDiscretizer::test_trans_parallel_env_check[False]

This comment was automatically generated by Dr. CI and updates every 15 minutes.

kurtamohler · 2024-07-19T20:36:29Z

sota-implementations/ddpg/ddpg.py

@@ -108,7 +108,7 @@ def main(cfg: "DictConfig"):  # noqa: F821
    for _, tensordict in enumerate(collector):
        sampling_time = time.time() - sampling_start
        # Update exploration policy
-        exploration_policy.step(tensordict.numel())
+        exploration_policy[1].step(tensordict.numel())


I'm not completely sure if this is the best way to do this. Does there happen to be some alternative to TensorDictSequential which does essentially the same thing but also provides a step function?

Well, it looks like the same thing was done when EGreedyWrapper was updated to EGreedyModule, so I guess it's alright:

rl/sota-implementations/bandits/dqn.py

Lines 89 to 91 in bdc9784

policy = TensorDictSequential(

actor,

EGreedyModule(

rl/sota-implementations/bandits/dqn.py

Line 125 in bdc9784

policy[1].step()

Yep either that or

def update_exploration(module): if isinstance(module, ExplorationModule): module.set() policy.apply(update_exploration)

We could make sure that all exploration modules have the same parent class and use that update function across examples.

vmoens

LGTM thanks for this
In a second time we could consider refactoring the update methods, happy to read your thoughts about this

vmoens · 2024-07-22T07:56:14Z

sota-implementations/ddpg/ddpg.py

@@ -108,7 +108,7 @@ def main(cfg: "DictConfig"):  # noqa: F821
    for _, tensordict in enumerate(collector):
        sampling_time = time.time() - sampling_start
        # Update exploration policy
-        exploration_policy.step(tensordict.numel())
+        exploration_policy[1].step(tensordict.numel())


Yep either that or

def update_exploration(module): if isinstance(module, ExplorationModule): module.set() policy.apply(update_exploration)

We could make sure that all exploration modules have the same parent class and use that update function across examples.

vmoens · 2024-07-22T07:57:15Z

sota-implementations/multiagent/maddpg_iddpg.py

@@ -200,7 +203,7 @@ def train(cfg: "DictConfig"):  # noqa: F821
                optim.zero_grad()
                target_net_updater.step()

-        policy_explore.step(frames=current_frames)  # Update exploration annealing
+        policy_explore[1].step(frames=current_frames)  # Update exploration annealing


in the example I gave above, the update_exploration or step_exploration should be turned into a class to allow us to pass the current_frames

kurtamohler requested a review from vmoens July 19, 2024 20:29

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 19, 2024

kurtamohler commented Jul 19, 2024

View reviewed changes

Update all instances of exploration *Wrapper to *Module

b40430b

kurtamohler force-pushed the update-exploration-modules-0 branch from 9c4ccbd to b40430b Compare July 19, 2024 21:08

vmoens changed the title ~~Update all instances of exploration *Wrapper to *Module~~ [Refactor] Update all instances of exploration *Wrapper to *Module Jul 22, 2024

empty

085bef2

vmoens added the Refactoring Refactoring of an existing feature label Jul 22, 2024

vmoens approved these changes Jul 22, 2024

View reviewed changes

vmoens merged commit 87f66e8 into pytorch:main Jul 22, 2024
49 of 55 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Refactor] Update all instances of exploration `Wrapper` to `Module` #2298

[Refactor] Update all instances of exploration `Wrapper` to `Module` #2298

kurtamohler commented Jul 19, 2024 •

edited

Loading

pytorch-bot bot commented Jul 19, 2024 •

edited

Loading

kurtamohler Jul 19, 2024

kurtamohler Jul 19, 2024 •

edited

Loading

vmoens Jul 22, 2024

vmoens left a comment

vmoens Jul 22, 2024

vmoens Jul 22, 2024

[Refactor] Update all instances of exploration *Wrapper to *Module #2298

[Refactor] Update all instances of exploration *Wrapper to *Module #2298

Conversation

kurtamohler commented Jul 19, 2024 • edited Loading

Description

Motivation and Context

Types of changes

Checklist

pytorch-bot bot commented Jul 19, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2298

❌ 3 New Failures, 1 Pending, 3 Unrelated Failures

kurtamohler Jul 19, 2024

Choose a reason for hiding this comment

kurtamohler Jul 19, 2024 • edited Loading

Choose a reason for hiding this comment

vmoens Jul 22, 2024

Choose a reason for hiding this comment

vmoens left a comment

Choose a reason for hiding this comment

vmoens Jul 22, 2024

Choose a reason for hiding this comment

vmoens Jul 22, 2024

Choose a reason for hiding this comment

[Refactor] Update all instances of exploration `Wrapper` to `Module` #2298

[Refactor] Update all instances of exploration `Wrapper` to `Module` #2298

kurtamohler commented Jul 19, 2024 •

edited

Loading

pytorch-bot bot commented Jul 19, 2024 •

edited

Loading

kurtamohler Jul 19, 2024 •

edited

Loading