-
Notifications
You must be signed in to change notification settings - Fork 327
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Feature] Batched actions wrapper (#2018)
- Loading branch information
Showing
9 changed files
with
435 additions
and
7 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
# Copyright (c) Meta Platforms, Inc. and affiliates. | ||
# | ||
# This source code is licensed under the MIT license found in the | ||
# LICENSE file in the root directory of this source tree. | ||
|
||
"""Example of a dummy multi-step agent. | ||
A multi-step actor predicts a macro (or an action sequence) and executes it regardless of the observations | ||
coming in the meantime. | ||
The core component of this example is the `MultiStepActorWrapper` class. | ||
`MultiStepActorWrapper` handles the calls to the actor when the macro has run out of actions or | ||
when the environment has been reset (which is indicated by the InitTracker transform). | ||
""" | ||
|
||
import torch.nn | ||
from tensordict.nn import TensorDictModule as Mod, TensorDictSequential as Seq | ||
from torchrl.envs import ( | ||
CatFrames, | ||
Compose, | ||
GymEnv, | ||
InitTracker, | ||
SerialEnv, | ||
TransformedEnv, | ||
) | ||
from torchrl.modules.tensordict_module.actors import MultiStepActorWrapper | ||
|
||
time_steps = 6 | ||
n_obs = 4 | ||
n_action = 2 | ||
batch = 5 | ||
|
||
|
||
# Transforms a CatFrames in a stack of frames | ||
def reshape_cat(data: torch.Tensor): | ||
return data.unflatten(-1, (time_steps, n_obs)) | ||
|
||
|
||
# an actor that reads `time_steps` frames and outputs one action per frame | ||
# (actions are conditioned on the observation of `time_steps` in the past) | ||
actor_base = Seq( | ||
Mod(reshape_cat, in_keys=["obs_cat"], out_keys=["obs_cat_reshape"]), | ||
Mod( | ||
torch.nn.Linear(n_obs, n_action), | ||
in_keys=["obs_cat_reshape"], | ||
out_keys=["action"], | ||
), | ||
) | ||
# Wrap the actor to dispatch the actions | ||
actor = MultiStepActorWrapper(actor_base, n_steps=time_steps) | ||
|
||
env = TransformedEnv( | ||
SerialEnv(batch, lambda: GymEnv("CartPole-v1")), | ||
Compose( | ||
InitTracker(), | ||
CatFrames(N=time_steps, in_keys=["observation"], out_keys=["obs_cat"], dim=-1), | ||
), | ||
) | ||
|
||
print(env.rollout(100, policy=actor, break_when_any_done=False)) |
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
76b296d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible performance regression was detected for benchmark 'CPU Benchmark Results'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold
2
.benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400]
332.38019103358624
iter/sec (stddev: 0.011652557506936787
)755.4771793311247
iter/sec (stddev: 0.00007278786761598838
)2.27
This comment was automatically generated by workflow using github-action-benchmark.
CC: @vmoens