Skip to content

Commit

Permalink
[Feature] Allow multiple (nested) action, reward, done keys in env,…
Browse files Browse the repository at this point in the history
…`vec_env` and `collectors` (pytorch#1462)

Signed-off-by: Matteo Bettini <matbet@meta.com>
Co-authored-by: vmoens <vincentmoens@gmail.com>
  • Loading branch information
matteobettini and vmoens authored Aug 30, 2023
1 parent 16ce926 commit f8777a6
Show file tree
Hide file tree
Showing 17 changed files with 1,411 additions and 566 deletions.
6 changes: 3 additions & 3 deletions benchmarks/test_envs_benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -118,9 +118,9 @@ def test_step_mdp_speed(
benchmark(
step_mdp,
td,
action_key=action_key,
reward_key=reward_key,
done_key=done_key,
action_keys=action_key,
reward_keys=reward_key,
done_keys=done_key,
keep_other=keep_other,
exclude_reward=exclude_reward,
exclude_done=exclude_done,
Expand Down
8 changes: 4 additions & 4 deletions docs/source/reference/envs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,10 +38,10 @@ Each env will have the following attributes:
- :obj:`env.done_spec`: a :class:`~torchrl.data.TensorSpec` object representing
the done-flag spec.
- :obj:`env.input_spec`: a :class:`~torchrl.data.CompositeSpec` object containing
all the input keys (:obj:`"_action_spec"` and :obj:`"_state_spec"`).
all the input keys (:obj:`"full_action_spec"` and :obj:`"full_state_spec"`).
It is locked and should not be modified directly.
- :obj:`env.output_spec`: a :class:`~torchrl.data.CompositeSpec` object containing
all the output keys (:obj:`"_observation_spec"`, :obj:`"_reward_spec"` and :obj:`"_done_spec"`).
all the output keys (:obj:`"full_observation_spec"`, :obj:`"full_reward_spec"` and :obj:`"full_done_spec"`).
It is locked and should not be modified directly.

Importantly, the environment spec shapes should contain the batch size, e.g.
Expand Down Expand Up @@ -340,8 +340,8 @@ single agent standards.
spec if the accessed spec is Composite. Therefore, if in the example above
we run `env.reward_spec` after env creation, we would get the same output as `torch.stack(reward_specs)}`.
To get the full composite spec with the "agents" key, you can run
`env.output_spec["_reward_spec"]`. The same is valid for action and done specs.
Note that `env.reward_spec == env.output_spec["_reward_spec"][env.reward_key]`.
`env.output_spec["full_reward_spec"]`. The same is valid for action and done specs.
Note that `env.reward_spec == env.output_spec["full_reward_spec"][env.reward_key]`.


Transforms
Expand Down
Loading

0 comments on commit f8777a6

Please sign in to comment.