[Docs] Fix multi-agent tutorial #1599

matteobettini · 2023-10-03T21:28:48Z

Fix the multi-agent tutorial after the recent "termianted" introduction

Signed-off-by: Matteo Bettini <matbet@meta.com>

vmoens

LGTM, see my comment and feel free to merge if you feel it's the right solution
I don't really have a preference

vmoens · 2023-10-04T07:00:05Z

tutorials/sphinx-tutorials/multiagent_ppo.py

 for tensordict_data in collector:
    tensordict_data.set(
-        ("next", "done"),
+        ("next", "agents", "done"),
        tensordict_data.get(("next", "done"))
        .unsqueeze(-1)
-        .expand(tensordict_data.get(("next", env.reward_key)).shape),
-    )  # We need to expand the done to match the reward shape (this is expected by the value estimator)
+        .expand(tensordict_data.get_item_shape(("next", env.reward_key))),
+    )
+    tensordict_data.set(
+        ("next", "agents", "terminated"),
+        tensordict_data.get(("next", "terminated"))
+        .unsqueeze(-1)
+        .expand(tensordict_data.get_item_shape(("next", env.reward_key))),
+    )
+    # We need to expand the done and terminated to match the reward shape (this is expected by the value estimator)


what is the best way to approach this?
A common criticism of our tutorials is that they are too complex, I wonder what will be judged as the most complex thing here: (1) boilerplate code such as this, which IMO has little instructive value or (2) an ad-hoc transform that may be a bit too custom to be part of the core lib

Yeah i agree, what i can do is:

leave as is

make the transform in the tutorial

make the transform in core

We could also consider allowing rewards and dones with different shapes and expanding in the value functionals

What is your preference ?

seems like a stretch, these classes are already very complex and hard to maintain

I think hese operations are actually 2fold: (1) rename without deleting and (2) unsqueeze + expand
We have a transform that does rename, one that does unsqueeze. We just need one that does expand (or expand_as).

The quickest way forward is this:
Merge as is, open an issue with a feature request for the expand_as transform. Wdyt?

Signed-off-by: Matteo Bettini <matbet@meta.com>

update

d9b4f04

Signed-off-by: Matteo Bettini <matbet@meta.com>

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 3, 2023

matteobettini added 3 commits October 3, 2023 22:29

update

c522418

Signed-off-by: Matteo Bettini <matbet@meta.com>

update

221e457

Signed-off-by: Matteo Bettini <matbet@meta.com>

update

d2f5c2c

Signed-off-by: Matteo Bettini <matbet@meta.com>

matteobettini marked this pull request as ready for review October 3, 2023 21:33

vmoens approved these changes Oct 4, 2023

View reviewed changes

update

80f9d54

Signed-off-by: Matteo Bettini <matbet@meta.com>

vmoens added the documentation Improvements or additions to documentation label Oct 4, 2023

vmoens merged commit 22fd5ba into pytorch:main Oct 4, 2023
54 of 59 checks passed

vmoens pushed a commit to hyerra/rl that referenced this pull request Oct 10, 2023

[Docs] Fix multi-agent tutorial (pytorch#1599)

84f1d8c

Signed-off-by: Matteo Bettini <matbet@meta.com>

matteobettini deleted the tutorial branch April 4, 2024 12:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Docs] Fix multi-agent tutorial #1599

[Docs] Fix multi-agent tutorial #1599

matteobettini commented Oct 3, 2023

vmoens left a comment

vmoens Oct 4, 2023

matteobettini Oct 4, 2023 •

edited

Loading

vmoens Oct 4, 2023

[Docs] Fix multi-agent tutorial #1599

[Docs] Fix multi-agent tutorial #1599

Conversation

matteobettini commented Oct 3, 2023

vmoens left a comment

Choose a reason for hiding this comment

vmoens Oct 4, 2023

Choose a reason for hiding this comment

matteobettini Oct 4, 2023 • edited Loading

Choose a reason for hiding this comment

vmoens Oct 4, 2023

Choose a reason for hiding this comment

matteobettini Oct 4, 2023 •

edited

Loading