[BugFix] DDPG select also critic input for actor loss #1563

matteobettini · 2023-09-21T15:48:03Z

Before we were only selecting the actor input and output keys when computing the actor loss (which involves a forward critic pass).

This is limiting as the critic can have extra inputs (for example a fully observable state or a centralized state in MARL)

This PR fixes that by extending the selected keys.

Signed-off-by: Matteo Bettini <matbet@meta.com>

vmoens

Makes sense, but not sure we cover all use cases.
Can we add a test in TestDDPG?

Can we also change line 243:

        self._in_keys = list(set(keys))

into

        self._in_keys = sorted(set(keys), keys=str)

?

torchrl/objectives/ddpg.py

Signed-off-by: Matteo Bettini <matbet@meta.com>

matteobettini · 2023-09-22T09:46:28Z

we should be there, now i just need to write tests

vmoens · 2023-09-22T10:00:46Z

torchrl/objectives/ddpg.py

+            unravel_key(("next", self.tensor_keys.reward)),
+            unravel_key(("next", self.tensor_keys.done)),


we should check but since it's a TensorDictModuleBase the in_keys are unravelled by default
https://github.com/pytorch-labs/tensordict/blob/accd8a4a31ec749f52e75a87a875424652069163/tensordict/nn/common.py#L474-L495

So I think we can spare the effort of doing that here

but even if they are already unraveled, we are creating a new tuple ("next", already_unraveled_key) which could be
("next","done") or ("next",("nested", "done"))

that is why i am only unravleing the ones where we are putting next

look at the link: they will be unravelled after you set them in theory

funky stuff! i ll make sure to test it

Ah maybe not for properties, this one's harder actually
Let's keep it as it is

Signed-off-by: Matteo Bettini <matbet@meta.com>

init

72d6d4b

Signed-off-by: Matteo Bettini <matbet@meta.com>

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 21, 2023

matteobettini added bug Something isn't working and removed CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. labels Sep 21, 2023

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 21, 2023

vmoens reviewed Sep 22, 2023

View reviewed changes

torchrl/objectives/ddpg.py Outdated Show resolved Hide resolved

matteobettini added 4 commits September 22, 2023 10:36

init

8f280af

Signed-off-by: Matteo Bettini <matbet@meta.com>

init

5a8b9cb

Signed-off-by: Matteo Bettini <matbet@meta.com>

init

e45f301

Signed-off-by: Matteo Bettini <matbet@meta.com>

amend

4cdcea3

Signed-off-by: Matteo Bettini <matbet@meta.com>

vmoens reviewed Sep 22, 2023

View reviewed changes

vmoens approved these changes Sep 22, 2023

View reviewed changes

vmoens merged commit b8e0fb5 into pytorch:main Sep 22, 2023

matteobettini mentioned this pull request Sep 22, 2023

[Tests] DDPG extra critic input tests #1568

Merged

vmoens pushed a commit to hyerra/rl that referenced this pull request Oct 10, 2023

[BugFix] DDPG select also critic input for actor loss (pytorch#1563)

6d442a6

Signed-off-by: Matteo Bettini <matbet@meta.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix] DDPG select also critic input for actor loss #1563

[BugFix] DDPG select also critic input for actor loss #1563

matteobettini commented Sep 21, 2023

vmoens left a comment

matteobettini commented Sep 22, 2023

vmoens Sep 22, 2023

matteobettini Sep 22, 2023

matteobettini Sep 22, 2023

vmoens Sep 22, 2023

matteobettini Sep 22, 2023

vmoens Sep 22, 2023

		unravel_key(("next", self.tensor_keys.reward)),
		unravel_key(("next", self.tensor_keys.done)),

[BugFix] DDPG select also critic input for actor loss #1563

[BugFix] DDPG select also critic input for actor loss #1563

Conversation

matteobettini commented Sep 21, 2023

vmoens left a comment

Choose a reason for hiding this comment

matteobettini commented Sep 22, 2023

vmoens Sep 22, 2023

Choose a reason for hiding this comment

matteobettini Sep 22, 2023

Choose a reason for hiding this comment

matteobettini Sep 22, 2023

Choose a reason for hiding this comment

vmoens Sep 22, 2023

Choose a reason for hiding this comment

matteobettini Sep 22, 2023

Choose a reason for hiding this comment

vmoens Sep 22, 2023

Choose a reason for hiding this comment