Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] DDPG select also critic input for actor loss #1563

Merged
merged 5 commits into from
Sep 22, 2023

Conversation

matteobettini
Copy link
Contributor

Before we were only selecting the actor input and output keys when computing the actor loss (which involves a forward critic pass).

This is limiting as the critic can have extra inputs (for example a fully observable state or a centralized state in MARL)

This PR fixes that by extending the selected keys.

Signed-off-by: Matteo Bettini <matbet@meta.com>
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 21, 2023
@matteobettini matteobettini added bug Something isn't working and removed CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. labels Sep 21, 2023
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 21, 2023
Copy link
Contributor

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, but not sure we cover all use cases.
Can we add a test in TestDDPG?

Can we also change line 243:

        self._in_keys = list(set(keys))

into

        self._in_keys = sorted(set(keys), keys=str)

?

torchrl/objectives/ddpg.py Outdated Show resolved Hide resolved
Signed-off-by: Matteo Bettini <matbet@meta.com>
Signed-off-by: Matteo Bettini <matbet@meta.com>
Signed-off-by: Matteo Bettini <matbet@meta.com>
Signed-off-by: Matteo Bettini <matbet@meta.com>
@matteobettini
Copy link
Contributor Author

we should be there, now i just need to write tests

Comment on lines +240 to +241
unravel_key(("next", self.tensor_keys.reward)),
unravel_key(("next", self.tensor_keys.done)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should check but since it's a TensorDictModuleBase the in_keys are unravelled by default
https://github.com/pytorch-labs/tensordict/blob/accd8a4a31ec749f52e75a87a875424652069163/tensordict/nn/common.py#L474-L495

So I think we can spare the effort of doing that here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but even if they are already unraveled, we are creating a new tuple ("next", already_unraveled_key) which could be
("next","done") or ("next",("nested", "done"))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is why i am only unravleing the ones where we are putting next

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

look at the link: they will be unravelled after you set them in theory

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

funky stuff! i ll make sure to test it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah maybe not for properties, this one's harder actually
Let's keep it as it is

@vmoens vmoens merged commit b8e0fb5 into pytorch:main Sep 22, 2023
vmoens pushed a commit to hyerra/rl that referenced this pull request Oct 10, 2023
Signed-off-by: Matteo Bettini <matbet@meta.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants