[BugFix] Multiagent "auto" entropy fix in SAC #1494

matteobettini · 2023-09-06T09:48:23Z

In SAC, when the entropy target is set to "auto", it is computed using the shape of the action.
In multi-agent settings this shape included the number of agents, which should not be the case.

This PR fixes that

Signed-off-by: Matteo Bettini <matbet@meta.com>

vmoens

Shall we also address this in other algos?
TD3, CQL, REDQ, DTs?

torchrl/objectives/sac.py

Co-authored-by: Vincent Moens <vincentmoens@gmail.com>

Signed-off-by: Matteo Bettini <matbet@meta.com>

matteobettini · 2023-09-06T13:32:45Z

Shall we also address this in other algos? TD3, CQL, REDQ, DTs?

done

vmoens

we could run a quick check in the test by passing composite actions to the losses and look at the target entropy?
Also: what I understand is that the target entropy is the number of agents.
Why is it that for single agent the entropy is the size of the action (eg, an action of size 100 has a very low target entropy) but for batched actions (not always MA but sometimes just composite actions) the target entropy is the number of actions?
How do we compute the entropy? Sum of the entropies?

I don't see the math behind these changes

matteobettini · 2023-09-07T14:43:45Z

Also: what I understand is that the target entropy is the number of agents. Why is it that for single agent the entropy is the size of the action (eg, an action of size 100 has a very low target entropy) but for batched actions (not always MA but sometimes just composite actions) the target entropy is the number of actions? How do we compute the entropy? Sum of the entropies?

I don't see the math behind these changes

The purpose of this change is to exclude the agents from the entropy calculation.

there are no changes in the math here.
The target entropy is always computed using the number of actions.
If i have an action with shape [3] it is 3, if i have a multidimnesional action [3,4] it is 12.

Therefore, we need to remove the batch size before computing this multiplication, otherwise the batch size will influence it.
In this case i am removing the batch size of the composite spec containing the action from the action itself (which removes the number of agents).

vmoens · 2023-09-07T14:46:08Z

Oh ok my mistake I understood something else!

Signed-off-by: Matteo Bettini <matbet@meta.com> Co-authored-by: Vincent Moens <vincentmoens@gmail.com>

amend

be1ffda

Signed-off-by: Matteo Bettini <matbet@meta.com>

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 6, 2023

vmoens added bug Something isn't working Refactoring Refactoring of an existing feature labels Sep 6, 2023

vmoens reviewed Sep 6, 2023

View reviewed changes

torchrl/objectives/sac.py Outdated Show resolved Hide resolved

matteobettini and others added 3 commits September 6, 2023 14:22

Update torchrl/objectives/sac.py

8511b89

Co-authored-by: Vincent Moens <vincentmoens@gmail.com>

review

9c8c749

Signed-off-by: Matteo Bettini <matbet@meta.com>

review

ce21ce2

Signed-off-by: Matteo Bettini <matbet@meta.com>

vmoens reviewed Sep 7, 2023

View reviewed changes

vmoens approved these changes Sep 7, 2023

View reviewed changes

vmoens merged commit 4c50f1e into pytorch:main Sep 7, 2023

matteobettini deleted the fix_sac_multiagent_auto_entropy branch September 7, 2023 14:53

vmoens added a commit to hyerra/rl that referenced this pull request Oct 10, 2023

[BugFix] Multiagent "auto" entropy fix in SAC (pytorch#1494)

3cc870a

Signed-off-by: Matteo Bettini <matbet@meta.com> Co-authored-by: Vincent Moens <vincentmoens@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix] Multiagent "auto" entropy fix in SAC #1494

[BugFix] Multiagent "auto" entropy fix in SAC #1494

matteobettini commented Sep 6, 2023

vmoens left a comment

matteobettini commented Sep 6, 2023

vmoens left a comment

matteobettini commented Sep 7, 2023

vmoens commented Sep 7, 2023

[BugFix] Multiagent "auto" entropy fix in SAC #1494

[BugFix] Multiagent "auto" entropy fix in SAC #1494

Conversation

matteobettini commented Sep 6, 2023

vmoens left a comment

Choose a reason for hiding this comment

matteobettini commented Sep 6, 2023

vmoens left a comment

Choose a reason for hiding this comment

matteobettini commented Sep 7, 2023

vmoens commented Sep 7, 2023