-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Add Stack
transform
#2567
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2567
Note: Links to docs will display an error until the docs builds have been completed. ❌ 5 New Failures, 9 Unrelated FailuresAs of commit 8e044a1 with merge base 1cffffe (): NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Looks like there is a minor bug if I try to use this on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this, long awaited feature!
Just left a couple of comments on the default dim and test set
23f7e1b
to
f443812
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks this is superb
I'd like to discuss the inverse transform:
Would it make sense in the inverse to get an entry (from the input_spec
) and unbind it?
Like: you have a single action with leading dim of 2, and map it to ("agent0", "action")
, ("agent1", "action")
. The spec seen from the outside is the stack of the 2 specs (as it is for stuff processed in forward).
Would that make sense?
f443812
to
81d5cfe
Compare
Yes, I think that does make sense, and that is exactly what happens. For instance, if I add a line to print the output of Click to expandfrom torchrl.envs import Stack, TransformedEnv
from torchrl.envs import UnityMLAgentsEnv
base_env = UnityMLAgentsEnv(registered_name='3DBall')
try:
t = Stack(
in_keys=[("group_0", f"agent_{idx}") for idx in range(12)],
out_key="agents",
)
env = TransformedEnv(base_env, t)
action = env.full_action_spec.rand()
print("-------------------------")
print(action)
print("-------------------------")
env.step(action)
finally:
base_env.close() I get the following output: Click to expand
Which shows that the inverse of Stack is able to correctly unbind the stacked actions into the format that the unity env expects. BTW, this functionality is being tested in the environment unit test I added. But one thing from the above example that I'd like to fix is that EDIT: I found a solution that I think is good--during |
a6e56be
to
cec32d2
Compare
Ah got it, sorry I was expecting something like in_keys or such. Maybe let's make clear in the doc strings that in_keys can be part of the input (usually it's reserved to output keys) Maybe this? If that's what you need you can mention it in the doc of the transform. |
I'm sorry, I'm not sure what you mean by this.
|
Yeah usually anything you iterate over during inverse pass is passed through the inv_keys and anything you process during forward is in the in_keys. |
Ok that sounds good! |
What I think you're saying is that most of the existing transforms that have an inverse allow the user to explicitly set the inverse keys in I guess we also might as well allow the user to override the default |
Yep that's what I'm saying. So I'd rather ask users to indicate the ˋin_keys |
Got it, that makes sense. I'll make that change |
Actually, I guess there are still some decisions to make about the specific behavior of
We'll probably have to enforce that all of the keys specified in We can still infer the
To me, it seems that a natural representation of
where Instead, maybe
(Which is just equal to But the docs would need to make it clear that
Where What do you think? Is there a better decision for the behavior of |
@kurtamohler what about we simply restrict the transform to stack X keys into 1 during forward and split 1 entry into Y during inverse? If users want to unbind multiple keys they should just have multiple transforms. |
Ok that sounds good to me! |
cec32d2
to
a74d3a4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very minor comments but otherwise LGTM
This transform is useful for environments that have multiple agents with | ||
identical specs under different keys. The specs and tensordicts for the | ||
agents can be stacked together under a shared key, in order to run MARL | ||
algorithms that expect the tensors for observations, rewards, etc. to | ||
contain batched data for all the agents. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather have a general statement about the transform first - some will use that just to stack tensors together, not in MARL settings (eg multiple images in one)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add a note saying that multiple stacks will require multiple transforms?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added, let me know if that's good
a74d3a4
to
8e044a1
Compare
Description
Adds a transform that stacks tensors and specs from different keys of a tensordict into a common key.
Motivation and Context
close #2566
Types of changes
What types of changes does your code introduce? Remove all that do not apply:
Checklist
Go over all the following points, and put an
x
in all the boxes that apply.If you are unsure about any of these, don't hesitate to ask. We are here to help!