-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Avoid collision of "step_count" key from transform and collector #868
Conversation
@@ -522,8 +522,6 @@ def reset( | |||
f"Env {self} was done after reset on specified '_reset' dimensions. This is (currently) not allowed." | |||
) | |||
if tensordict is not None: | |||
if "_reset" in tensordict.keys(): | |||
tensordict.del_("_reset") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matteobettini do you recall why we need to remove _reset from the td?
The problem is that we call reset()
in TransformedEnv
and we need "_reset"
to be passed to the transforms reset()
methods. Deleting it make them consider that "_reset"
is all True
by default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea there is to remove it because if users do not notice it is there and call reset again with the same tensordict it will still only reset those indexes and they might not notice. So i though of making it a consumable that once reaches the inner env is removed.
The problem is that right now the transform reset is called after the env reset. So this consumable logic consumes the _reset flag before reacheing the transforms. There is no workaroud to this right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same consumable logic is implemented also in the _BatchedEnv
s so if we remove it we should remove it from there too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and we might want to make the deletion in collectors)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be wrong (please correct me if I am) but isn't is actually useful to keep track of that?
It is also dangerous to keep the "action" entry or anything else FWIW. The real nasty thing is to control step_mdp
such that its default behaviour does not pass useless info to the next step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, at the time i made it seemed less error prone to consume it, but we can treat it as we treat actions. And now I realize that transforms reset is called after the env one and they may need it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we integrate it, maybe we can remove the underscore. Or leave it (in which case the collector will discard it by default)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably leave it as the private meaning is nice. Collectors will discard it automatically yes but they will also have to explicitly discard it here https://github.com/pytorch/rl/blob/main/torchrl/collectors/collectors.py#L620 i think, otherwise it will be carried to the next step. (Unless we implement the discarding logic in stp_mdp)
This PR will affect mostly traj_ids in the long term right? Aren't we planning to remove the collector step count? |
yep that should be the plan. |
Yep, then maybe also the "mask" key should go in the nested collector key |
Good point |
Description
StepCount
and the collectors both assign a"step_count"
key that it would be wise to use only once.This PR solves this issue.
cc @riiswa