-
Notifications
You must be signed in to change notification settings - Fork 331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Vectorized priority update in replay buffers #1598
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, see my few minor comments
priority = torch.tensor( | ||
[self._get_priority_item(td) for td in data], | ||
dtype=torch.float, | ||
device=data.device, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we assume that the stack dim is 0 but it could not be (?)
I think we can consider the priority can be stacked, no? At the end of the day it's supposed to be one priority per item. Maybe I'm missing something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was the previous treatment so I am not super sure what was going on or why things were this way.
my guess is that it is assuming 0 as stack dim because in expand
it stacks on 0 and also because that is the dim of the indeces and priority
are you suggesting to completely remove this and always go vectorized? I am down to try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it was this way because the priority can have different shapes along the stack dim?
That is the only explanation I can guess
Co-authored-by: Vincent Moens <vincentmoens@gmail.com>
Co-authored-by: Vincent Moens <vincentmoens@gmail.com>
…rioritised_buffer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks!
Signed-off-by: Matteo Bettini <matbet@meta.com> Co-authored-by: Vincent Moens <vincentmoens@gmail.com>
This is a patch for #1574
It fixes the core problem highlighted in that issue but there are still points that will need attention in a future refactoring of this class as I am not sure it is compatible with all the cases it aims to support. I am happy to elucidate more about this if needed.