[BUG] Transpose bug in reward2go
when the last dim is not 1
#2086
Labels
bug
Something isn't working
reward2go
when the last dim is not 1
#2086
Describe the bug
When calling
reward2go
with a shape whose last dimension is not 1, the results are incorrect due to reshaping rather than transposing.To Reproduce
Minimal example:
Output:
Expected behavior
System info
Reason and Possible fixes
In the last step of
reward2go
the cumsum is reshaped:It should be:
to undo an earlier transpose.
I'm happy to contribute a PR with a fix and an extra test.
Checklist
The text was updated successfully, but these errors were encountered: