Open
Description
Is your feature request related to a problem? Please describe
Jumanji specs have a generate_value
method which essentially returns zeros of the correct pytree/shape. It would be nice to be able to sample values (with an optional mask) from the action spec. This would give us random policies for free for all environments.
Describe the solution you'd like
Add a sample
method to the specs similar to Gym.
Remarks
We need to decide whether it is redundant to have both sample
and generate_value
.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
Misc
- Check for duplicate requests.