-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Example] A2C simplified example #1076
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks!
examples/a2c/a2c.py
Outdated
with torch.no_grad(): | ||
test_env.eval() | ||
actor.eval() | ||
# Generate a complete episode | ||
td_test = test_env.rollout( | ||
policy=actor, | ||
max_steps=10_000_000, | ||
auto_reset=True, | ||
auto_cast_to_device=True, | ||
break_when_any_done=True, | ||
).clone() | ||
logger.log_scalar( | ||
"reward_testing", | ||
td_test["next"]["reward"].sum().item(), | ||
collected_frames, | ||
) | ||
actor.train() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: If we use the Recorder
class we can do all of this I guess, but it's a bit of a black box so I'm ok with the explicit calls.
Maybe let's we use td_test["next", "reward"]
when we can (same for all the key indexing in the script) :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw that in Recorder you need to specify a number of steps, and I wanted to record a single test episode, independently of the number of steps. Maybe recorder could accept a number of episodes to records instead of a number steps?
Description
Add a simplified version of the A2C code example, including some plot results.
Motivation and Context
Why is this change required? What problem does it solve?
If it fixes an open issue, please link to the issue here.
You can use the syntax
close #15213
if this solves the issue #15213Types of changes
What types of changes does your code introduce? Remove all that do not apply:
Checklist
Go over all the following points, and put an
x
in all the boxes that apply.If you are unsure about any of these, don't hesitate to ask. We are here to help!