-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Fix TD3 target net #1186
Conversation
@BY571 I made some more changes and I have a good training curve now |
looking good! Also getting the "old" performance now with the correct delayed values :) |
Do you have the permission make a review? That would be great! |
total_frames: 1000000 | ||
frames_per_batch: 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big issue but I might put this back to 1000 as its a common length for MuJoCo envs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure
this is the frames in each batch but not the truncation length.
This should be handled by max_frames_per_traj
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yes! I was thinking of env like halfcheetah. But you are right for Hopper or others where episodes can be less than 1000 so multiple epochs per batch!
looks good to me now! |
No description provided.