-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rllib] Basic port of baselines/deepq to rllib #709
Conversation
Merged build finished. Test PASSed. |
Test PASSed. |
Merged build finished. Test PASSed. |
Test PASSed. |
Merged build finished. Test PASSed. |
Test PASSed. |
This is a straightforward adaptation of the baselines DQN implementation to conform to the RLlib API. Files to pay attention to are
rllib/dqn/dqn.py
andrllib/dqn/example.py
; the rest were mostly copied with linter fixes only.I also fixed up the licensing here by appending the OpenAI MIT license to the top-level LICENSE file.
I have a couple ideas on how to parallelize this with Ray in a followup PR:
train_freq
to be large enough to allow sufficient parallelism between training steps. Increasingtrain_freq
will probably also require an equivalent increase ofbatch_size
.batch_size
parameter to be increased. We might also consider multiple steps of optimization over replay buffer samples, similar to policy gradient.There is also literature on parallelizing DQN in other ways but that might be out of scope for now.
On a GPU instance the Pong example spends about equal time in training and rollouts, so both could be potentially valuable.
cc @pcmoritz @royf