poker_rl

a CONV based NLH poker RL agent and ENV

This is a work in progress, currently working on version 5.1

Descriptions: mod_agents: implementations of different hard-coded poker agents

mod_automated_training: code for finding optimal learning rate and setup self-play

mod_comp_test: benchmarks for testing the trained agent

mod_DQN_Conv: the Conv agent, implemented with a memory object and different learning algorithms such as Proximal Policy Optimization (PPO, works best), TRPO, Vanila Policy Gradient, Deep Q learning, ... .

mod_fe: Feature Engineering of the input signal for faster training

mod_memory: Implementation of the memory object and memory generators

mod_poker_5: The Env

mod_poker_decide: Implementation of a no-action environment used for pre training the weights of the Conv net

mod_step_function: Testing the agent versus human player

Results: more than 2e6 steps, linearly decrease epsilon, lr=1e-6, batch_Szie=512, mem_size=10000

DQLAgent won 94.0 from Call_Any

DQLAgent won 95.5 from Raise_Any

DQLAgent won 81.0 from Random

DQLAgent won 85.9 from Simple_Rational84

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Output		Output
Ac_shared_Eng.p		Ac_shared_Eng.p
Ac_shared_H.p		Ac_shared_H.p
LICENSE		LICENSE
Pre_training_Loss.png		Pre_training_Loss.png
README.md		README.md
mod_DQN_Conv.py		mod_DQN_Conv.py
mod_agents.py		mod_agents.py
mod_automated_training.py		mod_automated_training.py
mod_comp_test.py		mod_comp_test.py
mod_fe.py		mod_fe.py
mod_memory.py		mod_memory.py
mod_poker_5.py		mod_poker_5.py
mod_poker_decide.py		mod_poker_decide.py
mod_step_func.py		mod_step_func.py
ppo_loss.png		ppo_loss.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

poker_rl

About

Releases

Packages

Languages

License

navidivan/poker_rl

Folders and files

Latest commit

History

Repository files navigation

poker_rl

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages