Naive Implementations of Deep Reinforcement Learning

About

This is some naive implementations of deep reinforcement learning algorithms. The purpose of this repo is to help me understand the algorithms and the code. The code is not optimized for performance. If you want to use the code for your research, please refer to the original paper and the official implementation. I verify the code with OpenAI gymnasium. The most of games that I used is LunarLander-v3, CartPole-v1 and Pendulum-v1

Environment Preparation (torch users)

conda create -n rltorch pytorch torchvision torchaudio pytorch-cuda=12.1 gymnasium pyglet pygame gymnasium-box2d colorama pylint yapf tqdm 'tensorboardx>=2.5.0' 'tensorboard>2.0' pillow matplotlib scipy seaborn ipykernel -c conda-forge -c pytorch -c nvidia

Run

This project does not provide the trained Deep Reinforcement Learning model weight.

You can start training model under conda environment by

(rltorch) > python -m <project name>.main

For example (DDPG):

(rltorch) > python -m DDPG.main

Algorithms

DQN	DDQN	DDPG
PPO	PPG	C51
AWR	AC	TD3
SACv1	SACv2	MPO

Improved with Gumbel Distribution Regression from XQL

XAWR	XDDPG
XTD3	XSAC

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
.vscode		.vscode
AC		AC
AWR		AWR
C51		C51
DDPG		DDPG
DDQN		DDQN
DQN		DQN
MPO		MPO
PPG		PPG
PPO		PPO
REINFORCE		REINFORCE
RND		RND
SACv1		SACv1
SACv2		SACv2
SARSA		SARSA
TD3		TD3
XAWR		XAWR
XDDPG		XDDPG
XSAC		XSAC
XTD3		XTD3
tests		tests
tools		tools
util		util
.clang-format		.clang-format
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.pylintrc		.pylintrc
.style.yapf		.style.yapf
CONTRIBUTE.md		CONTRIBUTE.md
LICENSE		LICENSE
README.md		README.md
TODO.md		TODO.md
requirement.txt		requirement.txt
requirement_dev.txt		requirement_dev.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Naive Implementations of Deep Reinforcement Learning

About

Table of Contents

Environment Preparation (torch users)

Run

Algorithms

Improved with Gumbel Distribution Regression from XQL

Reference

About

Releases

Packages

Languages

License

Alwaysproblem/newbiesRL

Folders and files

Latest commit

History

Repository files navigation

Naive Implementations of Deep Reinforcement Learning

About

Table of Contents

Environment Preparation (torch users)

Run

Algorithms

Improved with Gumbel Distribution Regression from XQL

Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages