Skip to content

a13xe/PolicyGradientAlgorithms

Repository files navigation

CodeSize Repo LastCommint

Policy Gradient Algorithms

  • VPG (VANILLA POLICY GRADIENT)
  • PPO (PROXIMAL POLICY OPTIMIZATION)
  • TRPO (TRUST REGION POLICY OPTIMIZATION)

Installation

pip install matplotlib gym==0.25.2 tensorflow keras-rl2 pyglet protobuf==3.20.*

Training results

graph_500ep_