Skip to content

Latest commit

 

History

History
 
 

pg

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

PG (Vanilla Policy Gradient)

PG is the most basic reinforcement learning algorithm that learns a policy by taking a gradient of action log probabilities and weighting them by the return. This algorithm is also known as REINFORCE.

Installation

conda create -n rllib-pg python=3.10
conda activate rllib-pg
pip install -r requirements.txt
pip install -e '.[development]'

Usage

PG Example