This section covers some steroids for policy gradient methods, along with a cool general trick called
- Lecture on NPG and TRPO by J. Schulman - video
- Alternative lecture on TRPO and open problems by... J. Schulman - video
- Our videos: lecture, seminar (russian)
- Original articles - TRPO, NPG
Go to seminar_TRPO_<framework>.ipynb
and follow instructions in the notebook.
While you already know algorithms that will work with continuously many actions, it can't hurt to learn something more specialized.