- Lecture slides
- Our videos: lecture seminar (russian)
- [main] lecture by David Silver - url
- Alternative lecture by Pieter Abbeel (english): part 1, part 2
- Alternative lecture by John Schulmann (english): video
- Definitive guide in policy/value iteration from Sutton: start from page 81 here.
- Planning by dynamic programming (D. Silver) - video
- Planning via tree search videos 2-6 from CS188
- Our lecture:
- Monte-carlo tree search
- Integrating learning and planning (D. Silver) - video
- Approximating the MCTS optimal actions - 5vision solution for deephack.RL, code by Mikhail Pavlov - repo
The main assignment is seminar1_VI.ipynb
notebook in this week's folder.
If you're interested in model-based RL at scale, go through Materials: planning section and proceed with seminar2_MCTS.ipynb
notebook.