Skip to content

Latest commit

 

History

History
 
 

week2_value_based

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

Materials

  • Lecture slides
  • Our videos: lecture seminar (russian)
  • [main] lecture by David Silver - url
  • Alternative lecture by Pieter Abbeel (english): part 1, part 2
  • Alternative lecture by John Schulmann (english): video
  • Definitive guide in policy/value iteration from Sutton: start from page 81 here.

Materials: planning

  • Planning by dynamic programming (D. Silver) - video
  • Planning via tree search videos 2-6 from CS188
  • Our lecture:
  • Monte-carlo tree search
    • Udacity video on monte-carlo tree search (first part of a chain) - video
    • Reminder: UCB-1 - slides
    • Monte-carlo tree search step-by-step by J.Levine - video
    • Guide to MCTS (monte-carlo tree search) - post
    • Another guide to MCTS - url
  • Integrating learning and planning (D. Silver) - video
  • Approximating the MCTS optimal actions - 5vision solution for deephack.RL, code by Mikhail Pavlov - repo

Homework description:

The main assignment is seminar1_VI.ipynb notebook in this week's folder.

If you're interested in model-based RL at scale, go through Materials: planning section and proceed with seminar2_MCTS.ipynb notebook.