Lecture slides - here
- Russian materials:
- English materials:
- Lecture by David Silver (english) - video part I, video part II
- Alternative lecture by Pieter Abbeel (english) - video
- Alternative lecture by John Schulmann (english) - video
- Blog post on q-learning Vs SARSA - url
- N-step temporal difference from Sutton's book - suttonbook chapter 7
- Eligibility traces from Sutton's book - suttonbook chapter 12
- Blog post on eligibility traces - url
Just as usual, start with seminar_qlearning.ipynb
and then proceed to homework.ipynb
.
(optional) If you're running on a local machine (e.g. your pc) with python2, you can also try seminar_py2
. It has some neat RL problems with cool visualizations.
this assignment borrows code from awesome cs188 This homework assignment works on python2 only. If you stick to py3, consider seminar_alternative. Or just install it for this homework alone and remove afterwards.
This homework also requires some physical display (e.g. laptop monitor). It won't work on binder VM / headless server. Please run it on laptop or consider ./seminar_alternative
- You need to implement QLearining algorithm. If you're running go to
seminar_main/
folder and open fileqlearningAgent.py
.
Once you're done, run use those commands:
python crawler.py # Crawler with qlearning
python pacman.py -p <your agent> -x <number of train samples> -n <total number of samples> -l <grid env>
python pacman.py -p PacmanQAgent -x 5000 -n 5010 -l smallGrid # example
- Make sure you can tune agent to beat ./run_crawler.sh
- on windows, just run
python crawler.py
from cmd in the project directory - other ./run* files are mostly for your amusement.
- ./run_pacman.sh will need more epochs to converge, see comments
- on windows, just copy the type
python pacman.py -p PacmanQAgent -x 2000 -n 2010 -l smallGrid
in cmd from assignemnt dir (YSDA/HSE) Please submit only qlearningAgents.py file and include a brief text report as comments in it.