Skip to content

Commit

Permalink
s/spring19/master/g
Browse files Browse the repository at this point in the history
  • Loading branch information
dniku authored and yhn112 committed Jan 24, 2020
1 parent 93e37b7 commit 2844b21
Show file tree
Hide file tree
Showing 22 changed files with 29 additions and 29 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@

# Practical_RL [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/yandexdataschool/practical_rl/spring19)
# Practical_RL [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/yandexdataschool/practical_rl/master)
An open course on reinforcement learning in the wild.
Taught on-campus at [HSE](https://cs.hse.ru) and [YSDA](https://yandexdataschool.com/) and maintained to be friendly to online students (both english and russian).

Expand All @@ -22,7 +22,7 @@ Taught on-campus at [HSE](https://cs.hse.ru) and [YSDA](https://yandexdataschool
* Virtual course environment:
* [Installing dependencies](https://github.com/yandexdataschool/Practical_RL/issues/1) on your local machine (recommended).
* [__google colab__](https://colab.research.google.com/) - set open -> github -> yandexdataschool/pracical_rl -> {branch name} and select any notebook you want.
* Alternatives: [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/yandexdataschool/practical_rl/spring19) and [Azure Notebooks](https://notebooks.azure.com/).
* Alternatives: [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/yandexdataschool/practical_rl/master) and [Azure Notebooks](https://notebooks.azure.com/).


# Additional materials
Expand Down
2 changes: 1 addition & 1 deletion week01_intro/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@


## Practice assignment
Instant dive in: [__seminar_gym_interface__](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/spring19/week01_intro/seminar_gym_interface.ipynb), [__crossentropy_method__](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/spring19/week01_intro/crossentropy_method.ipynb)
Instant dive in: [__seminar_gym_interface__](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/master/week01_intro/seminar_gym_interface.ipynb), [__crossentropy_method__](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/master/week01_intro/crossentropy_method.ipynb)

* Open `gym_interface.ipynb` and follow instructions from there
* After you're done there, proceed to `crossentropy_method.ipynb`
Expand Down
2 changes: 1 addition & 1 deletion week02_value_based/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,6 @@

## Homework description:

The main assignment is `seminar_vi.ipynb`[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/spring19/week02_value_based/seminar_vi.ipynb) notebook in this week's folder. It has no requirements besides the most basic data science libraries (e.g. numpy) so you should be able to run it locally.
The main assignment is `seminar_vi.ipynb`[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/master/week02_value_based/seminar_vi.ipynb) notebook in this week's folder. It has no requirements besides the most basic data science libraries (e.g. numpy) so you should be able to run it locally.

__Note:__ if you have any difficulty using graphviz, just set `has_graphviz=False`.
2 changes: 1 addition & 1 deletion week02_value_based/seminar_vi.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@
"outputs": [],
"source": [
"# If you Colab, uncomment this please\n",
"# !wget -q https://raw.githubusercontent.com/yandexdataschool/Practical_RL/spring19/week02_value_based/mdp.py\n",
"# !wget -q https://raw.githubusercontent.com/yandexdataschool/Practical_RL/master/week02_value_based/mdp.py\n",
"\n",
"transition_probs = {\n",
" 's0': {\n",
Expand Down
6 changes: 3 additions & 3 deletions week03_model_free/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,12 @@

Just as usual, start with
- `seminar_qlearning.ipynb`
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/spring19/week03_model_free/seminar_qlearning.ipynb)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/master/week03_model_free/seminar_qlearning.ipynb)

and then proceed to

- `homework.ipynb`
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/spring19/week03_model_free/homework.ipynb)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/master/week03_model_free/homework.ipynb)

Please pay attention for uncommenting first lines in code if you use Colab.

Expand All @@ -49,7 +49,7 @@ python pacman.py -p PacmanQAgent -x 5000 -n 5010 -l smallGrid # example
* Make sure you can tune agent to beat ./run_crawler.sh
* on windows, just run `python crawler.py` from cmd in the project directory
* other ./run* files are mostly for your amusement.
* ./run_pacman.sh will need more epochs to converge, see [comments](https://github.com/yandexdataschool/Practical_RL/blob/spring19/week03_model_free/seminar_py2/run_pacman.sh)
* ./run_pacman.sh will need more epochs to converge, see [comments](https://github.com/yandexdataschool/Practical_RL/blob/master/week03_model_free/seminar_py2/run_pacman.sh)
* on windows, just copy the type `python pacman.py -p PacmanQAgent -x 2000 -n 2010 -l smallGrid` in cmd from assignemnt dir
(YSDA/HSE) Please submit only qlearningAgents.py file and include a brief text report as comments in it.

2 changes: 1 addition & 1 deletion week04_[recap]_deep_learning/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ __Note:__ This week's materials cover the basics of neural nets and deep learnin


### Practice
__[Colab url (pytorch)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/spring19/week04_%5Brecap%5D_deep_learning/seminar_pytorch.ipynb)__
__[Colab url (pytorch)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/master/week04_%5Brecap%5D_deep_learning/seminar_pytorch.ipynb)__
From now on, we'll have two tracks: theano and tensorflow. We'll also add pytorch seminars as soon as they're ready.

Please pick seminar_theano.ipynb, seminar_tensorflow.ipynb or seminar_pytorch.ipynb.
Expand Down
6 changes: 3 additions & 3 deletions week04_approx_rl/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,9 @@

## Practice

* Seminar: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/spring19/week04_approx_rl/seminar_pytorch.ipynb)
* Homework (main): [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/spring19/week04_approx_rl/homework_pytorch_main.ipynb#scrollTo=KVvvo7k_ap8w)
* Homework (debug): [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/spring19/week04_approx_rl/homework_pytorch_debug.ipynb#scrollTo=KVvvo7k_ap8w)
* Seminar: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/master/week04_approx_rl/seminar_pytorch.ipynb)
* Homework (main): [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/master/week04_approx_rl/homework_pytorch_main.ipynb#scrollTo=KVvvo7k_ap8w)
* Homework (debug): [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/master/week04_approx_rl/homework_pytorch_debug.ipynb#scrollTo=KVvvo7k_ap8w)



Expand Down
2 changes: 1 addition & 1 deletion week04_approx_rl/homework_pytorch_debug.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@
"\n",
"# os.system('python -m pip install -U pygame --user')\n",
"\n",
"# prefix = 'https://raw.githubusercontent.com/yandexdataschool/Practical_RL/spring19/week04_approx_rl/'\n",
"# prefix = 'https://raw.githubusercontent.com/yandexdataschool/Practical_RL/master/week04_approx_rl/'\n",
"\n",
"# os.system('wget ' + prefix + 'atari_wrappers.py')\n",
"# os.system('wget ' + prefix + 'utils.py')\n",
Expand Down
2 changes: 1 addition & 1 deletion week04_approx_rl/homework_pytorch_main.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@
"\n",
"# os.system('python -m pip install -U pygame --user')\n",
"\n",
"# prefix = 'https://raw.githubusercontent.com/yandexdataschool/Practical_RL/spring19/week04_approx_rl/'\n",
"# prefix = 'https://raw.githubusercontent.com/yandexdataschool/Practical_RL/master/week04_approx_rl/'\n",
"\n",
"# os.system('wget ' + prefix + 'atari_wrappers.py')\n",
"# os.system('wget ' + prefix + 'utils.py')\n",
Expand Down
2 changes: 1 addition & 1 deletion week04_approx_rl/seminar_pytorch.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@
"\n",
"Since we're working with a pre-extracted features (cart positions, angles and velocities), we don't need a complicated network yet. In fact, let's build something like this for starters:\n",
"\n",
"![img](https://raw.githubusercontent.com/yandexdataschool/Practical_RL/spring19/yet_another_week/_resource/qlearning_scheme.png)\n",
"![img](https://raw.githubusercontent.com/yandexdataschool/Practical_RL/master/yet_another_week/_resource/qlearning_scheme.png)\n",
"\n",
"For your first run, please only use linear layers (nn.Linear) and activations. Stuff like batch normalization or dropout may ruin everything if used haphazardly. \n",
"\n",
Expand Down
2 changes: 1 addition & 1 deletion week05_explore/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/spring19/week05_explore/week5.ipynb)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/master/week05_explore/week5.ipynb)

### Slides - [here](https://yadi.sk/i/H0zVBROe3TWWHz)

Expand Down
4 changes: 2 additions & 2 deletions week06_policy_based/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,6 @@
* Adversarial review of policy gradient - [blog](http://www.argmin.net/2018/02/20/reinforce/)


Run seminar notebook in colab: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/spring19/week06_policy_based/reinforce_pytorch.ipynb)
Run seminar notebook in colab: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/master/week06_policy_based/reinforce_pytorch.ipynb)

Run optional homework notebook in colab: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/spring19/week06_policy_based/a2c-optional.ipynb)
Run optional homework notebook in colab: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/master/week06_policy_based/a2c-optional.ipynb)
2 changes: 1 addition & 1 deletion week07_[recap]_rnn/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
* OpenAI research on sentiment analysis that sheds some light on what's inside LSTM language model.

# Homework description
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/spring19/week07_%5Brecap%5D_rnn/seminar_pytorch.ipynb)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/master/week07_%5Brecap%5D_rnn/seminar_pytorch.ipynb)

This week's practice gets you acquainted with basics of recurrent neural networks. For simplicity, we'll train them on character language modelling task. Pick any one of `seminar_lasagne`, `seminar_lasagne_ingraph` or `seminar_tf`.

Expand Down
2 changes: 1 addition & 1 deletion week07_[recap]_rnn/seminar_pytorch.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -242,7 +242,7 @@
"# Recurrent neural network\n",
"\n",
"We can rewrite recurrent neural network as a consecutive application of dense layer to input $x_t$ and previous rnn state $h_t$. This is exactly what we're gonna do now.\n",
"<img src=\"https://github.com/yandexdataschool/Practical_RL/blob/spring19/week07_%5Brecap%5D_rnn/rnn.png?raw=1\" width=480>\n",
"<img src=\"https://github.com/yandexdataschool/Practical_RL/blob/master/week07_%5Brecap%5D_rnn/rnn.png?raw=1\" width=480>\n",
"\n",
"Since we're training a language model, there should also be:\n",
"* An embedding layer that converts character id x_t to a vector.\n",
Expand Down
2 changes: 1 addition & 1 deletion week07_seq2seq/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

## Practice

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/spring19/week07_seq2seq/practice_torch.ipynb)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/master/week07_seq2seq/practice_torch.ipynb)


As usual, go to practice_{your framework}.ipynb above and follow instructions from there. [pytorch](./practice_torch.ipynb), [tensorflow](./practice_tf.ipynb), [theano](./practice_theano.ipynb)
Expand Down
2 changes: 1 addition & 1 deletion week08_pomdp/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,6 @@ _Links on all articles mentioned during the lecture could be found in "Reference

# Practice

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/spring19/week08_pomdp/practice_pytorch.ipynb)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/master/week08_pomdp/practice_pytorch.ipynb)

The assignment is platform and framewerk independent, so choose the framework that suits you best, but pay attention on how many you will need to implement youself in case of nonstandart ones.
4 changes: 2 additions & 2 deletions week09_policy_II/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@ This section covers some steroids for policy gradient methods, along with a cool
* Original articles - [TRPO](https://arxiv.org/abs/1502.05477), [NPG](https://papers.nips.cc/paper/2073-a-natural-policy-gradient.pdf)

## Practice
* Seminar: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/spring19/week09_policy_II/seminar_TRPO_pytorch.ipynb)
* Seminar: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/master/week09_policy_II/seminar_TRPO_pytorch.ipynb)

* Homework: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/spring19/week09_policy_II/ppo.ipynb)
* Homework: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/master/week09_policy_II/ppo.ipynb)

## More: Reinforcement learning in large/continuous action spaces
While you already know algorithms that will work with continuously many actions, it can't hurt to learn something more specialized.
Expand Down
2 changes: 1 addition & 1 deletion week09_policy_II/ppo.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
"# os.system('pip install pyglet==1.2.4')\n",
"# os.system('pip install gym')\n",
"\n",
"# prefix = 'https://raw.githubusercontent.com/yandexdataschool/Practical_RL/spring19/week06_policy_II/'\n",
"# prefix = 'https://raw.githubusercontent.com/yandexdataschool/Practical_RL/master/week06_policy_II/'\n",
"\n",
"# os.system('wget ' + prefix + 'runners.py')\n",
"# os.system('wget ' + prefix + 'mujoco_wrappers.py')\n",
Expand Down
2 changes: 1 addition & 1 deletion week09_policy_II/seminar_TRPO_pytorch.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -765,7 +765,7 @@
"\n",
"![img](https://s17.postimg.org/i90chxgvj/vine.png)\n",
"\n",
"In most gym environments, you can actually backtrack by using states. You can find a wrapper that saves/loads states in [the mcts seminar](https://github.com/yandexdataschool/Practical_RL/blob/spring19/week10_planning/seminar_MCTS.ipynb).\n",
"In most gym environments, you can actually backtrack by using states. You can find a wrapper that saves/loads states in [the mcts seminar](https://github.com/yandexdataschool/Practical_RL/blob/master/week10_planning/seminar_MCTS.ipynb).\n",
"\n",
"You can read more about in the [TRPO article](https://arxiv.org/abs/1502.05477) in section 5.2.\n",
"\n",
Expand Down
2 changes: 1 addition & 1 deletion week09_policy_II/seminar_TRPO_tensorflow.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -718,7 +718,7 @@
"\n",
"![img](https://s17.postimg.org/i90chxgvj/vine.png)\n",
"\n",
"In most gym environments, you can actually backtrack by using states. You can find a wrapper that saves/loads states in [the mcts seminar](https://github.com/yandexdataschool/Practical_RL/blob/spring19/week10_planning/seminar_MCTS.ipynb).\n",
"In most gym environments, you can actually backtrack by using states. You can find a wrapper that saves/loads states in [the mcts seminar](https://github.com/yandexdataschool/Practical_RL/blob/master/week10_planning/seminar_MCTS.ipynb).\n",
"\n",
"You can read more about in the [TRPO article](https://arxiv.org/abs/1502.05477) in section 5.2.\n",
"\n",
Expand Down
2 changes: 1 addition & 1 deletion week09_policy_II/seminar_TRPO_theano.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -718,7 +718,7 @@
"\n",
"![img](https://s17.postimg.org/i90chxgvj/vine.png)\n",
"\n",
"In most gym environments, you can actually backtrack by using states. You can find a wrapper that saves/loads states in [the mcts seminar](https://github.com/yandexdataschool/Practical_RL/blob/spring19/week10_planning/seminar_MCTS.ipynb).\n",
"In most gym environments, you can actually backtrack by using states. You can find a wrapper that saves/loads states in [the mcts seminar](https://github.com/yandexdataschool/Practical_RL/blob/master/week10_planning/seminar_MCTS.ipynb).\n",
"\n",
"You can read more about in the [TRPO article](https://arxiv.org/abs/1502.05477) in section 5.2.\n",
"\n",
Expand Down
2 changes: 1 addition & 1 deletion week10_planning/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
## Assignments

Just as usual, start with `seminar_MCTS.ipynb`
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/spring19/week10_planning/seminar_MCTS.ipynb)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yandexdataschool/Practical_RL/blob/master/week10_planning/seminar_MCTS.ipynb)

## Materials: planning

Expand Down

0 comments on commit 2844b21

Please sign in to comment.