Skip to content

Commit

Permalink
Revert "add context manager for recording, return s', change gym to g…
Browse files Browse the repository at this point in the history
…ymnaasium in description"

This reverts commit 59c7fa6.
  • Loading branch information
laktionov committed Jul 2, 2023
1 parent fe79654 commit 262f540
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 15 deletions.
4 changes: 2 additions & 2 deletions week04_approx_rl/homework_pytorch_debug.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -349,8 +349,8 @@
"source": [
"def play_and_record(initial_state, agent, env, exp_replay, n_steps=1):\n",
" \"\"\"\n",
" Play the game for exactly n_steps, record every (s,a,r,s', done) to replay buffer.\n",
" Whenever game ends due to termination or truncation, add record with done=terminated and reset the game.\n",
" Play the game for exactly n_steps, record every (s,a,r,s, done) to replay buffer.\n",
" Whenever game ends, add record with done=True and reset the game.\n",
" It is guaranteed that env has terminated=False when passed to this function.\n",
"\n",
" PLEASE DO NOT RESET ENV UNLESS IT IS \"DONE\"\n",
Expand Down
11 changes: 5 additions & 6 deletions week04_approx_rl/homework_pytorch_main.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1373,12 +1373,11 @@
"# record sessions\n",
"from gymnasium.wrappers import RecordVideo\n",
"\n",
"with make_env() as env, RecordVideo(\n",
" env=make_env(), video_folder=\"./videos\", episode_trigger=lambda episode_number: True\n",
") as env_monitor:\n",
" sessions = [\n",
" evaluate(env_monitor, agent, n_games=n_lives, greedy=True) for _ in range(10)\n",
" ]\n"
"with RecordVideo(env=make_env(), video_folder='./videos',\n",
" episode_trigger = lambda episode_number: True) as env_monitor:\n",
" sessions = [evaluate(env_monitor, agent, n_games=n_lives,\n",
" greedy=True) for _ in range(10)]\n",
"env.close()"
]
},
{
Expand Down
11 changes: 4 additions & 7 deletions week04_approx_rl/seminar_pytorch.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -420,7 +420,7 @@
"source": [
"### Record videos\n",
"\n",
"As usual, we now use `gymnasium.wrappers.RecordVideo` to record a video of our agent playing the game. Unlike our previous attempts with state binarization, this time we expect our agent to act ~~(or fail)~~ more smoothly since there's no more binarization error at play.\n",
"As usual, we now use `gym.wrappers.Monitor` to record a video of our agent playing the game. Unlike our previous attempts with state binarization, this time we expect our agent to act ~~(or fail)~~ more smoothly since there's no more binarization error at play.\n",
"\n",
"As you already did with tabular q-learning, we set epsilon=0 for final evaluation to prevent agent from exploring himself to death."
]
Expand All @@ -437,12 +437,9 @@
"\n",
"from gymnasium.wrappers import RecordVideo\n",
"\n",
"with gym.make(\"CartPole-v0\", render_mode=\"rgb_array\") as record_env, RecordVideo(\n",
" record_env, video_folder=\"videos\"\n",
") as env_monitor:\n",
" sessions = [\n",
" generate_session(env_monitor, epsilon=0, train=False) for _ in range(100)\n",
" ]\n"
"record_env = gym.make(\"CartPole-v0\", render_mode=\"rgb_array\")\n",
"with RecordVideo(record_env, video_folder=\"videos\") as env_monitor:\n",
" sessions = [generate_session(env_monitor, epsilon=0, train=False) for _ in range(100)]"
]
},
{
Expand Down

0 comments on commit 262f540

Please sign in to comment.