You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Traceback (most recent call last):
File "/projects/ruhdorfer/msc2023_constantin/src/scripts/train_simple_overcooked.py", line 31, in<module>
ego.learn(total_timesteps=1000)
File "/projects/ruhdorfer/msc2023_constantin/venv/lib/python3.10/site-packages/stable_baselines3/dqn/dqn.py", line 269, in learn
returnsuper().learn(
File "/projects/ruhdorfer/msc2023_constantin/venv/lib/python3.10/site-packages/stable_baselines3/common/off_policy_algorithm.py", line 311, in learn
rollout = self.collect_rollouts(
File "/projects/ruhdorfer/msc2023_constantin/venv/lib/python3.10/site-packages/stable_baselines3/common/off_policy_algorithm.py", line 543, in collect_rollouts
new_obs, rewards, dones, infos = env.step(actions)
File "/projects/ruhdorfer/msc2023_constantin/venv/lib/python3.10/site-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 163, in step
returnself.step_wait()
File "/projects/ruhdorfer/msc2023_constantin/venv/lib/python3.10/site-packages/stable_baselines3/common/vec_env/dummy_vec_env.py", line 54, in step_wait
obs, self.buf_rews[env_idx], self.buf_dones[env_idx], self.buf_infos[env_idx] = self.envs[env_idx].step(
File "/projects/ruhdorfer/msc2023_constantin/venv/lib/python3.10/site-packages/stable_baselines3/common/monitor.py", line 95, in step
observation, reward, done, info = self.env.step(action)
File "/projects/ruhdorfer/msc2023_constantin/venv/lib/python3.10/site-packages/gym/wrappers/order_enforcing.py", line 11, in step
observation, reward, done, info = self.env.step(action)
File "/projects/ruhdorfer/PantheonRL/pantheonrl/common/multiagentenv.py", line 195, in step
acts = self._get_actions(self._players, self._obs, action)
File "/projects/ruhdorfer/PantheonRL/pantheonrl/common/multiagentenv.py", line 157, in _get_actions
actions.append(agent.get_action(ob))
File "/projects/ruhdorfer/PantheonRL/pantheonrl/common/agents.py", line 263, in get_action
self.model._store_transition(
File "/projects/ruhdorfer/msc2023_constantin/venv/lib/python3.10/site-packages/stable_baselines3/common/off_policy_algorithm.py", line 455, in _store_transition
fori, donein enumerate(dones):
TypeError: 'bool' object is not iterable
This seems to be due to the fact that SB3 is expecting multiple dones from env.step in stable_baselines3/common/off_policy_algorithm.py:544: new_obs, rewards, dones, infos = env.step(actions) where Overcooked only returns a single done in overcookedgym/overcooked.py:80.
Are off policy algorithms not supported? Is there a good way of fixing this, i.e. by changing line 80 from
Hi,
I adapted the simple example to use
Just to test
OffPolicyAgent
but I keep getting:This seems to be due to the fact that SB3 is expecting multiple
dones
fromenv.step
instable_baselines3/common/off_policy_algorithm.py:544
:new_obs, rewards, dones, infos = env.step(actions)
where Overcooked only returns a singledone
inovercookedgym/overcooked.py:80
.Are off policy algorithms not supported? Is there a good way of fixing this, i.e. by changing line 80 from
to
?
Thank you!
Cheers, Constantin
The text was updated successfully, but these errors were encountered: