Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 0.1.3 #486

Merged
merged 139 commits into from
Jun 15, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
139 commits
Select commit Hold shift + click to select a range
a535e3d
Add rough recurrent code for MAPPO.
DriesSmit Sep 27, 2021
0c6405b
Save progress.
DriesSmit Sep 27, 2021
71f4dc6
Save recurrent PPO progress.
DriesSmit Sep 27, 2021
cc42e98
Recurrent PPO is running.
DriesSmit Sep 28, 2021
1ae02b5
Small fixes.
DriesSmit Sep 28, 2021
ac46176
Recurrent MAPPO trains of the debugging environment!
DriesSmit Sep 29, 2021
1e4aaae
Small fix.
DriesSmit Sep 29, 2021
fa2b43c
Save changes.
DriesSmit Sep 30, 2021
a2bbc86
Add code to MAD4PG.
DriesSmit Oct 3, 2021
345fe1b
Ready to run 2 vs 2 xray_attention.
DriesSmit Oct 4, 2021
a25e977
Decrease queue size.
DriesSmit Oct 5, 2021
3c8a32a
Merge develop.
DriesSmit Oct 29, 2021
792ea3c
PPO seems to be training and running.
DriesSmit Oct 29, 2021
f1006c1
Add multiple trainers PPO example.
DriesSmit Oct 29, 2021
cbf9cda
Merge branch 'develop' into feature/recurrent-mappo
DriesSmit Nov 19, 2021
5b44c5b
Merge develop.
DriesSmit Dec 3, 2021
c33e32c
Fix PPO example.
DriesSmit Dec 3, 2021
ef77441
Fix embed_spec bug.
DriesSmit Dec 3, 2021
fb1c43c
Fix mypy issues.
DriesSmit Dec 3, 2021
58b68f9
Fix mypy issue.
DriesSmit Dec 3, 2021
153e0e5
Merge branch 'develop' into feature/recurrent-mappo
KaleabTessera Dec 13, 2021
e4a343e
Address some of the PR comments.
DriesSmit Dec 14, 2021
d7f8ba5
Add termination condition to MA-PPO.
DriesSmit Dec 14, 2021
05c1650
Merge branch 'develop' into feature/recurrent-mappo
DriesSmit Dec 14, 2021
473d40d
Merge branch 'develop' into feature/recurrent-mappo
DriesSmit Jan 6, 2022
f709231
Merge branch 'develop' into feature/recurrent-mappo
KaleabTessera Jan 6, 2022
aef20d2
Address PR comments.
DriesSmit Jan 12, 2022
92abb30
Add the capability for MAPPO to use continuous action spaces.
DriesSmit Jan 13, 2022
7797e91
Merge branch 'develop' into feature/recurrent-mappo
arnupretorius Jan 14, 2022
eba009c
Merge branch 'develop' into feature/recurrent-mappo
arnupretorius Jan 14, 2022
c89ce9f
fix: Small fixes.
DriesSmit Mar 9, 2022
276933c
fix: mypy.
DriesSmit Mar 9, 2022
ad2f300
Merge branch 'develop' into feature/recurrent-mappo
DriesSmit Mar 10, 2022
28c3217
temp obs gradient var
AsadJeewa Mar 22, 2022
d6595f7
temp obs gradient var
AsadJeewa Mar 22, 2022
f8810b1
Merge branch 'develop' into feat/maddpg_obs_optim
AsadJeewa Mar 22, 2022
c5ffe70
Merge branch 'develop' into feature/recurrent-mappo
DriesSmit Mar 23, 2022
1967146
fix: Test small fix.
DriesSmit Mar 23, 2022
1fcaf0c
fix: Change writer back.
DriesSmit Mar 23, 2022
75ba274
fix: Change back.
DriesSmit Mar 23, 2022
ee366e7
temp obs gradient var
AsadJeewa Mar 23, 2022
21b376b
fix: Distributional head inside networks.
DriesSmit Mar 23, 2022
0c49d7a
feat: configurable obs network ddpg d4pg
AsadJeewa Mar 23, 2022
78a4d7c
Merge branch 'develop' into feat/maddpg_obs_optim
AsadJeewa Mar 24, 2022
232a540
fix: Merge dev.
DriesSmit Mar 24, 2022
315a151
fix: Change static unroll function to a manual unroll function.
DriesSmit Mar 24, 2022
c0e3127
fix: Change setting in PPO sequence adder. Remove custom adder code.
DriesSmit Mar 25, 2022
9217172
fix: Small comment updates.
DriesSmit Mar 25, 2022
66757a2
bugfix: architecture type typo fix
RuanJohn Mar 28, 2022
b8ccfe0
fix: Baseline cost defualt.
DriesSmit Mar 28, 2022
acf592f
Merge branch 'feature/recurrent-mappo' of github.com:instadeepai/Mava…
DriesSmit Mar 28, 2022
f0186b6
fix: Update defualt sequence length.
DriesSmit Mar 28, 2022
091d73a
Fix entropy term in trainer for continuous action space environments.
DriesSmit Mar 29, 2022
7ce71c7
fix: Small fix to entropy loss.
DriesSmit Mar 29, 2022
64cb86a
fix: Small fix to variable spelling.
DriesSmit Mar 29, 2022
d797241
fix: Fix continuous action space PPO by creating custom clipped Gauss…
DriesSmit Mar 31, 2022
b34c707
fix: Small fixes.
DriesSmit Mar 31, 2022
ec31d19
fix: Replace clip to spec with tanh to spec.
DriesSmit Mar 31, 2022
527d8df
fix: Remove comment.
DriesSmit Mar 31, 2022
40a430e
fix: Remove comment.
DriesSmit Mar 31, 2022
c9a1902
debug: d4pg update_obs_once
AsadJeewa Mar 31, 2022
771fc5a
remove debug code
AsadJeewa Mar 31, 2022
50eaae7
remove debug code
AsadJeewa Mar 31, 2022
86147a2
feature: Update Gaussian head's settings control the possible distrib…
DriesSmit Apr 1, 2022
91fc6a0
feature: Small update.
DriesSmit Apr 1, 2022
3ca5feb
fix: Update black version.
DriesSmit Apr 1, 2022
3852b0a
chore: Ignore files generated in testing when formatting code.
KaleabTessera Apr 1, 2022
9da3b55
fix: Update black version.
DriesSmit Apr 1, 2022
c3f08a6
fix: Update black version.
DriesSmit Apr 1, 2022
a76921a
chore: Minor black changes.
KaleabTessera Apr 1, 2022
c488e03
Merge pull request #470 from instadeepai/bugfix/black-version-update
DriesSmit Apr 1, 2022
723e713
Merge branch 'develop' into feat/maddpg_obs_optim
AsadJeewa Apr 1, 2022
0adc026
Merge branch 'develop' into feature/recurrent-mappo
DriesSmit Apr 4, 2022
4184fa7
feat: Add capability to fix the sampler for MADDPG.
DriesSmit Apr 12, 2022
24dfe32
fix: Small fix to MAD4PG.
DriesSmit Apr 12, 2022
ae33e1c
fix: multiple agents and multiple trainers using separate tables and …
EdanToledo Apr 12, 2022
bdcdfab
fix: Small fixes to the net_spec_keys setup code for MADDPG.
DriesSmit Apr 12, 2022
5774a57
feat: add multiple trainer, multiple agent, multiple agent architectu…
EdanToledo Apr 12, 2022
1a84f2f
fix: allow net_spec keys to be passed into create_default_networks
EdanToledo Apr 12, 2022
df45468
fix: Remove redundant statement.
DriesSmit Apr 12, 2022
49f8534
fix: Remove redundant statement.
DriesSmit Apr 12, 2022
34c538c
feat: Small improvement.
DriesSmit Apr 12, 2022
18d420f
fix: linting
EdanToledo Apr 12, 2022
7d411dc
fix: PPO training for networks with Categorical heads.
DriesSmit Apr 13, 2022
25f3ea1
fix: Small fix to dataset shuffler.
DriesSmit Apr 13, 2022
dbb5797
fix: Remove print statement.
DriesSmit Apr 13, 2022
a257819
Small fixes to trainer variable client and Hyperparameter settings.
DriesSmit Apr 13, 2022
8e25670
feat: added multiple network fix
sash-a Apr 13, 2022
a88de00
feat: Small updates to hyperparameters. Moving system closer to devel…
DriesSmit Apr 13, 2022
072771b
merge: Merge changes.
DriesSmit Apr 13, 2022
10185f6
fix: add if statement to check if default net spec keys is given and …
EdanToledo Apr 13, 2022
80cf51d
fix: linting issue
EdanToledo Apr 13, 2022
2851235
fix: Small training fixes.
DriesSmit Apr 14, 2022
1674154
fix: Big bugfix in MAPPO trainer code setup.
DriesSmit Apr 14, 2022
b8d2738
Remove variable update inside mappo trainer _step code.
DriesSmit Apr 19, 2022
86858c2
feat: Added jax docker containers.
KaleabTessera Apr 13, 2022
8da8562
feat: Added jax containers to autopush.
KaleabTessera Apr 13, 2022
2f9d568
fix: Correct base docker image.
KaleabTessera Apr 13, 2022
2bfa19f
fix: Updated base image to work with jax.
KaleabTessera Apr 14, 2022
f307597
chore: Updated readme with new jax containers.
KaleabTessera Apr 14, 2022
3155748
chore: Updated python virtual env readme.
KaleabTessera Apr 14, 2022
7b77152
fix: removed jax from reverb requirements.
KaleabTessera Apr 19, 2022
ecb90a6
feat: Updated pettingzoo.
KaleabTessera Apr 19, 2022
fe5389d
Merge pull request #480 from instadeepai/bugfix/update-pz-docker-for-jax
KaleabTessera Apr 20, 2022
7aac022
Merge branch 'develop' into feature/recurrent-mappo
DriesSmit Apr 20, 2022
a7343c2
chore: remove debug code, add comment to ddpg
AsadJeewa Apr 20, 2022
16c0fe1
Merge branch 'develop' into feat/maddpg_obs_optim
AsadJeewa Apr 20, 2022
67a2d43
chore: remove debug variable
AsadJeewa Apr 20, 2022
60eb054
fix: Address PR comments.
DriesSmit Apr 21, 2022
5877138
Merge pull request #326 from instadeepai/feature/recurrent-mappo
DriesSmit Apr 21, 2022
34c85b3
Merge branch 'develop' into feat/maddpg_obs_optim
DriesSmit Apr 21, 2022
80dd1df
chore: remove redundant code
AsadJeewa Apr 21, 2022
7c04224
fix: Merge branch 'feat/maddpg_obs_optim' of https://github.com/insta…
AsadJeewa Apr 21, 2022
897a08b
chore: remove redundant code
AsadJeewa Apr 21, 2022
dc7b93c
Merge pull request #459 from instadeepai/feat/maddpg_obs_optim
AsadJeewa Apr 22, 2022
67d1a52
Merge branch 'develop' into feature/fix_sampler_maddpg
DriesSmit Apr 22, 2022
c4c2509
Merge branch 'develop' into feature/fix_sampler_madqn
DriesSmit Apr 22, 2022
1f7994a
fix: Update 'list_of_networks' variable name to be 'networks_used_by_…
DriesSmit Apr 22, 2022
019da42
fix: Small update to variable naming.
DriesSmit Apr 22, 2022
1aba8ed
Merge pull request #475 from instadeepai/feature/fix_sampler_maddpg
DriesSmit Apr 22, 2022
5832921
fix: Small update variable naming.
DriesSmit Apr 22, 2022
f494b83
fix: Small fix to PPO variable naming.
DriesSmit Apr 22, 2022
1eef1eb
Merge pull request #477 from instadeepai/feature/fix_sampler_madqn
DriesSmit Apr 22, 2022
fa57447
chore: Up the patch version of mava.
KaleabTessera Apr 22, 2022
4b07ae5
Merge pull request #485 from instadeepai/feature/release-0.1.3
KaleabTessera Apr 22, 2022
06773ee
fix: Fixed keys in state based arch.
KaleabTessera Jun 6, 2022
486d963
fix: Fixed keys in centralised arch.
KaleabTessera Jun 6, 2022
9c00fa1
feat: Update dependencies and remove lp dependency (it is already par…
KaleabTessera Jun 6, 2022
1e40dd7
chore: Removed independent lp install.
KaleabTessera Jun 6, 2022
f1c346e
chore :Improve consistency env_states -> s_t.
KaleabTessera Jun 7, 2022
4b163c0
fix: Make all tf tests use shared weights.
KaleabTessera Jun 7, 2022
776b612
fix: Fix networked arch.
KaleabTessera Jun 7, 2022
4db7532
chore(tests): speed up tests by using smaller networks.
KaleabTessera Jun 7, 2022
2cd12ba
chore: Better test skip reason.
KaleabTessera Jun 7, 2022
01b7140
fix: Update python version for melting pot images.
KaleabTessera Jun 7, 2022
05b69e4
fix: pin meltingpot to before py3.9 requirement.
KaleabTessera Jun 8, 2022
a6d01fa
chore: More explicit on shared weights in tests.
KaleabTessera Jun 9, 2022
deb6d6f
fix: networked arch only support shared weights.
KaleabTessera Jun 9, 2022
d2a4004
Merge pull request #552 from instadeepai/bugfix/fix-old-tf-architectures
KaleabTessera Jun 15, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
fix: Fix networked arch.
  • Loading branch information
KaleabTessera committed Jun 7, 2022
commit 776b6127e1761478d20631ba73773f3616c5b4b8
5 changes: 3 additions & 2 deletions mava/components/tf/architectures/networked.py
Original file line number Diff line number Diff line change
Expand Up @@ -114,12 +114,13 @@ def _get_critic_specs(

for agent_type, agents in agents_by_type.items():
for agent in agents:
critic_obs_shape = list(copy.copy(self._embed_specs[agent].shape))
net_key = self._agent_net_keys[agent]
critic_obs_shape = list(copy.copy(self._embed_specs[net_key].shape))
critic_act_shape = list(
copy.copy(self._agent_specs[agent].actions.shape)
)
critic_obs_shape.insert(0, len(self._network_spec[agent]))
critic_obs_specs[agent] = tf.TensorSpec(
critic_obs_specs[net_key] = tf.TensorSpec(
shape=critic_obs_shape,
dtype=tf.dtypes.float32,
)
Expand Down
20 changes: 19 additions & 1 deletion tests/systems/maddpg_system_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
import functools

import launchpad as lp
import pytest
import sonnet as snt

import mava
Expand All @@ -44,7 +45,9 @@ def test_maddpg_on_debugging_env(self) -> None:

# networks
network_factory = lp_utils.partial_kwargs(
maddpg.make_default_networks, policy_networks_layer_sizes=(64, 64)
maddpg.make_default_networks,
policy_networks_layer_sizes=(32, 32),
critic_networks_layer_sizes=(64, 64),
)

# system
Expand Down Expand Up @@ -94,6 +97,7 @@ def test_recurrent_maddpg_on_debugging_env(self) -> None:
maddpg.make_default_networks,
architecture_type=ArchitectureType.recurrent,
policy_networks_layer_sizes=(32, 32),
critic_networks_layer_sizes=(64, 64),
)

# system
Expand Down Expand Up @@ -147,6 +151,7 @@ def test_centralised_maddpg_on_debugging_env(self) -> None:
network_factory = lp_utils.partial_kwargs(
maddpg.make_default_networks,
policy_networks_layer_sizes=(32, 32),
critic_networks_layer_sizes=(64, 64),
)

# system
Expand Down Expand Up @@ -184,6 +189,16 @@ def test_centralised_maddpg_on_debugging_env(self) -> None:
for _ in range(2):
trainer.step()

@pytest.mark.skip(
reason="""
Running tests with shared_weights=False pass when running indepedently
(other tests commented out), but fail when run with other tests and not
enough parallel cores (2 or less). This is likely a race condition,
hangling process from previous tests or related to network sampling
(TODO @Dries investigate if you have a chance). Only the test fails,
the examples run.
"""
)
def test_networked_maddpg_on_debugging_env(self) -> None:
"""Test networked maddpg."""
# environment
Expand All @@ -197,6 +212,7 @@ def test_networked_maddpg_on_debugging_env(self) -> None:
network_factory = lp_utils.partial_kwargs(
maddpg.make_default_networks,
policy_networks_layer_sizes=(32, 32),
critic_networks_layer_sizes=(64, 64),
)

# system
Expand All @@ -213,6 +229,7 @@ def test_networked_maddpg_on_debugging_env(self) -> None:
trainer_fn=maddpg.MADDPGNetworkedTrainer,
architecture=architectures.NetworkedQValueCritic,
connection_spec=fully_connected_network_spec,
shared_weights=False,
)
program = system.build()

Expand Down Expand Up @@ -249,6 +266,7 @@ def test_state_based_maddpg_on_debugging_env(self) -> None:
network_factory = lp_utils.partial_kwargs(
maddpg.make_default_networks,
policy_networks_layer_sizes=(32, 32),
critic_networks_layer_sizes=(64, 64),
)

# system
Expand Down