Mava includes a number of pre-built systems as listed below. All the systems are implemented using Launchpad, which is used for distributed training. This allows for easy scaling of computational resources by changing only one variable.
Below we list the different systems in Mava based on the action spaces they use. More systems will be added in future Mava updates.
The following systems focus on this setting:
System | Paper | Code |
---|---|---|
Multi-Agent Deep Deterministic Policy Gradient (MADDPG) | Lowe et al., 2017 | |
Multi-Agent Distributed Distributional DDPG (MAD4PG) | Barth-Maron et al., 2018 |
We also include a number of systems built with discrete action-spaces in mind listed below:
System | Paper | Code |
---|---|---|
Deep Q-Networks (DQN) | Horgan et al., 2018 | |
Differentiable Inter-Agent Learning (DIAL) | Foerster et al., 2016 | |
QMIX | Rashid et al., 2018 |
We also have a system that works with either discrete or continuous action-spaces:
System | Paper | Code |
---|---|---|
Multi-Agent Proximal Policy Optimization (MAPPO) | Yu et al., 2021, Schroeder et al., 2020 |