Skip to content

Latest commit

 

History

History

systems

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

Systems

Mava includes a number of pre-built systems as listed below. All the systems are implemented using Launchpad, which is used for distributed training. This allows for easy scaling of computational resources by changing only one variable.

Below we list the different systems in Mava based on the action spaces they use. More systems will be added in future Mava updates.

Continuous control

The following systems focus on this setting:

System Paper Code
Multi-Agent Deep Deterministic Policy Gradient (MADDPG) Lowe et al., 2017 TF
Multi-Agent Distributed Distributional DDPG (MAD4PG) Barth-Maron et al., 2018 TF

Discrete control

We also include a number of systems built with discrete action-spaces in mind listed below:

System Paper Code
Deep Q-Networks (DQN) Horgan et al., 2018 TF
Differentiable Inter-Agent Learning (DIAL) Foerster et al., 2016 TF
QMIX Rashid et al., 2018 TF

Mixed

We also have a system that works with either discrete or continuous action-spaces:

System Paper Code
Multi-Agent Proximal Policy Optimization (MAPPO) Yu et al., 2021, Schroeder et al., 2020 TF