[Algorithm] QMixer loss and multiagent models (#1378)

Signed-off-by: Matteo Bettini <matbet@meta.com>
pytorch · Jul 14, 2023 · 574dbf1 · 574dbf1 · github-actions · Jul 14, 2023
1 parent 9c95e1d
commit 574dbf1
Show file tree

Hide file tree

Showing 15 changed files with 1,691 additions and 40 deletions.
diff --git a/docs/source/reference/modules.rst b/docs/source/reference/modules.rst
@@ -335,6 +335,20 @@ algorithms, such as DQN, DDPG or Dreamer.
     RSSMPrior
     RSSMPosterior
 
+Multi-agent-specific modules
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+These networks implement models that can be used in
+multi-agent contexts.
+
+.. autosummary::
+    :toctree: generated/
+    :template: rl_template_noinherit.rst
+
+    MultiAgentMLP
+    QMixer
+    VDNMixer
+
 
 Exploration
 -----------

diff --git a/docs/source/reference/objectives.rst b/docs/source/reference/objectives.rst
@@ -185,6 +185,21 @@ Dreamer
     DreamerModelLoss
     DreamerValueLoss
 
+Multi-agent objectives
+----------------------
+.. currentmodule:: torchrl.objectives.multiagent
+
+These objectives are specific to multi-agent algorithms.
+
+QMixer
+~~~~~~
+
+.. autosummary::
+    :toctree: generated/
+    :template: rl_template_noinherit.rst
+
+    QMixerLoss
+
 
 Returns
 -------

diff --git a/setup.py b/setup.py
@@ -235,6 +235,7 @@ def _main(argv):
             "checkpointing": [
                 "torchsnapshot",
             ],
+            "marl": ["vmas"],
         },
         zip_safe=False,
         classifiers=[
@@ -254,5 +255,4 @@ def _main(argv):
 
 
 if __name__ == "__main__":
-
     _main(sys.argv[1:])
Benchmark suite	Current: `574dbf1`	Previous: `9c95e1d`	Ratio
`benchmarks/test_objectives_benchmarks.py::test_reinforce_speed`	`103.36318234699311` iter/sec (`stddev: 0.0011253947426459006`)	`215.8205136415671` iter/sec (`stddev: 0.00021539867449864384`)	`2.09`
`benchmarks/test_objectives_benchmarks.py::test_iql_speed`	`20.80897970496085` iter/sec (`stddev: 0.0033384474096849206`)	`42.24455036474581` iter/sec (`stddev: 0.0013206046382742407`)	`2.03`