In this folder we provide a set of multi-agent example scripts using the VMAS simulator.
The MARL algorithms contained in the scripts of this folder run on three multi-robot tasks in VMAS.For more details on the experiment setup and the environments please refer to the corresponding section of the appendix in the TorchRL paper.
First you need to install vmas and the dependencies of the scripts.
Install torchrl and tensordict following repo instructions.
Install vmas and dependencies:
pip install vmas
pip install wandb moviepy
pip install hydra-core
To run the scripts just execute the corresponding python file after having modified the corresponding config according to your needs. The config can be found in the .yaml file with the same name.
For example:
python mappo_ippo.py
You can even change the config from the command line like:
python mappo_ippo.py --m env.scenario_name=navigation
The scripts are set up for collecting many frames, if your compute is limited, you can change the "frames_per_batch" and "num_epochs" parameters to reduce compute requirements.
The scripts are self-contained. This means that all the code you will need to look at is contained in the script file. No helper functions are used.
The structure of scripts follows this order:
- Configuration dictionary for the script
- Environment creation
- Modules creation
- Collector instantiation
- Replay buffer instantiation
- Loss module creation
- Training loop (with inner minibatch loops)
- Evaluation run (at the desired frequency)
Logging is done by default to wandb. The logging backend can be changed in the config files to one of "wandb", "tensorboard", "csv", "mlflow".
All the scripts follow the same on-policy training structure so that results can be compared across different algorithms.