We provide two Q-Learning based systems that follow the independent learners and centralised training with decentralised execution paradigms:
rec-IQL
is a multi-agent version of DQN that uses double DQN and has a GRU memory module and rec-QMIX
is an implementation of QMIX in JAX that uses monontic value function decomposition.