Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/Multiple trainers for MA-DDPG #253

Merged
merged 164 commits into from
Oct 25, 2021
Merged

Conversation

DriesSmit
Copy link
Contributor

@DriesSmit DriesSmit commented Jun 29, 2021

What?

Implements a scaled-up version of MADDPG where multiple trainers can now be used with multiple executors. A centralised variable server is also implemented that absorbs the responsibilities of the counter node, trainer checkpointing and trainer variable source. The trainers and executors now read and write to the centralised variable source directly. A multiple trainer example is included where 3 trainers and 2 executors are used to train 3 non-weight sharing agents on the debugging environment.

Why?

Multiple trainers allow for the parallelisation of the trainer's tasks. Just like is already done with executors. This also opens the door to hyperparameter tuning directly using Mava in future updates.

How?

Added a new Scaled MA-DDPG system that allows for the use of multiple trainers.

Extra

This PR uses changes proposed in updated-network-keys. Therefore that PR should be merged first. After that point, this PR can be moved out of the draft status.

Copy link
Contributor

@KaleabTessera KaleabTessera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for this massive effort @DriesSmit 🙌 🏆 This is close to ready from my standpoint, I just had some comments and then we can do the benchmarks 🚀 🔥

mava/systems/tf/mad4pg/system.py Show resolved Hide resolved
mava/systems/tf/mad4pg/system.py Show resolved Hide resolved
mava/utils/adder_utils.py Outdated Show resolved Hide resolved
mava/utils/environments/render_utils.py Show resolved Hide resolved
mava/utils/sort_utils.py Show resolved Hide resolved
)

# Flush the writer.
self._writer.flush(self._max_in_flight_items)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So for certain adders, we previously flushed after every write , e.g. sequence adder (

self._writer.flush(self._max_in_flight_items)
). Not we flush the writer once at the end (like we did for the transition adder). Not sure if this has any side-effects?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great observation! I have not considered this. Flush seems to wait for all the data to be written. Please see here. So maybe its fine to do this only at the end of a trajectory step. Not sure. I am happy either way. I just don't want to make the writing even slower if we flush to often. But also if we need to flush after each write then we should change it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I phrased my previous message incorrectly. The flush is after every create_item , not write.

Let's do a flush after every create_item. Acme did that recently across all writers (google-deepmind/acme@94711b1). Please check we do it for the case where it is not multiple trainers as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this back to how it was previously. I think doing it after every create_item broke my training.

mava/systems/tf/maddpg/system.py Show resolved Hide resolved
mava/systems/tf/maddpg/system.py Outdated Show resolved Hide resolved
mava/systems/tf/maddpg/system.py Outdated Show resolved Hide resolved
mava/systems/tf/variable_utils.py Show resolved Hide resolved
@DriesSmit
Copy link
Contributor Author

Trainer logging does not seem to be working for mava/examples/petting_zoo/sisl/multiwalker/recurrent/decentralised/run_maddpg.py. Fix this. Also compare training MAD4G mutliwalker executor and training speeds with develop.

@DriesSmit
Copy link
Contributor Author

Trainer logging does not seem to be working for mava/examples/petting_zoo/sisl/multiwalker/recurrent/decentralised/run_maddpg.py. Fix this. Also compare training MAD4G mutliwalker executor and training speeds with develop.

Fixed.

Copy link
Collaborator

@arnupretorius arnupretorius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice @DriesSmit! 🙌 Just see my few comments. :)

mava/components/tf/architectures/__init__.py Outdated Show resolved Hide resolved
mava/components/tf/architectures/state_based.py Outdated Show resolved Hide resolved
mava/systems/tf/mad4pg/__init__.py Outdated Show resolved Hide resolved
mava/systems/tf/mad4pg/training.py Outdated Show resolved Hide resolved
mava/systems/tf/mad4pg/training.py Outdated Show resolved Hide resolved
mava/systems/tf/maddpg/training.py Outdated Show resolved Hide resolved
mava/wrappers/system_trainer_statistics.py Outdated Show resolved Hide resolved
mava/wrappers/system_trainer_statistics.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@arnupretorius arnupretorius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @DriesSmit!! 🔥

@arnupretorius arnupretorius merged commit 57ae3c8 into develop Oct 25, 2021
@arnupretorius arnupretorius deleted the feature/mava-scaling branch October 25, 2021 09:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants