This is the codification used in the IEEE Transactions on Cybernetics paper proposing the Multiagent Object-Oriented approach. You are free to use all or part of the codes here presented for any purpose, provided that the paper is properly cited and the original authors properly credited. All the files here shared come with no warranties.
Paper bib entry:
@ARTICLE{Silvaetal2018,
author = {Silva, Felipe Leno Da and
Ruben Glatt and
Anna Helena Reali Costa},
title = {{MOO-MDP: An Object-Oriented Representation for Cooperative Multiagent Reinforcement Learning}},
journal = {IEEE Transactions on Cybernetics},
year={2017},
volume={PP},
number={99},
pages={1-13},
doi={10.1109/TCYB.2017.2781130},
ISSN={2168-2267}
}
Part of this project was built on BURLAP2 (http://burlap.cs.brown.edu/). I included the used Burlap version to avoid incompatibility issues, changes in the codification will be necessary if you use a newer BURLAP version.
This work is an extension of a conference paper (http://ieeexplore.ieee.org/document/7839556/). The journal version and the new codification should be cited/used for most of the purposes.
The folder goldmine_and_gridworld contains the Java implementation (as an Eclipse project) and the BURLAP source files for the Goldmine and Gridworld domains.
The folder prey_predator contains the Python implementation for the Predator-Prey domain.
The folder experiment_results contains the .csv files which contain the results for our experiments and the MATLAB implementation to read the .csv files and output graphs.
The commands mentioned below should be executed on MATLAB and generate the graphs shown in the paper. Some manual modifications in the style were performed to increase readability. The required .m files are inside the experiment_results folder.
Gridworld and Goldmine:
folderCSV = %<path for goldmine or gridworld folders>
initTrial = 1;
endTrial = 70; % 50 for gridworld
useMarkers = true;
generateGraphFromBurlapFile(folderCSV, initTrial, endTrial,useMarkers);
Predator-Prey
folderOriginal = %<path for experiment_results folder>
folderCSV = [folderOriginal,'/prey-predator/'];
repetitions = 250;
initTrial = 1;
endTrial = repetitions;
useMarkers = false;
convert_preyPredator(folderCSV,repetitions,3); %may take a long time to run
generateGraphFromBurlapFile(folderCSV, initTrial, endTrial,useMarkers);
The folder goldmine_and_gridworld stores the implementations for the Goldmine and Gridworld domains.
We used Eclipse to run the experiments, hence you can import the folder as a project in Eclipse or import all files (including the burlap jar in the lib folder as an library) in your preferred IDE.
The experiments of our paper are replicated by executing the main method in the ExperimentBRACIS2016 class (I recommend executing the VM with the parameters -Xms1024m -Xmx14024m).
After executing this method, .csv files will be generated with the experiments results, that can be used to print graphs on matlab by executing the file generateGraphFromBurlapFile.m.
For executing experiments in the Predator-Prey domain, the experiment.py file must be executed. We used the PyCharm IDE with an Anaconda environment.
The python codification outputs a .csv file in a different format, hence the convert_preyPredator.m script adapts the output to the correct format.
We advise you to implement your own script to generate graphs, as the matlab file is not very well commented.
Our DOO-Q and DQL implementations are highly optimized to execute experiments faster, which means that the memory consumption is huge. If you want to use it to applications or in a pc with low memory resources, you will need to change our implementation.
A huge amount of memory can be saved if the implementation of DOOQPolicy is changed to only store entries on policyMemory in case there is two Q-values tied as the best action. However, if you do so, the experiments will run slower.
For any question, please send an email to the first author.