Merge branch 'master' into lanctot-ubuntu-2204

giogix2 · Mar 21, 2022 · 95de30b · 95de30b
2 parents 828b051 + 062416f
commit 95de30b
Show file tree

Hide file tree

Showing 215 changed files with 29,549 additions and 2,858 deletions.
diff --git a/.gitignore b/.gitignore
@@ -56,3 +56,5 @@ open_spiel/cmake-build-debug/
 
 # Swift generated build file
 Package.resolved
+# Visual Studio generated files
+open_spiel/.vs
diff --git a/README.md b/README.md
@@ -43,9 +43,10 @@ For an overview of OpenSpiel and example uses of the core API, please check out
 our tutorials:
 
 *   [Motivation, Core API, Brief Intro to Replictor Dynamics and Imperfect
-    Information Games](https://www.youtube.com/watch?v=YE0E0F39lac) by Marc
+    Information Games](https://www.youtube.com/watch?v=8NCPqtPwlFQ) by Marc
     Lanctot.
-    [(slides)](http://mlanctot.info/files/open_spiel_tutorial-mar2021-kuleuven.pdf).
+    [(slides)](http://mlanctot.info/files/OpenSpiel_Tutorial_KU_Leuven_2022.pdf)
+    [(colab)](https://colab.research.google.com/github/deepmind/open_spiel/blob/master/open_spiel/colabs/OpenSpielTutorial.ipynb)
 *   [Motivation, Core API, Implementing CFR and REINFORCE on Kuhn poker, Leduc
     poker, and Goofspiel](https://www.youtube.com/watch?v=o6JNHoGUXCo) by Edward
     Lockhart.

diff --git a/docs/algorithms.md b/docs/algorithms.md
@@ -13,6 +13,7 @@ Information Set Monte Carlo Tree Search (IS-MCTS) | Search       | [Cowley et al
 Minimax (and Alpha-Beta) Search                   | Search       | [Wikipedia1](https://en.wikipedia.org/wiki/Minimax#Minimax_algorithm_with_alternate_moves), [Wikipedia2](https://en.wikipedia.org/wiki/Alpha%E2%80%93beta_pruning), Knuth and Moore '75                                                                                                                         | ![](_static/green_circ10.png "green circle")
 Monte Carlo Tree Search                           | Search       | [Wikipedia](https://en.wikipedia.org/wiki/Monte_Carlo_tree_search), [UCT paper](http://ggp.stanford.edu/readings/uct.pdf), [Coulom '06](https://hal.inria.fr/inria-00116992/document), [Cowling et al. survey](http://www.incompleteideas.net/609%20dropbox/other%20readings%20and%20resources/MCTS-survey.pdf) | ![](_static/green_circ10.png "green circle")
 Lemke-Howson (via <tt>nashpy</tt>)                | Opt.         | [Wikipedia](https://en.wikipedia.org/wiki/Lemke%E2%80%93Howson_algorithm), [Shoham &amp; Leyton-Brown '09](http://masfoundations.org/)                                                                                                                                                                          | ![](_static/green_circ10.png "green circle")
+ADIDAS                                            | Opt.         | [Gemp et al '22](https://arxiv.org/abs/2106.01285)                                                                                                                                                                                                                                                              | <font color="orange"><b>~</b></font>
 Sequence-form linear programming                  | Opt.         | [Koller, Megiddo, and von Stengel '94](http://theory.stanford.edu/~megiddo/pdf/stoc94.pdf), <br> [Shoham &amp; Leyton-Brown '09](http://masfoundations.org/)                                                                                                                                                    | ![](_static/green_circ10.png "green circle")
 Counterfactual Regret Minimization (CFR)          | Tabular      | [Zinkevich et al '08](https://poker.cs.ualberta.ca/publications/NIPS07-cfr.pdf), [Neller &amp; Lanctot '13](http://modelai.gettysburg.edu/2013/cfr/cfr.pdf)                                                                                                                                                     | ![](_static/green_circ10.png "green circle")
 CFR against a best responder (CFR-BR)             | Tabular      | [Johanson et al '12](https://poker.cs.ualberta.ca/publications/AAAI12-cfrbr.pdf)                                                                                                                                                                                                                                | ![](_static/green_circ10.png "green circle")

diff --git a/docs/developer_guide.md b/docs/developer_guide.md
@@ -108,13 +108,68 @@ When you add a new conditional dependency, you need to touch:
 -   the root CMakeLists.txt to add the option, with an OFF default
 -   add the option to `scripts/global_variables.sh`
 -   change `install.sh` to make sure the dependency is installed
--   use constructs like `if (${OPEN_SPIEL_OPEN_SPIEL_BUILD_WITH_HANABI})` in
-    CMake to optionally add the targets to build.
+-   use constructs like `if (${OPEN_SPIEL_BUILD_WITH_HANABI})` in CMake to
+    optionally add the targets to build.
 
 ## Debugging tools
 
 For complex games it may be tricky to get all the details right. Reading through
-the playthrough You can visualize small game trees using
-[open_spiel/python/examples/treeviz_example.py](https://github.com/deepmind/open_spiel/blob/master/open_spiel/python/examples/treeviz_example.py) or for large
-games there is an interactive viewer for OpenSpiel games called
+the playthrough (or visually inspecting random games via the example) is the
+first step in verifying the game mechanics. You can visualize small game trees
+using [open_spiel/python/examples/treeviz_example.py](https://github.com/deepmind/open_spiel/blob/master/open_spiel/python/examples/treeviz_example.py) or for
+large games there is an interactive viewer for OpenSpiel games called
 [SpielViz](https://github.com/michalsustr/spielviz).
+
+## Adding Game-Specific Functionality
+
+OpenSpiel focuses on maintaining a general API to an underlying suite of games,
+but sometimes it is convenient to work on specific games. In this section, we
+describe how to get (or set) game-specific information from/to the generic state
+objects, and how to expose these functions to python.
+
+Suppose, for example, we want to look at (or set) the private cards in a game of
+Leduc poker. We will use an example based on this
+[this commit](https://github.com/deepmind/open_spiel/commit/4cd1e5889e447d285eb3f16901ccab5c14e62187).
+
+1.  First, locate the game you want to access. The game implementations are in
+    the `games/` subdirectory and have two main files: e.g. `leduc_poker.h`
+    (header) and `leduc_poker.cc` (implementation).
+2.  For simple accessor methods that just return the information and feel free
+    have the full implementation to the game's header file (e.g.
+    `LeducState::GetPrivateCards`). You can also declare the function in the
+    header and provide the implementation in source file (e.g.
+    `LeducPoker::SetPrivateCards`).
+3.  That's it for the core game logic. To expose these methods to Python, add
+    them to the Python module (via pybind11). Some games already have
+    game-specific functionality, so if a files named `games_leduc_poker.h` and
+    `games_leduc_poker.cc` exist within `python/pybind11`, add to them (skip to
+    Step 5).
+4.  If the games-specific files do not exist for your game of interest, then:
+    *   Add the files. Copy one of the other ones, adapt the names, and remove
+        most of the bindings code.
+    *   Add the new files to the `PYTHON_BINDINGS` list in
+        `python/CMakeFiles.txt`.
+    *   Modify `pyspiel.cc`: include the header at the top, and call the init
+        function at the bottom.
+5.  Add the custom methods to the game-specific python bindings
+    (`games_leduc_poker.cc`, i.e. `LeducPoker::GetPrivateCards` and
+    `LeducPoker::SetPrivateCards`). For simple types, this should be relatively
+    straight-forward; you can see how by looking at the other game-specific
+    functions. For complex types, you may have to bind additional code (see e.g.
+    `games_backgammon.cc`). If it is unclear, do not hesitate to ask, but also
+    please check the
+    [pybind11 documentation](https://pybind11.readthedocs.io/en/stable/).
+6.  Add a simple test to `python/games_sim_test.py` to check that it worked. For
+    inspiration, see e.g. `test_leduc_get_and_set_private_cards`.
+
+## Language APIs
+
+There are currently four other language APIs that expose functionality from the
+C++ core.
+
+-   [Python](https://github.com/deepmind/open_spiel/tree/master/open_spiel/python).
+-   [Julia](https://github.com/deepmind/open_spiel/tree/master/open_spiel/julia)
+-   [Go](https://github.com/deepmind/open_spiel/tree/master/open_spiel/go)
+    (experimental)
+-   [Rust](https://github.com/deepmind/open_spiel/tree/master/open_spiel/rust)
+    (experimental)
diff --git a/docs/games.md b/docs/games.md
@@ -9,6 +9,7 @@ we verified against known values and/or reproduced results from papers.
 
 Status                                       | Game
 -------------------------------------------- | ----
+<font color="orange"><b>~</b></font>         | [Amazons](#amazons)
 ![](_static/green_circ10.png "green circle") | [Backgammon](#backgammon)
 <font color="orange"><b>~</b></font>         | [Battleship](#battleship)
 <font color="orange"><b>~</b></font>         | [Blackjack](#blackjack)
@@ -40,15 +41,18 @@ Status                                       | Game
 ![](_static/green_circ10.png "green circle") | [Liar's Dice](#liars-dice)
 <font color="orange"><b>~</b></font>         | [Markov Soccer](#markov-soccer)
 ![](_static/green_circ10.png "green circle") | [Matching Pennies (Three-player)](#matching-pennies-three-player)
-![](_static/green_circ10.png "green circle") | [Mean Field Game : garnet](#mean-field-game--garnet)
-![](_static/green_circ10.png "green circle") | [Mean Field Game : crowd modelling](#mean-field-game--crowd-modelling)
-![](_static/green_circ10.png "green circle") | [Mean Field Game : crowd modelling 2d](#mean-field-game--crowd-modelling-2d)
-![](_static/green_circ10.png "green circle") | [Mean Field Game : predator prey](#mean-field-game--predator-prey)
+![](_static/green_circ10.png "green circle") | [Mean Field Game : garnet](#mean_field_game_garnet)
+![](_static/green_circ10.png "green circle") | [Mean Field Game : crowd modelling](#mean_field_game_crowd_modelling)
+![](_static/green_circ10.png "green circle") | [Mean Field Game : crowd modelling 2d](#mean_field_game_crowd_modelling_2d)
+![](_static/green_circ10.png "green circle") | [Mean Field Game : linear quadratic](#mean-field-game--linear-quadratic)
+![](_static/green_circ10.png "green circle") | [Mean Field Game : predator prey](#mean_field_game_predator_prey)
 ![](_static/green_circ10.png "green circle") | [Mean Field Game : routing](#mean-field-game--routing)
+<font color="orange"><b>~</b></font>         | [Morpion Solitaire (4D)](#morpion-solitaire-4d)
 ![](_static/green_circ10.png "green circle") | [Negotiation](#negotiation)
 <font color="orange"><b>X</b></font>         | [Oh Hell](#oh-hell)
 ![](_static/green_circ10.png "green circle") | [Oshi-Zumo](#oshi-zumo)
 ![](_static/green_circ10.png "green circle") | [Oware](#oware)
+<font color="orange"><b>~</b></font>         | [Pathfinding](#pathfinding)
 ![](_static/green_circ10.png "green circle") | [Pentago](#pentago)
 <font color="orange"><b>~</b></font>         | [Phantom Tic-Tac-Toe](#phantom-tic-tac-toe)
 ![](_static/green_circ10.png "green circle") | [Pig](#pig)
@@ -64,10 +68,21 @@ Status                                       | Game
 ![](_static/green_circ10.png "green circle") | [Tiny Bridge](#tiny-bridge)
 ![](_static/green_circ10.png "green circle") | [Tiny Hanabi](#tiny-hanabi)
 ![](_static/green_circ10.png "green circle") | [Trade Comm](#trade-comm)
+<font color="orange"><b>~</b></font>         | [Ultimate Tic-Tac-Toe](#ultimate-tic-tac-toe)
 ![](_static/green_circ10.png "green circle") | [Y](#y)
 
 ## Details
 
+### Amazons
+
+*   Move pieces on a board trying to block opponents from moving.
+*   Pieces on a grid.
+*   Modern game.
+*   Deterministic.
+*   Perfect information.
+*   2 players.
+*   [Wikipedia](https://en.wikipedia.org/wiki/Game_of_the_Amazons)
+
 ### Backgammon
 
 *   Players move their pieces through the board based on the rolls of dice.
@@ -409,8 +424,8 @@ Status                                       | Game
 
 ### Mean Field Game : routing
 
-*   Representative player chooses at each nodes where they go. They has an
-    origin, a destination and a departure time and choose their route to
+*   Representative player chooses at each node where they go. They has an
+    origin, a destination and a departure time and chooses their route to
     minimize their travel time. Time spent on each link is a function of the
     distribution of players on the link when the player reaches the link.
 *   Network with choice of route.
@@ -421,6 +436,32 @@ Status                                       | Game
 *   [Cabannes et. al. '21, Solving N-player dynamic routing games with
     congestion: a mean field approach](https://arxiv.org/pdf/2110.11943.pdf).
 
+### Mean Field Game : Linear-Quadratic
+
+*   Players are uniformly distributed and are then incentivized to gather at the
+    same point (The lower the distanbce wrt. the distribution mean position, the
+    higher the reward). A mean-reverting term pushes the players towards the
+    distribution, a gaussian noise term perturbs them. The players' actions
+    alter their states linearly (alpha * a * dt) and the cost thereof is
+    quadratic (K * a^2 * dt), hence the name. There exists an exact, closed form
+    solution for the fully continuous version of this game.
+*   Research game.
+*   Mean-field (with a unique player).
+*   Explicit stochastic game (only for initial node).
+*   Perfect information.
+*   [Perrin & al. 2019 (https://arxiv.org/abs/2007.03458)]
+
+### Morpion Solitaire (4D)
+
+*   A single player game where player aims to maximize lines drawn on a grid,
+    under certain limitations.
+*   Uses tokens on a grid.
+*   Traditional game.
+*   Deterministic
+*   Perfect information.
+*   1 player.
+*   [Wikipedia](https://en.wikipedia.org/wiki/Join_Five)
+
 ### Negotiation
 
 *   Agents with different utilities must negotiate an allocation of resources.
@@ -463,6 +504,20 @@ Status                                       | Game
 *   2 players.
 *   [Wikipedia](https://en.wikipedia.org/wiki/Oware)
 
+### Pathfinding
+
+*   Agents must move to their desitnation.
+*   Agents on a grid. Single-agent game is the classic examples from Sutton &
+    Barto.
+*   Research game.
+*   Non-deterministic (in multiagent, collisions resolved by chance nodes).
+*   Perfect information.
+*   1-10 players.
+*   Similar games appeared in
+    [Austerweil et al. '15](http://miaoliu.scripts.mit.edu/SSS-16/wp-content/uploads/2016/01/paper.pdf),
+    [Greenwald & Hall '03](https://www.aaai.org/Papers/ICML/2003/ICML03-034.pdf),
+    and [Littman '01](https://jmvidal.cse.sc.edu/library/littman01a.pdf).
+
 ### Pentago
 
 *   Players place tokens on the board, then rotate part of the board to a new
@@ -516,7 +571,9 @@ Status                                       | Game
 *   Modern game.
 *   Deterministic.
 *   Perfect information.
-*   2-4 players.
+*   2-4 players. (Note, from Wikipedia: "Though it can be played with 3 players,
+    it's advised against. Since the 3rd player doesn't have player on the
+    opposite side, they have an advantage.")
 *   [Wikipedia](https://en.wikipedia.org/wiki/Quoridor)
 
 ### Reconnaissance Blind Chess
@@ -535,7 +592,7 @@ Status                                       | Game
 
 ### Routing game
 
-*   Players choose at each nodes where they go. They have an origin, a
+*   Players choose at each node where they go. They have an origin, a
     destination and a departure time and choose their route to minimize their
     travel time. Time spent on each link is a function of the number of players
     on the link when the player reaches the link.
@@ -633,6 +690,15 @@ Status                                       | Game
 *   2 players.
 *   A simple emergent communication game based on trading.
 
+### Ultimate Tic-Tac-Toe
+
+*   Players try and form a pattern in local boards and a meta-board.
+*   Uses tokens on a grid.
+*   Deterministic.
+*   Perfect information.
+*   2 players.
+*   [Wikipedia](https://en.wikipedia.org/wiki/Ultimate_tic-tac-toe)
+
 ### Y
 
 *   Players place tokens to try and connect sides of a triangular board.

diff --git a/docs/install.md b/docs/install.md
@@ -93,15 +93,16 @@ In a nutshell:
 ./open_spiel/scripts/build_and_run_tests.sh # Run this every-time you need to rebuild.
 ```
 
-1.  Install system packages (e.g. cmake) and download some dependencies. Only
-    needs to be run once or if you enable some new conditional dependencies (see
-    specific section below).
+1.  (Optional) Configure
+    [Conditional Dependencies](#configuring-conditional-dependencies).
+2.  Install system packages (e.g. cmake) and download some dependencies. Only
+    needs to be run once or if you enable some new conditional dependencies.
 
     ```bash
     ./install.sh
     ```
 
-2.  Install your Python dependencies, e.g. in Python 3 using
+3.  Install your Python dependencies, e.g. in Python 3 using
     [`virtualenv`](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/):
 
     ```bash
@@ -121,7 +122,7 @@ In a nutshell:
     pip3 install --upgrade setuptools testresources
     ```
 
-3.  This sections differs depending on the installation procedure:
+4.  This sections differs depending on the installation procedure:
 
     **Building and testing from source**
 
@@ -144,7 +145,7 @@ In a nutshell:
     source files. If you edit any C++ files, you will have to rerun the install
     command.
 
-4.  Only when building from source:
+5.  Only when building from source:
 
     ```bash
     # For the python modules in open_spiel.
@@ -229,10 +230,20 @@ python3 open_spiel/python/examples/mcts.py --game=tic_tac_toe --player1=human --
 
 ## Detailed steps
 
-### Configuration conditional dependencies
+### Configuring conditional dependencies
 
-See [open_spiel/scripts/global_variables.sh](https://github.com/deepmind/open_spiel/blob/master/open_spiel/scripts/global_variables.sh) to configure the
-conditional dependencies. See also the [Developer Guide](developer_guide.md).
+Conditional dependencies are configured using environment variables, e.g.
+
+```bash
+export OPEN_SPIEL_BUILD_WITH_HANABI=ON
+```
+
+`install.sh` may need to be rerun after enabling new conditional dependencies.
+
+See [open_spiel/scripts/global_variables.sh](https://github.com/deepmind/open_spiel/blob/master/open_spiel/scripts/global_variables.sh) for the full list
+of conditional dependencies.
+
+See also the [Developer Guide](developer_guide.md#conditional-dependencies).
 
 ### Installing system-wide dependencies