Skip to content

Commit

Permalink
Merge branch 'master' into lanctot-ubuntu-2204
Browse files Browse the repository at this point in the history
  • Loading branch information
lanctot authored Mar 21, 2022
2 parents 828b051 + 062416f commit 95de30b
Show file tree
Hide file tree
Showing 215 changed files with 29,549 additions and 2,858 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -56,3 +56,5 @@ open_spiel/cmake-build-debug/

# Swift generated build file
Package.resolved
# Visual Studio generated files
open_spiel/.vs
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,9 +43,10 @@ For an overview of OpenSpiel and example uses of the core API, please check out
our tutorials:

* [Motivation, Core API, Brief Intro to Replictor Dynamics and Imperfect
Information Games](https://www.youtube.com/watch?v=YE0E0F39lac) by Marc
Information Games](https://www.youtube.com/watch?v=8NCPqtPwlFQ) by Marc
Lanctot.
[(slides)](http://mlanctot.info/files/open_spiel_tutorial-mar2021-kuleuven.pdf).
[(slides)](http://mlanctot.info/files/OpenSpiel_Tutorial_KU_Leuven_2022.pdf)
[(colab)](https://colab.research.google.com/github/deepmind/open_spiel/blob/master/open_spiel/colabs/OpenSpielTutorial.ipynb)
* [Motivation, Core API, Implementing CFR and REINFORCE on Kuhn poker, Leduc
poker, and Goofspiel](https://www.youtube.com/watch?v=o6JNHoGUXCo) by Edward
Lockhart.
Expand Down
1 change: 1 addition & 0 deletions docs/algorithms.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ Information Set Monte Carlo Tree Search (IS-MCTS) | Search | [Cowley et al
Minimax (and Alpha-Beta) Search | Search | [Wikipedia1](https://en.wikipedia.org/wiki/Minimax#Minimax_algorithm_with_alternate_moves), [Wikipedia2](https://en.wikipedia.org/wiki/Alpha%E2%80%93beta_pruning), Knuth and Moore '75 | ![](_static/green_circ10.png "green circle")
Monte Carlo Tree Search | Search | [Wikipedia](https://en.wikipedia.org/wiki/Monte_Carlo_tree_search), [UCT paper](http://ggp.stanford.edu/readings/uct.pdf), [Coulom '06](https://hal.inria.fr/inria-00116992/document), [Cowling et al. survey](http://www.incompleteideas.net/609%20dropbox/other%20readings%20and%20resources/MCTS-survey.pdf) | ![](_static/green_circ10.png "green circle")
Lemke-Howson (via <tt>nashpy</tt>) | Opt. | [Wikipedia](https://en.wikipedia.org/wiki/Lemke%E2%80%93Howson_algorithm), [Shoham &amp; Leyton-Brown '09](http://masfoundations.org/) | ![](_static/green_circ10.png "green circle")
ADIDAS | Opt. | [Gemp et al '22](https://arxiv.org/abs/2106.01285) | <font color="orange"><b>~</b></font>
Sequence-form linear programming | Opt. | [Koller, Megiddo, and von Stengel '94](http://theory.stanford.edu/~megiddo/pdf/stoc94.pdf), <br> [Shoham &amp; Leyton-Brown '09](http://masfoundations.org/) | ![](_static/green_circ10.png "green circle")
Counterfactual Regret Minimization (CFR) | Tabular | [Zinkevich et al '08](https://poker.cs.ualberta.ca/publications/NIPS07-cfr.pdf), [Neller &amp; Lanctot '13](http://modelai.gettysburg.edu/2013/cfr/cfr.pdf) | ![](_static/green_circ10.png "green circle")
CFR against a best responder (CFR-BR) | Tabular | [Johanson et al '12](https://poker.cs.ualberta.ca/publications/AAAI12-cfrbr.pdf) | ![](_static/green_circ10.png "green circle")
Expand Down
65 changes: 60 additions & 5 deletions docs/developer_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,13 +108,68 @@ When you add a new conditional dependency, you need to touch:
- the root CMakeLists.txt to add the option, with an OFF default
- add the option to `scripts/global_variables.sh`
- change `install.sh` to make sure the dependency is installed
- use constructs like `if (${OPEN_SPIEL_OPEN_SPIEL_BUILD_WITH_HANABI})` in
CMake to optionally add the targets to build.
- use constructs like `if (${OPEN_SPIEL_BUILD_WITH_HANABI})` in CMake to
optionally add the targets to build.

## Debugging tools

For complex games it may be tricky to get all the details right. Reading through
the playthrough You can visualize small game trees using
[open_spiel/python/examples/treeviz_example.py](https://github.com/deepmind/open_spiel/blob/master/open_spiel/python/examples/treeviz_example.py) or for large
games there is an interactive viewer for OpenSpiel games called
the playthrough (or visually inspecting random games via the example) is the
first step in verifying the game mechanics. You can visualize small game trees
using [open_spiel/python/examples/treeviz_example.py](https://github.com/deepmind/open_spiel/blob/master/open_spiel/python/examples/treeviz_example.py) or for
large games there is an interactive viewer for OpenSpiel games called
[SpielViz](https://github.com/michalsustr/spielviz).

## Adding Game-Specific Functionality

OpenSpiel focuses on maintaining a general API to an underlying suite of games,
but sometimes it is convenient to work on specific games. In this section, we
describe how to get (or set) game-specific information from/to the generic state
objects, and how to expose these functions to python.

Suppose, for example, we want to look at (or set) the private cards in a game of
Leduc poker. We will use an example based on this
[this commit](https://github.com/deepmind/open_spiel/commit/4cd1e5889e447d285eb3f16901ccab5c14e62187).

1. First, locate the game you want to access. The game implementations are in
the `games/` subdirectory and have two main files: e.g. `leduc_poker.h`
(header) and `leduc_poker.cc` (implementation).
2. For simple accessor methods that just return the information and feel free
have the full implementation to the game's header file (e.g.
`LeducState::GetPrivateCards`). You can also declare the function in the
header and provide the implementation in source file (e.g.
`LeducPoker::SetPrivateCards`).
3. That's it for the core game logic. To expose these methods to Python, add
them to the Python module (via pybind11). Some games already have
game-specific functionality, so if a files named `games_leduc_poker.h` and
`games_leduc_poker.cc` exist within `python/pybind11`, add to them (skip to
Step 5).
4. If the games-specific files do not exist for your game of interest, then:
* Add the files. Copy one of the other ones, adapt the names, and remove
most of the bindings code.
* Add the new files to the `PYTHON_BINDINGS` list in
`python/CMakeFiles.txt`.
* Modify `pyspiel.cc`: include the header at the top, and call the init
function at the bottom.
5. Add the custom methods to the game-specific python bindings
(`games_leduc_poker.cc`, i.e. `LeducPoker::GetPrivateCards` and
`LeducPoker::SetPrivateCards`). For simple types, this should be relatively
straight-forward; you can see how by looking at the other game-specific
functions. For complex types, you may have to bind additional code (see e.g.
`games_backgammon.cc`). If it is unclear, do not hesitate to ask, but also
please check the
[pybind11 documentation](https://pybind11.readthedocs.io/en/stable/).
6. Add a simple test to `python/games_sim_test.py` to check that it worked. For
inspiration, see e.g. `test_leduc_get_and_set_private_cards`.

## Language APIs

There are currently four other language APIs that expose functionality from the
C++ core.

- [Python](https://github.com/deepmind/open_spiel/tree/master/open_spiel/python).
- [Julia](https://github.com/deepmind/open_spiel/tree/master/open_spiel/julia)
- [Go](https://github.com/deepmind/open_spiel/tree/master/open_spiel/go)
(experimental)
- [Rust](https://github.com/deepmind/open_spiel/tree/master/open_spiel/rust)
(experimental)
82 changes: 74 additions & 8 deletions docs/games.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ we verified against known values and/or reproduced results from papers.

Status | Game
-------------------------------------------- | ----
<font color="orange"><b>~</b></font> | [Amazons](#amazons)
![](_static/green_circ10.png "green circle") | [Backgammon](#backgammon)
<font color="orange"><b>~</b></font> | [Battleship](#battleship)
<font color="orange"><b>~</b></font> | [Blackjack](#blackjack)
Expand Down Expand Up @@ -40,15 +41,18 @@ Status | Game
![](_static/green_circ10.png "green circle") | [Liar's Dice](#liars-dice)
<font color="orange"><b>~</b></font> | [Markov Soccer](#markov-soccer)
![](_static/green_circ10.png "green circle") | [Matching Pennies (Three-player)](#matching-pennies-three-player)
![](_static/green_circ10.png "green circle") | [Mean Field Game : garnet](#mean-field-game--garnet)
![](_static/green_circ10.png "green circle") | [Mean Field Game : crowd modelling](#mean-field-game--crowd-modelling)
![](_static/green_circ10.png "green circle") | [Mean Field Game : crowd modelling 2d](#mean-field-game--crowd-modelling-2d)
![](_static/green_circ10.png "green circle") | [Mean Field Game : predator prey](#mean-field-game--predator-prey)
![](_static/green_circ10.png "green circle") | [Mean Field Game : garnet](#mean_field_game_garnet)
![](_static/green_circ10.png "green circle") | [Mean Field Game : crowd modelling](#mean_field_game_crowd_modelling)
![](_static/green_circ10.png "green circle") | [Mean Field Game : crowd modelling 2d](#mean_field_game_crowd_modelling_2d)
![](_static/green_circ10.png "green circle") | [Mean Field Game : linear quadratic](#mean-field-game--linear-quadratic)
![](_static/green_circ10.png "green circle") | [Mean Field Game : predator prey](#mean_field_game_predator_prey)
![](_static/green_circ10.png "green circle") | [Mean Field Game : routing](#mean-field-game--routing)
<font color="orange"><b>~</b></font> | [Morpion Solitaire (4D)](#morpion-solitaire-4d)
![](_static/green_circ10.png "green circle") | [Negotiation](#negotiation)
<font color="orange"><b>X</b></font> | [Oh Hell](#oh-hell)
![](_static/green_circ10.png "green circle") | [Oshi-Zumo](#oshi-zumo)
![](_static/green_circ10.png "green circle") | [Oware](#oware)
<font color="orange"><b>~</b></font> | [Pathfinding](#pathfinding)
![](_static/green_circ10.png "green circle") | [Pentago](#pentago)
<font color="orange"><b>~</b></font> | [Phantom Tic-Tac-Toe](#phantom-tic-tac-toe)
![](_static/green_circ10.png "green circle") | [Pig](#pig)
Expand All @@ -64,10 +68,21 @@ Status | Game
![](_static/green_circ10.png "green circle") | [Tiny Bridge](#tiny-bridge)
![](_static/green_circ10.png "green circle") | [Tiny Hanabi](#tiny-hanabi)
![](_static/green_circ10.png "green circle") | [Trade Comm](#trade-comm)
<font color="orange"><b>~</b></font> | [Ultimate Tic-Tac-Toe](#ultimate-tic-tac-toe)
![](_static/green_circ10.png "green circle") | [Y](#y)

## Details

### Amazons

* Move pieces on a board trying to block opponents from moving.
* Pieces on a grid.
* Modern game.
* Deterministic.
* Perfect information.
* 2 players.
* [Wikipedia](https://en.wikipedia.org/wiki/Game_of_the_Amazons)

### Backgammon

* Players move their pieces through the board based on the rolls of dice.
Expand Down Expand Up @@ -409,8 +424,8 @@ Status | Game

### Mean Field Game : routing

* Representative player chooses at each nodes where they go. They has an
origin, a destination and a departure time and choose their route to
* Representative player chooses at each node where they go. They has an
origin, a destination and a departure time and chooses their route to
minimize their travel time. Time spent on each link is a function of the
distribution of players on the link when the player reaches the link.
* Network with choice of route.
Expand All @@ -421,6 +436,32 @@ Status | Game
* [Cabannes et. al. '21, Solving N-player dynamic routing games with
congestion: a mean field approach](https://arxiv.org/pdf/2110.11943.pdf).

### Mean Field Game : Linear-Quadratic

* Players are uniformly distributed and are then incentivized to gather at the
same point (The lower the distanbce wrt. the distribution mean position, the
higher the reward). A mean-reverting term pushes the players towards the
distribution, a gaussian noise term perturbs them. The players' actions
alter their states linearly (alpha * a * dt) and the cost thereof is
quadratic (K * a^2 * dt), hence the name. There exists an exact, closed form
solution for the fully continuous version of this game.
* Research game.
* Mean-field (with a unique player).
* Explicit stochastic game (only for initial node).
* Perfect information.
* [Perrin & al. 2019 (https://arxiv.org/abs/2007.03458)]

### Morpion Solitaire (4D)

* A single player game where player aims to maximize lines drawn on a grid,
under certain limitations.
* Uses tokens on a grid.
* Traditional game.
* Deterministic
* Perfect information.
* 1 player.
* [Wikipedia](https://en.wikipedia.org/wiki/Join_Five)

### Negotiation

* Agents with different utilities must negotiate an allocation of resources.
Expand Down Expand Up @@ -463,6 +504,20 @@ Status | Game
* 2 players.
* [Wikipedia](https://en.wikipedia.org/wiki/Oware)

### Pathfinding

* Agents must move to their desitnation.
* Agents on a grid. Single-agent game is the classic examples from Sutton &
Barto.
* Research game.
* Non-deterministic (in multiagent, collisions resolved by chance nodes).
* Perfect information.
* 1-10 players.
* Similar games appeared in
[Austerweil et al. '15](http://miaoliu.scripts.mit.edu/SSS-16/wp-content/uploads/2016/01/paper.pdf),
[Greenwald & Hall '03](https://www.aaai.org/Papers/ICML/2003/ICML03-034.pdf),
and [Littman '01](https://jmvidal.cse.sc.edu/library/littman01a.pdf).

### Pentago

* Players place tokens on the board, then rotate part of the board to a new
Expand Down Expand Up @@ -516,7 +571,9 @@ Status | Game
* Modern game.
* Deterministic.
* Perfect information.
* 2-4 players.
* 2-4 players. (Note, from Wikipedia: "Though it can be played with 3 players,
it's advised against. Since the 3rd player doesn't have player on the
opposite side, they have an advantage.")
* [Wikipedia](https://en.wikipedia.org/wiki/Quoridor)

### Reconnaissance Blind Chess
Expand All @@ -535,7 +592,7 @@ Status | Game

### Routing game

* Players choose at each nodes where they go. They have an origin, a
* Players choose at each node where they go. They have an origin, a
destination and a departure time and choose their route to minimize their
travel time. Time spent on each link is a function of the number of players
on the link when the player reaches the link.
Expand Down Expand Up @@ -633,6 +690,15 @@ Status | Game
* 2 players.
* A simple emergent communication game based on trading.

### Ultimate Tic-Tac-Toe

* Players try and form a pattern in local boards and a meta-board.
* Uses tokens on a grid.
* Deterministic.
* Perfect information.
* 2 players.
* [Wikipedia](https://en.wikipedia.org/wiki/Ultimate_tic-tac-toe)

### Y

* Players place tokens to try and connect sides of a triangular board.
Expand Down
29 changes: 20 additions & 9 deletions docs/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,15 +93,16 @@ In a nutshell:
./open_spiel/scripts/build_and_run_tests.sh # Run this every-time you need to rebuild.
```

1. Install system packages (e.g. cmake) and download some dependencies. Only
needs to be run once or if you enable some new conditional dependencies (see
specific section below).
1. (Optional) Configure
[Conditional Dependencies](#configuring-conditional-dependencies).
2. Install system packages (e.g. cmake) and download some dependencies. Only
needs to be run once or if you enable some new conditional dependencies.

```bash
./install.sh
```

2. Install your Python dependencies, e.g. in Python 3 using
3. Install your Python dependencies, e.g. in Python 3 using
[`virtualenv`](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/):

```bash
Expand All @@ -121,7 +122,7 @@ In a nutshell:
pip3 install --upgrade setuptools testresources
```

3. This sections differs depending on the installation procedure:
4. This sections differs depending on the installation procedure:

**Building and testing from source**

Expand All @@ -144,7 +145,7 @@ In a nutshell:
source files. If you edit any C++ files, you will have to rerun the install
command.

4. Only when building from source:
5. Only when building from source:

```bash
# For the python modules in open_spiel.
Expand Down Expand Up @@ -229,10 +230,20 @@ python3 open_spiel/python/examples/mcts.py --game=tic_tac_toe --player1=human --
## Detailed steps
### Configuration conditional dependencies
### Configuring conditional dependencies
See [open_spiel/scripts/global_variables.sh](https://github.com/deepmind/open_spiel/blob/master/open_spiel/scripts/global_variables.sh) to configure the
conditional dependencies. See also the [Developer Guide](developer_guide.md).
Conditional dependencies are configured using environment variables, e.g.
```bash
export OPEN_SPIEL_BUILD_WITH_HANABI=ON
```
`install.sh` may need to be rerun after enabling new conditional dependencies.
See [open_spiel/scripts/global_variables.sh](https://github.com/deepmind/open_spiel/blob/master/open_spiel/scripts/global_variables.sh) for the full list
of conditional dependencies.
See also the [Developer Guide](developer_guide.md#conditional-dependencies).
### Installing system-wide dependencies
Expand Down
Loading

0 comments on commit 95de30b

Please sign in to comment.