This repository contains code for studying the adversarial robustness of KataGo.
Read about our research here: https://arxiv.org/abs/2211.00241.
View our website here: https://goattack.far.ai/.
To run our adversary with Sabaki, see this guide.
To clone this repository, run one of the following commands
# Via HTTPS
git clone --recurse-submodules https://github.com/AlignmentResearch/go_attack.git
# Via SSH
git clone --recurse-submodules git@github.com:AlignmentResearch/go_attack.git
You can run pip install -e .[dev]
inside the project root directory to install all necessary dependencies.
To run a pre-commit script before each commit, run pre-commit install
(pre-commit
should already have been installed in the previous step).
You may also want to run pre-commit install
from engines/KataGo-custom
to install that repository's respective commit hook.
Modifications to KataGo are not tracked in this repository and should instead be made to the AlignmentResearch/KataGo-custom repository. We use code from KataGo-custom in this repository via a Git submodule.
- engines/KataGo-custom tracks the
stable
branch of theKataGo-custom
repository. - engines/KataGo-raw tracks the
master
branch of https://github.com/lightvector/KataGo.
We run KataGo within Docker containers. More specifically:
- The C++ portion of KataGo runs in the container defined by compose/cpp/Dockerfile.
- The Python training portion of KataGo runs in the container defined at compose/python/Dockerfile.
The Dockerfiles contain instructions for how to build them.
After building a container, you run it with a command like
docker run --gpus all -v ~/go_attack:/go_attack -v DATA_DIR:/shared -it humancompatibleai/goattack:cpp
where DATA_DIR
is a directory, shared among all containers, in which to save the
results of training runs.
A KataGo executable can be found in the /engines/KataGo-custom/cpp
directory inside the C++ container.
In order to launch training runs, run several containers simultaneously:
- One or more 1-GPU C++ containers executing victim-play games to generate data. Example
command to run in each container:
/go_attack/kubernetes/victimplay.sh [--warmstart] EXPERIMENT-NAME /shared/
, where the optional--warmstart
flag should be set for warmstarted runs. - One 1-GPU Python container for training. Example command:
/go_attack/kubernetes/train.sh [--initial-weights WARMSTART-MODEL-DIR] EXPERIMENT-NAME /shared/ 1.0
where the optional--initial-weights WARMSTART-MODEL-DIR
flag should be set for warmstarted runs. - One Python container for shuffling data. Example command:
/go_attack/kubernetes/shuffle-and-export.sh [--preseed WARMSTART-SELFPLAY-DIR] EXPERIMENT-NAME /shared
where the optional--preseed
flag should be set for warmstarted runs. - One Python container for running the curriculum. Example command:
/go_attack/kubernetes/curriculum.sh EXPERIMENT-NAME /shared/ /go_attack/configs/examples/cyclic-adversary-curriculum.json -harden-below-visits 100
.- The victims listed in the curriculum
.json
file are assumed to exist in/shared/victims
. They can be symlinks.
- The victims listed in the curriculum
- Optionally, one 1-GPU C++ container for evaluating models. Example command:
/go_attack/kubernetes/evaluate-loop.sh /shared/victimplay/EXPERIMENT-NAME/ /shared/victimplay/EXPERIMENT-NAME/eval
.
See configs/examples for example experiment configurations and example values for the warmstart flags.
For these wrapper scripts in kubernetes/
, optional flags for the wrapper come
before any positional arguments, but optional flags for the underlying command
the wrapper calls go after any positional arguments. For example, in the command
/go_attack/kubernetes/shuffle-and-export.sh --preseed WARMSTART-SELFPLAY-DIR EXPERIMENT-NAME /shared -add-to-window 100000000
, --preseed
is a flag for the
wrapper whereas -add-to-window
is a flag to be passed to
/engines/KataGo-tensorflow/python/selfplay/shuffle_and_export_loop.sh
.
Within the compose
directory of this repo are a few docker-compose .yml
files
that automate the process of spinning up the various components of training.
Each .yml
file also has a corresponding .env
that configures more specific
parameters of the run (
e.g. what directory to write to,
how many threads to use,
batch size,
where to look for other config files
).
(Note: we stopped using these in October 2022, so they are no longer maintained.)
See AlignmentResearch/KataGoVisualizer.
In addition to the learned attacks, we also implement 5 baseline, hardcoded attacks:
- Edge attack, which plays random vertices in the outermost available ring of the board
- Random attack, which simply plays random legal moves
- Pass attack, which always passes at every turn
- Spiral attack, which deterministically plays the "largest" legal move in lexicographical order in polar coordinates (going counterclockwise starting from the outermost ring)
- Mirror Go, which plays the opponent's last move reflected about the y = x diagonal, or the y = -x diagonal if they play on y = x. If the mirrored vertex is taken, then the policy plays the "closest" legal vertex by L1 distance.
You can test these attacks by running baseline_attacks.py
with the appropriate --strategy
flag (edge
, random
, pass
, spiral
, or mirror
). Run python scripts/baseline_attacks.py --help
for more information about all the available flags.