A thread pool implementation of AlphaZero and AlphaStar.
Also have some data augmentation schemes on gomoku and implemented self-supervised learning methods like Simsiam on backbones of the policy-value networks.
Added one line of code and removed ~45% of mutex lock requirements, brings about 20%+ speed improvements.
Implemented a light-weight league training method which suits limited training hardware environments (such as PCs having single GPU).
Added Simsiam on training backbones of policy-value networks.
- Easy Free-style Gomoku with no specific limitations
- Tree/Root Parallelization with Virtual Loss and LibTorch
- Gomoku and MCTS are written in C++
- SWIG for Python C++ extension
Edit config.py for everyting except training paradigms.
- Python 3.7
- PyTorch 1.11.0
- LibTorch 1.11.0
- SWIG 4.0.1
- CMake 3.16+
- GCC 9.4.0+
- Others please refer to requirements.txt
# Add LibTorch/SWIG to environment variable $PATH
# Compile Python extension
# 注意这边需要在find\_package(Torch REQUIRED)前面加上链接到你的conda中torch的CMAKE\_PREFIX\_PATH.
mkdir build
cd build
cmake .. -DCMAKE_PREFIX_PATH=path/to/libtorch -DCMAKE_CUDA_COMPILER="/usr/local/cuda/bin/nvcc" -DCMAKE_BUILD_TYPE=Release
cmake --build .
# Run
cd ..
python run_agent.py train # train model via self-play
python run_agent.py league_train # train model via league training
python run_agent.py play # play with human
Agent first.
- Mastering the Game of Go without Human Knowledge
- Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
- Parallel Monte-Carlo Tree Search
- An Analysis of Virtual Loss in Parallel MCTS
- github.com/hijkzzz/alpha-zero-gomoku
- github.com/suragnair/alpha-zero-general
- Exploring Simple Siamese Representation Learning