Othello Bot Introduction


Othello Bot: Introduction

In my first year of undergrad, the final project from one of the intro CS classes was to program a bot to compete in an Othello (a.k.a. Reversi) tournament with all the other student team bots. In addition to the student bots, there were also exhibition bots submitted by TAs or upperclassmen, which didn't count towards the final standings, but competed and were ranked. There was a small community of dedicated competitors who continued to improve and enter their bots in the exhibition category.

My teammate and I were inspired by an upperclassman (who also competed) in our dorm to aim for one of the top spots. We implemented most of the features that could be accomplished in the 2 week assignment: minimax with alpha-beta pruning, move ordering, and a neural network to evaluate board positions. We ended up placing something like 3rd or 4th in the student bots behind most of most of the exhibition bots, which were, in general, a lot stronger than student bots. At the time, the head TA for the class, a senior, had entered his bot every year since he took the class his freshman year and had won each time, twice with undefeated records. The whole experience was really rewarding, so I decided to start on a new bot from the ground up for the next year.

New Bot: COIN

Bots for perfect information games like Othello generally follow similar general construction. Simulate moves playing moves for yourself and your opponent, and find the sequence of moves that leads to the best game state. With unlimited computing time to choose a move, one could completely search the decision tree of moves and find a move that guarantees guarantees a win or draw. Even with clever optimizations like alpha-beta pruning and move ordering, searching the entire tree for Othello is not possible until near the halfway point of the game (but even then, only for the most optimized bots). Thus, for the first half of the game, the bot has to be able to look at an unfinished board state and guess at what the result. In my first bot, this was done with a simple neural network (which I made before I knew much about ML, so it wasn't very optimal). In my new bot, I wanted to try some of the techniques used by the historically best othello bots (e.g. wzebra, ntest), such as learned pattern-based board evaluation and improvements to the standard alpha-beta algorithm.

Bitboard

In order to simulate moves and explore the game tree, some representation of the current board state is necessary. Othello is most commonly played on an 8x8 board, which fits nicely into a pair of 64-bit intergers (one for black and one for white), where the bits of one integer determines if there is a stone of the corresponding color. Using the individual bits of larger datatypes is a common pattern known as a bitboard. Bitboards are commonly used in board game bots because they can take advantage of instructions that are fast on CPU hardware and parallelize over all the bits to make operations extremely fast. Examples of the operations used are bit shifting operations, multiplications, bitwise logical operations, and, in the case of x86 processors, PEXT and PDEP instructions, which pull or push bits specified by a mask to and from a condensed form. For othello, the two main operations necessary to implement are a procedure to find moves available to a player, and a procedure to play a move. It is important for most search algorithms that these two operations are as fast as possible, since the more you can execute within the time limit, the farther ahead you can look into the game which helps find a better move.

Search Algorithm

I experimented with many different tree search algorithms (to list a couple, MTD-f and PVS. However, I ended up implementing the Alpha Zero algorithm, because around the time I was working on this, Deepmind's AlphaGo Zero was released and was trouncing the best Go players in the world with it. Additionally, other algorithms require a fair amount of custom manual tuning and data generation that is difficult to get right and hard to debug, while Alpha Zero has a very well defined process for generating and training on data (and remarkably does not require any existing examples of "good Othello gameplay"). Alpha Zero is a long topic, so it will be covered in the next post about the Othello bot. If you want to take a look at the code for the COIN othello bot based on the AlphaZero algorithm, you can find it in this GitHub repository.