Tag: alphazero

Questions about AlphaGo Development

I was asked some questions about the development of AlphaGo and figured my answers might be of wider interest, so here we go:

How low-level does the code need to be in order to get the maximum performance both in training and, possibly, live game?

To maximize playing strength it is necessary to balance maximum utilization of the accelerators (GPUs or TPUs) and focusing on exploring and evaluating the most promising line of play. Unfortunately there are several conflicting tendencies:

The more board positions we can evaluate the better our estimates of how good each ...

Discovering Matrix Multiplication Algorithms with AlphaTensor

Matrix multiplication is at the foundation of modern machine learning - whether transformers or convolutional networks, diffusion models or GANs, they all boil down to matrix multiplications, executed efficiently on GPUs and TPUs. So far the best known algorithms have been discovered manually by humans, often optimized for specific use cases.

The most famous is probably the Strassen algorithm to multiply two 2x2 matrices using only 7 instead of the naive 8 multiplications:

illustration of the strassen matmul algorithm

Through clever addition and subtraction of the individual elements of the a and b matrices this algorithm is ...

MuZero talk - ICAPS 2020

I gave a detailed talk about MuZero at ICAPS 2020, at the workshop "Bridging the Gap Between AI Planning and Reinforcement Learning".

In addition to giving an overview of the algorithm in general, I also went into more detail about reanalyse - the technique that allows MuZero to use the model based search to repeatedly learn more from the same episode data.

I hope you find the talk useful! I've also uploaded my slides for easy reference....