Martin Müller
Wed 15 Aug 2018, 11:00 - 12:30
Informatics Forum (IF-4.31/4.33)

If you have a question about this talk, please contact: Diana Dalla Costa (ddallac)


This talk discusses two of our recent papers. Memory-Augmented Monte Carlo Tree Search (M-MCTS, AAAI 2018), is a new approach to generalization in online real-time search. M-MCTS adds a memory structure, where each entry contains information of a particular search state. This memory is used to generate approximate value estimates which combine estimates from similar states. Memory based value approximation is proven to be better than vanilla Monte Carlo estimation with high probability under mild conditions. Experiments in the game of Go show that M-MCTS outperforms MCTS.

Three-Head Neural Network Architecture for Monte Carlo Tree Search (IJCAI 2018) extends the two-head architecture introduced by AlphaGo Zero. In addition to the policy and value heads, a third head is trained to predict action values (Q-values), which estimate the state value after making a move. We exploit minimax relations between parent and child nodes for efficient training. A three-head network for the game of Hex significantly improves MCTS performance. In the recent Computer Olympiad, our new program won all games on both 11x11 and 13x13 boards.


Martin Müller is a Professor in the Department of Computing Science at the University of Alberta in Edmonton, Canada. He is interested in all aspects of modern heuristic search, studying the complex interactions between search, knowledge, simulations and learning. Application areas include game tree search, domain-independent planning, and combinatorial games. He has worked on computer Go for thirty years, and is the leader of the open source project Fuego. In 2009, Fuego became the first program to win a 9x9 Go game on even terms against a top-ranked professional player. With his students and colleagues, Müller has developed a series of successful planning systems based on macro learning and on Monte Carlo random walks. Recently, his paper with Chenjun Xiao and Jincheng Mei on Memory-Augmented Monte Carlo Tree Search won the outstanding paper award at AAAI-18.