In General Game Playing (GGP) the aim is to create intelligent programs that can automatically learn how to play many different abstract games at an expert level without any human intervention. This poses many research challenges in several areas of Artificial Intelligence, such as knowledge acquisition, heuristic search, and machine learning. A recent development in GGP is the successful application of simulation-based search methods such as Monte-Carlo Tree Search and Nested Monte-Carlo Search. These search methods overcome to some extent the knowledge-acquisition bottleneck. Still, all relevant search-control knowledge required for expert-level play must be discovered using machine-learning techniques during online play. This is a challenging task, for which more effective methods are required than currently exist. Therefore, this proposal aims to develop new and more effective methods for online learning of search control, which will lead to improved search algorithms. These methods, although developed within GGP, will have a much wider applicability because search is such a fundamental problem-solving paradigm. The two interrelated research questions of the project are:
1) How to develop effective methods for learning online search-control for Monte-Carlo Tree Search?
2) How to use the developed search-control mechanisms in other simulation-based search methods?
The GoGeneral project funded by NWO aims to answer these research questions. The research project is supervised by me and Yngvi Björnsson, and is executed by a PhD researcher.
Two-person zero-sum games with
perfect information such as Chess and Checkers have been
addressed by many Artificial Intelligence (AI) researchers with
great success, but the game of Go is a notable exception.
Despite significant attention the best computer Go programs
still lag far behind the level of professional players.
Recently, a new technique, called Monte-Carlo Tree Search (MCTS),
started a revolution and increased the level of play of computer
Go programs substantially. Whereas in first instance the focus
of MCTS with respect to Go has been on the practical
application, the current research proposal addresses the problem
of understanding the nature, the underlying principles, of MCTS.
A careful understanding of the nature of MCTS will lead to
(better) effective search algorithms. Hence, the two
interrelated research questions are:
1) How to formulate models that increase the understanding how Monte-Carlo Tree Search (MCTS) works?
2) How to use the developed MCTS models to create effective search algorithms?
The Go4Nature project funded by NWO researches the nature of Monte-Carlo Tree Search and uses the increased insight to improve search algorithms. The research project is supervised by me, and is executed by a Postdoc (underlying principles) and a PhD researcher (applications).
In the field of intelligent search techniques the main topics of interest are Variable-depth Search, Monte-Carlo Tree Search, and Proof-Number Search. In the past I have proposed several search enhancements and search techniques such as Enhanced Realization Probability Search, Relative History Heuristic, MCTS-Solver and PDS-PN. These ideas have been tested in several game domains such as Lines of Action, Clobber, Surakarta and Draughts.
Moreover, I test my ideas and the techniques of others in competitive tournament programs. I am known for my Lines of Action program MIA/MC-LOA which won six times the Computer Olympiad, my Surakarta program SIA winning the Olympiad five times, and my Clobber program MILA winning the Olympiad once.
The DKE Games and AI group has a reputation in solving games. I have been involved in solving Fanorona, 6×6 LOA and Go boards up to 30 intersections. These games have been mainly solved by “classic” techniques such as alpha-beta, proof-number search and retrograde analysis. Currently, I am investigating to what extent MCTS integrated with proving capabilities as done in MCTS-Solver can help in solving games.
I am interested in learning the weights of search enhancements and evaluation function features. Together with other researchers I investigated techniques such as Neural Networks, TD-Learning, RSPSA, Cross-Entropy Method and Gradual Focus in the past.