As a guest user you are not logged in or recognized by your IP address. You have
access to the Front Matter, Abstracts, Author Index, Subject Index and the full
text of Open Access publications.
We develop a new framework for the game of Go to target a high score, and thus a perfect play. We integrate this framework into the Monte Carlo tree search – policy iteration learning pipeline introduced by Google DeepMind with AlphaGo. Training on 9×9 Go produces a superhuman Go player, thus proving that this framework is stable and robust. We show that this player can be used to effectively play with both positional and score handicap. We develop a family of agents that can target high scores against any opponent, recover from very severe disadvantage against weak opponents, and avoid suboptimal moves.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.