Abstract:
Computer Go programs have exceeded top-level human players by using deep learning and reinforcement learning techniques. On the other hand, “Entertainment Go AI” or “Coac...Show MoreMetadata
Abstract:
Computer Go programs have exceeded top-level human players by using deep learning and reinforcement learning techniques. On the other hand, “Entertainment Go AI” or “Coaching Go AI” are also interesting directions which have not been well investigated. Several researches have been done for entertaining beginners or intermediate players. Position control or producing various strategies are important tasks, and some methods have been proposed and evaluated using a traditional Monte-Carlo tree search program. In this paper, we try to adapt the method to LeelaZero, a program based on AlphaGo Zero. There are some critical differences between the previous program and the new program. For example the new program does not use random simulations to the ends of games, then the previous method for producing various strategies cannot be used. In this paper we summarized the differences and some expected problems, and proposed several approaches to solve the problems. It was shown that the modified LeelaZero could play gently against weaker players (48% won against a program Ray). Through experiments using human subjects, it was shown that the average number of unnatural moves per game was 1.22, where that by a simple method without considering naturalness was 2.29. Also we evaluated the proposed method for training “center-oriented” and “edge/corner-oriented” players, and it was confirmed that human players could identify the produced strategy (center or edge/corner) with a probability of 71.88%.
Published in: 2019 International Conference on Technologies and Applications of Artificial Intelligence (TAAI)
Date of Conference: 21-23 November 2019
Date Added to IEEE Xplore: 16 January 2020
ISBN Information: