skip to main content
research-article
Free Access

The grand challenge of computer Go: Monte Carlo tree search and extensions

Published:01 March 2012Publication History
Skip Abstract Section

Abstract

The ancient oriental game of Go has long been considered a grand challenge for artificial intelligence. For decades, computer Go has defied the classical methods in game tree search that worked so successfully for chess and checkers. However, recent play in computer Go has been transformed by a new paradigm for tree search based on Monte-Carlo methods. Programs based on Monte-Carlo tree search now play at human-master levels and are beginning to challenge top professional players. In this paper, we describe the leading algorithms for Monte-Carlo tree search and explain how they have advanced the state of the art in computer Go.

References

  1. Special issue on Monte Carlo techniques and computer Go. In C.-S. Lee, M. Müller, and O. Teytaud, eds, IEEE Trans. Comput. Intell. AI in Games, 2 (2010).Google ScholarGoogle Scholar
  2. Abramson, B. Expected-outcome: a general model of static evaluation. IEEE Trans. Patt. Anal. Mach. Intell. 12 (1990), 182--193. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Auer, P., Cesa-Bianchi, N., Fischer, P. Finite time analysis of the multiarmed bandit problem. Mach. Learn. 47(2--3) (2002), 235--256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bourki, A., Chaslot, G., Coulm, M., Danjean, V., Doghmen, H., Hoock, J.B., Herault, T., Rimmel, A., Teytaud, F., Teytaud, O., Vayssiere, P., Yu, Z. Scalability and parallelization of Monte-Carlo tree search. In 7th International Conference on Computers and Games (CG-10) (2010), 48--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bouzy, B., Chaslot, G. Bayesian generation and integration of k-nearest-neighbor patterns for 19 × 19 Go. In IEEE Symposium on Computational Intelligence and Games (CIG-05) (2005).Google ScholarGoogle Scholar
  6. Bouzy, B., Helmstetter, B. Monte-Carlo Go developments. In 10th International Conference on Advances in Computer Games (ACG-03) (2003), 159--174.Google ScholarGoogle Scholar
  7. Brügmann, B. Monte-Carlo Go. Technical report, Max Planck Institute of Physics, 1993.Google ScholarGoogle Scholar
  8. Bubeck, S., Munos, R., Stoltz, G., Szepesvári, C. Online optimization in X-armed bandits. In Advances in Neural Information Processing Systems 22 (NIPS-22), D. Koller and D. Schuurmans and Y. Bengio and L. Bottou, eds. MIT Press, 2009, 201--208.Google ScholarGoogle Scholar
  9. Cazenave, T., Balbo, F., Pinson, S. Monte-Carlo bus regulation. In 12th International IEEE Conference on Intelligent Transportation Systems (2009), 340--345.Google ScholarGoogle Scholar
  10. Chevelu, J., Putois, G., Lepage, Y. The true score of statistical paraphrase generation. In 23rd International Conference on Computational Linguistics: Posters (2010), 144--152. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Coquelin, P.A., Munos, R. Bandit algorithms for tree search. In 23rd Conference on Uncertainty in Artificial Intelligence (UAI-07) (2007), 67--74.Google ScholarGoogle Scholar
  12. Coulom, R. Efficient selectivity and backup operators in Monte-Carlo tree search. In 5th International Conference on Computers and Games (CG-06) (2006), 72--83. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Coulom, R. Computing Elo ratings of move patterns in the game of Go. Int. Comput. Game. Assoc. J. 30, 4 (2007), 198--208.Google ScholarGoogle Scholar
  14. Finnsson, H., Björnsson, Y. Simulation-based approach to general game playing. In 23rd AAAI Conference on Artificial Intelligence (AAAI-08) (2008), 259--264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Gelly, S., Silver, D. Monte-Carlo tree search and rapid action value estimation in computer Go. Artif. Intell. 175 (2011), 1856--1875. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Gelly, S., Wang, Y., Munos, R., Teytaud, O. Modification of UCT with Patterns in Monte-Carlo Go. Rapport de recherche INRIA RR-6062, 2006.Google ScholarGoogle Scholar
  17. Huang, S., Coulom, R., Lin, S. Monte-Carlo simulation balancing in practice. In 7th International Conference on Computers and, Games (CG-09) (2009), 119--126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Kocsis, L., Szepesvári, C. Bandit based Monte-Carlo planning. In 15th European Conference on Machine Learning (ECML) (2006), 282--293. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Lai, T.L., Robbins, H. Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6 (1985), 4--22.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Nakhost, H., Müller, M. Monte-Carlo exploration for deterministic planning. In 21st International Joint Conference on Artificial Intelligence (IJCAI-09) (2009), 1766--1771. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Robbins, H. Some aspects of the sequential design of experiments. Bull. Am. Math. Soc. 58 (1952), 527--535.Google ScholarGoogle ScholarCross RefCross Ref
  22. Schaeffer, J. The games computers (and people) play. Adv. Comput., 52 (2000), 190--268.Google ScholarGoogle Scholar
  23. Tanabe, Y., Yoshizoe, K., Imai, H. A study on security evaluation methodology for image-based biometrics authentication systems. In 3rd IEEE International Conference on Biometrics: Theory, Applications and Systems (2009), 258--263. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Widrow, B., Gupta, N.K., Maitra, S. Punish/reward: Learning with a critic in adaptive threshold systems. IEEE Trans. Syst., Man, Cybern. 3 (1973), 455--465.Google ScholarGoogle ScholarCross RefCross Ref
  25. Zinkevich, M., Bowling, M., Bard, N., Kan, M., Billings, D. Optimal unbiased estimators for evaluating agent performance. In 21st National Conference on Artificial Intelligence (AAAI-06) (2006), 573--578. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The grand challenge of computer Go: Monte Carlo tree search and extensions

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image Communications of the ACM
              Communications of the ACM  Volume 55, Issue 3
              March 2012
              106 pages
              ISSN:0001-0782
              EISSN:1557-7317
              DOI:10.1145/2093548
              Issue’s Table of Contents

              Copyright © 2012 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 1 March 2012

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Popular
              • Refereed

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format .

            View HTML Format