Abstract
Monte-Carlo Tree Search (MCTS) is a successful approach for improving the performance of game-playing programs. This paper presents the Accelerated UCT algorithm, which overcomes a weakness of MCTS caused by deceptive structures which often appear in game tree search. It consists in using a new backup operator that assigns higher weights to recently visited actions, and lower weights to actions that have not been visited for a long time. Results in Othello, Havannah, and Go show that Accelerated UCT is not only more effective than previous approaches but also improves the strength of Fuego, which is one of the best computer Go programs.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning 47(2-3), 235–256 (2002)
Bouzy, B., Helmstetter, B.: Monte Carlo Go developments. In: Proc. of the 10th International Conference on Advances in Computer Games (ACG 2010). IFIP, vol. 263, pp. 159–174. Kluwer Academic (2003)
Brügmann, B.: Monte Carlo Go (1993), http://www.ideanest.com/vegos/MonteCarloGo.pdf
Coquelin, P.-A., Munos, R.: Bandit algorithms for tree search. In: Proc. of the 23rd Conference on Uncertainty in Artificial Intelligence (UAI 2007), pp. 67–74. AUAI press (2007)
Coulom, R.: Computing Elo ratings of move patterns in the game of Go. ICGA Journal 30(4), 198–208 (2007)
Coulom, R.: Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M(J.) (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007)
Enzenberger, M., Müller, M., Arneson, B., Segal, R.: Fuego - an open-source framework for board games and Go engine based on Monte-Carlo tree search. IEEE Transactions on Computational Intelligence and AI in Games 2(4), 259–270 (2010)
Gelly, S.: Discounted UCB. Posted to Computer Go Mailing List (2007), http://www.mail-archive.com/computer-go@computer-go.org/msg02124.html
Gelly, S., Silver, D.: Combining online and offline knowledge in UCT. In: Proc. of the 24th International Conference on Machine Learning (ICML 2007). ACM International Conference Proceeding Series, vol. 227, pp. 273–280 (2007)
Gelly, S., Wang, Y., Munos, R., Teytaud, O.: Modification of UCT with patterns in Monte-Carlo Go. Technical Report RR-6062, INRIA (2006)
Huang, S.-C., Coulom, R., Lin, S.-S.: Monte-Carlo Simulation Balancing in Practice. In: van den Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2010. LNCS, vol. 6515, pp. 81–92. Springer, Heidelberg (2011)
Kloetzer, J., Iida, H., Bouzy, B.: A comparative study of solvers in Amazons endgames. In: Proc. of the IEEE Symposium on Computational Intelligence and Games (CIG 2008), pp. 378–384. IEEE Press (2008)
Kocsis, L., Szepesvári, C.: Bandit Based Monte-Carlo Planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)
Kocsis, L., Szepesvári, C.: Discounted UCB. Video Lecture. In: The Lectures of PASCAL Second Challenges Workshop (2006), Slides are available at http://www.lri.fr/~sebag/Slides/Venice/Kocsis.pdf . Video is available at http://videolectures.net/pcw06_venice/
Lorentz, R.J.: Amazons Discover Monte-Carlo. In: van den Herik, H.J., Xu, X., Ma, Z., Winands, M.H.M. (eds.) CG 2008. LNCS, vol. 5131, pp. 13–24. Springer, Heidelberg (2008)
Ramanujan, R., Selman, B.: Trade-offs in sampling-based adversarial planning. In: Proc. of 21st International Conference on Automated Planning and Scheduling (ICAPS 2011), pp. 202–209. AAAI (2011)
Silver, D., Tesauro, G.: Monte-Carlo simulation balancing. In: Proc. of the 26th International Conference on Machine Learning (ICML 2009). ACM International Conference Proceeding Series, vol. 382, pp. 945–952 (2009)
Sutton, R.S.: Learning to predict by the methods of temporal differences. Machine Learning 3(1), 9–44 (1988)
Teytaud, F., Teytaud, O.: Creating an Upper-Confidence-Tree Program for Havannah. In: van den Herik, H.J., Spronck, P. (eds.) ACG 2009. LNCS, vol. 6048, pp. 65–74. Springer, Heidelberg (2010)
Tom, D., Müller, M.: A Study of UCT and Its Enhancements in an Artificial Game. In: van den Herik, H.J., Spronck, P. (eds.) ACG 2009. LNCS, vol. 6048, pp. 55–64. Springer, Heidelberg (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hashimoto, J., Kishimoto, A., Yoshizoe, K., Ikeda, K. (2012). Accelerated UCT and Its Application to Two-Player Games. In: van den Herik, H.J., Plaat, A. (eds) Advances in Computer Games. ACG 2011. Lecture Notes in Computer Science, vol 7168. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31866-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-31866-5_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31865-8
Online ISBN: 978-3-642-31866-5
eBook Packages: Computer ScienceComputer Science (R0)