Monte-Carlo Simulation Balancing in Practice

Huang, Shih-Chieh; Coulom, Rémi; Lin, Shun-Shii

doi:10.1007/978-3-642-17928-0_8

Shih-Chieh Huang¹⁸,
Rémi Coulom¹⁹ &
Shun-Shii Lin¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6515))

Included in the following conference series:

International Conference on Computers and Games

1485 Accesses
25 Citations

Abstract

Simulation balancing is a new technique to tune parameters of a playout policy for a Monte-Carlo game-playing program. So far, this algorithm had only been tested in a very artificial setting: it was limited to 5×5 and 6×6 Go, and required a stronger external program that served as a supervisor. In this paper, the effectiveness of simulation balancing is demonstrated in a more realistic setting. A state-of-the-art program, Erica, learned an improved playout policy on the 9×9 board, without requiring any external expert to provide position evaluations. The evaluations were collected by letting the program analyze positions by itself. The previous version of Erica learned pattern weights with the minorization-maximization algorithm. Thanks to simulation balancing, its playing strength was improved from a winning rate of 69% to 78% against Fuego 0.4.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abramson, B.: Expected-outcome: A general model of static evaluation. IEEE Transactions on Pattern Analysis and Machine Intelligence 12(2), 182–193 (1990)
Article Google Scholar
Brügmann, B.: Monte Carlo Go (1993) (unpublished technical report)
Google Scholar
Bouzy, B., Helmstetter, B.: Monte Carlo Go developments. In: van den Herik, H.J., Iida, H., Heinz, E.A. (eds.) ACG10, pp. 159–175. Kluwer Academic Publishers, Dordrecht (2003)
Google Scholar
Coulom, R.: Efficient selectivity and backup operators in monte-carlo tree search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M(J.) (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007)
Chapter Google Scholar
Gelly, S., Wang, Y., Munos, R., Teytaud, O.: Modification of UCT with patterns in Monte-Carlo Go. Technical Report RR-6062, INRIA (2006)
Google Scholar
Bouzy, B.: Associating domain-dependent knowledge and Monte-Carlo approaches within a Go program. Information Sciences, Heuristic Search and Computer Game Playing IV 175(4), 247–257 (2005)
Google Scholar
Chen, K.H., Zhang, P.: Monte-Carlo Go with knowledge-guided simulations. ICGA Journal 31(2), 67–76 (2008)
Google Scholar
Chaslot, G., Fiter, C., Hoock, J.-B., Rimmel, A., Teytaud, O.: Adding expert knowledge and exploration in monte-carlo tree search. In: van den Herik, H.J., Spronck, P. (eds.) ACG 2009. LNCS, vol. 6048, pp. 1–13. Springer, Heidelberg (2010)
Chapter Google Scholar
Bouzy, B., Chaslot, G.: Monte-Carlo Go reinforcement learning experiments. In: Kendall, G., Louis, S. (eds.) 2006 IEEE Symposium on Computational Intelligence and Games, Reno, USA, pp. 187–194 (May 2006)
Google Scholar
Gelly, S., Silver, D.: Combining online and offline knowledge in UCT. In: Proceedings of the 24th International Conference on Machine Learning, Corvallis Oregon, USA, pp. 273–280 (2007)
Google Scholar
Chaslot, G.M.J.B., Winands, M.H.M., Szita, I., van den Herik, H.J.: Cross-entropy for Monte-Carlo tree search. ICGA Journal 31(3), 145–156 (2008)
Google Scholar
Silver, D., Tesauro, G.: Monte-Carlo simulation balancing. In: Bottou, L., Littman, M. (eds.) Proceedings of the 26th International Conference on Machine Learning, Montreal, Canada, pp. 945–952. Omnipress (June 2009)
Google Scholar
Coulom, R.: Computing Elo ratings of move patterns in the game of Go. ICGA Journal 30(4), 198–208 (2007)
Google Scholar
Enzenberger, M., Muller, M.: Fuego—an open-source framework for board games and Go engine based on Monte-Carlo tree search. Technical Report TR 09-08, University of Alberta, Edmonton, Alberta, Canada (2009)
Google Scholar
Anderson, D.A.: Monte Carlo search in games. Technical report, Worcester Polytechnic Institute (2009)
Google Scholar
Kocsis, L., Szepesvári, C.: Bandit based monte-carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)
Chapter Google Scholar
Chaslot, G., Winands, M., Bouzy, B., Uiterwijk, J.W.H.M., van den Herik, H.J.: Progressive strategies for monte-carlo tree search. In: Wang, P. (ed.) Proceedings of the 10th Joint Conference on Information Sciences, Salt Lake City, USA, pp. 655–661 (2007)
Google Scholar
Goertz, U., Shubert, W.: Game records in SGF format (2007), http://www.u-go.net/gamerecords/
Chung-Hsiung, L.: Web2go web site (2009), http://www.web2go.idv.tw/gopro/
Silver, D.: Message to the computer-go mailing list (2009), http://www.mail-archive.com/computer-go@computer-go.org/msg11260.html
Google Scholar
Schraudolph, N.N.: Local gain adaptation in stochastic gradient descent. In: Proceedings of the 9th International Conference on Artificial Neural Networks, London. IEEE, Los Alamitos (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of CSIE, National Taiwan Normal University, Taiwan, R.O.C.
Shih-Chieh Huang & Shun-Shii Lin
Université de Lille, CNRS, INRIA, France
Rémi Coulom

Authors

Shih-Chieh Huang
View author publications
You can also search for this author in PubMed Google Scholar
Rémi Coulom
View author publications
You can also search for this author in PubMed Google Scholar
Shun-Shii Lin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Tilburg Center for Cognition and Communication (TiCC), Tilburg University, P.O. Box 90153, 5000LE, Tilburg, The Netherlands
H. Jaap van den Herik & Aske Plaat &
Japan Advanced Institute of Science and Technology, Research Unit for Computers and Games, 1-1, Asahidai, 923-1292, Nomi, Ishikawa, Japan
Hiroyuki Iida

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang, SC., Coulom, R., Lin, SS. (2011). Monte-Carlo Simulation Balancing in Practice. In: van den Herik, H.J., Iida, H., Plaat, A. (eds) Computers and Games. CG 2010. Lecture Notes in Computer Science, vol 6515. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17928-0_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-17928-0_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17927-3
Online ISBN: 978-3-642-17928-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics