Playout Policy Adaptation for Games

Cazenave, Tristan

doi:10.1007/978-3-319-27992-3_3

Playout Policy Adaptation for Games

Tristan Cazenave¹⁶

Conference paper
First Online: 25 December 2015

641 Accesses
7 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9525))

Abstract

Monte-Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). We propose to learn a playout policy online so as to improve MCTS for GGP. We test the resulting algorithm named Playout Policy Adaptation (PPA) on Atarigo, Breakthrough, Misere Breakthrough, Domineering, Misere Dominee-ring, Go, Knightthrough, Misere Knightthrough, Nogo and Misere Nogo. For most of these games, PPA is better than UCT with a uniform random playout policy, with the notable exceptions of Go and Nogo.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
For brevity, we use ‘he’ and ‘his’, whenever ‘he or she’ and ‘his or her’ are meant.

References

Boissac, F., Cazenave, T.: De nouvelles heuristiques de recherche appliquées à la résolution d’Atarigo. In: Intelligence artificielle et jeux, pp. 127–141. Hermes Science (2006)
Google Scholar
Browne, C., Powley, E., Whitehouse, D., Lucas, S., Cowling, P., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., Colton, S.: A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intell. AI Games 4(1), 1–43 (2012)
Article Google Scholar
Cazenave, T.: Nested Monte-Carlo search. In: Boutilier, C. (ed.) IJCAI, pp. 456–461 (2009)
Google Scholar
Cazenave, T.: Sequential halving applied to trees. IEEE Trans. Comput. Intell. AI Games 7(1), 102–105 (2015)
Article Google Scholar
Cazenave, T., Saffidine, A., Schofield, M., Thielscher, M.: Discounting and pruning for nested playouts in general game playing. GIGA at IJCAI (2015)
Google Scholar
Chou, C.-W., Teytaud, O., Yen, S.-J.: Revisiting Monte-Carlo tree search on a normal form game: NoGo. In: Di Chio, C., et al. (eds.) EvoApplications 2011, Part I. LNCS, vol. 6624, pp. 73–82. Springer, Heidelberg (2011)
Chapter Google Scholar
Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo tree search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M.J. (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007)
Chapter Google Scholar
Coulom, R.: Computing elo ratings of move patterns in the game of go. ICGA J. 30(4), 198–208 (2007)
Google Scholar
Enzenberger, M., Muller, M., Arneson, B., Segal, R.: Fuego - an open-source framework for board games and go engine based on Monte Carlo tree search. IEEE Trans. Comput. Intell. AI Games 2(4), 259–270 (2010)
Article Google Scholar
Finnsson, H., Björnsson, Y.: Simulation-based approach to general game playing. In: AAAI, pp. 259–264 (2008)
Google Scholar
Finnsson, H., Björnsson, Y.: Learning simulation control in general game-playing agents. In: AAAI (2010)
Google Scholar
Gardner, M.: Mathematical games. Sci. Am. 230, 106–108 (1974)
Article Google Scholar
Gelly, S., Silver, D.: Monte-Carlo tree search and rapid action value estimation in computer go. Artif. Intell. 175(11), 1856–1875 (2011)
Article MathSciNet Google Scholar
Genesereth, M.R., Love, N., Pell, B.: General game playing: overview of the AAAI competition. AI Mag. 26(2), 62–72 (2005)
Google Scholar
Huang, S., Arneson, B., Hayward, R.B., Müller, M., Pawlewicz, J.: Mohex 2.0: a pattern-based MCTS hex player. In: Computers and Games - 8th International Conference, CG 2013, Yokohama, Japan, 13–15 August 2013, Revised Selected Papers, pp. 60–71 (2013)
Google Scholar
Huang, S.-C., Coulom, R., Lin, S.-S.: Monte-Carlo simulation balancing in practice. In: van den Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2010. LNCS, vol. 6515, pp. 81–92. Springer, Heidelberg (2011)
Chapter Google Scholar
Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)
Chapter Google Scholar
Lee, C., Wang, M., Chaslot, G., Hoock, J., Rimmel, A., Teytaud, O., Tsai, S., Hsu, S., Hong, T.: The computational intelligence of MoGo revealed in taiwan’s computer go tournaments. IEEE Trans. Comput. Intell. AI Games 1(1), 73–89 (2009)
Article Google Scholar
Lorentz, R., Horey, T.: Programming breakthrough. In: van den Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2013. LNCS, vol. 8427, pp. 49–59. Springer, Heidelberg (2014)
Google Scholar
Méhat, J., Cazenave, T.: A parallel general game player. KI 25(1), 43–47 (2011)
Google Scholar
Pitrat, J.: Realization of a general game-playing program. IFIP Congr. 2, 1570–1574 (1968)
Google Scholar
Rimmel, A., Teytaud, F., Cazenave, T.: Optimization of the nested Monte-Carlo algorithm on the traveling salesman problem with time windows. In: Di Chio, C., et al. (eds.) EvoApplications 2011, Part II. LNCS, vol. 6625, pp. 501–510. Springer, Heidelberg (2011)
Chapter Google Scholar
Rimmel, A., Teytaud, F., Teytaud, O.: Biasing Monte-Carlo simulations through RAVE values. In: van den Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2010. LNCS, vol. 6515, pp. 59–68. Springer, Heidelberg (2011)
Chapter Google Scholar
Rosin, C.D.: Nested rollout policy adaptation for Monte Carlo tree search. In: IJCAI, pp. 649–654 (2011)
Google Scholar
Saffidine, A., Jouandeau, N., Cazenave, T.: Solving breakthrough with race patterns and job-level proof number search. In: van den Herik, H.J., Plaat, A. (eds.) ACG 2011. LNCS, vol. 7168, pp. 196–207. Springer, Heidelberg (2012)
Chapter Google Scholar
Swiechowski, M., Mandziuk, J.: Self-adaptation of playing strategies in general game playing. IEEE Trans. Comput. Intell. AI Games 6(4), 367–381 (2014)
Article MATH Google Scholar
Tak, M.J.W., Winands, M.H.M., Björnsson, Y.: N-grams and the last-good-reply policy applied in general game playing. IEEE Trans. Comput. Intell. AI Games 4(2), 73–83 (2012)
Article Google Scholar
Uiterwijk, J.W.H.M.: Perfectly solving domineering boards. In: Cazenave, T., Winands, M.H.M., Lida, H. (eds.) CGW 2013. Communications in Computer and Information Science, vol. 408, pp. 97–121. Springer, Switzerland (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

LAMSADE, Université Paris-Dauphine, Paris, France
Tristan Cazenave

Authors

Tristan Cazenave
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tristan Cazenave .

Editor information

Editors and Affiliations

Leiden University , Leiden, The Netherlands
Aske Plaat
Leiden University , Leiden, The Netherlands
Jaap van den Herik
Leiden University , Leiden, The Netherlands
Walter Kosters

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cazenave, T. (2015). Playout Policy Adaptation for Games. In: Plaat, A., van den Herik, J., Kosters, W. (eds) Advances in Computer Games. ACG 2015. Lecture Notes in Computer Science(), vol 9525. Springer, Cham. https://doi.org/10.1007/978-3-319-27992-3_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-27992-3_3
Published: 25 December 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27991-6
Online ISBN: 978-3-319-27992-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics