Skip to main content

Playout Policy Adaptation for Games

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9525))

Abstract

Monte-Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). We propose to learn a playout policy online so as to improve MCTS for GGP. We test the resulting algorithm named Playout Policy Adaptation (PPA) on Atarigo, Breakthrough, Misere Breakthrough, Domineering, Misere Dominee-ring, Go, Knightthrough, Misere Knightthrough, Nogo and Misere Nogo. For most of these games, PPA is better than UCT with a uniform random playout policy, with the notable exceptions of Go and Nogo.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    For brevity, we use ‘he’ and ‘his’, whenever ‘he or she’ and ‘his or her’ are meant.

References

  1. Boissac, F., Cazenave, T.: De nouvelles heuristiques de recherche appliquées à la résolution d’Atarigo. In: Intelligence artificielle et jeux, pp. 127–141. Hermes Science (2006)

    Google Scholar 

  2. Browne, C., Powley, E., Whitehouse, D., Lucas, S., Cowling, P., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., Colton, S.: A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intell. AI Games 4(1), 1–43 (2012)

    Article  Google Scholar 

  3. Cazenave, T.: Nested Monte-Carlo search. In: Boutilier, C. (ed.) IJCAI, pp. 456–461 (2009)

    Google Scholar 

  4. Cazenave, T.: Sequential halving applied to trees. IEEE Trans. Comput. Intell. AI Games 7(1), 102–105 (2015)

    Article  Google Scholar 

  5. Cazenave, T., Saffidine, A., Schofield, M., Thielscher, M.: Discounting and pruning for nested playouts in general game playing. GIGA at IJCAI (2015)

    Google Scholar 

  6. Chou, C.-W., Teytaud, O., Yen, S.-J.: Revisiting Monte-Carlo tree search on a normal form game: NoGo. In: Di Chio, C., et al. (eds.) EvoApplications 2011, Part I. LNCS, vol. 6624, pp. 73–82. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  7. Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo tree search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M.J. (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  8. Coulom, R.: Computing elo ratings of move patterns in the game of go. ICGA J. 30(4), 198–208 (2007)

    Google Scholar 

  9. Enzenberger, M., Muller, M., Arneson, B., Segal, R.: Fuego - an open-source framework for board games and go engine based on Monte Carlo tree search. IEEE Trans. Comput. Intell. AI Games 2(4), 259–270 (2010)

    Article  Google Scholar 

  10. Finnsson, H., Björnsson, Y.: Simulation-based approach to general game playing. In: AAAI, pp. 259–264 (2008)

    Google Scholar 

  11. Finnsson, H., Björnsson, Y.: Learning simulation control in general game-playing agents. In: AAAI (2010)

    Google Scholar 

  12. Gardner, M.: Mathematical games. Sci. Am. 230, 106–108 (1974)

    Article  Google Scholar 

  13. Gelly, S., Silver, D.: Monte-Carlo tree search and rapid action value estimation in computer go. Artif. Intell. 175(11), 1856–1875 (2011)

    Article  MathSciNet  Google Scholar 

  14. Genesereth, M.R., Love, N., Pell, B.: General game playing: overview of the AAAI competition. AI Mag. 26(2), 62–72 (2005)

    Google Scholar 

  15. Huang, S., Arneson, B., Hayward, R.B., Müller, M., Pawlewicz, J.: Mohex 2.0: a pattern-based MCTS hex player. In: Computers and Games - 8th International Conference, CG 2013, Yokohama, Japan, 13–15 August 2013, Revised Selected Papers, pp. 60–71 (2013)

    Google Scholar 

  16. Huang, S.-C., Coulom, R., Lin, S.-S.: Monte-Carlo simulation balancing in practice. In: van den Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2010. LNCS, vol. 6515, pp. 81–92. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  17. Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  18. Lee, C., Wang, M., Chaslot, G., Hoock, J., Rimmel, A., Teytaud, O., Tsai, S., Hsu, S., Hong, T.: The computational intelligence of MoGo revealed in taiwan’s computer go tournaments. IEEE Trans. Comput. Intell. AI Games 1(1), 73–89 (2009)

    Article  Google Scholar 

  19. Lorentz, R., Horey, T.: Programming breakthrough. In: van den Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2013. LNCS, vol. 8427, pp. 49–59. Springer, Heidelberg (2014)

    Google Scholar 

  20. Méhat, J., Cazenave, T.: A parallel general game player. KI 25(1), 43–47 (2011)

    Google Scholar 

  21. Pitrat, J.: Realization of a general game-playing program. IFIP Congr. 2, 1570–1574 (1968)

    Google Scholar 

  22. Rimmel, A., Teytaud, F., Cazenave, T.: Optimization of the nested Monte-Carlo algorithm on the traveling salesman problem with time windows. In: Di Chio, C., et al. (eds.) EvoApplications 2011, Part II. LNCS, vol. 6625, pp. 501–510. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  23. Rimmel, A., Teytaud, F., Teytaud, O.: Biasing Monte-Carlo simulations through RAVE values. In: van den Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2010. LNCS, vol. 6515, pp. 59–68. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  24. Rosin, C.D.: Nested rollout policy adaptation for Monte Carlo tree search. In: IJCAI, pp. 649–654 (2011)

    Google Scholar 

  25. Saffidine, A., Jouandeau, N., Cazenave, T.: Solving breakthrough with race patterns and job-level proof number search. In: van den Herik, H.J., Plaat, A. (eds.) ACG 2011. LNCS, vol. 7168, pp. 196–207. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  26. Swiechowski, M., Mandziuk, J.: Self-adaptation of playing strategies in general game playing. IEEE Trans. Comput. Intell. AI Games 6(4), 367–381 (2014)

    Article  MATH  Google Scholar 

  27. Tak, M.J.W., Winands, M.H.M., Björnsson, Y.: N-grams and the last-good-reply policy applied in general game playing. IEEE Trans. Comput. Intell. AI Games 4(2), 73–83 (2012)

    Article  Google Scholar 

  28. Uiterwijk, J.W.H.M.: Perfectly solving domineering boards. In: Cazenave, T., Winands, M.H.M., Lida, H. (eds.) CGW 2013. Communications in Computer and Information Science, vol. 408, pp. 97–121. Springer, Switzerland (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tristan Cazenave .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Cazenave, T. (2015). Playout Policy Adaptation for Games. In: Plaat, A., van den Herik, J., Kosters, W. (eds) Advances in Computer Games. ACG 2015. Lecture Notes in Computer Science(), vol 9525. Springer, Cham. https://doi.org/10.1007/978-3-319-27992-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27992-3_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27991-6

  • Online ISBN: 978-3-319-27992-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics