Skip to main content

Computational Experiments with the RAVE Heuristic

  • Conference paper
Computers and Games (CG 2010)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6515))

Included in the following conference series:

  • 1462 Accesses

Abstract

The Monte-Carlo tree search algorithm Upper Confidence bounds applied to Trees (UCT) has become extremely popular in computer games research. The Rapid Action Value Estimation (RAVE) heuristic is a strong estimator that often improves the performance of UCT-based algorithms. However, there are situations where RAVE misleads the search whereas pure UCT search can find the correct solution. Two games, the simple abstract game Sum of Switches (SOS) and the game of Go, are used to study the behavior of the RAVE heuristic. In SOS, RAVE updates are manipulated to mimic game situations where RAVE misleads the search. Such false RAVE updates are used to create RAVE overestimates and underestimates. A study of the distributions of mean and RAVE values reveals great differences between Go and SOS. While the RAVE-max update rule is able to correct extreme cases of RAVE underestimation, it is not effective in closer to practical settings and in Go.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Kocsis, L., Szepesvári, C.: Bandit based monte-carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  2. Gelly, S., Wang, Y., Munos, R., Teytaud, O.: Modification of UCT with patterns in Monte-Carlo Go, Technical Report RR-6062, INRIA, France (2006)

    Google Scholar 

  3. Finnsson, H., Björnsson, Y.: Simulation-based approach to General Game Playing. In: Fox, D., Gomes, C. (eds.) AAAI, pp. 259–264. AAAI Press, Menlo Park (2008)

    Google Scholar 

  4. Arneson, B., Hayward, R., Henderson, P.: Wolve 2008 wins Hex Tournament. ICGA Journal 32(1), 49–53 (2009)

    Article  Google Scholar 

  5. Lorentz, R.J.: Amazons discover monte-carlo. In: van den Herik, H.J., Xu, X., Ma, Z., Winands, M.H.M. (eds.) CG 2008. LNCS, vol. 5131, pp. 13–24. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  6. Winands, M., Björnsson, Y.: Evaluation function based Monte-Carlo LOA. In: [15], pp. 33–44

    Google Scholar 

  7. Brügmann, B.: Monte Carlo Go (March 1993) (unpublished manuscript), http://www.cgl.ucsf.edu/go/Programs/Gobble.html

  8. Gelly, S., Silver, D.: Combining online and offline knowledge in UCT. In: Ghahramani, Z. (ed.) ICML. ACM International Conference Proceeding Series, vol. 227, pp. 273–280. ACM, New York (2007)

    Chapter  Google Scholar 

  9. Tom, D., Müller, M.: A study of UCT and its enhancements in an artificial game. In: [15], pp. 55–64

    Google Scholar 

  10. Teytaud, F., Teytaud, O.: Creating an Upper-Confidence-Tree program for Havannah. In: [15], pp. 65–74

    Google Scholar 

  11. Enzenberger, M., Müller, M.: Fuego (2008), http://fuego.sf.net/ (Retrieved December 22, 2008)

  12. Silver, D.: Reinforcement Learning and Simulation-Based Search. PhD thesis, University of Alberta (2009)

    Google Scholar 

  13. Tom, D.: Investigating UCT and RAVE: Steps Towards a More Robust Method. Master’s thesis, University of Alberta, Department of Computing Science (2010)

    Google Scholar 

  14. Enzenberger, M., Müller, M., Arneson, B., Segal, R.: Fuego – an open-source framework for board games and Go engine based on Monte-Carlo tree search. Submitted to IEEE Transactions on Computational Intelligence and AI in Games (2010)

    Google Scholar 

  15. van den Herik, H.J., Spronck, P. (eds.): ACG 2009. LNCS, vol. 6048. Springer, Heidelberg (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tom, D., Müller, M. (2011). Computational Experiments with the RAVE Heuristic. In: van den Herik, H.J., Iida, H., Plaat, A. (eds) Computers and Games. CG 2010. Lecture Notes in Computer Science, vol 6515. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17928-0_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-17928-0_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-17927-3

  • Online ISBN: 978-3-642-17928-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics