Skip to main content

On Robustness of CMAB Algorithms: Experimental Approach

  • Conference paper
Book cover Computer Games (CGW 2014)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 504))

Included in the following conference series:

Abstract

In online planning with a team of cooperative agents, a straightforward model for decision making which actions the agents should execute can be represented as the problem of Combinatorial Multi-Armed Bandit. Similarly to the most prominent approaches for online planning with polynomial number of possible actions, state-of-the-art algorithms for online planning with exponential number of actions are based on Monte-Carlo sampling. However, without a proper selection of the appropriate subset of actions these techniques cannot be used. The most recent algorithms tackling this problem utilize an assumption of linearity with respect to the combinations of the actions.

In this paper, we experimentally analyze robustness of two state-of-the-art algorithms NMC and LSI for online planning with combinatorial actions in various setups of Real-Time and Turn-Taking Strategy games.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning 47(2-3), 235–256 (2002)

    Article  MATH  Google Scholar 

  2. Balla, R., Fern, A.: UCT for tactical assault planning in real-time strategy games. In: IJCAI, pp. 40–45 (2009)

    Google Scholar 

  3. Bubeck, S., Munos, R.: Open loop optimistic planning. In: COLT, pp. 477–489 (2010)

    Google Scholar 

  4. Bubeck, S., Munos, R., Stoltz, G.: Pure exploration in finitely-armed and continuous-armed bandits. Theor. Comput. Sci. 412(19), 1832–1852 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  5. Chen, W., Wang, Y., Yuan, Y.: Combinatorial multi-armed bandit: General framework and applications. In: ICML, pp. 151–159 (2013)

    Google Scholar 

  6. Chung, M., Buro, M., Schaeffer, J.: Monte Carlo planning in RTS games. In: IEEE-CIG (2005)

    Google Scholar 

  7. Churchill, D., Saffidine, A., Buro, M.: Fast heuristic search for RTS game combat scenarios. In: AIIDE (2012)

    Google Scholar 

  8. Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, 2 edn. (2006)

    Google Scholar 

  9. Feldman, Z., Domshlak, C.: Monte-Carlo planning: Theoretically fast convergence meets practical efficiency. In: UAI (2013)

    Google Scholar 

  10. Feldman, Z., Domshlak, C.: On MABs and separation of concerns in Monte-Carlo planning for MDPs. In: ICAPS (2014)

    Google Scholar 

  11. Gai, Y., Krishnamachari, B., Jain, R.: Learning multiuser channel allocations in cognitive radio networks: A combinatorial multi-armed bandit formulation. In: IEEE Symposium on New Frontiers in Dynamic Spectrum, pp. 1–9 (2010)

    Google Scholar 

  12. Karnin, Z.S., Koren, T., Somekh, O.: Almost optimal exploration in multi-armed bandits. In: ICML, pp. 1238–1246 (2013)

    Google Scholar 

  13. Keller, T., Helmert, M.: Trial-based heuristic tree search for finite horizon MDPs. In: ICAPS, pp. 135–143 (2013)

    Google Scholar 

  14. Kocsis, L., Szepesvári, C.: Bandit based monte-carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  15. Kovarsky, A., Buro, M.: Heuristic search applied to abstract combat games. In: Kégl, B., Lee, H.-H. (eds.) Canadian AI 2005. LNCS (LNAI), vol. 3501, pp. 66–78. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  16. Ontañón, S.: The combinatorial multi-armed bandit problem and its application to real-time strategy games. In: AIIDE (2013)

    Google Scholar 

  17. Saffidine, A., Finnsson, H., Buro, M.: Alpha-beta pruning for games with simultaneous moves. In: Hoffmann, J., Selman, B. (eds.) AAAI 2012 (2012)

    Google Scholar 

  18. Shleyfman, A., Komenda, A., Domshlak, C.: On Combinatorial Actions and CMABs with Linear Side Information. In: Schaub, T., Friedrich, G., O’Sullivan, B. (eds.) ECAI 2014. Frontiers in Artificial Intelligence and Applications, vol. 204, pp. 825–830 (2014)

    Google Scholar 

  19. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Komenda, A., Shleyfman, A., Domshlak, C. (2014). On Robustness of CMAB Algorithms: Experimental Approach. In: Cazenave, T., Winands, M.H.M., Björnsson, Y. (eds) Computer Games. CGW 2014. Communications in Computer and Information Science, vol 504. Springer, Cham. https://doi.org/10.1007/978-3-319-14923-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-14923-3_2

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-14922-6

  • Online ISBN: 978-3-319-14923-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics