On Robustness of CMAB Algorithms: Experimental Approach

Komenda, Antonín; Shleyfman, Alexander; Domshlak, Carmel

doi:10.1007/978-3-319-14923-3_2

Antonín Komenda¹⁵,
Alexander Shleyfman¹⁵ &
Carmel Domshlak¹⁵

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 504))

Included in the following conference series:

Workshop on Computer Games

681 Accesses
1 Citations

Abstract

In online planning with a team of cooperative agents, a straightforward model for decision making which actions the agents should execute can be represented as the problem of Combinatorial Multi-Armed Bandit. Similarly to the most prominent approaches for online planning with polynomial number of possible actions, state-of-the-art algorithms for online planning with exponential number of actions are based on Monte-Carlo sampling. However, without a proper selection of the appropriate subset of actions these techniques cannot be used. The most recent algorithms tackling this problem utilize an assumption of linearity with respect to the combinations of the actions.

In this paper, we experimentally analyze robustness of two state-of-the-art algorithms NMC and LSI for online planning with combinatorial actions in various setups of Real-Time and Turn-Taking Strategy games.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning 47(2-3), 235–256 (2002)
Article MATH Google Scholar
Balla, R., Fern, A.: UCT for tactical assault planning in real-time strategy games. In: IJCAI, pp. 40–45 (2009)
Google Scholar
Bubeck, S., Munos, R.: Open loop optimistic planning. In: COLT, pp. 477–489 (2010)
Google Scholar
Bubeck, S., Munos, R., Stoltz, G.: Pure exploration in finitely-armed and continuous-armed bandits. Theor. Comput. Sci. 412(19), 1832–1852 (2011)
Article MathSciNet MATH Google Scholar
Chen, W., Wang, Y., Yuan, Y.: Combinatorial multi-armed bandit: General framework and applications. In: ICML, pp. 151–159 (2013)
Google Scholar
Chung, M., Buro, M., Schaeffer, J.: Monte Carlo planning in RTS games. In: IEEE-CIG (2005)
Google Scholar
Churchill, D., Saffidine, A., Buro, M.: Fast heuristic search for RTS game combat scenarios. In: AIIDE (2012)
Google Scholar
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, 2 edn. (2006)
Google Scholar
Feldman, Z., Domshlak, C.: Monte-Carlo planning: Theoretically fast convergence meets practical efficiency. In: UAI (2013)
Google Scholar
Feldman, Z., Domshlak, C.: On MABs and separation of concerns in Monte-Carlo planning for MDPs. In: ICAPS (2014)
Google Scholar
Gai, Y., Krishnamachari, B., Jain, R.: Learning multiuser channel allocations in cognitive radio networks: A combinatorial multi-armed bandit formulation. In: IEEE Symposium on New Frontiers in Dynamic Spectrum, pp. 1–9 (2010)
Google Scholar
Karnin, Z.S., Koren, T., Somekh, O.: Almost optimal exploration in multi-armed bandits. In: ICML, pp. 1238–1246 (2013)
Google Scholar
Keller, T., Helmert, M.: Trial-based heuristic tree search for finite horizon MDPs. In: ICAPS, pp. 135–143 (2013)
Google Scholar
Kocsis, L., Szepesvári, C.: Bandit based monte-carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)
Chapter Google Scholar
Kovarsky, A., Buro, M.: Heuristic search applied to abstract combat games. In: Kégl, B., Lee, H.-H. (eds.) Canadian AI 2005. LNCS (LNAI), vol. 3501, pp. 66–78. Springer, Heidelberg (2005)
Chapter Google Scholar
Ontañón, S.: The combinatorial multi-armed bandit problem and its application to real-time strategy games. In: AIIDE (2013)
Google Scholar
Saffidine, A., Finnsson, H., Buro, M.: Alpha-beta pruning for games with simultaneous moves. In: Hoffmann, J., Selman, B. (eds.) AAAI 2012 (2012)
Google Scholar
Shleyfman, A., Komenda, A., Domshlak, C.: On Combinatorial Actions and CMABs with Linear Side Information. In: Schaub, T., Friedrich, G., O’Sullivan, B. (eds.) ECAI 2014. Frontiers in Artificial Intelligence and Applications, vol. 204, pp. 825–830 (2014)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Industrial Engineering and Management, Technion - Israel Institute of Technology, Haifa, Israel
Antonín Komenda, Alexander Shleyfman & Carmel Domshlak

Authors

Antonín Komenda
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Shleyfman
View author publications
You can also search for this author in PubMed Google Scholar
Carmel Domshlak
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Université Paris-Dauphine, Paris, France
Tristan Cazenave
Games and AI Group, Department of Knowledge Engineering, Faculty of Humanities and Sciences, Maastricht University, Maastricht, The Netherlands
Mark H. M. Winands
School of Computer Science, Reykjavik University, Iceland
Yngvi Björnsson

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Komenda, A., Shleyfman, A., Domshlak, C. (2014). On Robustness of CMAB Algorithms: Experimental Approach. In: Cazenave, T., Winands, M.H.M., Björnsson, Y. (eds) Computer Games. CGW 2014. Communications in Computer and Information Science, vol 504. Springer, Cham. https://doi.org/10.1007/978-3-319-14923-3_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-14923-3_2
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14922-6
Online ISBN: 978-3-319-14923-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics