Abstract
In online planning with a team of cooperative agents, a straightforward model for decision making which actions the agents should execute can be represented as the problem of Combinatorial Multi-Armed Bandit. Similarly to the most prominent approaches for online planning with polynomial number of possible actions, state-of-the-art algorithms for online planning with exponential number of actions are based on Monte-Carlo sampling. However, without a proper selection of the appropriate subset of actions these techniques cannot be used. The most recent algorithms tackling this problem utilize an assumption of linearity with respect to the combinations of the actions.
In this paper, we experimentally analyze robustness of two state-of-the-art algorithms NMC and LSI for online planning with combinatorial actions in various setups of Real-Time and Turn-Taking Strategy games.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning 47(2-3), 235–256 (2002)
Balla, R., Fern, A.: UCT for tactical assault planning in real-time strategy games. In: IJCAI, pp. 40–45 (2009)
Bubeck, S., Munos, R.: Open loop optimistic planning. In: COLT, pp. 477–489 (2010)
Bubeck, S., Munos, R., Stoltz, G.: Pure exploration in finitely-armed and continuous-armed bandits. Theor. Comput. Sci. 412(19), 1832–1852 (2011)
Chen, W., Wang, Y., Yuan, Y.: Combinatorial multi-armed bandit: General framework and applications. In: ICML, pp. 151–159 (2013)
Chung, M., Buro, M., Schaeffer, J.: Monte Carlo planning in RTS games. In: IEEE-CIG (2005)
Churchill, D., Saffidine, A., Buro, M.: Fast heuristic search for RTS game combat scenarios. In: AIIDE (2012)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, 2 edn. (2006)
Feldman, Z., Domshlak, C.: Monte-Carlo planning: Theoretically fast convergence meets practical efficiency. In: UAI (2013)
Feldman, Z., Domshlak, C.: On MABs and separation of concerns in Monte-Carlo planning for MDPs. In: ICAPS (2014)
Gai, Y., Krishnamachari, B., Jain, R.: Learning multiuser channel allocations in cognitive radio networks: A combinatorial multi-armed bandit formulation. In: IEEE Symposium on New Frontiers in Dynamic Spectrum, pp. 1–9 (2010)
Karnin, Z.S., Koren, T., Somekh, O.: Almost optimal exploration in multi-armed bandits. In: ICML, pp. 1238–1246 (2013)
Keller, T., Helmert, M.: Trial-based heuristic tree search for finite horizon MDPs. In: ICAPS, pp. 135–143 (2013)
Kocsis, L., Szepesvári, C.: Bandit based monte-carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)
Kovarsky, A., Buro, M.: Heuristic search applied to abstract combat games. In: Kégl, B., Lee, H.-H. (eds.) Canadian AI 2005. LNCS (LNAI), vol. 3501, pp. 66–78. Springer, Heidelberg (2005)
Ontañón, S.: The combinatorial multi-armed bandit problem and its application to real-time strategy games. In: AIIDE (2013)
Saffidine, A., Finnsson, H., Buro, M.: Alpha-beta pruning for games with simultaneous moves. In: Hoffmann, J., Selman, B. (eds.) AAAI 2012 (2012)
Shleyfman, A., Komenda, A., Domshlak, C.: On Combinatorial Actions and CMABs with Linear Side Information. In: Schaub, T., Friedrich, G., O’Sullivan, B. (eds.) ECAI 2014. Frontiers in Artificial Intelligence and Applications, vol. 204, pp. 825–830 (2014)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Komenda, A., Shleyfman, A., Domshlak, C. (2014). On Robustness of CMAB Algorithms: Experimental Approach. In: Cazenave, T., Winands, M.H.M., Björnsson, Y. (eds) Computer Games. CGW 2014. Communications in Computer and Information Science, vol 504. Springer, Cham. https://doi.org/10.1007/978-3-319-14923-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-14923-3_2
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14922-6
Online ISBN: 978-3-319-14923-3
eBook Packages: Computer ScienceComputer Science (R0)