Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Auer P, Cesa-Bianchi N, Freund Y, Schapire RE (2002) The non-stochastic multi-armed bandit problem. SIAM J Comput 32(1):48–77
Berry D, Fristedt B (1985) Bandit problems: sequential allocation of experiments. Chapman and Hall, London/New York
Cesa-Bianchi N, Lugosi G (2006) Prediction, learning, and games. Cambridge University Press, New York
Gittins JC (1989) Multi-armed bandit allocation indices. Wiley, New York
Gittins J, Jones D (1972) A dynamic allocation index for sequential design of experiments. In: Progress in statistics, European meeting of statisticians, Budapest, vol 1, pp 241–266
Lai TL, Robbins H (1985) Asymptotically efficient adaptive allocation rules. Adv Appl Math 6:4–22
Mannor S, Tsitsiklis JN (2004) The sample complexity of exploration in the multi-armed bandit problem. J Mach Learn Res 5:623–648
Robbins H (1952) Some aspects of the sequential design of experiments. Bull Am Math Soc 55:527–535
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media New York
About this entry
Cite this entry
Mannor, S. (2017). k-Armed Bandit. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_424
Download citation
DOI: https://doi.org/10.1007/978-1-4899-7687-1_424
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4899-7685-7
Online ISBN: 978-1-4899-7687-1
eBook Packages: Computer ScienceReference Module Computer Science and Engineering