Abstract
A classification bandits problem is a new class of multi-armed bandits problems in which an agent must classify a given set of arms into positive or negative depending on whether the number of bad arms are at least \(N_2\) or at most \(N_1(<N_2)\) by drawing as fewer arms as possible. In our problem setting, bad arms are imperfectly characterized as the arms with above-threshold expected rewards (losses). We develop a method of reducing classification bandits to simpler one threshold classification bandits and propose an algorithm for the problem that classifies a given set of arms correctly with a specified confidence. Our numerical experiments demonstrate effectiveness of our proposed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Histopathologists usually diagnose whether cells are of cancer or not by inspecting their morphological characteristics with a human bias, but Raman measurements are considered to enable more reliably to judge the cell states.
References
Audibert, J.Y., Bubeck, S.: Best arm identification in multi-armed bandits (2010)
Gabillon, V., Ghavamzadeh, M., Lazaric, A.: Best arm identification: a unified approach to fixed budget and fixed confidence. In: Advances in Neural Information Processing Systems, pp. 3212–3220 (2012)
Kalyanakrishnan, S., Tewari, A., Auer, P., Stone, P.: Pac subset selection in stochastic multi-armed bandits. In: ICML, vol. 12, pp. 655–662 (2012)
Kano, H., Honda, J., Sakamaki, K., Matsuura, K., Nakamura, A., Sugiyama, M.: Good arm identification via bandit feedback. Mach. Learn. 108(5), 721–745 (2019). https://doi.org/10.1007/s10994-019-05784-4
Kaufmann, E., Koolen, W.M., Garivier, A.: Sequential test for the lowest mean: from Thompson to murphy sampling. In: Advances in Neural Information Processing Systems, pp. 6332–6342 (2018)
Locatelli, A., Gutzeit, M., Carpentier, A.: An optimal algorithm for the thresholding bandit problem. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, vol. 48, pp. 1690–1698 (2016)
Nachlas, J.A., Loney, S.R., Binney, B.A.: Diagnostic-strategy selection for series systems. IEEE Trans. Reliab. 39(3), 273–280 (1990)
Pelissier, A., et al.: Intelligent measurement analysis on single cell Raman images for the diagnosis of follicular thyroid carcinoma. arXiv preprint (2019). arxiv.org/abs/1904.05675
Raghavan, V., Shakeri, M., Pattipati, K.: Test sequencing algorithms with unreliable tests. IEEE Trans. Syst. Man Cybern.-Part A: Syst. Humans 29(4), 347–357 (1999)
Tabata, K., Nakamura, A., Honda, J., Komatsuzaki, T.: A bad arm existence checking problem: how to utilize asymmetric problem structure? Mach. Learn. 109, 1–46 (2019). https://doi.org/10.1007/s10994-019-05854-7
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Tabata, K., Nakumura, A., Komatsuzaki, T. (2021). Classification Bandits: Classification Using Expected Rewards as Imperfect Discriminators. In: Gupta, M., Ramakrishnan, G. (eds) Trends and Applications in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12705. Springer, Cham. https://doi.org/10.1007/978-3-030-75015-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-75015-2_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75014-5
Online ISBN: 978-3-030-75015-2
eBook Packages: Computer ScienceComputer Science (R0)