Classification Bandits: Classification Using Expected Rewards as Imperfect Discriminators

Tabata, Koji; Nakumura, Atsuyoshi; Komatsuzaki, Tamiki

doi:10.1007/978-3-030-75015-2_6

Koji Tabata^10,12,
Atsuyoshi Nakumura¹¹ &
Tamiki Komatsuzaki^10,12

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12705))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

1155 Accesses
2 Citations

Abstract

A classification bandits problem is a new class of multi-armed bandits problems in which an agent must classify a given set of arms into positive or negative depending on whether the number of bad arms are at least \(N_2\) or at most \(N_1(<N_2)\) by drawing as fewer arms as possible. In our problem setting, bad arms are imperfectly characterized as the arms with above-threshold expected rewards (losses). We develop a method of reducing classification bandits to simpler one threshold classification bandits and propose an algorithm for the problem that classifies a given set of arms correctly with a specified confidence. Our numerical experiments demonstrate effectiveness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Histopathologists usually diagnose whether cells are of cancer or not by inspecting their morphological characteristics with a human bias, but Raman measurements are considered to enable more reliably to judge the cell states.

References

Audibert, J.Y., Bubeck, S.: Best arm identification in multi-armed bandits (2010)
Google Scholar
Gabillon, V., Ghavamzadeh, M., Lazaric, A.: Best arm identification: a unified approach to fixed budget and fixed confidence. In: Advances in Neural Information Processing Systems, pp. 3212–3220 (2012)
Google Scholar
Kalyanakrishnan, S., Tewari, A., Auer, P., Stone, P.: Pac subset selection in stochastic multi-armed bandits. In: ICML, vol. 12, pp. 655–662 (2012)
Google Scholar
Kano, H., Honda, J., Sakamaki, K., Matsuura, K., Nakamura, A., Sugiyama, M.: Good arm identification via bandit feedback. Mach. Learn. 108(5), 721–745 (2019). https://doi.org/10.1007/s10994-019-05784-4
Article MathSciNet MATH Google Scholar
Kaufmann, E., Koolen, W.M., Garivier, A.: Sequential test for the lowest mean: from Thompson to murphy sampling. In: Advances in Neural Information Processing Systems, pp. 6332–6342 (2018)
Google Scholar
Locatelli, A., Gutzeit, M., Carpentier, A.: An optimal algorithm for the thresholding bandit problem. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, vol. 48, pp. 1690–1698 (2016)
Google Scholar
Nachlas, J.A., Loney, S.R., Binney, B.A.: Diagnostic-strategy selection for series systems. IEEE Trans. Reliab. 39(3), 273–280 (1990)
Article Google Scholar
Pelissier, A., et al.: Intelligent measurement analysis on single cell Raman images for the diagnosis of follicular thyroid carcinoma. arXiv preprint (2019). arxiv.org/abs/1904.05675
Raghavan, V., Shakeri, M., Pattipati, K.: Test sequencing algorithms with unreliable tests. IEEE Trans. Syst. Man Cybern.-Part A: Syst. Humans 29(4), 347–357 (1999)
Article Google Scholar
Tabata, K., Nakamura, A., Honda, J., Komatsuzaki, T.: A bad arm existence checking problem: how to utilize asymmetric problem structure? Mach. Learn. 109, 1–46 (2019). https://doi.org/10.1007/s10994-019-05854-7
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Chemical Reaction Design and Discovery, Hokkaido University, Sapporo, Japan
Koji Tabata & Tamiki Komatsuzaki
Graduate School/Faculty of Information Science and Technology, Hokkaido University, Sapporo, Japan
Atsuyoshi Nakumura
Research Center of Mathematics for Social Creativity Research Institute for Electronic Science, Hokkaido University, Sapporo, Japan
Koji Tabata & Tamiki Komatsuzaki

Authors

Koji Tabata
View author publications
You can also search for this author in PubMed Google Scholar
Atsuyoshi Nakumura
View author publications
You can also search for this author in PubMed Google Scholar
Tamiki Komatsuzaki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Koji Tabata .

Editor information

Editors and Affiliations

Microsoft, Hyderabad, India
Manish Gupta
Indian Institute of Technology Bombay, Mumbai, India
Ganesh Ramakrishnan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tabata, K., Nakumura, A., Komatsuzaki, T. (2021). Classification Bandits: Classification Using Expected Rewards as Imperfect Discriminators. In: Gupta, M., Ramakrishnan, G. (eds) Trends and Applications in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12705. Springer, Cham. https://doi.org/10.1007/978-3-030-75015-2_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-75015-2_6
Published: 03 May 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75014-5
Online ISBN: 978-3-030-75015-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics