Skip to main content

Classification Bandits: Classification Using Expected Rewards as Imperfect Discriminators

  • Conference paper
  • First Online:
Trends and Applications in Knowledge Discovery and Data Mining (PAKDD 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12705))

Included in the following conference series:

Abstract

A classification bandits problem is a new class of multi-armed bandits problems in which an agent must classify a given set of arms into positive or negative depending on whether the number of bad arms are at least \(N_2\) or at most \(N_1(<N_2)\) by drawing as fewer arms as possible. In our problem setting, bad arms are imperfectly characterized as the arms with above-threshold expected rewards (losses). We develop a method of reducing classification bandits to simpler one threshold classification bandits and propose an algorithm for the problem that classifies a given set of arms correctly with a specified confidence. Our numerical experiments demonstrate effectiveness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Histopathologists usually diagnose whether cells are of cancer or not by inspecting their morphological characteristics with a human bias, but Raman measurements are considered to enable more reliably to judge the cell states.

References

  1. Audibert, J.Y., Bubeck, S.: Best arm identification in multi-armed bandits (2010)

    Google Scholar 

  2. Gabillon, V., Ghavamzadeh, M., Lazaric, A.: Best arm identification: a unified approach to fixed budget and fixed confidence. In: Advances in Neural Information Processing Systems, pp. 3212–3220 (2012)

    Google Scholar 

  3. Kalyanakrishnan, S., Tewari, A., Auer, P., Stone, P.: Pac subset selection in stochastic multi-armed bandits. In: ICML, vol. 12, pp. 655–662 (2012)

    Google Scholar 

  4. Kano, H., Honda, J., Sakamaki, K., Matsuura, K., Nakamura, A., Sugiyama, M.: Good arm identification via bandit feedback. Mach. Learn. 108(5), 721–745 (2019). https://doi.org/10.1007/s10994-019-05784-4

    Article  MathSciNet  MATH  Google Scholar 

  5. Kaufmann, E., Koolen, W.M., Garivier, A.: Sequential test for the lowest mean: from Thompson to murphy sampling. In: Advances in Neural Information Processing Systems, pp. 6332–6342 (2018)

    Google Scholar 

  6. Locatelli, A., Gutzeit, M., Carpentier, A.: An optimal algorithm for the thresholding bandit problem. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, vol. 48, pp. 1690–1698 (2016)

    Google Scholar 

  7. Nachlas, J.A., Loney, S.R., Binney, B.A.: Diagnostic-strategy selection for series systems. IEEE Trans. Reliab. 39(3), 273–280 (1990)

    Article  Google Scholar 

  8. Pelissier, A., et al.: Intelligent measurement analysis on single cell Raman images for the diagnosis of follicular thyroid carcinoma. arXiv preprint (2019). arxiv.org/abs/1904.05675

  9. Raghavan, V., Shakeri, M., Pattipati, K.: Test sequencing algorithms with unreliable tests. IEEE Trans. Syst. Man Cybern.-Part A: Syst. Humans 29(4), 347–357 (1999)

    Article  Google Scholar 

  10. Tabata, K., Nakamura, A., Honda, J., Komatsuzaki, T.: A bad arm existence checking problem: how to utilize asymmetric problem structure? Mach. Learn. 109, 1–46 (2019). https://doi.org/10.1007/s10994-019-05854-7

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Koji Tabata .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tabata, K., Nakumura, A., Komatsuzaki, T. (2021). Classification Bandits: Classification Using Expected Rewards as Imperfect Discriminators. In: Gupta, M., Ramakrishnan, G. (eds) Trends and Applications in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12705. Springer, Cham. https://doi.org/10.1007/978-3-030-75015-2_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-75015-2_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-75014-5

  • Online ISBN: 978-3-030-75015-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics