Multiclass Classification Using Dilute Bandit Feedback

Batra, Gaurav; Manwani, Naresh

doi:10.1007/978-3-030-89188-6_5

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13031))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

2253 Accesses
1 Citations

Abstract

This paper introduces a new online learning framework for multiclass classification called learning with diluted bandit feedback. At every time step, the algorithm predicts a candidate label set instead of a single label for the observed example. It then receives a feedback from the environment whether the actual label lies in this candidate label set or not. This feedback is called “diluted bandit feedback". Learning in this setting is even more challenging than the bandit feedback setting, as there is more uncertainty in the supervision. We propose an algorithm for multiclass classification using dilute bandit feedback (MC-DBF), which uses the exploration-exploitation strategy to predict the candidate set in each trial. We show that the proposed algorithm achieves $\mathcal {O}(T^{1-\frac{1}{m+2}})$ mistake bound if candidate label set size (in each step) is m. We demonstrate the effectiveness of the proposed approach with extensive simulations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Learning Multiclass Classifier Under Noisy Bandit Feedback

ALBIF: Active Learning with BandIt Feedbacks

A Novel Weakly Supervised Problem: Learning from Positive-Unlabeled Proportions

Notes

1.
T is number of trials.
2.
Note that this setting is exactly opposite to the partial label setting [3, 5]. In the partial label setting, ground truth is a labelled subset, and the algorithm predicts a single label.
3.
We see that $\sum _{A}Z(A) = 1$ as follows.
$$\begin{aligned} \sum _{A} Z(A)&= \sum _{A} \mathbb {P}(b_1)\dots \mathbb {P}(b_m|b_1,\dots ,b_{m-1})= \sum _{b_1}\mathbb {P}(b_1)\dots \sum _{b_m}\frac{\mathbb {P}(b_m)}{(1-\mathbb {P}(b_1)\dots -\mathbb {P}(b_{m-1}))} \end{aligned}$$
But, $\sum _{b_i} \frac{\mathbb {P}(b_i)}{(1-\mathbb {P}(b_1)-\dots -\mathbb {P}(b_{i-1})} = 1$. Thus, $\sum _{A}Z(A) = 1$.

References

Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). http://tensorflow.org/, software available from tensorflow.org
Arora, M., Manwani, N.: Exact passive-aggressive algorithms for multiclass classification using bandit feedbacks. In: Proceedings of The 12th Asian Conference on Machine Learning, vol. 129, pp. 369–384, 18–20 November 2020, Bangkok, Thailand (2020)
Google Scholar
Arora, M., Manwani, N.: Exact passive aggressive algorithm for multiclass classification using partial labels. In: 8th ACM IKDD CODS and 26th COMAD, pp. 38–46 (2021)
Google Scholar
Beygelzimer, A., Orabona, F., Zhang, C.: Efficient online bandit multiclass learning with $\tilde{O}(\sqrt{T})$ regret. CoRR abs/1702.07958 (2017). http://arxiv.org/abs/1702.07958
Bhattacharjee, R., Manwani, N.: Online algorithms for multiclass classification using partial labels. In: Proceedings of the 24th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), pp. 249–260 (2020)
Google Scholar
Crammer, K., Singer, Y.: Ultraconservative online algorithms for multiclass problems. J. Mach. Learn. Res. 3(null), 951–991 (2003)
Google Scholar
Fink, M., Shalev-Shwartz, S., Singer, Y., Ullman, S.: Online multiclass learning by interclass hypothesis sharing, pp. 313–320 (2006). https://doi.org/10.1145/1143844.1143884
Hazan, E., Kale, S.: NEWTRON: an efficient bandit algorithm for online multiclass prediction. In: Shawe-Taylor, J., Zemel, R., Bartlett, P., Pereira, F., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 24, pp. 891–899. Curran Associates, Inc. (2011). https://proceedings.neurips.cc/paper/2011/file/fde9264cf376fffe2ee4ddf4a988880d-Paper.pdf
Hazan, E., Kale, S.: NEWTRON: an efficient bandit algorithm for online multiclass prediction. In: Proceedings of the 24th International Conference on Neural Information Processing Systems, pp. 891–899 (2011)
Google Scholar
Kakade, S.M., Shalev-Shwartz, S., Tewari, A.: Efficient bandit algorithms for online multiclass prediction. In: Proceedings of the 25th International Conference on Machine Learning, pp. 440–447. ICML 2008 (2008)
Google Scholar
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Google Scholar
LeCun, Y., Cortes, C., Burges, C.: MNIST handwritten digit database. ATT Labs [Online]. http://yann.lecun.com/exdb/mnist 2 (2010)
Matsushima, S., Shimizu, N., Yoshida, K., Ninomiya, T., Nakagawa, H.: Exact passive-aggressive algorithm for multiclass classification using support class. In: Proceedings of the SIAM International Conference on Data Mining, SDM 2010, Columbus, Ohio, USA, pp. 303–314 (2010)
Google Scholar
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.: Reading digits in natural images with unsupervised feature learning. NIPS (01 2011)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014). http://arxiv.org/abs/1409.1556
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Machine Learning Lab, KCIS, IIIT Hyderabad, Hyderabad, India
Gaurav Batra & Naresh Manwani

Authors

Gaurav Batra
View author publications
You can also search for this author in PubMed Google Scholar
Naresh Manwani
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

MIMOS Berhad, Kuala Lumpur, Malaysia
Duc Nghia Pham
Sirindhorn International Institute of Science and Technology, Thammasat University, Mueang Pathum Thani, Thailand
Thanaruk Theeramunkong
Data61, CSIRO, Brisbane, QLD, Australia
Guido Governatori
Department of Philosophy, Tsinghua University, Beijing, China
Fenrong Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Batra, G., Manwani, N. (2021). Multiclass Classification Using Dilute Bandit Feedback. In: Pham, D.N., Theeramunkong, T., Governatori, G., Liu, F. (eds) PRICAI 2021: Trends in Artificial Intelligence. PRICAI 2021. Lecture Notes in Computer Science(), vol 13031. Springer, Cham. https://doi.org/10.1007/978-3-030-89188-6_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-89188-6_5
Published: 25 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89187-9
Online ISBN: 978-3-030-89188-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Multiclass Classification Using Dilute Bandit Feedback

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Learning Multiclass Classifier Under Noisy Bandit Feedback

ALBIF: Active Learning with BandIt Feedbacks

A Novel Weakly Supervised Problem: Learning from Positive-Unlabeled Proportions

Notes

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Multiclass Classification Using Dilute Bandit Feedback

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Learning Multiclass Classifier Under Noisy Bandit Feedback

ALBIF: Active Learning with BandIt Feedbacks

A Novel Weakly Supervised Problem: Learning from Positive-Unlabeled Proportions

Notes

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation