Abstract
This paper introduces a new online learning framework for multiclass classification called learning with diluted bandit feedback. At every time step, the algorithm predicts a candidate label set instead of a single label for the observed example. It then receives a feedback from the environment whether the actual label lies in this candidate label set or not. This feedback is called “diluted bandit feedback". Learning in this setting is even more challenging than the bandit feedback setting, as there is more uncertainty in the supervision. We propose an algorithm for multiclass classification using dilute bandit feedback (MC-DBF), which uses the exploration-exploitation strategy to predict the candidate set in each trial. We show that the proposed algorithm achieves \(\mathcal {O}(T^{1-\frac{1}{m+2}})\) mistake bound if candidate label set size (in each step) is m. We demonstrate the effectiveness of the proposed approach with extensive simulations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
T is number of trials.
- 2.
- 3.
We see that \(\sum _{A}Z(A) = 1\) as follows.
$$\begin{aligned} \sum _{A} Z(A)&= \sum _{A} \mathbb {P}(b_1)\dots \mathbb {P}(b_m|b_1,\dots ,b_{m-1})= \sum _{b_1}\mathbb {P}(b_1)\dots \sum _{b_m}\frac{\mathbb {P}(b_m)}{(1-\mathbb {P}(b_1)\dots -\mathbb {P}(b_{m-1}))} \end{aligned}$$But, \(\sum _{b_i} \frac{\mathbb {P}(b_i)}{(1-\mathbb {P}(b_1)-\dots -\mathbb {P}(b_{i-1})} = 1\). Thus, \(\sum _{A}Z(A) = 1\).
References
Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). http://tensorflow.org/, software available from tensorflow.org
Arora, M., Manwani, N.: Exact passive-aggressive algorithms for multiclass classification using bandit feedbacks. In: Proceedings of The 12th Asian Conference on Machine Learning, vol. 129, pp. 369–384, 18–20 November 2020, Bangkok, Thailand (2020)
Arora, M., Manwani, N.: Exact passive aggressive algorithm for multiclass classification using partial labels. In: 8th ACM IKDD CODS and 26th COMAD, pp. 38–46 (2021)
Beygelzimer, A., Orabona, F., Zhang, C.: Efficient online bandit multiclass learning with \(\tilde{O}(\sqrt{T})\) regret. CoRR abs/1702.07958 (2017). http://arxiv.org/abs/1702.07958
Bhattacharjee, R., Manwani, N.: Online algorithms for multiclass classification using partial labels. In: Proceedings of the 24th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), pp. 249–260 (2020)
Crammer, K., Singer, Y.: Ultraconservative online algorithms for multiclass problems. J. Mach. Learn. Res. 3(null), 951–991 (2003)
Fink, M., Shalev-Shwartz, S., Singer, Y., Ullman, S.: Online multiclass learning by interclass hypothesis sharing, pp. 313–320 (2006). https://doi.org/10.1145/1143844.1143884
Hazan, E., Kale, S.: NEWTRON: an efficient bandit algorithm for online multiclass prediction. In: Shawe-Taylor, J., Zemel, R., Bartlett, P., Pereira, F., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 24, pp. 891–899. Curran Associates, Inc. (2011). https://proceedings.neurips.cc/paper/2011/file/fde9264cf376fffe2ee4ddf4a988880d-Paper.pdf
Hazan, E., Kale, S.: NEWTRON: an efficient bandit algorithm for online multiclass prediction. In: Proceedings of the 24th International Conference on Neural Information Processing Systems, pp. 891–899 (2011)
Kakade, S.M., Shalev-Shwartz, S., Tewari, A.: Efficient bandit algorithms for online multiclass prediction. In: Proceedings of the 25th International Conference on Machine Learning, pp. 440–447. ICML 2008 (2008)
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
LeCun, Y., Cortes, C., Burges, C.: MNIST handwritten digit database. ATT Labs [Online]. http://yann.lecun.com/exdb/mnist 2 (2010)
Matsushima, S., Shimizu, N., Yoshida, K., Ninomiya, T., Nakagawa, H.: Exact passive-aggressive algorithm for multiclass classification using support class. In: Proceedings of the SIAM International Conference on Data Mining, SDM 2010, Columbus, Ohio, USA, pp. 303–314 (2010)
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.: Reading digits in natural images with unsupervised feature learning. NIPS (01 2011)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014). http://arxiv.org/abs/1409.1556
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms (2017)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Batra, G., Manwani, N. (2021). Multiclass Classification Using Dilute Bandit Feedback. In: Pham, D.N., Theeramunkong, T., Governatori, G., Liu, F. (eds) PRICAI 2021: Trends in Artificial Intelligence. PRICAI 2021. Lecture Notes in Computer Science(), vol 13031. Springer, Cham. https://doi.org/10.1007/978-3-030-89188-6_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-89188-6_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89187-9
Online ISBN: 978-3-030-89188-6
eBook Packages: Computer ScienceComputer Science (R0)