Improved Regret Bounds for Online Kernel Selection Under Bandit Feedback

Li, Junfan; Liao, Shizhong

doi:10.1007/978-3-031-26412-2_21

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13716))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

867 Accesses

Abstract

In this paper, we improve the regret bound for online kernel selection under bandit feedback. Previous algorithm enjoys a $O((\Vert f\Vert ^2_{\mathcal {H}_i}+1)K^{\frac{1}{3}}T^{\frac{2}{3}})$ expected bound for Lipschitz loss functions. We prove two types of regret bounds improving the previous bound. For smooth loss functions, we propose an algorithm with a $O(U^{\frac{2}{3}}K^{-\frac{1}{3}}(\sum ^K_{i=1}L_T(f^*_i))^{\frac{2}{3}})$ expected bound where $L_T(f^*_i)$ is the cumulative losses of optimal hypothesis in $\mathbb {H}_{i} =\{f\in \mathcal {H}_i:\Vert f\Vert _{\mathcal {H}_i}\le U\}$. The data-dependent bound keeps the previous worst-case bound and is smaller if most of candidate kernels match well with the data. For Lipschitz loss functions, we propose an algorithm with a $O(U\sqrt{KT}\ln ^{\frac{2}{3}}{T})$ expected bound asymptotically improving the previous bound. We apply the two algorithms to online kernel selection with time constraint and prove new regret bounds matching or improving the previous $O(\sqrt{T\ln {K}} +\Vert f\Vert ^2_{\mathcal {H}_i}\max \{\sqrt{T},\frac{T}{\sqrt{\mathcal {R}}}\})$ expected bound where $\mathcal {R}$ is the time budget. Finally, we empirically verify our algorithms on online regression and classification tasks.

This work was supported in part by the National Natural Science Foundation of China under grants No. 62076181.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

High-Probability Kernel Alignment Regret Bounds for Online Kernel Selection

Online Kernel Selection with Multiple Bandit Feedbacks in Random Feature Space

Worst-case regret analysis of computationally budgeted online kernel selection

Article 22 January 2022

Notes

1.
$\textrm{poly}(\Vert f\Vert _{\mathcal {H}_i})=\Vert f\Vert ^2_{\mathcal {H}_i}+1$. The original paper shows a $O((\Vert f\Vert ^2_{\mathcal {H}_i}+1)\sqrt{KT})$ expected regret bound. We will clarify the difference in Sect. 2.
2.
http://archive.ics.uci.edu/ml/datasets.php.
3.
https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/.
4.
The codes are available at https://github.com/JunfLi-TJU/OKS-Bandit.

References

Agarwal, A., Luo, H., Neyshabur, B., Schapire, R.E.: Corralling a band of bandit algorithms. In: Proceedings of the 30th Conference on Learning Theory, pp. 12–38 (2017)
Google Scholar
Audibert, J., Bubeck, S.: Minimax policies for adversarial and stochastic bandits. In: Proceedings of the 22nd Annual Conference on Learning Theory, pp. 217–226 (2009)
Google Scholar
Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The non-stochastic multi-armed bandit problem. SIAM J. Comput. 32(1), 48–77 (2002)
Article MathSciNet MATH Google Scholar
Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)
Book MATH Google Scholar
Foster, D.J., Kale, S., Mohri, M., Sridharan, K.: Parameter-free online learning via model selection. Adv. Neural. Inf. Process. Syst. 30, 6022–6032 (2017)
Google Scholar
Ghari, P.M., Shen, Y.: Online multi-kernel learning with graph-structured feedback. In: Proceedings of the 37th International Conference on Machine Learning, pp. 3474–3483 (2020)
Google Scholar
Hoi, S.C.H., Jin, R., Zhao, P., Yang, T.: Online multiple kernel classification. Mach. Learn. 90(2), 289–316 (2013)
Article MathSciNet MATH Google Scholar
Le, Q.V., Sarlós, T., Smola, A.J.: Fastfood-computing hilbert space expansions in loglinear time. In: Proceedings of the 30th International Conference on Machine Learning, pp. 244–252 (2013)
Google Scholar
Li, J., Liao, S.: Worst-case regret analysis of computationally budgeted online kernel selection. Mach. Learn. 111(3), 937–976 (2022)
Article MathSciNet MATH Google Scholar
Li, Z., Ton, J.F., Oglic, D., Sejdinovic, D.: Towards a unified analysis of random Fourier features. In: Proceedings of the 36th International Conference on Machine Learning, pp. 3905–3914 (2019)
Google Scholar
Liao, S., Li, J.: High-probability kernel alignment regret bounds for online kernel selection. In: Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 67–83 (2021)
Google Scholar
Rahimi, A., Recht, B.: Random features for large-scale kernel machines. Adv. Neural. Inf. Process. Syst. 20, 1177–1184 (2007)
Google Scholar
Rahimi, A., Recht, B.: Weighted sums of random kitchen sinks: replacing minimization with randomization in learning. Adv. Neural. Inf. Process. Syst. 21, 1313–1320 (2008)
Google Scholar
Sahoo, D., Hoi, S.C.H., Li, B.: Online multiple kernel regression. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD, pp. 293–302 (2014)
Google Scholar
Seldin, Y., Bartlett, P.L., Crammer, K., Abbasi-Yadkori, Y.: Prediction with limited advice and multiarmed bandits with paid observations. In: Proceedings of the 31st International Conference on Machine Learning, pp. 280–287 (2014)
Google Scholar
Shen, Y., Chen, T., Giannakis, G.B.: Random feature-based online multi-kernel learning in environments with unknown dynamics. J. Mach. Learn. Res. 20(22), 1–36 (2019)
MathSciNet MATH Google Scholar
Tsallis, C.: Possible generalization of boltzmann-gibbs statistics. J. Stat. Phys. 52(1), 479–487 (1988)
Article MathSciNet MATH Google Scholar
Wang, J., Hoi, S.C.H., Zhao, P., Zhuang, J., Liu, Z.: Large scale online kernel classification. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence, pp. 1750–1756 (2013)
Google Scholar
Yang, T., Mahdavi, M., Jin, R., Yi, J., Hoi, S.C.H.: Online kernel selection: algorithms and evaluations. In: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, pp. 1197–1202 (2012)
Google Scholar
Yu, F.X., Suresh, A.T., Choromanski, K.M., Holtmann-Rice, D.N., Kumar, S.: Orthogonal random features. Adv. Neural. Inf. Process. Syst. 29, 1975–1983 (2016)
Google Scholar
Zhang, L., Yi, J., Jin, R., Lin, M., He, X.: Online kernel learning with a near optimal sparsity bound. In: Proceedings of the 30th International Conference on Machine Learning, pp. 621–629 (2013)
Google Scholar
Zhang, X., Liao, S.: Online kernel selection via incremental sketched kernel alignment. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, pp. 3118–3124 (2018)
Google Scholar
Zimmert, J., Seldin, Y.: An optimal algorithm for stochastic and adversarial bandits. In: Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, pp. 467–475 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Intelligence and Computing, Tianjin University, Tianjin, 300350, China
Junfan Li & Shizhong Liao

Authors

Junfan Li
View author publications
You can also search for this author in PubMed Google Scholar
Shizhong Liao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shizhong Liao .

Editor information

Editors and Affiliations

Grenoble Alpes University, Saint Martin d’Hères, France
Massih-Reza Amini
INSA Rouen Normandy, Saint Etienne du Rouvray, France
Stéphane Canu
Ruhr-Universität Bochum, Bochum, Germany
Asja Fischer
KU Leuven, Leuven, Belgium
Tias Guns
Central European University, Vienna, Austria
Petra Kralj Novak
Aristotle University of Thessaloniki, Thessaloniki, Greece
Grigorios Tsoumakas

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 502 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, J., Liao, S. (2023). Improved Regret Bounds for Online Kernel Selection Under Bandit Feedback. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13716. Springer, Cham. https://doi.org/10.1007/978-3-031-26412-2_21

Download citation

DOI: https://doi.org/10.1007/978-3-031-26412-2_21
Published: 17 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26411-5
Online ISBN: 978-3-031-26412-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)

Improved Regret Bounds for Online Kernel Selection Under Bandit Feedback