Skip to main content

Improved Regret Bounds for Online Kernel Selection Under Bandit Feedback

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2022)

Abstract

In this paper, we improve the regret bound for online kernel selection under bandit feedback. Previous algorithm enjoys a \(O((\Vert f\Vert ^2_{\mathcal {H}_i}+1)K^{\frac{1}{3}}T^{\frac{2}{3}})\) expected bound for Lipschitz loss functions. We prove two types of regret bounds improving the previous bound. For smooth loss functions, we propose an algorithm with a \(O(U^{\frac{2}{3}}K^{-\frac{1}{3}}(\sum ^K_{i=1}L_T(f^*_i))^{\frac{2}{3}})\) expected bound where \(L_T(f^*_i)\) is the cumulative losses of optimal hypothesis in \(\mathbb {H}_{i} =\{f\in \mathcal {H}_i:\Vert f\Vert _{\mathcal {H}_i}\le U\}\). The data-dependent bound keeps the previous worst-case bound and is smaller if most of candidate kernels match well with the data. For Lipschitz loss functions, we propose an algorithm with a \(O(U\sqrt{KT}\ln ^{\frac{2}{3}}{T})\) expected bound asymptotically improving the previous bound. We apply the two algorithms to online kernel selection with time constraint and prove new regret bounds matching or improving the previous \(O(\sqrt{T\ln {K}} +\Vert f\Vert ^2_{\mathcal {H}_i}\max \{\sqrt{T},\frac{T}{\sqrt{\mathcal {R}}}\})\) expected bound where \(\mathcal {R}\) is the time budget. Finally, we empirically verify our algorithms on online regression and classification tasks.

This work was supported in part by the National Natural Science Foundation of China under grants No. 62076181.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    \(\textrm{poly}(\Vert f\Vert _{\mathcal {H}_i})=\Vert f\Vert ^2_{\mathcal {H}_i}+1\). The original paper shows a \(O((\Vert f\Vert ^2_{\mathcal {H}_i}+1)\sqrt{KT})\) expected regret bound. We will clarify the difference in Sect. 2.

  2. 2.

    http://archive.ics.uci.edu/ml/datasets.php.

  3. 3.

    https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/.

  4. 4.

    The codes are available at https://github.com/JunfLi-TJU/OKS-Bandit.

References

  1. Agarwal, A., Luo, H., Neyshabur, B., Schapire, R.E.: Corralling a band of bandit algorithms. In: Proceedings of the 30th Conference on Learning Theory, pp. 12–38 (2017)

    Google Scholar 

  2. Audibert, J., Bubeck, S.: Minimax policies for adversarial and stochastic bandits. In: Proceedings of the 22nd Annual Conference on Learning Theory, pp. 217–226 (2009)

    Google Scholar 

  3. Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The non-stochastic multi-armed bandit problem. SIAM J. Comput. 32(1), 48–77 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  4. Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)

    Book  MATH  Google Scholar 

  5. Foster, D.J., Kale, S., Mohri, M., Sridharan, K.: Parameter-free online learning via model selection. Adv. Neural. Inf. Process. Syst. 30, 6022–6032 (2017)

    Google Scholar 

  6. Ghari, P.M., Shen, Y.: Online multi-kernel learning with graph-structured feedback. In: Proceedings of the 37th International Conference on Machine Learning, pp. 3474–3483 (2020)

    Google Scholar 

  7. Hoi, S.C.H., Jin, R., Zhao, P., Yang, T.: Online multiple kernel classification. Mach. Learn. 90(2), 289–316 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  8. Le, Q.V., Sarlós, T., Smola, A.J.: Fastfood-computing hilbert space expansions in loglinear time. In: Proceedings of the 30th International Conference on Machine Learning, pp. 244–252 (2013)

    Google Scholar 

  9. Li, J., Liao, S.: Worst-case regret analysis of computationally budgeted online kernel selection. Mach. Learn. 111(3), 937–976 (2022)

    Article  MathSciNet  MATH  Google Scholar 

  10. Li, Z., Ton, J.F., Oglic, D., Sejdinovic, D.: Towards a unified analysis of random Fourier features. In: Proceedings of the 36th International Conference on Machine Learning, pp. 3905–3914 (2019)

    Google Scholar 

  11. Liao, S., Li, J.: High-probability kernel alignment regret bounds for online kernel selection. In: Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 67–83 (2021)

    Google Scholar 

  12. Rahimi, A., Recht, B.: Random features for large-scale kernel machines. Adv. Neural. Inf. Process. Syst. 20, 1177–1184 (2007)

    Google Scholar 

  13. Rahimi, A., Recht, B.: Weighted sums of random kitchen sinks: replacing minimization with randomization in learning. Adv. Neural. Inf. Process. Syst. 21, 1313–1320 (2008)

    Google Scholar 

  14. Sahoo, D., Hoi, S.C.H., Li, B.: Online multiple kernel regression. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD, pp. 293–302 (2014)

    Google Scholar 

  15. Seldin, Y., Bartlett, P.L., Crammer, K., Abbasi-Yadkori, Y.: Prediction with limited advice and multiarmed bandits with paid observations. In: Proceedings of the 31st International Conference on Machine Learning, pp. 280–287 (2014)

    Google Scholar 

  16. Shen, Y., Chen, T., Giannakis, G.B.: Random feature-based online multi-kernel learning in environments with unknown dynamics. J. Mach. Learn. Res. 20(22), 1–36 (2019)

    MathSciNet  MATH  Google Scholar 

  17. Tsallis, C.: Possible generalization of boltzmann-gibbs statistics. J. Stat. Phys. 52(1), 479–487 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  18. Wang, J., Hoi, S.C.H., Zhao, P., Zhuang, J., Liu, Z.: Large scale online kernel classification. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence, pp. 1750–1756 (2013)

    Google Scholar 

  19. Yang, T., Mahdavi, M., Jin, R., Yi, J., Hoi, S.C.H.: Online kernel selection: algorithms and evaluations. In: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, pp. 1197–1202 (2012)

    Google Scholar 

  20. Yu, F.X., Suresh, A.T., Choromanski, K.M., Holtmann-Rice, D.N., Kumar, S.: Orthogonal random features. Adv. Neural. Inf. Process. Syst. 29, 1975–1983 (2016)

    Google Scholar 

  21. Zhang, L., Yi, J., Jin, R., Lin, M., He, X.: Online kernel learning with a near optimal sparsity bound. In: Proceedings of the 30th International Conference on Machine Learning, pp. 621–629 (2013)

    Google Scholar 

  22. Zhang, X., Liao, S.: Online kernel selection via incremental sketched kernel alignment. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, pp. 3118–3124 (2018)

    Google Scholar 

  23. Zimmert, J., Seldin, Y.: An optimal algorithm for stochastic and adversarial bandits. In: Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, pp. 467–475 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shizhong Liao .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 502 KB)

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, J., Liao, S. (2023). Improved Regret Bounds for Online Kernel Selection Under Bandit Feedback. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13716. Springer, Cham. https://doi.org/10.1007/978-3-031-26412-2_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-26412-2_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-26411-5

  • Online ISBN: 978-3-031-26412-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics