Abstract
Online kernel selection is critical to online kernel learning, and must address the exploration-exploitation dilemma, where we explore new kernels to find the best one and exploit the kernel that showed the best performance in the past. In this paper, we propose a novel multi-armed bandit solution to the exploration-exploitation dilemma in online kernel selection. We first correspond each candidate kernel to an arm of a multi-armed bandit problem. Different from typical multi-armed bandit models where only one kernel is selected at each round, we sample multiple kernels with replacement according to a probability distribution. Then, we make prediction with the hypotheses learned in the random feature spaces specified by the selected kernels, and incur multiple losses referred to as multiple bandit feedbacks. Finally, we use all the feedbacks to update the probability distribution. We prove that the proposed approach enjoys a sub-linear expected regret bound. Experimental results on benchmark datasets show that the proposed approach has a comparable performance with existing online kernel selection methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1), 48–77 (2002)
Bubeck, S., Cesa-Bianchi, N.: Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Found. Trends\(\textregistered \) Mach. Learn. 5(1), 1–122 (2012)
Chang, C., Lin, C.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Tech. 2(3), 1–27 (2011)
Chen, B., Liang, J., Zheng, N., Príncipe, J.C.: Kernel least mean square with adaptive kernel size. Neurocomputing 191, 95–106 (2016)
Dekel, O., Shalev-Shwartz, S., Singer, Y.: The Forgetron: a kernel-based perceptron on a fixed budget. In: Proceedings of the 19th Annual Conference on Neural Information Processing Systems (NIPS), pp. 259–266 (2005)
Fan, H., Song, Q., Shrestha, S.B.: Kernel online learning with adaptive kernel width. Neurocomputing 175, 233–242 (2016)
Foster, D.J., Kale, S., Mohri, M., Sridharan, K.: Parameter-free online learning via model selection. In: Proceedings of the 31st Annual Conference on Neural Information Processing Systems (NIPS), pp. 6022–6032 (2017)
Han, Z., Liao, S.: Stochastic online kernel selection with instantaneous loss in random feature space. In: Liu, D., Xie, S., Li, Y., El-Alfy, E.S. (eds.) ICONIP 2017, vol. 10634, pp. 33–42. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70087-8_4
Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml
Lu, J., Hoi, S.C.H., Wang, J., Zhao, P., Liu, Z.: Large scale online kernel learning. J. Mach. Learn. Res. 17, 1–43 (2016)
Nguyen, T.D., Le, T., Bui, H., Phung, D.: Large-scale online kernel learning with random feature reparameterization. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI), pp. 2543–2549 (2017)
Rahimi, A., Recht, B.: Random features for large-scale kernel machine. In: Proceedings of the 21st Annual Conference on Neural Information Processing Systems (NIPS), pp. 1177–1184 (2007)
Shalev-Shwartz, S.: Online learning and online convex optimization. Found. Trends\(\textregistered \) Mach. Learn. 4(2), 107–194 (2012)
Tossou, A.C.Y., Dimitrakakis, C.: Achieving privacy in the adversarial multi-armed bandit. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI), pp. 2653–2659 (2017)
Yang, T., Mahdavi, M., Jin, R., Yi, J., Hoi, S.C.H.: Online kernel selection: algorithms and evaluations. In: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI), pp. 1197–1202 (2012)
Acknowledgments
The work was supported in part by the National Natural Science Foundation of China under grant No. 61673293.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix: Proof Sketch of Theorem 1
Appendix: Proof Sketch of Theorem 1
Proof
Let \(\ell _{t,i} \in [0,B],~B \ge 1\). We first give the next two facts
Let \(W_t = \sum ^K_{i=1}\omega _{t,i}\). With the proof of Theorem 3.1 in [1], we obtain
where we utilize the fact \(\forall x \ge 0, e^{-x} \le 1-x + \frac{x^2}{2}\). Furthermore, with the fact \(\forall x \in \mathbb {R}, 1+x \le e^x\), taking logarithms and summing over t gives
Besides, \(\forall \kappa _j\in \mathcal {K}\),
Combining (11) and (12), we obtain
Let \(S_t = \{\kappa _{i_1}, \kappa _{i_2}, \ldots , \kappa _{i_{\vert S_t\vert }}\}\) and \(i_1 \ne i_2\ne \ldots \ne i_{\vert S_t\vert }\). Then, we have \(p(\forall \kappa _j \in S_t) = p_{t,j}\cdot \delta _{t,j}\). If \(\vert S_t\vert < m\),
Otherwise, if \(\vert S_t\vert = m\),
We can bound \(\delta _{t,j} \le \vert S_t\vert \). For clear analysis, we denote \(\ell _{t,j}\) as \(\ell (\mathbf {w}_{t,j})\) and introduce the notation \(\mathbb {I}^t_{j} = \mathbbm {1}(\kappa _j \in S_t)\). Let \(\mathbf {w}^*_j \in \mathcal {H}_{R,j}\) be the best linear model. According to the standard analysis of online convex optimization, we have
Then, we get
Taking expectation with respect to \(S_1, S_2, \ldots , S_t\) gives
In which, we apply the facts \(p_{t,j}>\frac{\gamma }{K}, \delta _{t,j} \le \vert S_t\vert \) and
We also have
Let \(\eta = \sqrt{\frac{\Vert \mathbf {w}^*_j\Vert ^2\gamma }{KL^2T}}\). According to (13), (14) and (15), we obtain
Let \(\gamma = a^{\frac{1}{3}}_1(2b_1)^{-\frac{2}{3}}T^{-\frac{1}{3}}, a_1 = \Vert \mathbf {w}^*_j \Vert ^2KL^2\) and \( b_1 = \frac{(2+\beta B)B}{2}\). Then, we get
Next, we bound the difference between \(\sum ^T_{t=1}\ell (\mathbf {w}^*_j)\) and \(\sum ^T_{t=1}\ell (f^*_j)\), where \(f^*_j \in \mathcal {H}_j\). With the analysis of “Fourier Online Gradient Descent” [10], we have
and \(\Vert \mathbf {w}^*_j \Vert ^2 \le (1+\epsilon )\Vert f^*_j \Vert ^2_1\) with high probability according to claim 1 in [12]. Combining (16) and (17) yields
where \(a_2 = (1+\epsilon )\Vert f^*_j \Vert _1^2KL^2\), which completes the proof.
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, J., Liao, S. (2018). Online Kernel Selection with Multiple Bandit Feedbacks in Random Feature Space. In: Liu, W., Giunchiglia, F., Yang, B. (eds) Knowledge Science, Engineering and Management. KSEM 2018. Lecture Notes in Computer Science(), vol 11062. Springer, Cham. https://doi.org/10.1007/978-3-319-99247-1_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-99247-1_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99246-4
Online ISBN: 978-3-319-99247-1
eBook Packages: Computer ScienceComputer Science (R0)