Abstract
Recently, sparse kernel methods such as the Relevance Vector Machine (RVM) have become very popular for solving regression problems. The sparsity and performance of these methods depend on selecting an appropriate kernel function, which is typically achieved using a cross-validation procedure. In this paper we propose a modification to the incremental RVM learning method, that also learns the location and scale parameters of Gaussian kernels during model training. More specifically, in order to effectively model signals with different characteristics at various locations, we learn different parameter values for each kernel, resulting in a very flexible model. In order to avoid overfitting we use a sparsity enforcing prior that controls the effective number of parameters of the model. Finally, we apply the proposed method to one-dimensional and two-dimensional artificial signals, and evaluate its performance on two real-world datasets.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Tipping, M.E.: Sparse Bayesian learning and the relevance vector machine. Journal of Machine Learning Research 1, 211–244 (2001)
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)
Lanckriet, G.R.G., Cristianini, N., Bartlett, P., Ghaoui, L.E., Jordan, M.I.: Learning the kernel matrix with semidefinite programming. J. Mach. Learn. Res. 5, 27–72 (2004)
Girolami, M., Rogers, S.: Hierarchic Bayesian models for kernel learning. In: ICML 2005: Proceedings of the 22nd international conference on machine learning, pp. 241–248. ACM, New York (2005)
Sonnenburg, S., Rätsch, G., Schäfer, C., Schölkopf, B.: Large scale multiple kernel learning. J. Mach. Learn. Res. 7, 1531–1565 (2006)
Quiñonero-Candela, J., Hansen, L.K.: Time series prediction based on the relevance vector machine with adaptive kernels. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Piscataway, New Jersey, vol. 1, pp. 985–988. IEEE, Los Alamitos (2002)
Krishnapuram, B., Hartemink, A.J., Figueiredo, M.A.T.: A Bayesian approach to joint feature selection and classifier design. IEEE Trans. Pattern Anal. Mach. Intell. 26(9), 1105–1111 (2004)
Schmolck, A., Everson, R.: Smooth relevance vector machine: a smoothness prior extension of the RVM. Machine Learning 68(2), 107–135 (2007)
Tipping, M., Faul, A.: Fast marginal likelihood maximisation for sparse Bayesian models. In: Proc. of the Ninth International Workshop on Artificial Intelligence and Statistics (2003)
Faul, A.C., Tipping, M.E.: Analysis of sparse Bayesian learning. In: Advances in Neural Information Processing Systems, pp. 383–389. MIT Press, Cambridge (2001)
Holmes, C.C., Denison, D.G.T.: Bayesian wavelet analysis with a model complexity prior. In: Bernardo, J.M., Berger, J.O., Dawid, A.P., Smith, A.F.M. (eds.) Bayesian Statistics 6: Proceedings of the Sixth Valencia International Meeting. Oxford University Press, Oxford (1999)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tzikas, D., Likas, A., Galatsanos, N. (2008). Incremental Relevance Vector Machine with Kernel Learning. In: Darzentas, J., Vouros, G.A., Vosinakis, S., Arnellos, A. (eds) Artificial Intelligence: Theories, Models and Applications. SETN 2008. Lecture Notes in Computer Science(), vol 5138. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87881-0_27
Download citation
DOI: https://doi.org/10.1007/978-3-540-87881-0_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87880-3
Online ISBN: 978-3-540-87881-0
eBook Packages: Computer ScienceComputer Science (R0)