Abstract
The relevance vector machine(RVM) is a state-of-the-art constructing sparse regression kernel model [1,2,3,4]. It not only generates a much sparser model but provides better generalization performance than the standard support vector machine (SVM). In RVM and SVM, relevance vectors (RVs) and support vectors (SVs) are both selected from the input vector set. This may limit model flexibility. In this paper we propose a new sparse kernel model called Relevance Units Machine (RUM). RUM follows the idea of RVM under the Bayesian framework but releases the constraint that RVs have to be selected from the input vectors. RUM treats relevance units as part of the parameters of the model. As a result, a RUM maintains all the advantages of RVM and offers superior sparsity. The new algorithm is demonstrated to possess considerable computational advantages over well-known the state-of-the-art algorithms.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Tipping, M.: The relevance vector machine. In: Solla, S., Leen, T., Müller, K. (eds.) Advances in Neural Information Processing Systems, vol. 12. MIT Press, Cambridge (2000)
Bishop, C., Tipping, M.: Variational relevance vector machines. In: Boutilier, C., Goldszmidz, M. (eds.) Uncertainty in Artificial Intelligence 2000, pp. 46–53. Morgan Kaufmann, San Francisco (2000)
Tipping, M.: Sparse Bayesian learning and the relevance vector machine. J. Machine Learnign Research 1, 211–244 (2001)
Tipping, M., Faul, A.: Fast marginal likelihood maximisation for sparse bayesian models. In: Bishop, C., Frey, B. (eds.) Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, Key West, FL (January 2003)
Schölkopf, B., Smola, A.: Learning with Kernels. MIT Press, Cambridge (2002)
Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
Poggio, T., Girosi, F.: A sparse representation for function approximation. Neural Computation 10, 1445–1454 (1998)
Chen, S.: Local regularization assisted orthogonal least squares regression. NeuroComputing 69, 559–585 (2006)
Kruif, B., Vries, T.: Support-Vector-based least squares for learning non-linear dynamics. In: Proceedings of 41st IEEE Conference on Decision and Control, Las Vegas, USA, pp. 10–13 (2002)
Gestel, T., Espinoza, M., Suykens, J., Brasseur, C., deMoor, B.: Bayesian input selection for nonlinear regression with LS-SVMS. In: Proceedings of 13th IFAC Symposium on System Identification, Totterdam, The Netherlands, pp. 27–29 (2003)
Valyon, J., Horváth, G.: A generalized LS-SVM. In: Principe, J., Gile, L., Morgan, N., Wilson, E. (eds.) Proceedings of 13th IFAC Symposium on System Identification, Rotterdam, The Netherlands (2003)
Suykens, J., van Gestel, T., DeBrabanter, J., DeMoor, B.: Least Square Support Vector Machines. World Scientific, Singapore (2002)
Drezet, P., Harrison, R.: Support vector machines for system identification. In: Proceeding of UKACC Int. Conf. Control 1998, Swansea, U.K, pp. 688–692 (1998)
Gao, J., Antolovich, M., Kwan, P.H.: L1 lasso and its Bayesian inference. In: 21st Australasian Joint Conference on Artificial Intelligence, New Zealand (submitted, 2008)
Wang, G., Yeung, D.Y., Lochovsky, F.: The kernel path in kernelized LASSO. In: International Conference on Artificial Intelligence and Statistics, pp. 580–587. MIT Press, San Juan (2007)
Tibshirani, R.: Regression shrinkage and selection via the LASSO. J. Royal. Statist. Soc. B 58, 267–288 (1996)
Roth, V.: The generalized lasso. IEEE Transactions on Neural Networks 15(1), 16–28 (2004)
Wu, M., Schölkopf, B., Bakir, G.: A direct method for building sparse kernel learning algorithms. Journal of Machine Learning Research 7, 603–624 (2006)
Burges, C.: Simplified support vector decision rules. In: Proc. 13th International Conference on Machine Learning, pp. 71–77. Morgan Kaufman, San Mateo (1996)
Snelson, E., Ghahramani, Z.: Sparse gaussian processes using pseudo-inputs. In: Advances in Neural Information Processing Systems 18, pp. 1257–1264. MIT Press, Cambridge (2006)
Gao, J.: Robust L1 principal component analysis and its Bayesian variational inference. Neural Computation 20, 555–572 (2008)
Lawrence, N.: Probabilistic non-linear principal component analysis with gaussian process latent variable models. Journal of Machine Learning Research 6, 1783–1816 (2005)
Billings, S., Chen, S., Backhouse, R.: The identification of linear and nonlinear models of a turbocharged automotive diesel engine. Mech. Syst. Signal Processing 3(2), 123–142 (1989)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gao, J., Zhang, J. (2009). Sparse Kernel Learning and the Relevance Units Machine. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, TB. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2009. Lecture Notes in Computer Science(), vol 5476. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01307-2_60
Download citation
DOI: https://doi.org/10.1007/978-3-642-01307-2_60
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01306-5
Online ISBN: 978-3-642-01307-2
eBook Packages: Computer ScienceComputer Science (R0)