Abstract
In this paper we propose a strategy for constructing data–driven kernels, automatically determined by the training examples. Basically, their associated Reproducing Kernel Hilbert Spaces arise from finite sets of linearly independent functions, that can be interpreted as weak classifiers or regressors, learned from training material. When working in the Tikhonov regularization framework, the unique free parameter to be optimized is the regularizer, representing a trade-off between empirical error and smoothness of the solution. A generalization error bound based on Rademacher complexity is provided, yielding the potential for controlling overfitting.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Lanckriet, G., et al.: Learning the Kernel Matrix with Semidefinite Programming. JMLR 5, 27–72 (2004)
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)
Aronszajn, N.: Theory of Reproducing Kernels. Trans. AMS 686, 337–404 (1950)
Micchelli, C.A., Pontil, M.: Learning the Kernel Function via Regularization. JMLR 6, 1099–1125 (2005)
Ong, C.S., Smola, A.J., Williamson, R.C.: Learning the Kernel with Hyperkernels. JMLR 6, 1043–1071 (2005)
Rakotomamonjy, A., Canu, S.: Frames, Reproducing Kernels, Regularization and Learning. JMLR 6, 1485–1515 (2005)
Merler, M., Jurman, G.: Terminated Ramp – Support Vector Machines: a nonparametric data dependent kernel. Neur. Net. 19(10), 1597–1611 (2006)
Amari, S., Wu, S.: Improving support vector machine classifiers by modifying kernel functions. Neur. Net. 12(6), 783–789 (1999)
Evgeniou, T., Pontil, M., Poggio, T.: Regularization Networks and Support Vector Machines. Adv. Comp. Math. 13, 1–50 (2000)
Cucker, F., Smale, S.: On the Mathematical Fundations of Learning. Bull. AMS 39(1), 1–49 (2001)
Rifkin, R.: Everything old is new again: a fresh look at historical approaches in Machine Learning. PhD thesis, MIT (2002)
Hastie, T.J., Buja, A., Tibshirani, R.: Penalized Discriminant Analysis. Ann. Stat. 23, 73–102 (1995)
Bartlett, P.L., Mendelson, S.: Rademacher and Gaussian Complexities: Risk Bounds and Structural Results. JMLR 3, 463–482 (2002)
Guyon, I., et al.: Gene Selection for Cancer Classification using Support Vector Machines. Mach. Lear. 46(1/3), 389–422 (2002)
Barla, A., et al.: Proteome profiling without selection bias. In: Proc. CBMS 2006, pp. 941–946. IEEE Computer Society Press, Los Alamitos (2006)
Furlanello, C., et al.: Entropy-based gene ranking without selection bias for the predictive classification of microarray data. BMC Bioinf. 4, 54 (2003)
Jurman, G., et al.: Algebraic stability indicators for ranked lists in molecular diagnostics. Submitted (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Merler, S., Jurman, G., Furlanello, C. (2007). Deriving the Kernel from Training Data. In: Haindl, M., Kittler, J., Roli, F. (eds) Multiple Classifier Systems. MCS 2007. Lecture Notes in Computer Science, vol 4472. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72523-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-72523-7_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72481-0
Online ISBN: 978-3-540-72523-7
eBook Packages: Computer ScienceComputer Science (R0)