Abstract
During the last decade, research on Mercer kernel-based learning algorithms has flourished [294, 226, 289]. These algorithms include, for example, the support vector machine (SVM) [63], kernel principal component analysis (KPCA) [289], and kernel Fisher discriminant analysis (KFDA) [219]. The common property of these methods is that they operate linearly, as they are explicitly expressed in terms of inner products in a transformed data space that is a reproducing kernel Hilbert space (RKHS). Most often they correspond to nonlinear operators in the data space, and they are still relatively easy to compute using the so-called “kernel-trick”. The kernel trick is no trick at all; it refers to a property of the RKHS that enables the computation of inner products in a potentially infinite-dimensional feature space, by a simple kernel evaluation in the input space. As we may expect, this is a computational saving step that is one of the big appeals of RKHS. At first glance one may even think that it defeats the “no free lunch theorem” (get something for nothing), but the fact of the matter is that the price of RKHS is the need for regularization and in the memory requirements as they are memory-intensive methods. Kernel-based methods (sometimes also called Mercer kernel methods) have been applied successfully in several applications, such as pattern and object recognition [194], time series prediction [225], and DNA and protein analysis [350], to name just a few.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aronszajn N., The theory of reproducing kernels and their applications, Cambridge Philos. Soc. Proc., vol. 39:133–153, 1943.
Babich G., Camps O., Weighted Parzen windows for pattern classification. IEEE Trans. Pattern Anal. Mach. Intell., 18(5):567–570, 1996.
Bach F., Jordan M., Kernel independent component analysis, J. Mach. Learn. Res., 3:1–48, 2002.
Beirlant J., Zuijlen M., The empirical distribution function and strong laws for functions of order statistics of uniform spacings, J. Multiva. Anal., 16:300–317, 1985.
Carnell, A., Richardson, D., Linear algebra for time series of spikes. In: Proc. European Symp. on Artificial Neural Networks, pp. 363–368. Bruges, Belgium (2005)
Cortez C., Vapnik V., Support vector networks. Mach. Learn., 20:273–297, 1995.
Cover T., Classification and generalization capabilities of linear threshold units, Rome air force technical documentary report RADC-TDR-64-32, Tech. Rep., Feb 1964.
Dayan, P. Abbott L.F. Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems. MIT Press, Cambridge, MA, 2001.
Di Marzio M., Taylor C., Kernel density classification and boosting: An L2 analysis. Statist Comput., 15(2):113–123, 2004.
Diggle P., Marron J.S., Equivalence of smoothing parameter selectors in density and intensity estimation. J. Am. Statist. Assoc. 83(403):793–800, 1988.
Girolami M., Orthogonal series density estimation and the kernel eigenvalue problem. Neural Comput., 14(3):669–688, 2002.
Gretton, A., Herbrich R., Smola A., Bousquet O., Schölkopf B., Kernel Methods for Measuring Independence,” J. Mach. Learn. Res., 6:2075–2129, 2005.
Gyorfi L., van der Meulen E., On nonparametric estimation of entropy functionals, in Nonparametric Functional Estimation and Related Topics, (G. Roussas, Ed.), Kluwer Academic, Amsterdam, 1990, pp. 81–95.
Jenssen R., Erdogmus D., Principe J., Eltoft T., Some equivalence between kernel and information theoretic methods, in J. VLSI Signal Process., 45:49–65, 2006.
LeCun Y., Jackel L., Bottou L., Brunot A., Cortes C., Denker J., Drucker H., Guyon I., Muller U., Sackinger E., Simard P., Vapnik V., Learning algorithms for classification: A comparison on handwritten digit reconstruction. Neural Netw., pp. 261–276, 1995.
Mercer J., Functions of positive and negative type, and their connection with the theory of integral equations, Philosoph. Trans. Roy. Soc. Lond., 209:415–446, 1909.
Mika S., Ratsch G., Weston J., Scholkopf B., Muller K., Fisher discriminant analysis with kernels. In Proceedings of IEEE International Workshop on Neural Networks for Signal Processing, pages 41–48, Madison, USA, August 23–25, 1999.
Moore E., On properly positive Hermitian matrices, Bull. Amer. Math. Soc., 23(59):66–67, 1916.
Muller K., Smola A., Ratsch G., Scholkopf B., Kohlmorgen J., Vapnik V., Predicting time series with support vector machines. In Proceedings of International Conference on Artificial Neural Networks, Lecture Notes in Computer Science, volume 1327, pages 999–1004, Springer-Verlag, Berlin, 1997.
Muller K., Mika S., Ratsch G., Tsuda K., Scholkopf B., An introduction to kernel-based learning algorithms. IEEE Trans. Neural Netw., 12(2):181–201, 2001.
Paiva, A.R.C., Park, I., Principe, J.C. A reproducing kernel Hilbert space framework for spike train signal processing. Neural Comput., 21(2):424–449, 2009.
Parzen E., Statistical inference on time series by Hilbert space methods, Tech. Report 23, Stat. Dept., Stanford Univ., 1959.
Ramsay, J., Silverman, B., Functional Data Analysis. Springer-Verlag, New York, 1997.
Schölkopf B., Smola A., Muller K., Nonlinear component analysis as a kernel eigenvalue problem, Neural Comput., 10:1299–1319, 1998.
Schölkopf, B., Burges, C.J.C., Smola, A.J. (Eds.), Advances in Kernel Methods: Support Vector Learning. MIT Press, Cambridge, MA, 1999.
Schölkopf B. and Smola A., Learning with Kernels. MIT Press, Cambridge, MA, 2002
Schrauwen, B., Campenhout, J.V., Linking non-binned spike train kernels to several existing spike train distances. Neurocomp. 70(7–8), 1247–1253 (2007).
Shawe-Taylor J. Cristianini N., Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge, UK, 2004.
Snyder, D.L., Random Point Process in Time and Space. John Wiley & Sons, New York, 1975.
Vapnik V., The Nature of Statistical Learning Theory, Springer-Verlag, New York, 1995
Xu J., Pokharel P., Jeong K., Principe J., An explicit construction of a reproducing Gaussian kernel Hilbert space, Proc. IEEE Int. Conf. Acoustics Speech and Signal Processing, Toulouse, France, 2005.
Zien A., Ratsch G., Mika S., Schölkopf B., Lengauer T., Muller K., Engineering support vector machine kernels that recognize translation invariant sites in DNA. Bioinformatics, 16:906–914, 2000.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Xu, J., Jenssen, R., Paiva, A., Park, I. (2010). A Reproducing Kernel Hilbert Space Framework for ITL. In: Information Theoretic Learning. Information Science and Statistics. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-1570-2_9
Download citation
DOI: https://doi.org/10.1007/978-1-4419-1570-2_9
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-1569-6
Online ISBN: 978-1-4419-1570-2
eBook Packages: Computer ScienceComputer Science (R0)