Skip to main content

Kernel Methods

  • Reference work entry
  • First Online:
Encyclopedia of Machine Learning and Data Mining
  • 187 Accesses

Abstract

Over the past decade, kernel methods have gained much popularity in machine learning. Linear estimators have been popular due to their convenience in analysis and computation. However, nonlinear dependencies exist intrinsically in many real applications and are indispensable for effective modeling. Kernel methods can sometimes offer the best of both aspects. The reproducing kernel Hilbert space provides a convenient way to model nonlinearity, while the estimation is kept linear. Kernels also offer significant flexibility in analyzing generic non-Euclidean objects such as graphs, sets, and dynamic systems. Moreover, kernels induce a rich function space where functional optimization can be performed efficiently. Furthermore, kernels have also been used to define statistical models via exponential families or Gaussian processes and can be factorized by graphical models. Indeed, kernel methods have been widely used in almost all tasks in machine learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 699.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 949.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  • Aronszajn N (1950) Theory of reproducing kernels. Trans Am Math Soc 68:337–404

    Article  MathSciNet  MATH  Google Scholar 

  • Bach FR, Jordan MI (2002) Kernel independent component analysis. J Mach Learn Res 3:1–48

    MathSciNet  MATH  Google Scholar 

  • Boser B, Guyon I, Vapnik V (1992) A training algorithm for optimal margin classifiers. In: Haussler D (ed) Proceedings of annual conference on computational learning theory. ACM Press, Pittsburgh, pp 144–152

    Google Scholar 

  • Collins M, Globerson A, Koo T, Carreras X, Bartlett P (2008) Exponentiated gradient algorithms for conditional random fields and max-margin markov networks. J Mach Learn Res 9:1775–1822

    MathSciNet  MATH  Google Scholar 

  • Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Haussler D (1999) Convolution kernels on discrete structures. Technical report UCS-CRL-99-10, UC Santa Cruz

    Google Scholar 

  • Hofmann T, Schölkopf B, Smola AJ (2008) Kernel methods in machine learning. Ann Stat 36(3):1171–1220

    Article  MathSciNet  MATH  Google Scholar 

  • Lampert CH (2009) Kernel methods in computer vision. Found Trends Comput Graph Vis 4(3): 193–285

    Article  MathSciNet  Google Scholar 

  • Poggio T, Girosi F (1990) Networks for approximation and learning. Proc IEEE 78(9):1481–1497

    Article  MATH  Google Scholar 

  • Schölkopf B, Smola A (2002) Learning with kernels. MIT Press, Cambridge

    MATH  Google Scholar 

  • Schölkopf B, Smola AJ, Müller K-R (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10:1299–1319

    Article  Google Scholar 

  • Schölkopf B, Tsuda K, Vert J-P (2004) Kernel methods in computational biology. MIT Press, Cambridge

    Google Scholar 

  • Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Smola A, Vishwanathan SVN, Le Q (2007a) Bundle methods for machine learning. In: Koller D, Singer Y (eds) Advances in neural information processing systems, vol 20. MIT Press, Cambridge

    Google Scholar 

  • Smola AJ, Gretton A, Song L, Schölkopf B (2007b) A Hilbert space embedding for distributions. In: International conference on algorithmic learning theory, Sendai. Volume 4754 of LNAI. Springer, pp 13–31

    Google Scholar 

  • Smola AJ, Schölkopf B, Müller K-R (1998) The connection between regularization operators and support vector kernels. Neural Netw 11(5): 637–649

    Article  Google Scholar 

  • Steinwart I, Christmann A (2008) Support vector machines. Information science and statistics. Springer, New York

    MATH  Google Scholar 

  • Taskar B, Guestrin C, Koller D (2004) Max-margin Markov networks. In: Thrun S, Saul L, Schölkopf B (eds) Advances in neural information processing systems, vol 16. MIT Press, Cambridge, pp 25–32

    Google Scholar 

  • Vapnik V (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

  • Wahba G (1990) Spline models for observational data. Volume 59 of CBMS-NSF regional conference series in applied mathematics. SIAM, Philadelphia

    Book  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xinhua Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media New York

About this entry

Cite this entry

Zhang, X. (2017). Kernel Methods. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_144

Download citation

Publish with us

Policies and ethics