Skip to main content

A Short Introduction to Learning with Kernels

  • Chapter
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2600))

Abstract

We briefly describe the main ideas of statistical learning theory, support vector machines, and kernel feature spaces. This includes a derivation of the support vector optimization problem for classification and regression, the v-trick, various kernels and an overview over applications of kernel methods.

The present article is based on 23.

If the outputs are not in ·±1×, the situation gets more complex, cf. 34.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. A. Aizerman, É. M. Braverman, and L. I. Rozonoér. Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control, 25:821–837, 1964.

    Google Scholar 

  2. Noga Alon, Shai Ben-David, Nicolo Cesa-Bianchi, and David Haussler. Scalesensitive dimensions, uniform convergence, and learnability. Journal of the ACM, 44(4):615–631, 1997.

    Article  MATH  MathSciNet  Google Scholar 

  3. N. Aronszajn. Theory of reproducing kernels. Transactions of the American Mathematical Society, 68:337–404, 1950.

    Article  MATH  MathSciNet  Google Scholar 

  4. P. L. Bartlett and J. Shawe-Taylor. Generalization performance of support vector machines and other pattern classifiers. In B. Schölkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods-Support Vector Learning, pages 43–54, Cambridge, MA, 1999. MIT Press.

    Google Scholar 

  5. C. Berg, J. P. R. Christensen, and P. Ressel. Harmonic Analysis on Semigroups. Springer, New York, 1984.

    MATH  Google Scholar 

  6. D. P. Bertsekas. Nonlinear Programming. Athena Scientific, Belmont, MA, 1995.

    Google Scholar 

  7. V. Blanz, B. Schölkopf, H. Bültho., C. Burges, V. Vapnik, and T. Vetter. Comparison of view-based object recognition algorithms using realistic 3D models. In C. von der Malsburg, W. von Seelen, J. C. Vorbrüggen, and B. Sendhoff, editors, Artificial Neural Networks ICANN’96, pages 251–256, Berlin, 1996. Springer Lecture Notes in Computer Science, Vol. 1112.

    Google Scholar 

  8. B. E. Boser, I. M. Guyon, and V. N. Vapnik. A training algorithm for optimal margin classifiers. In D. Haussler, editor, Proceedings of the Annual Conference on Computational Learning Theory, pages 144–152, Pittsburgh, PA, July 1992. ACM Press.

    Google Scholar 

  9. C. J. C. Burges and B. Schölkopf. Improving the accuracy and speed of support vector learning machines. In M. C. Mozer, M. I. Jordan, and T. Petsche, editors, Advances in Neural Information Processing Systems 9, pages 375–381, Cambridge, MA, 1997. MIT Press.

    Google Scholar 

  10. C. Cortes and V. Vapnik. Support vector networks. Machine Learning, 20:273–297, 1995.

    MATH  Google Scholar 

  11. D. DeCoste and B. Schölkopf. Training invariant support vector machines. Machine Learning, 2002. Accepted for publication. Also: Technical Report JPL-MLTR-00-1, Jet Propulsion Laboratory, Pasadena, CA, 2000.

    Google Scholar 

  12. D. Haussler. Convolutional kernels on discrete structures. Technical Report UCSCCRL-99-10, Computer Science Department, UC Santa Cruz, 1999.

    Google Scholar 

  13. J. Mercer. Functions of positive and negative type and their connection with the theory of integral equations. Philosophical Transactions of the Royal Society, London, A 209:415–446, 1909.

    Article  Google Scholar 

  14. E. Osuna, R. Freund, and F. Girosi. An improved training algorithm for support vector machines. In J. Principe, L. Gile, N. Morgan, and E. Wilson, editors, Neural Networks for Signal Processing VII-Proceedings of the 1997 IEEE Workshop, pages 276–285, New York, 1997. IEEE.

    Google Scholar 

  15. J. Platt. Fast training of support vector machines using sequential minimal optimization. In B. Schölkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods-Support Vector Learning, pages 185–208, Cambridge, MA, 1999. MIT Press.

    Google Scholar 

  16. T. Poggio. On optimal nonlinear associative recall. Biological Cybernetics, 19:201–209, 1975.

    Article  MathSciNet  MATH  Google Scholar 

  17. B. Schölkopf. Support Vector Learning. R. Oldenbourg Verlag, München, 1997. Doktorarbeit, TU Berlin. Download: http://www.kernel-machines.org.

  18. B. Schölkopf, C. Burges, and V. Vapnik. Extracting support data for a given task. In U. M. Fayyad and R. Uthurusamy, editors, Proceedings, First International Conference on Knowledge Discovery & Data Mining, Menlo Park, 1995. AAAI Press.

    Google Scholar 

  19. B. Schölkopf, C. J. C. Burges, and A. J. Smola. Advances in Kernel Methods-Support Vector Learning. MIT Press, Cambridge, MA, 1999.

    Google Scholar 

  20. B. Schölkopf, J. Platt, J. Shawe-Taylor, A. J. Smola, and R. C. Williamson. Estimating the support of a high-dimensional distribution. Neural Computation, 13(7), 2001.

    Google Scholar 

  21. B. Schölkopf, A. Smola, and K.-R. Müller. Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10:1299–1319, 1998.

    Article  Google Scholar 

  22. B. Schölkopf, A. Smola, R. C. Williamson, and P. L. Bartlett. New support vector algorithms. Neural Computation, 12:1207–1245, 2000.

    Article  Google Scholar 

  23. B. Schölkopf and A. J. Smola. Learning with Kernels. MIT Press, Cambridge, MA, 2002.

    Google Scholar 

  24. A. Smola, B. Schölkopf, and K.-R. Müller. The connection between regularization operators and support vector kernels. Neural Networks, 11:637–649, 1998.

    Article  Google Scholar 

  25. A. J. Smola, P. L. Bartlett, B. Schölkopf, and D. Schuurmans. Advances in Large Margin Classifiers. MIT Press, Cambridge, MA, 2000.

    MATH  Google Scholar 

  26. A. J. Smola, Z. L. Óvári, and R. C. Williamson. Regularization with dot-product kernels. In T. K. Leen, T. G. Dietterich, and V. Tresp, editors, Advances in Neural Information Processing Systems 13, pages 308–314. MIT Press, 2001.

    Google Scholar 

  27. A. J. Smola and B. Schölkopf. On a kernel-based method for pattern recognition, regression, approximation and operator inversion. Algorithmica, 22:211–231, 1998.

    Article  MATH  MathSciNet  Google Scholar 

  28. V. Vapnik. Estimation of Dependences Based on Empirical Data [in Russian]. Nauka, Moscow, 1979. (English translation: Springer, New York, 1982).

    Google Scholar 

  29. V. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, 1995.

    MATH  Google Scholar 

  30. V. Vapnik and A. Chervonenkis. Theory of Pattern Recognition [in Russian]. Nauka, Moscow, 1974. (German Translation: W. Wapnik & A. Tscherwonenkis, Theorie der Zeichenerkennung, Akademie-Verlag, Berlin, 1979).

    Google Scholar 

  31. V. Vapnik and A. Lerner. Pattern recognition using generalized portrait method. Automation and Remote Control, 24:774–780, 1963.

    Google Scholar 

  32. G. Wahba. Spline Models for Observational Data, volume 59 of CBMS-NSF Regional Conference Series in Applied Mathematics. SIAM, Philadelphia, 1990.

    Google Scholar 

  33. C. Watkins. Dynamic alignment kernels. In A. J. Smola, P. L. Bartlett, B. Schölkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, pages 39–50, Cambridge, MA, 2000. MIT Press.

    Google Scholar 

  34. J. Weston, O. Chapelle, A. Elissee., B. Schölkopf, and V. Vapnik. Kernel dependency estimation. Technical Report 98, Max Planck Institute for Biological Cybernetics, 2002.

    Google Scholar 

  35. R. C. Williamson, A. J. Smola, and B. Schölkopf. Generalization bounds for regularization networks and support vector machines via entropy numbers of compact operators. IEEE Transaction on Information Theory, 2001.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Schölkopf, B., Smola, A.J. (2003). A Short Introduction to Learning with Kernels. In: Mendelson, S., Smola, A.J. (eds) Advanced Lectures on Machine Learning. Lecture Notes in Computer Science(), vol 2600. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36434-X_2

Download citation

  • DOI: https://doi.org/10.1007/3-540-36434-X_2

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-00529-2

  • Online ISBN: 978-3-540-36434-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics