Skip to main content

Generalization Error Analysis for Polynomial Kernel Methods — Algebraic Geometrical Approach

  • Conference paper
  • First Online:
Book cover Artificial Neural Networks and Neural Information Processing — ICANN/ICONIP 2003 (ICANN 2003, ICONIP 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2714))

Abstract

The generalization properties of learning classifiers with a polynomial kernel function are examined here. We first show that the generalization error of the learning machine depends on the properties of the separating curve, that is, the intersection of the input surface and the true separating hyperplane in the feature space. When the input space is one-dimensional, the problem is decomposed to as many one-dimensional problems as the number of the intersecting points. Otherwise, the generalization error is determined by the class of the separating curve. Next, we consider how the class of the separating curve depends on the true separating function. The class is maximum when the true separating polynomial function is irreducible and smaller otherwise. In either case, the class depends only on the true function and does not on the dimension of the feature space. The results imply that the generalization error does not increase even when the dimension of the feature space gets larger and that the so-called overmodeling does not occur in the kernel learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aizerman, M.A., Braverman, E.M., Rozonoer, L.I.: Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control, 25 (1964) 821–837

    MathSciNet  Google Scholar 

  2. Amari, S.: A universal theorem on learning curves. Neural Networks, 6 (1993) 161–166

    Article  Google Scholar 

  3. Amari, S., Fujita, N., Shinomoto, S.: Four Types of Learning Curves. Neural Computation, 4 (1992) 605–618

    Article  Google Scholar 

  4. Amari, S., Murata, N.: Statistical Theory of Learning Curves under Entropic Loss Criterion. Neural Computation, 5 (1993) 140–153

    Article  Google Scholar 

  5. Baum, E.B., Haussler, D.: What Size Net Gives Valid Generalization? Neural Computation, 1 (1989) 151–160

    Article  Google Scholar 

  6. Cox, D.: Ideals, Varieties, and Algorithms. Springer-Verlag, New York, NY (1997)

    Google Scholar 

  7. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines. Cambridge Univ. Press, Cambridge, UK (2000)

    Google Scholar 

  8. Dietrich, R., Opper, M., Sompolinsky, H.: Statistical Mechanics of Support Vector Networks. Physical Review Letters, 82 (1999) 2975–2978

    Article  Google Scholar 

  9. Ikeda, K.: Geometry and Learning Curves of Kernel Methods with Polynomial Kernels. Trans. of IEICE, J86-D-II (2003) in press (in Japanese).

    Google Scholar 

  10. Ikeda, K., Amari, S.: Geometry of Admissible Parameter Region in Neural Learning. IEICE Trans. Fundamentals, E79-A (1996) 409–414

    Google Scholar 

  11. Murata, N., Yoshizawa, S., Amari, S.: Network Information Criterions — Determining the Number of Parameters for an Artifcial Neural Network Model. IEEE Trans. Neural Networks, 5 (1994) 865–872

    Article  Google Scholar 

  12. Opper, M., Haussler, D.: Calculation of the Learning Curve of Bayes Optimal Classification on Algorithm for Learning a Perceptron with Noise. Proc. 4th Ann. Workshop Comp. Learning Theory (1991) 75–87

    Google Scholar 

  13. Schölkopf, B., Burges, C., Smola, A.J.: Advances in Kernel Methods: Support Vector Learning. Cambridge Univ. Press, Cambridge, UK (1998)

    MATH  Google Scholar 

  14. Smola, A.J. et al. (eds.): Advances in Large Margin Classifiers. MIT Press, Cambridge, MA (2000)

    MATH  Google Scholar 

  15. Ueno, K.: Introduction to Algebraic Geometry. Iwanami-Shoten, Tokyo (1995) (in Japanese)

    Google Scholar 

  16. Valiant, L.G.: A Theory of the Learnable. Communications of ACM, 27 (1984) 1134–1142

    Article  MATH  Google Scholar 

  17. Vapnik, V.N.: Statistical Learning Theory. John Wiley and Sons, New York, NY (1998)

    MATH  Google Scholar 

  18. Vapnik, V.N., Chervonenkis, A.Y.: On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities. Theory of Probability and its Applications, 16 (1971) 264–280

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ikeda, K. (2003). Generalization Error Analysis for Polynomial Kernel Methods — Algebraic Geometrical Approach. In: Kaynak, O., Alpaydin, E., Oja, E., Xu, L. (eds) Artificial Neural Networks and Neural Information Processing — ICANN/ICONIP 2003. ICANN ICONIP 2003 2003. Lecture Notes in Computer Science, vol 2714. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44989-2_25

Download citation

  • DOI: https://doi.org/10.1007/3-540-44989-2_25

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40408-8

  • Online ISBN: 978-3-540-44989-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics