Skip to main content

A Few Notes on Statistical Learning Theory

  • Chapter
  • First Online:
Advanced Lectures on Machine Learning

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2600))

Abstract

In these notes our aim is to survey recent (and not so recent) results regarding the mathematical foundations of learning theory. The focus in this article is on the theoretical side and not on the applicative one; hence, we shall not present examples which may be interesting from the practical point of view but have little theoretical significance. This survey is far from being complete and it focuses on problems the author finds interesting (an opinion which is not necessarily shared by the majority of the learning community). Relevant books which present a more evenly balanced approach are, for example 1, 4, 34, 35

I would like to thank Jyrki Kivinen for his valuable comments, which improved this manuscript considerably.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. Anthony, P.L. Bartlett: Neural Network Learning: Theoretical Foundations, Cambridge University Press, 1999.

    Google Scholar 

  2. N. Alon, S. Ben-David, N. Cesa-Bianchi, D. Haussler: Scale sensitive dimensions, uniform convergence and learnability, J. of ACM 44 (4), 615–631, 1997.

    Article  MATH  MathSciNet  Google Scholar 

  3. O. Bousquet: A Bennett concentration inequality and its application to suprema of empirical processes, preprint.

    Google Scholar 

  4. L. Devroye, L. Györfi, G. Lugosi: A Probabilistic Theory of Pattern Recognition, Springer, 1996.

    Google Scholar 

  5. R.M. Dudley: Real Analysis and Probability, Chapman and Hall, 1993.

    Google Scholar 

  6. R.M. Dudley: The sizes of compact subsets of Hilbert space and continuity of Gaussian processes, J. of Functional Analysis 1, 290–330, 1967.

    Article  MATH  MathSciNet  Google Scholar 

  7. R.M. Dudley: Central limit theorems for empirical measures, Annals of Probability 6(6), 899–929, 1978.

    Article  MATH  MathSciNet  Google Scholar 

  8. R.M. Dudley: Uniform Central Limit Theorems, Cambridge Studies in Advanced Mathematics 63, Cambridge University Press, 1999.

    Google Scholar 

  9. E. Giné, J. Zinn: Some limit theorems for empirical processes, Annals of Probability, 12(4), 929–989, 1984.

    Article  MATH  MathSciNet  Google Scholar 

  10. D. Haussler: Sphere packing numbers for subsets of Boolean n-cube with bounded Vapnik-Chervonenkis dimension, J. of Combinatorial Theory (A) 69, 217–232, 1995.

    Article  MATH  MathSciNet  Google Scholar 

  11. W. Hoeffding: Probability inequalities for sums of bounded random variables, J. of the American Statistical Association, 58, 13–30, 1963.

    Article  MATH  MathSciNet  Google Scholar 

  12. V. Koltchinskii, D. Panchenko: Rademacher processes and bounding the risk of function learning, High Dimensional Probability, II (Seattle, WA, 1999), 443–457, Progr. Probab., 47, Birkhauser.

    Google Scholar 

  13. R. Latala, K. Oleszkiewicz: On the best constant in the Khintchine-Kahane inequality, Studia Math. 109(1), 101–104, 1994.

    MATH  MathSciNet  Google Scholar 

  14. M. Ledoux: The Concentration of Measure Phenomenon, Mathematical Surveys an Monographs, Vol 89, AMS, 2001.

    Google Scholar 

  15. M. Ledoux, M. Talagrand: Probability in Banach Spaces: Isoperimetry and Processes, Springer, 1991.

    Google Scholar 

  16. W.S. Lee, P.L. Bartlett, R.C. Williamson: The Importance of Convexity in Learning with Squared Loss, IEEE Transactions on Information Theory 44 (5), 1974–1980, 1998.

    Article  MATH  MathSciNet  Google Scholar 

  17. P. Massart: About the constants in Talagrand’s concentration inequality for empirical processes, Annals of Probability, 28(2), 863–884, 2000.

    Article  MATH  MathSciNet  Google Scholar 

  18. S. Mendelson: Rademacher averages and phase transitions in Glivenko-Cantelli class, IEEE Transactions on Information Theory, 48(1), 251–263, 2002.

    Article  MATH  MathSciNet  Google Scholar 

  19. S. Mendelson: Improving the sample complexity using global data, IEEE Transactions on Information Theory, 48(7), 1977–1991, 2002.

    Article  MATH  MathSciNet  Google Scholar 

  20. S. Mendelson: Geometric parameters of kernel machines, in Proceedings of the 15th annual conference on Computational Learning Theory COLT02, Jyrki Kivinen and Robert H. Sloan(Eds.), Lecture Notes in Computer Sciences 2375, Springer, 29–43, 2002.

    Google Scholar 

  21. S. Mendelson, R. Vershynin: Entropy, combinatorial dimensions and random averages, in Proceedings of the 15th annual conference on Computational Learning Theory COLT02, Jyrki Kivinen and Robert H. Sloan(Eds.), Lecture Notes in Computer Sciences 2375, Springer, 14–28, 2002.

    Google Scholar 

  22. S. Mendelson, R. Vershynin: Entropy and the combinatorial dimension, Inventiones Mathematicae, to appear.

    Google Scholar 

  23. S. Mendelson, R.C. Williamson: Agnostic learning nonconvex classes of functions, in Proceedings of the 15th annual conference on Computational Learning Theory COLT02, Jyrki Kivinen and Robert H. Sloan(Eds.), Lecture Notes in Computer Sciences 2375, Springer, 1–13, 2002.

    Google Scholar 

  24. V.D. Milman, G. Schechtman: Asymptotic Theory of Finite Dimensional Normed Spaces, Lecture Notes in Mathematics 1200, Springer 1986.

    Google Scholar 

  25. A. Pajor: Sous espaces l n 1 des espaces de Banach, Hermann, Paris, 1985.

    Google Scholar 

  26. G. Pisier: The volume of convex bodies and Banach space geometry, Cambridge University Press, 1989.

    Google Scholar 

  27. E. Rio: Une inegalité de Bennett pour les maxima de processus empiriques, preprint.

    Google Scholar 

  28. N. Sauer: On the density of families of sets, J. Combinatorial Theory (A), 13, 145–147, 1972.

    Article  MATH  MathSciNet  Google Scholar 

  29. S. Shelah: A combinatorial problem: stability and orders for models and theories in infinitary languages, Pacific Journal of Mathematics, 41, 247–261, 1972.

    MATH  MathSciNet  Google Scholar 

  30. V.N. Sudakov: Gaussian processes and measures of solid angles in Hilbert space, Soviet Mathematics. Doklady 12, 412–415, 1971.

    MATH  MathSciNet  Google Scholar 

  31. M. Talagrand: Type, infratype and the Elton-Pajor theorem, Inventiones Mathematicae, 107, 41–59, 1992.

    Article  MATH  MathSciNet  Google Scholar 

  32. M. Talagrand: Sharper bounds for Gaussian and empirical processes, Annals of Probability, 22(1), 28–76, 1994.

    Article  MATH  MathSciNet  Google Scholar 

  33. A.W. Van der Vaart, J.A. Wellner: Weak Convergence and Empirical Processes, Springer-Verlag, 1996.

    Google Scholar 

  34. V. Vapnik: Statistical Learning Theory, Wiley 1998.

    Google Scholar 

  35. A. Vidyasagar: The Theory of Learning and Generalization Springer-Verlag, 1996.

    Google Scholar 

  36. V. Vapnik, A. Chervonenkis: Necessary and sufficient conditions for uniform convergence of means to mathematical expectations, Theory Prob. Applic. 26(3), 532–553, 1971.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Mendelson, S. (2003). A Few Notes on Statistical Learning Theory. In: Mendelson, S., Smola, A.J. (eds) Advanced Lectures on Machine Learning. Lecture Notes in Computer Science(), vol 2600. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36434-X_1

Download citation

  • DOI: https://doi.org/10.1007/3-540-36434-X_1

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-00529-2

  • Online ISBN: 978-3-540-36434-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics