Skip to main content

Sparseness Versus Estimating Conditional Probabilities: Some Asymptotic Results

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3120))

Abstract

One of the nice properties of kernel classifiers such as SVMs is that they often produce sparse solutions. However, the decision functions of these classifiers cannot always be used to estimate the conditional probability of the class label. We investigate the relationship between these two properties and show that these are intimately related: sparseness does not occur when the conditional probabilities can be unambiguously estimated. We consider a family of convex loss functions and derive sharp asymptotic bounds for the number of support vectors. This enables us to characterize the exact trade-off between sparseness and the ability to estimate conditional probabilities for these loss functions.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anthony, M., Bartlett, P.L.: Neural network learning: Theoretical foundations. Cambridge University Press, Cambridge (1999)

    Book  MATH  Google Scholar 

  2. Bartlett, P.L., Jordan, M.I., McAuliffe, J.D.: Large Margin Classifiers: convex loss, low noise and convergence rates. In: Advances in Neural Information Processing Systems 16 MIT Press, Cambridge (2004)

    Google Scholar 

  3. Fiacco, A.V.: Introduction to sensitivity and stability ananlysis in nonlinear programming. Academic Press, New York (1983)

    Google Scholar 

  4. Lugosi, G., Vayatis, N.: On the Bayes-risk consistency of regularized boosting methods. Annals of Statistics 32(1), 30–55 (2004)

    MATH  MathSciNet  Google Scholar 

  5. Pollard, D.: Convergence of stochastic processes. Springer, New York (1984)

    MATH  Google Scholar 

  6. Rockafellar, R.T.: Convex analysis. Princeton University Press, Princeton (1970)

    MATH  Google Scholar 

  7. Steinwart, I.: Sparseness of support vector machines. Journal of Machine Learning Research 4, 1071–1105 (2003)

    Article  MathSciNet  Google Scholar 

  8. Steinwart, I.: Sparseness of support vector machines – some asymptotically sharp bounds. In: Advances in Neural Information Processing Systems 16 MIT Press, Cambridge (2004)

    Google Scholar 

  9. Steinwart, I.: Consistency of support vector machines and other regularized kernel classifiers. IEEE Transactions on Information Theory ( to appear)

    Google Scholar 

  10. Wahba, G.: Soft and hard classification by reproducing kernel Hilbert space methods. Proceedings of the National Academy of Sciences USA 99(26), 16524–16530 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  11. Zhang, T.: Covering number bounds of certain regularized linear function classes. Journal of Machine Learning Research 2, 527–550 (2002)

    Article  MATH  Google Scholar 

  12. Zhang, T.: Statistical behavior and consistency of classification methods based on convex risk minimization. Annals of Statistics 32(1), 56–85 (2004)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bartlett, P.L., Tewari, A. (2004). Sparseness Versus Estimating Conditional Probabilities: Some Asymptotic Results. In: Shawe-Taylor, J., Singer, Y. (eds) Learning Theory. COLT 2004. Lecture Notes in Computer Science(), vol 3120. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27819-1_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-27819-1_39

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22282-8

  • Online ISBN: 978-3-540-27819-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics