Skip to main content
Log in

Support Vector Machines and the Bayes Rule in Classification

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

The Bayes rule is the optimal classification rule if the underlying distribution of the data is known. In practice we do not know the underlying distribution, and need to “learn” classification rules from the data. One way to derive classification rules in practice is to implement the Bayes rule approximately by estimating an appropriate classification function. Traditional statistical methods use estimated log odds ratio as the classification function. Support vector machines (SVMs) are one type of large margin classifier, and the relationship between SVMs and the Bayes rule was not clear. In this paper, it is shown that the asymptotic target of SVMs are some interesting classification functions that are directly related to the Bayes rule. The rate of convergence of the solutions of SVMs to their corresponding target functions is explicitly established in the case of SVMs with quadratic or higher order loss functions and spline kernels. Simulations are given to illustrate the relation between SVMs and the Bayes rule in other cases. This helps understand the success of SVMs in many classification studies, and makes it easier to compare SVMs and traditional statistical methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Boser, B.E., Guyon, I.M., and Vapnik, V.N. 1992. A training algorithm for optimal margin classifiers. In Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, D. Haussler (Ed.). Pittsburgh, PA: ACM Press.

    Google Scholar 

  • Burges, C.J.C. 1998. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2):121–167.

    Google Scholar 

  • Cortes, C. and Vapnik, V.N. 1995. Support vector networks. Machine Learning, 20:273–297.

    Google Scholar 

  • Cox, D.D. and O'sullivan, F. 1990. Asymptotic analysis of penalized likelihood and related estimates. The Annals of Statistics, 18(4):1676–1695.

    Google Scholar 

  • Evgeniou, T., Pontil, M., and Poggio, T. 1999. A unified framework for regularization networks and support vector machines. Technical Report, M.I.T. Artificial Intelligence Laboratory and Center for Biological and Computational Learning, Department of Brain and Cognitive Sciences.

  • Friedman, J.H. 1997. On bias, variance, 0/1-loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery, 1(1):55–77.

    Google Scholar 

  • Kaufman, L. 1999. Solving the quadratic programming problem arising in support vector classification. In Advances in Kernel Methods-Support Vector Learning, B. Schölkopf, C.J.C. Burges, and A.J. Smola (Eds.). Cambridge, MA: MIT Press, pp. 147–168.

    Google Scholar 

  • Lin, Y. 1998. Tensor product space ANOVA models in high dimensional function estimation. Ph.D. Dissertation, University of Pennsylvania.

  • Lin, Y. 2000a. Tensor product space ANOVA models. The Annals of Statistics, 28(3):734–755.

    Google Scholar 

  • Lin, Y. 2000b. On the support vector machine. Technical Report 1029, Department of Statistics, University of Wisconsin, Madison.

    Google Scholar 

  • Lin, Y., Lee, Y., and Wahba, G. 2002. Support vector machines for classification in nonstandard situations. Machine Learning, 46:191–202.

    Google Scholar 

  • Shawe-Taylor, J. and Cristianini, N. 1998. Robust bounds on the generalization from the margin distribution. Neuro COLT Technical Report TR-1998-029.

  • Vapnik, V.N. 1995. The Nature of Statistical Learning Theory. New York: Springer Verlag.

    Google Scholar 

  • Wahba, G. 1990. Spline Models for Observational Data. Philadelphia, PA: Society for Industrial and Applied Mathematics.

    Google Scholar 

  • Wahba, G. 1999. Support vector machines, reproducing kernel Hilbert spaces and the randomized GACV. In Advances in Kernel Methods-Support Vector Learning, B. Scholkopf, C.J.C. Burges, and A.J. Smola (Eds.). Cambridge, MA: MIT Press.

    Google Scholar 

  • Wahba, G., Lin, Y., and Zhang, H. 2000. GACV for support vector machines, or, another way to look at margin-like quantities. In Advances in Large Margin Classifiers, A.J. Smola, P. Bartlett, B. Scholkopf, and D. Schurmans (Eds.). Cambridge, MA and London, England: MIT Press.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, Y. Support Vector Machines and the Bayes Rule in Classification. Data Mining and Knowledge Discovery 6, 259–275 (2002). https://doi.org/10.1023/A:1015469627679

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1015469627679

Navigation