Support Vector Machines and the Bayes Rule in Classification

Lin, Yi

doi:10.1023/A:1015469627679

Support Vector Machines and the Bayes Rule in Classification

Published: July 2002

Volume 6, pages 259–275, (2002)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Yi Lin¹

1231 Accesses
164 Citations
3 Altmetric
Explore all metrics

Abstract

The Bayes rule is the optimal classification rule if the underlying distribution of the data is known. In practice we do not know the underlying distribution, and need to “learn” classification rules from the data. One way to derive classification rules in practice is to implement the Bayes rule approximately by estimating an appropriate classification function. Traditional statistical methods use estimated log odds ratio as the classification function. Support vector machines (SVMs) are one type of large margin classifier, and the relationship between SVMs and the Bayes rule was not clear. In this paper, it is shown that the asymptotic target of SVMs are some interesting classification functions that are directly related to the Bayes rule. The rate of convergence of the solutions of SVMs to their corresponding target functions is explicitly established in the case of SVMs with quadratic or higher order loss functions and spline kernels. Simulations are given to illustrate the relation between SVMs and the Bayes rule in other cases. This helps understand the success of SVMs in many classification studies, and makes it easier to compare SVMs and traditional statistical methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Boser, B.E., Guyon, I.M., and Vapnik, V.N. 1992. A training algorithm for optimal margin classifiers. In Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, D. Haussler (Ed.). Pittsburgh, PA: ACM Press.
Google Scholar
Burges, C.J.C. 1998. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2):121–167.
Google Scholar
Cortes, C. and Vapnik, V.N. 1995. Support vector networks. Machine Learning, 20:273–297.
Google Scholar
Cox, D.D. and O'sullivan, F. 1990. Asymptotic analysis of penalized likelihood and related estimates. The Annals of Statistics, 18(4):1676–1695.
Google Scholar
Evgeniou, T., Pontil, M., and Poggio, T. 1999. A unified framework for regularization networks and support vector machines. Technical Report, M.I.T. Artificial Intelligence Laboratory and Center for Biological and Computational Learning, Department of Brain and Cognitive Sciences.
Friedman, J.H. 1997. On bias, variance, 0/1-loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery, 1(1):55–77.
Google Scholar
Kaufman, L. 1999. Solving the quadratic programming problem arising in support vector classification. In Advances in Kernel Methods-Support Vector Learning, B. Schölkopf, C.J.C. Burges, and A.J. Smola (Eds.). Cambridge, MA: MIT Press, pp. 147–168.
Google Scholar
Lin, Y. 1998. Tensor product space ANOVA models in high dimensional function estimation. Ph.D. Dissertation, University of Pennsylvania.
Lin, Y. 2000a. Tensor product space ANOVA models. The Annals of Statistics, 28(3):734–755.
Google Scholar
Lin, Y. 2000b. On the support vector machine. Technical Report 1029, Department of Statistics, University of Wisconsin, Madison.
Google Scholar
Lin, Y., Lee, Y., and Wahba, G. 2002. Support vector machines for classification in nonstandard situations. Machine Learning, 46:191–202.
Google Scholar
Shawe-Taylor, J. and Cristianini, N. 1998. Robust bounds on the generalization from the margin distribution. Neuro COLT Technical Report TR-1998-029.
Vapnik, V.N. 1995. The Nature of Statistical Learning Theory. New York: Springer Verlag.
Google Scholar
Wahba, G. 1990. Spline Models for Observational Data. Philadelphia, PA: Society for Industrial and Applied Mathematics.
Google Scholar
Wahba, G. 1999. Support vector machines, reproducing kernel Hilbert spaces and the randomized GACV. In Advances in Kernel Methods-Support Vector Learning, B. Scholkopf, C.J.C. Burges, and A.J. Smola (Eds.). Cambridge, MA: MIT Press.
Google Scholar
Wahba, G., Lin, Y., and Zhang, H. 2000. GACV for support vector machines, or, another way to look at margin-like quantities. In Advances in Large Margin Classifiers, A.J. Smola, P. Bartlett, B. Scholkopf, and D. Schurmans (Eds.). Cambridge, MA and London, England: MIT Press.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, University of Wisconsin, Madison, 1210 West Dayton Street, Madison, WI, 53706-1685, USA
Yi Lin

Authors

Yi Lin
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, Y. Support Vector Machines and the Bayes Rule in Classification. Data Mining and Knowledge Discovery 6, 259–275 (2002). https://doi.org/10.1023/A:1015469627679

Download citation

Issue Date: July 2002
DOI: https://doi.org/10.1023/A:1015469627679

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Support Vector Machines and the Bayes Rule in Classification

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Learning from imbalanced data: open challenges and future directions

Supervised Classification Algorithms in Machine Learning: A Survey and Review

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Support Vector Machines and the Bayes Rule in Classification

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Learning from imbalanced data: open challenges and future directions

Supervised Classification Algorithms in Machine Learning: A Survey and Review

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation