Margin Distribution Bounds on Generalization

Shawe-Taylor, John; Cristianini, Nello

doi:10.1007/3-540-49097-3_21

John Shawe-Taylor^3,4 &
Nello Cristianini^3,4

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1572))

Included in the following conference series:

European Conference on Computational Learning Theory

669 Accesses
14 Citations

Abstract

A number of results have bounded generalization of a classifier in terms of its margin on the training points. There has been some debate about whether the minimum margin is the best measure of the distribution of training set margin values with which to estimate the generalization. Freund and Schapire [6] have shown how a different function of the margin distribution can be used to bound the number of mistakes of an on-line learning algorithm for a perceptron, as well as an expected error bound. We show that a slight generalization of their construction can be used to give a pac style bound on the tail of the distribution of the generalization errors that arise from a given sample size. We also derive an algorithm for optimizing the new measure for general kernel based learning machines. Some preliminary experiments are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Peter Bartlett, Pattern Classification in Neural Networks, IEEE Transactions on Information Theory, to appear.
Google Scholar
Peter Bartlett and John Shawe-Taylor, Generalization Performance of Support Vector Machines and Other Pattern Classifiers, In ‘Advances in Kernel Methods-Support Vector Learning’, Bernhard Schölkopf, Christopher J.C. Burges, and Alexander J. Smola (eds.), MIT Press, Cambridge, USA, 1998.
Google Scholar
C. Cortes and V. Vapnik, Support-Vector Networks, Machine Learning, 20(3): 273–297, September 1995
MATH Google Scholar
Nello Cristianini, John Shawe-Taylor, and Peter Sykacek, Bayesian Classifiers are Large Margin Hyperplanes in a Hilbert Space, in Shavlik, J., ed., Machine Learning: Proceedings of the Fifteenth International Conference, Morgan Kaufmann Publishers, San Francisco, CA.
Google Scholar
R.O. Duda and P.E. Hart, Pattern Classification and Scene Analysis, New York: Wiley, 1973.
MATH Google Scholar
Yoav Freund and Robert E. Schapire, Large Margin Classification Using the Perceptron Algorithm, Proceedings of the Eleventh Annual Conference on Computational Learning Theory, 1998.
Google Scholar
Leonid Gurvits, A note on a scale-sensitive dimension of linear bounded functionals in Banach spaces. In Proceedings of Algorithm Learning Theory, ALT-97, and as NECI Technical Report, 1997.
Google Scholar
Norbert Klasner and Hans Ulrich Simon, From Noise-Free to Noise-Tolerant and from On-line to Batch Learning, Proceedings of the Eighth Annual Conference on Computational Learning Theory, COLT’95, 1995, pp. 250–257.
Google Scholar
R. Schapire, Y. Freund, P. Bartlett, W. Sun Lee, Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods. In D.H. Fisher, Jr., editor, Proceedings of International Conference on Machine Learning, ICML’97, pages 322–330, Nashville, Tennessee, July 1997. Morgan Kaufmann Publishers.
Google Scholar
John Shawe-Taylor, Peter L. Bartlett, Robert C. Williamson, Martin Anthony, Structural Risk Minimization over Data-Dependent Hierarchies, to appear in IEEE Trans. on Inf. Theory, and NeuroCOLT Technical Report NC-TR-96-053, 1996. (ftp://ftp.dcs.rhbnc.ac.uk/pub/neurocolt/tech reports).
John Shawe-Taylor and Robert C. Williamson, Generalization Performance of Classifiers in Terms of Observed Covering Numbers, Submitted to EuroCOLT’99, 1998.
Google Scholar
Vladimir N. Vapnik, The Nature of Statistical Learning Theory, Springer-Verlag, New York, 1995.
MATH Google Scholar
Vladimir N. Vapnik, Estimation of Dependences Based on Empirical Data, Springer-Verlag, New York, 1982.
MATH Google Scholar
Vladimir N. Vapnik, Esther Levin and Yann Le Cunn, Measuring the VC-dimension of a learning machine, Neural Computation, 6:851–876, 1994.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Royal Holloway, University of London, London
John Shawe-Taylor & Nello Cristianini
Royal Holloway, University of Bristol, Bristol
John Shawe-Taylor & Nello Cristianini

Authors

John Shawe-Taylor
View author publications
You can also search for this author in PubMed Google Scholar
Nello Cristianini
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Lehrstuhl für Informatik II, Universität Dortmund, D-44221, Dortmund, Germany
Paul Fischer
Fakultät für Mathematik, Ruhr Universität Bochum, D-44780, Bochum, Germany
Hans Ulrich Simon

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shawe-Taylor, J., Cristianini, N. (1999). Margin Distribution Bounds on Generalization. In: Fischer, P., Simon, H.U. (eds) Computational Learning Theory. EuroCOLT 1999. Lecture Notes in Computer Science(), vol 1572. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49097-3_21

Download citation

DOI: https://doi.org/10.1007/3-540-49097-3_21
Published: 19 November 1999
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65701-9
Online ISBN: 978-3-540-49097-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics