On the Sample Complexity for Neural Trees

Schmitt, Michael

doi:10.1007/3-540-49730-7_26

On the Sample Complexity for Neural Trees

Michael Schmitt⁵

Conference paper
First Online: 01 January 2002

377 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1501))

Abstract

A neural tree is a feedforward neural network with at most one edge outgoing from each node. We investigate the number of examples that a learning algorithm needs when using neural trees as hypothesis class. We give bounds for this sample complexity in terms of the VC dimension. We consider trees consisting of threshold, sigmoidal and linear gates. In particular, we show that the class of threshold trees and the class of sigmoidal trees on n inputs both have VC dimension Ω(n log n). This bound is asymptotically tight for the class of threshold trees. We also present an upper bound for this class where the constants involved are considerably smaller than in a previous calculation. Finally, we argue that the VC dimension of threshold or sigmoidal trees cannot become larger by allowing the nodes to compute linear functions. This sheds some light on a recent result that exhibited neural networks with quadratic VC dimension.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

D. Angluin, L. Hellerstein, and M. Karpinski. Learning read-once formulas with queries. Journal of the Association for Computing Machinery, 40:185–210, 1993.
MATH MathSciNet Google Scholar
M. Anthony and N. Biggs. Computational Learning Theory. Cambridge Tracts in Theoretical Computer Science. Cambridge University Press, Cambridge, 1992.
Google Scholar
E. B. Baum and D. Haussler. What size net gives valid generalization? Neural Computation, 1:151–160, 1989.
Article Google Scholar
A. Blumer, A. Ehrenfeucht, D. Haussler, and M. K. Warmuth. Learnability and the Vapnik-Chervonenkis dimension. Journal of the Association for Computing Machinery, 36:929–965, 1989.
MATH MathSciNet Google Scholar
T. M. Cover. Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition. IEEE Transactions on Electronic Computers, 14:326–334, 1965.
Article MATH Google Scholar
T. M. Cover. Capacity problems for linear machines. In L. N. Kanal, editor, Pattern Recognition, pages 283–289, Thompson Book Co., Washington, 1968.
Google Scholar
P. W. Goldberg and M. R. Jerrum. Bounding the Vapnik-Chervonenkis dimension of concept classes parameterized by real numbers. Machine Learning, 18:131–148, 1995.
MATH Google Scholar
M. Golea, M. Marchand, and T. R. Hancock. On learning Μ-Perceptron networks with binary weights. In S. J. Hanson, J. D. Cowan, and C. L. Giles, editors, Advances in Neural Information Processing Systems 5, pages 591–598. Morgan Kaufmann, San Mateo, CA, 1993.
Google Scholar
M. Golea, M. Marchand, and T. R. Hancock. On learning Μ-Perceptron networks on the uniform distribution. Neural Networks, 9:67–82, 1996.
Article Google Scholar
T. R. Hancock, M. Golea, and M. Marchand. Learning nonoverlapping Perceptron networks from examples and membership queries. Machine Learning, 16:161–183, 1994.
Google Scholar
D. Haussler. Decision theoretic generalizations of the PAC model for neural net and other learning applications. Information and Computation, 100:78–150, 1992.
Article MATH MathSciNet Google Scholar
M. Karpinski and A. Macintyre. Polynomial bounds for VC dimension of sigmoidal and general pfaffian neural networks. Journal of Computer and System Sciences, 54:169–176, 1997.
Article MATH MathSciNet Google Scholar
P. Koiran and E. D. Sontag. Neural networks with quadratic VC dimension. Journal of Computer and System Sciences, 54:190–198, 1997.
Article MATH MathSciNet Google Scholar
N. Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2:285–318, 1988.
Google Scholar
W. Maass. Neural nets with superlinear VC-dimension. Neural Computation, 6:877–884, 1994.
Article MATH Google Scholar
W. Maass. Vapnik-Chervonenkis dimension of neural nets. In M. A. Arbib, editor, The Handbook of Brain Theory and Neural Networks, pages 1000–1003. MIT Press, Cambridge, Mass., 1995.
Google Scholar
W. Maass, G. Schnitger, and E. D. Sontag. A comparison of the computational power of sigmoid and Boolean threshold circuits. In V. Roychowdhury, K.-Y. Siu, and A. Orlitsky, editors, Theoretical Advances in Neural Computation and Learning, pages 127–151. Kluwer, Boston, 1994.
Google Scholar
W. Maass and G. Turán. Lower bound methods and separation results for on-line learning models. Machine Learning, 9:107–145, 1992.
MATH Google Scholar
A. Sakurai. Tighter bounds of the VC-dimension of three-layer networks. In Proceedings of the World Congress on Neural Networks WCNN’93, volume 3, pages 540–543, 1993.
Google Scholar
L. SchlÄfli. Theorie der vielfachen KontinuitÄt. Zürcher & Furrer, Zürich, 1901. Reprinted in: L. SchlÄfli, Gesammelte Mathematische Abhandlungen, Band I, BirkhÄuser, Basel, 1950.
Google Scholar
J. Shawe-Taylor. Sample sizes for threshold networks with equivalences. Information and Computation, 118:65–72, 1995.
Article MATH Google Scholar
L. G. Valiant. A theory of the learnable. Communications of the ACM, 27:1134–1142, 1984.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Lehrstuhl Mathematik und Informatik FakultÄt für Mathematik, Ruhr-UniversitÄt Bochum, D-44780, Bochum, Germany
Michael Schmitt

Authors

Michael Schmitt
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

AG Künstliche Intelligenz - Expertensysteme, UniversitÄt Kaiserslautern, Postfach 3049, D-67653, Kaiserslautern, Germany
Michael M. Richter
Department of Computer Science, University of Maryland, College Park, MD, 20742, USA
Carl H. Smith
AG Algorithmischesn Lernen, UniversitÄt Kaiserslautern, Postfach 3049, D-67653, Kaiserslautern, Germany
Rolf Wiehagen
Graduate School of Information Science and Electrical Engineering Department of Informatics, Kyushu University, Kassuga, 816-8580, Japan
Thomas Zeugmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schmitt, M. (1998). On the Sample Complexity for Neural Trees. In: Richter, M.M., Smith, C.H., Wiehagen, R., Zeugmann, T. (eds) Algorithmic Learning Theory. ALT 1998. Lecture Notes in Computer Science(), vol 1501. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49730-7_26

Download citation

DOI: https://doi.org/10.1007/3-540-49730-7_26
Published: 24 September 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65013-3
Online ISBN: 978-3-540-49730-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics