Abstract
A neural tree is a feedforward neural network with at most one edge outgoing from each node. We investigate the number of examples that a learning algorithm needs when using neural trees as hypothesis class. We give bounds for this sample complexity in terms of the VC dimension. We consider trees consisting of threshold, sigmoidal and linear gates. In particular, we show that the class of threshold trees and the class of sigmoidal trees on n inputs both have VC dimension Ω(n log n). This bound is asymptotically tight for the class of threshold trees. We also present an upper bound for this class where the constants involved are considerably smaller than in a previous calculation. Finally, we argue that the VC dimension of threshold or sigmoidal trees cannot become larger by allowing the nodes to compute linear functions. This sheds some light on a recent result that exhibited neural networks with quadratic VC dimension.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
D. Angluin, L. Hellerstein, and M. Karpinski. Learning read-once formulas with queries. Journal of the Association for Computing Machinery, 40:185–210, 1993.
M. Anthony and N. Biggs. Computational Learning Theory. Cambridge Tracts in Theoretical Computer Science. Cambridge University Press, Cambridge, 1992.
E. B. Baum and D. Haussler. What size net gives valid generalization? Neural Computation, 1:151–160, 1989.
A. Blumer, A. Ehrenfeucht, D. Haussler, and M. K. Warmuth. Learnability and the Vapnik-Chervonenkis dimension. Journal of the Association for Computing Machinery, 36:929–965, 1989.
T. M. Cover. Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition. IEEE Transactions on Electronic Computers, 14:326–334, 1965.
T. M. Cover. Capacity problems for linear machines. In L. N. Kanal, editor, Pattern Recognition, pages 283–289, Thompson Book Co., Washington, 1968.
P. W. Goldberg and M. R. Jerrum. Bounding the Vapnik-Chervonenkis dimension of concept classes parameterized by real numbers. Machine Learning, 18:131–148, 1995.
M. Golea, M. Marchand, and T. R. Hancock. On learning Μ-Perceptron networks with binary weights. In S. J. Hanson, J. D. Cowan, and C. L. Giles, editors, Advances in Neural Information Processing Systems 5, pages 591–598. Morgan Kaufmann, San Mateo, CA, 1993.
M. Golea, M. Marchand, and T. R. Hancock. On learning Μ-Perceptron networks on the uniform distribution. Neural Networks, 9:67–82, 1996.
T. R. Hancock, M. Golea, and M. Marchand. Learning nonoverlapping Perceptron networks from examples and membership queries. Machine Learning, 16:161–183, 1994.
D. Haussler. Decision theoretic generalizations of the PAC model for neural net and other learning applications. Information and Computation, 100:78–150, 1992.
M. Karpinski and A. Macintyre. Polynomial bounds for VC dimension of sigmoidal and general pfaffian neural networks. Journal of Computer and System Sciences, 54:169–176, 1997.
P. Koiran and E. D. Sontag. Neural networks with quadratic VC dimension. Journal of Computer and System Sciences, 54:190–198, 1997.
N. Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2:285–318, 1988.
W. Maass. Neural nets with superlinear VC-dimension. Neural Computation, 6:877–884, 1994.
W. Maass. Vapnik-Chervonenkis dimension of neural nets. In M. A. Arbib, editor, The Handbook of Brain Theory and Neural Networks, pages 1000–1003. MIT Press, Cambridge, Mass., 1995.
W. Maass, G. Schnitger, and E. D. Sontag. A comparison of the computational power of sigmoid and Boolean threshold circuits. In V. Roychowdhury, K.-Y. Siu, and A. Orlitsky, editors, Theoretical Advances in Neural Computation and Learning, pages 127–151. Kluwer, Boston, 1994.
W. Maass and G. Turán. Lower bound methods and separation results for on-line learning models. Machine Learning, 9:107–145, 1992.
A. Sakurai. Tighter bounds of the VC-dimension of three-layer networks. In Proceedings of the World Congress on Neural Networks WCNN’93, volume 3, pages 540–543, 1993.
L. SchlÄfli. Theorie der vielfachen KontinuitÄt. Zürcher & Furrer, Zürich, 1901. Reprinted in: L. SchlÄfli, Gesammelte Mathematische Abhandlungen, Band I, BirkhÄuser, Basel, 1950.
J. Shawe-Taylor. Sample sizes for threshold networks with equivalences. Information and Computation, 118:65–72, 1995.
L. G. Valiant. A theory of the learnable. Communications of the ACM, 27:1134–1142, 1984.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Schmitt, M. (1998). On the Sample Complexity for Neural Trees. In: Richter, M.M., Smith, C.H., Wiehagen, R., Zeugmann, T. (eds) Algorithmic Learning Theory. ALT 1998. Lecture Notes in Computer Science(), vol 1501. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49730-7_26
Download citation
DOI: https://doi.org/10.1007/3-540-49730-7_26
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65013-3
Online ISBN: 978-3-540-49730-1
eBook Packages: Springer Book Archive