Abstract
Most of the work on the Vapnik-Chervonenkis dimension of neural networks has been focused on feedforward networks. However, recurrent networks are also widely used in learning applications, in particular when time is a relevant parameter. This paper provides lower and upper bounds for the VC dimension of such networks. Several types of activation functions are discussed, including threshold, polynomial, piecewise-polynomial and sigmoidal functions. The bounds depend on two independent parameters: the number w of weights in the network, and the length k of the input sequence. In contrast, for feedforward networks, VC dimension bounds can be expressed as a function of w only. An important difference between recurrent and feedforward nets is that a fixed recurrent net can receive inputs of arbitrary length. Therefore we are particularly interested in the case k≫w. Ignoring multiplicative constants, the main results say roughly the following:
-
For architectures with activation σ = any fixed nonlinear polynomial, the VC dimension is ≈ wk.
-
For architectures with activation σ = any fixed piecewise polynomial, the VC dimension is between wk and w2k.
-
For architectures with activation σ = H (threshold nets), the VC dimension is between w log(k/w) and min{wk log wk, w 2+w log wk}.
-
For the standard sigmoid σ(x)=1/(1+e −x), the VC dimension is between wk and w4k2.
This research was carried out in part while visiting DIMACS and the Rutgers Center for Systems and Control (SYCON) at Rutgers University.
This research was supported in part by US Air Force Grant AFOSR-94-0293.
Preview
Unable to display preview. Download preview PDF.
References
E.B. Baum and D. Haussler, “What size net gives valid generalization?”, Neural Computation, 1 (1989), pp. 151–160.
Y. Bengio, Neural Networks for Speech and Sequence Recognition, Thompson Computer Press, Boston, 1996.
T.M. Cover, “Capacity problems for linear machines”, in: Pattern Recognition (L. Kanal ed.), Thompson Book Co., 1968, pp. 283–289
B. Dasgupta and E.D. Sontag, “Sample complexity for learning recurrent perceptron mappings,” IEEE Trans. Inform. Theory, September 1996, to appear. (Summary in Advances in Neural Information Processing Systems 8 (NIPS95) (D.S. Touretzky, M.C. Moser, and M.E. Hasselmo, eds.), MIT Press, Cambridge, MA, 1996, pp. 204–210.)
C.L. Giles, G.Z. Sun, H.H. Chen, Y.C. Lee and D. Chen, “Higher order recurrent networks and grammatical inference”, in Advances in Neural Information Processing Systems 2, D.S. Touretzky (ed.), Morgan Kaufmann, San Mateo, CA, 1990.
P. Goldberg and M. Jerrum, “Bounding the Vapnik-Chervonenkis dimension of concept classes parametrized by real numbers,” Machine Learning 18(1995), pp. 131–148.
M. Karpinski and A. Macintyre, “Polynomial bounds for VC dimension of sigmoidal and general Pfaffian neural networks,” J. Computer Sys. Sci., to appear. (Summary in “Polynomial bounds for VC dimension of sigmoidal neural networks,” in Proc. 27th ACM Symposium on Theory of Computing, 1995, pp. 200–208).
P. Koiran and E.D. Sontag, “Neural networks with quadratic VC dimension,” J. Computer Sys. Sci., to appear. (Summary in Advances in Neural Information Processing Systems 8 (NIPS95) (D.S. Touretzky, M.C. Moser, and M.E. Hasselmo, eds.), MIT Press, Cambridge, MA, 1996, pp. 197–203.)
M. Matthews, “A state-space approach to adaptive nonlinear filtering using recurrent neural networks,” Proc. 1990 IASTED Symp. on Artificial Intelligence Applications and Neural Networks, Zürich, pp. 197–200, July 1990.
M.M. Polycarpou, and P.A. Ioannou, “Neural networks and on-line approximators for adaptive control,” in Proc. Seventh Yale Workshop on Adaptive and Learning Systems, pp. 93–798, Yale University, 1992.
H. Siegelmann and E.D. Sontag, “On the computational power of neural nets,” J. Comp. Syst. Sci. 50(1995): 132–150.
H. Siegelmann and E.D. Sontag, “Analog computation, neural networks, and circuits,” Theor. Comp. Sci. 131(1994): 331–360.
E.D. Sontag, Mathematical Control Theory: Deterministic Finite Dimensional Systems, Springer, New York, 1990.
E.D. Sontag, “Neural nets as systems models and controllers,” in Proc. Seventh Yale Workshop on Adaptive and Learning Systems, pp. 73–79, Yale University, 1992.
E.D. Sontag, “Feedforward nets for interpolation and classification,” J. Comp. Syst. Sci. 45(1992): 20–48.
A.M. Zador and B.A. Pearlmutter, “VC dimension of an integrate-and-fire neuron model,” Neural Computation 8(1996): 611–624.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Koiran, P., Sontag, E.D. (1997). Vapnik-Chervonenkis dimension of recurrent neural networks. In: Ben-David, S. (eds) Computational Learning Theory. EuroCOLT 1997. Lecture Notes in Computer Science, vol 1208. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-62685-9_19
Download citation
DOI: https://doi.org/10.1007/3-540-62685-9_19
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62685-5
Online ISBN: 978-3-540-68431-2
eBook Packages: Springer Book Archive