On the Generalization Ability of Recurrent Networks

Hammer, Barbara

doi:10.1007/3-540-44668-0_102

Barbara Hammer⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2130))

Included in the following conference series:

International Conference on Artificial Neural Networks

4067 Accesses

Abstract

The generalization ability of discrete time partially recurrent networks is examined. It is well known that the VC dimension of recurrent networks is infinite in most interesting cases and hence the standard VC analysis cannot be applied directly. We find guarantees for specific situations where the transition function forms a contraction or the probability of long inputs is restricted. For the general case, we derive posterior bounds which take the input data into account. They are obtained via a generalization of the luckiness framework to the agnostic setting. The general formalism allows to focus on reppresentative parts of the data as well as more general situations such as long term prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 189.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

M. Anthony and P. Bartlett Neural Network Learning: Theoretical Foundations. Cambridge University Press, 1999.
Google Scholar
E. B. Baum and D. Haussler. What size net gives valid generalization? Neural Computation, 1, 1989.
Google Scholar
B. Dasgupta and E. D. Sontag. Sample complexity for learning recurrent perceptron mappings. IEEE Transactions on Information Theory, 42, 1996.
Google Scholar
P. Frasconi, M. Gori, and A. Sperduti. A general framework for adaptive processing of data sequences. IEEE Transactions on Neural Networks, 9(5), 1997.
Google Scholar
C. L. Giles, G. M. Kuhn, and R. J. Williams. Special issue on dynamic recurrent neural networks. IEEE Transactions on Neural Networks, 5(2), 1994.
Google Scholar
M. Gori (chair). Special session: Adaptive computation of data structures. ESANN’99, 1999.
Google Scholar
M. Gori, M. Mozer, A. C. Tsoi, and R. L. Watrous. Special issue on recurrent neural networks for sequence processing. Neurocomputing, 15(3-4), 1997.
Google Scholar
B. Hammer. Learning with Recurrent Neural Networks. Springer Lecture Notes in Control and Information Sciences 254, 2000.
Google Scholar
B. Hammer. Generalization ability of folding networks. To appear in IEEE Transactions on Knowledge and Data Engineering.
Google Scholar
D. Haussler. Decision theoretic generalizations of the PAC model for neural net and other learning applications. Information and Computation, 100, 1992.
Google Scholar
K. Hornik, M. Stinchcombe, and H. White. Multilayer feedforward networks are universal approximators. Neural Networks, 2, 1989.
Google Scholar
M. J. Kearns, R. E. Schapire, and L. M. Sellie. Toward Efficient Agnostic Learning. Machine Learning, 17, 1994.
Google Scholar
P. Koiran and E. D. Sontag. Vapnik-Chervonenkis dimension of recurrent neural networks. Discrete Applied Mathematics, 86(1), 1998.
Google Scholar
W. Maass and P. Orponen. On the effect of analog noise in discrete-time analog computation. Neural Computation, 10(5), 1998.
Google Scholar
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9(8), 1997.
Google Scholar
J. Shawe-Taylor, P. L. Bartlett, R. Williamson, and M. Anthony. Structural risk minimization over data dependent hierarchies. IEEE Transactions on Information Theory, 44(5), 1998.
Google Scholar
M. Vidyasagar. A Theory of Learning and Generalization. Springer, 1997.
Google Scholar
P. Werbos. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Science. PhD thesis, Harvard University, 1974.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics/Computer Science, University of Osnabrück, D-49069, Osnabrück, Germany
Barbara Hammer

Authors

Barbara Hammer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Mecidal Cybernetics and Artificial Intelligence, University of Vienna, Freyung 6/2, 1010, Vienna, Austria
Georg Dorffner
Institute for Computer Aided Automation Pattern Recognition and Image Processing Group, Technical University of Vienna, Favoritenstr. 9/1832, 1040, Vienna, Austria
Horst Bischof
Institut für Statistik, Wirtschaftsuniversität Wien, Augasse 2-6, 1090, Wien, Austria
Kurt Hornik

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hammer, B. (2001). On the Generalization Ability of Recurrent Networks. In: Dorffner, G., Bischof, H., Hornik, K. (eds) Artificial Neural Networks — ICANN 2001. ICANN 2001. Lecture Notes in Computer Science, vol 2130. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44668-0_102

Download citation

DOI: https://doi.org/10.1007/3-540-44668-0_102
Published: 17 August 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42486-4
Online ISBN: 978-3-540-44668-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics