Skip to main content

On the Generalization Ability of Recurrent Networks

  • Conference paper
  • First Online:
Artificial Neural Networks — ICANN 2001 (ICANN 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2130))

Included in the following conference series:

  • 4067 Accesses

Abstract

The generalization ability of discrete time partially recurrent networks is examined. It is well known that the VC dimension of recurrent networks is infinite in most interesting cases and hence the standard VC analysis cannot be applied directly. We find guarantees for specific situations where the transition function forms a contraction or the probability of long inputs is restricted. For the general case, we derive posterior bounds which take the input data into account. They are obtained via a generalization of the luckiness framework to the agnostic setting. The general formalism allows to focus on reppresentative parts of the data as well as more general situations such as long term prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 189.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. Anthony and P. Bartlett Neural Network Learning: Theoretical Foundations. Cambridge University Press, 1999.

    Google Scholar 

  2. E. B. Baum and D. Haussler. What size net gives valid generalization? Neural Computation, 1, 1989.

    Google Scholar 

  3. B. Dasgupta and E. D. Sontag. Sample complexity for learning recurrent perceptron mappings. IEEE Transactions on Information Theory, 42, 1996.

    Google Scholar 

  4. P. Frasconi, M. Gori, and A. Sperduti. A general framework for adaptive processing of data sequences. IEEE Transactions on Neural Networks, 9(5), 1997.

    Google Scholar 

  5. C. L. Giles, G. M. Kuhn, and R. J. Williams. Special issue on dynamic recurrent neural networks. IEEE Transactions on Neural Networks, 5(2), 1994.

    Google Scholar 

  6. M. Gori (chair). Special session: Adaptive computation of data structures. ESANN’99, 1999.

    Google Scholar 

  7. M. Gori, M. Mozer, A. C. Tsoi, and R. L. Watrous. Special issue on recurrent neural networks for sequence processing. Neurocomputing, 15(3-4), 1997.

    Google Scholar 

  8. B. Hammer. Learning with Recurrent Neural Networks. Springer Lecture Notes in Control and Information Sciences 254, 2000.

    Google Scholar 

  9. B. Hammer. Generalization ability of folding networks. To appear in IEEE Transactions on Knowledge and Data Engineering.

    Google Scholar 

  10. D. Haussler. Decision theoretic generalizations of the PAC model for neural net and other learning applications. Information and Computation, 100, 1992.

    Google Scholar 

  11. K. Hornik, M. Stinchcombe, and H. White. Multilayer feedforward networks are universal approximators. Neural Networks, 2, 1989.

    Google Scholar 

  12. M. J. Kearns, R. E. Schapire, and L. M. Sellie. Toward Efficient Agnostic Learning. Machine Learning, 17, 1994.

    Google Scholar 

  13. P. Koiran and E. D. Sontag. Vapnik-Chervonenkis dimension of recurrent neural networks. Discrete Applied Mathematics, 86(1), 1998.

    Google Scholar 

  14. W. Maass and P. Orponen. On the effect of analog noise in discrete-time analog computation. Neural Computation, 10(5), 1998.

    Google Scholar 

  15. S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9(8), 1997.

    Google Scholar 

  16. J. Shawe-Taylor, P. L. Bartlett, R. Williamson, and M. Anthony. Structural risk minimization over data dependent hierarchies. IEEE Transactions on Information Theory, 44(5), 1998.

    Google Scholar 

  17. M. Vidyasagar. A Theory of Learning and Generalization. Springer, 1997.

    Google Scholar 

  18. P. Werbos. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Science. PhD thesis, Harvard University, 1974.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hammer, B. (2001). On the Generalization Ability of Recurrent Networks. In: Dorffner, G., Bischof, H., Hornik, K. (eds) Artificial Neural Networks — ICANN 2001. ICANN 2001. Lecture Notes in Computer Science, vol 2130. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44668-0_102

Download citation

  • DOI: https://doi.org/10.1007/3-540-44668-0_102

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42486-4

  • Online ISBN: 978-3-540-44668-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics