Skip to main content

A non-convergent on-line training algorithm for neural networks

  • Methodology for Data Analysis, Task Selection and Nets Design
  • Conference paper
  • First Online:
Biological and Artificial Computation: From Neuroscience to Technology (IWANN 1997)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1240))

Included in the following conference series:

  • 197 Accesses

Abstract

Stopped training is a method to avoid over-fitting of neural network models by preventing an iterative optimization method from reaching a local minimum of the objective function. It is motivated by the observation that over-fitting occurs gradually as training progresses. The stopping time is typically determined by monitoring the expected generalization performance of the model as approximated by the error on a validation set. In this paper we propose to use an analytic estimate for this purpose. However, these estimates require knowledge of the analytic form of the objective function used for training the network and are only applicable when the weights correspond to a local minimum of this objective function. For this reason, we propose the use of an auxiliary, regularized objective function. The algorithm is “self-contained” and does not require to split the data in a training and a separate validation set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Barron, A. (1984), Predicted squared error: a criterion for automatic model selection, in S. Farlow, ed., ‘Self-Organizing Methods in Modeling', Marcel Dekker, New York.

    Google Scholar 

  • Chauvin, Y. (1991), Generalization dynamics in LMS trained linear networks, in R. P. Lippman, J. E. Moody and D. S. Touretzky, eds, ‘Advances in Neural Information Processing Systems 3', Morgan Kaufmann Publishers, San Mateo, CA, pp. 890–896.

    Google Scholar 

  • Finnoff, W., Hergert, F. and Zimmermann, H. G. (1993), ‘Improving model selection by nonconvergent methods', Neural Networks 6, 771–783.

    Google Scholar 

  • Geman, S., Bienenstock, E. and Doursat, R. (1992), ‘Neural networks and the bias/variance dilemma', Neural Computation 4(1), 1–58.

    Google Scholar 

  • Larsen, J. (1992), A generalization error estimate for nonlinear systems, in ‘Proceedings of the 1992 IEEE Workshop on Neural Networks for Signal Processing', IEEE Service Center, Piscataway, NJ, pp. 29–38.

    Google Scholar 

  • Leen, T. K. and Orr, G. B. (1992), Weight-space probability densities and convergence times for stochastic learning, in ‘Int. Joint Conference on Neural Networks', Vol. 4, Baltimore, MD, pp. 158–164.

    Google Scholar 

  • Moody, J. E. (1991), Note on generalization, regularization and architecture selection in nonlinear learning systems, in B. H. Juang, S. Y. Kung and C. A. Kamm, eds, ‘Neural Networks for Signal Processing', IEEE Signal Processing Society, pp. 1–10.

    Google Scholar 

  • Moody, J. and Utans, J. (1994), Architecture selection strategies for neural networks: Application to corporate bond rating prediction, in A. N. Refenes, ed., ‘Neural Networks in the Captial Markets', John Wiley & Sons.

    Google Scholar 

  • Sjöberg, J. and Ljung, L. (1992), Overtraining, regularization, and searching for minimum in neural networks, technical Report LiTH-ISY-I-1297, Dept. of Electrical Engineering, Linköping University, S-581 83 Linköping, Sweden.

    Google Scholar 

  • Stone, M. (1978), ‘Cross-validation: A review', Math. Operationsforsch. Statist., Ser. Statistics 9(1).

    Google Scholar 

  • Weigend, A. S. and Rummelhart, D. E. (1991), The effective dimension of the space of hidden units, in ‘Proceedings of the International Joint Conference on Neural Networks', Vol. III, Singapore, pp. 2069–2074.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

José Mira Roberto Moreno-Díaz Joan Cabestany

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Utans, J. (1997). A non-convergent on-line training algorithm for neural networks. In: Mira, J., Moreno-Díaz, R., Cabestany, J. (eds) Biological and Artificial Computation: From Neuroscience to Technology. IWANN 1997. Lecture Notes in Computer Science, vol 1240. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0032551

Download citation

  • DOI: https://doi.org/10.1007/BFb0032551

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63047-0

  • Online ISBN: 978-3-540-69074-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics