A non-convergent on-line training algorithm for neural networks

Utans, Joachim

doi:10.1007/BFb0032551

Joachim Utans¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1240))

Included in the following conference series:

International Work-Conference on Artificial Neural Networks

197 Accesses

Abstract

Stopped training is a method to avoid over-fitting of neural network models by preventing an iterative optimization method from reaching a local minimum of the objective function. It is motivated by the observation that over-fitting occurs gradually as training progresses. The stopping time is typically determined by monitoring the expected generalization performance of the model as approximated by the error on a validation set. In this paper we propose to use an analytic estimate for this purpose. However, these estimates require knowledge of the analytic form of the objective function used for training the network and are only applicable when the weights correspond to a local minimum of this objective function. For this reason, we propose the use of an auxiliary, regularized objective function. The algorithm is “self-contained” and does not require to split the data in a training and a separate validation set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Barron, A. (1984), Predicted squared error: a criterion for automatic model selection, in S. Farlow, ed., ‘Self-Organizing Methods in Modeling', Marcel Dekker, New York.
Google Scholar
Chauvin, Y. (1991), Generalization dynamics in LMS trained linear networks, in R. P. Lippman, J. E. Moody and D. S. Touretzky, eds, ‘Advances in Neural Information Processing Systems 3', Morgan Kaufmann Publishers, San Mateo, CA, pp. 890–896.
Google Scholar
Finnoff, W., Hergert, F. and Zimmermann, H. G. (1993), ‘Improving model selection by nonconvergent methods', Neural Networks 6, 771–783.
Google Scholar
Geman, S., Bienenstock, E. and Doursat, R. (1992), ‘Neural networks and the bias/variance dilemma', Neural Computation 4(1), 1–58.
Google Scholar
Larsen, J. (1992), A generalization error estimate for nonlinear systems, in ‘Proceedings of the 1992 IEEE Workshop on Neural Networks for Signal Processing', IEEE Service Center, Piscataway, NJ, pp. 29–38.
Google Scholar
Leen, T. K. and Orr, G. B. (1992), Weight-space probability densities and convergence times for stochastic learning, in ‘Int. Joint Conference on Neural Networks', Vol. 4, Baltimore, MD, pp. 158–164.
Google Scholar
Moody, J. E. (1991), Note on generalization, regularization and architecture selection in nonlinear learning systems, in B. H. Juang, S. Y. Kung and C. A. Kamm, eds, ‘Neural Networks for Signal Processing', IEEE Signal Processing Society, pp. 1–10.
Google Scholar
Moody, J. and Utans, J. (1994), Architecture selection strategies for neural networks: Application to corporate bond rating prediction, in A. N. Refenes, ed., ‘Neural Networks in the Captial Markets', John Wiley & Sons.
Google Scholar
Sjöberg, J. and Ljung, L. (1992), Overtraining, regularization, and searching for minimum in neural networks, technical Report LiTH-ISY-I-1297, Dept. of Electrical Engineering, Linköping University, S-581 83 Linköping, Sweden.
Google Scholar
Stone, M. (1978), ‘Cross-validation: A review', Math. Operationsforsch. Statist., Ser. Statistics 9(1).
Google Scholar
Weigend, A. S. and Rummelhart, D. E. (1991), The effective dimension of the space of hidden units, in ‘Proceedings of the International Joint Conference on Neural Networks', Vol. III, Singapore, pp. 2069–2074.
Google Scholar

Download references

Author information

Authors and Affiliations

London Business School, Sussex Place, Regent's Park, NW1 4SA, London, UK
Joachim Utans

Authors

Joachim Utans
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

José Mira Roberto Moreno-Díaz Joan Cabestany

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Utans, J. (1997). A non-convergent on-line training algorithm for neural networks. In: Mira, J., Moreno-Díaz, R., Cabestany, J. (eds) Biological and Artificial Computation: From Neuroscience to Technology. IWANN 1997. Lecture Notes in Computer Science, vol 1240. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0032551

Download citation

DOI: https://doi.org/10.1007/BFb0032551
Published: 18 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63047-0
Online ISBN: 978-3-540-69074-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics