Abstract
We present in this article a new technique dedicated to optimise the regularization parameter of a cost function. On the one hand the derivatives of the cost function with regards to the weights permits to optimise the network. On the other the derivatives of the cost function with regards to the regularization parameter permits to optimize the smoothness of the function achieved by the network. We show that by oscillating between these two gradient descent optimisations we achieve the task of regulating the smoothness of a neural network. We present the results of this algorithm on a task design to clearly express the network's level of smoothness.
Preview
Unable to display preview. Download preview PDF.
References
C. M. Bishop, “Chap. 9 Learning and Generalization,” in Neural Networks for Pattern Recognition: Oxford University Press, ISBN 0-19-853864-2, 1995, pp. 332–384.
T. Czernichow, “Architecture Selection through Statistical Sensitivity Analysis,” presented at ICANN'96, Bochum, 1996.
T. Czernichow, “Apport des réseaux récurrents à la prévision de séries temporelles, application á la prévision de consommation d'électricité,” PhD thesis, Intelligence Artificielle et Reconnaissance de Formes. Université Pierre et Marie Curie (Paris 6), Lab. Laforia/INT-SIM,1996.
A. N. Tikhonov, V.Y. Arsenin, Solution of ill posed problems. Washington, D.C.: W.H. Wilson, 1977.
A. Weigend, D. Rumelhart, B. Huberman, “Generalization by weight elimination with application to forecasting,” presented at Neural Information Processings 3, pp. 875–882, 1991.
Y. Chauvin, “Dynamic behavior of constrained back-propagation networks,” presented at Advances in Neural Information Processings 2, pp. 643–649, 1990.
S. Hanson, D. Burr, “Minkowski-r back propagation: Learning connectionist models with non-euclidian error signals,” presented at Neural Information Processing Systems, American Institute of Physics, New York, pp. 348–357, 1988.
V. Morosov, Methods for solving incorrectly posed problems. Berlin: Springer Verlag, 1984.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Czernichow, T. (1997). A double gradient algorithm to optimize regularization. In: Gerstner, W., Germond, A., Hasler, M., Nicoud, JD. (eds) Artificial Neural Networks — ICANN'97. ICANN 1997. Lecture Notes in Computer Science, vol 1327. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0020169
Download citation
DOI: https://doi.org/10.1007/BFb0020169
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63631-1
Online ISBN: 978-3-540-69620-9
eBook Packages: Springer Book Archive