Abstract
A backpropagation learning algorithm for feedforward neural networks withan adaptive learning rate is derived. The algorithm is based uponminimising the instantaneous output error and does not include anysimplifications encountered in the corresponding Least Mean Square (LMS)algorithms for linear adaptive filters. The backpropagation algorithmwith an adaptive learning rate, which is derived based upon the Taylorseries expansion of the instantaneous output error, is shown to exhibitbehaviour similar to that of the Normalised LMS (NLMS) algorithm. Indeed,the derived optimal adaptive learning rate of a neural network trainedby backpropagation degenerates to the learning rate of the NLMS for a linear activation function of a neuron. By continuity, the optimal adaptive learning rate for neural networks imposes additional stabilisationeffects to the traditional backpropagation learning algorithm.
Similar content being viewed by others
References
Ljung, L. and Soderstrom, T.: Theory and Practice of Recursive Identification, MIT Press, Cambridge, MA.,1983.
Treichler, J. R., Johnson, Jr., C. R. and Larimore, M. G.: Theory and Design of Adaptive Filters, John Wiley & Sons, New York, 1987.
Kwong, R. H. and Johnston, E. W.: A variable step size LMS algorithm, IEEE Transactions on Signal Processing 40(7) (1992), 1633-1641.
Evans, J. B., Xue, P. and Liu, B.: Analysis and implementation of variable step size adaptive algorithms, IEEE Transactions on Signal Processing 41(8) (1993), 2517-2535.
Mathews, V. J. and Xie, Z.: A stochastic gradient adaptive filter with gradient adaptive step size, IEEE Transactions on Signal Processing 41(6) (1993), 2075-2087.
Shan, T. J. and Kailaith, T.: Adaptive algorithms with an automatic gain control feature, IEEE Transactions on Acoustics, Speech and Signal Processing 35(1) (1988), 122-127.
Aboulnasr, T. and Mayyas, K.: A robust variable step-size LMS-type algorithm: Analysis and simulations, IEEE Transactions on Signal Processing 45(3) (1997), 631-639.
Haykin, S.: Adaptive Filter Theory, Prentice-Hall, 3d ed., Englewood Cliffs, NJ, 1996.
Jacobs, R. A.: Increased rates of convergence through learning rate adaptation, Neural Networks 1 (1988), 295-307.
Haykin, S.: Neural Networks-A Comprehensive Foundation, Prentice Hall, Englewood Cliffs, NJ, 1994.
Douglas, S. C. and Cichocki, A.: On-line step-size selection for training of adaptive systems, IEEE Signal Processing Magazine 14(6) (1997), 45-46.
Mandic, D. P. and Chambers, J. A.: A posteriori real time recurrent learning schemes for a recurrent neural network based non-linear predictor. IEE Proceedings-Vision, Image and Signal Processing 145(6) (1998), 365-370.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Mandic, D.P., Chambers, J.A. Towards the Optimal Learning Rate for Backpropagation. Neural Processing Letters 11, 1–5 (2000). https://doi.org/10.1023/A:1009686825582
Issue Date:
DOI: https://doi.org/10.1023/A:1009686825582