Dynamic Learning with the EM Algorithm for Neural Networks

de Freitas, J.F.G.; Niranjan, M.; Gee, A.H.

doi:10.1023/A:1008103718973

J.F.G. de Freitas¹,
M. Niranjan² &
A.H. Gee³

172 Accesses
8 Citations
Explore all metrics

Abstract

In this paper, we derive an EM algorithm for nonlinear state space models. We use it to estimate jointly the neural network weights, the model uncertainty and the noise in the data. In the E-step we apply a forward-backward Rauch-Tung-Striebel smoother to compute the network weights. For the M-step, we derive expressions to compute the model uncertainty and the measurement noise. We find that the method is intrinsically very powerful, simple and stable.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scientific Machine Learning Through Physics–Informed Neural Networks: Where we are and What’s Next

Article Open access 26 July 2022

Siamese Neural Networks: An Overview

Autoencoders and their applications in machine learning: a survey

Article Open access 03 February 2024

References

A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” Journal of the Royal Statistical Society Series B, vol. 39, 1977, pp. 1–38.
MathSciNet MATH Google Scholar
C.F. Chen, “The EM Algorithm to the Multiple Indicators and Multiple Causes Model via the Estimation of the Latent Variable,” Journal of the American Statistical Association, vol. 76, no.375, 1981, pp. 704–708.
Article MATH Google Scholar
M.W. Watson and R.F. Engle, “Alternative Algorithms for the Estimation of Dynamic Factor, MIMIC and Varying Coefficient regression Models,” Journal of Econometric, vol. 23, no.3, 1983, pp. 385–400.
Article MATH Google Scholar
R.H. Shumway and D.S. Stoffer, “An Approach to Time Series Smoothing and Forecasting Using the EM Algorithm,” Journal of Time Series Analysis, vol. 3, no.4, 1982, pp. 253–264.
Article MATH Google Scholar
R.H. Shumway and D.S. Stoffer, “Dynamic Linear Models with Swithcing,” Journal of the American Statistical Association, vol. 86, no.415, 1991, pp. 763–769.
Article MathSciNet Google Scholar
V. Digalakis, J.R. Rohlicek, and M. Ostendorf, “ML Estimation of a Stochastic Linear System with the EM Algorithm and its Application to Speech Recognition,” IEEE Transactions on Speech and Audio Processing, vol. 1, no.4, 1993, pp. 431–442.
Article Google Scholar
B. North and A. Blake, “Learning Dynamical Models using Expectation-Maximisation,” in International Conference on Computer Vision, Mumbai, India, 1998, pp. 384–389.
R.P.N. Rao and D.H. Ballard, “Dynamic Model ofVisual Recognition Predicts Neural Response Properties in theVisual Cortex,” Neural Computation, vol. 9, no.4, 1997, pp. 721–763.
Article Google Scholar
Z. Ghahramani, “Learning Dynamic Bayesian Networks,” in Adaptive Processing of Temporal information, C.L. Giles and M. Gori (Eds.), Springer-Verlag. Lecture Notes in Artificial Intelligence, vol. 1387.
S. Roweis and Z. Ghahramani, “A Unifying Review of Linear Gaussian Models,” Neural Computation, vol. 11, no.2, 1999, pp. 305–345.
Article Google Scholar
S. Singhal and L. Wu, “Training Multilayer perceptrons with the Extended Kalman Algorithm,” in Advances in Neural Information Processing Systems, vol. 1, D.S. Touretzky (Ed.), San Mateo, CA, 1988, pp. 133–140.
S. Shah, F. Palmieri, and M. Datum, “Optimal Filtering Algorithms for Fast Learning in Feedforward Neural Networks,” Neural Networks, vol. 5, no.5, 1992, pp. 779–787.
Article Google Scholar
G.V. Puskorius and L.A. Feldkamp, “Decoupled Extended Kalman Filter Training of Feedforward Layered Networks,” in International Joint Conference on Neural Networks, Seattle, 1991, pp. 307–312.
A. Gelb (Ed.), Applied Optimal Estimation, MIT Press, 1974.
A.H. Jazwinski, Stochastic processes and Filtering Theory, Academic Press, 1970.
H.E. Rauch, F. Tung, and C.T. Striebel, “Maximum Likelihood Estimates of Linear Dynamic Systems,” AIAA Journal, vol. 3, no.8, 1965, pp. 1445–1450.
Article MathSciNet Google Scholar
J.F.G. de Freitas, M. Niranjan, and A.H. Gee, “Hierarchical bayesian-Kalman Models for Regularisation and ARD in Sequential Learning,” Technical Report CUED/F-INFENG/TR 307, Cambridge University Engineering Department, 1997.
A. Graham, Knonecker Products and Matrix Calculus with Applications, Ellis Horwood Limited, 1981.
C. Andrieu, J.F.G. de Freitas, and A. Doucet, “Robust Full Bayesian Learning for Neural Networks,” Technical Report CUED/F-INFENG/TR 343, Cambridge University Engineering Department, 1999.
C.C. Holmes and B.K. Mallick, “Bayesian Radial Basis Functions of Variable Dimesion,” Neural Computation, vol. 10, no.5, 1998, pp. 1217–1233.
Article Google Scholar
D.J.C. Mackay, “A Practical bayesian Framework for Back-propagation Networks,” Neural Computation, vol. 4, no.3, 1992, pp. 448–472.
Article Google Scholar
R.M. Neal, Bayesian Learning for Neural Networks, Springer-Verlag, New York, 1996. Lecture Notes in Statistics, vol. 118.
Book MATH Google Scholar
D. Rios Insua and P. Müller, “Feedforward Neural Networks for Nonparametric Regression,” Technical Report 98-02, Institute of Statistics and Decision Sciences, Duke University, 1998.
S.J. Roberts, W.D. Penny, and D. Pillot, “Novelty, Confidence and Errors in Connectionist Systems,” in IEE Colloquium on Intelligent Sensors andFault Detection, vol. 261, 1996, pp. 10/1–10/6.
Google Scholar
J.M. Spyers-Ashby, P. Bain, and S.J. Roberts, “A Comparison of Fast Fourier Transform (FFT) and Autoregressive (AR) Spectral Estimation Techniques for the Analysis of Tremor Data,” Journal of Neuroscience Methods, vol. 83, no.1, 1998, pp. 35–43.
Article Google Scholar
S.J. Roberts and W.D. Penny, “Bayesian Neural Network for Classification: How Useful is the Evidence Framwork?” Neural Networks, vol. 12, 1999, pp. 877–892.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Division, University of California, 387 Soda Hall, Berkeley, CA, 94720-1776, USA
J.F.G. de Freitas
Department of Computer Science, University of Sheffield, Regent Court, Portabello Street, Sheffield, S1 4DP
M. Niranjan
Engineering Department, Cambridge University, Trumpington Street, Cambridge, CB2 1PZ, England
A.H. Gee

Authors

J.F.G. de Freitas
View author publications
You can also search for this author in PubMed Google Scholar
M. Niranjan
View author publications
You can also search for this author in PubMed Google Scholar
A.H. Gee
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

de Freitas, J., Niranjan, M. & Gee, A. Dynamic Learning with the EM Algorithm for Neural Networks. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 26, 119–131 (2000). https://doi.org/10.1023/A:1008103718973

Download citation

Published: 01 August 2000
Issue Date: August 2000
DOI: https://doi.org/10.1023/A:1008103718973

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic Learning with the EM Algorithm for Neural Networks

Abstract

Access this article

Similar content being viewed by others

Scientific Machine Learning Through Physics–Informed Neural Networks: Where we are and What’s Next

Siamese Neural Networks: An Overview

Autoencoders and their applications in machine learning: a survey

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dynamic Learning with the EM Algorithm for Neural Networks

Abstract

Access this article

Similar content being viewed by others

Scientific Machine Learning Through Physics–Informed Neural Networks: Where we are and What’s Next

Siamese Neural Networks: An Overview

Autoencoders and their applications in machine learning: a survey

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation