Abstract
The Expectation-Maximization (EM) algorithm is widely used also in industry for parameter estimation within a Maximum Likelihood (ML) framework in case of missing data. It is well-known that EM shows good convergence in several cases of practical interest. To the best of our knowledge, results showing under which conditions EM converges fast are only available for specific cases. In this paper, we analyze the connection of the EM algorithm to other ascent methods as well as the convergence rates of the EM algorithm in general including also nonlinear models and apply this to the PMHT model. We compare the EM with other known iterative schemes such as gradient and Newton-type methods. It is shown that EM reaches Newton-convergence in case of well-separated objects and a Newton-EM combination turns out to be robust and efficient even in cases of closely-spaced targets.
Similar content being viewed by others
References
Anderson, B.D.O., Moore, J.B.: Optimal Filtering. Information and System Science Series. Prentice Hall, Inc., Englewood Cliffs (1979)
Bar-Shalom, Y., Li, X.-R.: Estimation and Tracking: Principles, Techniques, and Software. Norwood, Artech House Inc. (1993)
Bar-Shalom, Y., Li, X.R., Kirubarajan, T.: Estimation with Applications to Tracking and Navigation: Theory, Algorithms and Software. Wiley, New York (2001)
Blackman, S., Popoli, R.: Design and Analysis of Modern Tracking Systems. Artech House, Inc., Norwood (1999)
Boyles, R.A.: On the convergence of the EM algorithm. J. Roy. Stat. Soc. Ser. B 45(1), 47–50 (1983)
Dempster, A., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. Roy. Stat. Soc. Ser. B 39, 1–38 (1977)
Efron, B., Hinkley, D.V.: Assessing the accuracy of the maximum likelihood estimator: observed versus expected fisher information. Biometrika 65(3), 457–483 (1978)
Jamshidian, M., Jennrich, R.I.: Conjugate gradient acceleration of the EM algorithm. J. Am. Stat. Assoc. 88(421), 221–228 (1993)
Jamshidian, M., Jennrich, R.I.: Acceleration of the EM algorithm by using Quasi-Newton methods. J. Roy. Stat. Soc. Ser. B 59(3), 569–587 (1997)
Kalman, R.E.: A new approach to linear filtering and prediction problems. Trans. ASME J. Basic Eng. 82, 35–45 (1960)
Lang, S.: Real Analysis, 2nd edn. Addison-Wesley Publishing Company Advanced Book Program, Reading (1983)
Lange, K.: A gradient algorithm locally equivalent to the EM algorithm. J. Roy. Stat. Soc. Ser. B 57(2), 425–437 (1995)
Lange, K.: Numerical Analysis for Statisticians. Springer-Verlag, New York (1999)
Lange, K.: Optimization. Springer Texts in Statistics. Springer, New York (2004)
Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (1987)
Louis, T.A.: Finding the observed information matrix when using the EM algorithm. J. Roy. Stat. Soc. Ser. B 44(2), 226–233 (1982)
Mallick, M., La Scala, B.: Comparison of single-point and two-point difference track initiation algorithms using position measurements. Acta Autom. Sin. 34(3), 258–265 (2008)
McLachlan, G.J., Krishnan, T.: The EM Algorithm and Extensions. Wiley Series in Probability and Statistics, 2nd edn. Wiley, Hoboken (2008)
Meilijson, I.: A fast improvement to the EM algorithm on its own terms. J. Roy. Stat. Soc. B 51(1), 127–138 (1989)
Ostrowski, A.M.: Solution of Equations and Systems of Equations, 2nd edn. Academic Press, New York (1966)
Rauch, H.E., Tung, F., Striebel, C.T.: Maximum likelihood estimates of linear dynamic systems. AIAA J. 3(8), 1445–1450 (1965)
Salakhutdinov, R., Ghahramani, Z.: Relationship Between Gradient and EM Steps in Latent Variable Models. Technical Report (2002). www.mit.edu/rsalakhu/papers/report.pdf
Salakhutdinov, R., Roweis, S., Ghahramani, Z.: Expectation-Conjugate Gradient: An Alternative to EM. Technical Report (2002). www.mit.edu/rsalakhu/papers/ecgdraft.pdf
Salakhutdinov, R., Roweis, S., Ghahramani, Z.: Optimization with EM and expectation-conjugate-gradient. In: Proceedings of the International Conference on Machine Learning, vol. 20, 672–679 (2003)
Shewchuk, J.R.: An Introduction to the Conjugate Gradient Method Without the Agonizing Pain. Technical report. Carnegie Mellon University, Pittsburgh (1994)
Springer, T.: Mathematical Analysis and Computational Methods for Probabilistic Multi-Hypothesis Tracking (PMHT). PhD thesis, Ulm University. vts.uni-ulm.de/doc.asp?id=8413
Streit, R.L., Luginbuhl, T.E.: Maximum likelihood method for probabilistic multi-hypothesis tracking. In: Proceedings of SPIE International Symposium, Signal and Data Processing of Small Targets, SPIE Proceedings, vol. 2335–24, pp. 394–405. Orlando (1994)
Streit, R.L., Luginbuhl, T.E.: Probabilistic Multi-Hypothesis Tracking. Technical Report NUWC-NPT 10,428. Naval Undersea Warfare Center, Division Newport, RI (1995)
Vaida, F.: Parameter convergence for EM and MM algorithms. Stat. Sin. 15(3), 831–840 (2005)
Varadhan, R.: Squared extrapolation methods (SQUAREM): a new class of simple and efficient numerical schemes for accelerating the convergence of the EM algorithm. Technical report, Johns Hopkins Bloomberg School of Public Health, Dept. of Biostatistics, Working Paper 63, (1994). biostats.bepress.com/jhubiostat/paper63
Willett, P.K., Ruan, Y., Streit, R.L.: PMHT: problems and some solutions. IEEE Trans. Aeros. Electron. Syst. 38(3), 738–754 (2002)
Wu, C.F.J.: On the convergence properties of the EM algorithm. Ann. Stat. 11(1), 95–103 (1983)
Xu, L., Jordan, M.I.: On convergence properties of the EM algorithm for Gaussian mixtures. Neural Comput. 8(1), 129–151 (1996)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Springer, T., Urban, K. Comparison of the EM algorithm and alternatives. Numer Algor 67, 335–364 (2014). https://doi.org/10.1007/s11075-013-9794-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11075-013-9794-8