Skip to main content
Log in

Comparison of the EM algorithm and alternatives

  • Original Paper
  • Published:
Numerical Algorithms Aims and scope Submit manuscript

Abstract

The Expectation-Maximization (EM) algorithm is widely used also in industry for parameter estimation within a Maximum Likelihood (ML) framework in case of missing data. It is well-known that EM shows good convergence in several cases of practical interest. To the best of our knowledge, results showing under which conditions EM converges fast are only available for specific cases. In this paper, we analyze the connection of the EM algorithm to other ascent methods as well as the convergence rates of the EM algorithm in general including also nonlinear models and apply this to the PMHT model. We compare the EM with other known iterative schemes such as gradient and Newton-type methods. It is shown that EM reaches Newton-convergence in case of well-separated objects and a Newton-EM combination turns out to be robust and efficient even in cases of closely-spaced targets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Anderson, B.D.O., Moore, J.B.: Optimal Filtering. Information and System Science Series. Prentice Hall, Inc., Englewood Cliffs (1979)

    Google Scholar 

  2. Bar-Shalom, Y., Li, X.-R.: Estimation and Tracking: Principles, Techniques, and Software. Norwood, Artech House Inc. (1993)

    MATH  Google Scholar 

  3. Bar-Shalom, Y., Li, X.R., Kirubarajan, T.: Estimation with Applications to Tracking and Navigation: Theory, Algorithms and Software. Wiley, New York (2001)

    Book  Google Scholar 

  4. Blackman, S., Popoli, R.: Design and Analysis of Modern Tracking Systems. Artech House, Inc., Norwood (1999)

    MATH  Google Scholar 

  5. Boyles, R.A.: On the convergence of the EM algorithm. J. Roy. Stat. Soc. Ser. B 45(1), 47–50 (1983)

    MathSciNet  MATH  Google Scholar 

  6. Dempster, A., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. Roy. Stat. Soc. Ser. B 39, 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  7. Efron, B., Hinkley, D.V.: Assessing the accuracy of the maximum likelihood estimator: observed versus expected fisher information. Biometrika 65(3), 457–483 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  8. Jamshidian, M., Jennrich, R.I.: Conjugate gradient acceleration of the EM algorithm. J. Am. Stat. Assoc. 88(421), 221–228 (1993)

    MathSciNet  MATH  Google Scholar 

  9. Jamshidian, M., Jennrich, R.I.: Acceleration of the EM algorithm by using Quasi-Newton methods. J. Roy. Stat. Soc. Ser. B 59(3), 569–587 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  10. Kalman, R.E.: A new approach to linear filtering and prediction problems. Trans. ASME J. Basic Eng. 82, 35–45 (1960)

    Article  Google Scholar 

  11. Lang, S.: Real Analysis, 2nd edn. Addison-Wesley Publishing Company Advanced Book Program, Reading (1983)

    MATH  Google Scholar 

  12. Lange, K.: A gradient algorithm locally equivalent to the EM algorithm. J. Roy. Stat. Soc. Ser. B 57(2), 425–437 (1995)

    MATH  Google Scholar 

  13. Lange, K.: Numerical Analysis for Statisticians. Springer-Verlag, New York (1999)

    MATH  Google Scholar 

  14. Lange, K.: Optimization. Springer Texts in Statistics. Springer, New York (2004)

    Google Scholar 

  15. Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (1987)

    MATH  Google Scholar 

  16. Louis, T.A.: Finding the observed information matrix when using the EM algorithm. J. Roy. Stat. Soc. Ser. B 44(2), 226–233 (1982)

    MathSciNet  MATH  Google Scholar 

  17. Mallick, M., La Scala, B.: Comparison of single-point and two-point difference track initiation algorithms using position measurements. Acta Autom. Sin. 34(3), 258–265 (2008)

    Article  Google Scholar 

  18. McLachlan, G.J., Krishnan, T.: The EM Algorithm and Extensions. Wiley Series in Probability and Statistics, 2nd edn. Wiley, Hoboken (2008)

    Google Scholar 

  19. Meilijson, I.: A fast improvement to the EM algorithm on its own terms. J. Roy. Stat. Soc. B 51(1), 127–138 (1989)

    MathSciNet  MATH  Google Scholar 

  20. Ostrowski, A.M.: Solution of Equations and Systems of Equations, 2nd edn. Academic Press, New York (1966)

    MATH  Google Scholar 

  21. Rauch, H.E., Tung, F., Striebel, C.T.: Maximum likelihood estimates of linear dynamic systems. AIAA J. 3(8), 1445–1450 (1965)

    Article  MathSciNet  Google Scholar 

  22. Salakhutdinov, R., Ghahramani, Z.: Relationship Between Gradient and EM Steps in Latent Variable Models. Technical Report (2002). www.mit.edu/rsalakhu/papers/report.pdf

  23. Salakhutdinov, R., Roweis, S., Ghahramani, Z.: Expectation-Conjugate Gradient: An Alternative to EM. Technical Report (2002). www.mit.edu/rsalakhu/papers/ecgdraft.pdf

  24. Salakhutdinov, R., Roweis, S., Ghahramani, Z.: Optimization with EM and expectation-conjugate-gradient. In: Proceedings of the International Conference on Machine Learning, vol. 20, 672–679 (2003)

  25. Shewchuk, J.R.: An Introduction to the Conjugate Gradient Method Without the Agonizing Pain. Technical report. Carnegie Mellon University, Pittsburgh (1994)

    Google Scholar 

  26. Springer, T.: Mathematical Analysis and Computational Methods for Probabilistic Multi-Hypothesis Tracking (PMHT). PhD thesis, Ulm University. vts.uni-ulm.de/doc.asp?id=8413

  27. Streit, R.L., Luginbuhl, T.E.: Maximum likelihood method for probabilistic multi-hypothesis tracking. In: Proceedings of SPIE International Symposium, Signal and Data Processing of Small Targets, SPIE Proceedings, vol. 2335–24, pp. 394–405. Orlando (1994)

  28. Streit, R.L., Luginbuhl, T.E.: Probabilistic Multi-Hypothesis Tracking. Technical Report NUWC-NPT 10,428. Naval Undersea Warfare Center, Division Newport, RI (1995)

  29. Vaida, F.: Parameter convergence for EM and MM algorithms. Stat. Sin. 15(3), 831–840 (2005)

    MathSciNet  MATH  Google Scholar 

  30. Varadhan, R.: Squared extrapolation methods (SQUAREM): a new class of simple and efficient numerical schemes for accelerating the convergence of the EM algorithm. Technical report, Johns Hopkins Bloomberg School of Public Health, Dept. of Biostatistics, Working Paper 63, (1994). biostats.bepress.com/jhubiostat/paper63

  31. Willett, P.K., Ruan, Y., Streit, R.L.: PMHT: problems and some solutions. IEEE Trans. Aeros. Electron. Syst. 38(3), 738–754 (2002)

    Article  Google Scholar 

  32. Wu, C.F.J.: On the convergence properties of the EM algorithm. Ann. Stat. 11(1), 95–103 (1983)

    Article  MATH  Google Scholar 

  33. Xu, L., Jordan, M.I.: On convergence properties of the EM algorithm for Gaussian mixtures. Neural Comput. 8(1), 129–151 (1996)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Karsten Urban.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Springer, T., Urban, K. Comparison of the EM algorithm and alternatives. Numer Algor 67, 335–364 (2014). https://doi.org/10.1007/s11075-013-9794-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11075-013-9794-8

Keywords

Mathematics Subject Classifications (2010)

Navigation