Degeneracy of the EM algorithm for the MLE of multivariate Gaussian mixtures and dynamic constraints

https://doi.org/10.1016/j.csda.2010.10.026Get rights and content

Abstract

EM algorithms for multivariate normal mixture decomposition have been recently proposed in order to maximize the likelihood function in a constrained parameter space having no singularities and a reduced number of spurious local maxima. However, such approaches require some a priori information about the eigenvalues of the covariance matrices. The behavior of the EM algorithm near a degenerated solution is investigated. The obtained theoretical results would suggest a new kind of constraint based on the dissimilarity between two consecutive updates of the eigenvalues of each covariance matrix. The performances of such a “dynamic” constraint are evaluated on the grounds of some numerical experiments.

Section snippets

The problem

The EM algorithm is a well-known and largely studied general purpose method for maximum likelihood estimation in incomplete data problems, see e.g. Dempster et al. (1977) and McLachlan and Krishnan (2008). For a given i.i.d. random sample {xn}n=1,,N of size N drawn from the density f(x;ψ), where xRq and the parameter ψ assumes values in a subset Ψ of a suitable Euclidean space, the EM algorithm generates a sequence of estimates {ψ(m)}m, where ψ(0) denotes the initial guess and ψ(m)Ψ for mN,

Theoretical results

In this section we extend previous results due to Biernacki and Chretien (2003) to the multivariate case.

Let D be a subset of {1,2,,N} with d elements and assume dq. Let us denote by j0 (1j0k) a degenerate component of the mixture (1), and set the vector v0=[{1/ϕnj0}nD,{ϕnj0}nD], where ϕnj0 is defined in (2). For a degenerate component the Euclidean norm v0 is small.

Furthermore, we shall consider the following assumption about the eigenvalues and the eigenvectors of the covariance

A dynamic constraint on the eigenvalues

The last result presented in the previous section states that if the EM algorithm fits a degenerate component, say j0, then the smallest eigenvalue of Σj0 tends to zero at an exponential rate. This suggests that during the EM iterations the eigenvalues may vary very rapidly. Triggering such a behavior is very dangerous when the current estimates of the parameters are far from the optimal solution, like in the first iterations of the algorithm. Thus, we conjecture that such bad behavior should

Numerical experiments

In this section we present both numerical experiments on simulated and real data in order to evaluate and compare the performances of the proposed dynamic constraint under different settings. In particular, we consider the following six algorithms:

    U

    Unconstrained

    Ordinary EM. One random starting point.

    U2

    Unconstrained

    Ordinary EM. Two random starting points, the solution giving the highest likelihood value is chosen.

    LCS

    Lower dynamically constrained, strong

    Constrained EM algorithm with ϑa=1.111 and

Concluding remarks

In this paper we have presented two main issues. The first part has been devoted to the extension to the multivariate case of some results about the convergence of the EM algorithm proposed in Biernacki and Chretien (2003). In particular, our main result concerns the generalization of Theorem 2 in that paper, and we showed that near degeneracy, the smallest eigenvalue of the degenerate component tends to zero at an exponential rate. Based on this result, in the second part of the paper we have

Acknowledgements

The authors sincerely thank the Associate editor and the anonymous referees for their very helpful comments and suggestions.

References (13)

There are more references available in the full text version of this article.

Cited by (33)

  • Improved Inference of Gaussian Mixture Copula Model for Clustering and Reproducibility Analysis using Automatic Differentiation

    2022, Econometrics and Statistics
    Citation Excerpt :

    Degeneracy of maximum likelihood in Gaussian mixture models have been well studied. The likelihood of GMM with unrestricted covariance matrices is unbounded (Day, 1969; Ingrassia, 2004; Chen and Tan, 2009; Ingrassia and Rocci, 2011). This leads to degeneracy when the component covariance matrices become singular.

  • A general hidden state random walk model for animal movement

    2017, Computational Statistics and Data Analysis
View all citing articles on Scopus
View full text