Skip to main content
Log in

Parsimonious Gaussian mixture models

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Parsimonious Gaussian mixture models are developed using a latent Gaussian model which is closely related to the factor analysis model. These models provide a unified modeling framework which includes the mixtures of probabilistic principal component analyzers and mixtures of factor of analyzers models as special cases.

In particular, a class of eight parsimonious Gaussian mixture models which are based on the mixtures of factor analyzers model are introduced and the maximum likelihood estimates for the parameters in these models are found using an AECM algorithm. The class of models includes parsimonious models that have not previously been developed.

These models are applied to the analysis of chemical and physical properties of Italian wines and the chemical properties of coffee; the models are shown to give excellent clustering performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Banfield, J.D., Raftery, A.E.: Model-based Gaussian and non-Gaussian clustering. Biometrics 49(3), 803–821 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  • Bartholomew, D.J., Knott, M.: Latent Variable Models and Factor Analysis, 2nd edn. Edward Arnold, London (1999)

    MATH  Google Scholar 

  • Biernacki, C., Celeux, G., Govaert, G.: Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans. Pattern Anal. Mach. Intell. 22(7), 719–725 (2000)

    Article  Google Scholar 

  • Böhning, D., Dietz, E., Schaub, R., Schlattmann, P., Lindsay, B.: The distribution of the likelihood ratio for mixtures of densities from the one-parameter exponential family. Ann. Inst. Stat. Math. 46, 373–388 (1994)

    Article  MATH  Google Scholar 

  • Celeux, G., Govaert, G.: Gaussian parsimonious clustering models. Pattern Recogn. 28, 781–793 (1995)

    Article  Google Scholar 

  • Chang, W.-C.: On using principal components before separating a mixture of two multivariate normal distributions. J. Roy. Stat. Soc. Ser. C 32(3), 267–275 (1983)

    MATH  Google Scholar 

  • Dean, N., Raftery, A.E.: The clustvarsel package. R package version 0.2-4 (2006)

  • Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. Ser. B 39(1), 1–38 (1977) (with discussion)

    MathSciNet  MATH  Google Scholar 

  • Forina, M., Armanino, C., Castino, M., Ubigli, M.: Multivariate data analysis as a discriminating method of the origin of wines. Vitis 25, 189–201 (1986)

    Google Scholar 

  • Fraley, C., Raftery, A.E.: How many clusters? Which clustering method? Answers via model-based cluster analysis. Comput. J. 41(8), 578–588 (1998)

    MATH  Google Scholar 

  • Fraley, C., Raftery, A.E.: Mclust: Software for model-based clustering. J. Classif. 16, 297–306 (1999)

    Article  MATH  Google Scholar 

  • Fraley, C., Raftery, A.E.: Model-based clustering, discriminant analysis, and density estimation. J. Am. Stat. Assoc. 97(458), 611–612 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  • Fraley, C., Raftery, A.E.: Enhanced model-based clustering, density estimation and discriminant analysis software: MCLUST. J. Classif. 20(2), 263–296 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  • Ghahramani, Z., Hinton, G.E.: The EM algorithm for factor analyzers. Technical report CRG-TR-96-1, University of Toronto, Toronto (1997)

  • Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2, 193–218 (1985)

    Article  Google Scholar 

  • Hurley, C.: Clustering visualizations of multivariate data. J. Comput. Graph. Stat. 13(4), 788–806 (2004)

    Article  MathSciNet  Google Scholar 

  • Kass, R.E., Raftery, A.E.: Bayes factors. J. Am. Stat. Assoc. 90, 773–795 (1995)

    Article  MATH  Google Scholar 

  • Keribin, C.: Estimation consistante de l’ordre de modèles de mélange. C. R. Acad. Sci. Paris Sér. I Math. 326(2), 243–248 (1998)

    MathSciNet  MATH  Google Scholar 

  • Keribin, C.: Consistent estimation of the order of mixture models. Sankhyā Ser. A 62(1), 49–66 (2000)

    MathSciNet  MATH  Google Scholar 

  • Lindsay, B.: Mixture Models: Theory, Geometry and Applications. Institute of Mathematical Statistics, Hayward (1995)

    Google Scholar 

  • Lütkepohl, H.: Handbook of Matrices. Wiley, Chichester (1996)

    MATH  Google Scholar 

  • McLachlan, G.J., Basford, K.E.: Mixture Models: Inference and Applications to Clustering. Marcel Dekker, New York (1988)

    MATH  Google Scholar 

  • McLachlan, G.J., Krishnan, T.: The EM Algorithm and Extensions. Wiley, New York (1997)

    MATH  Google Scholar 

  • McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley, New York (2000)

    MATH  Google Scholar 

  • McLachlan, G.J., Peel, D., Bean, R.W.: Modelling high-dimensional data by mixtures of factor analyzers. Comput. Stat. Data Anal. 41(3–4), 379–388 (2003)

    Article  MathSciNet  Google Scholar 

  • Meng, X.L., van Dyk, D.: The EM algorithm—an old folk song sung to the fast tune (with discussion). J. Roy. Stat. Soc. Ser. B 59, 511–567 (1997)

    Article  MATH  Google Scholar 

  • Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipies in C—The Art of Scientific Computation, 2nd edn. Cambridge University Press, Cambridge (1992)

    Google Scholar 

  • R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2004)

    Google Scholar 

  • Rand, W.M.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66, 846–850 (1971)

    Article  Google Scholar 

  • Schwartz, G.: Estimating the dimension of a model. Ann. Stat. 6, 31–38 (1978)

    Google Scholar 

  • Spearman, C.: The proof and measurement of association between two things. Am. J. Psychol. 15, 72–101 (1904)

    Article  Google Scholar 

  • Streuli, H.: Der heutige stand der kaffeechemie. In: 6th International Colloquium on Coffee Chemisrty. Association Scientifique International du Cafe, Bogatá, Columbia, pp. 61–72 (1973)

  • Tipping, M.E., Bishop, C.M.: Mixtures of probabilistic principal component analysers. Neural Comput. 11(2), 443–482 (1999a)

    Article  Google Scholar 

  • Tipping, M.E., Bishop, C.M.: Probabilistic principal component analysis. J. Roy. Stat. Soc. Ser. B 61(3), 611–622 (1999b)

    Article  MathSciNet  MATH  Google Scholar 

  • Titterington, D.M., Smith, A.F.M., Makov, U.E.: Statistical Analysis of Finite Mixture Distributions. Wiley, Chichester (1985)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Brendan Murphy.

Rights and permissions

Reprints and permissions

About this article

Cite this article

McNicholas, P.D., Murphy, T.B. Parsimonious Gaussian mixture models. Stat Comput 18, 285–296 (2008). https://doi.org/10.1007/s11222-008-9056-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-008-9056-0

Keywords

Navigation