Skip to main content
Log in

Identifiable finite mixtures of location models for clustering mixed-mode data

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

For clustering mixed categorical and continuous data, Lawrence and Krzanowski (1996) proposed a finite mixture model in which component densities conform to the location model. In the graphical models literature the location model is known as the homogeneous Conditional Gaussian model. In this paper it is shown that their model is not identifiable without imposing additional restrictions. Specifically, for g groups and m locations, (g!)m−1 distinct sets of parameter values (not including permutations of the group mixing parameters) produce the same likelihood function. Excessive shrinkage of parameter estimates in a simulation experiment reported by Lawrence and Krzanowski (1996) is shown to be an artifact of the model's non-identifiability. Identifiable finite mixture models can be obtained by imposing restrictions on the conditional means of the continuous variables. These new identified models are assessed in simulation experiments. The conditional mean structure of the continuous variables in the restricted location mixture models is similar to that in the underlying variable mixture models proposed by Everitt (1988), but the restricted location mixture models are more computationally tractable.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Celeux, G. and Govaert, G. (1995) Gaussian parsimonious clus-tering models. Pattern Recognition, 28, 781–793.

    Google Scholar 

  • Everitt, B. S. (1988) A finite mixture model for the clustering of mixed-mode data. Statistics and Probability Letters, 6, 305–309.

    Google Scholar 

  • Everitt, B. S. and Merette, C. (1990) The clustering of mixed-mode data: a comparison of possible approaches. Journal of Applied Statistics, 17, 283–297.

    Google Scholar 

  • Krzanowski, W. J. (1993) The location model for mixtures of categorical and continuous variables. Journal of Classifica-tion, 10, 25–49.

    Google Scholar 

  • Lawrence, C. J. and Krzanowski, W. J. (1996) Mixture separation for mixed-mode data. Statistics and Computing, 6, 85–92.

    Google Scholar 

  • McLachlan, G. J. and Basford, K. E. (1988) Mixture Models: Inference and Applications to Clustering, Marcel Dekker, New York.

    Google Scholar 

  • McLachlan, G. J. and Krishnan, T. (1997) The EM Algorithm and Extensions, Wiley, New York.

    Google Scholar 

  • Redner, R. A. and Walker, H. F. (1984) Mixture densities, maximum likelihood and the EM algorithm. SIAM Review, 26, 195–239.

    Google Scholar 

  • Titterington, D. M., Smith, A. F. M. and Makov, U. E. (1985) Statistical Analysis of Finite Mixture Distributions, Wiley, New York.

    Google Scholar 

  • Whittaker, J. (1990) Graphical Models in Applied Multivariate Statistics, Wiley, Chichester.

    Google Scholar 

  • Yakowitz, S. J. and Spragins, J. D. (1968) On the identifiability of finite mixtures. Annals of Mathematical Statistics, 40, 1728–1735.

    Google Scholar 

Download references

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Willse, A., Boik, R.J. Identifiable finite mixtures of location models for clustering mixed-mode data. Statistics and Computing 9, 111–121 (1999). https://doi.org/10.1023/A:1008842432747

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008842432747

Navigation