Mixture structure analysis using the Akaike Information Criterion and the bootstrap

Solka, Jeffrey L.; Wegman, Edward J.; Priebe, Carey E.; Poston, Wendy L.; Rogers, George W.

doi:10.1023/A:1008924323509

Mixture structure analysis using the Akaike Information Criterion and the bootstrap

Published: August 1998

Volume 8, pages 177–188, (1998)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Jeffrey L. Solka¹,
Edward J. Wegman²,
Carey E. Priebe³,
Wendy L. Poston¹ &
…
George W. Rogers¹

164 Accesses
12 Citations
Explore all metrics

Abstract

Given i.i.d. observations x1,x2,x3,...,xn drawn from a mixture of normal terms, one is often interested in determining the number of terms in the mixture and their defining parameters. Although the problem of determining the number of terms is intractable under the most general assumptions, there is hope of elucidating the mixture structure given appropriate caveats on the underlying mixture. This paper examines a new approach to this problem based on the use of Akaike Information Criterion (AIC) based pruning of data driven mixture models which are obtained from resampled data sets. Results of the application of this procedure to artificially generated data sets and a real world data set are provided.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Model-based clustering via new parsimonious mixtures of heavy-tailed distributions

Article 14 January 2022

On the Use of the Matrix-Variate Tail-Inflated Normal Distribution for Parsimonious Mixture Modeling

Robust mixture regression modeling based on scale mixtures of skew-normal distributions

Article 19 July 2015

References

Akaike, H. (1974) A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716–23.
Google Scholar
Binder, D. A. (1978) Bayesian cluster analysis. Biometrika, 65(1), 31–8.
Google Scholar
Bozdogan, H. and Sclove, S. L. (1984) Multi-sample cluster analysis using Akaike's information criterion. Annals of the Institute of Statistics and Mathematics, 36, 163–80.
Google Scholar
Carmen, C. S. and Merickel, M. (1990) Supervising isodata with an information theoretic stopping rule. Pattern Recognition, 23, 185–97.
Google Scholar
Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977) Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39, 1–38.
Google Scholar
Efron, B. and Tibshirani, R. (1993) An Introduction to the Bootstrap, London: Chapman and Hall.
Google Scholar
Everitt, B. S. and Hand, D. J. (1981) Finite Mixture Distributions, London: Chapman and Hall.
Google Scholar
Liang, Z., Jaszczak, R. J. and Coleman, R. E. (1992) Parameter estimation of finite mixtures using the EM algorithm and information criteria with applications to medical image processing. IEEE Transactions on Nuclear Science, 39(4), 1126–33.
Google Scholar
Marron, J. S. and Wand, M. P. (1992) Exact mean integrated squared error. Annals of Statistics, 20(2), 712–36.
Google Scholar
McLachlan, G. J. (1987) On bootstrapping the likelihood ratio test statistic for the number of components in a normal mixture. Applied Statistics, 36(3), 318–24.
Google Scholar
McLachlan, G. J. and Basford, K. E. (1988) Mixture Models, New York: Marcel Dekker.
Google Scholar
Milligan, G. W. and Cooper M. C. (1985) An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50(1), 159–79.
Google Scholar
Parzen, E. (1979) Nonparametric statistical data modeling. Journal of the American Statistical Association, 74, 105–31.
Google Scholar
Priebe, C. E. (1994) Adaptive mixtures. Journal of the American Statistical Association, 89, 796–806.
Google Scholar
Priebe, C. E. and Marchette, D. J. (1993) Adaptive mixture density estimation. Pattern Recognition, 26(5), 771–85.
Google Scholar
Priebe, C. E., Solka, J. L. and Rogers, G. W. (1993) Discriminant analysis in aerial images using fractal based features. In F. A. Sadjadi (ed.) Adaptive and Learning Systems II, Proc. SPIE 1962, pp. 196–208.
Priebe, C. E., Solka, J. L., Lorey, R. A., Rogers, G. W., Poston, W. L., Kallergi, M., Qian, W., Clarke, L. P. and Clark, R. A. (1994) The application of fractal analysis to mammographic tissue classification. Cancer Letters, 77, 183–89.
Google Scholar
Scott, D. W. (1985b) Frequency polygons. Journal of the American Statistical Association, 80, 348–54
Google Scholar
Scott, D. W. (1985b) Average shifted histograms: effective non-parametric density estimation in several dimensions. Annals of Statistics, 13, 1024–40.
Google Scholar
Scott, D. W. (1992) Multivariate Density Estimation, New York: John Wiley.
Google Scholar
Scott, D. W. (1994) Multivariate Density Estimation, Short Course Interface 1994.
Silverman, B. W. (1986) Density Estimation for Statistics and Data Analysis. New York: Chapman and Hall.
Google Scholar
Solka, J. L. (1995) Matching Model Information Content to Data Information, PhD Dissertation, George Mason University, Fairfax, Virginia.
Google Scholar
Solka, J. L., Priebe, C. E. and Rogers, G. W. (1992) An initial assessment of discriminant surface complexity for power law features. Simulation, 58(5), 311–18.
Google Scholar
Solka, J. L., Priebe, C. E. and Rogers, G. W. (1993) A probabilistic approach to fractal based texture discrimination. In F. A. Sadjadi (ed.) Adaptive and Learning Systems II, Proc. SPIE 1962, pp. 209–18.
Solka, J. L., Priebe, C. E., Rogers, G. W., Poston, W. L. and Lorey, R. A. (1994) Maximum likelihood density estimation with term creation and annihilation. In Computationally Intensive Statistical Methods, Proceedings of the 26th Symposium on the Interface, pp. 222–25.
Solka, J. L., Poston, W. L. and Wegman, E. J. (1995) A visualization technique for studying the iterative estimation of mixture densities. Journal of Computational and Graphical Statistics, 4(3), 180–97.
Google Scholar
Sturges, H. A. (1926) The choice of a class interval. Journal of the American Statistical Association, 21, 65–6.
Google Scholar
Titterington, D. M. (1984) Recursive parameter estimation using incomplete data. Journal of the Royal Statistical Society, Series B, 46, 257–67.
Google Scholar
Titterington, D. M., Smith, A. F. M. and Makov, V. E. (1985) Statistical Analysis of Finite Mixture Distributions, New York: Wiley.
Google Scholar
Wallace, C. S. and Boulton D. M. (1968) An information measure for classification. Computer Journal, 11, 185–94.
Google Scholar
Wegman, E. J. (1970) Maximum likelihood estimation of a unimodal density function. Annals of Statistics, 41, 457–71.
Google Scholar

Download references

Author information

Authors and Affiliations

Systems Research and Technology Department, Advanced Computation Technology Group, Code B10, Dahlgren Division of the Naval Surface Warfare Center, Dahlgren, VA, 22448-5100, USA
Jeffrey L. Solka, Wendy L. Poston & George W. Rogers
Center for Computational Statistics, George Mason University, Fairfax, VA, 22030-4444, USA
Edward J. Wegman
Department of Mathematical Sciences, The Johns Hopkins University, Baltimore, MD, 21218, USA
Carey E. Priebe

Authors

Jeffrey L. Solka
View author publications
You can also search for this author in PubMed Google Scholar
Edward J. Wegman
View author publications
You can also search for this author in PubMed Google Scholar
Carey E. Priebe
View author publications
You can also search for this author in PubMed Google Scholar
Wendy L. Poston
View author publications
You can also search for this author in PubMed Google Scholar
George W. Rogers
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Solka, J.L., Wegman, E.J., Priebe, C.E. et al. Mixture structure analysis using the Akaike Information Criterion and the bootstrap. Statistics and Computing 8, 177–188 (1998). https://doi.org/10.1023/A:1008924323509

Download citation

Issue Date: August 1998
DOI: https://doi.org/10.1023/A:1008924323509

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mixture structure analysis using the Akaike Information Criterion and the bootstrap

Abstract

Access this article

Similar content being viewed by others

Model-based clustering via new parsimonious mixtures of heavy-tailed distributions

On the Use of the Matrix-Variate Tail-Inflated Normal Distribution for Parsimonious Mixture Modeling

Robust mixture regression modeling based on scale mixtures of skew-normal distributions

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Mixture structure analysis using the Akaike Information Criterion and the bootstrap

Abstract

Access this article

Similar content being viewed by others

Model-based clustering via new parsimonious mixtures of heavy-tailed distributions

On the Use of the Matrix-Variate Tail-Inflated Normal Distribution for Parsimonious Mixture Modeling

Robust mixture regression modeling based on scale mixtures of skew-normal distributions

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation