Skip to main content
Log in

Estimation of finite mixtures with symmetric components

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

It may sometimes be clear from background knowledge that a population under investigation proportionally consists of a known number of subpopulations, whose distributions belong to the same, yet unknown, family. While a parametric family is commonly used in practice, one can also consider some nonparametric families to avoid distributional misspecification. In this article, we propose a solution using a mixture-based nonparametric family for the component distribution in a finite mixture model as opposed to some recent research that utilizes a kernel-based approach. In particular, we present a semiparametric maximum likelihood estimation procedure for the model parameters and tackle the bandwidth parameter selection problem via some popular means for model selection. Empirical comparisons through simulation studies and three real data sets suggest that estimators based on our mixture-based approach are more efficient than those based on the kernel-based approach, in terms of both parameter estimation and overall density estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bartolucci, F.: Clustering univariate observations via mixtures of unimodal normal mixtures. J. Classif. 22, 203–219 (2005)

    Article  MathSciNet  Google Scholar 

  • Benaglia, T., Chauveau, D., Hunter, D.R.: An EM-like algorithm for semi- and nonparametric estimation in multivariate mixtures. J. Comput. Graph. Stat. 18, 505–526 (2009)

    Article  MathSciNet  Google Scholar 

  • Bordes, L., Mottelet, S., Vandekerkhove, P.: Semiparametric estimation of a two-component mixture model. Ann. Stat. 34, 1204–1232 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  • Bordes, L., Chauveau, D., Vandekerkhove, P.: A stochastic EM algorithm for a semiparametric mixture model. Comput. Stat. Data Anal. 51, 5429–5443 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Bowman, A.W.: An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71, 353–360 (1984)

    Article  MathSciNet  Google Scholar 

  • Burnham, K.P., Anderson, D.R.: Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, 2nd edn. Springer, New York (2002)

    MATH  Google Scholar 

  • Charnigo, R., Pilla, R.S.: Semiparametric mixtures of generalized exponential families. Scand. J. Stat. 34, 535–551 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Cook, R.D., Weisberg, S.: An Introduction to Regression Graphics. Wiley, New York (1994)

    Book  MATH  Google Scholar 

  • Frühwirth-Schnatter, S.: Finite Mixture and Markov Switching Models. Springer, New York (2006)

    MATH  Google Scholar 

  • Heinz, G., Peterson, L.J., Johnson, R.W., Kerk, C.J.: Exploring relationships in body dimensions. J. Stat. Edu. 11 (2003)

  • Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2, 193–218 (1985)

    Article  Google Scholar 

  • Hunter, D.R., Wang, S., Hettmansperger, T.P.: Inference for mixtures of symmetric distributions. Ann. Stat. 35, 224–251 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Hurvich, C.M., Tsai, C.-L.: Regression and time series model selection in small samples. Biometrika 76, 297–307 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  • Kottas, A., Fellingham, G.W.: Bayesian semiparametric modeling and inference with mixtures of symmetric distributions. Stat. Comput. 22, 93–106 (2012)

    Article  MathSciNet  Google Scholar 

  • Laird, N.M.: Nonparametric maximum likelihood estimation of a mixing distribution. J. Am. Stat. Assoc. 73, 805–811 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  • Lindsay, B.G.: The geometry of mixture likelihoods: A general theory. Ann. Stat. 11, 86–94 (1983a)

    Article  MathSciNet  MATH  Google Scholar 

  • Lindsay, B.G.: The geometry of mixture likelihoods, Part II: The exponential family. Ann. Stat. 11, 783–792 (1983b)

    Article  MathSciNet  MATH  Google Scholar 

  • Lindsay, B.G.: Mixture Models: Theory, Geometry and Applications. NSF-CBMS Regional Conference Series in Probability and Statistics, vol. 5. Institute of Mathematical Statistics, Hayward (1995)

    MATH  Google Scholar 

  • Lindsay, B.G., Lesperance, M.L.: A review of semiparametric mixture models. J. Stat. Plan. Inference 47, 29–39 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  • McLachlan, G., Peel, D.: Finite Mixture Models. Wiley, New York (2000)

    Book  MATH  Google Scholar 

  • Miloslavsky, M., van der Laan, M.J.: Fitting of mixtures with unspecified number of components using cross validation distance estimate. Comput. Stat. Data Anal. 41, 413–428 (2003)

    Article  Google Scholar 

  • R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2010)

    Google Scholar 

  • Roeder, K.: Density estimation with confidence sets exemplified by superclusters and voids in the galaxies. J. Am. Stat. Assoc. 85, 617–624 (1990)

    Article  MATH  Google Scholar 

  • Scott, D.W., Terrell, G.R.: Biased and unbiased cross-validation in density estimation. J. Am. Stat. Assoc. 82, 1131–1146 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  • Sheather, S.J., Jones, M.C.: A reliable data-based bandwidth selection method for kernel density estimation. J. R. Stat. Soc., Ser. B, Stat. Methodol. 53, 683–690 (1991)

    MathSciNet  MATH  Google Scholar 

  • Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Chapman & Hall, London (1986)

    MATH  Google Scholar 

  • Smyth, P.: Model selection for probabilistic clustering using cross-validated likelihood. Stat. Comput. 10, 63–72 (2000)

    Article  Google Scholar 

  • Sugiura, N.: Further analysts of the data by Akaike’s information criterion and the finite corrections. Commun. Stat., Theory Methods 7, 13–26 (1978)

    Article  MathSciNet  Google Scholar 

  • Wang, Y.: Maximum likelihood computation for fitting semiparametric mixture models. Stat. Comput. 20, 75–86 (2010)

    Article  MathSciNet  Google Scholar 

  • Wang, Y., Chee, C.-S.: Density estimation using nonparametric and semiparametric mixtures. Stat. Model. (2012, to appear)

  • Young, D.S., Benaglia, T., Chauveau, D., Elmore, R.T., Hettmansperger, T.P., Hunter, D.R., Thomas, H., Xuan, F.: mixtools: Tools for Analyzing Finite Mixture Models. R package version 0.4.1 (2009)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yong Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chee, CS., Wang, Y. Estimation of finite mixtures with symmetric components. Stat Comput 23, 233–249 (2013). https://doi.org/10.1007/s11222-011-9305-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-011-9305-5

Keywords

Navigation