Skip to main content
Log in

On bootstrapping the number of components in finite mixtures of Poisson distributions

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Finite mixture models arise in a natural way in that they are modeling unobserved population heterogeneity. It is assumed that the population consists of an unknown number k of subpopulations with parameters λ1, ..., λ k receiving weights p1, ..., p k . Because of the irregularity of the parameter space, the log-likelihood-ratio statistic (LRS) does not have a (χ2) limit distribution and therefore it is difficult to use the LRS to test for the number of components. These problems are circumvented by using the nonparametric bootstrap such that the mixture algorithm is applied B times to bootstrap samples obtained from the original sample with replacement. The number of components k is obtained as the mode of the bootstrap distribution of k. This approach is presented using the Times newspaper data and investigated in a simulation study for mixtures of Poisson data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aitkin M. and Rubin D.B. 1985. Estimation and hypothesis testing in finite mixture models. J. R. Stat. Soc., Ser. B 47: 67–75.

    Google Scholar 

  • Böhning D. 1995. A review of reliable maximum likelihood algorithms for semiparametric mixture models. J. Stat. Plann. Inference 47(1-2): 5–28.

    Article  Google Scholar 

  • Böhning D. 1999. C.A.MAN-Computer Assisted Analysis of Mixtures and Applications. Chapman and Hall.

  • Böhning D., Dietz E., Schaub R., Schlattmann P. and Lindsay B.G. 1994. The distribution of the likelihood ratio for mixtures of densities from the one-parameter exponential family. Ann. Inst. Stat. Math. 46(2): 373–388.

    Article  Google Scholar 

  • Böhning D., Dietz E. and Schlattmann P. 1998. Recent developments in computer assisted mixture analysis. Biometrics 54: 283–303.

    Google Scholar 

  • Böhning D. and Schlattmann P. 1999. Disease mapping with a hidden structure using mixture models. In: Lawson A., Biggeri A., Böhning D., Lesaffre E., Viel J.F. and Bertollini R. (Eds.), Disease Mapping and Risk Assessment for Public Health decision making. Wiley, Chichester.

    Google Scholar 

  • Böhning D., Schlattmann P. and Lindsay B.G. 1992. C.A.MAN-computer assisted analysis of mixtures: Statistical algorithms. Biometrics 48: 283–303.

    PubMed  Google Scholar 

  • Chen J. and Kalbfleisch J.D. 1996. Penalized minimum-distance estimates in finite mixture models. Can. J. Stat. 24(2): 167–175.

    Google Scholar 

  • Day N. 1969. Estimating the components of a mixture of normal distributions. Biometrika 56: 463–474.

    Google Scholar 

  • Dempster A., Laird N. and Rubin D. 1977. Maximum likelihood from incomplete data via the EM algorithm. Discussion. J. R. Stat. Soc., Ser. B 39: 1–38.

    Google Scholar 

  • Efron B. 1982. The Jackknife, the Bootstrap and Other Resampling Plans. CBMS-NSF Reg. Conf. Ser. Appl. Math. 38.

  • Efron B. and Tibshirani R.J. 1993. An Introduction to the Bootstrap. Monographs on Statistics and Applied Probability, vol. 57. Chapman & Hall, New York, NY.

    Google Scholar 

  • Eilers P. 1995. Indirect observations, composite link models and penalized likelihood. In: Seeber G., Francis B., Hatzinger R. and Steckel-Berger G. (Eds.), Statistical Modelling. Springer, Berlin, pp. 91–98.

    Google Scholar 

  • Everitt B. and Hand D. 1981. Finite Mixture Distributions. Monographs on Applied Probability and Statistics. Chapman and Hall, London, New York.

    Google Scholar 

  • Feng Z. and McCulloch C. 1996. Using bootstrap likelihood ratios in finite mixture models. J. R. Stat. Soc., Ser. B 58(3): 609–617.

    Google Scholar 

  • Hasselblad V. 1969. Estimation of finite mixtures of distributions from the exponential family. J. Am. Stat. Assoc. 64: 1459–1471.

    Google Scholar 

  • Jamshidian M. and Jennrich R.I. 1997. Acceleration of the EM algorithm by using quasi-Newton methods. J. R. Stat. Soc., Ser. B 59(3): 569–587.

    Google Scholar 

  • Karlis D. and Xekalaki E. 1999. On testing for the number of components in a mixed Poisson model. Ann. Inst. Stat. Math. 51(1): 149–162.

    Article  Google Scholar 

  • Leroux B.G. 1992. Consistent estimation of a mixing distribution. Ann. Stat. 20(3): 1350–1360.

    Google Scholar 

  • Lindsay B.G. 1983a. The geometry of mixture likelihoods: A general theory. Ann. Stat. 11: 86–94.

    Google Scholar 

  • Lindsay B.G. 1983b. The geometry of mixture likelihoods, part II: The exponential family. Ann. Stat. 11: 783–792.

    Google Scholar 

  • Lindsay B.G. 1995. Mixture models Theory, Geometry and Applications. NSF-CBMS Regional Conference Series in Probability and Statistics, Institute of Statistical mathematics, vol. 5, Hayward.

    Google Scholar 

  • Lo Y., Mendell N.R. and Rubin D.B. 2001. Testing the number of components in a normal mixture. Biometrika 88(3): 767–778.

    Article  Google Scholar 

  • McLachlan G.J. and Basford K.E. 1988. Mixture models: Inference and Applications to Clustering. Statistics: Textbooks and Monographs, Marcel Dekker, Inc., New York etc., vol. 84, pp. 253.

    Google Scholar 

  • McLachlan G.J. and Peel D. 2000. Finite Mixture models. Wiley, Chichester.

    Google Scholar 

  • Mendell N.R., Thode H.C. and Finch S.J. 1991. The likelihood ratio test for the two-component normal mixture problem: Power and sample size analysis. Biometrics 47: 1143–1148.

    CAS  PubMed  Google Scholar 

  • Richardson S. and Green P.J. 1997. On Bayesian analysis of mixtures with an unknown number of components. (With discussion). J. R. Stat. Soc., Ser. B 59(4): 731–792.

    Google Scholar 

  • Schelp F., Vivatanasept P., Sitaputra P., Sornmani S., Pongpaew P., Vudhivai N., Egormaiphol S. and Bohning D. 1990. Relationship of the morbidity of under-fives to anthropometric measurements and community health intervention. Trop Med Parasitol, pp. 121–126.

  • Schlattmann P. and Böhning D. 1993. Mixture models and disease mapping. Statistics in Medicine 12: 943–950.

    Google Scholar 

  • Schlattmann P. and Böhning D. 1997. On Bayesian analysis of mixtures with an unknown number of components. Contribution to a paper by S. Richardson and P.J. Green. J. R. Stat. Soc., Ser. B 59(4): 782–783.

    Google Scholar 

  • Seidel W., Mosler K. and Alker M. 2000. A cautionary note on likelihood ratio tests in mixture models. Ann. Inst. Stat. Math. 52(3): 481–487.

    Article  MathSciNet  Google Scholar 

  • Thode H.C., Finch S.J. and Mendell N.R. 1988. Simulated percentage points for the null distribution of the likelihood ratio test for a mixture of two normals. Biometrics 44(4): 1195–1201.

    PubMed  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Schlattmann.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schlattmann, P. On bootstrapping the number of components in finite mixtures of Poisson distributions. Stat Comput 15, 179–188 (2005). https://doi.org/10.1007/s11222-005-1307-8

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-005-1307-8

Keywords

Navigation