Skip to main content
Log in

Modeling data with a truncated and inflated Poisson distribution

  • Original Paper
  • Published:
Statistical Methods & Applications Aims and scope Submit manuscript

Abstract

Zero inflated Poisson regression is a model commonly used to analyze data with excessive zeros. Although many models have been developed to fit zero-inflated data, most of them strongly depend on the special features of the individual data. For example, there is a need for new models when dealing with truncated and inflated data. In this paper, we propose a new model that is sufficiently flexible to model inflation and truncation simultaneously, and which is a mixture of a multinomial logistic and a truncated Poisson regression, in which the multinomial logistic component models the occurrence of excessive counts. The truncated Poisson regression models the counts that are assumed to follow a truncated Poisson distribution. The performance of our proposed model is evaluated through simulation studies, and our model is found to have the smallest mean absolute error and best model fit. In the empirical example, the data are truncated with inflated values of zero and fourteen, and the results show that our model has a better fit than the other competing models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Bae S, Famoye F, Wulu JT, Bartolucci AA, Singh KP (2005) A rich family of generalized Poisson regression models with applications. Math Comput Simul 69:4–11

    Article  MATH  Google Scholar 

  • Begum A, Mallick A, Pal N (2014) A generalized inflated Poisson distribution with application to modeling fertility data. Thail Stat 12:135–139

    MATH  Google Scholar 

  • Brijs T, Van der Waerden P, Timmermans, HJP (2005) Spatial and non-spatial covariates of telecommuting activities: a right truncated zero-inflated Poisson regression model. In: Proceedings of the Colloquium Vervoersplanologisch Speurwerk, Antwerpen, Belgium, November 24–25, pp 41–60

  • Centers for Disease Control and Prevention (CDC) (2011) Behavioral risk factor surveillance system survey data Atlanta. US Department of Health and Human Services, Georgia

    Google Scholar 

  • Famoye F, Singh KP (2003) On inflated generalized Poisson regression models. Adv Appl Stat 3:145–158

    MathSciNet  MATH  Google Scholar 

  • Klein JP, Moeschberger ML (2003) Survival analysis: techniques for censored and truncated data, 2nd edn. Springer, New York

    MATH  Google Scholar 

  • Lambert D (1992) Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34:1–14

    Article  MATH  Google Scholar 

  • Lin TH, Tsai MH (2013) Modeling health survey data with excessive zero and \(K\) responses. Stat Med 32:1572–1583

    Article  MathSciNet  Google Scholar 

  • Lin TH, Tsai MH (2016) Model selection criteria for dual-inflated data. J Stat Comput Simul 86:2663–2672

    Article  MathSciNet  Google Scholar 

  • R Core Team (2015) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/

  • Rakitzis A, Castagliola P, Maravelakis P (2016) A two-parameter general inflated Poisson distribution: properties and applications. Stat Methodol 29:32–50

    Article  MathSciNet  Google Scholar 

  • Wang H, Heitjan DF (2008) Modeling heaping in self-reported cigarette counts. Stat Med 27:3789–3804

    Article  MathSciNet  Google Scholar 

  • Welsh AH, Cunningham RB, Donnelly CF, Lindenmayer DB (1996) Modelling the abundance of rare species: statistical models for counts with extra zeros. Ecol Model 88:297–308

    Article  Google Scholar 

  • Zhou XH, Tu WZ (1999) Comparison of several independent population means when their samples contain log-normal and possibly zero observations. Biometrics 55:645–651

    Article  MATH  Google Scholar 

Download references

Acknowledgements

This research is supported by the Ministry of Science and Technology, Taiwan, R.O.C., research Grants MOST 103-2410-H-305-041 and 103-2118-M-305-001.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ting Hsiang Lin.

Appendix

Appendix

Proof of Lemma 1

Since

$$\begin{aligned}&\frac{\partial }{\partial \eta _{i}}\left[ {\left( {\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{y_{i}}}{y_{i}!}}\right) \left( {\sum \limits _{z=0}^K {\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{z}}{z!}}}\right) ^{-1}}\right] \\&\quad =\left( {\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{y_{i} }}{y_{i}!}}\right) \left( {y_{i}\sum \limits _{z=0}^K {\frac{\left( {\text{ e }^{\eta _{i} }}\right) ^{z}}{z!}- \sum \limits _{z=0}^K {\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{z}z}{z!}}}}\right) \left( {\sum \limits _{z=0}^K {\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{z}}{z!}}}\right) ^{-2}\\&\quad =\left[ {\left( {\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{y_{i}}}{y_{i}!}}\right) \left( {\sum \limits _{z=0}^K{\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{z}}{z!}}}\right) ^{-1}}\right] \left[ y_{i}- \left( \sum \limits _{z=0}^K{\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{z}z}{z!}}\right) \left( \sum \limits _{z=0}^K{\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{z}}{z!}}\right) ^{-1}\right] , \end{aligned}$$

hence, we have

$$\begin{aligned} \frac{\partial }{\partial \eta _{i}}\left[ {f_{\eta _{i}}(y_{i})}\right]= & {} f_{\eta _{i}}(y_{i}) \left[ y_{i}- \left( \sum \limits _{z=0}^K{\frac{\left( {\text{ e }^{\eta _{i} }}\right) ^{z}z}{z!}}\right) \left( \sum \limits _{z=0}^K {\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{z}}{z!}}\right) ^{-1} \right] \\= & {} f_{\eta _{i}}(y_{i}) \left[ y_{i}- \left( 0+\sum \limits _{z=1}^K{\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{z}z}{z!}}\right) \left( \sum \limits _{z=0}^K {\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{z}}{z!}}\right) ^{-1} \right] \\= & {} f_{\eta _{i}}(y_{i}) \left[ y_{i}- \left( \sum \limits _{z=1}^K{\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{z}}{(z-1)!}}\right) \left( \sum \limits _{z=0}^K{\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{z}}{z!}}\right) ^{-1} \right] \\= & {} f_{\eta _{i}}(y_{i}) \left[ y_{i}- \left( \text{ e }^{\eta _{i}}\sum \limits _{z=1}^K{\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{z-1}}{(z-1)!}}\right) \left( \sum \limits _{z=0}^K{\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{z}}{z!}}\right) ^{-1} \right] \\= & {} f_{\eta _{i}}(y_{i}) \left[ y_{i}-\text{ e }^{\eta _{i}} \left( \sum \limits _{z=0}^{K-1}{\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{z}}{z!}}\right) \left( \sum \limits _{z=0}^K{\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{z}}{z!}}\right) ^{-1} \right] \\= & {} f_{\eta _{i}}(y_{i}) \left[ y_{i}-\text{ e }^{\eta _{i}} \left( \sum \limits _{z=0}^K{\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{z}}{z!}}- \frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{K}}{K!}\right) \left( \sum \limits _{z=0}^K{\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{z}}{z!}}\right) ^{-1} \right] \\= & {} f_{\eta _{i}}(y_{i}) \left[ y_{i}-\text{ e }^{\eta _{i}} \left( {1-\left( {\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{K}}{K!}} \right) \left( {\sum \limits _{z=0}^K{\frac{\left( {\text{ e }^{\eta _{i}}}\right) ^{z}}{z!}}}\right) ^{-1}}\right) \right] \\= & {} f_{\eta _{i}}(y_{i}) \left[ y_{i}-\text{ e }^{\eta _{i}} \left( {1-f_{\eta _{i}}(K)}\right) \right] . \end{aligned}$$

This proves the assertion. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tsai, MH., Lin, T.H. Modeling data with a truncated and inflated Poisson distribution. Stat Methods Appl 26, 383–401 (2017). https://doi.org/10.1007/s10260-017-0377-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10260-017-0377-z

Keywords

Navigation