Abstract
In this paper, we propose the zero-modified Poisson–Shanker regression model as an alternative to model overdispersed count data exhibiting inflation or deflation of zeros in the presence of covariates. The zero modification has been incorporated using the zero-truncated Poisson–Shanker distribution. The zero-modified Poisson–Shanker distribution has been written as a hurdle model using a simple reparameterization of the probability function which leads to the fact that the proposed model can be fitted without any previous information about the zero modification present in a given dataset. The standard Bayesian procedures have been considered for estimation and inference. A simulation study has been presented to illustrate the performance of the developed methodology. The usefulness of the proposed model has been evaluated using a real dataset on fetal deaths notification data in Bahia State, Brazil. A sensitivity study to detect points which can influence the parameter estimates has been performed using Kullback–Leibler divergence measure. The randomized quantile residuals have been considered for the model validation issue. General comparison of the proposed model with some well-known discrete distributions has been provided.
Similar content being viewed by others
References
Angers JF, Biswas A (2003) A Bayesian analysis of zero-inflated generalized Poisson model. Comput Stat Data Anal 42(1):37–46
Bahn GD, Massenburg R (2008) Deal with excess zeros in the discrete dependent variable, the number of homicide in Chicago census tract. In: Joint Statistical Meetings of the American Statistical Association, pp 3905–3912
Bazán JL, Torres-Avilés F, Suzuki AK, Louzada F (2017) Power and reversal power links for binary regressions: an application for motor insurance policyholders. Appl Stoch Models Bus Ind 33(1):22–34
Beuf KD, Schrijver JD, Thas O, Criekinge WV, Irizarry RA, Clement L (2012) Improved base-calling and quality scores for 454 sequencing based on a hurdle Poisson model. BMC Bioinform 13(1):303
Bohara AK, Krieg RG (1996) A zero-inflated Poisson model of migration frequency. Int Reg Sci Rev 19(3):211–222
Bové DS, Held L (2011) Hyper-\(g\) priors for generalized linear models. Bayesian Anal 6(3):387–410
Bulmer MG (1974) On fitting the Poisson-lognormal distribution to species-abundance data. Biometrics 30:101–110
Carlin BP, Louis TA (1997) Bayes and empirical Bayes methods for data analysis. Stat Comput 7(2):153–154
Chib S, Greenberg E (1995) Understanding the Metropolis-Hastings algorithm. Am Stat 49(4):327–335
Cho H, Ibrahim JG, Sinha D, Zhu H (2009) Bayesian case influence diagnostics for survival models. Biometrics 65(1):116–124
Cohen AC (1960) An extension of a truncated Poisson distribution. Biometrics 16(3):446–450
Conceição KS, Andrade MG, Louzada Neto F (2013) Zero-modified Poisson model: Bayesian approach, influence diagnostics, and an application to a Brazilian leptospirosis notification data. Biom J 55(5):661–678
Cox DR, Snell EJ (1968) A general definition of residuals. J R Stat Soc Ser B (Methodol) 30:248–275
Csisz I (1967) Information-type measures of difference of probability distributions and indirect observations. Stud Sci Math Hungar 2:299–318
Dietz E, Böhning D (2000) On estimation of the Poisson parameter in zero-modified Poisson models. Comput Stat Data Anal 34(4):441–459
Dunn PK, Smyth GK (1996) Randomized quantile residuals. J Comput Graph Stat 5(3):236–244
Garay AM, Bolfarine H, Lachos VH, Cabral CRB (2015) Bayesian analysis of censored linear regression models with scale mixtures of Normal distributions. J Appl Stat 42(12):2694–2714
Gelfand AE, Dey DK (1994) Bayesian model choice: asymptotics and exact calculations. J R Stat Soc Ser B (Methodol) 56:501–514
Geweke J (1994) Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. J R Stat Soc 56(3):501–514
Ghosh SK, Mukhopadhyay P, Lu JC (2006) Bayesian analysis of zero-inflated regression models. J Stat Plan Inference 136(4):1360–1375
Gupta M, Ibrahim JG (2009) An information matrix prior for Bayesian analysis in generalized linear models with high dimensional data. Stat Sin 19(4):1641
Gurmu S, Trivedi PK (1996) Excess zeros in count models for recreational trips. J Bus Econ Stat 14(4):469–477
Heilbron DC, Gibson DR (1990) Shared needle use and health beliefs concerning AIDS: regression modeling of zero-heavy count data. Poster session. In: Proceedings of the sixth international conference on AIDS, San Francisco, CA
Hörmann W, Leydold J, Derflinger G (2013) Automatic nonuniform random variate generation. Springer Science & Business Media, Berlin
Hu MC, Pavlicova M, Nunes EV (2011) Zero-inflated and hurdle models of count data with extra zeros: examples from an HIV-risk reduction intervention trial. Am J Drug Alcohol Abuse 37(5):367–375
King G (1989) Variance specification in event count models: from restrictive assumptions to a generalized estimator. Am J Polit Sci 33:762–784
Lambert D (1992) Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34(1):1–14
McCulloch RE (1989) Local model influence. J Am Stat Assoc 84(406):473–478
McDowell A (2003) From the help desk: Hurdle models. Stata J 3(2):178–184
Mullahy J (1986) Specification and testing of some modified count data models. J Econom 91(434):841–853
Ngatchou-Wandji J, Paris C (2011) On the zero-inflated count models with application to modelling annual trends in incidences of some occupational allergic diseases in France. J Data Sci 9:639–659
R Development Core Team (2007) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0
Ridout M, Demétrio CGB, Hinde J (1998) Models for count data with many zeros. In: Proceedings of the XIXth international biometric conference, vol 19, pp 179–192
Rodrigues J (2003) Bayesian analysis of zero-inflated distributions. Commun Stat Theory Methods 32(2):281–289
Saffar SE, Adnan R, Greene W (2012) Parameter estimation on hurdle Poisson regression model with censored data. Jurnal Teknologi 57(1):189–198
Sankaran M (1970) The discrete Poisson–Lindley distribution. Biometrics 26(1):145–149
Shanker R (2015) Shanker distribution and its applications. Int J Stat Appl 5(6):338–348
Shanker R (2016a) The discrete Poisson–Sujatha distribution. Int J Probab Stat 5(1):1–9
Shanker R (2016b) The discrete Poisson–Amarendra distribution. Int J Stat Distrib Appl 2(2):14–21
Shanker R (2016c) The discrete Poisson–Shanker distribution. Jacobs J Biostat 1(1):1–7
Shanker R, Mishra A (2014) A two parameter Poisson–Lindley distribution. Int J Stat Syst 9(1):79–85
Shanker R, Mishra A (2016) A quasi Poisson–Lindley distribution. J Indian Stat Assoc 54(1&2):113–125
Shanker R, Sharma S, Shanker U, Shanker R, Leonida TA (2014) The discrete Poisson–Janardan distribution with applications. Int J Soft Comput Eng (IJSCE) 4(2):31–33
Umbach D (1981) On inference for a mixture of a Poisson and a degenerate distribution. Commun Stat Theory Methods 10(3):299–306
Zamani H, Ismail N (2010) Negative Binomial–Lindley distribution and its application. J Math Stat 6(1):4–9
Zellner A (1986) On assessing prior distributions and Bayesian regression analysis with g-prior distributions. Bayesian inference and decision techniques: Essays in Honor of Bruno De Finetti 6:233–243
Zorn CJW (1996) Evaluating zero-inflated and hurdle Poisson specifications. Midwest Polit Sci Assoc 18(20):1–16
Acknowledgements
The authors are grateful for the insightful comments and constructive suggestions provided by the associate editor and the anonymous referees. Also, the first author would like to thank the Federal Technology University of Paraná and the Araucária Foundation for the financial support during this research.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Bertoli, W., Conceição, K.S., Andrade, M.G. et al. On the zero-modified Poisson–Shanker regression model and its application to fetal deaths notification data. Comput Stat 33, 807–836 (2018). https://doi.org/10.1007/s00180-017-0788-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-017-0788-1