Skip to main content
Log in

General mixed Poisson regression models with varying dispersion

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

A general class of mixed Poisson regression models is introduced. This class is based on a mixing between the Poisson distribution and a distribution belonging to the exponential family. With this, we unified some overdispersed models which have been studied separately, such as negative binomial and Poisson inverse gaussian models. We consider a regression structure for both the mean and dispersion parameters of the mixed Poisson models, thus extending, and in some cases correcting, some previous models considered in the literature. An expectation–maximization (EM) algorithm is proposed for estimation of the parameters and some diagnostic measures, based on the EM algorithm, are considered. We also obtain an explicit expression for the observed information matrix. An empirical illustration is presented in order to show the performance of our class of mixed Poisson models. This paper contains a Supplementary Material.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Atkinson, A.C.: Plots, Transformations and Regression. Oxford University Press, Oxford (1985)

    MATH  Google Scholar 

  • Breslow, N.E.: Extra-Poisson variation in log-linear models. J. R. Stat. Soc. Ser. C 33, 38–44 (1984)

    Google Scholar 

  • Brillinger, D.R.: The natural variability of vital rates and associated statistics (with discussion). Biometrics 42, 693–734 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  • Cameron, A.C., Trivedi, P.K.: Regression Analysis of Count Data. Cambridge University Press, New York (1998)

    Book  MATH  Google Scholar 

  • Cook, R.D.: Detection of influential observations in linear regression. Technometrics 19, 15–18 (1977)

    MathSciNet  MATH  Google Scholar 

  • Cook, R.D.: Assessment of local influence. J. R. Stat. Soc. Se. B 48, 133–169 (1986)

    MathSciNet  MATH  Google Scholar 

  • Dean, C.B., Nielsen, J.D.: Generalized linear mixed models: a review and some extensions. Lifetime Data Anal. 13, 497–512 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Dean, C.B., Lawless, J., Willmot, G.E.: A mixed Poisson inverse Gaussian regression model. Can. J. Stat. 17, 171–182 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  • Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Stat. Soc. Se. B 39, 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  • Dunn, P.K., Smyth, G.K.: dglm: Double generalized linear models. R package version 1.6.2. http://CRAN.R-project.org/package=dglm (2012)

  • Engel, J.: Models for response data showing extra-Poisson variation. Stat. Neerlandica 38, 159–167 (1984)

    Article  MathSciNet  Google Scholar 

  • Ghitany, M.E., Karlis, D., Al-Mutairi, D.K., Al-Awadhi, F.A.: An EM algorithm for multivariate mixed poisson regression models and its application. Appl. Math. Sci. 6, 6843–6856 (2012)

    MathSciNet  Google Scholar 

  • Hall, D.: Zero-inflated Poisson and binomial regression with random effects: a case study. Biometrics 56, 1030–1039 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  • Hilbe, J.M.: Negative Binomial Regression. Cambridge University Press, New York (2008)

    MATH  Google Scholar 

  • Hinde, J., Demétrio, C.G.B.: Overdispersion: models and estimation. Comput. Stat. Data Anal. 27, 151–170 (1998)

    Article  MATH  Google Scholar 

  • Holla, M.S.: On a Poisson-inverse Gaussian distribution. Metrika 11, 115–121 (1966)

    Article  MathSciNet  MATH  Google Scholar 

  • Karlis, D.: A general EM approach for maximum likelihood estimation in mixed Poisson regression models. Stat. Model. 1, 305–318 (2001)

    Article  MATH  Google Scholar 

  • Karlis, D., Xekalaki, E.: Mixed Poisson distributions. Int. Stat. Rev. 73, 35–58 (2005)

    Article  MATH  Google Scholar 

  • Kassahun, W., Neyens, T., Faes, C., Molenberghs, G., Verbeke, G.: A zero-inflated overdispersed hierarchical Poisson model. Stat. Model. 14, 439–456 (2014)

    Article  MathSciNet  Google Scholar 

  • Lambert, D.: Zero-inflated poisson regression, with an application to defects in manufacturing. Technometrics 34, 1–14 (1992)

    Article  MATH  Google Scholar 

  • Lawless, J.F.: Negative binomial and mixed Poisson regression. Can. J. Stat. 15, 209–225 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  • Lesaffre, E., Verbeke, G.: Local influence in linear mixed model. Biometrics 54, 570–583 (1998)

    Article  MATH  Google Scholar 

  • Lim, H.K., Li, W.K., Yu, P.L.H.: Zero-inflated Poisson regression mixture model. Comput. Stat. Data Anal. 71, 151–158 (2014)

    Article  MathSciNet  Google Scholar 

  • Louis, T.A.: Finding the observed information matrix when using the EM algorithm. J. R. Stat. Soc. 44, 226–233 (1982)

    MathSciNet  MATH  Google Scholar 

  • Ord, J.K., Whitmore, G.A.: The Poisson-inverse gaussian distribution as a model for species abundance. Commun. Stat. Theory Methods 15, 853–871 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  • Poon, W., Poon, Y.: Conformal normal curvature and assessment of local influence. J. R. Stat. Soc. Ser. B 61, 51–61 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  • R Core Team.: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. url: http://www.R-project.org (2014)

  • Ridout, M., Hinde, J., Demétrio, C.G.B.: A Score test for testing a zero-inflated Poisson regression model against zero-inflated negative binomial alternatives. Biometrics 57, 219–223 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Rigby, R.A., Stasinopoulos, D.M.: Generalized additive models for location, scale and shape (with discussion). Appl. Stat. 54, 507–554 (2005)

    MathSciNet  MATH  Google Scholar 

  • Sankaran, M.: Mixtures by the inverse Gaussian distribution. Sankhya B 30, 455–458 (1968)

    Google Scholar 

  • Sellers, K.F., Shmueli, G.: A flexible regression model for count data. Ann. Appl. Stat. 4, 943–961 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  • Sichel, H.S.: On a family of discrete distributions particularly suited to represent long-tailed frequency data. In:Proceedings of the Third Symposium on Mathematical Statistics, Pretoria, CSIR, pp. 51–97 (1971)

  • Willmot, G.E.: The Poisson-inverse Gaussian distribution as an alternative to the negative binomial. Scand. Actuar. J. 87, 113–127 (1989)

    MathSciNet  Google Scholar 

  • Wu, C.F.J.: On the convergence properties of the EM algorithm. Ann. Stat. 11, 95–103 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  • Xie, F.C., Wei, B.C.: Influence analysis for Poisson inverse Gaussian regression models based on the EM algorithm. Metrika 67, 49–62 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  • Xie, F.C., Lin, J.G., Wei, B.C.: Bayesian zero-inflated generalized Poisson regression model: estimation and case influence diagnostics. J. Appl. Stat. 41, 1383–1392 (2014)

    Article  MathSciNet  Google Scholar 

  • Yau, K.K.W., Wang, K., Lee, A.H.: Zero-inflated negative binomial mixed regression modeling of over-dispersed count data with extra zeros. Biom. J. 45, 437–452 (2003)

    Article  MathSciNet  Google Scholar 

  • Zhu, H.T., Lee, S.Y.: Local influence for incomplete-data models. J. R. Stat. Soc. Ser. B 63, 111–126 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Zhu, H.T., Lee, S.Y., Wei, B.C., Zhu, J.: Case-deletion measures for models with incomplete data. Biometrika 88, 727–737 (2001)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

We would like to thank two referees for their careful reading and suggestions which improved the paper. The authors thank the financial support from FAPEMIG (W. Barreto-Souza) and CNPq-Brazil (A.B. Simas).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wagner Barreto-Souza.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 84 KB)

Appendix

Appendix

We begin this section by providing correction of some results presented in Xie and Wei (2008).

In Xie and Wei (2008, p. 52), the correct expression for \(E(\nu _i^{-1}|Y_o,\theta ^{(r)})\) in the case \(y_i=0\) is \(\lambda ^{(r)}\big [ 1+\sqrt{(\lambda ^{(r)})^{-1}(2\mu _i+(\lambda ^{(r)})^{-1})}\big ]\) instead of \(\lambda ^{(r)}\big [ 1+\sqrt{(\lambda ^{(r)})^{-1}}(2\mu _i+(\lambda ^{(r)})^{-1})\big ]\) .

In Xie and Wei (2008, p. 58), where one finds “\(-T\varvec{1}_n\varvec{1}_m^TS\) \(- \beta ^TS\varvec{1}_mGX\)”, the correct expression is “\(-T\varvec{1}_n\varvec{1}_m^TS\) \(-GX\beta \varvec{1}_mS\)”.

Finally, the perturbation of responses (case 3) found in Xie and Wei (2008, p. 58) does not make sense since the random variable \(Y_i\) is discrete and, therefore, does not admit infinitesimal perturbations (all perturbations of \(Y_i\) must be integer-valued). Therefore, this perturbation scheme is wrong and should not be considered. What does make sense is to do an infinitesimal perturbation on the latent variable Z, since it is continuous. We provide such a perturbation scheme in the present article (see the perturbation of the hidden variable scheme).

In what follows we present the proofs of two propositions given in the main text.

Proof of Proposition 1

We have that

$$\begin{aligned} E(Z|Y=y)= & {} \dfrac{\mu ^y}{y!p(y;\mu ,\phi )}\int _0^\infty e^{-\mu z}z^{y+1}\\&\times \,\exp \{\phi [z\xi _0-b(\xi _0)]+c(z;\phi )\}dz\\= & {} \dfrac{y+1}{\mu p(y;\mu ,\phi )}\int _0^\infty \dfrac{\mu ^{y+1}}{(n+1)!}e^{-\mu z}z^{y+1}\\&\times \,\exp \{\phi [z\xi _0-b(\xi _0)]+c(z;\phi )\}dz\\= & {} \dfrac{(y+1)p(y+1;\mu ,\phi )}{\mu p(y;\mu ,\phi )}. \end{aligned}$$

To obtain the second conditional expectation, we first get the conditional moment generating function of g(Z) given \(Y=y\), that is

$$\begin{aligned} E\left( \exp \{t\,g(Z)\}|Y=y\right)&=\dfrac{\mu ^y}{n!p(y;\mu ,\phi )}\int _0^\infty e^{-\mu z}z^y\\&\quad \times \,\exp \{\phi [z\xi _0-b(\xi _0)]+d(\phi )\\&\quad +\,(\phi +t)g(z)+h(z)\}dz\\&=\exp \left\{ \phi b\left( \dfrac{\phi \xi _0}{\phi +t}\right) -d(\phi +t)\right. \\&\quad +\,\left. d(\phi )-\phi b(\xi _0)\right\} \\&\quad \times \,\dfrac{p(y;\mu ^*_t,\phi +t)}{p(y;\mu ,\phi )}, \end{aligned}$$

with \(\mu _t^*\) as defined in the proposition. Hence, computing the derivative of \(E\left( \exp \{t\,g(Z)\}|Y=y\right) \) with respect to t at \(t=0\), after some simplifications we get the desired result. \(\square \)

Proof of Proposition 2

A simple computation shows that

$$\begin{aligned} \left. \frac{\partial Q_{[i]}}{\partial {\varvec{\beta }}}({\varvec{\theta }};\widehat{{\varvec{\theta }}})\right| _{{\varvec{\theta }}=\widehat{{\varvec{\theta }}}} = a_i\mathbf x_i, \end{aligned}$$

and

$$\begin{aligned} \left. \frac{\partial Q_{[i]}}{\partial \varvec{\alpha }}({\varvec{\theta }};\widehat{{\varvec{\theta }}})\right| _{{\varvec{\theta }}=\widehat{{\varvec{\theta }}}} = b_i\mathbf w_i. \end{aligned}$$

Furthermore,

$$\begin{aligned} \left. \frac{\partial ^2 Q_{[i]}}{\partial {\varvec{\beta }}\partial {\varvec{\beta }}^\top }({\varvec{\theta }};\widehat{{\varvec{\theta }}})\right| _{{\varvec{\theta }}=\widehat{{\varvec{\theta }}}}= & {} \sum _{i=1}^n -\mu _i\lambda _i\mathbf x_i\mathbf x_i^\top \\= & {} -\mathbf X^\top \mathbf G_1\mathbf X, \end{aligned}$$

and

$$\begin{aligned} \left. \frac{\partial ^2 Q_{[i]}}{\partial \varvec{\alpha }\partial \varvec{\alpha }^\top }({\varvec{\theta }};\widehat{{\varvec{\theta }}})\right| _{{\varvec{\theta }}=\widehat{{\varvec{\theta }}}}= & {} \sum _{i=1}^n \Big (\phi _i(\xi _0\lambda _i-b(\xi _0)\\&+\kappa _i+d''(\phi _i)\phi _i\Big )\mathbf w_i\mathbf w_i^\top \\= & {} -\mathbf W^\top \mathbf G_2\mathbf W. \end{aligned}$$

Thus

$$\begin{aligned} \left. \frac{\partial ^2 Q_{[i]}}{\partial {\varvec{\theta }}\partial {\varvec{\theta }}^\top }({\varvec{\theta }};\widehat{{\varvec{\theta }}})\right| _{{\varvec{\theta }}=\widehat{{\varvec{\theta }}}} = - \begin{bmatrix} \mathbf X^\top \mathbf G_1\mathbf X&\quad \mathbf 0 \\ \mathbf 0&\quad \mathbf W^\top \mathbf G_2\mathbf W \end{bmatrix}. \end{aligned}$$

\(\square \)

The elements of the observed information matrix (6) can be obtained by using the following quantities:

$$\begin{aligned} E\left( -\dfrac{\partial \ell _c^2}{\partial \beta _j\partial \beta _l}\big |\mathbf{Y}\right) =\sum _{i=1}^n\mu _i\lambda _ix_{ij}x_{il}, \end{aligned}$$

for    \(j,l=1,\ldots ,p\),

$$\begin{aligned} E\left( -\dfrac{\partial \ell _c^2}{\partial \alpha _j\partial \alpha _l}\big |\mathbf{Y}\right)&=\sum _{i=1}^n\phi _i\{b(\xi _0)-\xi _0\lambda _i-\kappa _i-d'(\phi _i)\\&\quad -\,\phi _id''(\phi _i)\}w_{ij}w_{il}, \end{aligned}$$

for    \(j,l=1,\ldots ,q\),

$$\begin{aligned} E\left( -\dfrac{\partial \ell _c^2}{\partial \beta _j\partial \alpha _l}\big |\mathbf{Y}\right) =0, \end{aligned}$$

for    \(j=1,\ldots ,p\) and \(l=1,\ldots ,q\),

$$\begin{aligned} E\left( \dfrac{\partial \ell _c}{\partial \beta _j}\dfrac{\partial \ell _c}{\partial \beta _l}\big |\mathbf{Y}\right) =\sum _{i=1}^n(y_i^2-2y_i\mu _i\lambda _i +\mu _i^2\gamma _i)x_{ij}x_{il}+\\ \sum _{i\ne k}(y_i-\mu _i\lambda _i)(y_k-\mu _k\lambda _k)x_{ij}x_{kl}, \end{aligned}$$

for    \(j,l=1,\ldots ,p\),

$$\begin{aligned}&E\left( \dfrac{\partial \ell _c}{\partial \beta _j}\dfrac{\partial \ell _c}{\partial \alpha _l}\big |\mathbf{Y}\right) \\&\quad =\sum _{i=1}^n\phi _i\{y_i[\xi _0\lambda _i+\kappa _i+d'(\phi _i)-b(\xi _0)]\\&\qquad -\,\mu _i[\xi _0\gamma _i+\rho _i+\lambda _i(d'(\phi _i)-b(\xi _0))]\} x_{ij}z_{il}\\&\qquad +\,\sum _{i\ne k}\phi _k(y_i-\mu _i\lambda _i)(\xi _0\lambda _k-b(\xi _0)\\&\qquad +\,\kappa _k+d'(\phi _k)) x_{ij}w_{kl}, \end{aligned}$$

for    \(j,1,\ldots ,p\) and \(l=1,\ldots ,q\),

$$\begin{aligned} E\left( \dfrac{\partial \ell _c}{\partial \alpha _j}\dfrac{\partial \ell _c}{\partial \alpha _l}\big |\mathbf{Y}\right)&=\sum _{i=1}^n\phi _i^2w_{ij}w_{il}\{(d'(\phi _i)-b(\xi _0))^2\\&\quad +\,2(d'(\phi _i)-b(\xi _0))(\xi _0\lambda _i+\kappa _i)\\&\quad +\,\xi _0^2\gamma _i+2\xi _0\rho _i+\nu _i\}\\&\quad +\,\sum _{i\ne k}\phi _i\phi _k(\xi _0\lambda _i-b(\xi _0)\\&\quad +\,\kappa _i+d'(\phi _i))\\&\quad \times \,(\xi _0\lambda _k-b(\xi _0)+\kappa _k+d'(\phi _k)) w_{ij}w_{kl}, \end{aligned}$$

for    \(j,l=1,\ldots ,q\), where \(\lambda _i\) and \(\kappa _i\) are defined as before and here we have defined \(\gamma _i=E(Z_i^2|\mathbf{Y})\), \(\rho _i=E(Z_i g(Z_i)|\mathbf{Y})\) and \(\nu _i=E(g(Z_i)^2|\mathbf{Y})\), with \(i=1,\ldots ,n\).

The conditional expectations that appears above are given explicitly in the following proposition.

Proposition 3

Let \(Y\sim \hbox {MP}(\mu ,\phi )\) with latent random effect Z belonging the exponential family as defined previously. Then, we have that

$$\begin{aligned}&E(Z^2|Y=y)=\dfrac{(y+1)(y+2)p(y+2;\mu ,\phi )}{\mu ^2 p(y;\mu ,\phi )},\\&E(g(Z)^2|Y=y)=(d'(\phi )+\xi _0)^2-2(d'(\phi )+\xi _0)\\&\quad \times \,\phi ^{-1}\xi _0^2b''(\xi _0)+\dfrac{d\,p(y;\mu ^*_t,\phi +t)/dt|_{t=0}}{p(y;\mu ,\phi )}\\&\quad +\,2\phi ^{-1}\xi _0-d''(\phi )+\dfrac{d^2p(y;\mu ^*_t,\phi +t)/dt^2|_{t=0}}{p(y;\mu ,\phi )} \end{aligned}$$

and

$$\begin{aligned}&E(Z g(Z)|Y=y)=\dfrac{(y+1)p(y+1;\mu ,\phi )}{\mu p(y;\mu ,\phi )}\\&\quad \times \left\{ \dfrac{d\,p(y+1;\mu ^*_t,\phi +t)/dt|_{t=0}}{p(y+1;\mu ,\phi )}-\xi _0-d'(\phi )\right\} . \end{aligned}$$

Corollary 1

For the PIG case, we have that

$$\begin{aligned} \rho= & {} -1/2,\\ \nu= & {} \left\{ \begin{array}{ll} \dfrac{1}{4\phi ^2}\{\phi (2\mu +\phi )+3\left[ 1+\sqrt{\phi (2\mu +\phi )}\right] \},&{}\quad \hbox {for}\,\, y=0,\\ \dfrac{(2\mu +\phi )^{1/2}}{4\phi ^{3/2}}[1+\sqrt{\phi (2\mu +\phi )}],&{}\quad \hbox {for}\,\, y=1,\\ \dfrac{\mu ^2}{4y(y-1)}\dfrac{p(y-2;\mu ,\phi )}{p(y;\mu ,\phi )},&{}\quad \hbox {for}\,\, y\ge 2. \\ \end{array}\right. \end{aligned}$$

In the NB case, we have that

$$\begin{aligned} \rho= & {} \dfrac{y+\phi }{\mu +\phi }[\Psi (y+\phi +1)-\log (\mu +\phi )],\\ \nu= & {} \Psi '(y+\phi )+\Psi (y+\phi )^2-2\Psi (y+\phi )\log (\mu +\phi )\\&+\log ^2(\mu +\phi ), \end{aligned}$$

where \(\Psi '(x)=d\Psi (x)/dx\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Barreto-Souza, W., Simas, A.B. General mixed Poisson regression models with varying dispersion. Stat Comput 26, 1263–1280 (2016). https://doi.org/10.1007/s11222-015-9601-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-015-9601-6

Keywords

Navigation