Effect of covariate misspecifications in the marginalized zero-inflated Poisson model

Samuel Iddi; Esther O. Nwoko

doi:10.1515/mcma-2017-0106

Published by De Gruyter May 18, 2017

Effect of covariate misspecifications in the marginalized zero-inflated Poisson model

Samuel Iddi and Esther O. Nwoko

From the journal Monte Carlo Methods and Applications

https://doi.org/10.1515/mcma-2017-0106

Showing a limited preview of this publication:

Abstract

Count outcomes are often modelled using the Poisson regression. However, this model imposes a strict mean-variance relationship that is unappealing in many contexts. Several studies in the life sciences result in count outcomes with excessive amounts of zeros. The presence of the excess zeros introduces extra dispersion in the data which cannot be accounted for by the traditional Poisson regression. The zero-inflated Poisson (ZIP) and zero-inflated negative binomial models are popular alternative. The zero-inflated models comprise two key components; a logistic part which models the zeros, and a Poisson component to handle the positive counts. Both components allow the inclusion of covariates. Civettini and Hines [3] investigated misspecification effects in the zero-inflated negative binomial regression models. Long,Preisser, Herring and Golin [10] proposed a so-called marginalized zero-inflated Poisson (MZIP) model that allows direct marginal interpretation for fixed effect estimates to overcome the often sub-population specific interpretation of the traditional zero-inflated models. In this research, the effects of misspecification of components of the MZIP regression model are investigated through a comprehensive simulation study. Two different incorrect specifications of the components of an MZIP model were considered, namely ‘Omission’ and ‘Misspecification’. Bias, standard error (precision) of estimates and mean square error (MSE) are computed while varying the sample size. Type I error rates are also evaluated for the misspecified models. Results of a Monte Carlo simulation are reported. It was observed that omissions in both parts of the models lead to biases in the estimated parameters. The intercept parameters were the most severely affected. Furthermore, in all the types of omissions, parameters in the zero-inflated part of the models were much affected compared to the Poisson part in terms of both bias and MSE. Generally, bias and MSE decrease as sample sizes increase for all parameters. It was also found that misspecification can either increase, preserve or decrease the type I error rates depending on the sample size.

Keywords: Marginal model; maximum likelihood estimation; misspecification; logistic model; omission; Poisson model; simulations; type I error rate; zero-inflation

MSC 2010: 62Jxx; 00A72

Funding statement: Samuel Iddi gratefully acknowledges financial support from University of Ghana, through ORID Research Grant.

A Supplementary appendix

Table 2

Results of the correct and misspecified models based on 500 simulations for sample of size 500.

Models	Quantity	β0	β1	β2	β3	α0	α1	α2	α3
CM	Est	0.2395	0.4075	0.2496	1.39867	0.6039	-2.0217	0.2530	-1.4982
	Std Err	0.0997	0.0793	0.0186	0.10830	0.1739	0.1871	0.0303	0.2497
	Bias	-0.0105	0.0075	-0.0004	-0.0013	0.0039	-0.0217	0.0030	0.0018
OMIT1	Est	0.1926	0.3106	0.3432	1.31915	0.9577	-1.9394	–	-1.4105
	Std Err	0.0977	0.0811	0.0066	0.10580	0.1664	0.2101	–	0.2630
	Bias	-0.0574	-0.0894	0.0932	-0.0808	0.3577	0.0606	–	0.0895
OMIT2	Est	0.4505	0.4376	–	1.37737	0.2318	-1.5703	0.5151	-1.0777
	Std Err	0.1060	0.0812	–	0.12110	0.1680	0.1430	0.0122	0.2226
	Bias	0.2005	0.0376	–	-0.0226	-0.3682	0.4297	0.2651	0.4223
OMIT3	Est	0.6946	0.5295	–	1.36019	0.9309	-1.9813	–	-1.4009
	Std Err	0.0957	0.0818	–	0.10340	0.1634	0.2065	–	0.2529
	Bias	0.4446	0.1295	–	-0.0398	0.3309	0.0187	–	0.0991
OMIT4	Est	0.7166	0.4406	–	0.90191	-0.3501	-1.4844	0.5046	–
	Std Err	0.0886	0.0846	–	0.07160	0.1215	0.1450	0.0115	–
	Bias	0.4666	0.0406	–	-0.4981	-0.9501	0.5156	0.2546	–

Table 3

Results of the correct and misspecified models based on 500 simulations for sample of size 500.

Models	Quantity	β0	β1	β2	β3	α0	α1	α2	α3
CMMIS1	Est	0.2453	0.4034	–	1.39386	0.6104	-2.0051	0.2522	-1.5044
	Std Err	0.1067	0.0841	–	0.12190	0.1894	0.2122	0.0271	0.2878
	Bias	-0.0047	0.0034	–	-0.0061	0.0104	-0.0051	0.0022	-0.0044
MIS1	Est	0.2486	0.4040	0.0028	1.39413	0.6156	-2.0109	0.2533	-1.5055
	Std Err	0.1106	0.0842	0.0252	0.12210	0.1955	0.2137	0.0697	0.2888
	Bias	-0.0014	0.0040	0.0028	-0.0059	0.0156	-0.0109	0.0033	-0.0055
CMMIS2	Est	0.2463	0.4030	0.2512	1.39887	0.5969	-2.0016	0.2484	–
	Std Err	0.1204	0.1145	0.0273	0.07170	0.1525	0.1894	0.0370	–
	Bias	-0.0037	0.0030	0.0012	-0.0011	-0.0031	-0.0016	-0.0016	–
MIS2	Est	0.2433	0.4049	0.2506	1.40758	0.6068	-2.0063	0.2484	0.0052
	Std Err	0.1315	0.1155	0.0274	0.13200	0.1935	0.1898	0.0372	0.2334
	Bias	-0.0067	0.0049	0.0006	0.00760	0.0068	-0.0063	-0.0016	0.0052
CMMIS3	Est	0.2486	0.4022	0.2483	–	0.5948	-2.0034	0.2540	–
	Std Err	0.1213	0.1264	0.0347	–	0.1588	0.2006	0.0466	–
	Bias	-0.0014	0.0022	-0.0017	–	-0.0052	-0.0034	0.0040	–
MIS3	Est	0.2439	0.4027	0.2488	-0.00792	0.5858	-2.0048	0.2519	-0.0001
	Std Err	0.1469	0.1266	0.0347	0.16540	0.2102	0.2015	0.0472	0.2766
	Bias	-0.0061	0.0027	-0.0012	-0.00790	-0.0142	-0.0048	0.0019	-0.0001
CMMIS4	Est	0.2434	0.4049	0.2499	–	0.6034	-2.0262	–	-1.4980
	Std Err	0.0746	0.0824	0.0046	–	0.1602	0.2727	–	0.2797
	Bias	-0.0066	0.0049	-0.0001	–	0.0034	-0.0262	–	0.0020
MIS4	Est	0.2540	0.4029	0.2491	-0.00513	0.6076	-2.0131	0.0030	-1.5068
	Std Err	0.1026	0.0850	0.0087	0.12390	0.2015	0.2730	0.0308	0.3445
	Bias	0.0040	0.0029	-0.0009	-0.00510	0.0076	-0.0131	0.0030	-0.0068

References

[1] A. Agresti, Foundations of Linear and Generalized Linear Models, John Wiley & Sons, Hoboken, 2015. Search in Google Scholar

[2] A. Agresti, B. Caffo and P. Ohman-Strickland, Examples in which misspecification of a random effects distribution reduces efficiency, and possible remedies, Comput. Statist. Data Anal. 47 (2004), no. 3, 639–653. 10.1016/j.csda.2003.12.009Search in Google Scholar

[3] A. J. Civettini and E. Hines, Misspecification effects in zero-inflated negative binomial regression models: Common cases, Annual Meeting of the Southern Political Science Association, New Orleans, (2005). Search in Google Scholar

[4] P. J. Heagerty, Marginally specified logistic-normal models for longitudinal binary data, Biometrics 55 (1999), 688–698. 10.1111/j.0006-341X.1999.00688.xSearch in Google Scholar

[5] S. Iddi and K. Doku-Amponsah, Statistical model for overdispersed count outcome with many zeros: An approach for direct marginal inference, South African J. Stat. 50 (2016), 313–330. 10.37920/sasj.2016.50.2.9Search in Google Scholar

[6] S. Iddi and G. Molenberghs, A combined overdispersed and marginalized multilevel model, Comput. Statist. Data Anal. 56 (2012), 1944–1951. 10.1016/j.csda.2011.11.021Search in Google Scholar

[7] D. Lambert, Zero-inflated Poisson regression, with an application to defects in manufacturing, Technometrics 34 (1992), no. 1, 1–14. 10.2307/1269547Search in Google Scholar

[8] S. Litière, A. Alonso and G. Molenberghs, Type I and type II error under random-effects misspecification in generalized linear mixed models, Biometrics 63 (2007), no. 4, 1038–1044. 10.1111/j.1541-0420.2007.00782.xSearch in Google Scholar PubMed

[9] S. Litière, A. Alonso and G. Molenberghs, The impact of a misspecified random-effects distribution on the estimation and the performance of inferential procedures in generalized linear mixed models, Stat. Med. 27 (2008), 3125–3144. 10.1002/sim.3157Search in Google Scholar PubMed

[10] D. L. Long, J. Preisser, A. Herring and C. Golin, A marginalized zero-inflated regression model with overall exposure effects, Stat. Med. 33 (2014), 5151–5165. 10.1002/sim.6293Search in Google Scholar PubMed PubMed Central

[11] R. W. M. Wedderburn, Quasi-likelihood functions, generalized linear models, and the Gauss–Newton method, Biometrika 61 (1974), 439–447. 10.1093/biomet/61.3.439Search in Google Scholar

[12] W. F. W. Yaacob, M. A. Lazim and Y. B. Wah, A practical approach in modelling count data, Proceedings of the Regional Conference on Statistical Sciences (Malaysia 2010), IEEE Press, Piscataway (2010), 176–183. Search in Google Scholar

Received: 2016-11-9

Accepted: 2017-4-28

Published Online: 2017-5-18

Published in Print: 2017-6-1

Effect of covariate misspecifications in the marginalized zero-inflated Poisson model

Abstract

A Supplementary appendix

References

Journal and Issue

Articles in the same Issue