Abstract
The generalized linear model is an indispensable tool for analyzing non-Gaussian response data, with both canonical and non-canonical link functions comprehensively used. When missing values are present, many existing methods in the literature heavily depend on an unverifiable assumption of the missing data mechanism, and they fail when the assumption is violated. This paper proposes a missing data mechanism that is as generally applicable as possible, which includes both ignorable and nonignorable missing data cases, as well as both scenarios of missing values in response and covariate. Under this general missing data mechanism, the authors adopt an approximate conditional likelihood method to estimate unknown parameters. The authors rigorously establish the regularity conditions under which the unknown parameters are identifiable under the approximate conditional likelihood approach. For parameters that are identifiable, the authors prove the asymptotic normality of the estimators obtained by maximizing the approximate conditional likelihood. Some simulation studies are conducted to evaluate finite sample performance of the proposed estimators as well as estimators from some existing methods. Finally, the authors present a biomarker analysis in prostate cancer study to illustrate the proposed method.
Similar content being viewed by others
References
McCullagh P and Nelder J A, Generalized Linear Models, 2nd Edition, Chapman & Hall/CRC, 1989.
Little R J and Rubin D B, Statistical Analysis with Missing Data, 2nd Edition, Wiley, New York, 2002.
Tsiatis A A, Semiparametric Theory and Missing Data, Springer, 2006.
Ibrahim J G, Chen M H, Lipsitz S R, et al., Missing-data methods for generalized linear models: A comparative review, Journal of the American Statistical Association, 2005, 100: 332–346.
Tang G, Little R J, and Raghunathan T E, Analysis of multivariate missing data with nonignorable nonresponse. Biometrika, 2003, 90: 747–764.
Shao J and Zhao J, Estimation in longitudinal studies with nonignorable dropout. Statistics and Its Interface, 2013, 6: 303–313.
Wang S, Shao J, and Kim J K, An instrumental variable approach for identification and estimation with nonignorable nonresponse. Statistica Sinica, 2014, 24: 1097–1116.
Fang F, Zhao J, and Shao J, Imputation-based adjusted score equations in generalized linear models with nonignorable missing covariate values. Statistica Sinica, 2017, DOI: 10.5705/ss.202015.0437.
Kalbfleisch J D, Likelihood methods and nonparametric tests. Journal of the American Statistical Association, 1978, 73: 167–170.
Liang K Y and Qin J, Regression analysis under non-standard situations: A pairwise pseudolikelihood approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2000, 62: 773–786.
Robins J M and Ritov Y, Toward a curse of dimensionality appropriate (CODA) asymptotic theory for semi-parametric models. Statistics in Medicine, 1997, 16: 285–319.
Varambally S, Dhanasekaran S M, Zhou M, et al., The polycomb group protein EZH2 is involved in progression of prostate cancer, Nature, 2002, 419: 624–629.
Zhao J and Shao J, Semiparametric pseudo-likelihoods in generalized linear models with nonignorable missing data. Journal of the American Statistical Association, 2015, 110: 1577–1590.
Sen P K, On some convergence properties of U-statistics. Calcutta Statist. Assoc. Bull, 1960, 10: 1–18.
Tomlins S A, Mehra R, Rhodes D R, et al., Integrative molecular concept modeling of prostate cancer progression, Nature Genetics, 2007, 39: 41–51.
Author information
Authors and Affiliations
Corresponding author
Additional information
This paper was supported by the Chinese 111 Project B14019, the US National Science Foundation under Grant Nos. DMS-1305474 and DMS-1612873, and the US National Institutes of Health Award UL1TR001412.
This paper was recommended for publication by Editor-in-Chief GAO Xiao-Shan.
Rights and permissions
About this article
Cite this article
Zhao, J., Shao, J. Approximate conditional likelihood for generalized linear models with general missing data mechanism. J Syst Sci Complex 30, 139–153 (2017). https://doi.org/10.1007/s11424-017-6188-3
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11424-017-6188-3