Abstract
This article proposes a robust method for analysing longitudinal continuous responses with informative dropouts and potential outliers by using the multivariate t-distribution. We specify a dropout mechanism and a missing covariate distribution and incorporate them into the complete data log-likelihood. Unlike the existing approaches which mainly focus on the inference of regression mean and dropouts process, our approach aims to reveal the dynamics in the location function, marginal scale function and association by joint parsimonious modeling the location and dependence structure. A parametric fractional imputation algorithm is developed to speed up the computation associated with the EM algorithm for maximum likelihood estimation with missing data. The resulting estimators are shown to be consistent and asymptotically normally distributed. Data examples and simulations demonstrate the effectiveness of the proposed approach.




Similar content being viewed by others
References
Bang H, Robins JM (2005) Doubly robust estimation in missing data and causal inference models. Biometrics 61:962–73
Booth JG, Hobert JP (1999) Maximizing generalized linear models with an automated Monte Carlo EM algorithm. J R Stat Soc Ser B Stat Methodol 61:625–685
Diggle PJ (2002) The analysis of longitudinal data. Oxford University Press, Oxford
Diggle P, Kenward MG (1994) Informative drop-out in longitudinal data analysis. J R Stat Soc Ser B Stat Methodol 43:49–93
Detke MJ, Wiltse CG, Mallinckrodt CH, McNamara RK, Demitrack MA, Bitter I (2004) Duloxetine in the acute and long-term treatment of major depressive disorder: a placebo and paroxetine controlled trial. Eur Neuropsychopharmacol 14:457–470
Fan J, Wu Y (2008) Semiparametric estimation of covariance matrices for longitudinal data. J Am Stat Assoc 103:1520–1533
Fan J, Huang T, Li R (2007) Analysis of longitudinal data with semiparametric estimation of covariance function. J Am Stat Assoc 102:632–641
Fitzmaurice GM, Laird NM, Zahner GEP (1996) Multivariate logistic models for incomplete binary responses. J Am Stat Assoc 91:99–108
Goldstein DJ, Lu Y, Detke MJ, Wiltse C, Mallinckrodt C, Demitrack MA (2004) Duloxetine in the treatment of depression: a double-blind placebo-controlled comparison with paroxetine. J Clin Psychopharmacol 24:389–399
Hedeker D, Gibbons RD (1997) Application of random-effects pattern-mixture models for missing data in longitudinal studies. Psychol Methods 2:64–78
Hogan JW, Roy J, Korkontzelou C (2004) Handling drop-out in longitudinal studies. Stat Med 23(9):1455–1497
Ibrahim JG, Molenberghs G (2009) Missing data methods in longitudinal studies: a review. Test (Madrid) 18(1):1–43
Joseph G, Ibrahim HZ, Tang N (2008) Model selection criteria for missing-data problems using the EM algorithm. J Am Stat Assoc 103:1648–1658
Kim JK (2011) Parametric fractional imputation for missing data analysis. Biometrika 98:119–132
Kotz S, Nadarajah S (2004) Multivariate \(t\) distributions and their applications. Cambridge University Press, Cambridge
Lange KL (1989) Robust statistical modeling using the t-distribution. J Am Stat Assoc 84:881–896
Leng C, Zhang W, Pan J (2010) Semiparametric mean–covariance regression analysis for longitudinal data. J Am Stat Assoc 105:181–193
Liang KY, Zeger SL (1986) Longitudinal data analysis using generalized linear models. Biometrika 73:13–22
Lin TI, Wang WL (2011) Bayesian inference in joint modeling of location and scale parameters of the t distribution for longitudinal data. J Stat Plan Inference 141:1543–1553
Little RJA (1995) Modeling the drop-out mechanism in repeated-measures studies. J Am Stat Assoc 90:1112–1121
Little RJ, Rubin DB (1987) Statistical analysis with missing data. Wiley, New York
Louis TA (1982) Finding the observed information matrix when using the EM algorithm. J R Stat Soc Ser B Stat Methodol 44:226–233
Maadooliat M, Pourahmadi M, Huang JZ (2013) Robust estimation of the correlation matrix of longitudinal data. Stat Comput 23:17–28
Newey WK, Mcfadden D (1994) Large sample estimation and hypothesis testing. In: Engle R, McFadden D (eds) Handbook of econometrics. Elsevier, Berlin, pp 2111–2245
Pan J, Mackenzie G (2003) Model selection for joint mean–covariance structures in longitudinal studies. Biometrika 90:239–244
Pauler DK, McCoy S, Moinpour C (2003) Pattern mixture models for longitudinal quality of life studies in advanced stage disease. Stat Methods Med Res 22:795–809
Pourahmadi M (1999) Joint mean–covariance models with applications to longitudinal data: unconstrained parameterisation. Biometrika 86:677–690
Pourahmadi M (2000) Maximum likelihood estimation of generalised linear models for multivariate normal covariance matrix. Biometrika 87:425–435
Pourahmadi M (2007) Cholesky decompositions and estimation of a covariance matrix: orthogonality of variance–correlation parameters. Biometrika 94:1006–1013
Roy J, Lin X (2003) Analysis of multivariate longitudinal outcomes with nonignorable dropouts and missing covariates: changes in methadone treatment practices. J Am Stat Assoc 97:40–52
Roy J, Lin X (2005) Missing covariates in longitudinal data with informative dropouts: bias analysis and inference. Biometrics 61:837–846
Rubin DB (1976) Inference and missing data. Biometrika 63:581–592
Seaman S, Copas A (2009) Doubly robust generalized estimating equations for longitudinal data. Stat Med 28(6):937–955
Stubbendick AL, Ibrahim JG (2003) Maximum likelihood methods for nonignorable missing responses and covariates in random effects models. Biometrics 59:1140–1150
Stubbendick AL, Ibrahim JG (2006) Likelihood-based inference with nonignorable missing responses and covariates in models for discrete longitudinal data. Stat Sin 16:1143–1167
Troxel AB, Ma G, Heitjan DF (2004) An index of local sensitivity to nonignorability. Stat Sin 14:1221–1237
Tukey JW (1949) One degree of freedom for non-additivity. Biometrics 5:232–242
Vansteelandt S, Rotnitzky A, Robins J (2007) Estimation of regression models for the mean of repeated outcomes under nonignorable nonmonotone nonresponse. Biometrika 94:841–860
Verbeke G, Molenberghs G, Thijs H, Lesaffre E, Kenward MG (2001) Sensitivity analysis for nonrandom dropout: a local influence approach. Biometrics 57:7–14
Willoughby I, Stokes V, Poole J, White JEJ, Hodge SJ (2007) The potential of 44 native and non-native tree species for woodland creation on a range of contrasting sites in lowland Britain. Forestry 80(5):531–553
Ye H, Pan J (2006) Modeling covariance structures in generalized estimating equations for longitudinal data. Biometrika 93:927–941
Zeger SL, Liang KY (1986) Longitudinal data analysis for discrete and continuous outcomes. Biometrics 42:121–130
Zhang W, Leng C (2012) A moving average Cholesky factor model in covariance modeling for longitudinal data. Biometrika 99:141–150
Zhang W, Leng C, Tang CY (2015) A joint modelling approach for longitudinal studies. J R Stat Soc Ser B Stat Methodol 77:219–238
Acknowledgements
We thank the Editor, and two referees for their constructive comments and suggestions that have greatly improved the paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work is supported by the National Key Research and Development Plan (No. 2016YFC0800100) and the NSFC of China (Nos. 11671374, 71771203, 71631006).
Rights and permissions
About this article
Cite this article
Zhang, W., Xie, F. & Tan, J. A robust joint modeling approach for longitudinal data with informative dropouts. Comput Stat 35, 1759–1783 (2020). https://doi.org/10.1007/s00180-020-00972-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-020-00972-6