Abstract
Variable selection for varying coefficient models includes the separation of varying and constant effects, and the selection of variables with nonzero varying effects and those with nonzero constant effects. This paper proposes a unified variable selection approach called the double-penalized quadratic inference functions method for varying coefficient models of longitudinal data. The proposed method can not only separate varying coefficients and constant coefficients, but also estimate and select the nonzero varying coefficients and nonzero constant coefficients. It is suitable for variable selection of linear models, varying coefficient models, and partial linear varying coefficient models. Under regularity conditions, the proposed method is consistent in both separation and selection of varying coefficients and constant coefficients. The obtained estimators of varying coefficients possess the optimal convergence rate of non-parametric function estimation, and the estimators of nonzero constant coefficients are consistent and asymptotically normal. Finally, the authors investigate the finite sample performance of the proposed method through simulation studies and a real data analysis. The results show that the proposed method performs better than the existing competitor.
Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Hastie T and Tibshirani R, Varying coefficient models, Journal of the Royal Statistical Society: Series B (Methodological), 1993, 55(4): 757–779.
Wu C O, Chiang C T, and Hoover D R, Asymptotic confidence regions for kernel smoothing of a varying-coefficient model with longitudinal data, Journal of the American Statistical Association, 1998, 93(444): 1388–1402.
Hoover D R, Rice J A, Wu C O, et al., Nonparametric smoothing estimates of time-varying coefficient models with longitudinal data, Biometrika, 1998, 85(4): 809–822.
Huang J Z, Wu C O, and Zhou L, Varying coefficient models and basis function approximations for the analysis of repeated measurements, Biometrika, 2002, 89(1): 111–128.
Xue L and Zhu L, Empirical likelihood for a varying coefficient model with longitudinal data, Journal of the American Statistical Association, 2007, 102(478): 642–654.
Zhao P, Zhou X, Wang X, et al., A new orthogonality empirical likelihood for varying coefficient partially linear instrumental variable models with longitudinal data, Communications in Statistics-Simulation and Computation, 2019, 49(12): 3328–3344.
Huang J Z, Wu C O, and Zhou L, Polynomial spline estimation and inference for varying coefficient models with longitudinal data, Statistica Sinica, 2004, 14(3): 763–788.
Zhao W, Zhang W, and Lian H, Marginal quantile regression for varying coefficient models with longitudinal data, Annals of the Institute of Statistical Mathematics, 2018, 2(72): 1–22.
Liu S and Lian H, Robust estimation and model identification for longitudinal data varying coefficient model, Communications in Statistics-Theory and Methods, 2018, 47(11): 2701–2719.
Qu A and Li R, Quadratic inference functions for varying coefficient models with longitudinal data, Biometrics, 2006, 62(2): 379–391.
Wang K and Lin L, Robust structure identification and variable selection in partial linear varying coefficient models, Journal of Statistical Planning and Inference, 2016, 174(2): 153–168.
Li G, Xue L, and Lian H, Semi-varying coefficient models with a diverging number of components, Journal of Multivariate Analysis, 2011, 102(7): 1166–1174.
Fan J and Zhang W, Statistical methods with varying coefficient models, Statistics and Its Interface, 2008, 1(1): 179–195.
Park B U, Mammen E, Lee Y K, et al., Varying coefficient regression models: A review and new developments, International Statistical Review, 2015, 83(1): 36–64.
Wang L, Chen G, and Li H, Group SCAD regression analysis for microarray time course gene expression data, Bioinformatics, 2007, 23(12): 1486–1494.
Fan J and Li R, Variable selection via nonconcave penalized likelihood and its oracle properties, Journal of the American Statistical Association, 2001, 96(456): 1348–1360.
Wang H and Xia Y, Shrinkage estimation of the varying coefficient model, Journal of the American Statistical Association, 2009, 104(486): 747–757.
Wang L, Li H, and Huang J Z, Variable selection in nonparametric varying-coefficient models for analysis of repeated measurements, Journal of the American Statistical Association, 2008, 103(484): 1556–1569.
Noh H S and Park B U, Sparse varying coefficient models for longitudinal data, Statistica Sinica, 2010, 20(3): 1183–1202.
Wei F, Huang J, and Li H, Variable selection and estimation in high-dimensional varying coefficient models, Statistica Sinica, 2011, 21(4): 1515–1540.
Yang H, Lü J, and Guo C, Robust variable selection in modal varying coefficient models with longitudinal, Journal of Statistical Computation and Simulation, 2015, 85(15): 3064–3079.
Tang Y, Wang H J, and Zhu Z, Variable selection in quantile varying coefficient models with longitudinal data, Computational Statistics & Data Analysis, 2013, 57(1): 435–449.
Yang H, Guo C, and Lü J, Variable selection for generalized varying coefficient models with longitudinal data, Statistical Papers, 2016, 57(1): 115–132.
Chu W, Li R, and Reimherr M, Feature screening for time-varying coefficient models with ultrahigh dimensional longitudinal data, Annals of Applied Statistics, 2016, 10(2): 596–617.
Tian R, Xue L, and Liu C, Penalized quadratic inference functions for semiparametric varying coefficient partially linear models with longitudinal data, Journal of Multivariate Analysis, 2014, 132(10): 94–110.
Tang Y, Wang H J, Zhu Z, et al., A unified variable selection approach for varying coefficient models, Statistica Sinica, 2012, 22(2): 601–628.
He X, Zhu Z Y, and Fung W K, Estimation in a semiparametric model for longitudinal data with unspecified dependence structure, Biometrika, 2002, 89(3): 579–590.
Schumaker L, Spline Functions: Basic Theory, Cambridge University Press, Cambridge, 2007.
Liang K Y and Zeger S L, Longitudinal data analysis using generalized linear models, Biometrika, 1986, 73(1): 13–22.
Qu A, Lindsay B G, and Li B, Improving generalised estimating equations using quadratic inference functions, Biometrika, 2000, 87(4): 823–836.
Author information
Authors and Affiliations
Corresponding author
Additional information
This paper was supported in part by the National Science Foundation of China under Grant Nos. 12071305 and 71803001, in part by the national social science foundation of China under Grant No. 19BTJ014, in part by the University Social Science Research Project of Anhui Province under Grant No. SK2020A0051, and in part by the Social Science Foundation of the Ministry of Education of China under Grant Nos. 19YJCZH250 and 21YJAZH081.
Rights and permissions
About this article
Cite this article
Xu, X., Zhou, Y., Zhang, K. et al. Unified Variable Selection for Varying Coefficient Models with Longitudinal Data. J Syst Sci Complex 36, 822–842 (2023). https://doi.org/10.1007/s11424-022-2109-1
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11424-022-2109-1
Keywords
Profiles
- Mingtao Zhao View author profile