Skip to main content
Log in

Average Estimation of Semiparametric Models for High-Dimensional Longitudinal Data

  • Published:
Journal of Systems Science and Complexity Aims and scope Submit manuscript

Abstract

Model average receives much attention in recent years. This paper considers the semiparametric model averaging for high-dimensional longitudinal data. To minimize the prediction error, the authors estimate the model weights using a leave-subject-out cross-validation procedure. Asymptotic optimality of the proposed method is proved in the sense that leave-subject-out cross-validation achieves the lowest possible prediction loss asymptotically. Simulation studies show that the performance of the proposed model average method is much better than that of some commonly used model selection and averaging methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Draper D, Assessment and propagation of model uncertainty, Mathematical Socialences, 1995, 57: 45–97.

    MathSciNet  MATH  Google Scholar 

  2. Raftery A E and Hoeting J A, Bayesian model averaging for linear regression models, Journal of the American Statistical Association, 1997, 92: 179–191.

    Article  MathSciNet  Google Scholar 

  3. Hoeting J A, Madigan D, Raftery A E, et al., Bayesian model averaging: A tutorial, Statistical Science, 1999, 14: 382–417.

    Article  MathSciNet  Google Scholar 

  4. Raftery A E and Zheng Y, Long-run performance of Bayesian model averaging, Journal of the American Statistical Association, 2003, 98: 931–938.

    Article  Google Scholar 

  5. Buckland S T, Burnham K P, and Augustin N H, Model selection: An integral part of inference, Biometrics, 1997, 53: 603–618.

    Article  Google Scholar 

  6. Hjort N L and Claeskens G, Frequentist model average estimators, Journal of the American Statistical Association, 2003, 98: 879–899.

    Article  MathSciNet  Google Scholar 

  7. Hjort N L and Claeskens G, Focused information criteria and model averaging for the Cox hazard regression model, Journal of the American Statistical Association, 2006, 101: 1449–1464.

    Article  MathSciNet  Google Scholar 

  8. Yuan Z and Yang Y, Combining linear regression models: When and how?, Journal of the American Statistical Association, 2005, 100: 1202–1214.

    Article  MathSciNet  Google Scholar 

  9. Hansen B E, Least squares model averaging, Econometrica, 2007, 75: 1175–1189.

    Article  MathSciNet  Google Scholar 

  10. Hansen B E, Least-squares forecast averaging, Journal of Econometrics, 2008, 146: 342–350.

    Article  MathSciNet  Google Scholar 

  11. Goldenshluger A, A universal procedure for aggregating estimators, Annals of Statistics, 2009, 37: 542–568.

    Article  MathSciNet  Google Scholar 

  12. Schomaker M, Wan A T K, and Heumann C, Frequentist model averaging with missing observations, Computational Statistics and Data Analysis, 2010, 54: 3336–3347.

    Article  MathSciNet  Google Scholar 

  13. Wan A T, Zhang X, and Zou G, Least squares model averaging by Mallows criterion, Journal of Econometrics, 2010, 156: 277–283.

    Article  MathSciNet  Google Scholar 

  14. Liang H, Zou G, Wan A T, et al., Optimal weight choice for frequentist model average estimators, Journal of the American Statistical Association, 2011, 106: 1053–1066.

    Article  MathSciNet  Google Scholar 

  15. Zhang X, Wan A T, and Zhou S Z, Focused information criteria, model selection, and model averaging in a tobit model with a nonzero threshold, Journal of Business and Economic Statistics, 2012, 30: 132–142.

    Article  MathSciNet  Google Scholar 

  16. Hansen B E and Racine J S, Jackknife model averaging, Journal of Econometrics, 2012, 167: 38–46.

    Article  MathSciNet  Google Scholar 

  17. Zhang X, Wan A T, and Zou G, Model averaging by jackknife criterion in models with dependent data, Journal of Econometrics, 2013, 174: 82–94.

    Article  MathSciNet  Google Scholar 

  18. Lu X and Su L, Jackknife model averaging for quantile regressions, Journal of Econometrics, 2015, 188: 40–58.

    Article  MathSciNet  Google Scholar 

  19. Zhang X, Zheng Y, and Wang S, A demand forecasting method based on stochastic frontier analysis and model average: An application in air travel demand forecasting, Journal of Systems Science and Complexity, 2019, 32:(4): 615–633.

    Article  MathSciNet  Google Scholar 

  20. Yu X, Xiao L, Zeng P, et al., Jackknife model averaging prediction methods for complex phenotypes with gene expression levels by integrating external pathway information, Computational and Mathematical Methods in Medicine, 2019, 2019: 1–8.

    Article  Google Scholar 

  21. Zhang X, Zou G, and Liang H, Model averaging and weight choice in linear mixed-effects models, Biometrika, 2014, 101: 205–218.

    Article  MathSciNet  Google Scholar 

  22. Gao Y, Zhang X, Wang S, et al., Model averaging based on leave-subject-out cross-validation, Journal of Econometrics, 2016, 192: 139–151.

    Article  MathSciNet  Google Scholar 

  23. Xia Y, Semiparametric Regression Models, Springer, Berlin, 2011.

    Book  Google Scholar 

  24. Zhu R, Wan A T, Zhang X, et al., A Mallows-type model averaging estimator for the varying-coefficient partially linear model, Journal of the American Statistical Association, 2019, 114: 882–892.

    Article  MathSciNet  Google Scholar 

  25. Zhang X and Wang W, Optimal model averaging estimation for partially linear models, Statistica Sinica, 2019, 29: 693–718.

    MathSciNet  MATH  Google Scholar 

  26. Huang T and Li J, Semiparametric model average prediction in panel data analysis, Journal of Nonparametric Statistics, 2018, 30: 125–144.

    Article  MathSciNet  Google Scholar 

  27. Ando T and Li K C, A model-averaging approach for high-dimensional regression, Journal of the American Statistical Association, 2014, 109: 254–265.

    Article  MathSciNet  Google Scholar 

  28. Ando T and Li K C, A weight-relaxed model averaging approach for high-dimensional generalized linear models, The Annals of Statistics, 2017, 45: 2654–2679.

    Article  MathSciNet  Google Scholar 

  29. Xu G and Huang J Z, Asymptotic optimality and efficient computation of the leave-sub jectoutcross-validation, The Annals of Statistics, 2012, 40: 3003–3030.

    Article  MathSciNet  Google Scholar 

  30. Green P J and Silverman B W, Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach, CRC Press, Florida, 1993.

    Book  Google Scholar 

  31. Ruppert D, Wand M P, and Carroll R J, Semiparametric Regression, Cambridge University Press, Cambridge, 2003.

    Book  Google Scholar 

  32. Claeskens G, Krivobokova T, and Opsomer J D, Asymptotic properties of penalized spline estimators, Biometrika, 2009, 96: 529–544.

    Article  MathSciNet  Google Scholar 

  33. Liang K Y and Zeger S L, Longitudinal data analysis using generalized linear models, Biometrika, 1986, 73: 13–22.

    Article  MathSciNet  Google Scholar 

  34. Welsh A H, Lin X, and Carroll R J, Marginal longitudinal nonparametric regression: locality and efficiency of spline and kernel methods, Journal of the American Statistical Association, 2002, 97: 482–493.

    Article  MathSciNet  Google Scholar 

  35. Zhu Z, Fung W K, and He X, On the asymptotics of marginal regression splines with longitudinal data, Biometrika, 2008, 95: 907–917.

    Article  MathSciNet  Google Scholar 

  36. Diggle P J, Heagerty P, Liang K Y, et al., Analysis of Longitudinal Data, Oxford University Press, Oxford, 2002.

    MATH  Google Scholar 

  37. White H, Maximum likelihood estimation of misspecified models, Econometrica, 1982, 50: 1–25.

    Article  MathSciNet  Google Scholar 

  38. Wang L, GEE analysis of clustered binary data with diverging number of covariates, The Annals of Statistics, 2011, 39: 389–417.

    Article  MathSciNet  Google Scholar 

  39. Whittle P, Bounds for the moments of linear and quadratic forms in independent variables, Theory of Probability and Its Applications, 1960, 5: 302–305.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guohua Zou.

Additional information

This research was supported by the Ministry of Science and Technology of China under Grant No. 2016YFB0502301, Academy for Multidisciplinary Studies of Capital Normal University, and the National Natural Science Foundation of China under Grant Nos. 11971323 and 11529101.

This paper was recommended for publication by Editor LI Qizhai.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, Z., Zou, G. Average Estimation of Semiparametric Models for High-Dimensional Longitudinal Data. J Syst Sci Complex 33, 2013–2047 (2020). https://doi.org/10.1007/s11424-020-9343-1

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11424-020-9343-1

Keywords

Navigation