Abstract
Instead of picking up a single ridge parameter in ridge regression, this paper considers a frequentist model averaging approach to appropriately combine the set of ridge estimators with different ridge parameters, when the response is randomly right censored. Within this context, we propose a weighted least squares ridge estimation for unknown regression parameter. A new Mallows-type weight choice criterion is then developed to allocate model weights, where the unknown distribution function of the censoring random variable is replaced by the Kaplan–Meier estimator and the covariance matrix of random errors is substituted by its averaging estimator. Under some mild conditions, we show that when the fitting model is misspecified, the resulting model averaging estimator achieves optimality in terms of minimizing the loss function. Whereas, when the fitting model is correctly specified, the model averaging estimator of the regression parameter is root-n consistent. Additionally, for the weight vector which is obtained by minimizing the new criterion, we establish its rate of convergence to the infeasible optimal weight vector. Simulation results show that our method is better than some existing methods. A real dataset is analyzed for illustration as well.




Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
No datasets were generated or analysed during the current study.
References
Akaike, H.: Maximum likelihood identification of gaussian autoregressive moving average models. Biometrika 60, 255–265 (1973)
Ahmad, T., Munir, A., Bhatti, S.H., Aftab, M., Raza, M.A.: Survival analysis of heart failure patients: a case study. PLoS ONE 12, e0181001 (2017)
Ando, T., Li, K.C.: A model-averaging approach for high-dimensional regression. J. Am. Stat. Assoc. 109, 254–265 (2014)
Ando, T., Li, K.C.: A weight-relaxed model averaging approach for high-dimensional generalized linear models. Ann. Stat. 45, 2654–2679 (2017)
Bao, Y., He, S., Mei, C.: The Koul-Susarla-Van Ryzin and weighted least squares estimates for censored linear regression model: a comparative study. Comput. Stat. Data Anal. 51, 6488–6497 (2007)
Chen, S., Khan, S.: Semiparametric estimation of a partially linear censored regression model. Economet. Theor. 17, 567–590 (2001)
Chen, J., Li, D., Linton, O., Lu, Z.: Semiparametric ultra-high dimensional model averaging of nonlinear dynamic time series. J. Am. Stat. Assoc. 113, 919–932 (2018)
Chicco, D., Jurman, G.: Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone. BMC Med. Inform. Decis. Mak. 20, 1–16 (2020)
Claeskens, G., Hjort, N.L.: The focused information criterion. J. Am. Stat. Assoc. 98, 900–916 (2003)
Dong, Q., Liu, B., Zhao, H.: Weighted least squares model averaging for accelerated failure time models. Comput. Stat. Data Anal. 184, 107743 (2023)
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32, 407–499 (2004)
Emami, H., Arzideh, K.: Robust ridge estimator in censored semiparametric linear models. Commun. Stat. Theory Methods 52, 5989–6007 (2023)
Golub, G.H., Heath, M., Wahba, G.: Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics 21, 215–223 (1979)
Hansen, B.E.: Least squares model averaging. Econometrica 75, 1175–1189 (2007)
Hansen, B.E., Racine, J.: Jackknife model averaging. J. Econom. 167, 38–46 (2012)
He, S., Huang, X.: Central limit theorem of linear regression model under right censorship. Sci. China Ser. A 46, 600–610 (2003)
Hu, G., Cheng, W., Zeng, J., Guan, R.: Optimal model averaging for semiparametric partially linear models with measurement errors. J. Stat. Plan. Inference 230, 106101 (2024)
Hoerl, A., Kennard, R.: Ridge regression: biased estimation for non-orthogonal problems. Technometrics 12, 69–82 (1970)
Koul, H., Susarla, V., Van Ryzin, J.: Regression analysis with randomly right-censored data. Ann. Stat. 9, 1276–1288 (1981)
Li, K.C.: Asymptotic optimality for \(C_p\), \(C_L\), cross-validation and generalized cross-validation: discrete index set. Ann. Stat. 15, 958–975 (1987)
Li, C., Li, Q., Racine, J., Zhang, D.: Optimal model averaging of varying coefficient models. Stat. Sin. 28, 2795–2809 (2018)
Li, J., Lv, J., Wan, A.T.K., Liao, J.: Adaboost semiparametric model averaging prediction for multiple categories. J. Am. Stat. Assoc. 117, 495–509 (2022)
Liang, Z., Chen, X., Zhou, Y.: Mallows model averaging estimation for linear regression model with right censored data. Acta Math. Appl. Sin. Engl. Ser. 38, 5–23 (2022)
Liao, J., Zou, G.: Corrected Mallows criterion for model averaging. Comput. Stat. Data Anal. 144, 106902 (2020)
Liao, J., Zong, X., Zhang, X., Zou, G.: Model averaging based on leave-subject-out cross-validation for vector autoregressions. J. Econom. 209, 35–60 (2019a)
Liao, J., Zou, G., Gao, Y.: Spatial Mallows model averaging for geostatistical models. Canad. J. Stat. 47, 336–351 (2019b)
Liu, Q., Okui, R., Yoshimura, A.: Generalized least squares model averaging. Economet. Rev. 35, 1692–1752 (2016)
Liu, Y., Zou, J., Zhao, S., Yang, Q.: Model averaging estimation for varying-coefficient single-index models. J. Syst. Sci. Complex. 35, 264–282 (2022)
Longford, N.T.: Editorial: Model selection and efficiency-is ‘which model ...?’ the right question? J. R. Stat. Soc. Ser. A 168, 469–472 (2005)
Lu, X., Su, L.: Jackknife model averaging for quantile regressions. J. Econom. 188, 40–58 (2015)
Mallows, C.L.: Some comments on \(C_p\). Technometrics 15, 661–675 (1973)
Peng, J., Yang, Y.: On improvability of model selection by model averaging. J. Econom. 229, 246–262 (2022)
Racine, J., Li, Q., Yu, D., Zheng, L.: Optimal model averaging of mixed-data kernel-weighted spline regressions. J. Bus. Econ. Stat. 41, 1251–1261 (2023)
Schomaker, M.: Shrinkage averaging estimation. Stat. Pap. 53, 1015–1034 (2012)
Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)
Seng, L., Li, J.: Structural equation model averaging: methodology and application. J. Bus. Econ. Stat. 40, 815–828 (2022)
Stone, M.: Cross-validation choice and assessment of statistical predictions. J. Roy. Stat. Soc. B 36, 111–147 (1974)
Stute, W.: Consistent estimation under random censorship when covariables are present. J. Multivar. Anal. 45, 89–103 (1993)
Sun, Y., Hong, Y., Wang, S., Zhang, X.: Penalized time-varying model averaging. J. Econom. 235, 1355–1377 (2023)
Wan, A.T.K., Zhang, X., Zou, G.: Least squares model averaging by Mallows criterion. J. Econom. 156, 277–283 (2010)
Wang, S., Nan, B., Zhu, J., Beer, D.G.: Doubly penalized Buckley-James method for survival data with high-dimensional covariates. Biometrics 64, 132–140 (2008)
Wang, M., Zhang, X., Wan, A.T.K., You, K., Zou, G.: Jackknife model averaging for high-dimensional quantile regression. Biometrics 79, 178–189 (2023)
Wei, Y., Wang, Q., Liu, W.: Model averaging for linear models with responses missing at random. Ann. Inst. Stat. Math. 73, 535–553 (2021)
Xia, X.: Model averaging prediction for nonparametric varying-coefficient models with B-spline smoothing. Stat. Pap. 62, 2885–2905 (2021)
Xie, J., Yan, X., Tang, N.: A model-averaging method for high-dimensional regression with missing responses at random. Stat. Sin. 31, 1005–1026 (2021)
Yan, X., Wang, H., Wang, W., Xie, J., Ren, Y., Wang, X.: Optimal model averaging forecasting in high-dimensional survival analysis. Int. J. Forecast. 37, 1147–1155 (2021)
Yuan, Z., Yang, Y.: Combining linear regression models: when and how? J. Am. Stat. Assoc. 100, 1202–1214 (2005)
Yuan, C., Fang, F., Li, J.: Model averaging for generalized linear models in diverging model spaces with effective model size. Econom. Rev. 43, 71–96 (2024)
Yu, D., Lian, H., Sun, Y., Zhang, X., Hong, Y.: Post-averaging inference for optimal model averaging estimator in generalized linear models. Econom. Rev. 43, 98–122 (2024)
Zeng, D., Mao, L., Lin, D.Y.: Maximum likelihood estimation for semiparametric transformation models with interval-censored data. Biometrika 103, 253–271 (2016)
Zhang, X., Liu, C.A.: Inference after model averaging in linear regression models. Econom. Theor. 35, 816–841 (2019)
Zhang, X., Liu, C.A.: Model averaging prediction by K-fold cross-validation. J. Econom. 235, 280–301 (2023)
Zhang, X., Wang, W.: Optimal model averaging estimation for partially linear models. Stat. Sin. 29, 693–718 (2019)
Zhang, X., Zhang, X.: Optimal model averaging based on forward-validation. J. Econom. 237, 105295 (2023)
Zhang, X., Wan, A.T.K., Zou, G.: Model averaging by jackknife criterion in models with dependent data. J. Econom. 174, 82–94 (2013)
Zhang, X., Yu, D., Zou, G., Liang, H.: Optimal model averaging estimation for generalized linear models and generalized linear mixed-effects models. J. Am. Stat. Assoc. 111, 1775–1790 (2016)
Zhang, X., Chiou, J.M., Ma, Y.: Functional prediction through averaging estimated functional linear regression models. Biometrika 105, 945–962 (2018)
Zhang, X., Ma, Y., Carroll, R.J.: MALMEM: model averaging in linear measurement error models. J. Roy. Stat. Soc. B 81, 763–779 (2019)
Zhang, X., Zou, G., Liang, H., Carroll, R.J.: Parsimonious model averaging with a diverging number of parameters. J. Am. Stat. Assoc. 115, 972–984 (2020)
Zhao, S., Zhang, X., Gao, Y.: Model averaging with averaging covariance matrix. Econ. Lett. 145, 214–217 (2016)
Zhao, H., Wu, Q., Li, G., Sun, J.: Simultaneous estimation and variable selection for interval-censored data with broken adaptive ridge regression. J. Am. Stat. Assoc. 115, 204–216 (2020a)
Zhao, S., Liao, J., Yu, D.: Model averaging estimator in ridge regression and its large sample properties. Stat. Pap. 61, 1719–1739 (2020b)
Zhu, R., Wan, A.T.K., Zhang, X., Zou, G.: A Mallows-type model averaging estimator for the varying-coefficient partially linear model. J. Am. Stat. Assoc. 114, 882–892 (2019)
Acknowledgements
The authors would like to thank the reviewers and editors for their careful reading and constructive comments. This work was supported by the Important Natural Science Foundation of Colleges and Universities of Anhui Province (No.KJ2021A0930, No.KJ2021A0929) and Research Project of Hefei Normal University (No.2023XTTDZD06, No.2023XTQTZD28).
Author information
Authors and Affiliations
Contributions
Zeng and Cheng wrote the main manuscript text and Hu prepared all the figures and tables. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zeng, J., Hu, G. & Cheng, W. A Mallows-type model averaging estimator for ridge regression with randomly right censored data. Stat Comput 34, 159 (2024). https://doi.org/10.1007/s11222-024-10472-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-024-10472-y