Abstract
Ridge regression is a widely used method to mitigate the multicollinearly problem often arising in multiple linear regression. It is well known that the ridge regression estimator can be derived from the Bayesian framework by the posterior mode under a multivariate normal prior. However, the ridge regression model with a copula-based multivariate prior model has not been employed in the Bayesian framework. Motivated by the multicollinearly problem due to an interaction term, we adopt a vine copula to construct the copula-based joint prior distribution. For selected copulas and hyperparameters, we propose Bayesian ridge estimators and credible intervals for regression coefficients. A simulation study is carried out to compare the performance of four different priors (the Clayton, Gumbel, and Gaussian copula priors, and the tri-variate normal prior) on the regression coefficients. Our simulation studies demonstrate that the Archimedean (Clayton and Gumbel) copula priors give more accurate estimates in the presence of multicollinearity compared with the other priors. Finally, a real dataset is analyzed, where the Bayesian ridge estimators and some frequentist estimators are compared.








Similar content being viewed by others
References
Aas K, Czado C, Frigessi A, Bakken H (2009) Pair-copula constructions of multiple dependence. Insur Math Econ 44:182–198
Abonazel MR, Taha IM (2021) Beta ridge regression estimators: simulation and application. Commun Stat Simul Comput. https://doi.org/10.1080/03610918.2021.1960373
Allen DM (1974) The relationship between variable selection and data augmentation and a method for prediction. Technometrics 16:125–127
Armagan A, Zaretzki RL (2010) Model selection via adaptive shrinkage with t priors. Comput Stat 25:441–461
Assaf AG, Tsionas M, Tasiopoulos A (2019) Diagnosing and correcting the effects of multicollinearity: Bayesian implications of ridge regression. Tour Manag 71:1–8
Bedford T, Cooke RM (2002) Vines: a new graphical model for dependent random variables. Ann Stat 30:1031–1068
Box GEP, Tiao GC (1992) Bayesian inference in statistical analysis. Wiley, New York
Burzykowski T, Molenberghs G, Buyse M, Geys H, Renard D (2001) Validation of surrogate end points in multiple randomized clinical trials with failure time end points. J R Stat Soc Ser C (Appl Stat) 50(4):405–422
Chang B, Joe H (2019) Prediction based on conditional distributions of vine copulas. Comput Stat Data Anal 139:45–63
Chipman H (1996) Bayesian variable selection with related predictors. Can J Stat 24:17–36
Clayton DG (1978) A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika 65:141–151
Czado C (2019) Analyzing dependent data with vine copulas. Lecture notes in statistics. Springer, Cham
Emura T, Sofeu C, Rondeau V (2021) Conditional copula models for correlated survival endpoints: individual patient data meta-analysis of randomized controlled trials. Stat Methods Med Res 30(12):2634–2650
Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7:457–511
Griffin JE, Brown PJ (2013) Some priors for sparse regression modelling. Bayesian Anal 8:691–702
Griffin J, Brown P (2017) Hierarchical shrinkage priors for regression models. Bayesian Anal 12:135–159
Gumbel EJ (1960) Distributions des valeurs extremes en plusieurs dimensions. Publications De L’institut De Statistique De L’universit De Paris 9:171–173
Haff IH, Aas K, Frigessi A (2010) On the simplified pair-copula construction—simply useful or too simplistic? J Multivar Anal 101:1296–1310
Hans C (2011) Elastic net regression modeling with the orthant normal prior. J Am Stat Assoc 106:1383–1393
Hoerl AE, Kennard RW (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12:55–67
Hoerl AE, Kannard RW, Baldwin KF (1975) Ridge regression: some simulations. Commun Stat 4:105–123
Huard D, Evin G, Favre AC (2006) Bayesian copula selection. Comput Stat Data Anal 51:809–822
Joe H (1997) Multivariate models and dependence concepts. Chapman & Hall, London
Joe H (2006) Generating random correlation matrices based on partial correlations. J Multivar Anal 97(10):2177–2189
Killiches M, Kraus D, Czado C (2017) Examination and visualisation of the simplifying assumption for vine copulas in three dimensions. Aust N Z J Stat 59(1):95–117
Kurowicka D, Cooke R (2003) A parameterization of positive definite matrices in terms of partial correlation vines. Linear Algebra Appl 372:225–251
Kurowicka D, Cooke RM (2006) Uncertainty analysis with high dimensional dependence modelling. Wiley, Hoboken
Lewandowski D, Kurowicka D, Joe H (2009) Generating random correlation matrices based on vines and extended onion method. J Multivar Anal 100:1989–2001
Loesgen KH (1990) A generalization and Bayesian interpretation of ridge-type estimators with good prior means. Stat Pap 31:147–154
Michimae H, Matsunami M, Emura T (2020) Robust ridge regression for estimating the effects of correlated gene expressions on phenotypic traits. Environ Ecol Stat 27:41–72
Mitchell TJ, Beauchamp JJ (1988) Bayesian variable selection in linear regression. J Am Stat Assoc 83:1023–1032
Montgomery DC, Peck EA, Vining GG (2012) Introduction to linear regression analysis, 5th edn. Wiley, Hoboken
Nelsen RB (2006) An introduction to copulas. Springer series in statistics. Springer, New York
Norouzirad M, Arashi M (2019) Preliminary test and Stein-type shrinkage ridge estimators in robust regression. Stat Pap 60:1849–1882
Nikoloulopoulos AK (2017) A vine copula mixed effect model for trivariate meta-analysis of diagnostic test accuracy studies accounting for disease prevalence. Stat Methods Med Res 26:2270–2286
O’Brien RM (2007) A caution regarding rules of thumb for variance inflation factors. Qual Quant 41:673–690
Park T, Casella G (2008) The Bayesian lasso. J Am Stat Assoc 103(482):681–686
Pliskin JL (1987) A ridge-type estimator and good prior means. Commun Stat Theory Methods 16:3429–3437
Polson NG, Scott JG (2012) On the half-Cauchy prior for a global scale parameter. Bayesian Anal 7:887–902
Salmerón R, García J, García C, del Mar LM (2018) Transformation of variables and the condition number in ridge estimation. Comput Stat 33:1497–1524
Sambasivan R, Das S, Sahu SK (2020) A Bayesian perspective of statistical machine learning for big data. Comput Stat 35:893–930
Scheipl F, Kneib T, Fahrmeir L (2013) Penalized likelihood and Bayesian function selection in regression models. Adv Stat Anal 97:349–385
Schepsmeier U, Stöber J (2014) Derivatives and Fisher information of bivariate copulas. Stat Pap 55:525–542
Shih JH, Lin TY, Jimichi M, Emura T (2021) Robust ridge M-estimators with pretest and Stein-rule shrinkage for an intercept term. Jpn J Stat Data Sci 4:107–150
Stan Development Team (2017) Stan modeling language users guide and reference manual. https://mc-stan.org
Stan Development Team (2018) RStan: the R interface to Stan. R package version 2.17.3. http://mc-stan.org
Stöber J, Joe H, Czado C (2013) Simplified pair copula constructions—limitations and extensions. J Multivar Anal 119:101–118
van Wieringen WN (2021) Lecture notes on ridge regression. arXiv preprint https://arxiv.org/pdf/1509.09169
Veerman JR, Leday GG, van de Wiel MA (2022) Estimation of variance components, heritability and the ridge penalty in high-dimensional generalized linear models. Commun Stat Simul Comput 51(1):116–134
Wong KY, Chiu SN (2015) An iterative approach to minimize the mean squared error in ridge regression. Comput Stat 30(2):625–639
Yang SP, Emura T (2017) A Bayesian approach with generalized ridge estimation for high-dimensional regression and testing. Commun Stat Simul Comput 46(8):6083–6105
Acknowledgements
The authors thank Editor, Associate Editor, and two referees for their valuable suggestions that improved the paper. This research was supported by JSPS KAKENHI Grant Number JP21K12127.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Michimae, H., Emura, T. Bayesian ridge estimators based on copula-based joint prior distributions for regression coefficients. Comput Stat 37, 2741–2769 (2022). https://doi.org/10.1007/s00180-022-01213-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-022-01213-8