A novel Bayesian approach for variable selection in linear regression models

https://doi.org/10.1016/j.csda.2019.106881Get rights and content
Under a Creative Commons license
open access

Abstract

A novel Bayesian approach to the problem of variable selection in multiple linear regression models is proposed. In particular, a hierarchical setting which allows for direct specification of a priori beliefs about the number of nonzero regression coefficients as well as a specification of beliefs that given coefficients are nonzero is presented. This is done by introducing a new prior for a random set which holds the indices of the predictors with nonzero regression coefficients. To guarantee numerical stability, a g-prior with an additional ridge parameter is adopted for the unknown regression coefficients. In order to simulate from the joint posterior distribution an intelligent random walk Metropolis–Hastings algorithm which is able to switch between different models is proposed. For the model transitions a novel proposal, which prefers to add a priori or empirically important predictors to the model and further tries to remove less important ones, is used. Testing the algorithm on real and simulated data illustrates that it performs at least on par and often even better than other well-established methods. Finally, it is proven that under some nominal assumptions, the presented approach is consistent in terms of model selection.

Keywords

Variable selection
Hierarchical Bayes
g-prior with ridge parameter
Model uncertainty
Metropolis–Hastings algorithm
Consistency

Cited by (0)

For this work there exists supplementary material. Based on the prostate cancer data provided by Stamey et al. (1989) the performance of the proposed approach is compared to other recent variable selection methods.