A sign based loss approach to model selection in nonparametric regression

Nott, David J.; Jialiang, Li

doi:10.1007/s11222-009-9139-6

A sign based loss approach to model selection in nonparametric regression

Published: 02 July 2009

Volume 20, pages 485–498, (2010)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

David J. Nott¹ &
Li Jialiang¹

169 Accesses
5 Citations
Explore all metrics

Abstract

In parametric regression models the sign of a coefficient often plays an important role in its interpretation. One possible approach to model selection in these situations is to consider a loss function that formulates prediction of the sign of a coefficient as a decision problem. Taking a Bayesian approach, we extend this idea of a sign based loss for selection to more complex situations. In generalized additive models we consider prediction of the sign of the derivative of an additive term at a set of predictors. Being able to predict the sign of the derivative at some point (that is, whether a term is increasing or decreasing) is one approach to selection of terms in additive modelling when interpretation is the main goal. For models with interactions, prediction of the sign of a higher order derivative can be used similarly. There are many advantages to our sign-based strategy for selection: one can work in a full or encompassing model without the need to specify priors on a model space and without needing to specify priors on parameters in submodels. Also, avoiding a search over a large model space can simplify computation. We consider shrinkage prior specifications on smoothing parameters that allow for good predictive performance in models with large numbers of terms without the need for selection, and a frequentist calibration of the parameter in our sign-based loss function when it is desired to control a false selection rate for interpretation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Abramovich, F., Steinberg, D.: Improved inference in nonparametric regression using L _k-smoothing splines. J. Stat. Plan. Inference 49, 327–341 (1996)
Article MATH MathSciNet Google Scholar
Biller, C.: Adaptive Bayesian regression splines in semiparametric generalized linear models. J. Comput. Graph. Stat. 12, 122–140 (2000)
MathSciNet Google Scholar
Chan, D., Kohn, R., Nott, D., Kirby, C.: Locally adaptive semiparametric estimation of the mean and variance functions in regression models. J. Comput. Graph. Stat. 15, 915–936 (2006)
Article MathSciNet Google Scholar
Chaudhuri, P., Marron, S.: SiZer for exploration of structure in curves. J. Am. Stat. Assoc. 94, 807–823 (1999)
Article MATH MathSciNet Google Scholar
Cottet, R., Kohn, R., Nott, D.: Variable selection and model averaging in semiparametric overdispersed generalized linear models. J. Am. Stat. Assoc. 103, 661–671 (2008)
Article MATH MathSciNet Google Scholar
De Boor, C.: A Practical Guide to Splines. Springer, New York (1978)
MATH Google Scholar
Denison, D.G.T., Mallick, B.K., Smith, A.F.M.: Automatic Bayesian curve fitting. J. R. Stat. Soc. B 60, 333–350 (1998)
Article MATH MathSciNet Google Scholar
Dias, R., Gamerman, D.: A Bayesian approach to hybrid splines non-parametric regression. J. Stat. Comput. Simul. 72, 285–297 (2002)
Article MATH MathSciNet Google Scholar
DiMatteo, I., Genovese, C.R., Kass, R.E.: Bayesian curve-fitting with free-knot splines. Biometrika 88, 1055–1071 (2001)
Article MATH MathSciNet Google Scholar
Eilers, P.H.C., Marx, B.D.: Flexible smoothing using B-splines and penalties (with discussion). Stat. Sci. 11, 89–121 (1996)
Article MATH MathSciNet Google Scholar
Gamerman, D.: Sampling from the posterior distribution in generalized linear mixed models. Stat. Comput. 7, 57–68 (1997)
Article Google Scholar
Ganguli, B., Wand, M.P.: Feature significance in generalized additive models. Stat. Comput. 17, 179–192 (2007)
Article MathSciNet Google Scholar
Gelman, A.: Prior distributions for variance parameters in hierarchical models. Bayesian Anal. 1, 515–533 (2006)
Article MathSciNet Google Scholar
Gelman, A., Tuerlinckx, F.: Type S error rates for classical and Bayesian single and multiple comparison procedures. Comput. Stat. 15, 373–390 (2000)
Article MATH Google Scholar
Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B.: Bayesian Data Analysis, 2nd edn. CRC Press, Boca Raton (2003)
Google Scholar
Gilley, O.W., Pace, R.K.: On the Harrison and Rubinfeld data. J. Environ. Econ. Manage. 31, 403–405 (1996)
Article MATH Google Scholar
Green, P.J.: Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 711–732 (1995)
Article MATH MathSciNet Google Scholar
González-Manteiga, W., Martínez-Miranda, M.D., Raya-Miranda, R.: SiZer map for inference with additive models. Stat. Comput. 18, 297–312 (2008)
Article MathSciNet Google Scholar
Gu, C., Wahba, G.: Smoothing spline ANOVA with component-wise Bayesian “confidence intervals”. J. Comput. Graph. Stat. 2, 97–117 (1993)
Article MathSciNet Google Scholar
Gustafson, P.: Bayesian regression modeling with interactions and smooth effects. J. Am. Stat. Assoc. 95, 795–806 (2000)
Article Google Scholar
Harrison, D., Rubinfeld, D.L.: Hedonic prices and the demand for clean air. J. Environ. Econ. Manage. 5, 81–102 (1978)
Article MATH Google Scholar
Hastie, T.: Pseudosplines. J. R. Stat. Soc. B 58, 379–396 (1996)
MATH MathSciNet Google Scholar
Hastie, T., Tibshirani, R.: Generalized Additive Models. Chapman and Hall, London (1990)
MATH Google Scholar
Hastie, T., Tibshirani, R.: Bayesian backfitting (with discussion). Stat. Sci. 15, 196–223 (2000)
Article MATH MathSciNet Google Scholar
Jara, A., Hanson, T., Quintana, F., Mueller, P., Rosner, G.: The DPpackage package. Reference manual. Available at http://cran.r-project.org/web/packages/DPpackage/DPpackage.pdf (2008)
Jones, L.V., Tukey, J.W.: A sensible formulation of the significance test. Psychol. Methods 5, 411–414 (2000)
Article Google Scholar
Kass, R.E., Wasserman, L.: A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. J. Am. Stat. Assoc. 90, 928–934 (1995)
Article MATH MathSciNet Google Scholar
Kohn, R., Smith, M., Chan, D.: Nonparametric regression using linear combinations of basis functions. Stat. Comput. 11, 313–322 (2001)
Article MathSciNet Google Scholar
Lang, S., Brezger, A.: Bayesian P-splines. J. Comput. Graph. Stat. 13, 183–212 (2004)
Article MathSciNet Google Scholar
Lenk, P.J., DeSarbo, W.S.: Bayesian inference for finite mixtures of generalized linear models with random effects. Psychometrika 65, 93–119 (2000)
Article Google Scholar
Liu, J.S.: Monte Carlo Strategies in Scientific Computing. Springer, Berlin (2001)
MATH Google Scholar
Lunn, D.J., Best, N., Whittaker, J.C.: Generic reversible jump MCMC using graphical models. Stat. Comput. (2008, in press)
Nott, D.: Semiparametric estimation of mean and variance functions for non-Gaussian data. Comput. Stat. 21, 603–620 (2006)
Article MATH Google Scholar
Nott, D.J., Kuk, A.Y.C.: Coefficient sign prediction methods for model selection. J. R. Stat. Soc. B 69, 447–461 (2007)
Article MathSciNet Google Scholar
Rigby, R.A., Stasinopoulos, D.M.: Generalized additive models for location, scale and shape. Appl. Stat. 54, 507–554 (2005)
MATH MathSciNet Google Scholar
Ruppert, D., Wand, M.P., Carroll, R.J.: Semiparametric Regression. Cambridge University Press, Cambridge (2003)
Book MATH Google Scholar
Shively, T.S., Kohn, R., Wood, S.: Variable selection and function estimation in additive nonparametric regression using a data-based prior (with discussion). J. Am. Stat. Assoc. 94, 777–806 (1999)
Article MATH MathSciNet Google Scholar
Smith, M., Kohn, R.: Nonparametric regression using Bayesian variable selection. J. Econom. 75, 317–344 (1996)
Article MATH Google Scholar
Smith, J.W., Everhart, J.E., Dickson, W.C., Knowler, W.C., Johannes, R.S.: Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In: Proceedings of the Symposium on Computer Applications and Medical Care, pp. 261–265. IEEE Comput. Soc., Los Alamitos (1988)
Google Scholar
Spiegelhalter, D.J., Thomas, A., Best, N.: WinBUGSVersion 1.2 User Manual. MRC Biostatistics Unit. Software available at http://www.mrcbsu.cam.ac.uk/bugs/winbugs/contents.shtml (1999)
Wood, S.N.: Generalized Additive Models: An Introduction with R. Chapman and Hall/CRC Press, London/Boca Raton (2006)
MATH Google Scholar
Wood, S., Kohn, R.: A Bayesian approach to robust binary nonparametric regression. J. Am. Stat. Assoc. 93, 203–213 (1998)
Article MATH Google Scholar
Wood, S., Kohn, R., Shively, T., Jiang, W.: Model selection in spline nonparametric regression. J. R. Stat. Soc. B 64, 119–139 (2002)
Article MATH MathSciNet Google Scholar
Wu, Y., Boos, D., Stefanski, L.A.: Controlling variable selection by the addition of pseudo-variables. J. Am. Stat. Assoc. 102, 235–243 (2007)
Article MATH MathSciNet Google Scholar
Yau, P., Kohn, R.: Estimation and variable selection in nonparametric heteroscedastic regression. Stat. Comput. 13, 191–208 (2003)
Article MathSciNet Google Scholar
Yau, P., Kohn, R., Wood, S.: Bayesian variable selection and model averaging in high-dimensional multinomial nonparametric regression. J. Comput. Graph. Stat. 12, 23–54 (2003)
Article MathSciNet Google Scholar
Zaslavsky, A.M.: From ANOVA to variance components. Discussion of Gelman (2005) Analysis of variance—why it is more important than ever. Ann. Stat. 33, 1–53 (2005)
Article Google Scholar
Zheng, X., Loh, W.: Consistent variable selection in linear models. J. Am. Stat. Assoc. 90, 151–156 (1995)
Article MATH MathSciNet Google Scholar
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. B 67, 301–320 (2005)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics and Applied Probability, National University of Singapore, Singapore, 117546, Singapore
David J. Nott & Li Jialiang

Authors

David J. Nott
View author publications
You can also search for this author in PubMed Google Scholar
Li Jialiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David J. Nott.

Additional information

This research was supported by an Australian Research Council grant.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nott, D.J., Jialiang, L. A sign based loss approach to model selection in nonparametric regression. Stat Comput 20, 485–498 (2010). https://doi.org/10.1007/s11222-009-9139-6

Download citation

Received: 24 June 2008
Accepted: 18 June 2009
Published: 02 July 2009
Issue Date: October 2010
DOI: https://doi.org/10.1007/s11222-009-9139-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A sign based loss approach to model selection in nonparametric regression

Abstract

Access this article

Similar content being viewed by others

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

A new computational framework for log-concave density estimation

Overfitting, Model Tuning, and Evaluation of Prediction Performance

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A sign based loss approach to model selection in nonparametric regression

Abstract

Access this article

Similar content being viewed by others

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

A new computational framework for log-concave density estimation

Overfitting, Model Tuning, and Evaluation of Prediction Performance

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation