Estimating Predictive Variances with Kernel Ridge Regression

Cawley, Gavin C.; Talbot, Nicola L. C.; Chapelle, Olivier

doi:10.1007/11736790_5

Gavin C. Cawley²²,
Nicola L. C. Talbot²² &
Olivier Chapelle²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3944))

Included in the following conference series:

Machine Learning Challenges Workshop

2592 Accesses
7 Citations

Abstract

In many regression tasks, in addition to an accurate estimate of the conditional mean of the target distribution, an indication of the predictive uncertainty is also required. There are two principal sources of this uncertainty: the noise process contaminating the data and the uncertainty in estimating the model parameters based on a limited sample of training data. Both of them can be summarised in the predictive variance which can then be used to give confidence intervals. In this paper, we present various schemes for providing predictive variances for kernel ridge regression, especially in the case of a heteroscedastic regression, where the variance of the noise process contaminating the data is a smooth function of the explanatory variables. The use of leave-one-out cross-validation is shown to eliminate the bias inherent in estimates of the predictive variance. Results obtained on all three regression tasks comprising the predictive uncertainty challenge demonstrate the value of this approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Linear, Logistic, and Kernel Regression

The Weight of Penalty Optimization for Ridge Regression

Linear Models

References

Quiñonero-Candela, J.: Evaluating Predictive Uncertainty Challenge (2005), http://www.predict.kyb.tuebingen.mpg.de
Saunders, C., Gammerman, A., Vovk, V.: Ridge regression learning algorithm in dual variables. In: Proc. 15th Int. Conf. on Machine Learning, Madison, WI, pp. 515–521 (1998)
Google Scholar
Williams, C., Rasmussen, C.: Gaussian Processes for Regression. In: Touretzky, D.S., Mozer, M.C., Hasselmo, M.E. (eds.) Advances in Neural Information Processing Systems, NIPS, vol. 8. MIT Press, Cambridge (1995)
Google Scholar
Suykens, J.A.K., De Brabanter, J., Lukas, L., Vanderwalle, J.: Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing 48, 85–105 (2002)
Article MATH Google Scholar
Mercer, J.: Functions of positive and negative type and their connection with the theory of integral equations. Philosophical Transactions of the Royal Society of London A 209, 415–446 (1909)
Article MATH Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines (and other kernel-based learning methods). Cambridge University Press, Cambridge (2000)
Google Scholar
Schölkopf, B., Smola, A.J.: Learning with kernels - support vector machines, regularization, optimization and beyond. MIT Press, Cambridge (2002)
Google Scholar
Hoerl, A.E., Kennard, R.W.: Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12, 55–67 (1970)
Article MATH Google Scholar
Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)
MATH Google Scholar
Satchwell, C.: Finding error bars (the easy way). Neural Computing Applications Forum 5 (1994)
Google Scholar
Lowe, D., Zapart, C.: Point-wise confidence interval estimation by neural networks: A comparative study based on automotive engine calibration. Neural Computing and Applications 8, 77–85 (1999)
Article Google Scholar
Nix, D.A., Weigend, A.S.: Estimating the mean and variance of the target probability distribution. In: Proceedings of the IEEE International Conference on Neural Networks, Orlando, FL, vol. 1, pp. 55–60 (1994)
Google Scholar
Williams, P.M.: Using neural networks to model conditional multivariate densities. Neural Computation 8, 843–854 (1996)
Article Google Scholar
Vapnik, V.: Statistical Learning Theory. John Wiley & Sons, Chichester (1998)
MATH Google Scholar
Kimeldorf, G.S., Wahba, G.: Some results on Tchebycheffian spline functions. Journal of Mathematical Analysis and Applications 33, 82–95 (1971)
Article MathSciNet MATH Google Scholar
Schölkopf, B., Herbrich, R., Smola, A.J.: A generalised representer theorem. In: Proceedings of the Fourteenth International Conference on Computational Learning Theory, Amsterdam, The Netherlands, pp. 416–426 (2001)
Google Scholar
Cawley, G.C., Talbot, N.L.C., Foxall, R.J., Dorling, S.R., Mandic, D.P.: Heteroscedastic kernel ridge regression. Neurocomputing 57, 105–124 (2004)
Article Google Scholar
Foxall, R.J., Cawley, G.C., Talbot, N.L.C., Dorling, S.R., Mandic, D.P.: Heteroscedastic regularised kernel regression for prediction of episodes of poor air quality. In: Proceedings of the European Symposium on Artificial Neural Networks (ESANN 2002), Bruges, Belgium, pp. 19–24 (2002)
Google Scholar
Yuan, M., Wahba, G.: Doubly penalized likelihood estimator in heteroscedastic regression. Statistics and Probability Letters 69, 11–20 (2004)
Article MathSciNet MATH Google Scholar
Nabney, I.T.: Efficient training of RBF networks for classification. In: Proceedings of the Ninth International Conference on Artificial Neural Networks, Edinburgh, United Kingdom, vol. 1, pp. 210–215 (1999)
Google Scholar
Stone, M.: Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society B 36, 111–147 (1974)
MathSciNet MATH Google Scholar
Luntz, A., Brailovsky, V.: On estimation of characters obtained in statistical procedure of recognition (in Russian). Techicheskaya Kibernetica 3 (1969)
Google Scholar
Cawley, G.C., Talbot, N.L.C.: Efficient leave-one-out cross-validation of kernel Fisher discriminant classifiers. Pattern Recognition 36, 2585–2592 (2003)
Article MATH Google Scholar
Williams, P.M.: Bayesian regularization and pruning using a Laplace prior. Neural Computation 7, 117–143 (1995)
Article Google Scholar
Bishop, C.M., Qazaz, C.S.: Bayesian inference of noise levels in regression. In: Vorbrüggen, J.C., von Seelen, W., Sendhoff, B. (eds.) ICANN 1996. LNCS, vol. 1112, pp. 59–64. Springer, Heidelberg (1996)
Chapter Google Scholar
Goldberg, P.W., Williams, C.K.I., Bishop, C.M.: Regression with input-dependent noise: A Gaussian process treatment. In: Jordan, M., Kearns, M., Solla, S. (eds.) Advances in Neural Information Processing Systems, vol. 10, pp. 493–499. MIT Press, Cambridge (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, U.K.
Gavin C. Cawley & Nicola L. C. Talbot
Max Plank Institute for Biological Cybernetics, 72076, Tübingen, Germany
Olivier Chapelle

Authors

Gavin C. Cawley
View author publications
You can also search for this author in PubMed Google Scholar
Nicola L. C. Talbot
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Chapelle
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Max Planck Institute for Biological Cybernetics, Spemannstr. 38, Tübingen, Germany
Joaquin Quiñonero-Candela
Bar Ilan University, 52900, Ramat Gan, Israel
Ido Dagan
ITC-IRST, Trento, Italy
Bernardo Magnini
Université d’Evry-Val d’Essonne, IBISC CNRS FRE 2873 and GENPOLE, 523, Place des terrasses, 91000, Evry, France
Florence d’Alché-Buc

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cawley, G.C., Talbot, N.L.C., Chapelle, O. (2006). Estimating Predictive Variances with Kernel Ridge Regression. In: Quiñonero-Candela, J., Dagan, I., Magnini, B., d’Alché-Buc, F. (eds) Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment. MLCW 2005. Lecture Notes in Computer Science(), vol 3944. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11736790_5

Download citation

DOI: https://doi.org/10.1007/11736790_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33427-9
Online ISBN: 978-3-540-33428-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics