Abstract
For a given prediction model, some predictions may be reliable while others may be unreliable. The average accuracy of the system cannot provide the reliability estimate for a single particular prediction. The measure of individual prediction reliability can be important information in risk-sensitive applications of machine learning (e.g. medicine, engineering, business). We define empirical measures for estimation of prediction accuracy in regression. Presented measures are based on sensitivity analysis of regression models. They estimate reliability for each individual regression prediction in contrast to the average prediction reliability of the given regression model. We study the empirical sensitivity properties of five regression models (linear regression, locally weighted regression, regression trees, neural networks, and support vector machines) and the relation between reliability measures and distribution of learning examples with prediction errors for all five regression models. We show that the suggested methodology is appropriate only for the three studied models: regression trees, neural networks, and support vector machines, and test the proposed estimates with these three models. The results of our experiments on 48 data sets indicate significant correlations of the proposed measures with the prediction error.
Similar content being viewed by others
References
Bousquet O, Elisseeff A (2002) Stability and generalization. J Mach Learn Res 2:499–526
Crowder MJ, Kimber AC, Smith RL, Sweeting TJ (1991) Statistical concepts in reliability. Statistical analysis of reliability data. Chapman & Hall, London, pp 1–11
Bousquet O, Elisseeff A (2000) Algorithmic stability and generalization performance. In: NIPS, pp 196–202
Bousquet O, Pontil M (2003) Leave-one-out error and stability of learning algorithms with applications. In: Suykens JAK et al, Advances in learning theory: methods, models and applications. IOS Press, Amsterdam
Kearns MJ, Ron D (1997) Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. In: Computational learing theory, pp 152–162
Breiman L (1996) Bagging predictors. Mach Learn 24:123–140
Schapire RE (1999) A brief introduction to boosting. In: Proceedings of IJCAI, pp 1401–1406
Drucker H (1997) Improving regressors using boosting techniques. In: Machine learning: proceedings of the fourteenth international conference, pp 107–115
Ridgeway G, Madigan D, Richardson T (1999) Boosting methodology for regression problems. In: Proceedings of the artificial intelligence and statistics, pp 152–161
Breiman L (1997) Pasting bites together for prediction in large data sets and on-line. Department of Statistics technical report, University of California, Berkeley
Tibshirani R, Knight K (1999) The covariance inflation criterion for adaptive model selection. J Roy Stat Soc Ser B 61:529–546
Rosipal R, Girolami M, Trejo L (2000) On kernel principal component regression with covariance in action criterion for model selection. Technical report, University of Paisley
Elidan G, Ninio M, Friedman N, Shuurmans D (2002) Data perturbation for escaping local maxima in learning. In: Proceedings of AAAI/IAAI, pp 132–139
Gammerman A, Vovk V, Vapnik V (1998) Learning by transduction. In: Proceedings of the 14th conference on uncertainty in artificial intelligence, Madison, WI, pp 148–155
Saunders C, Gammerman A, Vovk V (1999) Transduction with confidence and credibility. In: Proceedings of IJCAI, vol 2, pp 722–726
Nouretdinov I, Melluish T, Vovk V (2001) Ridge regression confidence machine. In: Proceedings of the 18th international conference on machine learning. Kaufmann, San Francisco, pp 385–392
Vapnik V (1995) The nature of statistical learning theory. Springer, Berlin
Kukar M, Kononenko I (2002) Reliable classifications with machine learning. In: Proceedings of the machine learning: ECML-2002. Springer, Helsinki, pp 219–231
Bosnić Z, Kononenko I, Robnik-Šikonja M, Kukar M (2003) Evaluation of prediction reliability in regression using the transduction principle. In Proceedings of Eurocon 2003, Ljubljana, pp 99–103
Mitchell T (1999) The role of unlabelled data in supervised learning. In: Proceedings of the 6th international colloquium of cognitive science, San Sebastian, Spain
Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the 11th annual conference on computational learning theory, pp 92–100
Li M, Vitányi P (1993) An introduction to Kolmogorov complexity and its applications. Springer, New York
Press WH et al. (2002) Numerical recipes in C: the art of scientific computing. Cambridge University Press, Cambridge
Newman DJ, Hettich S, Blake CL, Merz CJ (1998) UCI repository of machine learning databases. Department of Information and Computer Sciences, University of California, Irvine
Department of Statistics at Carnegie Mellon University (2005) StatLib—data, software and news from the statistics community
Cestnik B, Bratko I (1991) On estimating probabilities in tree pruning. In: Proceedings of European working session on learning (EWSL-91), Porto, Portugal, pp 138–150
Chang C, Lin C (2001) LIBSVM: a library for support vector machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm
Alpaydin E (2004) Introduction to machine learning. MIT Press, Cambridge
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bosnić, Z., Kononenko, I. Estimation of individual prediction reliability using the local sensitivity analysis. Appl Intell 29, 187–203 (2008). https://doi.org/10.1007/s10489-007-0084-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-007-0084-9