Skip to main content
Log in

Prediction intervals in supervised learning for model evaluation and discrimination

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

In this paper we explore prediction intervals and how they can be used for model evaluation and discrimination in the supervised regression setting of medium sized datasets. We review three different methods for making prediction intervals and the statistics used for their evaluation. How the prediction intervals look like, how different methods behave and how the prediction intervals can be utilized for the graphical evaluation of models is illustrated with the help of simple datasets. Afterwards we propose a combined method for making prediction intervals and explore its performance with two voting schemes for combining predictions of a diverse ensemble of models. All methods are tested on a large set of datasets on which we evaluate individual methods and aggregated variants for their abilities of selecting the best predictions. The analysis of correlations between the root mean squared error and our evaluation statistic show that both stability and reliability of the results increase as the techniques get more elaborate. We confirm that the methodology is suitable for the graphical comparison of individual models and is a viable way of discriminating among model candidates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. http://goo.gl/sBh6Dp

References

  1. Bosnić Z, Kononenko I (2008) Comparison of approaches for estimating reliability of individual regression predictions. Data Knowl Eng 67(3):504–516

    Article  Google Scholar 

  2. Bosnić Z, Kononenko I (2008) Estimation of individual prediction reliability using the local sensitivity analysis. Appl Intell 29(3):187–203

    Article  Google Scholar 

  3. Breiman L (1996) Bagging predictors. Mach Learn 123–140

  4. Breiman L (2001) Random forests. vol 45, pp 5–32

  5. Dasarathy BV, Sheela BV (1979) A composite classifier system design: Concepts and methodology. vol 67, pp 708–713

  6. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. SIGKDD Explor Newsl 11(1):10–18

    Article  Google Scholar 

  7. Hamada M, Johnson V, Moore LM, Wendelberger J (2004) Bayesian prediction intervals and their relationship to tolerance intervals. Technometrics 46(4):452–459

    Article  MathSciNet  Google Scholar 

  8. Heskes T (1997) Practical confidence and prediction intervals. Advances in Neural Information Processing Systems 9:176–182

    Google Scholar 

  9. Hoeting JA, Madigan D, Raftery AE, Volinsky CT (1999) Bayesian model averaging: a tutorial. Statistical science, pp 382–401

  10. Horn PS, Pesce AJ, Copeland BE (1998) A robust approach to reference interval estimation and evaluation. Clin Chem 44(3):622–631

    Google Scholar 

  11. Khosravi A, Nahavandi S, Creighton D (2013) Prediction Intervals for Short-Term Wind Farm Power Generation Forecasts. IEEE Transactions on Sustainable Energy 4(3):602–610

    Article  Google Scholar 

  12. Lawless J, Fredette M (2005) Frequentist prediction intervals and predictive distributions. Biometrika 92(3):529–542

    Article  MATH  MathSciNet  Google Scholar 

  13. Li Y, Chen J, Feng L (2013) Dealing with uncertainty: A survey of theories and practices. IEEE Trans Knowl Data Eng 25(11):2463–2482

    Article  Google Scholar 

  14. Lin Y, Jeon Y (2002) Random forests and adaptive nearest neighbours. J Am Stat Assoc 97(457):101–474

  15. Meinshausen N (2006) Quantile regression forests. J Mach Learn Res 7:983–999

    MATH  MathSciNet  Google Scholar 

  16. Monteith K, Carroll JL, Seppi K, Martinez T (2011) Turning bayesian model averaging into bayesian model combination. IEEE IJCNN 2011:2657–2663

    Google Scholar 

  17. Neyman J (1937) Outline of a theory of statistical estimation based on the classical theory of probability. Philosophical Transactions of the Royal Society of London Series A. Math Phys Sci 236:333–380

    Article  Google Scholar 

  18. Nix D, Weigend A (1994) Estimating the mean and variance of the target probability distribution. IEEE World Congress on Computational Intelligence, 1994 IEEE International Conference on Neural Networks, pp 55–60

  19. Oh S (2011) A new dataset evaluation method based on category overlap. Comp Bio Med 41(2):115–122

    Article  Google Scholar 

  20. Papadopoulos H, Haralambous H (2011) Reliable prediction intervals with regression neural networks. Neural Netw 24(8):842–851

    Article  Google Scholar 

  21. Pevec D, Kononenko I (2012) Model selection with combining valid and optimal prediction intervals. ICDM Workshops 653–658

  22. Quan H, Srinivasan D, Khosravi A (2012) Uncertainty handling using neural network-based prediction intervals for electrical load forecasting. Energy 73:916–925

    Article  Google Scholar 

  23. R Development Core Team (2006) R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria

  24. Rodrigues PP, Gama J (2014) Distributed clustering of ubiquitous data streams. Wiley Interdiscip Rev Data Min Knowl Disc 4(1):38–54

    Article  Google Scholar 

  25. Shrestha DL, Solomatine DP (2006) Machine learning approaches for estimation of prediction interval for the model output. Neural Netw 19(2):225–235

    Article  MATH  Google Scholar 

  26. Tibshirani R (1996) A comparison of some error estimates for neural network models. Neural Comput 8(1):152–163

    Article  Google Scholar 

  27. Zapranis A, Livanis E (2005) Prediction intervals for neural network models. Proceedings of the 9th WSEAS International Conference on Computers. ICCOMP’05 76:1–7

    Google Scholar 

  28. Zhao L, Wang L, Xu Q (2012) Data stream classification with artificial endocrine system. Appl Intell 37(3):390– 404

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Darko Pevec.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pevec, D., Kononenko, I. Prediction intervals in supervised learning for model evaluation and discrimination. Appl Intell 42, 790–804 (2015). https://doi.org/10.1007/s10489-014-0632-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-014-0632-z

Keywords

Navigation