Abstract
Bootstrapping is a simple technique typically used to assess accuracy of estimates of model parameters by using simple plug-in principles and replacing sometimes unwieldy theory by computer simulation. Common uses include variance estimation and confidence interval construction of model parameters. It also provides a way to estimate prediction accuracy of continuous and class-valued outcomes regression models. In this paper we will overview some of these applications of the bootstrap focusing on bootstrap estimates of prediction error, and also explore how the bootstrap can be used to improve prediction accuracy of unstable models like tree-structured classifiers through aggregation. The improvements can typically be attributed to variance reduction in the classical regression setting and more generally a smoothing of decision boundaries for the classification setting. These advancements have important implications in the way that atmospheric prediction models can be improved, and illustrations of this will be shown. For class-valued outcomes, an interesting graphic known as the CAT scan can be constructed to help understand the aggregated decision boundary. This will be illustrated using simulated data.
Similar content being viewed by others
References
Breiman, L. 1996. Bagging predictors. Machine Learning, 26:123–140.
Breiman, L. 1998. Arcing classifiers. Annals of Statistics, 26:801–849.
Breiman, L., Friedman, J., Olshen, R., and Stone, C. 1984. Classification and Regression Trees. New York: Chapman and Hall.
Chambers, J. and Hastie, T. 1991. Statistical Models in S. Pacific Grove, CA: Wadsworth Brooks Cole.
Chipman, H. 1998. Making sense of a forest of trees, Technical report, Department of Statistics and Actuarial Science, University of Waterloo.
Davison, A.C. and Hinkley, D.V. 1997. Bootstrap Methods and their Application. Cambridge, UK: Cambridge Univeristy Press.
Efron, B. 1979. Bootstrap methods: Another look at the jackknife, Annals of Statistics, 7:1–26.
Efron, B. 1982. The jackknife, the bootstrap, and other resampling plans. SIAM, Philadelphia.
Efron, B. and Tibshirani, R. 1993. An Introduction to the Bootstrap. New York: Chapman and Hall.
Efron, B. and Tibshirani, R. 1997. Improvements on cross-validation: the :632+ bootstrap method, Journal of the American Statistical Association, 92:548–560.
Freund, Y. and Schapire, R. 1996. Experiments with a new boosting algorithm, Machine Learning: Proceedings of the Thirteenth International Conference, pp. 148–156.
Friedman, J. 1997. On bias, variance, 0-1 loss, and the curse of dimensionality, Journal of Data Mining and Knowledge Discovery, 1:55–77.
Gray, W.M., Landsea, C.W., Mielke, P.W., and Berry, K.J. 1992. Predicting Atlantic seasonal hurricane activity 6-11 months in advance, Weather and Forecasting, 7:440–455.
Leger, C., Politis, D.N., and Romano, J.P. 1993. Bootstrap technology and applications, Technometrics, 34:378–398.
Quinlan, R. 1996. Bagging, boosting, and C4.5. Proceedings of the AAAI National Conference on Artificial Intelligence.
Rao, J.S. and Potts, W. (to appear). Visualizing bagged decision trees, Journal of Computational and Graphical Statistics.
Rao, J.S. and Tibshirani, R. 1997. The out of bootstrap method for model averaging and selection, Technical report, Department of Statistics, University of Toronto.
Ripley, B. 1996. Pattern Recognition and Neural Networks. Cambridge, UK: Cambridge University Press.
Shao, J. 1996. Bootstrap model selection, Journal of the American Statistical Association, 91:655–665.
Shao, J. and Tu, D. 1995. The Jackknife and Bootstrap. Springer Series in Statistics, New York.
Tibshirani, R. 1996. Bias, variance and prediction error for classification rules, Technical report, Department of Statistics, University of Toronto.
Tibshirani, R. 1997. A comparison of some error estimates for neural network models, Neural Computation, 8:152–163.
Wu, C.J.F. 1986. Jackknife, bootstrap and other resampling plans in regression analysis (with discussion), Annals of Statistics, 14:1261–1350.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Rao, J.S. Bootstrapping to Assess and Improve Atmospheric Prediction Models. Data Mining and Knowledge Discovery 4, 29–41 (2000). https://doi.org/10.1023/A:1009876615946
Issue Date:
DOI: https://doi.org/10.1023/A:1009876615946