Skip to main content
Log in

Smoothed Bagging with Kernel Bandwidth Selectors

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Recently, a combined approach of bagging (bootstrap aggregating) and noise addition was proposed and shown to result in a significantly improved generalization performance. But, the level of noise introduced, a crucial factor, was determined by trial and error. The procedure is not only ad hoc but also time consuming since bagging involves training a committee of networks. Here we propose a principled procedure of computing the level of noise, which is also computationally less expensive. The idea comes from kernel density estimation (KDE), a non-parametric probability density estimation method where appropriate kernel functions such as Gaussian are imposed on data. The kernel bandwidth selector is a numerical method for finding the width of a kernel function (called bandwidth). The computed bandwidth can be used as the variance of added noise. The proposed approach makes the trial and error procedure unnecessary, and thus provides a much faster way of finding an appropriate level of noise. In addition, experimental results show that the proposed approach results in an improved performance over bagging, particularly for noisy data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. An, G.: The effects of adding noise during backpropagation training on a generalization performance, Neural Computation 8(3) (1996), 673-674.

    Google Scholar 

  2. Breiman, L.: Bagging predictors, Technical Report 421, Dept of Statistics, Univ. of Cal., Berkeley, CA, 1994.

    Google Scholar 

  3. Fahlman, S. and Lebiere, C.: The cascade-correlation learning architecture, In: Advances in Neural Information Processing Systems, Vol. 2, 1990, pp. 524-532.

    Google Scholar 

  4. Fan, J. and Marron, J.: Fast implementation of nonparametric curve estimators, Computational and Graphical Statistics 3 (1994), 35-56.

    Google Scholar 

  5. Geman, S., Bienenstock, E. and Doursat, R.: Neural networks and the bias/variance dilemma, Neural Computation 4(1) (1992), 1-48.

    Google Scholar 

  6. Hassibi, B. and Stork, D.: Second order derivatives for network pruning: optimal brain surgeon, In: Advances in Neural Information Processing Systems, Vol. 5, 1993, pp. 164-171.

    Google Scholar 

  7. Hinton, G.: Learning translation invariant recognition in massively parallel networks, In: Proceedings PARLE Conference on Parallel Architectures and Languages Europe. Berlin, 1987, pp. 1-13.

  8. Holmström, L. and Koistinen, P.: Using additive noise in backpropagation training, IEEE Transactions on Neural Networks 3(1) (1992), 24-38.

    Google Scholar 

  9. Jacobs, R.: Bias/variance analysis of mixtures-of-experts architecture, Neural Computation 9(2) (1997), 369-383.

    Google Scholar 

  10. Merz, C. and Murphy, P.: UCI Repository of machine learning databases [http://www.ics.uci.edu/mlearn/MLRepository.html]. Department of Information and Computer Science, University of California, Irvine, CA, 1996.

    Google Scholar 

  11. Parmanto, B., Munro, P. and Doyle, H.: Reducing variance of committee prediction with resampling techniques, Connections Science 8(3) (1996), 405-425.

    Google Scholar 

  12. Perrone, M. and Cooper, L.: When networks disagree: Ensemble methods for hybrid neural networks, In: R. Mammone (ed), Neural networks for speech and image processing, Chapman and Hall, 1993, pp. 126-142.

  13. Raviv, Y. and Intrator, N.: Bootstrapping with noise: An effective regularization technique, Connections Science 8(3) (1996), 355-372.

    Google Scholar 

  14. Rosenblatt, M.: Remarks on some nonparametric estimates of a density function, Annals of Mathematical Statistics 27 (1956), 832-837.

    Google Scholar 

  15. Rudemo, M.: Empirical choice of histograms and kernel density estimators, Scandinavian Journal of Statistics 9 (1982), 65-78.

    Google Scholar 

  16. Sheather, S. and Jones, M.: A reliable data-based bandwidth selection method for kernel density estimation, Journal of Royal Statistical Society, Series B 53 (1991), 683-690.

    Google Scholar 

  17. Silverman, B.: Density Estimation for Statistics and Data Analysis, Chapman and Hall, 1986.

  18. Simonoff, J.: Smoothing Methods in Statistics, Springer-Verlag, 1996.

  19. Wand, M. and Jones, M.: Kernel Smoothing, Chapman and Hall, 1995.

  20. Woodroofe, M.: On choosing a delta-sequence, Annals of Mathematical Statistics 41 (1970), 1665-1671.

    Google Scholar 

  21. Zell, A.: Stuttgart Neural Network Simulator (SNNS) User Manual V 4.1. Institute for Parallel and Distributed High Performance Systems (IPVR), University of Stuttgart, Stuttgart, Germany, 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, S., Cho, S. Smoothed Bagging with Kernel Bandwidth Selectors. Neural Processing Letters 14, 157–168 (2001). https://doi.org/10.1023/A:1012403711980

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1012403711980

Navigation