Abstract
Recently, a combined approach of bagging (bootstrap aggregating) and noise addition was proposed and shown to result in a significantly improved generalization performance. But, the level of noise introduced, a crucial factor, was determined by trial and error. The procedure is not only ad hoc but also time consuming since bagging involves training a committee of networks. Here we propose a principled procedure of computing the level of noise, which is also computationally less expensive. The idea comes from kernel density estimation (KDE), a non-parametric probability density estimation method where appropriate kernel functions such as Gaussian are imposed on data. The kernel bandwidth selector is a numerical method for finding the width of a kernel function (called bandwidth). The computed bandwidth can be used as the variance of added noise. The proposed approach makes the trial and error procedure unnecessary, and thus provides a much faster way of finding an appropriate level of noise. In addition, experimental results show that the proposed approach results in an improved performance over bagging, particularly for noisy data.
Similar content being viewed by others
References
An, G.: The effects of adding noise during backpropagation training on a generalization performance, Neural Computation 8(3) (1996), 673-674.
Breiman, L.: Bagging predictors, Technical Report 421, Dept of Statistics, Univ. of Cal., Berkeley, CA, 1994.
Fahlman, S. and Lebiere, C.: The cascade-correlation learning architecture, In: Advances in Neural Information Processing Systems, Vol. 2, 1990, pp. 524-532.
Fan, J. and Marron, J.: Fast implementation of nonparametric curve estimators, Computational and Graphical Statistics 3 (1994), 35-56.
Geman, S., Bienenstock, E. and Doursat, R.: Neural networks and the bias/variance dilemma, Neural Computation 4(1) (1992), 1-48.
Hassibi, B. and Stork, D.: Second order derivatives for network pruning: optimal brain surgeon, In: Advances in Neural Information Processing Systems, Vol. 5, 1993, pp. 164-171.
Hinton, G.: Learning translation invariant recognition in massively parallel networks, In: Proceedings PARLE Conference on Parallel Architectures and Languages Europe. Berlin, 1987, pp. 1-13.
Holmström, L. and Koistinen, P.: Using additive noise in backpropagation training, IEEE Transactions on Neural Networks 3(1) (1992), 24-38.
Jacobs, R.: Bias/variance analysis of mixtures-of-experts architecture, Neural Computation 9(2) (1997), 369-383.
Merz, C. and Murphy, P.: UCI Repository of machine learning databases [http://www.ics.uci.edu/mlearn/MLRepository.html]. Department of Information and Computer Science, University of California, Irvine, CA, 1996.
Parmanto, B., Munro, P. and Doyle, H.: Reducing variance of committee prediction with resampling techniques, Connections Science 8(3) (1996), 405-425.
Perrone, M. and Cooper, L.: When networks disagree: Ensemble methods for hybrid neural networks, In: R. Mammone (ed), Neural networks for speech and image processing, Chapman and Hall, 1993, pp. 126-142.
Raviv, Y. and Intrator, N.: Bootstrapping with noise: An effective regularization technique, Connections Science 8(3) (1996), 355-372.
Rosenblatt, M.: Remarks on some nonparametric estimates of a density function, Annals of Mathematical Statistics 27 (1956), 832-837.
Rudemo, M.: Empirical choice of histograms and kernel density estimators, Scandinavian Journal of Statistics 9 (1982), 65-78.
Sheather, S. and Jones, M.: A reliable data-based bandwidth selection method for kernel density estimation, Journal of Royal Statistical Society, Series B 53 (1991), 683-690.
Silverman, B.: Density Estimation for Statistics and Data Analysis, Chapman and Hall, 1986.
Simonoff, J.: Smoothing Methods in Statistics, Springer-Verlag, 1996.
Wand, M. and Jones, M.: Kernel Smoothing, Chapman and Hall, 1995.
Woodroofe, M.: On choosing a delta-sequence, Annals of Mathematical Statistics 41 (1970), 1665-1671.
Zell, A.: Stuttgart Neural Network Simulator (SNNS) User Manual V 4.1. Institute for Parallel and Distributed High Performance Systems (IPVR), University of Stuttgart, Stuttgart, Germany, 1995.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Lee, S., Cho, S. Smoothed Bagging with Kernel Bandwidth Selectors. Neural Processing Letters 14, 157–168 (2001). https://doi.org/10.1023/A:1012403711980
Issue Date:
DOI: https://doi.org/10.1023/A:1012403711980