Skip to main content
Log in

Voting over Multiple Condensed Nearest Neighbors

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Lazy learning methods like the k-nearest neighbor classifier require storing the whole training set and may be too costly when this set is large. The condensed nearest neighbor classifier incrementally stores a subset of the sample, thus decreasing storage and computation requirements. We propose to train multiple such subsets and take a vote over them, thus combining predictions from a set of concept descriptions. We investigate two voting schemes: simple voting where voters have equal weight and weighted voting where weights depend on classifiers' confidences in their predictions. We consider ways to form such subsets for improved performance: When the training set is small, voting improves performance considerably. If the training set is not small, then voters converge to similar solutions and we do not gain anything by voting. To alleviate this, when the training set is of intermediate size, we use bootstrapping to generate smaller training sets over which we train the voters. When the training set is large, we partition it into smaller, mutually exclusive subsets and then train the voters. Simulation results on six datasets are reported with good results. We give a review of methods for combining multiple learners. The idea of taking a vote over multiple learners can be applied with any type of learning scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aha, D. W., Kibler, D. &, Albert, M. K. (1991). Instance-Based Learning Algorithms. Machine Learning 6: 37–66.

    Google Scholar 

  • Alpaydin, E. (1990). Neural Models of Incremental Supervised and Unsupervised Learning, PhD dissertation, No 869, Department d'Informatique, Ecole Polytechnique Fédérale de Lausanne, Switzerland, 1990.

    Google Scholar 

  • Alpaydin, E. (1991). GAL: Networks that Grow When They Learn and Shrink When They Forget, Berkeley CA, TR–91–032: International Computer Science Institute.

    Google Scholar 

  • Alpaydin, E. (1993). Multiple Networks for Function Learning. IEEE International Conference on Neural Networks, March, San Francisco CA 1: 9–14.

  • Alpaydin, E. & Gürgen, F. (1995). Comparison of Kernel Estimators, Perceptrons and Radial-Basis Functions for OCR and Speech Classification. Neural Computing and Applications 3: 38–49.

    Google Scholar 

  • Benediktsson, J. A. & Swain, P. H. (1992). Consensus Theoretic Classification Methods. IEEE Transactions on Systems, Man, and Cybernetics 22: 688–704.

    Google Scholar 

  • Breiman, L. (1992). Stacked Regressions, TR-367. Department of Statistics, University of California, Berkeley.

    Google Scholar 

  • Drucker, H., Schapire, R. & Simard, P. (1993). Improving Performance in Neural Networks Using a Boosting Algorithm. In Hanson S. J. Cowan J. & Giles L. (eds.) Advances in Neural Information Processing Systems 5, 42–49. Morgan Kaufmann.

  • Duda, R. O. & Hart, P. E. (1973). Pattern Classification and Scene Analysis. Wiley and Sons.

  • Gates, G. W. (1972). The Reduced Nearest Neighbor Rule. IEEE Transactions on Information Theory 18: 431–433.

    Google Scholar 

  • Guyon, I., Poujoud, I., Personnaz, L., Dreyfus, G., Denker, J. & le Cun, Y. (1989). Comparing Different Neural Architectures for Classifying Handwritten Digits. International Joint Conference on Neural Networks. Washington, USA.

  • Hansen, L. K. & Salamon, P. (1990). Neural Network Ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence 12: 993–1001.

    Google Scholar 

  • Härdle, W. (1990). Applied Nonparametric Regression. Econometric Society Monographs, Cambridge University Press.

  • Hart, P. E. (1968). The Condensed Nearest Neighbor Rule. IEEE Transactions on Information Theory 14: 515–516.

    Google Scholar 

  • Hastie, T. & Tibshirani, R. (1990). Generalized Additive Models. Chapman Hall.

  • Hines, E. L., Gianna, C. C. & Gardner, J. W. (1993). Neural Network Based Electronic Nose Using Constructive Algorithms. In Taylor, M. and Lisboa P. (eds.) Techniques and Application of Neural Networks, 135–154. Ellis Horwood.

  • Jacobs, R. A., Jordan, M. I., Nowlan, S. J. & Hinton, G. E. (1991). Adaptive Mixtures of Local Experts. Neural Computation 3: 79–87.

    Google Scholar 

  • Krogh, A. & Vedelsby, J. (1995). Neural Network Ensembles, Cross Validation, and Active Learning. In Tesauro, G., Touretzky, D. S. & Leen T. K. (eds.) Advances in Neural Information Processing Systems 7. MIT Press.

  • LeBlanc, M. & Tibshirani, R. (1994). Combining Estimates in Regression and Classification. Department of Statistics, University of Toronto.

  • Lincoln, W. P. & Skrzypek, J. (1990). Synergy of Clustering Multiple Back Propagation Networks. In Touretzky D (ed.) Advances in Neural Information Processing Systems 2, 650–657. Morgan Kaufmann.

  • Mani, G. (1991). Lowering Variance of Decisions by using Artificial Neural Network Ensembles. Neural Computation 3: 484–486.

    Google Scholar 

  • Meir, R. (1994). Bias, Variance and the Combination of Estimators: The Case of Linear Least Squares. Department of Electrical Engineering, Technion.

  • Murphy, P. M. (1994). UCI Repository of Machine Learning Databases [http://www.ics.uci.edu/∼mlearn/MLRepository.html]. Irvine, CA: University of California, Department of Information and Computer Science.

    Google Scholar 

  • Omohundro, S. M. (1987). Efficient Algorithms with Neural Network Behaviour. Complex Systems 1: 273–347.

    Google Scholar 

  • Perrone, M. P. (1993). Improving Regression Estimation: Averaging Methods for Variance Reduction with Extensions to General Convex Measure Optimization. PhD Thesis, Department of Physics, Brown University.

  • Preparata, F. P. & Shamos, M. I. (1985). Computational Geometry. Springer.

  • Reignier, P., Hansen, V. & Crowley, J. (1995). Incremental Supervised Learning for Mobile Robot Reactive Control. In Rembold, U. et al. (eds.) Intelligent Autonomous Systems, 287–294. IOS Press.

  • Rogova, G. (1994). Combining the Results of Several Neural Network Classifiers. Neural Networks 7: 777–781.

    Google Scholar 

  • Schaffer, C. (1994). A conservation law for generalization performance. In Proceedings of the Eleventh International Conference on Machine Learning, 259–265. New Brunswick, NJ: Morgan Kaufmann.

    Google Scholar 

  • Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. Chapman & Hall.

  • Stanfill, C. & Waltz, D. (1986). Toward Memory-Based Reasoning. Communications of the ACM 29: 1213–1228.

    Google Scholar 

  • Tresp, V. & Taniguchi, M. (1995). Combining Estimators Using Non-Constant Weighting Functions. In Tesauro, G., Touretzky, D. S. & Leen T. K. (eds.) Advances in Neural Information Processing Systems 7. MIT Press.

  • Vapnik, V. (1995). The Nature of Statistical Learning Theory. Springer.

  • Wolpert, D. H. (1992). Stacked Generalization. Neural Networks 5: 241–259.

    Google Scholar 

  • Xu, L., Krzyzak, A. & Suen, C. Y. (1992). Methods of Combining Multiple Classifiers and Their Applications to Handwriting Recognition. IEEE Transactions on Systems, Man, and Cybernetics 22: 418–435.

    Google Scholar 

  • Zhang, X., Mesirov, J. P. & Waltz, D. L. (1992). Hybrid System for Protein Secondary Structure Prediction. Journal of Molecular Biology 225: 1049–1063.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alpaydin, E. Voting over Multiple Condensed Nearest Neighbors. Artificial Intelligence Review 11, 115–132 (1997). https://doi.org/10.1023/A:1006563312922

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1006563312922

Navigation