Abstract
Feature subset-selection has emerged as a useful technique for creating diversity in ensembles — particularly in classification ensembles. In this paper we argue that this diversity needs to be monitored in the creation of the ensemble. We propose an entropy measure of the outputs of the ensemble members as a useful measure of the ensemble diversity. Further, we show that using the associated conditional entropy as a loss function (error measure) works well and the entropy in the ensemble predicts well the reduction in error due to the ensemble. These measures are evaluated on a medical prediction problem and are shown to predict the performance of the ensemble well. We also show that the entropy measure of diversity has the added advantage that it seems to model the change in diversity with the size of the ensemble.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Aha, D. W., & Bankert, R. L., Feature selection for case-based classification of cloud types: An empirical comparison. In D. W. Aha (Ed.) Case-Based Reasoning: Papers from the 1994 Workshop (Technical Report WS-94-01). Menlo Park, CA: AAAI Press. (NCARAI TR: AIC-94-011), 1994.
Bonzano A., Cunningham P., Smyth B., Using introspective learning to improve retrieval in CBR: A case study in air traffic control, Case-Based Reasoning Research and Development, Proceedings of the 1997 International Conference on Case-Based Reasoning, D.B. Leake and E. Plaza Eds., Springer Verlag, Lecture Notes in Artificial Intelligence, pp.291–302, 1997.
Carney, J., Cunningham, P., The NeuralBAG algorithm: optimizing generalization performance in bagged neural networks, in proceedings of 7 th European Symposium on Artificial Neural Networks, Bruges (Belgium), pp35–50 1999.
Guerra-Salcedo, C., Whitley, D., Genetic Approach for Feature Selection for Ensemble Creation. in GECCO-99: Proceedings of the Genetic and Evolutionary Computation Conference, Banzhaf, W., Daida, J., Eiben, A. E., Garzon, M. H., Honavar, V., Jakiela, M., & Smith, R. E. (eds.). Orlando, Florida USA, pp236–243, San Francisco, CA: Morgan Kaufmann, 1999.
Guerra-Salcedo, C., Whitley, D., Feature Selection Mechanisms for Ensemble Creation: A Genetic Search Perspective, in Data Mining with Evolutionary Algorithms: Research Directions. Papers from the AAAI Workshop. Alex A. Freitas (Ed.) Technical Report WS-99-06. AAAI Press, 1999.
Ho, T.K., The Random Subspace Method for Constructing Decision Forests, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20,8, 832–844, 1998.
Ho, T.K., Nearest Neighbours in Random Subspaces, Proc. Of 2 nd International Workshop on Statistical Techniques in Pattern Recognition, A. Amin, D. Dori, P. Puil, H. Freeman, (eds.) pp640–648, Springer Verlag LNCS 1451, 1998.
Krogh, A., Vedelsby, J., Neural Network Ensembles, Cross Validation and Active Learning, in Advances in Neural Information Processing Systems 7, G. Tesauro, D. S. Touretsky, T. K. Leen, eds., pp231–238, MIT Press, Cambridge MA, 1995.
Tibshirani, R., (1996) Bias, variance and prediction error for classification rules, University of Toronto, Department of Statistics Technical Report, November 1996 (also available at http://www-stat.stanford.edu/~tibs), 1996.
Wettschereck, D., Aha, D. W., & Mohri, T., A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. Artificial Intelligence Review, pp273–314, Vol. 11, Nos. 1–5, 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cunningham, P., Carney, J. (2000). Diversity versus Quality in Classification Ensembles Based on Feature Selection. In: López de Mántaras, R., Plaza, E. (eds) Machine Learning: ECML 2000. ECML 2000. Lecture Notes in Computer Science(), vol 1810. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45164-1_12
Download citation
DOI: https://doi.org/10.1007/3-540-45164-1_12
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67602-7
Online ISBN: 978-3-540-45164-8
eBook Packages: Springer Book Archive