Feature Selection for Ensembles of Simple Bayesian Classifiers

Tsymbal, Alexey; Puuronen, Seppo; Patterson, David

doi:10.1007/3-540-48050-1_63

Feature Selection for Ensembles of Simple Bayesian Classifiers

Alexey Tsymbal⁵,
Seppo Puuronen⁵ &
David Patterson⁶

Conference paper
First Online: 01 January 2002

677 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2366))

Abstract

A popular method for creating an accurate classifier from a set of training data is to train several classifiers, and then to combine their predictions. The ensembles of simple Bayesian classifiers have traditionally not been a focus of research. However, the simple Bayesian classifier has much broader applicability than previously thought. Besides its high classification accuracy, it also has advantages in terms of simplicity, learning speed, classification speed, storage space, and incrementality. One way to generate an ensemble of simple Bayesian classifiers is to use different feature subsets as in the random subspace method. In this paper we present a technique for building ensembles of simple Bayesian classifiers in random subspaces. We consider also a hill-climbing-based refinement cycle, which improves accuracy and diversity of the base classifiers. We conduct a number of experiments on a collection of real-world and synthetic data sets. In many cases the ensembles of simple Bayesian classifiers have significantly higher accuracy than the single “global” simple Bayesian classifier. We consider several methods for integration of simple Bayesian classifiers. The dynamic integration better utilizes ensemble diversity than the static integration.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: ging, boosting, and variants. Machine Learning, Vol. 36, Nos. 1,2 (1999) 105–139.
Article Google Scholar
Blake, C.L., Merz, C.J.: UCI repository of machine learning databases [http://www.ics.uci.edu/~mlearn/MLRepository.html]. Dep-t of Information and CS, Un-ty of California, Irvine CA (1998).
Google Scholar
Brodley, C., Lane, T.: Creating and exploiting coverage and diversity. In: Proc. AAAI-96 Workshop on Integrating Multiple Learned Models (1996) 8–14.
Google Scholar
Cunningham, P.: Diversity versus quality in classification ensembles based on feature selection. Tech. Report TCD-CS-2000-02, Dept. of Computer Science, Trinity College Dublin, Ireland (2000).
Google Scholar
Dietterich, T. G.: Ensemble Learning Methods. In: M.A. Arbib (ed.), Handbook of Brain Theory and Neural Networks, 2nd ed., MIT Press (2001).
Google Scholar
Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning, Vol. 29, Nos. 2,3 (1997) 103–130.
Article MATH Google Scholar
Elkan C.: Boosting and naïve Bayesian learning. Tech. Report CS97-557, Dept. of CS and Engineering, Un-ty of California, San Diego, USA (1997).
Google Scholar
Hansen, L., Salamon, P.: Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 12 (1990) 993–1001.
Article Google Scholar
Ho, T. K.: The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 20, No. 8 (1998) 832–844.
Article Google Scholar
Kohavi, R., Sommerfield, D., Dougherty, J.: Data mining using MLC++: a machine learning library in C++. Tools with Artificial Intelligence, IEEE CS Press (1996) 234–245.
Google Scholar
Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation, and active learning. In D. Touretzky, T. Leen (eds.), Advances in Neural Information Processing Systems, Vol. 7, Cambridge, MA, MIT Press (1995) 231–238.
Google Scholar
Opitz, D.: Feature selection for ensembles. In: Proc. 16^th National Conf. on Artificial Intelligence, AAAI (1999) 379–384.
Google Scholar
Pedersen, T.: A simple approach to building ensembles of naive Bayesian classifiers for word sense disambiguation. In: Proc. 1^st Annual Meeting of the North American Chapter of the Association for Computational Linguistics, Seattle, WA (2000) 63–69.
Google Scholar
Puuronen, S., Terziyan, V., Tsymbal, A.: A dynamic integration algorithm for an ensemble of classifiers. In: Z.W. Ras, A. Skowron (eds.), Foundations of Intelligent Systems: ISMIS’99, Lecture Notes in AI, Vol. 1609, Springer-Verlag, Warsaw (1999) 592–600.
Chapter Google Scholar
Puuronen, S., Tsymbal, A.: Local feature selection with dynamic integration of classifiers, In: Fundamenta Informaticae, Special Issue “Intelligent Information Systems”, Vol. 47, Nos. 1-2, IOS Press (2001) 91–117.
MATH MathSciNet Google Scholar
Skurichina, M., Duin, R.P.W.: Bagging and the random subspace method for redundant feature spaces. In: J. Kittler, F. Roli (eds.), Proc. 2^nd Int. Workshop on Multiple Classifier Systems MCS 2001, Cambridge, UK (2001) 1–10.
Google Scholar
Tsymbal, A., Puuronen, S., Skrypnyk, I.: Ensemble feature selection with dynamic integration of classifiers, In: Proc. Int. ICSC Congress on Computational Intelligence Methods and Applications CIMA’2001, Bangor, Wales, U.K. (2001).
Google Scholar

Download references

Author information

Authors and Affiliations

University of Jyväskylä, P.O.Box 35, FIN-40351, Jyväskylä, Finland
Alexey Tsymbal & Seppo Puuronen
Northern Ireland Knowledge Engineering Laboratory, University of Ulster, UK
David Patterson

Authors

Alexey Tsymbal
View author publications
You can also search for this author in PubMed Google Scholar
Seppo Puuronen
View author publications
You can also search for this author in PubMed Google Scholar
David Patterson
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

UFR d’Informatique, Université Claude Bernard Lyon I, 8, boulevard Niels Bohr, 69622, Villeurbanne Cedex, France
Mohand-Saïd Hacid
Dept. of Computer Science College of IT, University of North Carolina, Charlotte, NC, 28223, USA
Zbigniew W. Raś
Băt. L. Equipe de Recherche en Ingénierie des Connaissances, Université Lumière Lyon 2, 5, avenue Pierre Mendes-France, 69676, Bron Cedex, France
Djamel A. Zighed
LRI, Université Paris Sud, Băt. 490, 91405, Orsay Cedex, France
Yves Kodratoff

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tsymbal, A., Puuronen, S., Patterson, D. (2002). Feature Selection for Ensembles of Simple Bayesian Classifiers. In: Hacid, MS., Raś, Z.W., Zighed, D.A., Kodratoff, Y. (eds) Foundations of Intelligent Systems. ISMIS 2002. Lecture Notes in Computer Science(), vol 2366. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48050-1_63

Download citation

DOI: https://doi.org/10.1007/3-540-48050-1_63
Published: 21 June 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43785-7
Online ISBN: 978-3-540-48050-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics