Ensemble Feature Selection Based on the Contextual Merit

Puuronen, Seppo; Skrypnyk, Iryna; Tsymbal, Alexey

doi:10.1007/3-540-44801-2_12

Ensemble Feature Selection Based on the Contextual Merit

Seppo Puuronen⁷,
Iryna Skrypnyk⁷ &
Alexey Tsymbal⁷

Conference paper
First Online: 01 January 2001

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2114))

Abstract

Recent research has proved the benefits of using ensembles of classifiers for classification problems. Ensembles constructed by machine learning methods manipulating the training set are used to create diverse sets of accurate classifiers. Different feature selection techniques based on applying different heuristics for generating base classifiers can be adjusted to specific domain characteristics. In this paper we consider and experiment with the contextual feature merit measure as a feature selection heuristic. We use the diversity of an ensemble as evaluation function in our new algorithm with a refinement cycle. We have evaluated our algorithm on seven data sets from UCI. The experimental results show that for all these data sets ensemble feature selection based on the contextual merit and suitable starting amount of features produces an ensemble which with weighted voting never produces smaller accuracy than C4.5 alone with all the features.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Apte, C., Hong, S.J., Hosking, J.R.M., Lepre, J., Pednault, E.P.D., Rosen, B.K.: Decomposition of Heterogeneous Classification Problems. Advances in Intelligent Data Analysis, Springer-Verlag, London (1997) 17–28.
Chapter Google Scholar
Batitti, R., Colla, A.M.: Democracy in Neural Nets: Voting Schemes for Classification. Neural Networks, Vol. 7,No. 4 (1994) 691–707.
Article Google Scholar
Brieman, L., Friedman, J., Olshen, R., Stone, C.: Classification and regression trees. Wadsworth International Group, Belmont, California (1984).
Google Scholar
Cost, S., Salzberg, S.: A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features. Machine Learning, Vol. 10,No. 1 (1993) 57–78.
Google Scholar
Dietterich, T. Machine Learning research: Four Current Directions. Artificial Intelligence, Vol. 18,No. 4 (1997) 97–136.
Google Scholar
Hansen, L., Salamon, P.: Neural Network Ensembles. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 12 (1990) 993–1001.
Article Google Scholar
Hong, S.J.: Use of contextual information for feature ranking and discretization. IEEE Transactions on knowledge and Data Engineering, Vol. 9,No. 5) (1997) 718–730.
Article Google Scholar
John, G.H.: Enhancements to the Data Mining Process, PhD Thesis, Computer Science Department, School of Engineering, Stanford University (1997).
Google Scholar
Kohavi R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence Journal, Special Issue on Relevance edited by R. Greiner, J. Pearl and D. Subramanian.
Google Scholar
Kohavi, R., John, G.H.: The Wrapper Approach. In: (eds.) H. Liu and H. Motoda, Feature Selection for Knowledge Discovery in Databases, Springer-Verlag (1998).
Google Scholar
Kohavi, R., Sommerfield, D., Dougherty, J.: Data Mining Using MLC++: A Machine Learning Library in C++. Tools with Artificial Intelligence, IEEE CS Press (1996) 234–245.
Google Scholar
Merz, C.J., Murphy, P.M.: UCI Repository of Machine Learning Datasets http://www.ics.uci.edu/~mlearn/MLRepository.html]. Dept of Information and CS, Un-ty of California, Irvine, CA (1998).
Opitz, D. Feature Selection for Ensembles. In: 16^th National Conf. on Artificial Intelligence (AAAI), Orlando, Florida (1999) 379–384.
Google Scholar
Opitz, D., Maclin, R.: Popular Ensemble Methods: An Empirical Study. Artificial Intelligent Research, Vol. 11 (1999), 169–198.
MATH Google Scholar
Opitz, D., Shavlik, J.: Generating accurate and diverse members of neural network ensemble. Advances in Neural Information Processing Systems, Vol. 8 (1996) 881–887.
Google Scholar
Oza, N., Tumer, K.: Dimensionality Reduction Through Classifier Ensembles. Tech. Rep. NASA-ARC-IC-1999-126.
Google Scholar
Prodromidis, A.L., Stolfo, S.J., Chan P.K.: Puning Classifiers in a Distributed Meta-Learning System. In: Proc. of 1^st National Conference on New Information Technologies, (1998) 151–160.
Google Scholar
Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann, San Mateo, California (1993).
Google Scholar
Shapire, R.E., Freud, Y., Bartlett, P., Lee, W.S.: Boosting the Margin: A New Explanation of the Effectiveness of the Voting Methods. The Annals of Statistics, Vol. 25,No. 5 (1998), 1651–1686.
Google Scholar
Shapire, R.E.: A Brief Introduction to Boosting. In: Proceedings of 16^th International Joint Conference on Artificial Intelligence (1999).
Google Scholar
Skrypnyk, I., Puuronen, S.: Ensembles of Classifiers based on Contextual Features. In Proceedings of 4^th International Conference “New Information Technologies” (NITe’2000), Minsk, Belarus, Dec. (2000) (to appear).
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Systems, University of Jyväskylä, 35, FIN, 40351, Jyväskylä, Finland
Seppo Puuronen, Iryna Skrypnyk & Alexey Tsymbal

Authors

Seppo Puuronen
View author publications
You can also search for this author in PubMed Google Scholar
Iryna Skrypnyk
View author publications
You can also search for this author in PubMed Google Scholar
Alexey Tsymbal
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Kyoto University, Kyoto, 606-8501, Japan
Yahiko Kambayashi
EC3, Siebensterngasse 21/3, 1070, Wien
Werner Winiwarter
Center for Spatial Information Science (CSIS), University of Tokyo, 4-6-1, Komaba Meguro-ku, Tokyo, 153-8904, Japan
Masatoshi Arikawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Puuronen, S., Skrypnyk, I., Tsymbal, A. (2001). Ensemble Feature Selection Based on the Contextual Merit. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2001. Lecture Notes in Computer Science, vol 2114. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44801-2_12

Download citation

DOI: https://doi.org/10.1007/3-540-44801-2_12
Published: 28 August 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42553-3
Online ISBN: 978-3-540-44801-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics