Skip to main content

Ensemble Feature Selection Based on the Contextual Merit

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2114))

Abstract

Recent research has proved the benefits of using ensembles of classifiers for classification problems. Ensembles constructed by machine learning methods manipulating the training set are used to create diverse sets of accurate classifiers. Different feature selection techniques based on applying different heuristics for generating base classifiers can be adjusted to specific domain characteristics. In this paper we consider and experiment with the contextual feature merit measure as a feature selection heuristic. We use the diversity of an ensemble as evaluation function in our new algorithm with a refinement cycle. We have evaluated our algorithm on seven data sets from UCI. The experimental results show that for all these data sets ensemble feature selection based on the contextual merit and suitable starting amount of features produces an ensemble which with weighted voting never produces smaller accuracy than C4.5 alone with all the features.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Apte, C., Hong, S.J., Hosking, J.R.M., Lepre, J., Pednault, E.P.D., Rosen, B.K.: Decomposition of Heterogeneous Classification Problems. Advances in Intelligent Data Analysis, Springer-Verlag, London (1997) 17–28.

    Chapter  Google Scholar 

  2. Batitti, R., Colla, A.M.: Democracy in Neural Nets: Voting Schemes for Classification. Neural Networks, Vol. 7,No. 4 (1994) 691–707.

    Article  Google Scholar 

  3. Brieman, L., Friedman, J., Olshen, R., Stone, C.: Classification and regression trees. Wadsworth International Group, Belmont, California (1984).

    Google Scholar 

  4. Cost, S., Salzberg, S.: A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features. Machine Learning, Vol. 10,No. 1 (1993) 57–78.

    Google Scholar 

  5. Dietterich, T. Machine Learning research: Four Current Directions. Artificial Intelligence, Vol. 18,No. 4 (1997) 97–136.

    Google Scholar 

  6. Hansen, L., Salamon, P.: Neural Network Ensembles. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 12 (1990) 993–1001.

    Article  Google Scholar 

  7. Hong, S.J.: Use of contextual information for feature ranking and discretization. IEEE Transactions on knowledge and Data Engineering, Vol. 9,No. 5) (1997) 718–730.

    Article  Google Scholar 

  8. John, G.H.: Enhancements to the Data Mining Process, PhD Thesis, Computer Science Department, School of Engineering, Stanford University (1997).

    Google Scholar 

  9. Kohavi R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence Journal, Special Issue on Relevance edited by R. Greiner, J. Pearl and D. Subramanian.

    Google Scholar 

  10. Kohavi, R., John, G.H.: The Wrapper Approach. In: (eds.) H. Liu and H. Motoda, Feature Selection for Knowledge Discovery in Databases, Springer-Verlag (1998).

    Google Scholar 

  11. Kohavi, R., Sommerfield, D., Dougherty, J.: Data Mining Using MLC++: A Machine Learning Library in C++. Tools with Artificial Intelligence, IEEE CS Press (1996) 234–245.

    Google Scholar 

  12. Merz, C.J., Murphy, P.M.: UCI Repository of Machine Learning Datasets http://www.ics.uci.edu/~mlearn/MLRepository.html]. Dept of Information and CS, Un-ty of California, Irvine, CA (1998).

  13. Opitz, D. Feature Selection for Ensembles. In: 16th National Conf. on Artificial Intelligence (AAAI), Orlando, Florida (1999) 379–384.

    Google Scholar 

  14. Opitz, D., Maclin, R.: Popular Ensemble Methods: An Empirical Study. Artificial Intelligent Research, Vol. 11 (1999), 169–198.

    MATH  Google Scholar 

  15. Opitz, D., Shavlik, J.: Generating accurate and diverse members of neural network ensemble. Advances in Neural Information Processing Systems, Vol. 8 (1996) 881–887.

    Google Scholar 

  16. Oza, N., Tumer, K.: Dimensionality Reduction Through Classifier Ensembles. Tech. Rep. NASA-ARC-IC-1999-126.

    Google Scholar 

  17. Prodromidis, A.L., Stolfo, S.J., Chan P.K.: Puning Classifiers in a Distributed Meta-Learning System. In: Proc. of 1st National Conference on New Information Technologies, (1998) 151–160.

    Google Scholar 

  18. Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann, San Mateo, California (1993).

    Google Scholar 

  19. Shapire, R.E., Freud, Y., Bartlett, P., Lee, W.S.: Boosting the Margin: A New Explanation of the Effectiveness of the Voting Methods. The Annals of Statistics, Vol. 25,No. 5 (1998), 1651–1686.

    Google Scholar 

  20. Shapire, R.E.: A Brief Introduction to Boosting. In: Proceedings of 16th International Joint Conference on Artificial Intelligence (1999).

    Google Scholar 

  21. Skrypnyk, I., Puuronen, S.: Ensembles of Classifiers based on Contextual Features. In Proceedings of 4th International Conference “New Information Technologies” (NITe’2000), Minsk, Belarus, Dec. (2000) (to appear).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Puuronen, S., Skrypnyk, I., Tsymbal, A. (2001). Ensemble Feature Selection Based on the Contextual Merit. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2001. Lecture Notes in Computer Science, vol 2114. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44801-2_12

Download citation

  • DOI: https://doi.org/10.1007/3-540-44801-2_12

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42553-3

  • Online ISBN: 978-3-540-44801-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics