Consistency Based Feature Selection

Dash, Manoranjan; Liu, Huan; Motoda, Hiroshi

doi:10.1007/3-540-45571-X_12

Consistency Based Feature Selection

Manoranjan Dash⁴,
Huan Liu⁴ &
Hiroshi Motoda⁵

Conference paper
First Online: 01 January 2003

1876 Accesses
66 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1805))

Abstract

Feature selection is an effective technique in dealing with dimensionality reduction for classification task, a main component of data mining. It searches for an “optimal” subset of features. The search strategies under consideration are one of the three: complete, heuristic, and probabilistic. Existing algorithms adopt various measures to evaluate the goodness of feature subsets. This work focuses on one measure called consistency. We study its properties in comparison with other major measures and different ways of using this measure in search of feature subsets. We conduct an empirical study to examine the pros and cons of these different search methods using consistency. Through this extensive exercise, we aim to provide a comprehensive view of this measure and its relations with other measures and a guideline of the use of this measure with different search strategies facing a new application.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

H. Almuallim and T. G. Dietterich. Learning boolean concepts in the presence of many irrelevant features. Artificial Intelligence, 69(1–2):279–305, November 1994.
Article MATH MathSciNet Google Scholar
M. Ben-Bassat. Pattern recognition and reduction of dimensionality. In P. R. Krishnaiah and L. N. Kanal, editors, Handbook of Statistics, pages 773–791. North Holland, 1982.
Google Scholar
A. L. Blum and P. Langley. Selection of relevant features and examples in machine learning. Artificial Intelligence, 97:245–271, 1997.
Article MATH MathSciNet Google Scholar
A Blumer, A. Ehrenfeucht, D. Haussler, and M. K. Warmuth. Occam’s razor. Readings in Machine Learning, pages 201–204, 1990.
Google Scholar
G. Brassard and P. Bratley. Fundamentals of Algorithms. Prentice Hall, New Jersy, 1996.
Google Scholar
M. Dash. Feature selection via set cover. In Proceedings of IEEE Knowledge and Data Engineering Exchange Eorkshop, pages 165–171, Newport, California, November 1997. IEEE Computer Society.
Google Scholar
M. Dash and H. Liu. Feature selection methods for classification. Intelligent Data Analysis: An Interbational Journal, 1(3), 1997.
Google Scholar
P. A. Devijver and J. Kittler. Pattern Recognition: A Statistical Approach. Prentice Hall, 1982.
Google Scholar
G. H. John, R. Kohavi, and K. Pfleger. Irrelevant features and the subset selection problem. In Proceedings of the Eleventh International Conference on Machine Learning, pages 121–129, 1994.
Google Scholar
D. S. Johnson. Approximation algorithms for combinatorial problems. Journal of Computer and System Sciences, 9:256–278, 1974.
Article MATH MathSciNet Google Scholar
K. Kira and L. A. Rendell. The feature selection problem: Traditional methods and a new algorithm. In Proceedings of Ninth National Conference on AI, pages 129–134, 1992.
Google Scholar
R. Kohavi. Wrappers for performance enhancement and oblivious decision graphs. PhD thesis, Department of Computer Science, Stanford University, CA, 1995.
Google Scholar
H. Liu, H. Motoda, and M. Dash. A monotonic measure for optimal feature selection. In Proceedings of European Conference on Machine Learning, pages 101–106, 1998.
Google Scholar
H. Liu and R. Setiono. Feature selection and classification-a probabilistic wrapper approach. In Proceedings of Ninth International Conference on Industrial and Engineering Applications of AI and ES, 1996.
Google Scholar
C. J. Merz and P. M. Murphy. UCI repository of machine learning databases, 1996. FTP from ics.uci.edu in the directory pub/machine-learning-databases.
Google Scholar
P. M. Narendra and K. Fukunaga. A branch and bound algorithm for feature selection. IEEE Transactions on Computers, C-26(9):917–922, September 1977.
Article Google Scholar
J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, California, 1993.
Google Scholar
T. W. Rauber. Inductive Pattern Classification Methods-Features-Sensors. PhD thesis, Department of Electrical Engineering, Universidale Nova de Lisboa, 1994.
Google Scholar
W. Siedlecki and J Sklansky. On automatic feature selection. International Journal of Pattern Recognition and Artificial Intelligence, 2:197–220, 1988.
Article Google Scholar
S. Watanabe. Pattern Recognition: Human and Mechanical. Wiley Intersceince, 1985.
Google Scholar
A. Zell and et al. Stuttgart Neural Network Simulator (SNNS), user manual, version 4.1. Technical report, 1995.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing, National University of Singapore, Singapore
Manoranjan Dash & Huan Liu
Division of Intelligent Sys Sci, Osaka University, Ibaraki, Osaka, 567, Japan
Hiroshi Motoda

Authors

Manoranjan Dash
View author publications
You can also search for this author in PubMed Google Scholar
Huan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi Motoda
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Graduate School of Systems Management, Universiy of Tsukuba, 3-29-1 Otsuka, Bunkyo-ku, Tokyo, 112-0012, Japan
Takao Terano
Department of Computer Science and Engineering, Arizona State University, P.O. Box 875 406, Tempe, AZ, 85287-5406
Huan Liu
Department of Computer Science, National Tsing Hua University, Hsinchu, 300, Taiwan ROC
Arbee L. P. Chen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dash, M., Liu, H., Motoda, H. (2000). Consistency Based Feature Selection. In: Terano, T., Liu, H., Chen, A.L.P. (eds) Knowledge Discovery and Data Mining. Current Issues and New Applications. PAKDD 2000. Lecture Notes in Computer Science(), vol 1805. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45571-X_12

Download citation

DOI: https://doi.org/10.1007/3-540-45571-X_12
Published: 24 March 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67382-8
Online ISBN: 978-3-540-45571-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics