Hybrid search of feature subsets

Dash, Manoranjan; Liu, Huan

doi:10.1007/BFb0095273

Manoranjan Dash¹ &
Huan Liu²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1531))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

145 Accesses
8 Citations

Abstract

Feature selection is a search problem for an “optimal” subset of features. The class separability is normally used as one of the basic feature selection criteria. Instead of maximizing the class separability as in the literature, this work adopts a criterion aiming to maintain the discriminating power of the data. After examining the pros and cons of two existing algorithms for feature selection, we propose a hybrid algorithm of probabilistic and complete search that can take advantage of both algorithms. It begins by running LVF (probabilistic search) to reduce the number of features; then it runs “Automatic Branch & Bound (ABB)” (complete search). By imposing a limit on the amount of time this algorithm can run, we obtain an approximation algorithm. The empirical study suggests that dividing the time equally between the two phases yields nearly the best performance, and that the hybrid search algorithm substantially outperforms earlier methods in general.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

H. Almuallim and T.G. Dietterich. Learning with many irrelevant features. In Proceedings of the Ninth National Conference on Artificial Intelligence, pages 547–552, Anaheim, California, 1991, AAAI Press/The MIT Press, Menlo Park, California.
Google Scholar
A.L. Blumer, A. Ehrenfeucht, D. Haussler, and M.K. Warmuth. Occam’s razor. In J.W. Shavlik and T.G. Dietterich, editors, Readings in Machine Learning, pages 201–204. Morgan Kaufmann, 1990.
Google Scholar
G. Brassard and P. Bratley. Fundamentals of Algorithms. Prentice Hall, New Jersey, 1996.
Google Scholar
M. Dash and H. Liu. Feature selection methods for classifications. Intelligent Data Analysis: An International Journal, 1(3), 1997. http://www-east.elsevier.com/ida/free.htm.
Google Scholar
P.A. Devijver and J. Kittler. Pattern Recognition: A Statistical Approach. Prentice Hall International, 1982.
Google Scholar
K. Kira and L.A. Rendell. The feature selection problem: Traditional methods and a new algorithm. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 129–134. Menlo Park: AAAI Press/The MIT Press, 1992.
Google Scholar
R. Kohavi. Wrappers for performance enhancement and oblivious decision graphs. PhD thesis, Department of Computer Science, Standford University, Stanford, CA, 1995.
Google Scholar
D. Koller and M. Sahami. Toward optimal feature selection. In L. Saitta, editor, Machine Learning: Proceedings of the Thirteenth International Conference (ICML-96), July 3–6, 1996, pages 284–292, Bari, Italy, 1996 San Francisco: Morgan Kaufmann Publishers.
Google Scholar
I. Kononenko. Estimating attributes: Analysis and extension of RELIEF. In F. Bergadano and L. De Raedt, editors, Proceedings of the European Conference on Machine Learning, April 6–8, pages 171–182, Catania, Italy, 1994. Berlin: Springer-Verlag.
Google Scholar
P. Langley. Selection of relevant features in machine learning. In Proceedings of the AAAI Fall Symposium on Relevance. AAAI Press, 1994.
Google Scholar
H. Liu and H. Motoda. Feature Selection for Knowledge Discovery Data Mining. Boston: Kluwer Academic Publishers, 1998.
Google Scholar
H. Liu, H. Motoda, and M. Dash. A monotonic measure for optimial feature selection. In C. Nedellec and C. Rouveirol, editors, Machine Learning: ECML-98, April 21–23, 1998, pages 101–106, Chemnitz, Germany, April 1998. Berlin Heidelberg: Springer-Verlag.
Google Scholar
H. Liu and R. Setiono. A probabilistic approach to feature selection—a filter solution. In L. Saitta, editor, Proceedings of International Conference on Machine Learning (ICML-96), July 3–6, 1996, pages 319–327, Bari, Italy, 1996. San Francisco: Morgan Kaufmann Publishers, CA.
Google Scholar
C.J. Merz and P.M. Murphy. UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html Irvine, CA: University of California, Department of Information and Computer Science, 1996.
Google Scholar
P.M. Narendra and K. Fukunaga. A branch and bound algorithm for feature subset selection. IEEE Trans. on Computer, C-26(9):917–922, September 1977.
MATH Google Scholar
J.R. Quinlan, C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
Google Scholar
T.W. Rauber. Inductive Pattern Classification Methods-Features-Sensors. PhD thesis, Dept. of Electrical Engineering, Universidade Nova de Lisboa, Lisboa, 1994.
Google Scholar
J. C. Schlimmer. Efficiently inducing determinations: a complete and systematic search algorithm that uses optimal pruning. In Proceedings of the Tenth International Conference on Machine Learning, pages 284–290, 1993.
Google Scholar
W. Siedlecki and J. Sklansky. On automatic feature selection. International Journal of Pattern Recognition and Artificial Intelligence, 2:197–220, 1988.
Article Google Scholar
S. Watanabe. Pattern Recognition: Human and Mechanical. Wiley Interscience, 1985.
Google Scholar
A. Zell and et al. Stuttgart neural network simulator (SNNS), user manual, version 4.1. Technical Report 6/95, Institute for Parallel and Distributed High Performance Systems (IPVR), University of Stuttgart, FTP: ftp.informatik.unistuttgart.de/pub/SNNS, 1995.
Google Scholar

Download references

Author information

Authors and Affiliations

BioInformatics Centre, National University of Singapore, 119074, Singapore
Manoranjan Dash
School of Computing, National University of Singapore, 119260, Singapore
Huan Liu

Authors

Manoranjan Dash
View author publications
You can also search for this author in PubMed Google Scholar
Huan Liu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Hing-Yan Lee Hiroshi Motoda

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dash, M., Liu, H. (1998). Hybrid search of feature subsets. In: Lee, HY., Motoda, H. (eds) PRICAI’98: Topics in Artificial Intelligence. PRICAI 1998. Lecture Notes in Computer Science, vol 1531. Springer, Berlin, Heidelberg . https://doi.org/10.1007/BFb0095273

Download citation

DOI: https://doi.org/10.1007/BFb0095273
Published: 20 October 2006
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65271-7
Online ISBN: 978-3-540-49461-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics