Feature Selection Algorithm for Data with Both Nominal and Continuous Features

Tang, Wenyin; Mao, Kezhi

doi:10.1007/11430919_78

Wenyin Tang²¹ &
Kezhi Mao²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3518))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

2673 Accesses
7 Citations

Abstract

Wrapper and filter are two commonly used feature selection schemes. Because of its computational efficiency, the filter method is often the first choice when dealing with large dataset. However, most of filter methods reported in the literature are developed for continuous feature selection. In this paper, we proposed a filter method for mixed data with both continuous and nominal features. The new algorithm includes a novel criterion for mixed feature evaluation, and a novel search algorithm for mixed feature subset generation. The proposed method is tested using a few benchmark real-world problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bar-Hen, A., Daudin, J.J.: Generalization of Mahalanobis Distance in Mixed Case. Journal of Multivariate Analysis 53, 332–342 (1995)
Article MATH MathSciNet Google Scholar
Devijver, P.A., Kittler, J.: Pattern Recognition, A Statistical Approach. Prentice-Hall International, Inc., London (1982)
MATH Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley-Interscience Publication, Hoboken (2001)
MATH Google Scholar
Kononenko, I.: Estimating Attributes: Analysis and Extensions of RELIEF. In: ECML, pp. 171–182 (1994)
Google Scholar
Robnik, M., Kononenko, I.: Theoretical and Empirical Analysis of ReliefF and ReliefF. Machine Learning 53, 23–26 (2003)
Article MATH Google Scholar
Hall, M.A.: Correlation-based Feature Selection for Machine Learning. A Dissertation submit to Department of Computer Science, University of Waikato, Hamilton, NewZealand (1999)
Google Scholar
Hall, M.A.: Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning. In: ICML, pp. 359–366 (2000)
Google Scholar
Molina, L.C., Belanche, L., Nebot, A.: Feature Selection Algorithms: A Survey and Experimental Evaluation. In: ICDM, pp. 306–313 (2002)
Google Scholar
Murphy, P.M., Aha, D.W.: UCI repository of machine learning databases (1995), http://www.ics.uci.edu/~mlearn/MLRepository.html
Florez-Lopez, R.: Reviewing RELIEF and its extensions: a new approach for estimating attributes considering high-correlated features. In: IEEE International Conference on Data Mining, pp. 605–608 (2002)
Google Scholar
Mitchell, T.M.: Machine Learning. The McGraw-Hill Companies, Inc., New York (2001)
Google Scholar
Wilson, D.R., Martinez, T.R.: Improved Heterogeneous Distance Functions. Journal of Artificial Intelligence Research 6, 1–34 (1997)
MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical & Electronic Engineering, Nanyang Technological University, Nanyang Avenue, 639798, Singapore
Wenyin Tang & Kezhi Mao

Authors

Wenyin Tang
View author publications
You can also search for this author in PubMed Google Scholar
Kezhi Mao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Japan Advanced Institute of Science and Technology, Asahidai 1-1, 923-12292, Nomi, Japan
Tu Bao Ho
University of Hong Kong, Pokfulam Road, Hong Kong, China
David Cheung
Department of Computer Science and Engineering, Arizona State University, Tempe, Arizona, USA
Huan Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tang, W., Mao, K. (2005). Feature Selection Algorithm for Data with Both Nominal and Continuous Features. In: Ho, T.B., Cheung, D., Liu, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2005. Lecture Notes in Computer Science(), vol 3518. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11430919_78

Download citation

DOI: https://doi.org/10.1007/11430919_78
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26076-9
Online ISBN: 978-3-540-31935-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics