Automatic Feature Selection for Classification of Health Data

He, Hongxing; Jin, Huidong; Chen, Jie

doi:10.1007/11589990_108

Hongxing He²⁰,
Huidong Jin²⁰ &
Jie Chen²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3809))

Included in the following conference series:

Australasian Joint Conference on Artificial Intelligence

2416 Accesses
2 Citations

Abstract

For classification of health data, we propose in this paper a fast and accurate feature selection method, FIEBIT (Feature Inclusion and Exclusion Based on Information Theory). FIEBIT selects the most relevant and non-redundant features using Conditional Mutual Information (CMU) while excluding irrelevant and redundant features according to the comparison among Individual Symmetrical Uncertainty (ISU) and Combined Symmetrical Uncertainty (CSU). Small feature subsets are selected before classification without compromising the classification accuracy. In addition, the size of the feature subset is determined automatically. Our preliminary empirical results on health data with hundreds of features suggest FIEBIT is efficient and effective in comparison with representative feature selection methods.

The authors would like to acknowledge Dr H. Altay Guvenir of Bilkent University for donating the Cardiac Arrhythmia Database for public usage.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Yu, L., Liu, H.: Redundancy based feature selection for microarray data. In: Proceedings of KDD 2004, Seattle, WA, USA, pp. 737–742 (2004)
Google Scholar
Fleuret, F.: Fast binary feature selection with conditional mutual information. Journal of Machine Learning Research 5, 1531–1555 (2004)
MathSciNet Google Scholar
Wang, G., Lochovsky, F.H., Yang, Q.: Feature selection with conditional mutual information maxmin in text categorization. In: Proceedings of CIKM 2004, Washington, US, November 2004, pp. 8–13 (2004)
Google Scholar
Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Jack, S., Heckerman, D., Kadie, C.: (1998) ftp://ftp.ics.uci.edu/pub/machine-learning-databases/arrhythmia/

Download references

Author information

Authors and Affiliations

CSIRO Mathematical and Information Sciences, GPO Box 664, Canberra, ACT, 2601, Australia
Hongxing He, Huidong Jin & Jie Chen

Authors

Hongxing He
View author publications
You can also search for this author in PubMed Google Scholar
Huidong Jin
View author publications
You can also search for this author in PubMed Google Scholar
Jie Chen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Guangxi Normal University, College of CS and IT, Guilin, China, and University of Technology, Faculty of Engineering and Information Technology, Sydney, Australia
Shichao Zhang
Department of Electrical and Computer Systems Engineering, Monash University, 3800, Melbourne, Victoria, Australia
Ray Jarvis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

He, H., Jin, H., Chen, J. (2005). Automatic Feature Selection for Classification of Health Data. In: Zhang, S., Jarvis, R. (eds) AI 2005: Advances in Artificial Intelligence. AI 2005. Lecture Notes in Computer Science(), vol 3809. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11589990_108

Download citation

DOI: https://doi.org/10.1007/11589990_108
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30462-3
Online ISBN: 978-3-540-31652-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics