A Feature Selection Approach Based on Information Theory for Classification Tasks

Jesus, Jhoseph; Canuto, Anne; Araújo, Daniel

doi:10.1007/978-3-319-68612-7_41

Jhoseph Jesus¹⁷,
Anne Canuto¹⁷ &
Daniel Araújo¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10614))

Included in the following conference series:

International Conference on Artificial Neural Networks

Abstract

This paper proposes the use of a Information Theory measure in a dynamic feature selection approach. We tested such approach including elements of Information Theory in the process, such as Mutual Information, and compared with classical methods like PCA and LDA as well as Mutual Information based algorithms. Results showed that the proposed method achieved better performance in most cases when compared with the other methods. Based on this, we could conclude that the proposed approach is very promising since it achieved better performance than well-established dimensionality reduction methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Araújo, D., Jesus, J., Neto, A.D., Martins, A.: A combination method for reducing dimensionality in large datasets. In: Villa, A.E.P., Masulli, P., Pons Rivero, A.J. (eds.) ICANN 2016. LNCS, vol. 9887, pp. 388–397. Springer, Cham (2016). doi:10.1007/978-3-319-44781-0_46
Chapter Google Scholar
Battiti, R.: Using mutual information for selecting features in supervised neural net learning. Trans. Neural Netw. 5(4), 537–550 (1994)
Article Google Scholar
Gibbons, J., Chakraborti, S.: Nonparametric Statistical Inference. Statistics, Textbooks and Monographs. Marcel Dekker Incorporated, New York (2003)
MATH Google Scholar
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
MATH Google Scholar
Jesus, J., Arajo, D., Canuto, A.: Fusion approaches of feature selection algorithms for classification problems. In: 2016 5th Brazilian Conference on Intelligent Systems (BRACIS), pp. 379–384, October 2016
Google Scholar
Jolliffe, I.T.: Principal Component Analysis. Springer, New York (1986). doi:10.1007/978-1-4757-1904-8
Book MATH Google Scholar
Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 881–892 (2002)
Article MATH Google Scholar
Law, M.H., Figueiredo, M.A., Jain, A.K.: Simultaneous feature selection and clustering using mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 26, 1154–1166 (2004)
Article Google Scholar
Li, J., Cheng, K., Wang, S., Morstatter, F., Robert, T., Tang, J., Liu, H.: Feature selection: a data perspective. arXiv:1601.07996 (2016)
Lichman, M.: UCI Machine Learning Repository (2013)
Google Scholar
Lin, D., Tang, X.: Conditional infomax learning: an integrated framework for feature extraction and fusion. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 68–82. Springer, Heidelberg (2006). doi:10.1007/11744023_6
Chapter Google Scholar
McLachlan, G.: Discriminant Analysis and Statistical Pattern Recognition. Wiley Series in Probability and Statistics. Wiley, Hoboken (2004)
MATH Google Scholar
Mitchell, T.M.: Machine Learning, 1st edn. McGraw-Hill Inc., New York (1997)
MATH Google Scholar
Nunes, R.O., Dantas, C.A., Canuto, A.M.P., Xavier-Junior, J.C.: An unsupervised-based dynamic feature selection for classification tasks. In: 2016 International Joint Conference on Neural Networks (IJCNN), pp. 4213–4220, July 2016
Google Scholar
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)
Article Google Scholar
Santhanam, T., Padmavathi, M.: Application of k-means and genetic algorithms for dimension reduction by integrating SVM for diabetes diagnosis. Procedia Comput. Sci. 47(Complete), 76–83 (2015)
Article Google Scholar
BioInformatics Group Seville: BIGS Bioinformatics Research Group of Seville Repository (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics and Applied Math, Federal University of Rio Grande do Norte, Campus Universitário, Lagoa Nova, Natal, RN, Brazil
Jhoseph Jesus & Anne Canuto
Digital Metropolis Institute, Federal University of Rio Grande do Norte, Campus Universitário, Lagoa Nova, Natal, RN, Brazil
Daniel Araújo

Authors

Jhoseph Jesus
View author publications
You can also search for this author in PubMed Google Scholar
Anne Canuto
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Araújo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel Araújo .

Editor information

Editors and Affiliations

University of Lausanne, Lausanne, Switzerland
Alessandra Lintas
University of Genoa, Genoa, Italy
Stefano Rovetta
Universitat Pompeu Fabra, Barcelona, Spain
Paul F.M.J. Verschure
University of Lausanne, Lausanne, Switzerland
Alessandro E.P. Villa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jesus, J., Canuto, A., Araújo, D. (2017). A Feature Selection Approach Based on Information Theory for Classification Tasks. In: Lintas, A., Rovetta, S., Verschure, P., Villa, A. (eds) Artificial Neural Networks and Machine Learning – ICANN 2017. ICANN 2017. Lecture Notes in Computer Science(), vol 10614. Springer, Cham. https://doi.org/10.1007/978-3-319-68612-7_41

Download citation

DOI: https://doi.org/10.1007/978-3-319-68612-7_41
Published: 25 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68611-0
Online ISBN: 978-3-319-68612-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics