Generalized feature similarity measure

Kamalov, Firuz

doi:10.1007/s10472-020-09700-8

Firuz Kamalov ORCID: orcid.org/0000-0003-3946-0920¹

107 Accesses
Explore all metrics

Abstract

Quantifying the degree of relation between a feature and target class is one of the key aspects of machine learning. In this regard, information gain (IG) and χ² are two of the most widely used measures in feature evaluation. In this paper, we discuss a novel approach to unifying these and other existing feature evaluation measures under a common framework. In particular, we introduce a new generalized family of measures to estimate the similarity between features. We show that the proposed set of measures satisfies all the general criteria for quantifying the relationship between features. We demonstrate that IG and χ² are special cases of the generalized measure. We also analyze some of the topological and set-theoretic aspects of the family of functions that satisfy the criteria of our generalized measure. Finally, we produce novel feature evaluation measures using our approach and analyze their performance through numerical experiments. We show that a diverse array of measures can be created under our framework which can be used in applications such fusion based feature selection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature selection based on mutual information with correlation coefficient

Article 12 August 2021

Feature redundancy term variation for mutual information-based feature selection

Article 10 January 2020

Weighted structure preservation and redundancy minimization for feature selection

Article 02 August 2017

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Bennasar, M., Hicks, Y., Setchi, R.: Feature selection using joint mutual information maximisation. Expert Syst. Appl. 42(22), 8520–8532 (2015)
Article Google Scholar
Bolón-Canedo, V., Sánchez-Marono, N., Alonso-Betanzos, A., Benítez, J.M., Herrera, F.: A review of microarray datasets and applied feature selection methods. Inform. Sci. 282, 111–135 (2014)
Article Google Scholar
Brown, G., Pocock, A., Zhao, M.J., Luján, M: Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J. Mach. Learn. Res. 13(Jan), 27–66 (2012)
MathSciNet MATH Google Scholar
Boutell, M.R., Luo, J., Shen, X., Brown, C.M.: Learning multi-label scene classification. Pattern Recogn. 37(9), 1757–1771 (2004)
Article Google Scholar
Cressie, N., Read, T.R.: Multinomial goodness-of-fit tests. Journal of the Royal Statistical Society. Series B (Methodological), vol. 46, no.3 440–464 (1984)
Dash, M., Liu, H.: Consistency-based search in feature selection. Artif. Intell. 151(1-2), 155–176 (2003)
Article MathSciNet Google Scholar
Estévez, P.A., Tesmer, M., Perez, C.A., Zurada, J.M.: Normalized mutual information feature selection. IEEE Trans. Neural Netw. 20(2), 189–201 (2009)
Article Google Scholar
Forman, G.: An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 3, 1289–1305 (2003)
MATH Google Scholar
Guyon, I.: Design of experiments for the NIPS 2003 variable selection benchmark (2003)
Hofmann, D.: German Credit Data Set, UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. University of California, School of Information and Computer Science, Irvine (1994)
Hopkins, M., et al.: SPAM Email Database, UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. University of California, School of Information and Computer Science, Irvine (1998)
Kamalov, F., Thabtah, F.: A feature selection method based on ranked vector scores of features for classification. Ann. Data Sci. 4(4), 483–502 (2017)
Article Google Scholar
Kamalov, F., Leung, H.H., Moussa, S.: Monotonicity of the χ²-statistic and feature selection. Annals of Data Science, 1–19 (2020)
Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining, vol. 454. Springer Science & Business Media, Berlin (2012)
Google Scholar
Maji, P., Pal, S.K.: Feature selection using f-information measures in fuzzy approximation spaces. IEEE Trans. Knowl. Data Eng. 22(6), 854–867 (2010)
Article Google Scholar
Mohammad, R., McCluskey, L., Thabtah, F.: Phishing Websites Data Set, UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. University of California School of Information and Computer Science, Irvine (2012)
Nie, F., Huang, H., Cai, X., Ding, C.H.: Efficient and robust feature selection via joint l2, 1-norms minimization. In: Advances in neural information processing systems, pp. 1813–1821 (2010)
Ogura, H., Amano, H., Kondo, M.: Feature selection with a measure of deviations from Poisson in text categorization. Expert Syst. Appl. 36(3), 6826–6832 (2009)
Article Google Scholar
Rijn, J.: BNG(kr-vs-kp) Data Set, OpenML Repository [http://www.openml.org] (2014)
Tan, M., Schlimmer, J.: Breast Cancer Data Set, UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. University of California School of Information and Computer Science, Irvine (1988)
Thabtah, F., Kamalov, F., Rajab, K.: A new computational intelligence approach to detect autistic features for autism screening. Int. J. Med. Inf. 117, 112–124 (2018)
Article Google Scholar
Vergara, J.R., Estévez, P.A.: A review of feature selection methods based on mutual information. Neural Comput. Appl. 24(1), 175–186 (2014)
Article Google Scholar
Yang, F., Mao, K.Z.: Robust feature selection for microarray data based on multicriterion fusion. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(4), 1080–1092 (2011)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Canadian University Dubai, Dubai, UAE
Firuz Kamalov

Authors

Firuz Kamalov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Firuz Kamalov.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kamalov, F. Generalized feature similarity measure. Ann Math Artif Intell 88, 987–1002 (2020). https://doi.org/10.1007/s10472-020-09700-8

Download citation

Published: 20 May 2020
Issue Date: September 2020
DOI: https://doi.org/10.1007/s10472-020-09700-8

Keywords

Mathematics Subject Classification (2010)

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generalized feature similarity measure

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Feature selection based on mutual information with correlation coefficient

Feature redundancy term variation for mutual information-based feature selection

Weighted structure preservation and redundancy minimization for feature selection

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2010)

Subscribe and save

Buy Now

Navigation

Generalized feature similarity measure

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Feature selection based on mutual information with correlation coefficient

Feature redundancy term variation for mutual information-based feature selection

Weighted structure preservation and redundancy minimization for feature selection

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Subscribe and save

Buy Now

Search

Navigation