Abstract
Quantifying the degree of relation between a feature and target class is one of the key aspects of machine learning. In this regard, information gain (IG) and χ2 are two of the most widely used measures in feature evaluation. In this paper, we discuss a novel approach to unifying these and other existing feature evaluation measures under a common framework. In particular, we introduce a new generalized family of measures to estimate the similarity between features. We show that the proposed set of measures satisfies all the general criteria for quantifying the relationship between features. We demonstrate that IG and χ2 are special cases of the generalized measure. We also analyze some of the topological and set-theoretic aspects of the family of functions that satisfy the criteria of our generalized measure. Finally, we produce novel feature evaluation measures using our approach and analyze their performance through numerical experiments. We show that a diverse array of measures can be created under our framework which can be used in applications such fusion based feature selection.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bennasar, M., Hicks, Y., Setchi, R.: Feature selection using joint mutual information maximisation. Expert Syst. Appl. 42(22), 8520–8532 (2015)
Bolón-Canedo, V., Sánchez-Marono, N., Alonso-Betanzos, A., Benítez, J.M., Herrera, F.: A review of microarray datasets and applied feature selection methods. Inform. Sci. 282, 111–135 (2014)
Brown, G., Pocock, A., Zhao, M.J., Luján, M: Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J. Mach. Learn. Res. 13(Jan), 27–66 (2012)
Boutell, M.R., Luo, J., Shen, X., Brown, C.M.: Learning multi-label scene classification. Pattern Recogn. 37(9), 1757–1771 (2004)
Cressie, N., Read, T.R.: Multinomial goodness-of-fit tests. Journal of the Royal Statistical Society. Series B (Methodological), vol. 46, no.3 440–464 (1984)
Dash, M., Liu, H.: Consistency-based search in feature selection. Artif. Intell. 151(1-2), 155–176 (2003)
Estévez, P.A., Tesmer, M., Perez, C.A., Zurada, J.M.: Normalized mutual information feature selection. IEEE Trans. Neural Netw. 20(2), 189–201 (2009)
Forman, G.: An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 3, 1289–1305 (2003)
Guyon, I.: Design of experiments for the NIPS 2003 variable selection benchmark (2003)
Hofmann, D.: German Credit Data Set, UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. University of California, School of Information and Computer Science, Irvine (1994)
Hopkins, M., et al.: SPAM Email Database, UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. University of California, School of Information and Computer Science, Irvine (1998)
Kamalov, F., Thabtah, F.: A feature selection method based on ranked vector scores of features for classification. Ann. Data Sci. 4(4), 483–502 (2017)
Kamalov, F., Leung, H.H., Moussa, S.: Monotonicity of the χ2-statistic and feature selection. Annals of Data Science, 1–19 (2020)
Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining, vol. 454. Springer Science & Business Media, Berlin (2012)
Maji, P., Pal, S.K.: Feature selection using f-information measures in fuzzy approximation spaces. IEEE Trans. Knowl. Data Eng. 22(6), 854–867 (2010)
Mohammad, R., McCluskey, L., Thabtah, F.: Phishing Websites Data Set, UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. University of California School of Information and Computer Science, Irvine (2012)
Nie, F., Huang, H., Cai, X., Ding, C.H.: Efficient and robust feature selection via joint l2, 1-norms minimization. In: Advances in neural information processing systems, pp. 1813–1821 (2010)
Ogura, H., Amano, H., Kondo, M.: Feature selection with a measure of deviations from Poisson in text categorization. Expert Syst. Appl. 36(3), 6826–6832 (2009)
Rijn, J.: BNG(kr-vs-kp) Data Set, OpenML Repository [http://www.openml.org] (2014)
Tan, M., Schlimmer, J.: Breast Cancer Data Set, UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. University of California School of Information and Computer Science, Irvine (1988)
Thabtah, F., Kamalov, F., Rajab, K.: A new computational intelligence approach to detect autistic features for autism screening. Int. J. Med. Inf. 117, 112–124 (2018)
Vergara, J.R., Estévez, P.A.: A review of feature selection methods based on mutual information. Neural Comput. Appl. 24(1), 175–186 (2014)
Yang, F., Mao, K.Z.: Robust feature selection for microarray data based on multicriterion fusion. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(4), 1080–1092 (2011)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kamalov, F. Generalized feature similarity measure. Ann Math Artif Intell 88, 987–1002 (2020). https://doi.org/10.1007/s10472-020-09700-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10472-020-09700-8