Abstract
Previous studies have shown that information entropy and its variants are useful at reducing data dimensionality. Yet, most existing approaches based on entropy exploit the correlations between features and labels, lacking of taking into account the relevance between features. In this paper, we propose a new index for feature selection, named fuzzy conditional distinction degree (FDD), based on fuzzy similarity relation by combining feature correlations with the relationship between features and labels. Different from existing approaches based on entropy, FDD considers the cardinality of the relation matrix instead of the similarity classes. Meanwhile, we encode the feature correlations into distance to measure the relevance of any two features. Some useful properties are discussed. Based on the FDD, a greedy forward algorithm for feature selection is presented. Experimental results on benchmark data sets denote the feasibility and effectiveness of the proposed approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5(4), 537–550 (1994)
Dai, J., Wang, W., Xu, Q.: An uncertainty measure for incomplete decision tables and its applications. IEEE Trans. Cybern. 43(4), 1277–1289 (2013)
Dai, J., Xu, Q.: Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification. Appl. Soft Comput. J. 13(1), 211–221 (2013)
Dai, J., Xu, Q., Wang, W., Tian, H.: Conditional entropy for incomplete decision systems and its application in data mining. Int. J. Gen. Syst. 41(7), 713–728 (2012)
Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Hall, M.A.: Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), pp. 359–366 (2000)
Hu, Q., Yu, D., Xie, Z., Liu, J.: Fuzzy probabilistic approximation spaces and their information measures. IEEE Trans. Fuzzy Syst. 14(2), 191–201 (2006)
Hu, Q., Zhang, L., Zhang, D., Pan, W., An, S., Pedrycz, W.: Measuring relevance between discrete and continuous features based on neighborhood mutual information. Expert Syst. Appl. 38(9), 10737–10750 (2011)
Jensen, R., Shen, Q.: New approaches to fuzzy-rough feature selection. IEEE Trans. Fuzzy Syst. 17(4), 824–838 (2009)
Tallón-Ballesteros, A.J., Riquelme, J.C.: Tackling ant colony optimization meta-heuristic as search method in feature subset selection based on correlation or consistency measures. In: Corchado, E., Lozano, J.A., Quintián, H., Yin, H. (eds.) IDEAL 2014. LNCS, vol. 8669, pp. 386–393. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10840-7_47
Tang, J., Liu, H.: Unsupervised feature selection for linked social media data. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 904–912 (2012)
Tiwari, A.K., Shreevastava, S., Som, T., Shukla, K.K.: Tolerance-based intuitionistic fuzzy-rough set approach for attribute reduction. Expert Syst. Appl. 101, 205–212 (2018)
Wang, C., Hu, Q., Wang, X., Chen, D., Qian, Y., Dong, Z.: Feature selection based on neighborhood discrimination index. IEEE Trans. Neural Netw. Learn. Syst. 29(7), 2986–2999 (2017)
Wang, C., Qi, Y., Shao, M., Hu, Q., Chen, D., Qian, Y., Lin, Y.: A fitting model for feature selection with fuzzy rough sets. IEEE Trans. Fuzzy Syst. 25(4), 741–753 (2017)
Wang, C., Shao, M., He, Q., Qian, Y., Qi, Y.: Feature subset selection based on fuzzy neighborhood rough sets. Knowl.-Based Syst. 111, 173–179 (2016)
Witten, I.H., Eibe, F., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn. Morgan Kaufmann, Elsevier, Burlington (2011)
Yager, R.R.: Entropy measures under similarity relations. Int. J. Gen. Syst. 20(4), 341–358 (1992)
Yu, L., Liu, H.: Feature selection for high-dimensional data: A fast correlation-based filter solution. In: Machine Learning, Proceedings of the Twentieth International Conference (ICML 2003), August 21–24, 2003, Washington, DC, pp. 856–863 (2003)
Acknowledgments
This work was partially supported by the National Natural Science Foundation of China (Nos. 61473259, 61502335, 61070074, 60703038) and the Hunan Provincial Science and Technology Project Foundation (2018TP1018, 2018RS3065).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, Q., Dai, J. (2018). Feature Selection Based on Fuzzy Conditional Distinction Degree. In: Cheng, L., Leung, A., Ozawa, S. (eds) Neural Information Processing. ICONIP 2018. Lecture Notes in Computer Science(), vol 11304. Springer, Cham. https://doi.org/10.1007/978-3-030-04212-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-04212-7_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04211-0
Online ISBN: 978-3-030-04212-7
eBook Packages: Computer ScienceComputer Science (R0)