Abstract
As a reliable and valid tool for analyzing uncertain information, fuzzy rough set theory has attracted widespread concern in feature selection. However, the performance of fuzzy rough set model is generally affected by various factors, for instance, large data distribution differences, unreasonable settings for fuzzy information granules and feature evaluation functions with single perspective. Considering these problems, a fitting fuzzy rough set model with relative dependency complement mutual information is proposed in this paper. First, the relative distance is introduced to eliminate the influence of data distribution on the fuzzy similarity relation. Then, by analyzing the similarity distributions of samples with regard to decisions, a fitting fuzzy neighborhood radius is proposed to improve the fuzzy information granules, and a fitting fuzzy rough set model is proposed based on the relative distance and the fitting fuzzy neighborhood radius. Moreover, considering the complementary characteristics between fuzzy information granularity and fuzzy information entropy, the related definitions of relative complement information entropy in the fitting fuzzy rough set model are offered, and a multiview uncertainty measure based on relative dependency complement mutual information is constructed to comprehensively analyze the uncertainty of information. Finally, a heuristic feature selection algorithm is designed. A series of experiments designed in this paper prove the superiority of the proposed method.








Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Li J, Liu H (2017) Challenges of feature selection for big data analytics. IEEE Intell Syst 32 (2):9–15. https://doi.org/10.1109/MIS.2017.38
El-Hasnony IM, Barakat SI, Elhoseny M, Mostafa RR (2020) Improved feature selection model for big data analytics. IEEE Access 8:66989–67004. https://doi.org/10.1109/ACCESS.2020.2986232
An S, Hu Q, Wang C, Wang C, Guo G, Li P (2022) Data reduction based on NN-kNN measure for NN classification and regression. Int J Mach Learn Cybern 13(3):765–781. https://doi.org/10.1007/s13042-021-01327-3
Lin Y, Hu Q, Liu J, et al. (2022) MULFE: multi-label learning via label-specific feature space ensemble. ACM Transactions on Knowledge Discovery from Data (TKDD) 16(1):5:1–5:24. https://doi.org/10.1145/3451392
Wang C, Wang Y, Shao M, Qian Y, Chen D (2019) Fuzzy rough attribute reduction for categorical data. IEEE Trans Fuzzy Syst 28(5):818–830. https://doi.org/10.1109/TFUZZ.2019.2949765
Lin Y, Liu H, Zhao H, et al (2022) Hierarchical feature selection based on label distribution learning. IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/TKDE.2022.3177246
Hamidzadeh J (2021) Feature selection by using chaotic cuckoo optimization algorithm with levy flight, opposition-based learning and disruption operator. Soft Comput 25(4):2911–2933. https://doi.org/10.1007/s00500-020-05349-x
Ali G, Afzal M, Asif M et al (2022) Attribute reduction approaches under interval-valued q-rung orthopair fuzzy soft framework. App Intell 52(8):8975–9000
Kashani SMZ, Hamidzadeh J (2020) Feature selection by using privacy-preserving of recommendation systems based on collaborative filtering and mutual trust in social networks. Soft Comput 24(15):11425–11440. https://doi.org/10.1007/s00500-019-04605-z
Skowron A, Dutta S (2018) Rough sets: past, present, and future. Nat Comput 17(4):855–876. https://doi.org/10.1007/s11047-018-9700-3
Jayasuruthi L, Shalini A, Kumar VV (2018) Application of rough set theory in data mining market analysis using rough sets data explorer. J Comput Theor Nanosci 15(6–7):2126–2130
Hu Q, Yu D, Liu J et al (2008) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594. https://doi.org/10.1016/j.ins2008.05.024
Chen D, Zhang L, Zhao S et al (2011) A novel algorithm for finding reducts with fuzzy rough sets. IEEE Trans Fuzzy Syst 20(2):385–389. https://doi.org/10.1109/TFUZZ.2011.2173695
Sun L, Wang L, Qian Y et al (2019) Feature selection using Lebesgue and entropy measures for incomplete neighborhood decision systems. Knowl-Based Syst 186:104942. https://doi.org/10.1016/j.knosys.2019.104942
Xu J, Qu K, Yuan M et al (2021) Feature selection combining information theory view and algebraic view in the neighborhood decision system. Entropy 23(6):704. https://doi.org/10.3390/e23060704
Yang X, Chen H, Li T et al (2021) Neighborhood rough sets with distance metric learning for feature selection. Knowl-Based Syst 224:107076. https://doi.org/10.1016/j.knosys.2021.107076
Ji W, Pang Y, Jia X et al (2021) Fuzzy rough sets and fuzzy rough neural networks for feature selection: a review. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 11(3):e1402. https://doi.org/10.1002/widm.1402
Akram M, Ali G, Alcantud JCR (2022) Attributes reduction algorithms for m-polar fuzzy relation decision systems. Int J Approx Reason 140:232–254. https://doi.org/10.1016/j.ijar.2021.10.005
Zhang X, Mei C, Chen D et al (2018) A fuzzy rough set-based feature selection method using representative instances. Knowl-Based Syst 151:216–229. https://doi.org/10.1016/j.knosys.2018.03.031
Sheeja TK, Kuriakose AS (2018) A novel feature selection method using fuzzy rough sets. Comput Ind 97:111–116. https://doi.org/10.1016/j.compind.2018.01.014
Atef M, Atik E, El Fattah A (2021) Some extensions of covering-based multigranulation fuzzy rough sets from new perspectives. Soft Comput 25(8):6633–6651. https://doi.org/10.1007/s00500-021-05659-8
Sun L, Xu JC, Wang W et al (2016) Locally linear embedding and neighborhood rough set-based gene selection for gene expression data classification. Genet Mol Res 15(3):15038990
Ali G, Akram M, Alcantud JCR (2020) Attributes reductions of bipolar fuzzy relation decision systems. Neural Comput Applic 32(14):10051–10071. https://doi.org/10.1007/s00521-019-04536-8
Xu J, Sun Y, Qu K, et al. (2022) Online group streaming feature selection using entropy-based uncertainty measures for fuzzy neighborhood rough sets. Complex & Intelligent Systems, pp 1–20. https://doi.org/10.1007/s40747-022-00763-0
Tiwari AK, Shreevastava S, Som T et al (2018) Tolerance-based intuitionistic fuzzy-rough set approach for attribute reduction. Expert Syst Appl 101:205–212. https://doi.org/10.1016/j.eswa.2018.02.009
Ducange P, Fazzolari M, Marcelloni F (2020) An overview of recent distributed algorithms for learning fuzzy models in big data classification. Journal of Big Data 7(1):1–29. https://doi.org/10.1186/s40537-020-00298-6
Hamidzadeh J, Rezaeenik E, Moradi M (2021) Predicting users’ preferences by fuzzy rough set quarter-sphere support vector machine. Appl Soft Comput 112:107740. https://doi.org/10.1016/j.asoc.2021.107740
Dai J, Chen J (2020) Feature selection via normative fuzzy information weight with application into tumor classification. Appl Soft Comput 92:106299. https://doi.org/10.1016/j.asoc.2020.106299
Akram M, Ali G, Alcantud JCR, et (2021) Parameter reductions in N-soft sets and their applications in decision-making. Expert Syst 38(1):e12601. https://doi.org/10.1111/exsy.12601
Zhu X, Wu X (2004) Class noise vs. attribute noise: a quantitative study. Artif Intell Rev 22(3):177–210. https://doi.org/10.1007/s10462-004-0751-8
Moslemnejad S, Hamidzadeh J (2021) Weighted support vector machine using fuzzy rough set theory. Soft Comput 25(31):8461–8481. https://doi.org/10.1007/s00500-021-05773-7
Hu Q, Yu D, Pedrycz W et al (2010) Kernelized fuzzy rough sets and their applications. IEEE Trans Knowl Data Eng 23(11):1649–1667. https://doi.org/10.1109/TKDE.2010.260
Hu Q, Zhang L, An S et al (2011) On robust fuzzy rough set models. IEEE Trans Fuzzy Syst 20(4):636–651. https://doi.org/10.1109/TFUZZ.2011.2181180
Wang C, Huang Y, Ding W et al (2021) Attribute reduction with fuzzy rough self-information measures. Inform Sci 549:68–86. https://doi.org/10.1016/j.ins.2020.11.021
An S, Hu Q, Wang C (2021) Probability granular distance-based fuzzy rough set model. Appl Soft Comput 102:107064. https://doi.org/10.1016/j.asoc.2020.107064
Zhang C, Li D, Liang J (2020) Multi-granularity three-way decisions with adjustable hesitant fuzzy linguistic multigranulation decision-theoretic rough sets over two universes. Inform Sci 507:665–683. https://doi.org/10.1016/j.ins.2019.01.033
Wang C, Shao M, He Q et al (2016) Feature subset selection based on fuzzy neighborhood rough sets. Knowl-Based Syst 111:173–179. https://doi.org/10.1016/j.knosys.2016.08.009
Pal SK (2020) Granular mining and big data analytics: rough models and challenges. Proceedings of the National Academy of Sciences. India Section A: Physical Sciences 90(2):193–208. https://doi.org/10.1007/s40010-018-0578-3
Xia S, Zhang Z, Li W et al (2020) GBNRS: a novel rough set algorithm for fast adaptive attribute reduction in classification. IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/TKDE.2020.2997039
Wang C, Qi Y, Shao M et al (2016) A fitting model for feature selection with fuzzy rough sets. IEEE Trans Fuzzy Syst 25(4):741–753. https://doi.org/10.1109/TFUZZ.2016.2574918
Tan A, Wu WZ, Qian Y et al (2018) Intuitionistic fuzzy rough set-based granular structures and attribute subset selection. IEEE Trans Fuzzy Syst 27(3):527–539. https://doi.org/10.1109/TFUZZ.2018.2862870
Xu J, Wang Y, Mu H et al (2019) Feature genes selection based on fuzzy neighborhood conditional entropy. Journal of Intelligent & Fuzzy Systems 36(1):117–126. https://doi.org/10.3233/JIFS-18100
Yuan Z, Chen H, Xie P et al (2021) Attribute reduction methods in fuzzy rough set theory: An overview, comparative experiments, and new directions. Appl Soft Comput 107:107353. https://doi.org/10.1016/j.asoc.2021.107353
An S, Zhao E, Wang C et al (2021) Relative fuzzy rough approximations for feature selection and classification. IEEE Transactions on Cybernetics
Sun L, Wang L, Ding W et al (2020) Neighborhood multi-granulation rough sets-based attribute reduction using Lebesgue and entropy measures in incomplete neighborhood decision systems. Knowl-Based Syst 192:105373. https://doi.org/10.1016/j.knosys.2019.105373
Ni P, Zhao S, Wang X et al (2019) PARA: A positive-region based attribute reduction accelerator. Inform Sci 503:533–550. https://doi.org/10.1016/j.ins.2019.07.038
Jensen R, Shen Q (2008) New approaches to fuzzy-rough feature selection. IEEE Trans Fuzzy Syst 17(4):824–838. https://doi.org/10.1109/TFUZZ.2008.924209
Hu Q, Xie Z, Yu D (2008) Comments on fuzzy probabilistic approximation spaces and their information measures. IEEE Trans Fuzzy Syst 16(2):549–551. https://doi.org/10.1109/TFUZZ.2007.896321
Hu Q, Zhang L, Zhang D et al (2011) Measuring relevance between discrete and continuous features based on neighborhood mutual information. Expert Syst Appl 38(9):10737–10750. https://doi.org/10.1016/j.eswa.2011.01.023
Lu H, Chen J, Yan K, et (2017) A hybrid feature selection algorithm for gene expression data classification. Neurocomputing 256:56–62. https://doi.org/10.1016/j.neucom.2016.07.080
Wang G (2003) Rough reduction in algebra view and information view. Int J Intell Syst 18(6):679–688. https://doi.org/10.1002/int.10109
Sun L, Wang L, Ding W et al (2020) Feature selection using fuzzy neighborhood entropy-based uncertainty measures for fuzzy neighborhood multigranulation rough sets. IEEE Trans Fuzzy Syst 29(1):19–33. https://doi.org/10.1109/TFUZZ.2020.2989098
Xu J, Yuan M, Ma Y (2022) Feature selection using self-information and entropy-based uncertainty measure for fuzzy neighborhood rough set. Complex & Intelligent Systems 8(1):287–305. https://doi.org/10.1007/s40747-021-00356-3
Xu J, Qu K, Meng X et al (2022) Feature selection based on multiview entropy measures in multiperspective rough set. International Journal of Intelligent Systems. https://doi.org/10.1002/int.22878
Sun L, Li M, Ding W et al (2022) AFNFS: adaptive fuzzy neighborhood-based feature selection with adaptive synthetic over-sampling for imbalanced data. Inform Sci 612:724–744. https://doi.org/10.1016/j.ins.2022.08.118
Liang J, Chin KS, Dang C et al (2002) A new method for measuring uncertainty and fuzziness in rough set theory. Int J Gen Syst 31(4):331–342. https://doi.org/10.1080/0308107021000013635
Qian Y, Liang J, Wei-zhi ZW et al (2010) Information granularity in fuzzy binary GrC model. IEEE Trans Fuzzy Syst 19(2):253–264. https://doi.org/10.1109/TFUZZ.2010.2095461
Zhao J, Zhang Z, Han C et al (2015) Complement information entropy for uncertainty measure in fuzzy rough set and its applications. Soft Comput 19(7):1997–2010. https://doi.org/10.1007/s00500-014-1387-5
Qian Y, Wang Q, Cheng H et al (2015) Fuzzy-rough feature selection accelerator. Fuzzy Set Syst 258:61–78. https://doi.org/10.1016/j.fss.2014.04.029
Zhao Z, Liu H (2009) Searching for interacting features in subset selection. Intelligent Data Analysis 13(2):207–228. https://doi.org/10.3233/IDA-2009-0364
Yu L, Liu H (2003) Feature selection for high-dimensional data: A fast correlation-based filter solution. In: Proceedings of the 20th international conference on machine learning (ICML-03), pp 856–863
Paul A, Sil J, Mukhopadhyay CD (2017) Gene selection for designing optimal fuzzy rule base classifier by estimating missing value. Appl Soft Comput 55:276–288. https://doi.org/10.1016/j.asoc.2017.01.046
Hall MA (1999) Correlation-based feature selection for machine learning. Dissertation, University of Tese
Priya RD, Sivaraj R (2017) Dynamic genetic algorithm-based feature selection and incomplete value imputation for microarray classification. Curr Sci, pp 126–131. https://www.jstor.org/stable/24911624
Priya RD, Kuppuswami S (2012) A genetic algorithm based approach for imputing missing discrete attribute values in databases. WSEAS Trans Inf Sci Appl 9(6):169–178
Hu Q, Yu D, Xie Z et al (2006) Fuzzy probabilistic approximation spaces and their information measures. IEEE Trans Fuzzy Syst 14(2):191–201
Moghaddam VH, Hamidzadeh J (2016) New hermite orthogonal polynomial kernel and combined kernels in support vector machine classifier. Pattern Recogn 60:921–935. https://doi.org/10.1016/j.patcog.2016.07.004
Acknowledgements
The paper is supported in part by the National Natural Science Foundation of China under Grant (61976082, 62076089, 62002103).
Author information
Authors and Affiliations
Contributions
Conceptualization: Jiucheng Xu; Methodology: Xiangru Meng; Writing - original draft preparation: Xiangru Meng, Kanglin Qu, Yuanhao Sun; Writing - review and editing: Yuanhao Sun, Qinchen Hou; Funding acquisition: Jiucheng Xu.
Corresponding authors
Ethics declarations
Competing interests
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Xu, J., Meng, X., Qu, K. et al. Feature selection using relative dependency complement mutual information in fitting fuzzy rough set model. Appl Intell 53, 18239–18262 (2023). https://doi.org/10.1007/s10489-022-04445-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-04445-9