Abstract
Recent developments in big data applications have heightened the need for understanding and processing high-dimensional data. It is necessary to extract some excellent features that effect the learning performance in high-dimensional data. Feature selection algorithm based on rough set theory as an important preprocessing method has been widely used in practical applications. Meanwhile, it should be noted that different attributes have different effects on model evaluation. Nevertheless, each feature or attribute has the same degree of importance in the interval-valued information system by using rough set models, ignoring the imbalance between features. Moreover, the monotonic classification effect of interval-valued data is easily affected by noise. For these two issues, we introduce different weights into neighborhood relations and propose a novel approach for feature selection-based weighted neighborhood rough sets for interval-valued information systems in this study. First, weighted neighborhood relations and some important properties are proposed by considering different attribute weights in the interval-valued information system. Then, we construct an interval-valued-based weighted neighborhood rough set (IVWNRS) model to solve the contradiction between the degree of dependency and the classification ability of the attribute subset. Furthermore, a heuristic algorithm is designed according to the degree of dependency to select an attribute subset that has both strong correlation and high dependency. Finally, we compare it with six other representative feature selection algorithms on fifteen public datasets to evaluate the performance of the proposed algorithm. Experimental results on different classifiers show that the IVWNRS algorithm has higher classification performance and is significantly effective.
Similar content being viewed by others
References
Brtka V, Stokic E, Srdic B (2008) Automated extraction of decision rules for leptin dynamics-A rough sets approach. J Biomed Inform 41(4):667–674
Chen L, Chen D, Wang H (2019) Fuzzy kernel alignment with application to attribute reduction of heterogeneous data. IEEE Trans Fuzzy Syst 27(7):1469–1478
Chen H, Li T, Cai Y, Luo C, Fujita H (2016) Parallel attribute reduction in dominance-based neighborhood rough Set. Inform Sci 373:351–368
Chen Y, Liu KY, Song JJ et al (2020) Attribute Group for Attribute Reduction. Inf Sci 535:64–80
Chen HF, Long JW, Qu XP (2019) A positive Region-Based attribute reduction approach in interval valued decision information system. J Chongqing Univ Tech(Natural Sci) 33(11):130– 136
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Fan XQ, Li XF, Zhao SY, Chen H, Li CP (2018) Weighted attribute reduction based on fuzzy rough sets. Comput Sci 45(01):133–139
Fu YQ (2019) Design of attribute subset selection and fusion classification method via dominant rough sets. Value Eng 38(28):226–229
Fu WQ, Khalil AM (2021) Graded rough sets based on neighborhood operator over two different universes and their applications in Decision-making problems. J Intell Fuzzy Syst 41(2):2639–2664
Gonzalez-Lopez J, Ventura S, Cano A (2020) Distributed selection of continuous features in multilabel classification using mutual information. IEEE Trans Neural Netw Learn Syst 31(7):2280–2293
Guo YT, Tsang ECC, Hu M et al (2020) Incremental updating approximations for double-quantitative decision-theoretic rough sets with the lariation of Objects. Knowl-Based Syst:189
Guo YT, Tsang ECC, Xu WH, et al. (2020) Adaptive weighted generalized multi-granulation interval-valued decision-theoretic rough sets. Knowl-Based Syst:187
Hashemzadeh M, Oskouei AG, Farajzadeh N (2019) New fuzzy c-means clustering method based on feature-weight and cluster-weight learning. Appl Soft Comput 78:324–345
Huang YY, Li TR, Luo C, Fujita H, Horng SJ (2018) Dynamic fusion of multisource interval-valued data by fuzzy granulation. IEEE Trans Fuzzy Syst 26(6):1–1
Huang QQ, Li TR, et al. (2020) Dynamic dominance rough set approach for processing composite ordered data. Knowl-Based Syst:187
Huang J, Wei Y, Yi J, Liu M (2018) An improved KNN Based on class contribution and feature weighting, 2018 10th international conference on measuring technology and mechatronics automation (ICMTMA). IEEE:313–316
Jiang HB, Hu BQ (2021) A decision-theoretic fuzzy rough set in hesitant fuzzy information systems and its application in multi-attribute decision-making. Inf Sci 579:103–127
Jiang ZH, Liu KY, et al. (2020) Accelerator for supervised neighborhood based attribute reduction. Int J Approx Reason 119:122–150
Kong QZ, Zhang XW, Xu WH, Xie ST (2020) Attribute reducts of multi-granulation information system. Artif Intell Rev 53(2):1353–1371
Liang S, Yang X, Chen X, et al. (2018) Stable attribute reduction for neighborhood rough set. Filomat 32(5):1809–1815
Liu Q, Dai JH, Chen JL (2021) Cost-sensitive feature selection for interval-valued data. J Nanjing Univ(Nature Sci) 57(1):121–129
Liu KY, Yang XB, Fujita H et al (2019) An efficient selector for multi-granularity attribute reduction. Inf Sci 505:457– 472
Luo C, Ju YB, Dong PW et al (2021) Risk assessment for ppp waste-to-energy incineration plant projects in china based on hybrid weight methods and weighted multigranulation fuzzy rough sets. Sustainable cities and society:74
Luo C, Li T, Huang Y, Fujita H (2019) Updating Three-Way decisions in incomplete multi-scale information systems. Inf Sci 476:274–289
Mariello A, Battiti R (2018) Feature selection based on the neighborhood entropy. IEEE Trans Neural Netw Learn Syst 29(12):6313–6322
Pawlak L (1982) Sets, rough. Int J Comput Inform Sci 11:341–356
Qian YH, Liang JY, Dang CY (2008) Converse approximation and rule extraction from decision tables in rough set theory. Comput Math Appl 55(8):1754–1765
Ren YG, Zhang YP, Zhang ZP (2020) Collaborative filtering recommendation algorithm based on rough set rule extraction. J Commun 41(1):76–83
Sang BB, Chen H, Yang L, et al. (2021) Feature selection for dynamic interval-valued ordered data based on fuzzy dominance neighborhood rough set. Knowl-Based Syst 227:107–223
Sang BB, Chen HM, Yang L, et al. (2021) Incremental feature selection using a conditional entropy based on fuzzy dominance neighborhood rough sets. IEEE Trans Fuzzy Syst 99:1–1
Sun BZ, Gong ZT, Chen DG (2008) Fuzzy rough set theory for the interval-valued fuzzy information systems. Inf Sci 178(13):2794–2815
Sun L, Wang TX, Ding WP, et al. (2021) Feature selection using fisher score and multilabel neighborhood rough sets for multilabel classification. Inf Sci 578:887–912
Tsang ECC, Hu Q, Chen D (2016) Feature and instance reduction for PNN classifiers based on fuzzy rough sets. Int J Mach Learn Cybernetics 7:1–11
Vluymans S, Parthalain NM, Cornelis C, et al. (2019) Weight selection strategies for ordered weighted average based fuzzy rough sets. Inf Sci 501:155–171
Wan ZC, Song J, Shen YL (2018) Variable intuitionistic fuzzy multi-granulation rough set model and its approximate distribution reduction algorithms. J Comput Appl 38(2):390–398
Wang YB, Chen XJ, Dong K (2019) Attribute reduction via local conditional entropy. Int J Mach Learn Cyber 10:3619–3634
Wang C, Hu Q, Wang X, et al. (2018) Feature selection based on neighborhood discrimination index. IEEE Trans Neural Netw Learn Syst 29(7):2986–2999
Wang CZ, Huang Y, Shao MW, Hu QH, Chen DG (2020) Feature selection based on neighborhood self-information. IEEE Trans Cybern 50(9):4031–4042
Wang C, Huang Y, Shao M, et al. (2019) Fuzzy rough Set-Based attribute reduction using distance measures. Knowl-Based Syst 164:205–212
Wang Q, Qian Y, Liang X, et al. (2018) Local neighborhood rough set. Knowl-Based Syst 153:53–64
Wang CZ, Shi YP, Fan XD, Shao MW (2019) Attribute reduction based on K-Nearest neighborhood rough sets. Int J Approx Reason 106:18–31
Xu WH, Li WL (2016) Granular computing approach to two-way learning based on formal concept analysis in fuzzy datasets. IEEE Trans Cybern 46(2):366–379
Xu WH, Yu JH (2020) A novel approach to information fusion in multi-source datasets: a granular computing viewpoint. Inf Sci 378:410–423
Yang X, Li TR, Liu D et al (2020) A multilevel neighborhood sequential decision approach of three-way granular computing. Inf Sci 538:119–141
Yang XB, Liang SC, Yu HL, Gao S, Qian YH (2019) Pseudo-label neighborhood rough set: measures and attribute reductions. Int J Approx Reason 105:112–129
Yang L, Qin KY, Sang BB, Xu WH (2021) Dynamic fuzzy neighborhood rough set approach for interval-valued information systems with fuzzy decision. Appl Soft Comput:111
Yang L, Xu WH, Zhang XY, Sang BB (2020) Multi-granulation method for information fusion in multi-source decision information system. Int J Approx Reason 122:47–65
Zhang X, Mei C, Chen D, Yang Y, Li J (2019) Active incremental feature selection using a Fuzzy-Rough-Set-Based information entropy. IEEE Trans. Fuzzy Syst. 28(5):901–915
Zhang X, Mei C, Chen D, et al. (2016) Feature selection in mixed data: a method using a novel fuzzy rough set-based information entropy. Pattern Recognit 56:1–15
Zhao HD (2012) The geometric average sorting method of interval numbers and its application inner Mongolia university for nationalities
Zhou P, Hu XG, Li PP, Wu XD (2019) Online streaming feature selection using adapted neighborhood rough set. Inf Sci 481:258–279
Acknowledgements
This paper is supported by the National Natural Science Foundation of China (Nos. 61976245, 61772002)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, X., Jiang, Z. & Xu, W. Feature selection using a weighted method in interval-valued decision information systems. Appl Intell 53, 9858–9877 (2023). https://doi.org/10.1007/s10489-022-03987-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03987-2