Abstract
Feature selection is one basic and critical technology for data mining, especially in current “big data era”. Rough set theory is sensitive to noise in feature selection due the stringent condition of an equivalence relation. However, D–S evidence theory is flexible to measure uncertainty of information. In this paper, we introduce robust feature evaluation metrics “belief function” and “plausibility function” into feature selection algorithm to avoid the defect that classification effect is affected by noise such as missing values, confusing data, etc. Firstly, similarity between information values in a set-valued information system (SVIS) is introduced and a variable parameter to control the similarity of samples is given. Secondly, \(\theta\)-lower and \(\theta\)-upper approximations in an SVIS are put forward. Then, the concepts of \(\theta\)-belief function, \(\theta\)-plausibility function, \(\theta\)-belief reduction and \(\theta\)-plausibility reduction are given. Moreover, several feature selection algorithms based on the D–S evidence theory in an SVIS are proposed. Experimental results and statistical test show that the proposed metric is insensitive to noise because it comprehensively considers the evidence at all levels, and the proposed algorithms are more robust than several state-of-the-art feature selection algorithms.
Similar content being viewed by others
References
Alexander I, Tapani R (2010) Practical approaches to principal component analysis in the presence of missing values. J Mach Learn Res 11:1957–2000
Calinski T, Harabasz J (1974) A dendrite method for cluster analysis. Commun Stat 3:1–27
Chen Y, Liu KY, Song JJ, Fujita H, Yang XB, Qian YH (2020) Attribute group for attribute reduction. Inf Sci 535:64–80
Dai JH, Hu H, Wu WZ, Qian YH, Huang DB (2018) Maximal-discernibility-pair-based approach to attribute reduction in fuzzy rough sets. IEEE Trans Fuzzy Syst 26(4):2175–2187
Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 1(2):224–227
Dempster AP (1967) Upper and lower probabilities induced by a multivalued mapping. Ann Math Stat 38:325–339
Deng Y, Shi WK, Zhu ZF, Liu Q (2005) Combining belief functions based on distance of evidence. Decis Support Syst 38:489–493
Dubois D, Prade H (1988) Representation and combination of uncertainty with belief functions and possibility measures. Comput Intell 4:244–264
Fujita H, Ko YC (2020) A heuristic representation learning based on evidential memberships: case study of UCI-SPECTF. Int J Approx Reason 120:125–137
Huang YY, Li TR, Luo C, Fujita H, Horng SJ (2017) Dynamic variable precision rough set approach for probabilistic set-valued information systems. Knowl-Based Syst 122:131–147
Huang Z, Li J (2021) Discernibility measures for fuzzy \(\beta\)-covering and their application. IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2021.3054742
Jiang ZH, Liu KY, Yang XB, Yu HL, Fujita H, Qian YH (2020) Accelerator for supervised neighborhood based attribute reduction. Int J Approx Reason 119:122–150
Ko YC, Fujita H (2018) Evidential probability of signals on price herds predictions: case study on solar energy companies. Int J Approx Reason 92:255–269
Ko YC, Fujita H (2019) An evidential analytics for buried information in big data samples: case study of semiconductor manufacturing. Inf Sci 486:190–203
Li ZW, Zhang PF, Ge X, Xie NX, Zhang GQ, Wen CF (2019) Uncertainty measurement for a fuzzy relation information system. IEEE Trans Fuzzy Syst 27:2338–2352
Li ZW, Liu XF, Dai JH, Chen JL, Fujita H (2020) Measures of uncertainty based on Gaussian kernel for a fully fuzzy information system. Knowl-Based Syst 196:105791
Li ZW, Qu LD, Zhang GQ, Xie NX (2021) Attribute selection for heterogeneous data based on information entropy. Int J Gener Syst 50(5):548–566
Liu KY, Yang XB, Yu HL, Fujita H, Liu D (2020) Supervised information granulation strategy for attribute reduction. Int J Mach Learn Cybern 11(9):2149–2163
Maji P, Garai P (2012) An fuzzy-rough attribute selection: criteria of max-dependency, max-relevance, min-redundancy, and max-significance. Appl Soft Comput 17:1–14
Min F, Zhang Z, Zhai W, Shen R (2020) Frequent pattern discovery with tri-partition alphabets. Inf Sci 507:715–732
Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11:341–356
Rouseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65
Shafer G (1976) A Mathematical Theory of Evidence. Princeton University Press, Princeton
Singh S, Shreevastava S, Som T, Somani G (2020) A fuzzy similarity-based rough set approach for attribute selection in set-valued information systems. Soft Comput 24:4675–4691
Skowron A (1989) The relationship between rough set theory and evidence theory. Bull Polish Acad Sci (Math) 37:87–90
Skowron A (1990) The rough sets theory and evidence theory. Fundamenta Informaticae 13:245–262
Sun Q, Ye XQ, Gu WK (2000) A new combination rules of evidence theory. Acta Electronica Sinica 28:117–119
Tan AH, Wu WZ, Tao YZ (2018) A unified framework for characterizing rough sets with evidence theory in various approximation spaces. Inf Sci 454–455:144–160
Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B 58:267–288
Wu WZ (2008) Attribute reduction based on evidence theory in incomplete decision systems. Inf Sci 178:1355–1371
Wu WZ, Leung Y, Zhang WX (2002) Connections between rough set theory and Dempster-Shafer theory of evidence. Int J Gen Syst 31:405–430
Wu WZ, Zhang M, Li HZ, Mi JS (2005) Knowledge reduction in random information systems via Dempster-Shafer theory of evidence. Inf Sci 174:143–164
Wu WZ, Leung Y, Mi JS (2009) On generalized fuzzy belief functions in infinite spaces. IEEE Trans Fuzzy Syst 17:385–397
Xie NX, Liu M, Li ZW, Zhang GQ (2019) New measures of uncertainty for an interval-valued information system. Inf Sci 470:156–174
Yao YY, Zhang XY (2017) Class-specific attribute reducts in rough set theory. Inf Sci 418:601–618
Zhang M, Xu LD, Zhang WX, Li HZ (2003) A rough set approach to knowledge reduction based on inclusion degree and evidence reasoning theory. Exp Syst 20:298–304
Acknowledgements
The authors would like to thank the editors and the anonymous reviewers for their valuable comments and suggestions, which have helped immensely in improving the quality of the paper. This work is supported by Guangxi First-class Discipline Applied Economics Construction Project Fund, and Humanities and Social Sciences Fund of Ministry of Education in China.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, Y., Wang, S. Feature selection for set-valued data based on D–S evidence theory. Artif Intell Rev 56, 2667–2696 (2023). https://doi.org/10.1007/s10462-022-10241-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-022-10241-1