Abstract
Uncertainty measurement (UM) provides a new perspective on feature selection in an information system (IS). The intent of this paper is to measure the uncertainty of a set-valued information system (SVIS) from the perspective of “The similarity between information values is fed back to the feature set” and consider its application in feature selection. Based on the similarity between information values, fuzzy symmetry relations on the object set of an SVIS are first established. Secondly, \(\theta\)-information granules based on the fuzzy symmetry relations are obtained. Thirdly, four UMs for an SVIS, including \(\theta\)-information granulation (\(G^\theta\)), \(\theta\)-information entropy (\(H^\theta\)), \(\theta\)-rough entropy (\(E_\mathrm{{r}}^\theta\)) and \(\theta\)-information amount (\(E^\theta\)), are proposed. Moreover, numerical experiments and statistical tests to evaluate the performance of the proposed measurements are carried out. Finally, an application in feature selection for an SVIS is given and the corresponding algorithms based on \(G^\theta\) and \(H^\theta\) are presented, clustering analysis on the reduced SVIS is conducted. The experimental results show that the proposed algorithms are effective according to three evaluation indicators of clustering performance.
Graphic Abstract
Similar content being viewed by others
References
Beaubouef, T., Petry, F.E.: Fuzzy rough set techniques for uncertainty processing in a relational database. Int. J. Intell. Syst. 15, 389–424 (2000)
Calinski, T.T., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat. 3(1), 1–27 (1974)
Cornelis, C., Jensen, R., Martin, G.H., Slezak, D.: Attribute selection with fuzzy decision reducts. Inform. Sci. 180, 209–224 (2010)
Chen, L.L., Chen, D.G., Wang, H.: Fuzzy kernel alignment with application to attribute reduction of heterogeneous data. IEEE Trans. Fuzzy Syst. 27, 1469–1478 (2019)
Chen, Z.C., Qin, K.Y.: Attribute reduction of set-valued information systems based on a tolerance relation. Comput. Sci. 23(1), 18–22 (2010)
Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 2, 224–227 (1979)
Duntsch, I., Gediga, G.: Uncertainty measures of rough set prediction. Artif. Intell. 106, 109–137 (1998)
Dai, J.H., Hu, H., Wu, W.Z., Qian, Y.H., Huang, D.B.: Maximal-discernibility-pair-based approach to attribute reduction in fuzzy rough sets. IEEE Trans. Fuzzy Syst. 26(4), 2175–2187 (2018)
Dai, J.H., Tian, H.: Entropy measures and granularity measures for set-valued information systems. Inform. Sci. 240, 72–82 (2013)
Guan, Y.Y., Xue, P.J., Hu, H.Q.: Attribute reduction and definite decision rules optimization in set-valued decision information systems. Syst. Eng. Electron. 28, 551–555 (2006)
Huang, Z., Li, J.J.: Discernibility measures for fuzzy \(\beta\) covering and their Application. IEEE Trans. Cybernet. 99, 1–14 (2021)
Jia, X.Y., Rao, Y., Shang, L., Li, T.J.: Similarity-based attribute reduction in rough set theory: a clustering perspective. Int. J. Mach. Learn. Cybernet. 11, 1047–1060 (2020)
Leung, Y., Fischer, M.M., Wu, W.Z., Mi, J.S.: A rough set approach for the discovery of classification rules in set-valued information systems. Int. J. Approx. Reason. 47, 233–246 (2008)
Li, Z.W., Huang, D., Liu, X.F., Xie, N.X., Zhang, G.Q.: Information structures in a covering information system. Inform. Sci. 507, 449–471 (2020)
Li, F.C., Jin, C.X., Yang, J.N.: Roughness measure based on description ability for attribute reduction in information system. Int. J. Mach. Learn. Cybernet. 10, 925–934 (2019)
Li, Z.W., Liu, X.F., Dai, J.H., Chen, J.L., Fujita, H.: Measures of uncertainty based on Gaussian kernel for a fully fuzzy information system. Knowl.-Based Syst. 196, 105791 (2020)
Li, Z.W., Qu, L.D., Zhang, G.Q., Xie, N.X.: Attribute selection for heterogeneous data based on information entropy. Int. J. General Syst. 50(5), 548–566 (2021)
Li, Z.W., Zhang, P.F., Ge, X., Xie, N.X., Zhang, G.Q., Wen, C.F.: Uncertainty measurement for a fuzzy relation information system. IEEE Trans. Fuzzy Syst. 27, 2338–2352 (2019)
Li, W.T., Li, Z., Zhu, C.L., Xu, W.H.: Neighborhood-based set-valued double-quantitative rough sets, Chinese. Quart. J. Math. 36, 122–140 (2021)
Pawlak, Z.: Rough sets. Int. J. Comput. Inform. Sci. 11, 341–356 (1982)
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer Academic Publishers, New York (1991)
Qian, Y.H., Liang, J.Y., Dang, C.Y.: Set ordered information systems. Comput. Math. Appl. 56, 1994–2009 (2008)
Qian, Y.H., Liang, J.Y., Wu, W.Z., Dang, C.Y.: Information granularity in fuzzy binary GrC model. IEEE Trans. Fuzzy Syst. 19, 253–264 (2011)
Rouseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)
Singh, S., Shreevastava, S., Som, T., Somani, G.: A fuzzy similarity-based rough set approach for attribute selection in set-valued information systems. Soft Comput. 24, 4675–4691 (2020)
Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soci. (Ser. B) 58, 267–288 (1996)
Li, T., Yan, W., Mo, Z.W.: Knowledge reduction in set-valued incomplete information system. J. Sichuan Normal Univ. 30(3), 288–290 (2007)
Wu, W.Z.: Attribute reduction based on evidence theory in incomplete decision systems. Inform. Sci. 178, 1355–1371 (2008)
Wang, Y.B., Chen, X.J., Dong, K.: Attribute reduction via local conditional entropy. Int. J. Mach. Learn. Cybernet. 10(12), 3619–3634 (2019)
Wang, C.Z., Huang, Y., Shao, M.W., Hu, Q.H., Chen, D.G.: Feature selection based on neighborhood self-information. IEEE Trans. Cybernet. 50(9), 4031–4042 (2020)
Wnag, C.Z., Wang, Y., Shao, M.W., Qian, Y.H., Chen, D.G.: Fuzzy rough attribute reduction for categorical data. IEEE Trans. Fuzzy Syst. 28, 818–830 (2020)
Wu, W.Z., Mi, J.S., Zhang, W.X.: Generalized fuzzy rough sets. Inform. Sci. 151, 263–282 (2003)
Xie, N.X., Liu, M., Li, Z.W., Zhang, G.Q.: New measures of uncertainty for an interval-valued information system. Inform. Sci. 470, 156–174 (2019)
Yao, Y.Y.: Relational interpretations of neighborhood operators and rough set approximation operators. Inform. Sci. 111, 239–259 (1998)
Yao, Y.Y., Li, X.N.: Comparison of rough-set and interval-set models for uncertain reasoning. Fundam. Inform. 27, 289–298 (1996)
Yao, Y.Y., Noroozi, N.: A unified framework for set-based computations. In: Proceedings of the 3rd international workshop on rough sets and soft computing, pp. 10–12 (1994)
Yang, L., Zhang, X.Y., Xu, W.H., Sang, B.B.: Multi-granulation sough sets and uncertainty measurement for multi-source fuzzy information system. Int. J. Fuzzy Syst. 21, 1919–1937 (2019)
Zadeh, L.A.: Fuzzy sets. Inform. Control 8, 338–353 (1965)
Zeng, A.P., Li, T.R., Liu, D., Zhang, J.B., Chen, H.M.: A fuzzy rough set approach for incremental feature selection on hybrid information systems. Fuzzy Sets Syst. 258, 39–60 (2015)
Zeng, J.S., Li, Z.W., Zhang, P.F., Wang, P.: Information structures and uncertainty measures in a hybrid information system: Gaussian kernel method. Int. J. Fuzzy Syst. 22, 212–231 (2020)
Acknowledgements
The authors would like to thank the editors and the anonymous reviewers for their valuable comments and suggestions, which have helped immensely in improving the quality of the paper. This study is supported by grants from 2021 High-Level Talent Project of Yulin Normal University (G2021ZK05).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Peng, Y., Zhang, Q. Uncertainty Measurement for Set-Valued Data and Its Application in Feature Selection. Int. J. Fuzzy Syst. 24, 1735–1756 (2022). https://doi.org/10.1007/s40815-021-01230-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40815-021-01230-7