Abstract
In the era of big data, multimedia, hyper-media and social networks are emerging, and the amount of information is growing rapidly. When people participate in the process of massive data processing, they will encounter data with different structures, so data has heterogeneity. How to acquire hidden and valuable knowledge from heterogeneous data and measure its uncertainty is an important problem in artificial intelligence. This paper investigates uncertainty measurement for heterogeneous data and gives its application in attribute reduction. The concept of a heterogeneous information system (HIS) is first proposed. Then, an equivalence relation on the object set is constructed. Next, uncertainty measurement for a HIS is investigated, a numerical experiment is given, and dispersion analysis, correlation analysis, and Friedman test and Bonferroni–Dunn test in statistics are conducted. Finally, as an application of the proposed measures, attribute reduction in a HIS is studied, and the corresponding algorithms and their analysis are proposed.
Similar content being viewed by others
References
Beaubouef T, Petry FE, Arora G (1998) Information-theoretic measures of uncertainty for rough sets and rough relational databases. Inf Sci 109:185–195
Chen YM, Wu KS, Chen XH, Tang CH, Zhu QX (2014) An entropy-based uncertainty measurement approach in neighborhood systems. Inf Sci 279:239–250
Dunn OJ (1961) Multiple comparisons among means. J Am Stat Assoc 56(293):52–64
Dütsch I, Gediga G (1998) Uncertainty measures of rough set prediction. Artif Intell 106(1):109–137
Dai JH, Wang WT, Hao QX, Tian W (2012) Uncertainty measurement for interval-valued decision systems based on extended conditional entropy. Knowl-Based Syst 27:443–450
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92
Hu QH, Yu DR, Liu JF, Wu CX (2008) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178:3577–3594
Hu QH, Yu DR, Xie ZX (2006) Information-preserving hybrid data reduction based on fuzzy-rough techniques. Pattern Recogn Lett 27(5):414–423
Li ZW, Liu YY, Li QG, Qin B (2016) Relationships between knowledge bases and related results. Knowl Inf Syst 49:171–195
Li ZW, Zhang PF, Ge X, Xie NX, Zhang GQ, Wen CF (2019) Uncertainty measurement for a fuzzy relation information system. IEEE Trans Fuzzy Syst 27(12):2338–2352
Li ZW, Huang D, Liu XF, Xie NX, Zhang GQ (2020a) Information structures in a covering information system. Inf Sci 507:449–471
Li ZW, Liu XF, Dai JH, Chen JL, Fujita H (2020b) Measures of uncertainty based on Gaussian kernel for a fully fuzzy information system. Knowl-Based Syst 196:105791
Li ZW, Zhang GQ, Wu WZ, Xie NX (2020c) Measures of uncertainty for knowledge bases. Knowl Inf Syst 62:611–637
Liang JY, Qian YH (2008) Information granules and entropy theory in information systems. Sci China (Ser F) 51:1427–1444
Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer Academic Publishers, Dordrecht
Pawlak Z, Skowron A (2007) Rudiments of rough sets. Inf Sci 177:3–27
Pawlak Z, Skowron A (2007) Rough sets: some extensions. Inf Sci 177:28–40
Pawlak Z, Skowron A (2007) Rough sets and Boolean reasoning. Inf Sci 177:41–73
Shannon C (1948) A mathematical theory of communication. Bell Syst Techn J 27:379–423
Sanchez MA, Castro JR, Castillo O, Mendoza O, Rodriguez-Diaz A, Melin P (2017) Fuzzy higher type information granules from an uncertainty measurement. Granul Comput 2:95–103
Sun BZ, Ma WM, Chen DG (2014) Rough approximation of a fuzzy concept on a hybrid attribute information system and its uncertainty measure. Inf Sci 284:60–80
Wang CZ, Huang Y, Shao MW, Chen DG (2019) Uncertainty measures for general fuzzy relations. Fuzzy Sets Syst 360:82–96
Wang XD, Song YF (2018) Uncertainty measure in evidence theory with its applications. Appl Intell 48:1672–1688
Xie NX, Liu M, Li ZW, Zhang GQ (2019) New measures of uncertainty for an interval-valued information system. Inf Sci 470:156–174
Yao YY (2003) Probabilistic approaches to rough sets. Expert Syst 20:287–297
Yu B, Guo LK, Li QG (2019) A characterization of novel rough fuzzy sets of information systems and their application in decision making. Expert Syst Appl 122:253–261
Zadeh LA (1965) Fuzzy sets. Inf Control 8(3):338–353
Zhang XX, Chen DG, Tsang EC (2017) Generalized dominance rough set models for the dominance intuitionistic fuzzy information systems. Inf Sci 378:1–25
Zeng AP, Li TR, Liu D, Zhang JB, Chen HM (2015) A fuzzy rough set approach for incremental feature selection on hybrid information systems. Fuzzy Sets Syst 258:39–60
Zhang GQ, Li ZW, Wu WZ, Liu XF, Xie NX (2018) Information structures and uncertainty measures in a fully fuzzy information system. Int J Approx Reason 101:119–149
Zhang X, Mei CL, Chen DG, Li JH (2016) Feature selection in mixed data: a method using a novel fuzzy rough set-based information entropy. Pattern Recogn 56:1–15
Acknowledgements
The authors would like to thank the editors and the anonymous reviewers for their valuable comments and suggestions, which have helped immensely in improving the quality of the paper. This work is supported by National Natural Science Foundation of China (11971420), Special Scientific Research Project of Young Innovative Talents in Guangxi (2019AC20052), Natural Science Foundation of Guangxi (2019JJA110036, AD19245102, 2018GXNSFDA294003, 2018GXNSFDA294134), Guangxi Science and Technology Program(2017AD23056),Key Laboratory of Software Engineering in Guangxi University for Nationalities(2020-18XJSY-03), Guangxi Higher Education Institutions of China (Document No.[2019] 52), Guangxi Higher Education Reform Project (2020XJJGZD17), Research Project of Institute of Big Data in Yulin (YJKY03) and Engineering Project of Undergraduate Teaching Reform of Higher Education in Guangxi (2017JGA179).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Song, Y., Zhang, G., He, J. et al. Uncertainty measurement for heterogeneous data: an application in attribute reduction. Artif Intell Rev 55, 991–1027 (2022). https://doi.org/10.1007/s10462-021-09978-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-021-09978-y