Using Z-number to measure the reliability of new information fusion method and its application in pattern recognition

doi:10.1016/j.asoc.2021.107658

Applied Soft Computing

Volume 111, November 2021, 107658

https://doi.org/10.1016/j.asoc.2021.107658 Get rights and content

Highlights

•
Proposed a new information fusion method based on Dempster–Shafer Evidence Theory (DST) and K-means clustering.
•
The reliability evaluation criterion for fusion results is established based on Z-number.
•
Some critical issues in DST, e.g., conflict management, evidence stuck, are well overcome.
•
Comparison and discussion prove that the proposed method has better robustness and sensitivity than existing methods.
•
The examples and applications show the potential of the proposed method in a data-driven intelligent system.

Abstract

Information fusion has traditionally been a concern. In the fusion process, how to effectively take care of the ambiguity and uncertainty of data is a fascinating problem. Dempster–Shafer evidence theory shows powerful functions in dealing with uncertainty information, and Z-number can comprehensively model the ambiguity and reliability of information. Inspired by this, this paper proposed a new information fusion method based on Dempster–Shafer theory and K-means clustering and it established the reliability evaluation criterion based on Z-number. Comparison and discussion verify the rationality of the proposed method, which also illustrates the method has better robustness and sensitivity than existing methods, some critical issues in DST, e.g., conflict management, evidence stuck, are well investigated and overcome by the proposed method. Number examples and the application further shows the application potential of the proposed method in a data-driven intelligent system.

Introduction

Information fusion has always been seen as the basis for biological environmental perception and behavioral actions. Due to the practical needs of many fields, how to use an effective representation framework to fuse extracted and abstract data information and finally convert it into results that are more conducive to human decision-making has become a hot research direction [1], [2], [3], e.g., emergency alternative selection [4], medical diagnosis [5], [6], [7], [8], [9], complex event processing [10], [11], [12], data cluster [13], [14], failure mode and effects analysis [15], [16], [17], [18], reliability assessment [19], [20], objective optimization [21], [22], [23] and decision making [24], [25], [26], [27].

However, many traditional information fusion methods have an inevitable problem that the uncertainty in the original data has been ignored [28], [29]. However, it will inevitably encounter inaccurate data and even some wrong data in a complex real environment. Therefore, the ambiguity and randomness in the data information are useful to considering. Considering the ambiguity between the focal elements, Dempster–Shafer Evidence Theory (DST) shows more powerful functions in dealing with uncertainty information and modeling [30], [31]. It can satisfy the axiom system which is weaker than probability theory, and has become an important means of information fusion in many fields [32], [33], [34], e.g., decision making [35], [36], [37], reliability measure [38], [39] and uncertain reasoning [40], [41], [42], [43], [44]. In order to solve the defects of the classic DST fusion method [45], [46], Murphy [47] uses the average value instead of the original information to reduce the interference of conflicting information. Deng et al. [48] constructed a similarity matrix to obtain the credibility of different information, making the fusion result more reasonable and reliable. Jiang [49] proposed the correlation coefficient between evidence information and improved the information fusion method based on the evidence distance. Since then, many works have improved the existing fusion methods to varying degrees, such as Song et al. [50], Pan et al. [51] and our previous work [52] and so on. Recently, Ma et al. [53] proposed a new idea. This method balances the relationship between averaging and focusing well, and the supporting information of non-focused elements is retained more. This view has inspired this article to some extent. We believe that conflict information should not be overly negated, because if conflict information is completely meaningless, deleting it is undoubtedly the best solution. This not only makes the focusing function of the fusion method stronger, but also saves extra computational overhead. But in many cases, the conflict information is also important, especially the cause of its occurrence, such as sensor failure, a certain attribute of the target to be tested has a mutation, etc. The retention of conflicting information will prompt further investigation. Yager [54] believes that the nature of conflicts originates from the unknown, and the conflict value should be transferred to the universal set. However, in a strong conflict information environment, simple transfer of conflicting sensitivity is needed, which may provide us with an interesting work.

This paper proposes a new information fusion method based on Dempster–Shafer theory and K-means clustering. The evidence information collected in the information source is clustered, and the basic probability assignment (BPA) value of the finally converged cluster center evidence is combined to obtain the comprehensive result. The proposed method retains more uncertain and conflicting information in the original information, especially in the environment of strong conflict information, it can more effectively perceive the conflict signal in the original information and manage conflicts. In addition, the proposed method can obtain more valuable information compared with the aforementioned methods. For example, in a voting election system, the proposed method can not only fuse a large number of voting information to obtain decision results, but also can observe the situation of groups holding different opinions, e.g., the number of people included in these groups, the opinions they support, and the degree of disagreement among different groups, etc.

In addition, the reliability analysis of the fusion results is also a problem worthy of concern. Because of the ambiguity and randomness of the data information, whether the final fusion result is reliable or not directly affects further analysis and processing. None of the aforementioned methods discuss the reliability of the obtained fusion results. Based on Z-number, we give a method to evaluate the reliability of the fusion result obtained by the proposed fusion method. Z = ( $A$ , $B$ ), was first proposed by Zadeh [55], the first component $A$ is the fuzzy measure of the proposition, and the second component $B$ is a measure of the reliability of the first component $A$ , which are connected by an hidden probability distribution. In the previous work, we discussed how to effectively use fuzzy measure and probability information to obtain reliability assessment [56], [57]. In this paper, combine the fusion method proposed, we first obtain the fuzzy measure and probability evaluation of the fusion result, then determine the reliability component based on the connotation of Z-number, and finally convert it into the reliability value. This provides meaningful help for evaluating decisions.

The rest of the paper is structured as follows. The second Section introduces some preliminary knowledge. The third Section proposes a new information fusion method based on Dempster–Shafer theory and K-means clustering and how to use Z-number to measure the reliability of fusion results. A number example in Section 4 shows the detailed calculation process of the new method more clearly, and the obtained results are compared and discussed with other methods in the fifth Section to verify this method rationality and effectiveness. Then, the ability to process conflict information shows the robustness and sensitivity of the proposed method. At the end of this section the superiority of the proposed method is summarized. In Section 6, we show a more complex pattern recognition application based on the Iris data set, which is to further illustrate the application potential of the proposed method in fields such as pattern recognition. Finally, the paper ends in the conclusion.

Section snippets

Preliminaries

In this section, some preliminaries are briefly introduced.

The proposed information fusion method and the established reliability criterion

In this section, the new information fusion method based on Dempster–Shafer theory and K-means clustering was proposed, and it given the method on how to use Z-number to evaluate the reliability of fusion results. Fig. 3 shows the concept diagram. First, we give the overall framework of the two algorithms, and then, each step of the process is explained in detail.

The description for the detailed steps of the proposed method as follow.

Step 1:

Construct a collection of evidence bodies and

Numerical examples

In this section, we will demonstrate the calculation process of the proposed method in detail through a simple numerical example. A random set of evidence body data is produced by using the random function. And 10 pieces of evidence are selected as the representatives and detailed data is shown in Table 1.

Step 1:

Assume that the collected evidence body needs to be divided into three categories, i.e., the input of $K$ value is 3. Then randomly select the evidence body $m_{1}$ , $m_{5}$ , $m_{10}$ as the initial

Rationality analysis

To further verify the effectiveness and rationality of the proposed method, we compared the results of this study with other current methods. In addition to the classic Dempster’s method [45], [46], Murphy [47] proposed a method of first averaging the quality of each subset of a given recognition frame, and then calculating the combined quality by combining the average values. Deng et al. [48] proposed an evidence fusion method based on evidence distance. The similarity measure matrix is

An application of pattern recognition based on Iris data set

This section takes the Iris data set as the test data and the proposed method is used for the pattern recognition application, which is to further highlight the practical value of the proposed method.

Conclusion

In this paper, a new information fusion method based on Dempster–Shafer theory and K-means clustering is proposed, and it established the reliability evaluation criterion based on Z-number. Comparison and discussion prove that the proposed method has better robustness and sensitivity than existing methods, some critical issues in DST, e.g., conflict management, evidence stuck, are well investigated and overcome by the proposed method. An application based on the Iris data set shows the

CRediT authorship contribution statement

Ye Tian: Conceptualization, Methodology, Formal analysis, Investigation, Writing - original draft, Writing - review & editing. Xiangjun Mi: Data curation, Writing - original draft. Huizi Cui: Visualization. Pengdan Zhang: Validation. Bingyi Kang: Supervision, Resources, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The work is partially supported by the Fund of the National Natural Science Foundation of China (Grant No. 61903307), China Postdoctoral Science Foundation (Grant No. 2020M683575), Chinese Universities Scientific Fund (Grant No. 2452018066). We also thank the anonymous reviewers for their valuable suggestions and comments.

References (69)

XiaoF.
Multi-sensor data fusion based on the belief divergence measure of evidences and the belief entropy
Inf. Fusion
(2019)
MunirK. et al.
Neuroscience patient identification using big data and fuzzy logic–an Alzheimer’s disease case study
Expert Syst. Appl.
(2019)
XiaoF. et al.
Divergence measure of pythagorean fuzzy sets and its application in medical diagnosis
Appl. Soft Comput.
(2019)
RadhakrishnanS. et al.
Analysis of parameters affecting blood oxygen saturation and modeling of fuzzy logic system for inspired oxygen prediction
Comput. Methods Programs Biomed.
(2019)
XiaoF.
A novel multi-criteria decision making method for assessing health-care waste treatment technologies based on d numbers
Eng. Appl. Artif. Intell.
(2018)
LiZ. et al.
A novel evidential FMEA method by integrating fuzzy belief structure and grey relational projection method
Eng. Appl. Artif. Intell.
(2019)
JiangW. et al.
Failure mode and effects analysis based on a novel fuzzy evidential method
Appl. Soft Comput.
(2017)
BasirO. et al.
Engine fault diagnosis based on multi-sensor information fusion using Dempster–Shafer evidence theory
Inf. Fusion
(2007)
WangZ. et al.
Failure mode and effects analysis using Dempster-Shafer theory and TOPSIS method: Application to the gas insulated metal enclosed transmission line (GIL)
Appl. Soft Comput.
(2018)
HilletofthP. et al.
Three novel fuzzy logic concepts applied to reshoring decision-making
Expert Syst. Appl.
(2019)

XiongN. et al.

Multi-sensor management for information fusion: issues and approaches

Inf. Fusion

(2002)

MassonM.H. et al.

ECM: An evidential version of the fuzzy c-means algorithm

Pattern Recognit.

(2008)

LiuZ.G. et al.

A new belief-based K-nearest neighbor classification method

Pattern Recognit.

(2013)

ZhouX. et al.

Dependence assessment in human reliability analysis based on d numbers and AHP

Nucl. Eng. Des.

(2017)

DengX. et al.

Dependence assessment in human reliability analysis using an evidential network approach extended by belief rules and uncertainty measures

Ann. Nucl. Energy

(2018)

YangJ.B. et al.

Evidential reasoning rule for evidence combination

Artificial Intelligence

(2013)

YangY. et al.

An evidential reasoning-based decision support system for handling customer complaints in mobile telecommunications

Knowl.-Based Syst.

(2018)

ZhouM. et al.

Evidential reasoning approach with multiple kinds of attributes and entropy-based weight assignment

Knowl.-Based Syst.

(2019)

MurphyC.K.

Combining belief functions when evidence conflicts

Decis. Support Syst.

(2000)

YongD. et al.

Combining belief functions based on distance of evidence

Decis. Support Syst.

(2004)

JiangW.

A correlation coefficient for belief functions

Internat. J. Approx. Reason.

(2018)

MaW. et al.

A flexible rule for evidential combination in Dempster–Shafer theory of evidence

Appl. Soft Comput.

(2019)

YagerR.R.

On the Dempster-Shafer framework and new combination rules

Inform. Sci.

(1987)

ZadehL.A.

A note on Z-numbers

Inform. Sci.

(2011)

LiuQ. et al.

Derive knowledge of Z-number from the perspective of Dempster-Shafer evidence theory

Eng. Appl. Artif. Intell.

(2019)

JousselmeA.L. et al.

A new distance between two bodies of evidence

Inf. Fusion

(2001)

FanX. et al.

Fault diagnosis of machines based on D–S evidence theory. Part 1: D–S evidence theory and its improvement

Pattern Recognit. Lett.

(2006)

SenouciM.R. et al.

Fusion-based surveillance WSN deployment using Dempster–Shafer theory

J. Netw. Comput. Appl.

(2016)

DongY. et al.

Combination of evidential sensor reports with distance function and belief entropy in fault diagnosis

Int. J. Comput. Commun. Control

(2019)

YuanK. et al.

Conflict evidence management in fault diagnosis

Int. J. Mach. Learn. Cybern.

(2019)

ChenL. et al.

Emergency alternative selection based on an E-IFWA approach

IEEE Access

(2019)

WuD. et al.

A new medical diagnosis method based on Z-numbers

Appl. Intell.

(2018)

RomeroD. et al.

Applying fuzzy logic to assess the biogeographical risk of dengue in South America

Parasites & Vectors

(2019)

XiaoF.

An intelligent complex event processing with numbers under fuzzy environment

Math. Probl. Eng.

(2016)

Cited by (18)

The arithmetic of triangular Z-numbers with reduced calculation complexity using an extension of triangular distribution
2023, Information Sciences
Information that people rely on is often uncertain and partially reliable. Zadeh introduced the concept of Z-numbers as a more adequate formal construct for describing uncertain and partially reliable information. Most existing applications of Z-numbers involve discrete ones due to the high complexity of calculating continuous ones. However, the continuous form is the most common form of information in the real world. Simplifying continuous Z-number calculations is significant for practical applications. There are two reasons for the complexity of continuous Z-number calculations: the use of normal distributions and the inconsistency between the meaning and definition of Z-numbers. In this paper, we extend the triangular distribution as the hidden probability density function of triangular Z-numbers. We add a new parameter to the triangular distribution to influence its convexity and concavity, and then expand the value's domain of the probability measure. Finally, we implement the basic operations of triangular Z-numbers based on the extended triangular distribution. The suggested method is illustrated with numerical examples, and we compare its computational complexity and the entropy (uncertainty) of the resulting Z-number to the traditional method. The comparison shows that our method has lower computational complexity, higher precision and lower uncertainty in the results.
A novel Z-number based Real Option (ZRO) model under uncertainty: Application in Public-Private-Partnership refinancing value evaluation
2023, Expert Systems with Applications
Citation Excerpt :
For example, “the refinancing cost is very likely to be 11.4 billion dollars in the future” can be expressed as (refinancing cost: about 11.4 billion dollars, very likely). Previous studies focused on the conversion method between Z-numbers and linguistic knowledge (Kang, Wei, Li, & Deng, 2012), the involvement of Z-numbers in other decision techniques (Aboutorab, et al., 2018; Bakar & Gegov, 2015; Hendiani, et al., 2020; Poorvaezi Roukerd, et al., 2019; Yaakob & Gegov, 2016), and the application of Z-numbers in decision reliability analysis (Gardashova, 2014; Hendiani, et al., 2020; Tian, et al., 2021b; Yager, 2012). When the Z-number is used in the decision reliability analysis, the expert’s reliability is usually measured based on five linguistic variables (i.e., very low, low, medium, high, very high), which can be further transformed into fuzzy and crisp numbers, as shown in Table 1 (Kang, et al., 2012; Sahrom & Mohd Dom, 2015).
Public-Private-Partnership (PPP) refinancing is a complex activity with significant uncertainty, including environment-related and decision-maker-related uncertainty. Although value evaluation techniques, like Real Option (RO) and Fuzzy Real Option, have been developed to deal with uncertainty, the reliability of the used project information is ignored. This study aims to establish a refinancing value evaluation model considering both uncertainty and reliability. A Z-number based Real Option (ZRO) model is proposed using Z-numbers to represent the parameter values in the traditional RO model. In the Z-numbers, subjective evaluation of project parameters and information reliability are characterized. Moreover, a novel method is proposed to objectively evaluate the expert’s reliability by considering the expert’s multiple characteristics. The proposed model can determine refinancing values under various scenarios and reliability. A numerical case is used to validate the model performance. The results show that the refinancing value becomes higher when the information reliability increases. Changes in the underlying project value affect the refinancing option value most, while the impact of the annual project volatility is minimal. The proposed ZRO model could be applied to search for potential investors and as a bargaining tool.
Demand prediction of emergency materials using case-based reasoning extended by the Dempster-Shafer theory
2022, Socio-Economic Planning Sciences
In recent years, the frequent occurrence of natural hazards has caused huge economic and human losses, as well as seriously impacting the sustainable development of society. The effective management of emergency responses to natural hazards has become an important research topic worldwide. The demand prediction of emergency materials is the premise and basis for the optimal allocation of emergency resources, which is of great significance in improving the efficiency of disaster-related emergency responses. Using case-based reasoning (CBR) and the Dempster-Shafer theory, we investigated methods of predicting emergency materials demand. First, to address the problems of missing feature values, feature heterogeneity and inter-correlations among features of CBR, we proposed a case retrieval strategy based on Dempster-Shafer theory that not only lays a theoretical foundation for subsequent research, but also improves the case retrieval strategy used in CBR. Second, inspired by the 4R principle in CBR, we proposed a scenario-matching method for natural hazard, which uses historical cases in the absence of effective decision data for natural hazard-related loss predictions. Third, assuming that the impact of natural hazards will change with time, we further constructed a dynamic prediction model of emergency material demand based on the prediction results of natural hazard losses. In this paper, typhoon and earthquake disasters are used as case studies to demonstrate the application of the proposed materials demand prediction model, and the effectiveness of the method is demonstrated through empirical analysis.
Multi-attribute group decision making method with dual comprehensive clouds under information environment of dual uncertain Z-numbers
2022, Information Sciences
Citation Excerpt :
Noting that the computational problems are caused by the second component, Massanet et al. [22] offered the mixed-discrete Z-numbers, where the first component is managed as a usual fuzzy number, and the second one as a discrete fuzzy number with support in a finite chain. Tian et al. [37] presented a novel information fusion method via the Dempster–Shafer theory and K-means clustering to form the reliability evaluation criterion using Z-numbers. Peng and Wang [26] proposed the outranking relations of the Z-numbers and defined the dominance degree of discrete Z-numbers, and presented an outranking method with Z-number cognitive information.
Recognizing the subjectivity in human judgment and the limits to human cognition, fuzzy linguistic variables and uncertain linguistic variables are widely used to express the preference information of the decision experts in Multi-Attribute Group Decision Making (MAGDM). This paper develops an MAGDM method to solve a decision problem in which the attribute values are fuzzy numbers characterized by uncertain linguistic variables, that is, dual uncertain Z-numbers. The notion of the dual uncertain Z-number is first proposed based on the classical Z-number. Next, the cloud model is introduced, together with a new notion of the dual comprehensive cloud. A method of transforming the dual uncertain Z-numbers into dual comprehensive clouds is proposed. Third, a new dual comprehensive cloud-weighted averaging operator (DCC-WA) is presented to aggregate the multiple dual comprehensive clouds. Fourth, a new dual comprehensive cloud-grey relational degree (DCC-GRD) is supplied, and a novel MAGDM method based on the DCC-GRD is proposed. Finally, a group decision making case of sustainable supplier selection is provided to validate the proposed method. A sensitivity analysis and comparison with several congeneric methods are conducted.
Belief entropy-of-entropy and its application in the cardiac interbeat interval time series analysis
2022, Chaos, Solitons and Fractals
Citation Excerpt :
Before introducing the proposed method in this section, some concepts are briefly reviewed here, including D-S evidence theory [16,17] and Deng entropy[18]. How to measure the uncertain degree to give a rational decision pays a lot of attention, various kinds of work has been presented, such as D-S evidence theory [16,17], Generalized evidence theory [29,30], D-number [31,32], Z-number [33–38] and so on. Among them, D-S evidence theory proposed by Dempster and developed by Shafer, is an effective tool for information fusion and uncertain information processing [28,39–42].
How to measure the complexity of physiological signals in biological system is an open problem. Various entropy algorithms have been presented, but most of them fail to account for the complexity of time series with high accuracy. In this paper, the concept of Belief Entropy-of-Entropy (BEoE) is introduced, it expands entropy of entropy (EoE) into belief structure, and computes quadratic belief entropy to characterize the complexity of biological systems based on multiple time scales. The influence of inherent complex fluctuation, length bound, correlation of time windows, etc. is considered in the BEoE analysis. Application and discussion demonstrate that BEoE has better accurateness and applicability than many existing entropy algorithms.
A generalized divergence of information volume and its applications
2022, Engineering Applications of Artificial Intelligence
Citation Excerpt :
Various theories for modelling and processing uncertain information have been proposed, such as probability theory (Jaynes, 2003), fuzzy set (Zadeh, 1996), non-standard fuzzy set (Atanassov, 1986; Yager and Abbasov, 2013; Pan et al., 2021), rough set (Pawlak, 1982) and entropy measure (Deng, 2021). These theories have their own advantages and are widely used in medical diagnosis (Cao et al., 2019; Tian et al., 2021), group decision-making (Capuano et al., 2017; Gao et al., 2021a; Khan et al., 2021), target classification (de Souza et al., 2019; Vylegzhanin et al., 2019), cluster analysis (Liu et al., 2021a; Miyamoto, 2012; Dang et al., 2019; Ding et al., 2021; Jiang et al., 2021) and other fields (Dzitac et al., 2017; Yasmin et al., 2020; Gao et al., 2021b; Kamacı, 2021; Deng et al., 2021a). To solve more complex problems, these theories are further extended (Xiao, 2021a).
Dempster–Shafer evidence theory provides a powerful method for the expression and fusion of uncertain information. When handling the high conflict information, traditional Dempster combination rule can produce counterintuitive results. Hence, the reasonable conflict measure is essential in information fusion. Inspired by this view, the paper propose the new method to measure conflict between bodies of evidence. Firstly, we define a new information volume of mass function for the perspective of information discord and non-specificity. Second, we propose a generalized divergence based on information volume of mass function, denoted as Jensen–Shannon divergence of information volume $(I J S)$ . $I J S$ can effectively measure the conflict between bodies of evidence. $I J S$ reflects the conflict between bodies of evidence in terms of the differences between the support of propositions and the elements. That is, compared to the current approach, $I J S$ not only fully considers the differences between the support degree of propositions, but also the differences of elements in propositions from the perspective of information non-specificity. When the mass function degenerates to a probabilistic distribution, $I J S$ also degenerates to the classical Jensen–Shannon divergence. Meanwhile, $I J S$ also satisfies the axioms of distance measure, such as non-negativity, symmetry and etc. Further, we proved these axioms based on mathematical derivation, and some numerical examples are applied to explain axioms and advantages. Based on the proposed divergence measure, we propose a multi-source information fusion method in the real world, and several data sets can be used to show that the proposed fusion method is superior to current method under the framework of evidence theory.

View all citing articles on Scopus

View full text

Using Z-number to measure the reliability of new information fusion method and its application in pattern recognition

Highlights

Abstract

Introduction

Section snippets

Preliminaries

The proposed information fusion method and the established reliability criterion

Numerical examples

Rationality analysis

An application of pattern recognition based on Iris data set

Conclusion

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

Inf. Fusion

Expert Syst. Appl.

Appl. Soft Comput.

Comput. Methods Programs Biomed.

Eng. Appl. Artif. Intell.

Eng. Appl. Artif. Intell.

Appl. Soft Comput.

Inf. Fusion

Appl. Soft Comput.

Expert Syst. Appl.

Inf. Fusion

Pattern Recognit.

Pattern Recognit.

Nucl. Eng. Des.

Ann. Nucl. Energy

Artificial Intelligence

Knowl.-Based Syst.

Knowl.-Based Syst.

Decis. Support Syst.

Decis. Support Syst.

Internat. J. Approx. Reason.

Appl. Soft Comput.

Inform. Sci.

Inform. Sci.

Eng. Appl. Artif. Intell.

Inf. Fusion

Pattern Recognit. Lett.

J. Netw. Comput. Appl.

Combination of evidential sensor reports with distance function and belief entropy in fault diagnosis

Int. J. Comput. Commun. Control

Conflict evidence management in fault diagnosis

Int. J. Mach. Learn. Cybern.

Emergency alternative selection based on an E-IFWA approach

IEEE Access

A new medical diagnosis method based on Z-numbers

Appl. Intell.

Applying fuzzy logic to assess the biogeographical risk of dengue in South America

Parasites & Vectors

An intelligent complex event processing with numbers under fuzzy environment

Math. Probl. Eng.