Approaches to knowledge reduction of covering decision systems based on information theory
Introduction
It is estimated that every 20 months or so the amount of information in the world doubles and 30% of it is redundant [1]. In order to improve the performance of applications including speed, storage and accuracy, information processing technique must be developed to combat this growth. One of the key issues of information processing is knowledge reduction. Different methods and tools have been proposed for effective and efficient reduction of knowledge. Of all the paradigms, Pawlak’s rough set theory [16], a new mathematical approach to deal with inexact and uncertain knowledge, makes significant contribution to this field [9], [10], [12], [15], [17], [18], [19], [20], [27]. The aim of a reduct is to find a minimal attribute subset of the original datasets that is the most informative, and all other attributes can be deleted from the databases with the minimal information loss. Over the past 20 years, some algorithms of attribute reduction based on rough set theory have been proposed. Discernibility Matrix [22], consistency of data [13], dependency of attributes [24] and mutual information [25] were employed to find reducts of an information system.
Equivalence relations play an important role in traditional rough sets. The above algorithms can be applicable only to databases whose attributes can induce equivalence relations. Equivalence relations of traditional rough set theory is thus restrictive for many applications. To address this issue, some interesting extensions to equivalence relations have been proposed, such as similarity relation [23], [26], [28], [29], tolerance relation [4], [21] and others [3], [5], [7], [11], [14], [30]. Since Zakowski [30] used coverings of a universe to establishing the covering generalized rough set theory, lots of additional literature on covering rough sets has been published [5], [32], [33], [34]. Bonikowski et al. [3] studied the structures of coverings, Mordeson [14] examined the relationship between the approximations of sets defined with respect to coverings and some axioms satisfied by traditional rough sets. Chen et al. [5] discussed the covering rough set within the framework of a complete completely distributive lattice. Zhu and Wang [33] investigated some basic properties of covering generalized rough sets and proved the reduct of a covering is the minimal covering which generates the same covering lower and upper approximation. However, more attention has been paid to set approximation by coverings, while little work has been done on attribute reduction in covering rough sets. Recently, Chen et al. [6] proposed a new method to reduct redundant coverings in covering decision systems by defining the intersection of coverings and used a discernibility matrix to compute all the reducts. Their study established a theoretical foundation for attribute reduction of covering rough sets and our research is on the basis of their achievements. In this paper, we study attribute reduction of covering information systems from information theory. First we introduce the entropy, conditional entropy, limitary entropy and limitary conditional entropy of coverings and define attribute reduction of covering decision systems by means of conditional entropy and limitary conditional entropy. The equivalence relationship between attribute reduction in [6] and the proposed attribute reduction are analyzed. Then we give the definition of the significance of coverings by conditional entropy and limitary conditional entropy. And finally some algorithms are given to calculate reducts from covering decision systems. In addition, in traditional rough set theory, Shannon’s entropy [8] and mutual information were employed to find reducts of decision systems, and a heuristic algorithm MIBARK-algorithm was proposed. However, Wang et al. [25] proved that MIBARK-algorithm cannot ensure the reduct is the minimal attribute subsets keeping the decision rule invariant in inconsistent decision systems. In this paper, we solve the problem in inconsistent covering decision systems.
This paper is organized as follows: In Section 2, we review set approximations and attribute reduction of traditional rough sets and covering rough sets. And by which, we introduce entropy and condition entropy of the covering and define attribute reduction of covering information systems in Section 3. In Section 4, we study attribute reduction of consistent covering decision systems by information entropy. In Section 5, we propose limitary entropy and limitary conditional entropy for reducing an inconsistent covering decision system. At the end, experimental results, comparison with results available from the literatures and discussions are given.
Section snippets
Preliminaries
Firstly, we review basic concepts related with traditional rough sets which can been found in [16], [22], [31].
An information system is a pair , where is a nonempty finite set of objects and a nonempty finite set of attributes. With every subset of attributes a binary relation , called the B-indiscernibility relation, is defined bythen is an equivalence relation and . By we denote the
The information entropy and attribute reduction of covering information systems
In this section, we firstly introduce some basic concepts of attribute reduction of covering information systems.
Let and be two families of coverings of U. It is easy to see that if and only if for each . if and only if there exists such that . Definition 3.1 Let be a covering information system and be a family of coverings. For , if , then is called dispensable in , otherwise is
Attribute reduction of consistent covering decision systems
The covering decision systems can be divided into consistent covering decision systems and inconsistent covering decision systems.
Let be a covering decision system, a family of coverings of U, D a decision attribute and a decision partition on U. If for every , there is a such that , then decision system is called a consistent covering decision system, denoted by . Otherwise, is called an inconsistent covering decision system [6]. Definition 4.1 Let
Attribute reduction of inconsistent covering decision systems
Inconsistent covering decision systems, we can equivalently use the conditional entropy to describe the attribute reduction. However, the equivalence relation does not hold any more in inconsistent covering decision systems. Example 5.1 Now let us consider a house evaluation problem. Suppose be a set of ten houses, and be a set of attributes.
Experimental analysis: a test application
In order to evaluate the utility of the proposed attribute reductions approach, a series of experiments have been conducted to test the proposed algorithms based on UCI data [2]. The behavior of the proposed algorithms is examined against traditional rough sets on four standard datasets. Their features are summarized in Table 1. The selected datasets is first split into two parts: the training set, composed of randomly chosen 50% patterns and testing set of the remaining 50% patterns.
We can
Conclusions
This paper discusses attribute reduction of covering decision systems. First by defining information entropy and information limitary entropy of coverings, we introduce a new method to reduce redundant coverings in covering decision systems. The equivalence relationship between the attribute reduction in [6] and the present attribute reduction is analyzed. We also develop several algorithms to compute all the reductions of covering decision system by the significance of coverings. It should be
Acknowledgements
The authors are thankful to Professor Chen Degang, the referees and Professor Witold Pedrycz, Editor-in-Chief, for their valuable comments and suggestions.
References (34)
- et al.
Extensions and intentions in the rough set theory
Inform. Sci.
(1998) - et al.
Rough set analysis of a general type of fuzzy data using transitive aggregations of fuzzy similarity relations
Fuzzy Set. Syst.
(2003) Rough set approach to incomplete information systems
Inform. Sci.
(1998)Rule in incomplete information systems
Inform. Sci.
(1999)- et al.
Approaches to knowledge reduction based on variable precision rough set model
Inform. Sci.
(2004) Rough set theory applied to fuzzy ideal theory
Fuzzy Sets Syst.
(2001)- et al.
Decision table reduction based on conditional information entropy
Chinese J. Comput.
(2002) - et al.
Knowledge reduction based on the equivalence relations defined on attribute set and its power set
Inform. Sci.
(2007) Constructive and algebraic method of theory of rough sets
Inform. Sci.
(1998)Relational interpretations of neighborhood operators and rough set approximation operators
Inform. Sci.
(1998)