Attribute reduction and optimal decision rules acquisition for continuous valued information systems
Introduction
Rough set theory [21], [22] can describe knowledge via set-theoretic analysis based on equivalence classification for the universe of discourse. It provides a theoretical foundation for inference reasoning about data analysis and has extensive applications in areas of artificial intelligence and knowledge acquisition. Attribute reduction and optimal decision rules acquisition are two important issues in current research for decision information systems.
For a discrete (symbolic) information system, an attribute value is just a symbol that expresses a feature. For a complete discrete information system, the classical (Pawlak) rough set models [19], [21], [22], [23], [24], [25], [32], [33], [40], [51], [52], Dominance-based rough set models [2], [3], and Probabilistic rough set models [26], [27], [35], [36], [41], [42], [46], [48], [49], [50] are proposed to compute reductions and also can be used to derive optimal decision rules effectively; for an incomplete discrete information system, some types of tolerance rough set models [12], [13], [16], [17], the generalized dominance-based rough set models [37], [47], and the generalized probabilistic rough set models [43] are developed recently. For the set-valued discrete information systems, Guan and Wang [4] developed a tolerance rough set model based on maximal tolerance classification, and this model is actually a generalization of classical rough set model. In practice, there are many information systems which are non-discrete information systems, such as fuzzy information systems, interval valued information systems, and continuous valued (real valued) information systems. They cannot be handled by classical rough set models, so some extended rough set models are developed to deal with these information systems [1], [6], [14], [29], [30], [31], [34], [38], [39], [44], [45]. In detail, for non-discrete information systems with fuzzy condition attributes and fuzzy decision attributes, Wang et al. in [39] proposed a concept of fuzzy lower and upper approximation by considering the similarity between two objects, and defined knowledge reduction in fuzzy environment. Based on the proposed concept, they developed a heuristic algorithm to learn fuzzy rules from initial fuzzy data. For non-discrete information systems with crisp condition attributes and fuzzy decision attributes, Yang et al. in [45] defined fuzzy decision rules via using different fuzzy lower and upper approximations, and proposed new techniques for attribute reductions of objects. This approach can deduce the optimal fuzzy decision rules. By combining the rough set theory with the interval valued fuzzy set theory, Gong et al. in [5] developed an interval valued rough fuzzy set model to deal with interval valued fuzzy information systems with crisp condition attributes and fuzzy interval valued decision attributes. For the knowledge discovery problem, they [5] presented an approach to deduce fuzzy decision rules from the initial data, but they did not investigate the knowledge reduction and optimal decision rules acquisition. Sun et al. in [34] established an interval valued fuzzy rough set model for interval valued fuzzy information systems, and investigated the knowledge reduction problem. Recently, Zhao and Tsang [53] demonstrated how different fuzzy approximate operators in fuzzy rough set models can impact the performance of attribute reduction in information systems with fuzzy condition attributes and symbolic decision attributes. For information systems with heterogeneous data, in which numerical attributes are used to deduce fuzzy relations and symbolic attributes generate crisp relations, Hu et al. in [7] utilized Shannon’s entropy theory to measure the information quality and applied the proposed measure to calculate the uncertainty in fuzzy approximate spaces, and this idea was used for reduction of systems in [8]. In [9], [10], [11], Hu et al. discussed the neighborhoods rough set model, and this model is used to calculate the reduction of information systems with heterogeneous data. However, most of above mentioned work are focused on the reduction of information systems and little has been investigated on optimal decision rules acquisition.
For continuous valued decision information systems, attribute values of objects for the same attribute represent not only the ordinal relationship but also the relative distances of objects, and thus very few objects have same attribute value. In this case, if the classical rough set model [21], [22] is used, the ordinal relationship and closeness of different objects will be neglected. This will lead to loss of information, and thus produce a large number of decision rules with weak generality. To best of our knowledge, in order to solve this problem, many researchers utilized the discretization methods to convert the continuous values of attributes into discrete ones [1], [6], [14], [20], [29], [30], [31], [44]. However, they only investigated the reduction problem of information systems, and did not consider the acquisition of optimal decision rules.
The crisp discretization technique for the continuous valued information systems is to select a set of cutting points within the ranges of corresponding attribute values [1], [14], [20], [31]. These cutting points will classify a range of the attribute values into some disjoint intervals, and form a crisp partition of the universe. Such crisp discretization is also called hard discretization [30]. Because this “knife-edge” [30] approach may be too categorical in some situations owing to selection of the cutting points, some new ideas emerged in which some additional “softening” thresholds are introduced. For example, fuzzy discretization approaches were proposed in [29], [38], in which the hard intervals defined by the cutting points were replaced with fuzzy intervals defined by fuzzy numbers with overlapping bounds. Recently, Leung et al. in [18] proposed a rough set approach to discover classification rules for the continuous valued information systems. In [18], the continuous valued information systems were transformed into some interval valued information systems by a statistical method, in which the concept of α-misclassification rates was used to compare different classes with a given threshold value α. By utilizing Boolean reasoning techniques [18], they calculated the α-classification reduction and α-classification core, and thus derived the classification rules accordingly. However, the problem for acquiring the optimal decision rules was not considered in [18].
In this paper, instead of utilizing discretization methods, we establish a tolerance rough set model based on similarity of different objects, which is used to compute optimal decision rules and reduce continuous valued decision information systems. This approach can avoid the loss of information, and effectively discover the knowledge hidden in continuous valued decision information systems.
The paper is organized as follows. In Section 2, by using a closeness measure for two objects, we will construct a fuzzy similarity matrix, which generates a tolerance relation with a given level. Based on the maximal tolerance classification of the universe, we establish a tolerance rough set model. In Section 3, we will define the initial decision rules and optimal decision rules by using the concepts of attribute descriptors and attribute feature descriptions for the maximal tolerance classes. In Section 4, we will propose the concepts of approximate discernibility matrix and approximate discernibility function for the maximal tolerance classes, from which all optimal decision rules can be derived. In Section 5, we will define the reduction and the core of the system with a given level, and present their computational approaches using the proposed approximate discernibility function. In Section 6, we will investigate the impact of level variations on reductions and selection of the optimal decision rules, and discuss the relationship between the proposed tolerance rough set model and classical one. Also we will compare the maximal tolerance classification approach with cluster discretization method. Finally, we conclude our work in Section 7.
Section snippets
Continuous valued decision information system
Let (U, C ∪ {d}, F,fd) be a decision information system, where U = {x1, x2, … ,xm} is a nonempty finite set called the universe, C = {c1, c2, … ,cn} is a conditional attribute set, and d is a decision attribute. Suppose that d is a discrete attribute, which represents a class of objects with specific attribute features. We further assume that there is not any order relationship between attribute values of d. Let the set of attribute values of d be denoted as Vd, Vd = {1, 2, … , r} and C ∩ {d} = ϕ. Assume that fd is a
Conditional attribute descriptors and Bβ-decision rules
For B ⊆ C, let , where [ai,bi] ⊆ [0,1], t is called a B-conditional attribute descriptor, and 〈ci, [ai, bi]〉 is called an atom of t denoted as 〈ci, [ai, bi]〉 ∈ t. Let ∥t∥ = {x∣x ∈ U, and ∀ci ∈ B, ci(x) ∈ [ai, bi]}, which is called a support set of t. If y ∈ ∥t∥, we say y supports t. Let d(t) = {d(x)∣x ∈ ∥t∥}, we say t → ∨k∈d(t)(d,k) is a decision rule induced by t.
For a decision rule t → ∨k∈d(t)(d,k), the smaller the value ∣d(t)∣ is, the higher the definite level of the decision rule will be, where ∣A
Approximate discernibility matrix and computing approach for optimal decision rules
Definition 8 If , then we say x and y are similar each other with respect to B under a level β, or approximately indiscernible with respect to B under a level β. Proposition 4 x and y are approximately indiscernible with respect to B under a level β ⇔ rB(x,y) ⩾ β⇔∀ck ∈ B, ∣ck(x) − ck(y)∣ ⩽ 1 − β. Proof x and y are approximately indiscernible with respect to B under a level β Definition 9 For xi, xj ∈ U, 1 ⩽ i, j ⩽ m, denoteWe call Mβ = (αβ(xi
Reductions of decision information systems
Definition 11 Let B ⊆ C. If B is a minimal condition attribute set which satisfies the following property:then B is called a general β-reduction of the information system, and the joint of all general β- reductions is called a general β-core of the information system.
If B is a minimal conditional attribute set which satisfies the following property:then B is called a consistent β-reduction of the information system, and
Impacts of threshold level variations on reductions and optimal decision rules
With threshold level β = 0.5, we will have the cutting matrix and discernibility matrix as follows:And the C0.5-complete cover of the universe U is , where , , , and three initial decision
Conclusions
Instead of utilizing discretization method, a tolerance rough set model is proposed and constructed in this paper in order to compute reductions and optimal decision rules for continuous valued decision information systems. Based on the maximal tolerance classification of the universe under a given level, two kinds of lower and upper approximations and positive fields are defined. By the attribute feature description of the maximal tolerance classes, concepts of reductions of the maximal
Acknowledgements
The authors would like to thank the anonymous reviewers for their constructive comments and suggestions. This research is supported by the National Natural Science Foundation of China (60774100), the Scientific Research and Development Project of Shandong Provincial Education Department, China (J06P01), and Doctoral Foundation of University of Jinan, China (B0616).
References (53)
- et al.
Global discretization of continuous attributes as preprocessing for machine learning
International Journal of Approximate Reasoning
(1996) - et al.
Set-valued information systems
Information Sciences
(2006) - et al.
Rough set theory for the interval-valued fuzzy information systems
Information Sciences
(2008) - et al.
Information-preserving hybrid data reduction based on fuzzy-rough techniques
Pattern Recognition Letters
(2006) - et al.
Neighborhood classifiers
Expert Systems with Applications
(2008) - et al.
Mixed feature selection based on granulation and application
Knowledge-Based Systems
(2008) - et al.
Neighborhood rough set based heterogeneous feature subset selection
Information Sciences
(2008) Rough set approach to incomplete information systems
Information Sciences
(1998)Rules in incomplete information systems
Information Sciences
(1999)- et al.
Maximal consistent block technique for rule acquisition in incomplete information systems
Information Sciences
(2003)
Knowledge acquisition in incomplete information systems: a rough set approach
European Journal of Operational Research
A rough set approach for the discovery of classification rules in interval-valued information systems
International Journal of Approximate Reasoning
Approaches to knowledge reduction based on variable precision rough sets model
Information Sciences
Rudiments of rough sets
Information Sciences
Rough set: some extensions
Information Sciences
Rough sets and Boolean reasoning
Information Sciences
Rough sets decision algorithms and Bayes’ theorem
European Journal of Operational Research
A comparative assessment of measures of similarity of fuzzy values
Fuzzy Sets and Systems
Fuzzy discretization of feature space for a rough set classier
Pattern Recognition Letters
Analyzing discretizations of continuous attributes given a monotonic discrimination function
Intelligent Data Analysis
Fuzzy rough set theory for the interval-valued fuzzy information systems
Information Sciences
The investigation of the Bayesian rough set model
International Journal of Approximate Reasoning
Degrees of conditional (in) dependence: a framework for approximate Bayesian networks and examples related to the rough set-based feature selection
Information Sciences
Entropy-based fuzzy rough classification approach for extracting classification rules
Expert Systems with Applications
Learning fuzzy rules from fuzzy samples based on rough set technique
Information Sciences
A systematic study on attribute reduction with rough sets based on general binary relations
Information Sciences
Cited by (37)
On rule acquisition methods for data classification in heterogeneous incomplete decision systems
2020, Knowledge-Based SystemsCitation Excerpt :Actually, they are not based on rules, as analyzed above. For numerical data, Guan et al. [35] proposed a tolerance rough set model to extract decision rules in numerical decision systems. But this work does not involve in classification problems and it is unable to handle incomplete data.
Knowledge reduction for decision tables with attribute value taxonomies
2014, Knowledge-Based SystemsCitation Excerpt :The figure reveals that the accuracy of models built on attribute-generalization reduced data by AGR-SCE is higher than the accuracy of models built on attribute reduced data by AR-SCE in most cases. The number of rules determines the complexity of the model [8,13,38]. Fig. 8 shows the comparison of numbers of rules derived by four classifies for various reduced data.
Core set analysis in inconsistent decision tables
2013, Information SciencesNeighborhood effective information ratio for hybrid feature subset evaluation and selection
2013, NeurocomputingCitation Excerpt :Among the above evaluation criteria, dependency, consistency and mutual information are all just available in evaluating categorical features. For applying the three criteria to numerical features, a discretization algorithm should be introduced to partition the numerical features into a finite set of intervals and associate each interval with a distinct value [20,21]. Since the discretization of numerical features ignores the degrees of membership of numerical values to discretized values, it may cause information loss for losing neighborhood structure and order structure in real spaces [22].
Neighborhood systems-based rough sets in incomplete information system
2011, Knowledge-Based SystemsCitation Excerpt :By introducing the basic concept of discrete mathematics into incomplete information system, Leung and Li [9] proposed the maximal consistent block based rough approximation. Guan et al. further introduced the maximal consistent block into set-valued and continuous valued systems in Refs. [6,7], respectively. Qian et al. [22] introduced the approximate distribution reducts into incomplete information system in terms of the maximal consistent block based rough approximation.
Dynamic discreduction using Rough Sets
2011, Applied Soft Computing Journal