Elsevier

Information Sciences

Volume 179, Issue 17, 5 August 2009, Pages 2974-2984
Information Sciences

Attribute reduction and optimal decision rules acquisition for continuous valued information systems

https://doi.org/10.1016/j.ins.2009.04.017Get rights and content

Abstract

For continuous valued information systems, the attribute values of objects for the same attribute represent not only their ordinal relationship but also their relative distances. Therefore, the classical rough set model is not suitable for deducing attribute reductions and optimal decision rules for continuous valued information systems. Though some discretization methods are proposed to transform the continuous valued information systems into discrete ones, but those methods are too categorical and may lead to loss of information in some cases. To solve such information loss problem, we propose a tolerance rough set model in this paper. With a given level, the proposed model can divide a universe into some maximal tolerance classes. Also two types of lower and upper approximations are defined accordingly. Then the reductions of the maximal tolerance class and optimal decision rules based on the proposed attribute descriptors are defined, and the approximate discernibility function for the maximal tolerance class is constructed and used to compute all the corresponding optimal decision rules via using Boolean reasoning techniques. Finally, the general reductions and consistent reductions for continuous valued information systems are discussed.

Introduction

Rough set theory [21], [22] can describe knowledge via set-theoretic analysis based on equivalence classification for the universe of discourse. It provides a theoretical foundation for inference reasoning about data analysis and has extensive applications in areas of artificial intelligence and knowledge acquisition. Attribute reduction and optimal decision rules acquisition are two important issues in current research for decision information systems.

For a discrete (symbolic) information system, an attribute value is just a symbol that expresses a feature. For a complete discrete information system, the classical (Pawlak) rough set models [19], [21], [22], [23], [24], [25], [32], [33], [40], [51], [52], Dominance-based rough set models [2], [3], and Probabilistic rough set models [26], [27], [35], [36], [41], [42], [46], [48], [49], [50] are proposed to compute reductions and also can be used to derive optimal decision rules effectively; for an incomplete discrete information system, some types of tolerance rough set models [12], [13], [16], [17], the generalized dominance-based rough set models [37], [47], and the generalized probabilistic rough set models [43] are developed recently. For the set-valued discrete information systems, Guan and Wang [4] developed a tolerance rough set model based on maximal tolerance classification, and this model is actually a generalization of classical rough set model. In practice, there are many information systems which are non-discrete information systems, such as fuzzy information systems, interval valued information systems, and continuous valued (real valued) information systems. They cannot be handled by classical rough set models, so some extended rough set models are developed to deal with these information systems [1], [6], [14], [29], [30], [31], [34], [38], [39], [44], [45]. In detail, for non-discrete information systems with fuzzy condition attributes and fuzzy decision attributes, Wang et al. in [39] proposed a concept of fuzzy lower and upper approximation by considering the similarity between two objects, and defined knowledge reduction in fuzzy environment. Based on the proposed concept, they developed a heuristic algorithm to learn fuzzy rules from initial fuzzy data. For non-discrete information systems with crisp condition attributes and fuzzy decision attributes, Yang et al. in [45] defined fuzzy decision rules via using different fuzzy lower and upper approximations, and proposed new techniques for attribute reductions of objects. This approach can deduce the optimal fuzzy decision rules. By combining the rough set theory with the interval valued fuzzy set theory, Gong et al. in [5] developed an interval valued rough fuzzy set model to deal with interval valued fuzzy information systems with crisp condition attributes and fuzzy interval valued decision attributes. For the knowledge discovery problem, they [5] presented an approach to deduce fuzzy decision rules from the initial data, but they did not investigate the knowledge reduction and optimal decision rules acquisition. Sun et al. in [34] established an interval valued fuzzy rough set model for interval valued fuzzy information systems, and investigated the knowledge reduction problem. Recently, Zhao and Tsang [53] demonstrated how different fuzzy approximate operators in fuzzy rough set models can impact the performance of attribute reduction in information systems with fuzzy condition attributes and symbolic decision attributes. For information systems with heterogeneous data, in which numerical attributes are used to deduce fuzzy relations and symbolic attributes generate crisp relations, Hu et al. in [7] utilized Shannon’s entropy theory to measure the information quality and applied the proposed measure to calculate the uncertainty in fuzzy approximate spaces, and this idea was used for reduction of systems in [8]. In [9], [10], [11], Hu et al. discussed the neighborhoods rough set model, and this model is used to calculate the reduction of information systems with heterogeneous data. However, most of above mentioned work are focused on the reduction of information systems and little has been investigated on optimal decision rules acquisition.

For continuous valued decision information systems, attribute values of objects for the same attribute represent not only the ordinal relationship but also the relative distances of objects, and thus very few objects have same attribute value. In this case, if the classical rough set model [21], [22] is used, the ordinal relationship and closeness of different objects will be neglected. This will lead to loss of information, and thus produce a large number of decision rules with weak generality. To best of our knowledge, in order to solve this problem, many researchers utilized the discretization methods to convert the continuous values of attributes into discrete ones [1], [6], [14], [20], [29], [30], [31], [44]. However, they only investigated the reduction problem of information systems, and did not consider the acquisition of optimal decision rules.

The crisp discretization technique for the continuous valued information systems is to select a set of cutting points within the ranges of corresponding attribute values [1], [14], [20], [31]. These cutting points will classify a range of the attribute values into some disjoint intervals, and form a crisp partition of the universe. Such crisp discretization is also called hard discretization [30]. Because this “knife-edge” [30] approach may be too categorical in some situations owing to selection of the cutting points, some new ideas emerged in which some additional “softening” thresholds are introduced. For example, fuzzy discretization approaches were proposed in [29], [38], in which the hard intervals defined by the cutting points were replaced with fuzzy intervals defined by fuzzy numbers with overlapping bounds. Recently, Leung et al. in [18] proposed a rough set approach to discover classification rules for the continuous valued information systems. In [18], the continuous valued information systems were transformed into some interval valued information systems by a statistical method, in which the concept of α-misclassification rates was used to compare different classes with a given threshold value α. By utilizing Boolean reasoning techniques [18], they calculated the α-classification reduction and α-classification core, and thus derived the classification rules accordingly. However, the problem for acquiring the optimal decision rules was not considered in [18].

In this paper, instead of utilizing discretization methods, we establish a tolerance rough set model based on similarity of different objects, which is used to compute optimal decision rules and reduce continuous valued decision information systems. This approach can avoid the loss of information, and effectively discover the knowledge hidden in continuous valued decision information systems.

The paper is organized as follows. In Section 2, by using a closeness measure for two objects, we will construct a fuzzy similarity matrix, which generates a tolerance relation with a given level. Based on the maximal tolerance classification of the universe, we establish a tolerance rough set model. In Section 3, we will define the initial decision rules and optimal decision rules by using the concepts of attribute descriptors and attribute feature descriptions for the maximal tolerance classes. In Section 4, we will propose the concepts of approximate discernibility matrix and approximate discernibility function for the maximal tolerance classes, from which all optimal decision rules can be derived. In Section 5, we will define the reduction and the core of the system with a given level, and present their computational approaches using the proposed approximate discernibility function. In Section 6, we will investigate the impact of level variations on reductions and selection of the optimal decision rules, and discuss the relationship between the proposed tolerance rough set model and classical one. Also we will compare the maximal tolerance classification approach with cluster discretization method. Finally, we conclude our work in Section 7.

Section snippets

Continuous valued decision information system

Let (U, C  {d}, F,fd) be a decision information system, where U = {x1, x2,  ,xm} is a nonempty finite set called the universe, C = {c1, c2,  ,cn} is a conditional attribute set, and d is a decision attribute. Suppose that d is a discrete attribute, which represents a class of objects with specific attribute features. We further assume that there is not any order relationship between attribute values of d. Let the set of attribute values of d be denoted as Vd, Vd = {1, 2,  , r} and C  {d} = ϕ. Assume that fd is a

Conditional attribute descriptors and Bβ-decision rules

For B  C, let t=ciBci,[ai,bi], where [ai,bi]  [0,1], t is called a B-conditional attribute descriptor, and 〈ci, [ai, bi]〉 is called an atom of t denoted as 〈ci, [ai, bi]〉  t. Let ∥t = {xx  U, and ∀ci  B, ci(x)  [ai, bi]}, which is called a support set of t. If y  t∥, we say y supports t. Let d(t) = {d(x)∣x  t∥}, we say t  kd(t)(d,k) is a decision rule induced by t.

For a decision rule t  kd(t)(d,k), the smaller the value ∣d(t)∣ is, the higher the definite level of the decision rule will be, where ∣A

Approximate discernibility matrix and computing approach for optimal decision rules

Definition 8

If (x,y)RβB, then we say x and y are similar each other with respect to B under a level β, or approximately indiscernible with respect to B under a level β.

Proposition 4

x and y are approximately indiscernible with respect to B under a level β  rB(x,y)  β⇔∀ck  B, ck(x)  ck(y)  1  β.

Proof

x and y are approximately indiscernible with respect to B under a level β(x,y)RβBrB(x,y)βmax{|ck(x)-ck(y)||ckB}1-βckB,|ck(x)-ck(y)|1-β.

Definition 9

For xi, xj  U, 1  i, j  m, denoteαβ(xi,xj)={ck|ckC,and|ck(xi)-ck(xj)|>1-β}.We call Mβ = (αβ(xi

Reductions of decision information systems

Definition 11

Let B  C. If B is a minimal condition attribute set which satisfies the following property:KβCCCRβC(U),KβBCCRβB(U),KβBKβCdKβB=dKβC,then B is called a general β-reduction of the information system, and the joint of all general β- reductions is called a general β-core of the information system.

If B is a minimal conditional attribute set which satisfies the following property:KβCCCCRβC(U),KβBCCRβB(U),KβBKβCdKβB=dKβC,then B is called a consistent β-reduction of the information system, and

Impacts of threshold level variations on reductions and optimal decision rules

With threshold level β = 0.5, we will have the cutting matrix and discernibility matrix as follows:R0.5=1100011011100011001111000001100000100000111111111111111,M0.5=ϕϕ{c1c4}{c1}{c4}ϕϕ{c2}ϕϕϕ{c4}{c2}{c4}ϕϕ{c2}{c2}ϕϕϕϕ{c1c4}{c1}{c1}{c1}{c1c4}ϕϕ{c1c3}{c1,c3}{c1}{c1}{c1}ϕ{c3}{c1}{c1}{c1}{c1}ϕϕϕϕϕϕϕϕϕϕϕϕϕϕϕ.And the C0.5-complete cover of the universe U is CCR0.5C(U)=K0.5C(1),K0.5C(2),K0.5C(3), where K0.5C(1)={x1,x2,x6,x7,x10}, K0.5C(2)={x3,x4,x5}, K0.5C(3)={x6,x7,x8,x9,x10}, and three initial decision

Conclusions

Instead of utilizing discretization method, a tolerance rough set model is proposed and constructed in this paper in order to compute reductions and optimal decision rules for continuous valued decision information systems. Based on the maximal tolerance classification of the universe under a given level, two kinds of lower and upper approximations and positive fields are defined. By the attribute feature description of the maximal tolerance classes, concepts of reductions of the maximal

Acknowledgements

The authors would like to thank the anonymous reviewers for their constructive comments and suggestions. This research is supported by the National Natural Science Foundation of China (60774100), the Scientific Research and Development Project of Shandong Provincial Education Department, China (J06P01), and Doctoral Foundation of University of Jinan, China (B0616).

References (53)

  • Y. Leung et al.

    Knowledge acquisition in incomplete information systems: a rough set approach

    European Journal of Operational Research

    (2006)
  • Y. Leung et al.

    A rough set approach for the discovery of classification rules in interval-valued information systems

    International Journal of Approximate Reasoning

    (2008)
  • J.S. Mi et al.

    Approaches to knowledge reduction based on variable precision rough sets model

    Information Sciences

    (2004)
  • Z. Pawlak et al.

    Rudiments of rough sets

    Information Sciences

    (2007)
  • Z. Pawlak et al.

    Rough set: some extensions

    Information Sciences

    (2007)
  • Z. Pawlak et al.

    Rough sets and Boolean reasoning

    Information Sciences

    (2007)
  • Z. Pawlak

    Rough sets decision algorithms and Bayes’ theorem

    European Journal of Operational Research

    (2002)
  • C. Pappis et al.

    A comparative assessment of measures of similarity of fuzzy values

    Fuzzy Sets and Systems

    (1993)
  • A. Roy et al.

    Fuzzy discretization of feature space for a rough set classier

    Pattern Recognition Letters

    (2003)
  • R. Susmaga

    Analyzing discretizations of continuous attributes given a monotonic discrimination function

    Intelligent Data Analysis

    (1997)
  • B.Z. Sun et al.

    Fuzzy rough set theory for the interval-valued fuzzy information systems

    Information Sciences

    (2008)
  • D. Slezak et al.

    The investigation of the Bayesian rough set model

    International Journal of Approximate Reasoning

    (2005)
  • D. Slezak

    Degrees of conditional (in) dependence: a framework for approximate Bayesian networks and examples related to the rough set-based feature selection

    Information Sciences

    (2009)
  • Y.C. Tsai et al.

    Entropy-based fuzzy rough classification approach for extracting classification rules

    Expert Systems with Applications

    (2006)
  • X.Z. Wang et al.

    Learning fuzzy rules from fuzzy samples based on rough set technique

    Information Sciences

    (2007)
  • C.Z. Wang et al.

    A systematic study on attribute reduction with rough sets based on general binary relations

    Information Sciences

    (2008)
  • Cited by (37)

    • On rule acquisition methods for data classification in heterogeneous incomplete decision systems

      2020, Knowledge-Based Systems
      Citation Excerpt :

      Actually, they are not based on rules, as analyzed above. For numerical data, Guan et al. [35] proposed a tolerance rough set model to extract decision rules in numerical decision systems. But this work does not involve in classification problems and it is unable to handle incomplete data.

    • Knowledge reduction for decision tables with attribute value taxonomies

      2014, Knowledge-Based Systems
      Citation Excerpt :

      The figure reveals that the accuracy of models built on attribute-generalization reduced data by AGR-SCE is higher than the accuracy of models built on attribute reduced data by AR-SCE in most cases. The number of rules determines the complexity of the model [8,13,38]. Fig. 8 shows the comparison of numbers of rules derived by four classifies for various reduced data.

    • Neighborhood effective information ratio for hybrid feature subset evaluation and selection

      2013, Neurocomputing
      Citation Excerpt :

      Among the above evaluation criteria, dependency, consistency and mutual information are all just available in evaluating categorical features. For applying the three criteria to numerical features, a discretization algorithm should be introduced to partition the numerical features into a finite set of intervals and associate each interval with a distinct value [20,21]. Since the discretization of numerical features ignores the degrees of membership of numerical values to discretized values, it may cause information loss for losing neighborhood structure and order structure in real spaces [22].

    • Neighborhood systems-based rough sets in incomplete information system

      2011, Knowledge-Based Systems
      Citation Excerpt :

      By introducing the basic concept of discrete mathematics into incomplete information system, Leung and Li [9] proposed the maximal consistent block based rough approximation. Guan et al. further introduced the maximal consistent block into set-valued and continuous valued systems in Refs. [6,7], respectively. Qian et al. [22] introduced the approximate distribution reducts into incomplete information system in terms of the maximal consistent block based rough approximation.

    • Dynamic discreduction using Rough Sets

      2011, Applied Soft Computing Journal
    View all citing articles on Scopus
    View full text