Decision Support
Rough set approach to multiple criteria classification with imprecise evaluations and assignments

https://doi.org/10.1016/j.ejor.2008.09.033Get rights and content

Abstract

Dominance-based Rough Set Approach (DRSA) has been introduced to deal with multiple criteria classification (also called multiple criteria sorting, or ordinal classification with monotonicity constraints), where assignments of objects may be inconsistent with respect to dominance principle. In this paper, we consider an extension of DRSA to the context of imprecise evaluations of objects on condition criteria and imprecise assignments of objects to decision classes. The imprecisions are given in the form of intervals of possible values. In order to solve the problem, we reformulate the dominance principle and introduce second-order rough approximations. The presented methodology preserves well-known properties of rough approximations, such as rough inclusion, complementarity, identity of boundaries and precisiation. Moreover, the meaning of the precisiation property is extended to the considered case. The paper presents also a way to reduce decision tables and to induce decision rules from rough approximations.

Introduction

In decision analysis, the multiple criteria classification is the most frequently considered decision problem. This problem is also referred to as multiple criteria sorting, or ordinal classification with monotonicity constraints. It consists in assignment of objects evaluated on a set of criteria (i.e. attributes with preference-ordered value sets) to pre-defined and preference-ordered decision classes. It is assumed that there exists a semantic correlation between evaluation on criteria and assignment to decision class, i.e. a better evaluation of an object on a criterion should not worsen its assignment to a decision class. Multiple criteria classification problems may arise in many real-world situations. Let us consider, for example, the problem of credit scoring, in which customers applying for a credit may be assigned to some decision classes representing various levels of risk, e.g. low, medium, high. Customers can be described by attributes such as: personal status and gender, education degree, income, status of checking account, credit history, purpose and duration of a credit, amount of requested credit, savings account, duration of present employment, and so on. Some of these attributes could be treated as criteria, for example: income, status of checking accounts, duration of present employment, etc.

In order to support multiple criteria classification, one must construct a preference model of the Decision Maker (DM). The construction of the preference model requires some preference information from the DM. Classically, this information concerns substitution rates among criteria, importance weights, or comparisons of lotteries. Acquisition of this preference information from the DM is not easy and, moreover, the resulting preference model is not intelligible for the DM. Another possible way is to induce the preference model from a set of exemplary decisions (assignments of objects to decision classes) made on a set of selected objects called reference objects. Those objects are relatively well-known to the DM who is able to assign them to decision classes. In other words, the preference information comes from observation of DM’s acts. Such an approach is concordant with the paradigm of artificial intelligence and, in particular, of inductive learning. Moreover, the induced model can be represented in intelligible way by a set of decision rules.

There is, however, a problem with inconsistency often present in the set of decision examples. Two decision examples are inconsistent with respect to, so called, dominance principle, if one shows an object not worse than the other on all considered criteria, however, its assignment has been made to a worse decision class than the other. These inconsistencies follow from hesitation of the DM, unstable character of his/her preferences, incomplete information or/and weak determination of the family of criteria. They can convey important information that should be taken into account in the construction of the DM’s preference model. Rather than correct or ignore these inconsistencies, it has been proposed to take them into account in the preference model construction using rough set theory. For this purpose, original rough set theory (Pawlak, 1982, Pawlak, 1991, Słowiński, 1992, Polkowski, 2002, Pawlak and Skowron, 2007b, Pawlak and Skowron, 2007a) has been extended by Greco et al., 1999, Greco et al., 2001, Greco et al., 2002 by replacing the classical indiscernibility relation by a dominance relation, which permits taking into account the preference order in value sets (scales) of criteria. The extended rough set approach is called Dominance-based Rough Set Approach (DRSA) – a complete overview of this methodology is presented in (Greco et al., 2005, Słowiński et al., 2005).

Using the rough set approach to the analysis of preference information, we obtain the lower and the upper (rough) approximations of unions of decision classes. The difference between upper and lower approximations shows inconsistent objects with respect to the dominance principle. Level of consistency is measured by quality of approximation that is a ratio of the cardinality of all consistent objects to the cardinality of all reference objects. The rough approximations are then used in induction of decision rules representing, respectively, certain and possible patterns of DM’s preferences. The preference model in the form of decision rules explains a decision policy of the DM and permits to classify new objects in line of the DM’s preferences. It is worth underlying that decision rules constitute a preference model that is more general than the most general utility function or the outranking relation (Słowiński et al., 2002, Greco et al., 2004).

In this paper, we consider an instance of the multiple criteria classification problem, in which objects are described and assigned imprecisely. We assume that the assignments and the evaluations on criteria are represented by intervals. The non-univocal (interval) assignment of an object is defined through the lowest and the highest class to which an object could belong. The interval evaluation is defined similarly, through the highest and the lowest value that an object may obtain on a given criterion. Let us remark that the notion of interval evaluation is a generalization of the missing value concept. Indeed, an interval evaluation on a criterion could be seen as a “partially missing value” because a non-univocal evaluation, spanned over an interval, is like a value partially unknown. In the extreme case, an interval equal to the whole value set of a criterion is a completely missing value. The problem of missing values was already considered within DRSA (Greco et al., 2000a). Here, we present some more general results. The considered problem is also related to the concept of incomplete or multi-valued information systems considered in many places (Lipski, 2001, Orłowska and Pawlak, 1987, Kryszkiewicz, 1998, Düntsch et al., 2001). In Düntsch et al. (2001), different semantic interpretations of such systems are given. In the case of multiple criteria classification, the most intuitive interpretation of an interval is disjunctive and exclusive, i.e. only one value inside the interval is the right one, but it is not known which one it is.

Handling of interval evaluations and assignments requires, however, the dominance principle to be revised and the DRSA methodology to be adapted adequately (Dembczyński et al., 2003, Dembczyński et al., 2005). A possible solution to the problem consists in introducing the second-order rough approximations that result from both, the imprecision of interval assignments and the inconsistencies with respect to the dominance principle. These approximations satisfy the usual properties considered in rough set theory such as rough inclusion, complementarity, identity of boundaries and precisiation (the last property is sometimes referred to as monotonicity).

Any new information about reference objects is referred to in this paper as precisiation of data. This precisiation means either a new attribute (criterion) or an information confining the interval evaluation or assignment of a reference object. As it was mentioned above, rough approximations are characterized by the precisiation property that could be informally explained in the following way: if we knew more precise information about objects (precisiation of data), then we would have not less consistent knowledge. For example, in rough set approaches dealing with univocal evaluations and assignments of objects (here called basic rough set approaches), a quality of approximation is not decreasing if new attributes (criteria) are added. In this paper, we define the generalized precisiation property concerning precisiation of evaluations and assignments of reference objects. This property seems to be crucial in analysis of incomplete information systems.

On the other hand, removing some attributes results in decreasing quality of approximation, unless some remaining attributes constitute a reduct. Reduct is a minimal subset of attributes that preserves the same quality of approximation as the complete set of attributes. This subset may be used in further analysis instead of all attributes. In this paper, we give a generalized definition of the quality of approximation and reduct. Let us also remark that precisiation and reduction of data are strictly related. From one side, data are precisiated to be consistent (quality of approximation increases), and from the other side, it is desirable to reduce the set of attributes without decreasing the quality of approximation.

Finally, in this paper, we consider induction of decision rules from second-order rough approximations. It is desirable that a set of decision rules is complete. This means that the classification procedure based on the induced set of decision rules restores the original assignments of reference objects given by the DM. Precisely, consistent objects should be reassigned to exactly the same classes and inconsistent objects should be reclassified to unions of classes that reflect the inconsistencies. It is also required that the rules remain true when reference objects are precisiated. This last requirement can be seen as specific extension of the generalized precisiation property. We show that the methodology introduced in this paper satisfies these two requirements, i.e., the methodology is able to deliver a complete set of rules that remain true after precisiation of data.

The methodology presented here generalizes and improves author’s proposal given in (Dembczyński et al., 2002) concerning hierarchical multiple attribute and multiple criteria classification. While assignments of objects were considered univocal in this study, the evaluations on particular criteria were possibly imprecise (in the form of intervals). Some elements of the present generalization have already been reported in (Dembczyński et al., 2003). This paper is the revised and extended version of (Dembczyński et al., 2005) containing additional sections on precisiation property, reduction of criteria and induction of decision rules.

The article is organized in the following way. Section 2 contains problem statement and basic definitions. Section 3 describes several types of dominance relations used further in definitions of rough approximations. In Section 4, the dominance principle is extended to the considered case. Section 5 presents definitions of decision and condition granules. Section 6 describes the generalization of DRSA into second-order rough approximations. Section 7 discusses the precisiation property and its generalization. Quality of approximation and reduction of criteria is presented in Section 8. Section 9 presents a decision rule model induced from second-order rough approximations. The last section concludes the paper.

Section snippets

Problem statement

Multiple criteria classification consists in an assignment of objects from a set A to pre-defined decision classes Clt,tT={1,,n}. It is assumed that the classes are preference-ordered according to an increasing order of class indices, i.e. for all r,sT, such that r>s, the objects from Clr are strictly preferred to the objects from Cls. The objects are described by condition criteria, i.e. attributes with preference-ordered value sets.

The preference model is induced from classification

Dominance relations

Within the basic DRSA, the notions of weak preference relation q and P-dominance relation DP are defined as follows. For any x,yA and qQ,xqy means that x is at least as good as (is weakly preferred to) y with respect to criterion q. Moreover, taking into account more than one criterion, we say that x dominates y with respect to PQ (shortly x P-dominates y), if xqy for all qP. The weak preference relation q is supposed to be a complete preorder, i.e. complete, reflexive, transitive, and

Dominance principle

In decision analysis, the dominance principle requires that an object having not worse evaluations on condition criteria than another object should be assigned to a class not worse than the other object. Formally, the dominance principle can be expressed for x,yA as follows:xDPyxDdy,for anyPC.In other words, if x P-dominates y (x is not worse than y with respect to all criteria from P), then the assignment of x to a decision class should be not worse than the assignment of y. This also

Decision and condition granules

Granules of information concern objects from U and are induced by dominance relations defined above. We consider two types of granules: decision granules defined on decision criterion and condition granules defined on condition criteria.

Definition 4

Decision granules are defined as follows:Ddl+(x)={yU:yDdlx},Ddu-(x)={yU:xDduy},D¯d+(x)={yU:yD¯dx},D¯d-(x)={yU:xD¯dy},and referred to as lower-end d-dominating sets, upper-end d-dominated sets, possible d-dominating sets and possible d-dominated sets,

Dominance-based rough approximations

Dominance-based rough approximations in the case of imprecise (interval) evaluations and assignments are derived from the generalized dominance principle.

From Definition 2, we have that xU is P-consistent ifyUxD¯PyxDduyyD¯PxyDdlx.It is easy to see thatyUxD¯PyxDduy[D¯P-(x)Cl̲u(x,d)],andyUyD¯PxyDdlxD¯P+(x)Cl̲l(x,d).Let st, and u(x,d)s, and l(x,d)t. Inserting s and t into (8), (9), one obtainsD¯P-(x)Cl̲sandD¯P+(x)Cl̲t.Expression (10) can be treated, analogously to

Precisiation property

We define precisiation of data as any new information about reference objects that is delivered to decision table. This can be either a new attribute (criterion) or an information confining an interval evaluation or assignment of a reference object. In rough set approaches, it is usually required that more precise information about objects should not decrease lower approximations of decision classes. In the basic rough set approaches, this property corresponds to the situation where new

Quality of approximation and reduction of criteria

In rough set theory, quality of approximation measures how much inconsistent the decision table is. In the simplest case, this is a ratio of consistent objects to all objects in the decision table. In the context of multiple criteria classification with imprecise (interval) assignments and evaluations, this definition takes the following form.

Definition 15

Quality of P-approximation, for PC, is defined asγ(P)=card({xU:lP(x)=uP(x)})cardU.

In the definition, {xU:uP(x)=lP(x)} is the set of totally P

Decision rule model

Decision rule is a logical expression in the form: if [condition], then [decision], where the condition part (or antecedent, or if-part) is a conjunction of elementary conditions, and the decision part (or consequent, or then-part) contains a suggestion of an assignment. An object that satisfies (or is covered by) the condition part of the rule is assigned to decision classes suggested by the decision part of this rule. Decision rules induced from the set of reference objects U can explain DM’s

Conclusions

The presented results extend the basic DRSA and allow analyzing decision tables with imprecise (interval) evaluations of objects on condition criteria, as well as imprecise (interval) assignments of objects to decision classes. To solve this problem, we introduced specific definitions of dominance relations and we revised adequately the dominance principle. This led us to a definition of new decision and condition granules used for second-order rough approximations. It is worth underlining that

Acknowledgement

The first and the third author wish to acknowledge financial support from the Polish Ministry of Education and Science.

References (25)

  • K. Dembczyński et al.

    Second-order rough approximations in multi-criteria classification with imprecise evaluations and assignments

  • K. Dembczyński et al.

    Quality of rough approximation in multi-criteria classification problems

  • Cited by (129)

    • Concept-cognitive computing system for dynamic classification

      2022, European Journal of Operational Research
    • Trust recommendation mechanism-based consensus model for Pawlak conflict analysis decision making

      2021, International Journal of Approximate Reasoning
      Citation Excerpt :

      The classical Pawlak conflict model encapsulates the conflict problem in an information system and a conflict information system is established. But the classical Pawlak conflict model has three drawbacks [8]: (1) What are the intrinsic reasons for the conflict? ( 2) How can a feasible consensus strategy be found? (

    • Partial-overall dominance three-way decision models in interval-valued decision systems

      2020, International Journal of Approximate Reasoning
      Citation Excerpt :

      A strict dominance based probabilistic rough set model is presented for inconsistent data analysis [15]. Moreover, lots of literature have been reported regarding dominance-based rough sets to solve problems with fuzzy, intuitionistic fuzzy, uncertain or incomplete data [16–21]. Rough set theory provides a useful model for 3WDs and rough set-based three-way decision theory was widely concerned in many disciplines [22–24].

    View all citing articles on Scopus
    View full text