Decision SupportRough set approach to multiple criteria classification with imprecise evaluations and assignments
Introduction
In decision analysis, the multiple criteria classification is the most frequently considered decision problem. This problem is also referred to as multiple criteria sorting, or ordinal classification with monotonicity constraints. It consists in assignment of objects evaluated on a set of criteria (i.e. attributes with preference-ordered value sets) to pre-defined and preference-ordered decision classes. It is assumed that there exists a semantic correlation between evaluation on criteria and assignment to decision class, i.e. a better evaluation of an object on a criterion should not worsen its assignment to a decision class. Multiple criteria classification problems may arise in many real-world situations. Let us consider, for example, the problem of credit scoring, in which customers applying for a credit may be assigned to some decision classes representing various levels of risk, e.g. low, medium, high. Customers can be described by attributes such as: personal status and gender, education degree, income, status of checking account, credit history, purpose and duration of a credit, amount of requested credit, savings account, duration of present employment, and so on. Some of these attributes could be treated as criteria, for example: income, status of checking accounts, duration of present employment, etc.
In order to support multiple criteria classification, one must construct a preference model of the Decision Maker (DM). The construction of the preference model requires some preference information from the DM. Classically, this information concerns substitution rates among criteria, importance weights, or comparisons of lotteries. Acquisition of this preference information from the DM is not easy and, moreover, the resulting preference model is not intelligible for the DM. Another possible way is to induce the preference model from a set of exemplary decisions (assignments of objects to decision classes) made on a set of selected objects called reference objects. Those objects are relatively well-known to the DM who is able to assign them to decision classes. In other words, the preference information comes from observation of DM’s acts. Such an approach is concordant with the paradigm of artificial intelligence and, in particular, of inductive learning. Moreover, the induced model can be represented in intelligible way by a set of decision rules.
There is, however, a problem with inconsistency often present in the set of decision examples. Two decision examples are inconsistent with respect to, so called, dominance principle, if one shows an object not worse than the other on all considered criteria, however, its assignment has been made to a worse decision class than the other. These inconsistencies follow from hesitation of the DM, unstable character of his/her preferences, incomplete information or/and weak determination of the family of criteria. They can convey important information that should be taken into account in the construction of the DM’s preference model. Rather than correct or ignore these inconsistencies, it has been proposed to take them into account in the preference model construction using rough set theory. For this purpose, original rough set theory (Pawlak, 1982, Pawlak, 1991, Słowiński, 1992, Polkowski, 2002, Pawlak and Skowron, 2007b, Pawlak and Skowron, 2007a) has been extended by Greco et al., 1999, Greco et al., 2001, Greco et al., 2002 by replacing the classical indiscernibility relation by a dominance relation, which permits taking into account the preference order in value sets (scales) of criteria. The extended rough set approach is called Dominance-based Rough Set Approach (DRSA) – a complete overview of this methodology is presented in (Greco et al., 2005, Słowiński et al., 2005).
Using the rough set approach to the analysis of preference information, we obtain the lower and the upper (rough) approximations of unions of decision classes. The difference between upper and lower approximations shows inconsistent objects with respect to the dominance principle. Level of consistency is measured by quality of approximation that is a ratio of the cardinality of all consistent objects to the cardinality of all reference objects. The rough approximations are then used in induction of decision rules representing, respectively, certain and possible patterns of DM’s preferences. The preference model in the form of decision rules explains a decision policy of the DM and permits to classify new objects in line of the DM’s preferences. It is worth underlying that decision rules constitute a preference model that is more general than the most general utility function or the outranking relation (Słowiński et al., 2002, Greco et al., 2004).
In this paper, we consider an instance of the multiple criteria classification problem, in which objects are described and assigned imprecisely. We assume that the assignments and the evaluations on criteria are represented by intervals. The non-univocal (interval) assignment of an object is defined through the lowest and the highest class to which an object could belong. The interval evaluation is defined similarly, through the highest and the lowest value that an object may obtain on a given criterion. Let us remark that the notion of interval evaluation is a generalization of the missing value concept. Indeed, an interval evaluation on a criterion could be seen as a “partially missing value” because a non-univocal evaluation, spanned over an interval, is like a value partially unknown. In the extreme case, an interval equal to the whole value set of a criterion is a completely missing value. The problem of missing values was already considered within DRSA (Greco et al., 2000a). Here, we present some more general results. The considered problem is also related to the concept of incomplete or multi-valued information systems considered in many places (Lipski, 2001, Orłowska and Pawlak, 1987, Kryszkiewicz, 1998, Düntsch et al., 2001). In Düntsch et al. (2001), different semantic interpretations of such systems are given. In the case of multiple criteria classification, the most intuitive interpretation of an interval is disjunctive and exclusive, i.e. only one value inside the interval is the right one, but it is not known which one it is.
Handling of interval evaluations and assignments requires, however, the dominance principle to be revised and the DRSA methodology to be adapted adequately (Dembczyński et al., 2003, Dembczyński et al., 2005). A possible solution to the problem consists in introducing the second-order rough approximations that result from both, the imprecision of interval assignments and the inconsistencies with respect to the dominance principle. These approximations satisfy the usual properties considered in rough set theory such as rough inclusion, complementarity, identity of boundaries and precisiation (the last property is sometimes referred to as monotonicity).
Any new information about reference objects is referred to in this paper as precisiation of data. This precisiation means either a new attribute (criterion) or an information confining the interval evaluation or assignment of a reference object. As it was mentioned above, rough approximations are characterized by the precisiation property that could be informally explained in the following way: if we knew more precise information about objects (precisiation of data), then we would have not less consistent knowledge. For example, in rough set approaches dealing with univocal evaluations and assignments of objects (here called basic rough set approaches), a quality of approximation is not decreasing if new attributes (criteria) are added. In this paper, we define the generalized precisiation property concerning precisiation of evaluations and assignments of reference objects. This property seems to be crucial in analysis of incomplete information systems.
On the other hand, removing some attributes results in decreasing quality of approximation, unless some remaining attributes constitute a reduct. Reduct is a minimal subset of attributes that preserves the same quality of approximation as the complete set of attributes. This subset may be used in further analysis instead of all attributes. In this paper, we give a generalized definition of the quality of approximation and reduct. Let us also remark that precisiation and reduction of data are strictly related. From one side, data are precisiated to be consistent (quality of approximation increases), and from the other side, it is desirable to reduce the set of attributes without decreasing the quality of approximation.
Finally, in this paper, we consider induction of decision rules from second-order rough approximations. It is desirable that a set of decision rules is complete. This means that the classification procedure based on the induced set of decision rules restores the original assignments of reference objects given by the DM. Precisely, consistent objects should be reassigned to exactly the same classes and inconsistent objects should be reclassified to unions of classes that reflect the inconsistencies. It is also required that the rules remain true when reference objects are precisiated. This last requirement can be seen as specific extension of the generalized precisiation property. We show that the methodology introduced in this paper satisfies these two requirements, i.e., the methodology is able to deliver a complete set of rules that remain true after precisiation of data.
The methodology presented here generalizes and improves author’s proposal given in (Dembczyński et al., 2002) concerning hierarchical multiple attribute and multiple criteria classification. While assignments of objects were considered univocal in this study, the evaluations on particular criteria were possibly imprecise (in the form of intervals). Some elements of the present generalization have already been reported in (Dembczyński et al., 2003). This paper is the revised and extended version of (Dembczyński et al., 2005) containing additional sections on precisiation property, reduction of criteria and induction of decision rules.
The article is organized in the following way. Section 2 contains problem statement and basic definitions. Section 3 describes several types of dominance relations used further in definitions of rough approximations. In Section 4, the dominance principle is extended to the considered case. Section 5 presents definitions of decision and condition granules. Section 6 describes the generalization of DRSA into second-order rough approximations. Section 7 discusses the precisiation property and its generalization. Quality of approximation and reduction of criteria is presented in Section 8. Section 9 presents a decision rule model induced from second-order rough approximations. The last section concludes the paper.
Section snippets
Problem statement
Multiple criteria classification consists in an assignment of objects from a set A to pre-defined decision classes . It is assumed that the classes are preference-ordered according to an increasing order of class indices, i.e. for all , such that , the objects from are strictly preferred to the objects from . The objects are described by condition criteria, i.e. attributes with preference-ordered value sets.
The preference model is induced from classification
Dominance relations
Within the basic DRSA, the notions of weak preference relation and P-dominance relation are defined as follows. For any and means that x is at least as good as (is weakly preferred to) y with respect to criterion q. Moreover, taking into account more than one criterion, we say that x dominates y with respect to (shortly x P-dominates y), if for all . The weak preference relation is supposed to be a complete preorder, i.e. complete, reflexive, transitive, and
Dominance principle
In decision analysis, the dominance principle requires that an object having not worse evaluations on condition criteria than another object should be assigned to a class not worse than the other object. Formally, the dominance principle can be expressed for as follows:In other words, if x P-dominates y (x is not worse than y with respect to all criteria from P), then the assignment of x to a decision class should be not worse than the assignment of y. This also
Decision and condition granules
Granules of information concern objects from U and are induced by dominance relations defined above. We consider two types of granules: decision granules defined on decision criterion and condition granules defined on condition criteria. Definition 4 Decision granules are defined as follows:and referred to as lower-end d-dominating sets, upper-end d-dominated sets, possible d-dominating sets and possible d-dominated sets,
Dominance-based rough approximations
Dominance-based rough approximations in the case of imprecise (interval) evaluations and assignments are derived from the generalized dominance principle.
From Definition 2, we have that is P-consistent ifIt is easy to see thatandLet , and , and . Inserting s and t into (8), (9), one obtainsExpression (10) can be treated, analogously to
Precisiation property
We define precisiation of data as any new information about reference objects that is delivered to decision table. This can be either a new attribute (criterion) or an information confining an interval evaluation or assignment of a reference object. In rough set approaches, it is usually required that more precise information about objects should not decrease lower approximations of decision classes. In the basic rough set approaches, this property corresponds to the situation where new
Quality of approximation and reduction of criteria
In rough set theory, quality of approximation measures how much inconsistent the decision table is. In the simplest case, this is a ratio of consistent objects to all objects in the decision table. In the context of multiple criteria classification with imprecise (interval) assignments and evaluations, this definition takes the following form. Definition 15 Quality of P-approximation, for , is defined as
In the definition, is the set of totally P
Decision rule model
Decision rule is a logical expression in the form: if [condition], then [decision], where the condition part (or antecedent, or if-part) is a conjunction of elementary conditions, and the decision part (or consequent, or then-part) contains a suggestion of an assignment. An object that satisfies (or is covered by) the condition part of the rule is assigned to decision classes suggested by the decision part of this rule. Decision rules induced from the set of reference objects U can explain DM’s
Conclusions
The presented results extend the basic DRSA and allow analyzing decision tables with imprecise (interval) evaluations of objects on condition criteria, as well as imprecise (interval) assignments of objects to decision classes. To solve this problem, we introduced specific definitions of dominance relations and we revised adequately the dominance principle. This led us to a definition of new decision and condition granules used for second-order rough approximations. It is worth underlining that
Acknowledgement
The first and the third author wish to acknowledge financial support from the Polish Ministry of Education and Science.
References (25)
- et al.
Relational attribute systems
International Journal of Human-Computer Studies
(2001) - et al.
Rough approximation of a preference relation by dominance relations
European Journal of Operational Research
(1999) - et al.
Rough sets theory for multicriteria decision analysis
European Journal of Operational Research
(2001) - et al.
Rough sets methodology for sorting problems in presence of multiple attributes and criteria
European Journal of Operational Research
(2002) - et al.
Axiomatic characterization of a general utility function and its particular cases in terms of conjoint measurement and rough-set decision rules
European Journal of Operational Research
(2004) Rough set approach to incomplete information systems
Information Sciences
(1998)- et al.
Rough sets: Some extensions
Information Sciences
(2007) - et al.
Rudiments of rough sets
Information Sciences
(2007) - Dembczyński, K., Greco, S., Słowiński, R., 2003. Dominance-based rough set approach to multicriteria classification...
- et al.
Methodology of rough-set-based classification and sorting with hierarchical structure of attributes and criteria
Control & Cybernetics
(2002)
Second-order rough approximations in multi-criteria classification with imprecise evaluations and assignments
Quality of rough approximation in multi-criteria classification problems
Cited by (129)
Divide and conquer: A granular concept-cognitive computing system for dynamic classification decision making
2023, European Journal of Operational ResearchA generalized approach to ordinal classification based on the comparison of actions with either limiting or characteristic profiles
2023, European Journal of Operational ResearchConcept-cognitive computing system for dynamic classification
2022, European Journal of Operational ResearchTrust recommendation mechanism-based consensus model for Pawlak conflict analysis decision making
2021, International Journal of Approximate ReasoningCitation Excerpt :The classical Pawlak conflict model encapsulates the conflict problem in an information system and a conflict information system is established. But the classical Pawlak conflict model has three drawbacks [8]: (1) What are the intrinsic reasons for the conflict? ( 2) How can a feasible consensus strategy be found? (
Partial-overall dominance three-way decision models in interval-valued decision systems
2020, International Journal of Approximate ReasoningCitation Excerpt :A strict dominance based probabilistic rough set model is presented for inconsistent data analysis [15]. Moreover, lots of literature have been reported regarding dominance-based rough sets to solve problems with fuzzy, intuitionistic fuzzy, uncertain or incomplete data [16–21]. Rough set theory provides a useful model for 3WDs and rough set-based three-way decision theory was widely concerned in many disciplines [22–24].
A novel approach for efficient updating approximations in dynamic ordered information systems
2020, Information Sciences