Decision Support
Rough set-based logics for multicriteria decision analysis

https://doi.org/10.1016/j.ejor.2006.08.029Get rights and content

Abstract

In this paper, we propose some decision logic languages for rule representation in rough set-based multicriteria analysis. The semantic models of these logics are data tables, each of which is comprised of a finite set of objects described by a finite set of criteria/attributes. The domains of the criteria may have ordinal properties expressing preference scales, while the domains of the attributes may not. The validity, support, and confidence of a rule are defined via its satisfaction in the data table.

Introduction

The theory of knowledge has long been an important topic in many academic disciplines, such as philosophy, psychology, economics, and artificial intelligence, whereas the storage and retrieval of data is the main concern of information science. In modern experimental science, knowledge is usually acquired from observed data, which is a valuable resource for researchers and decision-makers. However, when the amount of data is large, it is difficult to analyze the data and extract knowledge from it. With the aid of computers, the vast amount of data stored in relational data tables can be transformed into symbolic knowledge automatically. Thus, intelligent data analysis has received a great deal of attention in recent years.

While data mining research concentrates on the design of efficient algorithms for extracting knowledge from data, how to bridge the semantic gap between structured data and human-comprehensible concepts has been a long-lasting challenge for the research community. Kruse et al. (1999) called this the interpretability problem of intelligent data analysis. Since discovered knowledge is only useful for a human user when he can understand its meaning, the knowledge representation formalism plays an important role in the utilization of the induced rules. A good representation formalism should have clear semantics so that a rule can be effectively validated with respect to the given data tables. In this regard, logic is one of the best choices. As noted by Zadeh (1996), humans usually compute with words instead of numbers, so if we can incorporate linguistically meaningful terms into the representation formalism, the induced rules may be more useful to human decision-makers.

The rough set theory proposed by Pawlak (1982) provides an effective tool for extracting knowledge from data tables. To represent and reason about the extracted knowledge, a decision logic (DL) was proposed in Pawlak (1991). The semantics of the logic is defined in a Tarskian style through the notions of models and satisfaction. While DL can be considered an example of classical logic in the context of data tables, different generalizations of DL corresponding to some non-classical logics are also desirable from the viewpoint of knowledge representation. For example, to deal with uncertain or incomplete information, some generalized decision logics have been proposed (Fan et al., 2001, Liau and Liu, 1999, Liau and Liu, 2001, Yao and Liau, 2002, Yao and Liu, 1999).

When rough set theory is applied to multi-criteria decision analysis (MCDA), it is crucial that preference-ordered attribute domains and decision classes be dealt with (Greco et al., 1997, Greco et al., 1998, Greco et al., 1999a, Greco et al., 2000, Greco et al., 2001a, Greco et al., 2002, Greco et al., 2004, Slowinski et al., 2002b). The original rough set theory cannot handle inconsistencies arising from violations of the dominance principle due to its use of the indiscernibility relation. Therefore, in the above-mentioned works, the indiscernibility relation is replaced by a dominance relation to solve the multi-criteria sorting problem, and the data table is replaced by a pairwise comparison table to solve multi-criteria choice and ranking problems. The approach is called the dominance-based rough set approach (DRSA). For MCDA problems, DRSA can induce a set of decision rules from sample decisions provided by decision-makers. The induced rules form a comprehensive preference model and can provide recommendations about a new decision-making environment.

A strong assumption about data tables is that each object takes exactly one value with respect to an attribute. However, in practice, we may only have incomplete information about the values of an object’s attributes. Thus, more general data tables and decision logics are needed to represent and reason about incomplete information. For example, set-valued and interval set-valued data tables have been introduced to represent incomplete information (Kryszkiewicz, 1998, Kryszkiewicz and Rybiński, 1996a, Kryszkiewicz and Rybiński, 1996b, Lipski, 1981, Yao and Liu, 1999). A generalized decision logic based on interval set-valued data tables is also proposed in Yao and Liu (1999). In these formalisms, the attribute values of an object may be a subset or an interval set in the domain. Since crisp subsets and interval sets are both special cases of fuzzy sets, further generalization of data tables is desirable to represent uncertain information. In data tables containing such information, an object can take a fuzzy subset of values for each attribute. To represent knowledge induced from uncertain data tables, the decision logic also needs to be generalized.

DRSA has also been extended to deal with missing values in MCDA problems (Greco et al., 2001a, Slowinski et al., 2002b). A data table with missing values is a special case of uncertain data tables. Therefore, we propose further extending DRSA to uncertain data tables and fuzzy data tables. In this paper, we present a logical treatment of DRSA in precise data tables, as well as uncertain and fuzzy data tables. Our approach is concerned with variants of DL for data tables.

The remainder of the paper is organized as follows. In Section 2, we review the decision logic proposed by Pawlak. In Sections 3 Preference-ordered data tables, 4 Preference-ordered uncertain data tables, 5 Preference-ordered fuzzy data tables, 6 Pairwise comparison decision logic, we respectively present generalized DL for preference-ordered data tables, preference-ordered uncertain data tables, preference-ordered fuzzy data tables, and pairwise comparison tables. For each logic, the syntax and semantics are described, and some quantitative measures for the rules of the logics are defined. Finally, in Section 7, we discuss the main contribution of this paper and indicate the direction of future research.

Section snippets

Classical data tables

In data mining problems, data is usually provided in the form of data tables (DT). A formal definition of a data table is given in Pawlak (1991).

Definition 1

A data table1 is a tupleT=(U,A,{Vi|iA},{fi|iA}),where U is a non-empty finite set, called the universe; A is a non-empty finite set of primitive attributes; for each i  A, Vi is the domain of values for i; and for each i  A, fi : U  Vi is a total function.

Given a

Preference-ordered data tables

For MCDA problems, each object in a data table or decision table can be seen as a sample decision, and each condition attribute is a criterion for the decision. Since the domain of values of a criterion is usually ordered according to the decision-maker’s preferences, we define a preference-ordered data table (PODT) as a tupleT=(U,A,{(Vi,i)|iA},{fi|iA}),where T = (U, A, {Vii  A}, {fii  A}) is a classical data table; and for each i  A, ⪰i  Vi × Vi is a binary relation over Vi. The relation ⪰i is

Preference-ordered uncertain data tables

PODL is suitable for the representation of rules induced from a PODT. However, the latter inherits the restriction of classical DT so that uncertain information cannot be represented. An uncertain data table is a generalization of DT such that the values of some or all attributes are imprecise (Fan et al., 2001, Dembczynski et al., 2002). An analogous generalization can be applied to PODT to define preference-ordered uncertain data tables (POUDT). Formally, a POUDT is a tupleT=(U,A,{(Vi,i)|iA}

Preference-ordered fuzzy data tables

The preference-ordered fuzzy data table (POFDT) is a further generalization of POUDT. An approach for dealing with fuzzy information in PODT has been proposed in Greco et al. (1999b). In this section, we propose an alternative based on our logical formalism. For any domain V, let NF(V) denote the set of all normalized fuzzy subsets of V. Recall that a fuzzy subset of domain V is normalized if supxVμ(x) = 1, where μ is the membership function of the fuzzy subset. A POFDT is a tupleT=(U,A,{(Vi,i)|

Pairwise comparison decision logic

Greco et al., 1997, Greco et al., 1998, Greco et al., 1999a proposed the pairwise comparison table (PCT) to handle multicriteria choice or ranking problems. In a PCT, the strength of preferences between objects, instead of the evaluation scores of objects, are stored with respect to each criterion. Formally, a PCT is a tupleT=(U,A,{Hi|iA},{fi|iA}),where U and A are defined as above; and for each i  A, Hi is a finite set of integers, and fi : U × U  Hi encodes the preferential information

Conclusions

In this paper, we present some logics that are useful in the representation of rules induced from preference-ordered data tables. Such data tables are commonly used in MCDA. The main advantage of using logic is its syntax and semantics are precise. As DL is a precise way to represent decision rules induced from classical data tables, we use PODL and PCDL to reformulate the decision rules induced from PODT and PCDT in DRSA, respectively. Though this seems a trivial step, it maps the decision

Acknowledgement

We would like to thank two anonymous referees for their helpful comments.

References (32)

  • S. Greco et al.

    The use of rough sets and fuzzy sets in MCDM

  • S. Greco et al.

    Extension of the rough set approach to multicriteria decision support

    INFOR Journal: Information Systems and Operational Research

    (2000)
  • S. Greco et al.

    Variable consistency model of dominance-based rough set approach

  • P. Hájek

    Metamathematics of Fuzzy Logic

    (1998)
  • R. Kruse et al.

    Fuzzy data analysis: Challenges and perspectives

  • M. Kryszkiewicz

    Properties of incomplete information systems in the framework of rough sets

  • Cited by (0)

    View full text