Elsevier

Information Sciences

Volumes 346–347, 10 June 2016, Pages 442-462
Information Sciences

Rough-set concept analysis: Interpreting RS-definable concepts based on ideas from formal concept analysis

https://doi.org/10.1016/j.ins.2016.01.091Get rights and content

Abstract

Based on ideas from formal concept analysis, this paper interprets the notions of RS-definable concepts (i.e., rough-set definable concepts) and the Boolean algebra of RS-definable concepts. We explicitly represent a RS-definable concept as a pair of an extension and an intension, where the extension is a set of objects and the intension is a family of sets of attribute-value pairs called avp-sets. An object in the extension satisfies at least one avp-set in the intension and each avp-set in the intension is satisfied by only objects in the extension. The two-directional connections produce an atomic Boolean algebra of RS-definable concepts, corresponding to the lattice of formal concepts in formal concept analysis. The Boolean algebra of RS-definable concepts is used to define and interpret a subset of objects through a pair of lower and upper approximations. The new formulation emphasizes on an in-depth conceptual understanding of rough-set concept analysis.

Introduction

There are many views and theories for studying concepts [35], [36], [52]. Following ideas from Port-Royal Logic of Arnauld and Nicole [1], [3], one may understand a concept jointly as a pair of an intension (or comprehension) and an extension (or denotation). The intension of a concept consists of all properties or attributes (more generally, some formulas of a language) that are valid for all those objects to which the concept applies. The extension of a concept is the set of objects that are instances of the concept. The intension–extension view serves as a basis for studying data analysis, concept formation, and concept learning [52].

There are two crucial tasks in concept formation and concept learning. One task is to find a good description (i.e., intension) of a concept extensionally described by a set of instances. Ideally, there is an inverse relationship between the intension and the extension of a concept. The other task is to derive relationships between concepts based on the corresponding relationships between their extensions and/or intensions. For concept formation and concept learning, we need to consider several issues. One issue is to design a scheme for representing and interpreting concepts. Different schemes may lead to different descriptions of concepts, for example, conjunctive concepts versus disjunctive concepts [23]. A related issue is the selection of a set of meaningful basic or elementary concepts, from which other concepts can be expressed and interpreted. Another issue is to build strategies for searching a concept space efficiently [40].

With respect to the intension–extension view of concepts, formal concept analysis (FCA), proposed by Wille [55], and rough set analysis (RSA), proposed by Pawlak [43], are two related theories for concept analysis based on data in a tabular form. In a data table, rows represent a set of objects, columns represent a set of attributes or properties, and each cell, if not empty, represents the value of an object on an attribute. A data table is a context within which concepts are defined and interpreted. A table in which each attribute takes only one value is called a (one-valued) formal context in FCA. One can also represent conveniently a formal context as a binary relation between the set of objects and the set of attributes. When each attribute can take more than one value, the table is called a many-valued context in FCA and a data table in RSA. Although an arbitrary subset of objects in a data table may be viewed as the extension of a concept, it may not always be possible to find an intension, with some specified characteristics, for this subset of objects. FCA and RSA share a common interest in subsets of objects that are intensionally definable, but they differ in their choices in forming intensions, leading to two different families of concepts [65].

Two fundamental notions of FCA are formal concepts and the concept lattice of all formal concepts. A formal concept is a pair of a set of objects and a set of attributes that mutually define each other. The family of all formal concepts derived from a formal context is a lattice, showing a hierarchical organization of concepts. One important feature of FCA is an explicit representation of a concept in terms of an extent and an intent. Two special families of formal concepts, called object concepts and attribute concepts, respectively, can be used to construct other concepts through set-intersection. In some sense, FCA focuses on conjunctive concepts.

Corresponding to the two notions of FCA, RSA introduces the notions of definable sets, called rough-set definable sets or RS-definable sets in this paper, and an atomic Boolean algebra of all RS-definable sets. A RS-definable set is a subset of objects that can be described by a formula in a description language in a data table [34], [67]. The Boolean algebra gives a hierarchical organization of RS-definable sets. A set of elementary RS-definable sets, i.e., the set of atoms of the Boolean algebra, serves as the building blocks for constructing all RS-definable sets through set union. Elementary RS-definable sets correspond to conjunctive concepts, while RS-definable sets correspond to disjunctions of conjunctive concepts. Thus, RSA covers both conjunctive and disjunctive concepts. RSA introduces an additional notion of approximations. An RS-undefinable set is approximated by a pair of RS-definable sets.

Many proposals have been made to integrate FCA and RSA in order to take advantages of the two theories and to enrich each other. These studies reveal close connections between the two theories [5], [7], [17], [26], [44], [49], [58], [60], [66], introduce new concept lattices for RSA [11], [12], [25], [29], [59], [63], [65], produce rough-set-like approximations for FCA [12], [22], [60], [69], give rise to FCA interpretations of rough data table [17] and FCA understanding of definable sets [16], and suggest models of fuzzy and/or rough FCA [2], [47].

From the viewpoint of FCA, a review of literature of RSA shows that two important issues remain to be addressed thoroughly. The first issue is related to the fact that RSA typically equates a subset of objects to a concept, focusing only on extensions of concepts without an explicit reference to intensions. The second issue is that, when intensions are explicitly used, RSA uses a one-directional inference from an intension to an extension, but not from an extension to an intension. It is therefore necessary in RSA to design an appropriate, operable and explicit representation of intensions of RS-definable sets, as well as two-directional inferences between the extension and the intension. By explicitly considering together extension and intension, we make a shift from RS-definable sets into RS-definable concepts. This leads to a new interpretation of rough-set concept analysis or rough concept analysis (RCA). Consequently, the approximations of a subset of objects by a pair of RS-definable sets become the approximations of a concept by a pair of RS-definable concepts. This new reformulated notion of approximations is semantically superior for interpreting approximations and deriving decision/classification rules from approximations.

To explicitly consider intensions of concepts in RSA, several studies used modal-style data operators [11], [12], [61] to define new concept lattices for RSA [11], [59], [65], including object-oriented concept lattices [64], [65], property-oriented concept lattices [11], and lower and upper concept lattices [59]. One difficulty with these studies is that most of them are restricted to a binary table or a one-valued formal context. To resolve this difficulty, one may transform the table into a formal context by applying a conceptual scaling technique introduced by Ganter and Wille [18], [19], or, more generally, logical scaling introduced by Prediger [48] and techniques of logical concept analysis proposed by Ferré and Ridoux [14]. Alternatively, one may apply a more general notion of pattern structures introduced by Ganter and Kuznetsov [15]. Several authors adopted FCA approaches to RSA with conceptual scaling. Jiang et al. [24] and Kang et al. [25] studied new formal concept lattices in RSA based on different conceptual scalings. Ganter and Meschke [17] used the interordinal scaling to transform a rough table (i.e., a summarization of a data table) into a formal context. The concept lattices produced in these studies do not produce the same Boolean algebra used in RSA.

To derive the Boolean algebra of RSA, Düntsch and Gediga [12] suggested a simple transformation method for interpreting rough set approximations, which was also used by Wolski [59]. In their studies, an attribute of the derived formal context is a tuple, or a vector, of values of an object on the set of attributes in the original table. A vector is interpreted as a conjunctive combination of attribute values, which corresponds to a minimal conjunctive concept. By using a family of minimal conjunctive concepts, the formulation has an advantage of simplicity, but does not fully explore the rich structures given by all conjunctive subconcepts of a concept.

The main objective of this paper is to introduce a new interpretation of RS-definable concepts for rough-set concept analysis (RCA) by drawing two fundamental ideas from formal concept analysis (FCA): (a) a set-theoretic interpretation of a concept as “a unity of extension and intension” [19], and (b) conceptual scaling for transforming a data table into a one-valued formal context [18], [19]. Motivated by ideas from logical scaling proposed by Prediger [48], logical concept analysis proposed by Ferré and Ridoux [14], and pattern structures proposed by Ganter and Kuznetsov [15], we develop a scaling method by adopting ideas from the rough-set based learning systems LERS developed by Grzymala-Busse [20], [21], which is related to, but different from, a scaling method used by Düntsch and Gediga [12]. We use a family of subsets of attribute-value pairs, called avp-sets, as the set of attributes in the derived formal context. Instead of using the standard derivation operators of FCA, we use a pair of lower and upper intension operators/mappings and a pair of lower and upper extension operators/mappings [5], [7], [9], [11], [64]. A constructive definition of RS-definable concepts is given by applying these operators. We show that the family of all RS-definable concepts is an atomic Boolean algebra. If one only considers extensions of all RS-definable concepts, one would have the atomic Boolean algebra constructed from a partition of an equivalence relation. Finally, we introduce a reformulation of rough set approximations of a set of objects by using a pair of RS-definable concepts, with an explicit reference to the intensions of the pair of RS-definable concepts.

Our articulation of rough-set concept analysis aims at explaining existing results for a conceptual understanding of rough set theory [68]. The family of avp-sets, used to represent the intension of a concept, contains redundancy and the formulation is therefore not intended to be a computational model. At computational level, one may still use the standard formulation with a partition of the universe of objects, without explicitly referring to the intensions of RS-definable concepts in the process of computation. Intensions of concepts may be later brought in after the computation.

Regarding the terminology used in the paper, we use concepts, extension and intension in a general discussion of concepts; we use formal concepts, extent and intent for FCA; we use RS-definable concepts, extension and intension for RCA. The rest of the paper is organized as follows. Section 2 recalls basic notions of formal concepts and conceptual scaling. Section 3 introduces a new scaling method for transforming a data table into a one-valued formal context. Sections 4 and 5 investigate, respectively, RS-definable concepts and the structure of the set of all RS-definable concepts. Section 6 examines the relationships between our formulation and the standard Pawlak formulation. Section 7 introduces rough-set approximations of concepts.

Section snippets

Formal concepts and conceptual scaling

We review two basic notions of formal concepts and conceptual scaling [19], [55]. The ideas of the two notions are used in the rest of this paper to formulate an interpretation of rough-set concept analysis.

Transforming a data table into a one-valued formal context

We introduce a scaling method for transforming a data table into a one-valued formal context. In the derived formal context, each “derived attribute” is a set of attribute-value pairs of the original data table.

Rough-set definable concepts in a data table

For rough-set concept analysis, we introduce the notion of a rough-set definable concept, or RS-definable concept, as a pair of an extension and an intension that mutually define each other, where the extension is a set of objects and the intension is a set of avp-sets.

Structures of the family of all RS-definable concepts

In this section, we investigate sub-families of conjunctively RS-definable concepts and the structure of the family of all RS-definable concepts.

Relations to standard Pawlak formulation

In our formulation, we use the family of conjunctively RS-definable concepts induced by W as the building blocks for constructing all RS-definable concepts. In other words, the extension of a RS-definable concept can be expressed as the union of a family of subsets of OB from the following covering of OB: C={ppW}.Thus, our formulation provides a covering based framework. In standard Pawlak formulation, one uses the family of elementary RS-definable sets induced by WπW as the building

Concept approximations

Concept approximations are another important notion of rough set analysis. We reformulate this notion by explicitly considering the intensions of concepts.

Conclusion

In this paper, we have presented an interpretation of rough-set concept analysis in a set-theoretic setting by adopting ideas from formal concept analysis. A data table is transformed into a formal context, in which the set of scale attributes are avp-sets (i.e., attribute-value-pairs sets). Instead of using the derivation operators of formal concept analysis, a pair of intension operators and another pair of extension operators are used. Corresponding to the notion of formal concepts, we have

Acknowledgements

This work was supported in part by a Discovery Grant from NSERC, Canada. The author is grateful to detailed and constructive comments from reviewers. The author thanks Mengjun Hu for assistance in preparing examples used in the paper.

References (70)

  • L.Y. Yang et al.

    On rough concept lattices

    Electron. Notes Theor. Comput. Sci.

    (2009)
  • Y.Y. Yao

    The two sides of the theory of rough sets

    Knowledge Based Syst.

    (2015)
  • A. Arnauld et al.

    Logic or the Art of Thinking (Jill Vance Buroker, trans.)

    (1996)
  • J. Buroker, Port royal logic, Stanford Encyclopedia of Philosophy,...
  • C. Burgmann et al.

    The basic theorem on preconcept lattices

  • K.J. Cios et al.

    Machine learning algorithms inspired by the work of Ryszard Spencer Michalski

  • D. Ciucci et al.

    The structure of oppositions in rough set theory and formal concept analysis – toward a new bridge between the two settings

  • A.P. Dempster

    Upper and lower probabilities induced by a multivalued mapping

    Ann. Math. Stat.

    (1967)
  • D. Dubois et al.

    A possibility-theoretic view of formal concept analysis

    Fund. Inform.

    (2007)
  • I. Düntsch et al.

    Modal-style operators in qualitative data analysis

    Proceedings of the 2002 IEEE International Conference on Data Mining

    (2002)
  • I. Düntsch et al.

    Approximation operators in qualitative data analysis

  • M. Erné et al.

    A primer on Galois connections

    Ann. N.Y. Acad. Sci.

    (1993)
  • S. Ferré et al.

    A logical generalization of formal concept analysis

  • B. Ganter et al.

    Pattern structures and their projections

  • B. Ganter et al.

    Scale coarsening as feature selection

  • G. Ganter et al.

    A formal concept analysis approach to rough data tables

  • B. Ganter et al.

    Conceptual scaling

  • B. Ganter et al.

    Formal Concept Analysis, Mathematical Foundations

    (1999)
  • J.W. Grzymala-Busse

    LERS – a system for learning from examples based on rough sets

  • J.W. Grzymala-Busse

    A new version of the rule induction system LERS

    Fund. Inform.

    (1997)
  • K. Hu et al.

    Concept approximation in concept lattice

  • E.B. Hunt

    Concept Learning: An Information Processing Problem

    (1962)
  • F. Jiang et al.

    Formal concept analysis in relational database and rough relational database

    Fund. Inf.

    (2007)
  • R.E. Kent

    Rough concept analysis: a synthesis of rough sets and formal concept analysis

    Fund. Inf.

    (1996)
  • S.O. Kuznetsov

    Stability as an estimate of the degree of substantiation of hypotheses derived on the basis of operational similarity

    Automatic Documentation and Mathematical Linguistics (Nauch. Tekh. Inf. Ser. 2)

    (1990)
  • Cited by (59)

    • Solving an EPQ model with doubt fuzzy set: A robust intelligent decision-making approach

      2022, Knowledge-Based Systems
      Citation Excerpt :

      Pawlak [9] and Liu [10] have developed the basic concept of rough set theory and its various applications. Yao [11] analysed the concept of rough-set with better interpretation. Moreover, the classical EOQ/EPQ models as well as the models of supply chains have been modified through the realistic and novel application of fuzzy set recent times.

    • Rough concepts

      2021, Information Sciences
    View all citing articles on Scopus
    View full text