Elsevier

Information Sciences

Volume 163, Issue 4, 18 June 2004, Pages 253-262
Information Sciences

A multi-level conceptual data reduction approach based on the Lukasiewicz implication

https://doi.org/10.1016/j.ins.2003.06.013Get rights and content

Abstract

Starting from fuzzy binary data represented as tables in the fuzzy relational database, in this paper, we use fuzzy formal concept analysis to reduce the tables size to only keep the minimal rows in each table, without losing knowledge (i.e., association rules extracted from reduced databases are identical at given precision level). More specifically, we develop a fuzzy extension of a previously proposed algorithm for crisp data reduction without loss of knowledge. The fuzzy Galois connection based on the Lukasiewicz implication is mainly used in the definition of the closure operator according to a precision level, which makes data reduction sensitive to the variation of this precision level.

Introduction

While computers utilization spreads overall the world, data become very abundant, and easy to find. The major problem is that when we search for pertinent information, we generally get a huge number of references. More than any time, we need to abstract data automatically. Data reduction methods have the main objective to minimize the size of data to only keep significant objects. Unfortunately, most of these methods are based on heuristics, and are not accurate. Moreover, reducing fuzzy data becomes a more and more difficult problem since the handling of imprecision and uncertainty my cause information loss and/or deformation.

In this work, we develop a fuzzy extension of a previously proposed algorithm for crisp data reduction [7], [14] without loss of knowledge. This method is based on fuzzy formal concept analysis which has been recently developed by several researches and applied for learning, knowledge acquisition, information retrieval, etc. [1], [10], [11], [12], [13], [16], [18].

The Lukasiewicz based fuzzy Galois connection definition [10] is mainly used in this work. It allows to consider different precision levels in the definition of fuzzy formal concepts. Hence, for each precision level, we try to preserve the same knowledge extracted from the initial database.

The paper is structured as follows: in Section 2, we introduce some basic definitions on fuzzy formal concept analysis and we explain how we can use the precision levels in order to reduce the numbers of formal concepts. Then, in Section 3, we propose an algorithm for fuzzy data reduction and we illustrate the variation of the reduced database size according to the reduction of the precision level, while in Section 4 we present the reduced sizes of different well-known databases [19] and finally, in Section 5, we conclude this paper and we point out the perspectives of our work.

Section snippets

Mathematical foundations

Among the mathematical theories found recently with important applications in computer science, lattice theory has a specific place for data organization, information engineering, data mining and for reasoning. It may be considered as the mathematical tool that unifies data and knowledge: or information retrieval and reasoning [2], [4], [5], [8]. In this section, we define fuzzy binary context, fuzzy formal concept and the fuzzy Galois connection associated with the fuzzy binary context.

Data reduction in fuzzy binary relations

The fundamental question for data reduction is the following: How can we minimize data without losing any knowledge? Knowledge is defined as the set of association rules that we can extract from the initial explicit data. The advantage of reduced data is that it can be used directly as a prototype for making decision, for supervised learning, or for reasoning. For that purpose, we prove that some rows my be removed from the initial fuzzy binary context at a given precision level without

Experimental results

Here, we present the reduced databases sizes for different precision levels. The experiment concerns the Iris Plants Database, the EV 2 Database and the Heart-diseases database [19].

  • The Iris Plants Database [19] is created by R.A. Fisher in July 1988. The data set contains three classes of 50 instances where each class refers to a type of iris plant. The Iris Database have 150 instances (50 in each of three classes) and four Attributes (Table 10).

  • The EV 2 Database “ALCOOLS SUPERIEURS DANS LES

Conclusion

In this paper, we have presented a fuzzy data reduction approach based on fuzzy formal concept analysis foundation. Fuzzy formal concept were defined at a given precision level and can be used to express knowledge as associative rules. For a given precision level, our approach allows to generate the same association rules by using only a reduced fuzzy formal context. Hence, we have illustrated experimentally to what extent the database reduction is significant if the precision level is reduced

References (18)

  • A. Jaoua et al.

    Galois connection, formal concept and Galois lattice in real binary relation

    J. Syst. Software

    (2002)
  • L.A. Zadeh

    Fuzzy sets

    Inform. Control J.

    (1965)
  • R. Belohlavèk

    Fuzzy Galois connections

    Math. Logic J.

    (1999)
  • B.A. Davey et al.

    Introduction to Lattices and Order

    (1990)
  • B. Ganter et al.

    Formal concept analysis

    (1999)
  • G. Schmidt et al.

    Relations and graphs

    (1999)
  • A. Jaoua, K. Bsaies, W. Consmtini, May reasoning be reduced to an information retrieval problem, Relational Methods in...
  • A. Jaoua, A. Al-Rashdi, H. AL-Muraikhi, M. Al-Subaiey, N. Al-Ghanim, S. Al-Misaifri, Conceptual data reduction,...
  • G.W. Mineau et al.

    Automatic structuring of knowledge bases by conceptual clustering

    IEEE Trans. Know. Data Eng.

    (1995)
There are more references available in the full text version of this article.

Cited by (0)

View full text