Elsevier

Information Systems

Volume 29, Issue 1, March 2004, Pages 47-58
Information Systems

Ontology-based distributed autonomous knowledge systems

https://doi.org/10.1016/S0306-4379(03)00033-4Get rights and content

Abstract

Traditional query processing usually requires that users fully understand the database structure and content to issue a query. Due to the complexity of the database applications and the variety of user needs, the so-called global queries are introduced which traditional query answering systems cannot handle. Query posed to a database D is global if minimum one of its attributes is missing in D while it occurs in other databases. Definitions of a missing attribute in D can be extracted from other databases and shared with D. To handle semantics inconsistencies between the same attributes used at different sites, task ontologies are used as a communication bridge between them. These inconsistencies can be caused either by different granularity levels or by different interpretations of the same attribute. As the final outcome of this research, a rough query answering system based on distributed data mining is presented.

Introduction

In many fields, such as medical, banking and educational, similar databases are kept at many sites. An attribute may be missing in one database, while it occurs in many others. Databases are often incomplete which means that queries can be either answered approximatively (for instance in a rough sets framework) or to retrieved objects some weights are assigned. Each query can be answered in many different ways, depending on the interpretation of incomplete values. In the most general scenario, an answer to a query submitted to an information system (see [1], [2]) is a set of objects certainly satisfying the query, and a set of objects possibly satisfying the query. Different interpretations may assign different values to possible objects showing the confidence of the query answering system in each object returned as an answer to the query. These values can be calculated either using information on frequency of appearance of some attribute values in the information system or using local and global KDD methods which can predict what attribute values should replace the missing values of incomplete attributes and what weights should be assigned to them.

Missing or incomplete attributes lead to problems when answering queries. For example, a user may issue a query to a local database S in search for its objects that match a desired description, only to realize that attribute a1 used in that description is either missing or vast majority of its values in S is incomplete so that the query cannot be answered. But definitions of a1 may be extracted from databases at remote sites and used to identify objects in S having properties related to a1 as proposed in [14], [15], [16]. The simplicity of this approach is no longer in place when the semantics of terms used to describe objects at a client and remote sites differ. Sometime, such a difference in semantics can be repaired quite easily. For instance if “Temperature in Celsius” is used at one site and “Temperature in Fahrenheit” at the other, a simple mapping will fix the problem. If databases are complete and two attributes have the same name and differ only in their granularity level, a new hierarchical attribute can be formed to fix the problem. If databases are incomplete, the problem is more complex because of the number of options available to interpret incomplete values (including null values). The problem is especially difficult in a distributed framework when rule-based chase techniques, driven by rules extracted at remote sites, are used by a client site to replace null values by values which are less incomplete.

The notion of an intermediate model, proposed by Maluf and Wiederhold [3], is very useful to deal with heterogeneity problem, because it describes the database content at a relatively high abstract level, sufficient to guarantee homogeneous representation of all databases. Knowledgebases built jointly with task ontologies proposed in our paper, can be used for a similar purpose. They contain rules extracted from databases at remote sites. These rules are seen as definitions of their decision values. Sometime these definitions are inconsistent so some form of a consensus has to be reached. Algorithm addressing this issue was proposed by Ras [4].

In this paper, the heterogeneity problem [13], [17] is introduced from the query answering point of view. Query answering system linked with a client site transforms, so-called, global queries into local queries for a client site using task ontologies and definitions extracted at remote sites. These definitions may have so many different interpretations as the number of remote sites used to extract them. This paper will mainly focus on different interpretations of queries due to different interpretations of null values and also due to granularity levels of a given attribute which may easily vary from site to site. Information stored in task ontologies is used to find so-called rough interpretations representing consensus of all involved sites.

Section snippets

Distributed information systems

In this section, we recall the notion of a distributed information system and a knowledgebase for a client site formed from rules extracted at remote sites. We introduce the notion of local queries and give example of their local semantics.

By an information system we mean S=(X,A,V), where X is a finite set of objects, A is a finite set of attributes, and V=⋃{Va:a∈A} is a set of their values. The set Va is called the domain of attribute a. We assume that

  • Va,Vb are disjoint for any a,bA such that

Semantic inconsistencies and distributed autonomous knowledge systems

In this section, we introduce the notion of a distributed autonomous knowledge system (DAKS) and next present problems related to the construction of its query answering system QAS. We discuss the process of handling semantics inconsistencies in knowledge extracted at different sites of DAKS. Next, we outline the transformation steps for global queries so some of their atomic terms can be replaced by subterms extracted at different sites of DAKS by knowledge discovery techniques. The goal of

Query processing based on reducts

In this section we recall the notion of a reduct (see [2]) and show how the partially ordered set of semantics (Ω,≼) can be used to improve query answering process in DAKS as introduced in [8].

Let us assume that S=(X,A,V) is an information system and V=⋃{Va:a∈A}. Let BA. We say that x,yX are indiscernible by B, denoted [xBy], if (∀aB)[a(x)=a(y)].

Now, assume that both B1,B2 are subsets of A. We say that B1 depends on B2 if ≈B2⊂≈B1. Also, we say that B1 is a covering of B2 if B2 depends on B1

Conclusion

Clearly, the easiest way to solve semantics inconsistencies problem is to apply the same local semantics at all remote sites. However, when databases are incomplete and we replace their null values using rule-based chase algorithms based on rules locally extracted then we are already committed to the semantics used by these algorithms. If we do not keep track what and how the null values have been replaced by rule-based chase algorithms, there is no way back for us. Also, it sounds rather

References (18)

  • Z. Pawlak

    Rough classification

    Int. J. Man–Machine Studies

    (1984)
  • Z. Ras, A. Dardzinska, Handling semantics inconsistencies in query answering based on distributed knowledge mining, in:...
  • D. Maluf, G. Wiederhold, Abstraction of representation for interoperation, in: Foundations of Intelligent Systems,...
  • Z. Ras, Dictionaries in a distributed knowledge-based system, in: Concurrent Engineering: Research and Applications,...
  • Z. Ras

    Resolving queries through cooperation in multi-agent systems

  • Z. Ras et al.

    Query approximate answering system for an incomplete DKBS

    Fund. Inform.

    (1997)
  • R. Mizoguchi, Ontological engineering: foundation of the next generation knowledge processing, in: Proceedings of Web...
  • Z. Ras

    Query answering based on distributed knowledge mining

  • T. Andreasen, J.F. Nilsson, H.E. Thomsen, Ontology-based quering, in: Proceedings of the Flexible Query Answering...
There are more references available in the full text version of this article.

Cited by (0)

View full text