Detection and resolution of semantic inconsistency and redundancy in an automatic ontology merging system

Fahad, Muhammad; Moalla, Nejib; Bouras, Abdelaziz

doi:10.1007/s10844-012-0202-y

Detection and resolution of semantic inconsistency and redundancy in an automatic ontology merging system

Published: 29 April 2012

Volume 39, pages 535–557, (2012)
Cite this article

Download PDF

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Detection and resolution of semantic inconsistency and redundancy in an automatic ontology merging system

Download PDF

Muhammad Fahad¹,
Nejib Moalla¹ &
Abdelaziz Bouras¹

513 Accesses
12 Citations
Explore all metrics

Abstract

In recent years, researchers have been developing algorithms for the automatic mapping and merging of ontologies to meet the demands of interoperability between heterogeneous and distributed information systems. But, still state-of-the-art ontology mapping and merging systems is semi-automatic that reduces the burden of manual creation and maintenance of mappings, and need human intervention for their validation. The contribution presented in this paper makes human intervention one step more down by automatically identifying semantic inconsistencies in the early stages of ontology merging. We are detecting semantic heterogeneities that occur due to conflicts among the set of Generalized Concept Inclusions, Property Subsumption Criteria, and Constraint Satisfaction Mechanism in local heterogeneous ontologies, which become obstacles for the generation of semantically consistent global merged ontology. We present several algorithms to detect such semantic inconsistencies based on subsumption analysis of concepts and properties in local ontologies from the list of initial mappings. We provide ontological patterns for resolving these inconsistencies automatically. This results global merged ontology free from ‘circulatory error in class/property hierarchy’, ‘common class between disjoint classes/properties’, ‘redundancy of subclass/subproperty of relations’ and other types of ‘semantic inconsistency’ errors. Experiments on the real ontologies show that our algorithms save time and cost of traversing local ontologies, improve system’s performance by producing only consistent accurate mappings, and reduce the users’ dependability for ensuring the satisfiability of merged ontology.

Merging Operation for Domain Ontologies in Semantic Web: Some Issues

A Benchmark for Ontologies Merging Assessment

$$\mathcal {C}$$ o $$\mathcal {M}$$ erger: A Customizable Online Tool for Building a Consistent Quality-Assured Merged Ontology

1 Introduction

Recent years have witnessed the great development and successful use of ontologies for knowledge representation, data annotation and information exchange between people, organizations, autonomous agents, web services or groups in open environments such as Semantic Web. But, as they are being developed for multiple purposes, needs, and requirements, same ontologies share overlapping domain knowledge and can be used for the annotation of multiple data sources, such as web pages, xml repositories, relational databases, multi-media data, etc. (Klein 2001). Such use of ontologies, as a means for providing a shared/common understanding of various domains, enables a certain degree of interoperation between these data sources (Bouquet et al. 2006). Therefore, Ontology Merging is proposed as one of the solutions to achieve the demands of interoperability. It is a process of generating single ontology from different heterogeneous source ontologies. It comprises of two primary steps. First, the source ontologies are looked-up for the similarities between them. Second, duplicate-free union of source ontologies is achieved based on the established similarities. The source ontologies contain overlapping domain knowledge, and can contain different types of semantic heterogeneities which create conflicts when going to be merged. The new merged ontology, that is the result of union of source ontologies, should provide a unified consistent and coherent view about the source ontologies. As a result, ontology alignment, mapping and merging systems have appeared to fulfill these demands, and discussed with different aspects by Euzenat and Shvaiko (2007). They act as pillars in wide range of application domains and building collaborations that involve sharing of data, knowledge and resources among modern companies (Euzenat and Shvaiko 2007). They also aid in developing a new ontology by reusing existing open ontologies and significantly reduce the cost of building ontology from scratch (Klein 2001).

The last decade has seen researchers both in academia and industry developing efficient algorithms for the Automatic Ontology Merging (AOM), because it is very hard to perform this task manually beyond a certain complexity, size or number of ontologies (Ehrig and Staab 2004). Although, there is a great effort seen, but, still state-of-the-art ontology mapping and merging systems is semi-automatic that reduces the burden of manual creation and maintenance of mappings and needs expensive human intervention for their validation. In addition, they use different aids, such as common vocabulary, reference ontology, basic initial alignments by human, etc., each of which might be appropriate in some tasks with given set of circumstances, but are not feasible for dynamic environment such as the Semantic Web (Euzenat and Shvaiko 2007). Recent studies on ontology merging show that due to semantic heterogeneities and mismatches, fully automatic merging is unattainable (Kotis et al. 2006). But, effective algorithms for computing semantic correspondences help us reach at a position, where ontology merging can be carried out with minimum human intervention.

The most primitive unit of any ontology is the set of Generalized Concept Inclusions axioms (GCIs) by which ontology hierarchy is built. Complex ontologies also comprise of set of properties which are imposed by constraint to represent real world situations. It is necessary for the properties to meet the Property Subsumption Criteria (PCS) and Constraint Satisfaction Mechanism (CSM) imposed on them. Therefore, we formulate our consistency checking mechanism based on GCIs, PSC and CSM to ensure the consistency of merged ontology. In this way, the contribution presented in this paper minimizes human involvement one step down during the ontology merging process and presents novel methodologies for the detection of semantic inconsistencies in the initial stages of ontology merging. Our ultimate goal is to check the semantic correctness and consistency of mappings, and achieve the satisfiability of merged ontology. In addition, to ensure the soundness of merged ontology, i.e., it includes nothing but the truth (i.e., all the axioms from source ontologies). To achieve this, we have developed algorithms for the detection of several types of inconsistencies and reported them with system evaluation in Fahad et al. (2011). But, the resolution criterion for the potential problems in the automatic merging system is unaddressed. In addition, how inconsistency detection on the basis of depth of ontological concepts is helpful in reducing time complexity of the merging system is not reported. This paper focuses on these points, by contributing Ontological Patterns as a solution of potential problems and analysis of time complexity of proposed algorithm on real ontologies.

First, this paper discusses our methodology that detects semantic inconsistencies from the list of initial mappings by exploiting GCIs, PSC and CSM present in local ontologies. It checks whether lexically same concepts within the local source ontologies must not contradict each other with respect to these mechanisms. Initial mappings between concepts of local ontologies are flagged based on the degree of difference in their depth. The consistency checker module acts as a filter at the initial merging stage checking for a set of basic conditions before allowing axioms to be added to the global ontology. Second, we present ontological patterns for the resolution of such problem in automatic merging of ontologies. By the use of more semantics present in the source ontologies and employing test criteria for initial mappings found, our approach enhances the accuracy of mapping and merging ontologies, and produces consistent, complete and coherent global ontology from local heterogeneous ontologies. In this way, it forms a global layer from which several heterogeneous local ontologies could be accessed and hence would exchange information in semantically sound manners. Finally, we analyze the time complexity of the proposed algorithms and access their efficiency on the real ontologies.

The rest of paper is organized as follows. Section 2 discusses state-of-the-art on ontology mapping and merging approaches and systems. Section 3 discusses overview of our semantic ontology merger system, DKP-AOM. Section 4 presents semantic inconsistencies due to conflict in GCIs, PSC and CSM. It also contributes the detection criteria for such inconsistencies from the list of initial mappings. Section 5 contributes the ontological patterns for the resolution of potential problems in merged ontology, so that only accurate mappings and axioms constitute consistent, complete and coherent global merged ontology. Section 6 throws a light on complexity analysis of proposed algorithms. Section 7 concludes the paper and shows our future directions on this topic.

2 Ontology mapping and merging systems

In the research literature, there are many diverse approaches, techniques and systems for the merging of heterogeneous ontologies. In general, they are classified in two broad categories based on the approaches they follow (Bruijn et al. 2006). In the first approach, merging process results a single output ontology that contains the individual source ontologies. The examples of this approach are Prompt, Chimaera, etc. In the second approach, merging process results a bridge ontology that imports the source ontologies and comprises of bridge axioms or articulation rules that represent the mappings about the concepts of source ontologies. The examples of this approach are OntoMorph, ONION, etc.

Besides this, ontology merging approaches, techniques and systems are based on various features, such as instances, labels, attributes, structures, axioms, etc., and additional auxiliary information such as reference ontology, instance document, etc. An instance based methodology, FCA-Merge, takes two ontologies and a set of natural language documents having descriptions on the instances of ontologies as input. It employs formal concept analysis to make the concept lattice, and considers concepts having the identical instance candidate for merge (Stumme and Mädche 2001). IF-Map, also, uses the formal analysis of concepts and gets aid from common reference ontology that comprises common vocabulary about the local subject ontologies for their mapping (Kalfoglou and Schorlemmer 2003). GLUE follows a hybrid strategy making use of instance and taxonomic structure matching technique with machine learning approach to determine the probabilities of concepts for ontology integration (Doan et al. 2004). The results of instance based ontology mapping and merging techniques show drawback when semantically distinct concepts having the common instance are considered to be the same and candidate for merge (Klein 2001).

The interactive ontology merging tools, PROMPT suite (Noy and Musen 2003) and Chimaera (McGuinness et al. 2000), exploit syntactic concept label matching techniques and to some extent the structure of local ontologies for the initial comparisons. They generate list of suggestions and get user feed back for the ontology integration. However, these systems left out concepts that are semantically equivalent but modelled with different names. HCONE-merge makes use of latent semantic indexing mechanisms for computing possible mappings and then requires human’s intervention for their validation to ensure mapping concepts to their intended meanings. Then, it employs reasoning services of DL for automatic merging of local ontologies (Kotis et al. 2006). OLA exploits distance based algorithms and Alignment API for finding correspondences between OWL-Lite ontologies by making use of all the elementary matching techniques (Euzenat and Valtchev 2004). QOM aimed at gaining efficiency by dynamic programming approach rather than effectiveness by matching algorithms. It avoids the complete pair-wise comparison of concepts and employs many heuristics for choosing only the best candidate mappings, and thus reduces the runtime complexity matching process (Ehrig and Staab 2004).

OntoMorph proposes a set of transformation operators for enabling interoperability among heterogeneous ontologies (Chalupsky 2000). The list of suggestions containing candidate merge concepts is identified and provided to the end-user. The goal of this approach is not to produce a complete merged ontology as above systems, but, bridge ontology. The bridge ontology has the bridge axioms that represent the mappings between the concepts of source ontologies. For the merging of ontologies, an end-user receives no guidance during the merging process of heterogeneous ontologies except for the initial list of matches. Similarly, ONION uses the structure of taxonomy, local definitions and formalization of articulation ontologies for mapping different source ontologies (Mitra and Wiederhold 2002).

Besides above mentioned techniques and systems, there are few works, such as CtxMatch (Bouquet et al. 2006) and S-Match (Giunchiglia et al. 2004), devoted towards determining semantic matching between the concepts of ontologies. First, they transform concept descriptions into Description Logic (DL) axioms, i.e., change matching problem into a propositional unsatisfiability problem. Then, they make use of available open source DL reasoners for finding semantic relations (e.g., equivalence, subsumption) between concepts that correspond semantically with each other. ASMOV matching algorithm exploits lexical and structural analysis for determining correspondences between ontologies. Then, it verifies the determined correspondences by the use of formal semantics to compute whether they comply with the desired characteristics (Jean-Marya et al. 2009). Mascardi et al. uses upper ontologies as semantic bridges for the matching source ontologies and performs analysis of the relationships between different features of local ontologies (Mascardi et al. 2010).

Our initial effort towards ontology merging is a semi-automatic system, Disjoint Knowledge Preservation based Ontology Merging (DKP-OM), that follows a hybrid approach making use of linguistic matching, formal analysis of concepts, and various heuristics for computing semantic correspondences between concepts (Fahad et al. 2007; Fahad et al. 2010; Fahad and Qadir 2009). These correspondences at first act as a list of suggestions to the end user and then system requires human intervention for the generation of global merged ontology. This paper extends the methodology of DKP-OM system to encounter more structural and semantic conflicts, and contribute more optimized solution for accessing and resolving semantic consistencies in ontology merging. In our previous approach, we detect such semantic inconsistencies with the help of hidden global merged ontology that is formed by leading initial mappings. Our system then applies consistency criteria on the hidden merged ontology to ensure its satisfiability based on the traversal. Making hidden ontology and traversal for the validation, consume time and resources, and add performance overheads to the over all performance of the system. The algorithms presented in this paper detect inconsistent mappings from the initial mapping list, and save time and resources for the generation of hidden merged ontology. Our hybrid strategy makes it possible to find all possible mappings, and semantic validation of mappings gives very promising final results by ignoring the incorrect correspondences that don’t satisfy the test criteria, hence increases the precision of final results.

3 Overview of semantic ontology merging system (DKP-AOM)

Our semantic ontology merging system, DKP-AOM, takes local source ontologies and performs ontology matching operation for finding semantic similarities between the source ontologies based on various features to produce consistent merged ontology in sound semantically manner. Figure 1 presents the three main steps of our ontology merging methodology with its sub-components.

DKP-AOM system, first, generates the intermediate models (OWL Graphs) of source ontologies using Jena API. Using these graphs, MatchManager, which comprises a set of individual matching algorithms, performs the first level task of finding the initial linguistic, synonym and axiomatic based mappings between concepts. It propagates the initial mappings to ConsistencyChecker for the validation. ConsistencyChecker has many detectors that make the validation of each mapping found in the initial stage so that the merged ontology stays consistent with reference to the source ontologies. When the initial mappings pass the consistency test, ConsistencyChecker passes the mappings to the Reasoner. Finally, Reasoner aggregates the output of different similarity measures, resolves conflicts and merges initial mapping to generate a global merged ontology. Finally, it compiles the output as a merged global ontology automatically or final list of mappings as required by the end user. In this step, it ensures the ultimate goal of achieving the satisfiability of merged ontology by checking correctness and consistency of concepts, axioms and instances of the generated global ontology. For semi-automatic generation of global ontology from the mapping list, it shows the semantically consistent mappings to the user as a list of initial suggestions, and asks for the feedback. In this case, it follows a cyclic approach as other semi-automatic merging systems (e.g., Prompt) to generate global merged ontology.

4 Ensuring satisfiability of global merged ontology

The main goal of ConsistencyChecker module of DKP-AOM is to ensure the satisfiability of global merged ontology that is generated by following initial mappings, so that Tbox (Terminological box) and Abox (Assertional box) of global ontology comprise of consistent set of Generalized Concept Inclusions (e.g., GCI: $C \sqsubseteq D$ stays consistent according to local ontologies). In addition, all the properties, which belong to classes and various constraints that are imposed via restrictions on them, should meet the subsumption criteria. Formally, the satisfiability of global merged ontology is expressed as follows.

Definition 1

A Global Ontology GO is a pair $GO=\left( {{\cal T},{\cal A}} \right)$ where ${\cal T}$ is a Tbox and ${\cal A}$ is an Abox generated from the Mapping_list {Mapping(C₁,D₁), Mapping(C₂,D₂),...,Mapping(C_n,D_n)} where C₁,C₂, ...,C_n belong to local ontology O₁ and D₁,D₂, ...,D_n belong to local ontology O₂. An interpretation ${\cal I}$ is a model for GO if it is both a model for ${\cal A}$ and ${\cal T}$. A Global Ontology GO logically implies ∝, where alpha is either an Abox statement (i.e., concept or role instantiation) or a Tbox statement (i.e., concept introduction), written GO = ∝, iff ∝ is satisfied by every model of GO.

Several inconsistency detectors, inside the ConsistencyChecker component, are responsible for finding semantic inconsistencies in the initial mappings found. Each of the detector work independently employing specific algorithm so that only consistent and accurate mappings are forwarded to the user to build satisfiable merged ontology. When detectors discover any inconsistent mapping, they notify it to the ConsistencyChecker that warns the user about the inconsistent situations, which occur in merged global ontology by following inconsistent initial mapping. Hence, it reduces the human intervention by validating the merged ontology automatically. There are various types of inconsistency, incompleteness and redundancy errors that may occur in a single ontology (as discussed in Fahad and Qadir (2008)), and in merged ontology due to the differences of modeling and semantics of domain concepts, but in this paper we are only addressing semantic inconsistency errors.

4.1 Types of semantic inconsistencies that can occur in global merged ontology

Semantic Inconsistency in merged ontology occur when merging system makes an incorrect class hierarchy by classifying a concept as a subclass of a concept to which it does not really (or partially) belong. This can happen due to GCIs conflicts, property subsumption violation or constraint dissatisfaction. Finding such types of semantic inconsistencies automatically during ontology merging process is a difficult task and need special elaboration of these errors.

In general, there are mainly three reasons due to which incorrect semantic classification in merged ontology originates. Firstly, ontology merging system has placed a concept with ‘the weaker domain’ under a concept which comprises ‘the stronger domain’ in the merged class hierarchy. In practice, when building class hierarchies, ontologists with their intelligence make generalization towards specialization of domain while going down in class hierarchy. This means child concept should possess is-a relationship with its parent, and parent concept to its ancestor concepts. This process goes on until the root concept of ontology the possesses everything, as denoted by Thing concept. But, automatic merging system, due to lack of embedded rules and semantics, ignored and composed class hierarchy on similarities which may lead to generalization instead of specialization when building merged class hierarchy. Merging system should be aware of the semantic rule that a subclass should always specialize the superclass concept by specifying stronger domain as going down in the taxonomy. Secondly, merging system has placed a subclass in the merged class hierarchy that possesses only few features of its superclass, and breaches superclass domain by allowing more features that may not present in its superclass or violates some features of it. Usually this type of error occurs when merging system has not keenly observed all the features of concepts from both the local ontologies and classified a subclass of concept that only partially meet the properties of superclass. Ideally, a subclass should possess all the features of its superclass and may add new features that should not conflict with its superclass features. Thirdly, merging system has placed a concept as a subclass of a concept that occupies a disjoint domain. Figure 2 shows the conformance relationships between the superclass and subclass concepts for establishing consistency in the merged class hierarchy. This classification holds same to the properties of concepts, as they build property hierarchies in an ontology.

In general, all these kinds of errors occur when concepts in ontologies partially overlap with one another, and automatic (or semi-automatic) merging system merges concepts (or suggests user to merge concepts) that are partially close via some similarity criteria. For well-formed and accurate merged class hierarchy, it is necessary to ensure that subclass concept should not belong to disjoint domain or weaken or breach the properties and features of any superclass concepts. The subclass may add new features and properties, but, they should specialize the features and properties of its super classes. The consistency should be analyzed based on Generalized Concept Inclusion Axioms, Property Subsumption Criteria and Constraint Satisfaction Mechanisms during the construction of merged ontology.

4.2 Semantic inconsistency due to conflicts in GCIs or PSC

Global Merged Ontology, GO, is meant to be semantically consistent when its TBox satisfy all the set of Generalized Concept Inclusions (GCIs) and all the set of Properties Subsumption Criteria (PSC) of local heterogeneous ontologies. When conflict occurs in the set of GCIs or PSC of local heterogeneous ontologies, then the global merged ontology suffers from various types of semantic inconsistencies. Formally, the following definitions express the subsumption rule for global ontology and semantically inconsistent mappings as follows.

Definition 2

For Global Ontology GO, an interpretation ${\cal I}$ satisfies a GCI $C\sqsubseteq D$, if $C^{\cal I}\subseteq D^{\cal I}$, and an interpretation is a model of a Tbox ${\cal T}$, if it satisfies all GCIs in the TBox ${\cal T}$. Likewise, it must hold the Property Subsumption Criteria.

Definition 3

Mapping(C,C’), Mapping(D,D’) are semantically inconsistent as they violate the subsumption rule with respect to GCIs or PCS of local ontologies O₁ and O₂, such that in O₁, $C\sqsubseteq D$ w.r.t local Tbox ${\cal T}_1 $ & in all models of ${\cal T}_1 \;C^{\cal I}\subseteq D^{\cal I}$, And in O₂, $D'\sqsubseteq C'$ w.r.t local Tbox ${\cal T}_2 $ & in all models of ${\cal T}_2 \;D^{{\prime}\cal I}\subseteq C^{{\prime}\cal I}$.

There are several possibilities that originate semantic inconsistencies in global merged ontology. For example, let O₁ and O₂ be two source ontologies (as shown in the Fig. 3) of software engineering domain comprising workers following Object Oriented (OO) approach or Structured Engineering approach for developing programs for the companies. Semantic inconsistent mappings due to GCIs conflict mean a situation where a parent concept in O₁ maps on a child concept of O₂ and a child concept of O₁ maps on a parent concept of O₂. In this case, initial mappings by MatchManager subcomponent, i.e., Mapping_O1,O2(Designer, Designer) and Mapping_O1,O2(OOknow _Associate, OOKnow_Associate), create GCIs conflict.

Our system works on these source ontologies as follows. It gets the source ontologies, makes the intermediate models for similarity computation, and annotates depth with each concept in the ontology hierarchy. MatchManager computes correspondences between ontology concepts and produces initial mappings between source ontologies as follows: Mappings$_{O1,O2\thinspace }${(SoftwareEngineer ₁, Software-Engineer ₁), (Programmer ₂, Programmer ₃), (Designer ₂, Designer ₃), (StructKnow_Associate ₃, StructKnow_Associate ₂), (OOKnowledge_Associate ₃, OOKnowledge_Associate ₂)}. Each mapping contains name of classes from O₁ and O₂ and depth of concept in the ontology associated with it (here shown as subscript in Mappings_O1,O2). It propagates these mappings to ConsistencyChecker, which ensures correctness and consistency issues.

ConsistencyChecker applies the test criterion to ensure the semantic correctness of mappings as shown in the Fig. 4. It detects the GCIs conflict that lead to inconsistency from the depth associated in mapping list and not by traversing the source ontologies to reduce the time complexity of validating initial mappings. Otherwise, traversing the ontologies again for detecting inconsistencies would be very much costly and compromises the performance of the system. First, it extracts mappings with unequal depth from the list of initial mapping as they are ambiguous due to the depth differences, but these may or may not create inconsistencies in merged ontology. Then, it validates whether selected mappings lead to inconsistency, if they have parent-child relationship between them. We call this proposed mechanism of inconsistency detection as Depth Analysis & Pruning based Approach (DAPbA).

It is a necessary condition for semantic inconsistency caused by GCIs conflict that the mappings with unequal depth have parent-child relationship that lead to cycles in the merged ontology. In this example, ConsistencyChecker detects Mapping_O1,O2 (Designer ₂, Designer ₃) and Mapping_O1,O2(OOKnowledge_Associate ₃, OOKnowledge_Associate ₂) as inconsistent mappings that create the semantic inconsistency. As Designer in O₁ ontology lexically corresponds to Designer in O₂ ontology, and OOKnowledge_Associate in O₁ corresponds to OOKnowledge_Associate in O₂, and these mappings fulfill the condition of parent-child relationship. Note that mapping pairs (3 and 4), (2 and 4), and (2 and 5) in Mappings$_{O1,O2\thinspace }$have unequal depth, but, they do not fulfill the condition of parent-child relationship between them. Therefore, they cannot create inconsistency in the merged ontology. This type of inconsistency present in the initial mappings leads to redundancies and various other types of inconsistencies (i.e., ‘circulatory error in class hierarchy error’) in merged global ontology.

Table 1 depicts several rules for semantic inconsistencies when occur during the mappings of concepts (cases 1 and 2) and properties (cases 3 and 4). However, their occurrences in property hierarchies (datatype and object property hierarchies) are very less observed in the real ontologies as compared to class hierarchies. However, in all these scenarios, system does not automatically follow the initial mappings to build global merged ontology with the constraints of parent-child relationship of both or any of the source ontology. The system detects and warns the user about the situation, and does not execute the mappings to make inconsistent merged ontology without telling the user about the inconsistencies. However, the system is flexible enough to proceed for building merged ontology if user wants, but, after generating warnings.

Table 1 Several cases for semantic inconsistency due to GCIs in ontology hierarchy

Full size table

In the running example, semantic inconsistency has raised the issues of circular inconsistency and redundancy of subclass-of axiom. Redundancy of subclass-of error (case 2 in Table 1) can be observed in O₃ by analyzing subclass-of axiom between (Structknow_Associate, SoftwareEngineer), already having another subclass-of axioms between (Structknow_Associate, Designer) and (Designer, SoftwareEngineer) as indicated by is-a relationships (a, and (b, c)) in the Fig. 3. Therefore, it is very crucial to identify all these scenarios of semantic inconsistency before building the global merged ontology. The algorithm in Fig. 4 helps to identify these situations of semantic inconsistencies. But, redundant subsumptions can occur in other cases without such semantic inconsistent situations. Therefore, system has to employ detection mechanism of such redundancies to produce a concise merged ontology.

The detection of redundant subsumption has to analyze the list of mappings; for instance, Mapping (A₁, M₂), Mapping (B₁, N₂) exists, then Mapping (C₁, L₂) when in O$_{1\thinspace }$ontology, there exists subsumption C$_{1}\sqsubseteq $B$_{1}\sqsubseteq $A₁, and in ontology O₂, there exists subsumptions like N$_{2}\sqsubseteq $M₂, L$_{2}\sqsubseteq $M₂, but L$_{2}\sqsubseteq \neg $N₂. Simply, this means that B is a direct child of A by the subclass of axiom in O₁, but, in O₂, N is a indirect child of M by the inference mechanism. Therefore, detection algorithm of redundant subsumption has to analyze from the list of mappings the pairs having direct and indirect parent child relationships between their respective elements, as depicted in algorithm shown in Fig. 5.

Another type of semantic inconsistency can occur due to conflicts between property subsumptions. In general, there are two types of properties, datatype and object property. Both, the properties hold domain concept to which they belong. Datatype property is like an attribute that takes some value (i.e., string, integer, etc.) in its range. But, an object property creates a link or relationship between concepts. Therefore, it holds concepts in its range. For consistent mappings between concepts, it is necessary for the mapped concepts to possess properties which should not contradict with their domain concepts. Property subsumption Criteria (PSC) violation can be seen when some properties violate the subsumption criteria, i.e., their range take the value of some other concept that does not fulfill the subsumption rule to its domain concept. For example, consider local ontologies in Fig. 6. In ontology Oa, there are many persons that hold has_Responsibility object property, such as, Tester with range concepts (Object Oriented (OO) and Structure Approach (SA)), Web_Analyzer with range concepts (html pages, web artifacts etc.), Graphic Examiner with range concepts (images, poster, etc.) are disjointly responsible for their activities that they control in the software house. But, in ontology Ob, there exists only one Tester who manages all the artifacts and deliverable of the whole company. Therefore, the property has_Responsibility takes range concepts (OO Sw or Web artifacts, etc.). When merging Tester ₁ with Tester ₂, semantic inconsistency occurs due to property subsumption violation, as it is taking the value of ‘web artifacts’ from Ob, which has specified disjointness in Oa. Since the property has_responsibility can only take the values which are not disjoint with the Tester according to its subsumption criteria. Therefore, it breaches the domain of the Tester by incorporating other feature of concepts (here it is Web_Analyzer) which are disjoint in one (or both) of the local ontologies.

Detection of such violation needs to verify that the properties of mapped concepts should not hold disjoint concepts (directly or indirectly) as their range concepts (as shown in Fig. 7). If they hold, then there is an inconsistency because the individual concept cannot subsume from its disjoint concept.

4.3 Semantic inconsistency due to constraint dissatisfaction

Global Merged Ontology GO is free from constraint dissatisfaction when all the concepts in its TBox satisfy all the set of axioms present in local heterogeneous ontologies, and all the concepts follow the satisfiability criteria with respect to all the constraints imposed on classes or properties, especially disjoint or overlapping constraints in local ontologies. Formally, the following definitions express the satisfiability of concepts in global ontology and semantically inconsistent mappings.

Definition 4

A concept C in global merged ontology GO that is generated by Mapping (C, C ^′) from local ontologies O ₁, O ₂ is satisfiable w.r.t a TBox ${\cal T}$ if there exist a model ${\cal I}$ of ${\cal T}$ such that $C_{\cal I} \ne \emptyset $, and TBox of global merged ontology is satisfiable if it admits a model.

Definition 5

Mapping(C, C ^′), Mapping(D, D ^′) suffers from constraint dissatisfaction as they violate the satisfiability rule with respect to knowledge in local ontologies O ₁ and O ₂, Given that in O ₁, C has a model w.r.t local Tbox ${\cal T}_1 $ & in all models of ${\cal T}_1 \;C^{\cal I}$, But in O ₂, C′ has a different model w.r.t local Tbox ${\cal T}_2 $ & in all models of ${\cal T}_2 \;C^{\cal I}$.

Definition 6

Mapping(C, C ^′), Mapping(D, D ^′) suffers from alignment conflict among disjoint relations as they violate the satisfiability rule with respect to knowledge in local ontologies O ₁ and O ₂, Given that in O ₁, C disjointWith D w.r.t a local Tbox ${\cal T}_1 $ & in all models of ${\cal T}_1 \;C^{\cal I}$ disjointWith $D^{\cal I}$, But in$ O_{2}, C^{\prime}$ overlapping D ^′ w.r.t a local Tbox ${\cal T}_2 $ & in all models of ${\cal T}_2 \;C^{\cal I \prime} $ overlapping $D^{\cal I \prime} $.

For instance, consider local ontologies in Fig. 8, where Designer and Programmer are disjoint concept in software engineer ontology O₁ to avoid the situation where same person may design and program in a wrong way, but in ontology O₂ they are overlapping concepts and have a common class Tester between them.

In this scenario, MatchManager suggests the following three Mappings; Mappings $_{O1,O2\thinspace }$ {(SoftwareEngineer ₁ , SoftwareEngineer ₁ ), (Programmer ₂ , Program- mer ₂ ), (Designer ₂ , Designer ₂ ) }. Then, ConsistencyChecker applies the test criteria for the consistency analysis of initial mappings and employs the algorithm for the detection of alignment conflict among disjoint relations. In this example, the initial Mapping $_{O1,O2\thinspace }$ (Designer, Designer) and Mapping _O1,O2 (Programmer, Programmer) create alignment conflict among disjoint relations as Designer and Programmer are disjoint in O₁ but there is a common class between them in O₂. When merging system merges these mappings without having constraint satisfaction, then merged ontology suffers from inconsistent common class between two disjoint classes. To avoid such case, it is necessary for merging system to apply and verify constraints on the concepts and properties so that they stay consistent in the merged ontology.

We identified that there are several possibilities of alignment conflict among disjoint relations (reported in Table 2) between local ontologies that occur when concepts in source ontology O₁ are disjoint but overlapping in ontology O₂, i.e., there is a common class (case 1), equivalence relation (case 2), parent-child relationship (case 3), common instance (case 4) between them.

Table 2 Several cases for the Semantic Inconsistencies due to alignment conflict among disjoint relations in Ontology Hierarchy

Full size table

The detection criteria for disjoint conflict analyzes the mappings belonging to the concepts having disjoint axioms among them (see Fig. 9). When it gets the mapping (e.g., 2) of concept (e.g. O ₁ :Programmer) having disjoint axiom, then it analyzes it’s disjoint concepts (e.g., Designer in this case), and find their mappings (e.g., O ₁ :Designer, O ₂ :Designer) in the mapping list Mappings _O1,O2. Then, it checks whether the disjoint concepts in O₁ are overlapping in ontology O₂. In case of inconsistency, system warns the situation and does not proceed merging with these mappings as they lead to ‘common class between disjoint classes error’. Then, it follows the repair mechanism by placing the common class or preserving the disjoint knowledge in the merged ontology.

5 Patterns for resolving semantic inconsistency and redundancy

This section presents several patterns for resolving semantic inconsistency and redundancy in the ontology merging of heterogeneous ontologies.

5.1 Resolving circular inconsistency in the ontology hierarchy

Circular inconsistency can occur during the merging of source ontologies. For example, consider ontologies in Fig. 10, where the concept Committee is defined as a child of concept User in ontology O1, but in ontology O2, User is defined as a Committee member. This creates circular inconsistency in the merged ontology between User and Committee. Two solutions can be applied for its automatic resolution. First, on the basis of preference of ontology given by the user as an input to the merging process. Second, preserve direct subclass-of axioms in merged ontology (i.e., Person with both children, User as in O1 and Committee as in O2) and remove the subclass-of axiom between children concepts that cause circle. This results Person concept having the User and Committee as subconcepts.

When GCI conflict creates circular inconsistency in the merged ontology, then our approach identifies this situation from the initial stages of mapping detection phase. The solution is that apply pattern in such a way that the sub-hierarchies of both ontologies should not be disturbed and remained concise. The system solves this situation automatically by applying conflict resolution based on the preferences, or applying the pattern below (see Fig. 11). Such a pattern is more suitable in querying instances of ontologies, as it does not disturb the huge repositories underlying the ontological concepts.

The following are the steps of this pattern. It is described in three points, but in practice, steps 1 and 3 are applied to solve the inconsistent problem.

1.
Merge concept C1:level1 with C1:level2, and C2:level1 with C2:level2 concepts together. Then, delete the subclass-of axioms between C1 and C2, which are creating the circulatory inconsistency. After applying step 1, the ontologies look like as in Fig. 11a.
2.
The problem with the ontologies after step1 is that they lack connectivity of S1 with C2 and S2 with C1. For this, it needs to establish subclass-of axioms between corresponding classes or instance-of axioms between instances and classes, as in Fig. 11b. The number of corresponding entities can be large, so making such relations again creates another problem of inconciseness in the merged ontology.
3.
Introduce a new intermediate concept as a subconcept of C1 and C2, and attach S1 and S2 with appropriate axioms (i.e., subclass-of or instance-of) as in Fig. 11c. It is hard to choose an appropriate name for a new concept without human expert. Therefore, human intervention is required to ensure the whole semantics with a new concept in merged ontology. Otherwise as described in step 2, it is necessary to attach S1 with C2 and S2 with C1 with respective axioms to ensure the completeness of knowledge modelled in merged ontology. Thus, expert can decide the choice of execution of this step in either way depending upon the source ontologies and corresponding instance repository.

5.2 Resolving redundancy of subclass-of in the ontology hierarchy

Inconciseness (or redundancies) in the merged ontology can occur during the merging of source ontologies. For example, consider ontologies in Fig. 12, when a concept Ca:Administrator is going to be merged with Cb:Administrator, where it produces redundancy of subclass-of axiom in the class hierarchy of merged ontology. In source ontologies, Ca:Administrator is a subclass of Person, and in Cb:Administrator is a subclass of User. In such a situation, our system applies the reasoning mechanisms, and preserves the semantics of second ontology, and ignores the subclass axiom of first ontology. The reason behind this is that while reasoning, it infers that, when a concept A is-a B and B is-a C, then, it can deduce that A is-a C. This rule holds in this situation as Administrator is-a User is-a Person and preserves conciseness of information under the semantics of O2 ontology. Similarly, in complex ontologies datatype and object properties can have hierarchies. Therefore, while building them in merged ontology apply the same criterion by avoiding redundancy of subclass-of axiom.

Due to semantic heterogeneities, this situation can occur with the children of both the source ontologies (see Fig. 13). Apply the same rule on both the concepts, i.e., on the concepts Administrator and Author. Step 2 applies on the concept Author and is copied in the merged ontology under the concept User. This results a hierarchy which contains Author, Administrator, and Committee concepts as children of User concept in the merged ontology.

The system follows a pattern to solve this situation in an automatic merging of heterogeneous ontologies. The pattern is shown diagrammatically in the Fig. 14, and explained in three steps below.

1.
Merge the corresponding concepts (C1 and C2), (C3 and C4) and (C5, and C6) found by the initial mappings to form the merged ontology. Copy all other unmapped concepts to the merged ontology.
2.
After step 2, apply the reasoning mechanisms on the merged ontology, and analyse the transitivity of subclass-of axioms. When found axioms such that (I) A is-a B and B is-a C and (II) A is-a C, then, deduce that the axiom (ii) is redundant and is covered by the axiom (i).
3.
Remove the redundant axioms that are covered by transitive property of subclass-of axioms.

5.3 Resolving redundant disjoints axioms in the ontology hierarchy

During the merging of heterogeneous ontologies, disjoint axioms in source ontologies need special attention. On one side, their omission create incompleteness in the global ontology and on the other side, they create inconciseness, and redundancies. There can be two situations for the redundancy of disjointness, directly or indirectly among the disjoint concepts. Redundancy of direct/indirect disjointness occurs, when in merged ontology, there are disjoint axioms between concepts which are already disjoint by themselves or their parent inheritance. For example, Disjoint Axiom (C $\sqsubseteq\! \neg$ D) in O1 and Disjoint Axiom (C’ $\sqsubseteq\! \neg$ E’) in O2 create redundant disjointness in merged ontology, such that C is merged with C’ and D is merged with D’, where D’ subsumes E’. This is the case of direct redundancy of disjointness as the parent concept is involved in creating inconciseness. The example of such situation is shown in Fig. 15.

There can be other situation of indirect disjointness in merged ontology. For example, Disjoint Axiom (C $\sqsubseteq\! \neg$ D) in O1 and Disjoint Axiom (E’ $\sqsubseteq\! \neg$ F’) in O2 create redundant disjointness in GO, such that C is merged with C’ and D is merged with D’, where C’ subsumes E’ and D’ subsumes F’. This is the case of indirect redundancy of disjointness as the children concepts create redundancy when parents are already disjoint. Such a situation is depicted in Fig. 16.

Our system follows a pattern to solve this situation in an automatic merging of heterogeneous ontologies. The pattern is shown diagrammatically in the Fig. 17, and explained in three steps below.

1.
Merge the corresponding concepts (C and C’) and (D and D’) found by the initial mappings to form the merged ontology. Copy all other unmapped concepts to the merged ontology.
2.
After step 2, apply the reasoning mechanisms on the merged ontology, and analyse the inference of disjoint-of axioms. When found axioms such that (I) C is disjointwith D and (II) A is disjointwith D, where A is subconcept of C, then, deduce that the axiom (ii) is redundant and is covered by the axiom (i) because A is indirectly disjoint with D by its parent disjoint axiom.
3.
Remove the redundant disjoint axioms that are covered by inheritance disjointness property of disjoint-of axioms.

6 Complexity analysis and discussion

These types of semantic inconsistencies due to structural differences can happen when the same information is modeled differently during classification of knowledge in terms of concepts and properties in the local ontology hierarchies due to different pragmatics of ontologist, scope differences of domain and level of knowledge granularity. The time complexity of presented algorithms is O(n) where n is the size of mapping list, which gives good performance working with real world corpus. We have chosen large taxonomies such as Mouse_Anatomy (2744 classes), NCI_Anatomy (3304 classes), and Gene ontologies (29,534 concepts) for the experiment. We inserted the circulatory inconsistency and subsumption redundancy at various concepts with various depth levels in the ontology pairs. Table 3 shows the measured time in minutes of both the approaches, i.e., Traversal based approach (TbA) as implemented in previous research prototype, and Depth Analysis & Pruning—current prototype implementation (DAPbA). As compared to ontologies for anatomies, erroneous mappings in Gene Ontology took more time. The reason is that it has many concepts with multiple parent subsumptions. From this we also conclude that, the more complex is the ontology structure, the more time it takes for the mapping identification. The underlying algorithm based on the traversal has to search each path for circle or redundant subsumptions. As our algorithm is based on depth analysis and pruning, it takes initially some time to compute all the depths of concepts when it is creating the ontology trees in the memory. During the mapping validation, just pruning based on depth analysis reduces the number of suspected erroneous mappings. Therefore, it saves time and cost of overall system. In addition, as compared to our previous approach (without concept’s depth analysis), where we detect such semantic inconsistencies with the help of hidden global merged ontology which is formed by initial mappings, this novel methodology reduces 30 to 45% system’s time and memory resources. The main reason is that it avoids making intermediate ontology and applies test criteria for the inconsistency detection on mappings list. We have observed that these inconsistent situations are very much common dealing with the real world ontologies due to the inherent semantic conflicts that came out from different communities over the internet. The concept of instant validation of initial mappings, for the users who are not much experts in building ontologies but interested to build ontologies for their domains by reusing the existing several domain ontologies, is highly useful. Otherwise, the user may face various consequences as he may not be familiar with ontological errors of these kinds. Our Reliable mapping and merging algorithms produce consistent mappings that provide a common accurate, consistent and coherent global layer from which several heterogeneous local ontologies could be accessed and hence would exchange information in semantically sound manners.

Table 3 Measuring efficiency of proposed algorithms based on depth analysis and pruning based approach (DAPbA)

Full size table

7 Conclusion and future direction

The challenging task of Automatic Ontology Merging has seen great attention in recent years. This paper presents DKP Automatic Ontology Merging system that exploits linguistic matching and semantic-based formal definition analysis to find correspondences between concepts that promote a larger pool of knowledge and information to be integrated to support new reliable communication and reuse. The main contribution of the paper is the algorithm for the detection of semantic inconsistency that originates when concepts in local ontologies contradict according to their subsumption criterion. Detection of conflict among Generalized Concept Inclusions, Property Subsumption Criteria and Constraint Satisfaction Mechanism between local ontologies would result global ontology free from ‘circulatory error in class/property hierarchy’, ‘common class/instance between disjoint classes’ and ‘redundancy of disjoint relations’, ‘redundancy of subclass/subproperty of relations’ errors and other types of ‘semantic inconsistency’ that occur due to wrong placement of concept in the merged ontology from local heterogeneous ontologies. Our algorithms detect inconsistencies from the initial mappings in the early stages of ontology merging so that only consistent mappings would result Tbox and Abox of global merged ontology with consistent set of generalized concept inclusion axioms. Early automatic detection of inconsistency not only saves time and resources, but also lessons user intervention for ensuring the consistency of merged ontology. We have implemented the algorithm of detecting such inconsistencies and evaluated the working of system on real world ontologies. We provided the ontological patterns for the resolution of semantic inconsistency in automatic merging of ontologies. The outcomes of overall approach are very interesting by embedding inconsistency detection algorithms inside the ontology merging system in terms of precision of results, reduction of human expert dependability, computational efficiency, and good level of automatic consistency checking, etc.

There are several future directions of our research. One of our ongoing research is to integrate algorithms for the consistency checking of instance respository so that Abox of generated merged ontology should be free from errors. Another is to apply a variety of architectural, optimization, and design principle to improve the performance of our system, and further enhance it based on other test criteria (as presented in Fahad and Qadir 2008) that lead towards consistent merged ontology with avoidance of higher level of user intervention in merging process. Another research direction is to integrate multi-lingual ontology translation technique to bootstrap semantic interoperability between multilingual ontologies. Now-a-days, merging of multilingual ontologies developed in different well known languages, such as Chinese, Spanish, French, German, etc. need tools to overcome language level barriers in achieve interoperability among heterogeneous multi-vendor systems. There is some research going on in this direct, such as development of EuroWordNet for most popular languages of Europe. In near future, we hope to integrate this feature inside our ontology merging so that one can manage and integrate multilingual ontologies and produce output in one’s native language.

References

Bouquet, P., Serafini, L., Zanobini, S., & Sceffer, S. (2006). Bootstrapping semantics on the web: Meaning elicitation from schemas. In Proc. of 15th international world wide web conference(pp. 505–512).
Bruijn, J.d., Ehrig, M., Feier, C., Martín-Recuerda, F., Scharffe, F., & Weiten, M. (2006). Ontology mediation, merging and aligning. In Semantic web technologies. Wiley.
Chalupsky, H. (2000). OntoMorph: A translation system for symbolic knowledge. In A. G. Cohn, F. Giunchiglia, & B. Selman (Eds.), 7th international conference on knowledge representation and reasoning (KR’00). Breckenridge, Colorado (pp. 471–482). San Francisco: Morgan Kaufmann.
Google Scholar
Doan, A., Madhaven, J., Domingos, P., & Halevy, A. (2004). Ontology matching: A machine learning approach. Handbook on ontologies in info. systems (pp. 397–416). Springer.
Ehrig, M., & Staab, S. (2004). QOM—Quick Ontology Mapping. In Proc. of the third international semantic web conference, LNCS 3298 (pp. 683–696). Springer.
Euzenat, J., & Shvaiko, P. (2007). Ontology matching. Berlin: Springer.
MATH Google Scholar
Euzenat, J., & Valtchev, P. (2004). Similarity-based ontology alignment in OWL-Lite. In Proc. of 16th ECAI-04, Valencia, Spain (pp. 333–337).
Fahad, M., Moalla, N., & Bouras, A. (2011). Towards ensuring satisfiability of merged ontology. In Proceedings of 10th int’l conference on computation science, Singapore. ICCS 2011, Procedia Computer Science (Vol. 4, pp. 2216–2225).
Fahad, M., Moalla, N., Bouras, A., Qadir, M. A., & Farukh, M. (2010). Disjoint-knowledge analysis and preservation in ontology merging process. In ICSEA, fifth international conference on software engineering advances (pp. 422–428).
Fahad, M., & Qadir, M. A. (2008). A framework for ontology evaluation. In Proceedings of 16th int’l conference on conceptual structures, France. ICCS Supplement, Ceur-ws (Vol. 354, pp. 149–158).
Fahad, M., & Qadir, M. A. (2009). Similarity computation by ontology merging system: DKP-OM. In Proc. of 2nd international conference on computer, control and communication, Karachi, Pakistan (pp. 1–6). IEEE Press.
Fahad, M., Qadir, M. A., Noshairwan, M. W., & Iftakhir, N. (2007). DKP-OM: A semantic based ontology merger. In Proceedings of 3rd international conference I-Semantics 2007, Graz, Austria (pp. 313–322). J.UCS.
Giunchiglia, F., Shvaiko, P., & Yatskevich, M. (2004). S-Match: An algorithm and implementation of semantic matching. In Proc. of 1st European semantic web symposium, LNCS 3053 (pp. 61–75). Springer.
Jean-Marya, Y. R., Shironoshitaa, E. P., & Kabuka, M. R. (2009). Ontology matching with semantic verification. Web Semantics: Science, Services and Agents on the WWW, 7(1), 235–251 (Elsevier).
Google Scholar
Kalfoglou, Y., & Schorlemmer, M. (2003). If-map: An ontology mapping method based on information flow theory. In Journal of Data Semantics, LNCS, 2800 (pp. 98–127). Springer.
Klein, M. (2001). Combining and relating ontologies: An analysis of problems and solution. In Proc. of workshop on ontologies and information sharing, Seattle, USA (pp. 53–62).
Kotis, K., Vouros, G. A., & Stergiou, K. (2006). Towards automatic merging of domain ontologies: The HCONE-merge approach. In Web semantics: Science, services and agents on the world wide web (Vol. 4(1), pp. 60–79). Elsevier.
Mascardi, V., Locoro, A., & Rosso, P. (2010). Automatic ontology matching via upper ontologies: A systematic evaluation. IEEE Transaction on Knowledge and Data Engineering, 2(5), 609–623 (IEEE Press).
Article Google Scholar
McGuinness, D. L., Fikes, R., Rice, J., & Wilder, S. (2000). An environment for merging and testing large ontologies. In Proc. of the 7th international conference on principles of knowledge representation and reasoning, Colorado, USA (pp. 483–493).
Mitra, P., & Wiederhold, G. (2002). Resolving terminological heterogeneity in ontologies. In Proc. of workshop on ontologies and semantic interoperability at the 15th ECAI, Lyon, France (pp. 45–50).
Noy, N. F., & Musen, M. A. (2003). The PROMPT suite: Interactive tools for ontology merging and mapping. International Journal of Human-Computer Studies, 59(6), 983–1024 (Elsevier).
Article Google Scholar
Stumme, G., & Mädche, A. (2001). FCA-merge: Bottom-up merging of ontologies. In Proc. of 7th int’l. joint conference on artificial intelligence, Seattle, USA (pp. 225–230).

Download references

Author information

Authors and Affiliations

Decision & Information Sciences for Production Systems (DISP), CERRAL CENTER, University of Lyon2, Bron, 69676, France
Muhammad Fahad, Nejib Moalla & Abdelaziz Bouras

Authors

Muhammad Fahad
View author publications
You can also search for this author in PubMed Google Scholar
Nejib Moalla
View author publications
You can also search for this author in PubMed Google Scholar
Abdelaziz Bouras
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Muhammad Fahad.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fahad, M., Moalla, N. & Bouras, A. Detection and resolution of semantic inconsistency and redundancy in an automatic ontology merging system. J Intell Inf Syst 39, 535–557 (2012). https://doi.org/10.1007/s10844-012-0202-y

Download citation

Received: 14 July 2011
Revised: 04 February 2012
Accepted: 20 March 2012
Published: 29 April 2012
Issue Date: October 2012
DOI: https://doi.org/10.1007/s10844-012-0202-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Detection and resolution of semantic inconsistency and redundancy in an automatic ontology merging system

Abstract

Similar content being viewed by others

Merging Operation for Domain Ontologies in Semantic Web: Some Issues

A Benchmark for Ontologies Merging Assessment

$$\mathcal {C}$$ o $$\mathcal {M}$$ erger: A Customizable Online Tool for Building a Consistent Quality-Assured Merged Ontology

1 Introduction

2 Ontology mapping and merging systems

3 Overview of semantic ontology merging system (DKP-AOM)