Defining and Detecting Complex Changes on RDF(S) Knowledge Bases

Galani, Theodora; Papastefanatos, George; Stavrakas, Yannis; Vassiliou, Yannis

doi:10.1007/s13740-021-00136-9

Defining and Detecting Complex Changes on RDF(S) Knowledge Bases

Original Article
Published: 02 November 2021

Volume 10, pages 367–398, (2021)
Cite this article

Journal on Data Semantics

Theodora Galani ORCID: orcid.org/0000-0001-6063-5487¹,
George Papastefanatos²,
Yannis Stavrakas² &
…
Yannis Vassiliou¹

108 Accesses
1 Citation
Explore all metrics

Abstract

The dynamic nature of web data brings forward the need for maintaining data versions as well as identifying changes between them. In this paper, we deal with problems regarding understanding evolution, focusing on RDF(S) knowledge bases, as RDF is a de-facto standard for representing data on the web. We argue that revisiting past snapshots or the differences between them is not enough for understanding how and why data evolved. Instead, changes should be treated as first-class citizens. In our view, this involves supporting semantically rich, user-defined changes, called complex changes, as well as identifying the relations between them. In this paper, we present our perspective regarding complex changes, formally define a declarative language for defining complex changes on RDF(S) knowledge bases and present how this language is used to detect complex change instances among dataset versions, which can be queried for analyzing evolution. The approach has been extensively evaluated in terms of language expressivity and detection performance on both artificial and real data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Theory of Knowledge Revision: the Development of the KReC Framework

Article 18 April 2024

A retrospective of knowledge graphs

Article 26 September 2016

Rethinking the experiment: necessary (R)evolution

Article 28 March 2017

Availability of data and material

DBpedia datasets analyzed during this study are publicly available, while the artificial datasets can be generated by the EvoGen tool (GNU General Public License).

Code availability

The software implemented during this study is not publicly available.

Notes

References

Antoniazzi F, Viola F (2018) RDF graph visualization tools: a survey. In: 23rd conference of open innovations association (FRUCT).
Auer S, Herre H (2007) A versioning and evolution framework for RDF knowledge bases. In: Perspectives of systems informatics
Berners-Lee Τ, Connolly D (2004) Delta: an ontology for the distribution of differences between RDF graphs. http://www.w3.org/DesignIssues/Diff (version: 2006-05-12)
Bobed C, Maillot P, Cellier P, Ferré S (2020) Data-driven assessment of structural evolution of RDF graphs. Semantic Web 11(5):831–853
Article Google Scholar
Franconi E, Meyer T, Varzinczak I (2010) Semantic diff as the basis for knowledge base versioning. In: NMR.
Galani T, Papastefanatos G, Stavrakas Y (2016) A language for defining and detecting interrelated complex changes on RDF(S) knowledge bases. In: ICEIS
Galani T, Stavrakas Y, Papastefanatos G, Flouris G (2015) Supporting complex changes in RDF(S) knowledge bases. In: MEPDaW-15
Gonzalez L, Hogan A (2018) Modeling dynamics in semantic web knowledge graphs with formal concept analysis. In: WWW
Harris S, Seaborne A (2013) SPARQL query language for RDF. W3C recommendation. W3C.
Kaminski M, Kostylev EV, Cuenca Grau B (2017) Query nesting, assignment, and aggregation in SPARQL 1.1. ACM TODS 42(3).
Klein M (2004) Change management for distributed ontologies. Ph.D. thesis, Vrije University
Maillot P, Bobed C (2018). Measuring structural similarity between RDF graphs. In: SIGAPP
Meimaris M (2016) EvoGen: a generator for synthetic versioned RDF. In: EDBT/ICDT workshops.
Meimaris M, Papastefanatos G (2016) The EvoGen benchmark suite for evolving RDF data. In: MEPDaW/LDQ in ESWC
Noy NF, Musen M (2002) PromptDiff: a fixed-point algorithm for comparing ontology versions. In: AAAI
Papastefanatos G, Stavrakas Y, Galani T (2013) Capturing the history and change structure of evolving data. In: DBKDA
Papavasileiou V, Flouris G, Fundulaki I, Kotzinos D, Christophides V (2013) High-level change detection in RDF(S) KBs. ACM Trans Database Syst 38(1):1–42
Article MathSciNet Google Scholar
Perez J, Arenas M, Gutierrez C (2009) Semantics and complexity of SPARQL. ACM TODS 34(3):1–45
Article Google Scholar
Plessers P, De Troyer O, Casteleyn S (2007) Understanding ontology evolution: a change detection approach. J Web Sem 5(1):39–49
Article Google Scholar
Roussakis Y, Chrysakis I, Stefanidis K, Flouris G, Stavrakas Y (2015) A flexible framework for understanding the dynamics of evolving RDF datasets. In: ISWC.
Singh A, Brennan R, O’Sullivan D (2019) DELTA-LD: a change detection approach for linked datasets. In: MEPDAW in ESWC
Stojanovic L (2004) Methods and tools for ontology evolution. Ph.D. thesis, University of Karlsruhe
Troullinou G, Roussakis G, Kondylakis H, Stefanidis K, Flouris G (2016) Understanding ontology evolution beyond deltas. In: MEPDAW in EDBT/ICDT
Volkel M, Winkler W, Sure Y, Kruk S, Synak M (2005) SemVersion: a versioning system for rdf and ontologies. In: ESWC.
Guo Y, Pan Z, Heflin J (2005) LUBM: a benchmark for OWL knowledge base systems. J Web Semant 3(2–3):158–182
Article Google Scholar
Zeginis D, Tzitzikas Y, Christophides V (2011) On computing deltas of RDF/S knowledge bases. ACM Trans Web 5:1–36
Article Google Scholar

Download references

Funding

This research is partially funded by the H2020 NEANIAS project (No.863448).

Author information

Authors and Affiliations

School of Electrical and Computer Engineering, NTUA, Zografou, Athens, Greece
Theodora Galani & Yannis Vassiliou
RC ATHENA, Artemidos 6 & Epidavrou, Marousi, Greece
George Papastefanatos & Yannis Stavrakas

Authors

Theodora Galani
View author publications
You can also search for this author in PubMed Google Scholar
George Papastefanatos
View author publications
You can also search for this author in PubMed Google Scholar
Yannis Stavrakas
View author publications
You can also search for this author in PubMed Google Scholar
Yannis Vassiliou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Theodora Galani.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix I: Simple Changes

Add_Type_Class(a)	Add object a of type rdfs:Class
Delete_Type_Class(a)	Delete object a of type rdfs:Class
Rename_Class(a)	Rename class a to b
Merge_Classes(A, b)	Merge classes contained in A into b
Merge_Classes_Into_Existing(A,b)	Merge classes in A into b, b ∈ A
Split_Class(a,B)	Split class a into classes contained in B
Split_Class_Into_Existing(a,B)	Split class a into classes in B, a ∈ B
Add_Type_Property(a)	Add object a of type rdf:property
Delete_Type_Property(a)	Delete object a of type rdf:property
Rename_Property(a,b)	Rename property a to b
Merge_Properties(A,b)	Merge properties contained in A into b
Merge_Properties_Into_Existing(A, b)	Merge A into b, b ∈ A
Split_Property(a,B)	Split property a into properties contained in B
Split_Property_Into_Existing(a,B)	Split a into properties in B, a ∈ B
Add_Type_Individual(a)	Add object a of type rdfs:Resource
Delete_Type_Individual(a)	Delete object a of type rdfs:Resource
Merge_Individuals(A,b)	Merge individuals contained in A into b
Merge_Individuals_Into_Existing(A,b)	Merge A into b, b ∈ A
Split_Individual(a,B)	Split individual a into individuals in B
Split_Individual_Into_Existing(a,B)	Split a into individuals in B, a ∈ B
Add_Superclass(a,b)	Parent b of class a is added
Delete_Superclass(a,b)	Parent b of class a is deleted
Add_Superproperty(a,b)	Parent b of property a is added
Delete_Superproperty(a,b)	Parent b of property a is deleted
Add_Type_To_Individual(a,b)	Type b of individual a is added
Delete_Type_From_Individual(a,b)	Type b of individual a is deleted
Add_Property_Instance (a1,a2,b)	Add property instance of property b
Delete_Property_Instance(a1,a2,b)	Delete instance of property b
Add_Domain(a,b)	Domain b of property a is added
Delete_Domain(a,b)	Domain b of property a is deleted
Add_Range(a,b)	Range b of property a is added
Delete_Range(a,b)	Range b of property a is deleted
Add_Comment(a,b)	Comment b of object a is added
Delete_Comment(a,b)	Comment b of object a is deleted
Change_Comment(u,a,b)	Change comment of resource u from a to b
Add_Label(a,b)	Label b of object a is added
Delete_Label(a,b)	Label b of object a is deleted
Change_Label(u,a,b)	Change label of resource u from a to b

Appendix 2: Complex Change Detection Correctness

Below we prove the correctness of the detection algorithm in Sect. 5 with respect to complex change language semantics. First, a subset of the proposed language is proven to have equivalent semantics to a subset of SPARQL. SPARQL semantics are defined in Perez et al. [18] and Kaminski et al. [10]. Next, augmenting with the rest features, semantics are implemented by applying Algorithm 3 to the result mappings of a SPARQL graph pattern.

Step 1. Consider the subset of the proposed complex change language which involves only changes with cardinalities one and "?", scalar parameters and filter expressions on scalar parameters. Complex change semantics are defined given a set of change instances \(I\) and SPARQL semantics given an RDF graph \(D\). Let \(D\) contain the RDF representation of \(I\) based on the vocabulary presented in Sect. 5.2.

(1) The abstract syntax of the proposed language is by definition equivalent to the one proposed for SPARQL in Perez et al. [18], assuming that a graph pattern involves triples for changes, except that: (a) UNION operator is not considered, (b) the right operand of OPT shall be a graph pattern corresponding to a primitive change pattern, or a filter primitive change pattern, or an optional change pattern involving only primitive change patterns, filter primitive change patterns or optional change patterns with these types of operands, (c) the right operand of OPT may be a triple that involves an optional variable \(xOPT\) (recall, if \(xOPT\in dom\left({\mu }_{c}\right)\) then \({\mu }_{c}\left(xOPT\right)=\varnothing \) or \({\mu }_{c}\left(xOPT\right)\ne \varnothing \)). All complex change language's built-in filter expressions are SPARQL built-in filter expressions as well. For a complete SPARQL feature list, see Harris and Seaborne [9].

(2) The semantics of the proposed language are by definition equal to SPARQL semantics as in Perez et al. [18] for the syntax in (1), since they are made up of semantically equivalent operators applied on equivalent data in the same sequence.

Algorithm 3 (grouping variables are the change variables) materializes the change instances, performing a trivial grouping, where each SPARQL result mapping forms a trivial group and a new complex change instance. Overall, \(\left[\kern-0.15em\left[ {change\; pattern} \right]\kern-0.15em\right]_{I} = Algorithm3\left( {\left[\kern-0.15em\left[ {graph\; pattern} \right]\kern-0.15em\right]_{D} } \right)\).

Step 2. Augment step 1 with set parameters. Consider a change pattern with a set variable \(X\) and a set of mappings \({\mu }_{c},\,{\Omega }_{c}.\) Since SPARQL does not support this feature, the graph pattern corresponding to the change pattern involves a scalar variable \(x\) corresponding to \(X\). Evaluating the graph pattern results in a set of mappings \(\mu,\,\Omega. \) It holds that \(dom\left({\mu }_{c}\right)-\left\{X\right\}=dom\left(\mu \right)-\left\{x\right\}\). Based on step 1, for each \({\mu }_{c}\in {\Omega }_{c}\) there is a \(\mu \in\Omega \) such that \({\mu }_{c}\left(y\right)=\mu \left(y\right)\) where \(y\in dom\left({\mu }_{c}\right)-\left\{X\right\}\). Based on \({\mu }_{c}\) definition for a set parameter \({\mu }_{c}\left(X\right)={\cup }_{i=1, \dots , n}{\mu }_{i}\left(x\right)\), considering all \({\mu }_{i}\) where \({\mu }_{c}\left(y\right)={\mu }_{i}\left(y\right) \forall y\in dom\left({\mu }_{c}\right)-\left\{X\right\}\) or simply \(\forall y\in dom\left({\mu }_{c}\right)-\left\{X\right\}\) and \(y\) is a change variable. Optional set variables are handled similarly. Therefore, the complex change semantics equal SPARQL semantics for step 1 plus Algorithm 3 for implementing set variable semantics: \(\left[\kern-0.15em\left[ {change\; pattern} \right]\kern-0.15em\right]_{I} = Algorithm3\left( {\left[\kern-0.15em\left[ {graph\; pattern} \right]\kern-0.15em\right]_{D} } \right)\).

Step 3. Augment step 2 with filter expressions on set parameters. These expressions are not SPARQL built-in expressions. Thus, each such expression \(R\) is mapped to an equivalent \({R}^{{\prime}}\) in SPARQL, based on built-in features (FILTER EXIST/NOT EXIST, MINUS and subqueries). The exact mapping for each one filter expression into SPARQL is not discussed in further detail. Also, \(R\) may combine primitive filter expressions with logical connectives. In this case, there is always an equivalent DNF expression \(DNF\left(R\right)={R}_{1}\vee {R}_{2}\vee \dots \vee {R}_{n}.\) Since, \(\left[\kern-0.15em\left[ {P \,FILTER \,R} \right]\kern-0.15em\right]_{I} = \left\{ {\left. {\mu \in \left[\kern-0.15em\left[ {P } \right]\kern-0.15em\right]_{I}} \right|\mu { \vDash }R} \right\} = \left\{ {\left. {\mu \in \left[\kern-0.15em\left[ P \right]\kern-0.15em\right]_{I} } \right|\mu { \vDash }R_{1} \vee R_{2} \vee \ldots \vee R_{n} } \right\}\) and \(\left[\kern-0.15em\left[ {P \,FILTER \,R_1} \right]\kern-0.15em\right]_{I} = \left\{ {\left. {\mu \in \left[\kern-0.15em\left[ P \right]\kern-0.15em\right]_{I} } \right|\mu { \vDash }R_{1} } \right\}\,\ldots, \left[\kern-0.15em\left[ {P \,FILTER\, R_n} \right]\kern-0.15em\right]_{I} = \left\{ {\left. {\mu \in \left[\kern-0.15em\left[ P \right]\kern-0.15em\right]_{I} } \right|\mu { \vDash }R_{n} } \right\}\), it is implied that \(\left[\kern-0.15em\left[ {P\, FILTER\, R} \right]\kern-0.15em\right]_{I} = \left[\kern-0.15em\left[ {P \,FILTER\, R_1} \right]\kern-0.15em\right]_{I} \cup \ldots \cup \left[\kern-0.15em\left[ {P \,FILTER \,R}_n \right]\kern-0.15em\right]_{I}\). Thus, \(P\, FILTER\, R\) can be mapped in SPARQL to the union of all graph patterns where each comprises of P and R_i.

Overall, the complex change semantics are equal to the semantics of an equivalent SPARQL graph pattern plus Algorithm 3 for implementing the semantics of set variables (as in step 2). Again, \(\left[\kern-0.15em\left[ {change \;pattern} \right]\kern-0.15em\right]_{I} = Algorithm3\left( {\left[\kern-0.15em\left[ {equivalent \;graph \;pattern} \right]\kern-0.15em\right]_{D} } \right)\).

Step 4. Augment step 3 with cardinalities " + " and "*" and with union aggregation function. The change pattern is in extended form, including groups and aggregation. In Definition 12, a group \(\Gamma =Group\left({V}_{r}^{g}, P\right)\) is defined over a change pattern \(P\) and a list of variables \({V}_{r}^{g}\) (grouping variables). In Definition 13, an aggregate is a construct of the form \(A=Aggregate\left({v}_{r}, union,\Gamma \right)\) where \({v}_{r}\) is a variable over which \(union\) aggregate function is performed for each group \(\Gamma \). Based on previous steps, \(P\) is mapped to a SPARQL graph pattern \(P{^{\prime}}\), such that \(\left[\kern-0.15em\left[ P \right]\kern-0.15em\right] _I= Algorithm3\left( {\left[\kern-0.15em\left[ P^{\prime } \right]\kern-0.15em\right]_{D} } \right)\) (3). Groups and aggregation computation is based on variables in \({V}_{r}^{g}\), which is by definition a superset of the variables used by Algorithm 3 in (3), since in previous steps the grouping variables are the change variables. Thus, \(\left[\kern-0.15em\left[ A \right]\kern-0.15em\right]_I = Algorithm3\left( {\left[\kern-0.15em\left[ P^{\prime } \right]\kern-0.15em\right]_{D} } \right)\) and grouping variables are those in \({V}_{r}^{g}\). Union aggregation function is implemented by Algorithm 3, also implementing set variable semantics for computing set grouping variables.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Galani, T., Papastefanatos, G., Stavrakas, Y. et al. Defining and Detecting Complex Changes on RDF(S) Knowledge Bases. J Data Semant 10, 367–398 (2021). https://doi.org/10.1007/s13740-021-00136-9

Download citation

Received: 05 August 2020
Revised: 08 August 2021
Accepted: 14 August 2021
Published: 02 November 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s13740-021-00136-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Defining and Detecting Complex Changes on RDF(S) Knowledge Bases

Abstract

Access this article

Similar content being viewed by others

A Theory of Knowledge Revision: the Development of the KReC Framework

A retrospective of knowledge graphs

Rethinking the experiment: necessary (R)evolution

Availability of data and material

Code availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix I: Simple Changes

Appendix 2: Complex Change Detection Correctness

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Defining and Detecting Complex Changes on RDF(S) Knowledge Bases

Abstract

Access this article

Similar content being viewed by others

A Theory of Knowledge Revision: the Development of the KReC Framework

A retrospective of knowledge graphs

Rethinking the experiment: necessary (R)evolution

Availability of data and material

Code availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix I: Simple Changes

Appendix 2: Complex Change Detection Correctness

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation