ISSN: 2577-610X

 JDI Homepage
 Guidelines for Authors
 JDI Online

Subscribers: to view a paper, simply click on the title of the paper, the pdf (or ps or zip file) file will pup up on your screen. If you have any problem to access the files, please check with your librarian or contact jdi@rintonpress.com      To subscribe to JDI, please click Here.

 

Journal of Data Intelligence  ISSN: 2577-610X      published since 2020
Vol.2 No.1  March, 2021 

Schema-level Index Models for Web Data Search (pp047-063)
        
Ansgar Scherp and Till Blume
         
doi:
https://doi.org/10.26421/JDI2.1-3
Abstracts: Indexing the Web of Data offers many opportunities, in particular, to find and explore data sources. One major design decision when indexing the Web of Data is to find a suitable index model, i.e., how to index and summarize data. Various efforts have been conducted to develop specific index models for a given task. With each index model designed, implemented, and evaluated independently, it remains difficult to judge whether an approach generalizes well to another task, set of queries, or dataset. In this work, we empirically evaluate six representative index models with unique feature combinations. Among them is a new index model incorporating inferencing over RDFS and \texttt{owl:sameAs}. We implement all index models for the first time into a single, stream-based framework. We evaluate variations of the index models considering sub-graphs of size $0$, $1$, and $2$ hops on two large, real-world datasets. We evaluate the quality of the indices regarding the compression ratio, summarization ratio, and F1-score denoting the approximation quality of the stream-based index computation. The experiments reveal huge variations in compression ratio, summarization ratio, and approximation quality for different index models, queries, and datasets. However, we observe meaningful correlations in the results that help to determine the right index model for a given task, type of query, and dataset.
Key words:
Graph Summarization; Schema-level Graph Indices; Data Search