short-paper

SHACTOR: Improving the Quality of Large-Scale Knowledge Graphs with Validating Shapes

Authors:
Kashif Rabbani

Aalborg University, Aalborg, Denmark

Aalborg University, Aalborg, Denmark

0000-0002-6984-2121
View Profile

,
Matteo Lissandrini

Aalborg University, Aalborg, Denmark

Aalborg University, Aalborg, Denmark

0000-0001-7922-5998
View Profile

,
Katja Hose

Aalborg University & TU Wien, Wien, Austria

Aalborg University & TU Wien, Wien, Austria

0000-0001-7025-8099
View Profile

SIGMOD '23: Companion of the 2023 International Conference on Management of DataJune 2023Pages 151–154https://doi.org/10.1145/3555041.3589723

Published:05 June 2023Publication History

SIGMOD '23: Companion of the 2023 International Conference on Management of Data

Pages 151–154

ABSTRACT

We demonstrate SHACTOR, a system for extracting and analyzing validating shapes from very large Knowledge Graphs (KGs). Shapes represent a specific form of data patterns, akin to schemas for entities. Standard shape extraction approaches are likely to produce thousands of shapes, and some of those represent spurious constraints extracted due to the presence of erroneous data in the KG. Given a KG having tens of millions of triples and thousands of classes, SHACTOR parses the KG using our efficient and scalable shapes extraction algorithm and outputs SHACL shapes constraints. The extracted shapes are further annotated with statistical information regarding their support in the graph, which allows to identify both erroneous and missing triples in the KG. Hence, SHACTOR can be used to extract, analyze, and clean shape constraints from very large KGs. Furthermore, it enables the user to also find and correct errors by automatically generating SPARQL queries over the graph to retrieve nodes and facts that are the source of the spurious shapes and to intervene by amending the data.

Supplemental Material

SIGMOD23-demo43.mp4

mp4

611.7 MB

Download

References

WWW Consortium. 2014. RDF 1.1. https://w3.org/RDF/.Google Scholar
D. Fernandez-Álvarez, J. Emilio Labra-Gayo, and D. Gayo-Avello. 2022. Automatic extraction of shapes using sheXer. KBS, Vol. 238 (2022), 107975.Google ScholarDigital Library
A. Keely. 2022. SHACLGEN. https://pypi.org/project/shaclgen/.Google Scholar
Holger Knublauch and Dimitris Kontokostas. 2017. Shapes constraint language (SHACL). W3C Candidate Recommendation, Vol. 11, 8 (2017).Google Scholar
N. Noy, Y. Gao, A. Jain, A. Narayanan, Alan Patterson, and Jamie Taylor. 2019. Industry-scale knowledge graphs: lessons and challenges. ACM, Vol. 62, 8 (2019), 36--43.Google ScholarDigital Library
E. Prud'hommeaux, J. Emilio Labra Gayo, and H. R. Solbrig. 2014. Shape expressions: an RDF validation and transformation language. In ICSS. 32--40.Google Scholar
K. Rabbani, M. Lissandrini, and K. Hose. 2022. SHACL and ShEx in the Wild: A Community Survey on Validating Shapes Generation and Adoption. In TheWebConf-2022. 260--263.Google Scholar
K. Rabbani, M. Lissandrini, and K. Hose. 2023. Extraction of Validating Shapes from very large Knowledge Graphs. PVLDB, Vol. 16, 5 (2023), 1023--1032.Google ScholarDigital Library
J. Sequeda and O. Lassila. 2021. Designing and Building Enterprise Knowledge Graphs. Synthesis Lectures on Data, Semantics, and Knowledge, Vol. 11, 1 (2021), 1--165.Google ScholarCross Ref

Index Terms

SHACTOR: Improving the Quality of Large-Scale Knowledge Graphs with Validating Shapes
1. Computing methodologies
  1. Artificial intelligence
    1. Knowledge representation and reasoning
      1. Ontology engineering
2. Information systems
  1. Information systems applications
    1. Data mining
      1. Data cleaning
  2. World Wide Web
    1. Web data description languages
      1. Semantic web description languages

Recommendations

SHACL and ShEx in the Wild: A Community Survey on Validating Shapes Generation and Adoption
WWW '22: Companion Proceedings of the Web Conference 2022

Knowledge Graphs (KGs) are widely used to represent heterogeneous domain knowledge on the Web and within organizations. Various methods exist to manage KGs and ensure the quality of their data. Among these, the Shapes Constraint Language (SHACL) and ...
Read More
Extraction of Validating Shapes from Very Large Knowledge Graphs

Knowledge Graphs (KGs) represent heterogeneous domain knowledge on the Web and within organizations. There exist shapes constraint languages to define validating shapes to ensure the quality of the data in KGs. Existing techniques to extract validating ...
Read More
Automatic extraction of shapes using sheXer
Abstract
There is an increasing number of projects based on Knowledge Graphs and SPARQL endpoints. These SPARQL endpoints are later queried by final users or used to feed many different kinds of applications. Shape languages, such as ShEx and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '23: Companion of the 2023 International Conference on Management of Data
June 2023
330 pages
ISBN:9781450395076
DOI:10.1145/3555041
General Chairs:
Sudipto Das
Amazon Web Services, USA
,
Ippokratis Pandis
Amazon Web Services, USA
,
Program Chairs:
K. Selçuk Candan
Arizona State University, USA
,
Sihem Amer-Yahia
CNRS, Université Grenoble Alpes, France
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 June 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
SHACL
knowledge graphs
quality assessment
shapes extraction
Qualifiers
- short-paper
Conference

Acceptance Rates
Overall Acceptance Rate785of4,003submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 160
  Total Downloads
- Downloads (Last 12 months)160
- Downloads (Last 6 weeks)12
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

SHACTOR: Improving the Quality of Large-Scale Knowledge Graphs with Validating Shapes

SIGMOD '23: Companion of the 2023 International Conference on Management of Data

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

SHACL and ShEx in the Wild: A Community Survey on Validating Shapes Generation and Adoption

Extraction of Validating Shapes from Very Large Knowledge Graphs

Automatic extraction of shapes using sheXer