skip to main content
10.1145/2320765.2320803acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article

Sieve: linked data quality assessment and fusion

Published: 30 March 2012 Publication History

Abstract

The Web of Linked Data grows rapidly and already contains data originating from hundreds of data sources. The quality of data from those sources is very diverse, as values may be out of date, incomplete or incorrect. Moreover, data sources may provide conflicting values for a single real-world object.
In order for Linked Data applications to consume data from this global data space in an integrated fashion, a number of challenges have to be overcome. One of these challenges is to rate and to integrate data based on their quality. However, quality is a very subjective matter, and finding a canonic judgement that is suitable for each and every task is not feasible.
To simplify the task of consuming high-quality data, we present Sieve, a framework for flexibly expressing quality assessment methods as well as fusion methods. Sieve is integrated into the Linked Data Integration Framework (LDIF), which handles Data Access, Schema Mapping and Identity Resolution, all crucial preliminaries for quality assessment and fusion.
We demonstrate Sieve in a data integration scenario importing data from the English and Portuguese versions of DBpedia, and discuss how we increase completeness, conciseness and consistency through the use of our framework.

References

[1]
C. Bizer and R. Cyganiak. Quality-driven information filtering using the wiqa policy framework. Web Semant., 7:1--10, January 2009.
[2]
C. Bizer and A. Schultz. The R2R Framework: Publishing and discovering mappings on the web. Work, page 19, 2010.
[3]
J. Bleiholder and F. Naumann. Declarative Data Fusion: Syntax, Semantics, and Implementation. pages 58--73. 2005.
[4]
J. Bleiholder and F. Naumann. Conflict handling strategies in an integrated information system. In Proceedings of the International Workshop on Information Integration on the Web (IIWeb), Edinburgh, UK, 0 2006.
[5]
J. Bleiholder and F. Naumann. Data fusion. ACM Comput. Surv., 41:1:1--1:41, January 2009.
[6]
J. J. Carroll, C. Bizer, P. J. Hayes, and P. Stickler. Named graphs. J. Web Sem., 3(4):247--267, 2005.
[7]
K. G. Clark, L. Feigenbaum, and E. Torres. SPARQL Protocol for RDF. January 2008.
[8]
T. Heath and C. Bizer. Linked data: evolving the web into a global data space. Morgan and Claypool, {San Rafael, Calif.}, 2011.
[9]
R. Isele, A. Jentzsch, and C. Bizer. Silk Server - Adding missing Links while consuming Linked Data. In 1st International Workshop on Consuming Linked Data (COLD 2010), Shanghai, 2010.
[10]
A. Jentzsch, C. Bizer, and R. Cyganiak. State of the LOD Cloud, September 2011.
[11]
J. Juran. The Quality Control Handbook. McGraw-Hill, New York, 3rd edition, 1974.
[12]
F. Naumann. Quality-Driven Query Answering for Integrated Information Systems. Springer, Berlin Heidelberg New York, 2002.
[13]
A. Schultz, A. Matteini, R. Isele, C. Bizer, and C. Becker. Ldif: Linked data integration framework. 2011.
[14]
J. Volz, C. Bizer, M. Gaedke, and G. Kobilarov. Discovering and maintaining links on the web of data. In The Semantic Web -- ISWC 2009: 8th International Semantic Web Conference, Chantilly, VA, USA, pages 650--665, 2009.

Cited By

View all
  • (2024)Research on Spurious-Negative Sample Augmentation-Based Quality Evaluation Method for Cybersecurity Knowledge GraphMathematics10.3390/math1301006813:1(68)Online publication date: 27-Dec-2024
  • (2024)RQSS: Referencing quality scoring system for WikidataSemantic Web10.3233/SW-243695(1-57)Online publication date: 19-Aug-2024
  • (2024)From data to insights: the application and challenges of knowledge graphs in intelligent auditJournal of Cloud Computing10.1186/s13677-024-00674-013:1Online publication date: 29-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
EDBT-ICDT '12: Proceedings of the 2012 Joint EDBT/ICDT Workshops
March 2012
265 pages
ISBN:9781450311434
DOI:10.1145/2320765
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 March 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. RDF
  2. data fusion
  3. data integration
  4. data quality
  5. linked data
  6. semantic web

Qualifiers

  • Research-article

Funding Sources

Conference

ICDT '12

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)46
  • Downloads (Last 6 weeks)2
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Research on Spurious-Negative Sample Augmentation-Based Quality Evaluation Method for Cybersecurity Knowledge GraphMathematics10.3390/math1301006813:1(68)Online publication date: 27-Dec-2024
  • (2024)RQSS: Referencing quality scoring system for WikidataSemantic Web10.3233/SW-243695(1-57)Online publication date: 19-Aug-2024
  • (2024)From data to insights: the application and challenges of knowledge graphs in intelligent auditJournal of Cloud Computing10.1186/s13677-024-00674-013:1Online publication date: 29-May-2024
  • (2024)A Deep Learning-Based Framework for Handling Incompleteness and Detecting Errors in Linked Data Applied to the UniProt Dataset2024 8th International Artificial Intelligence and Data Processing Symposium (IDAP)10.1109/IDAP64064.2024.10710995(1-8)Online publication date: 21-Sep-2024
  • (2024)BIGOWL4DQ: Ontology-driven approach for Big Data quality meta-modelling, selection and reasoningInformation and Software Technology10.1016/j.infsof.2023.107378167(107378)Online publication date: Mar-2024
  • (2024)KGHeartBeat: An Open Source Tool for Periodically Evaluating the Quality of Knowledge GraphsThe Semantic Web – ISWC 202410.1007/978-3-031-77847-6_3(40-58)Online publication date: 11-Nov-2024
  • (2023)Review of Knowledge Graph and Its Vertical Applications in Industry2023 42nd Chinese Control Conference (CCC)10.23919/CCC58697.2023.10240572(5151-5157)Online publication date: 24-Jul-2023
  • (2023)Automated approach for quality assessment of RDF resourcesBMC Medical Informatics and Decision Making10.1186/s12911-023-02182-823:S1Online publication date: 10-May-2023
  • (2023)A Framework to Assess Knowledge Graphs Accountability2023 IEEE International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)10.1109/WI-IAT59888.2023.00034(213-220)Online publication date: 26-Oct-2023
  • (2023)Towards Reliable Collaborative Data Processing Ecosystems: Survey on Data Quality Criteria2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)10.1109/TrustCom60117.2023.00345(2456-2464)Online publication date: 1-Nov-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media