Skip to main content

Evaluating the Gap Between an RDF Dataset and Its Schema

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9382))

Abstract

An increasing number of linked datasets is published on the Web, using RDF(S)/OWL. The availability of the schema describing these datasets is crucial for their meaningful usage. A dataset may contain schema-related information, however, languages do not impose any constraint on their structure, and a gap may therefore exist between the schema and the actual instances. In this paper, we tackle the problem of evaluating this gap. We present an approach relying on both type and class profiles, as well as a set of quality metrics. We also present some experimental evaluations to illustrate the use of the proposed metrics.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    BNF: http://datahub.io/fr/dataset/data-bnf-fr.

  2. 2.

    Conference: http://data.semanticweb.org/dumps/conferences/dc-2010-complete.rdf.

  3. 3.

    BNF: http://datahub.io/fr/dataset/data-bnf-fr.

References

  1. Arenas, M., Dıaz, G., Fokoue, A., Kementsietsidis, A., Srinivas, K.: A principled approach to bridging the gap between graph data and their schemas. In: VLDB (2014)

    Google Scholar 

  2. Batini, C., Scannapieco, M.: Data Quality: Concepts. Methodologies and Techniques. Springer Science & Business Media, New York (2006)

    MATH  Google Scholar 

  3. Berti-Équille, L., Comyn-Wattiau, I., Cosquer, M., Kedad, Z., Nugier, S., Peralta, V., Cherfi, S.S.-S., Thion-Goasdoué, V.: Assessment and analysis of information quality: a multidimensional model and case studies. IJIQ 2(4), 300–323 (2011)

    Article  Google Scholar 

  4. Duan, S., Kementsietsidis, A., Srinivas, K., Udrea, O.: Apples and oranges: a comparison of rdf benchmarks and real rdf datasets. In: SIGMOD (2011)

    Google Scholar 

  5. Fürber, C., Hepp, M.: Using semantic web resources for data quality management. In: Cimiano, P., Pinto, H.S. (eds.) EKAW 2010. LNCS, vol. 6317, pp. 211–225. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  6. Fürber, C., Hepp, M.: Using SPARQL and SPIN for data quality management on the semantic web. In: Abramowicz, W., Tolksdorf, R. (eds.) BIS 2010. LNBIP, vol. 47, pp. 35–46. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  7. Fürber, C., Hepp, M.: Swiqa-a semantic web information quality assessment framework. In: ECIS (2011)

    Google Scholar 

  8. Fürber, C., Hepp, M.: Towards a vocabulary for data quality management in semantic web architectures. In: Workshop on Linked Web Data Management (2011)

    Google Scholar 

  9. Kellou-Menouer, K., Kedad, Z.: Discovering types in RDF datasets. In: 12th European Semantic Web Conference, ESWC. Springer (2015, poster paper)

    Google Scholar 

  10. Kellou-Menouer, K., Kedad, Z.: Schema discovery in RDF data sources. In: Jeusfeld, M., Karlapalem, K. (eds.) ER 2015. LNCS, vol. 9382, pp. XX–YY. Springer, Heidelberg (2015)

    Google Scholar 

  11. Kontokostas, D., Westphal, P., Auer, S., Hellmann, S., Lehmann, J., Cornelissen, R., Zaveri, A.: Test-driven evaluation of linked data quality. In: WWW (2014)

    Google Scholar 

  12. Kontokostas, D., Zaveri, A., Auer, S., Lehmann, J.: TripleCheckMate: a tool for crowdsourcing the quality assessment of linked data. In: Klinov, P., Mouromtsev, D. (eds.) KESW 2013. CCIS, vol. 394, pp. 265–272. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  13. Moody, D.: Theoretical and practical issues in evaluating the quality of conceptual models: current state and future directions. In: Data & Knowledge Engineering (2005)

    Google Scholar 

  14. Pipino, L., Lee, Y., Wang, R.: Data quality assessment. Commun. ACM 45(4), 211–218 (2002)

    Article  Google Scholar 

  15. Redman, T.: Data Quality for the Information Age. Artech House, Boston (1996)

    Google Scholar 

  16. Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. J. Manage. Inf. Syst. 12(4), 5–33 (1996)

    Article  Google Scholar 

Download references

Acknowledgements

This work was partially funded by the French National Research Agency through the CAIR ANR-14-CE23-0006 project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kenza Kellou-Menouer .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Kellou-Menouer, K., Kedad, Z. (2015). Evaluating the Gap Between an RDF Dataset and Its Schema. In: Jeusfeld, M., Karlapalem, K. (eds) Advances in Conceptual Modeling. ER 2015. Lecture Notes in Computer Science(), vol 9382. Springer, Cham. https://doi.org/10.1007/978-3-319-25747-1_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25747-1_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25746-4

  • Online ISBN: 978-3-319-25747-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics