Skip to main content

Toward RDF Normalization

  • Conference paper
  • First Online:
Book cover Conceptual Modeling (ER 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9381))

Included in the following conference series:

Abstract

Billions of RDF triples are currently available on the Web through the Linked Open Data cloud (e.g., DBpedia, LinkedGeoData and New York Times). Governments, universities as well as companies (e.g., BBC, CNN) are also producing huge collections of RDF triples and exchanging them through different serialization formats (e.g., RDF/XML, Turtle, N-Triple, etc.). However, RDF descriptions (i.e., graphs and serializations) are verbose in syntax, often contain redundancies, and could be generated differently even when describing the same resources, which would have a negative impact on their processing. Hence, we propose here an approach to clean and eliminate redundancies from such RDF descriptions as a means of transforming different descriptions of the same information into one representation, which can then be tuned, depending on the target application (information retrieval, compression, etc.). Experimental tests show significant improvements, namely in reducing RDF description loading time and file size.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://linkedgeodata.org, http://data.nytimes.com/, http://dbpedia.org.

  2. 2.

    We use disparities to designate different serializations of the same information.

  3. 3.

    http://www.w3.org/TR/REC-xml-names/.

  4. 4.

    Following the W3C Recommendation, we consider that all the prefixes have to be unique for each namespace.

  5. 5.

    DT is a set of datatypes: string, number, date, etc.

  6. 6.

    Lang is a set of language tags: @fr, @en, etc.

  7. 7.

    \(st_{i}^{+}\), \(u_{i}\), \(p_{i}\), \(bn_{i}\), and \(l_{i}\) represent corresponding extended statements, IRIs, predicates, blank nodes, and literals.

  8. 8.

    An unused namespace is a namespace which is mention in the serialization file but which is not use in any of the statements, i.e., it will not appear in the Graph.

  9. 9.

    This is comparable to the notion of map function in [4] except that the authors do not consider namespaces.

  10. 10.

    http://www.w3.org/TR/xml-entity-names/.

  11. 11.

    Available at http://rdfn.sigappfr.org/.

References

  1. Belleau, F., et al.: Bio2rdf: towards a mashup to build bioinformatics knowledge systems. J. Biomed. Inform. 41(5), 706–716 (2008)

    Article  Google Scholar 

  2. Fernández, J.D., et al.: Binary rdf representation for publication and exchange (HDT). J. Web Semant. 19, 22–41 (2013)

    Article  Google Scholar 

  3. Gutierrez, C., et al.: Foundations of semantic web databases. In: PODS 2004, pp. 95–106. ACM (2004)

    Google Scholar 

  4. Gutierrez, C., et al.: Foundations of semantic web databases. J. Comput. Syst. Sci. 77(3), 520–541 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  5. Hayes, J., Gutierrez, C.: Bipartite graphs as intermediate model for RDF. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 47–61. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  6. Jiang, G., et al.: Using semantic web technology to support ICD-11 textual definitions authoring. J. Biomed. Semant. 4, 11 (2013)

    Article  Google Scholar 

  7. Kerzazi, A., et al.: A model-based mediator system for biological data integration. In: Journes Scientifiques en Bio-Informatique, pp. 70–77 (2007)

    Google Scholar 

  8. Kerzazi, A., et al.: A semantic mediation architecture for RDF data integration. In: SWAP, p. 3 (2008)

    Google Scholar 

  9. Longley, D.: RDF dataset normalization (2015). http://json-ld.org/spec/latest/rdf-dataset-normalization/

  10. Nolin, M.-A., et al.: Building an hiv data mashup using Bio2RDF. Briefings Bioinform. 13(1), 98–106 (2012)

    Article  Google Scholar 

  11. Pathak, J., et al.: Lexgrid: a framework for representing, storing, and querying biomedical terminologies from simple to sublime. J. Am. Med. Inform. Assoc. 16(3), 305–315 (2009)

    Article  Google Scholar 

  12. Salameh, K., Tekli, J., Chbeir, R.: SVG-to-RDF image Semantization. In: Traina, A.J.M., Traina Jr., C., Cordeiro, R.L.F. (eds.) SISAP 2014. LNCS, vol. 8821, pp. 214–228. Springer, Heidelberg (2014)

    Google Scholar 

  13. Sporny, M., Longley, D.: RDF graph normalization (2013). http://json-ld.org/spec/ED/rdf-graph-normalization/20111016/

  14. Tao, C., et al.: A RDF-base normalized model for biomedical lexical grid. In: The 8th International Semantic Web Conference, p. 2 (2009)

    Google Scholar 

  15. Ticona-Herrera, R., et al.: Rdf similarity. Technical report (2015). http://rdfn.sigappfr.org/RDFN-TR-15.pdf

  16. Vrandecic, D., et al.: RDF syntax normalization using XML validation. In: Proceedings of the SemRUs, p. 11 (2009)

    Google Scholar 

  17. Weiss, C., Karras, P., Bernstein, A.: Hexastore: sextuple indexing for semantic web data management. Proc. VLDB Endow. 1(1), 1008–1019 (2008)

    Article  Google Scholar 

Download references

Acknowledgments

This work has been partly supported by FINCyT (Fund for Innovation, Science and Technology) of Peru.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Regina Ticona-Herrera .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Ticona-Herrera, R., Tekli, J., Chbeir, R., Laborie, S., Dongo, I., Guzman, R. (2015). Toward RDF Normalization. In: Johannesson, P., Lee, M., Liddle, S., Opdahl, A., Pastor López, Ó. (eds) Conceptual Modeling. ER 2015. Lecture Notes in Computer Science(), vol 9381. Springer, Cham. https://doi.org/10.1007/978-3-319-25264-3_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25264-3_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25263-6

  • Online ISBN: 978-3-319-25264-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics