Skip to main content

OFR: An Efficient Representation of RDF Datasets

  • Conference paper
  • First Online:
Languages, Applications and Technologies (SLATE 2015)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 563))

Included in the following conference series:

Abstract

The constant growth of structured data, often in the form of RDF, demands for efficient compression methods, to facilitate their storage and transmission. We propose an RDF compression algorithm that produces a succinct representation of RDF datasets. It consists of two stages. The first splits the input triples into multiple streams, and applies tailored compaction techniques for each stream. In the second, a general-purpose compression is applied. We experimentally show on a number of datasets that the proposed algorithm achieves compression ratios significantly better than the RDF compressors known from the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.compression.ru/ds/ppmdj1.rar (PPMd, var. J rev. 1, May 10, 2006, by D. Shkarin).

  2. 2.

    Z. Tan, ulib. An efficient library for developing high-performance and scalable systems in C and C++, 2012, http://code.google.com/p/ulib/.

References

  1. Abadi, D.J., Marcus, A., Madden, S., Hollenbach, K.J.: Scalable semantic web data management using vertical partitioning. In: Proceedings of the 33rd International Conference on Very Large Data Bases, pp. 411–422. ACM (2007)

    Google Scholar 

  2. Álvarez-García, S., Brisaboa, N.R., Fernández, J.D., Martínez-Prieto, M.A.: Compressed k2-triples for full-in-memory RDF engines. In: A Renaissance of Information Technology for Sustainability and Global Competitiveness. 17th Americas Conference on Information Systems. Association for Information Systems (2011)

    Google Scholar 

  3. Atre, M., Chaoji, V., Zaki, M.J., Hendler, J.A.: Matrix “bit” loaded: a scalable lightweight join query processor for RDF data. In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, Raleigh, North Carolina, USA, 26–30 April 2010, pp. 41–50. ACM (2010)

    Google Scholar 

  4. Brisaboa, N., Ladra, S., Navarro, G.: Compact representation of web graphs with extended functionality. Inf. Syst. 39(1), 152–174 (2014)

    Article  Google Scholar 

  5. Broekstra, J., Kampman, A., van Harmelen, F.: Sesame: a generic architecture for storing and querying RDF and RDF schema. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, pp. 54–68. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  6. Cheng, L., Malik, A., Kotoulas, S., Ward, T.E., Theodoropoulos, G.: Scalable RDF data compression using X10. CoRR, abs/1403.2404 (2014)

    Google Scholar 

  7. Fernández, J.D., Llaves, A., Corcho, O.: Efficient RDF interchange (ERI) format for RDF data streams. In: Mika, P., Tudorache, T., Bernstein, A., Welty, C., Knoblock, C., Vrandečić, D., Groth, P., Noy, N., Janowicz, K., Goble, C. (eds.) ISWC 2014, Part II. LNCS, vol. 8797, pp. 244–259. Springer, Heidelberg (2014)

    Google Scholar 

  8. Fernández, J.D., Martínez-Prieto, M.A., Gutierrez, C.: Compact representation of large RDF data sets for publishing and exchange. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 193–208. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  9. Fernández, N., Arias, J., Sánchez, L., Fuentes-Lorenzo, D., Corcho, Ó.: RDSZ: an approach for lossless RDF stream compression. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 52–67. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  10. Hernández-Illera, A., Martínez-Prieto, M.A., Fernández, J.D.: Serializing RDF in compressed space. In: Data Compression Conference (DCC) (2015)

    Google Scholar 

  11. Jiang, X., Zhang, X., Gao, F., Pu, C., Wang, P.: Graph compression strategies for instance-focused semantic mining. In: Qi, G., Tang, J., Du, J., Pan, J.Z., Yu, Y. (eds.) CSWS 2013. CCIS, vol. 406, pp. 50–61. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  12. Joshi, A.K., Hitzler, P., Dong, G.: Logical linked data compression. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 170–184. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  13. Maneth, S., Peternek, F.: A survey on methods and systems for graph compression. CoRR, abs/1504.00616 (2015)

    Google Scholar 

  14. Martínez-Prieto, M.A., Fernández, J.D., Cánovas, R.: Compression of RDF dictionaries. In: 27th ACM International Symposium on Applied Computing (SAC 2012) - Track The Semantic Web and Applications (SWA), pp. 1841–1848. ACM (2012)

    Google Scholar 

  15. Neumann, T., Weikum, G.: The RDF-3X engine for scalable management of RDF data. VLDB J. 19(1), 91–113 (2010)

    Article  Google Scholar 

  16. Pan, J.Z., Pérez, J.M.G., Ren, Y., Wu, H., Wang, H., Zhu, M.: Graph pattern based RDF data compression. In: Supnithi, T., Yamaguchi, T., Pan, J.Z., Wuwongse, V., Buranarach, M. (eds.) JIST 2014. LNCS, vol. 8943, pp. 239–256. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  17. Urbani, J., Maassen, J., Drost, N., Seinstra, F.J., Bal, H.E.: Scalable RDF data compression with MapReduce. Concurrency Comput. Pract. Experience 25(1), 24–39 (2013)

    Article  Google Scholar 

  18. Weiss, C., Karras, P., Bernstein, A.: Hexastore: sextuple indexing for semantic web data management. PVLDB 1(1), 1008–1019 (2008)

    Google Scholar 

  19. Wilkinson, K.: Jena property table implementation. In: SSWS (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jakub Swacha .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Swacha, J., Grabowski, S. (2015). OFR: An Efficient Representation of RDF Datasets. In: Sierra-Rodríguez, JL., Leal, JP., Simões, A. (eds) Languages, Applications and Technologies. SLATE 2015. Communications in Computer and Information Science, vol 563. Springer, Cham. https://doi.org/10.1007/978-3-319-27653-3_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27653-3_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27652-6

  • Online ISBN: 978-3-319-27653-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics