Skip to main content

Space-Adaptive and Workload-Aware Replication and Partitioning for Distributed RDF Triple Stores

  • Conference paper
  • First Online:
Book cover Database and Expert Systems Applications (DEXA 2018)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 903))

Included in the following conference series:

Abstract

The efficient distributed processing of big RDF graphs requires typically decreasing the communication cost over the network. This requires on the storage level both a careful partitioning (in order to keep the queried data in the same machine), and a careful data replication strategy (in order to enhance the probability of a query finding the required data locally). Analyzing the collected workload trend can provide a base to highlight the more important parts of the data set that are expected to be targeted by future queries. However, the outcome of such analysis is highly affected by the type and diversity of the collected workload and its correlation with the used application. In addition, the replication type and size are limited by the amount of available storage space. Both of the two main factors, workload quality and storage space, are very dynamic on practical system. In this work we present our adaptable partitioning and replication approach for a distributed RDF triples store. The approach enables the storage layer to adapt with the available size of storage space and with the available quality of workload aiming to give the most optimized performance under these variables.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://code.google.com/archive/p/rdf3x/.

  2. 2.

    https://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/.

References

  1. Al-Ghezi, A., Wiese, L.: Adaptive workload-based partitioning and replication for RDF graphs. In: Database and Expert Systems Applications, pp. 377–388. Springer International Publishing (2018)

    Google Scholar 

  2. Galárraga, L., Hose, K., Schenkel, R.: Partout: a distributed engine for efficient RDF processing. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 267–268. ACM (2014)

    Google Scholar 

  3. Gurajada, S., Seufert, S., Miliaraki, I., Theobald, M.: TriAD: a distributed shared-nothing RDF engine based on asynchronous message passing. In: Proceedings of the ACM International Conference on Management of Data, pp. 289–300. ACM, New York (2014)

    Google Scholar 

  4. Hose, K., Schenkel, R.: WARP: workload-aware replication and partitioning for RDF. In: IEEE 29th International Conference on Data Engineering Workshops (ICDEW), pp. 1–6 (2013)

    Google Scholar 

  5. Huang, J., Abadi, D.J., Ren, K.: Scalable SPARQL querying of large RDF graphs. Proc. VLDB Endow. 4(11), 1123–1134 (2011)

    Google Scholar 

  6. Margo, D., Seltzer, M.: A scalable distributed graph partitioner. Proc. VLDB Endow. 8(12), 1478–1489 (2015)

    Article  Google Scholar 

  7. Neumann, T., Weikum, G.: The RDF-3X engine for scalable management of RDF data. VLDB J. 19, 91–113 (2010)

    Article  Google Scholar 

  8. Padiya, T., Kanwar, J.J., Bhise, M.: Workload aware hybrid partitioning. In: Proceedings of the 9th Annual ACM India Conference, pp. 51–58. ACM (2016)

    Google Scholar 

  9. Peng, P., Chen, L., Zou, L., Zhao, D.: Query workload-based RDF graph fragmentation and allocation. In: EDBT, pp. 377–388 (2016)

    Google Scholar 

  10. Wu, B., Zhou, Y., Yuan, P., Liu, L., Jin, H.: Scalable SPARQL querying using path partitioning. In: 2015 IEEE 31st International Conference on Data Engineering (ICDE), pp. 795–806. IEEE (2015)

    Google Scholar 

  11. Xu, Q., Wang, X., Wang, J., Yang, Y., Feng, Z.: Semantic-aware partitioning on RDF graphs. In: Chen, L., Jensen, C.S., Shahabi, C., Yang, X., Lian, X. (eds.) APWeb-WAIM 2017. LNCS, vol. 10366, pp. 149–157. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63579-8_12

    Chapter  Google Scholar 

  12. Zhang, X., Chen, L., Tong, Y., Wang, M.: EAGRE: towards scalable I/O efficient SPARQL query evaluation on the cloud. In: Jensen, C.S., Jermaine, C.M., Zhou, X. (eds.) ICDE, pp. 565–576. IEEE Computer Society (2013)

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank Deutscher Akademischer Austauschdienst (DAAD) for providing funds for research on this project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ahmed Al-Ghezi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Al-Ghezi, A., Wiese, L. (2018). Space-Adaptive and Workload-Aware Replication and Partitioning for Distributed RDF Triple Stores. In: Elloumi, M., et al. Database and Expert Systems Applications. DEXA 2018. Communications in Computer and Information Science, vol 903. Springer, Cham. https://doi.org/10.1007/978-3-319-99133-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-99133-7_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-99132-0

  • Online ISBN: 978-3-319-99133-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics