An Empirical Evaluation of RDF Graph Partitioning Techniques

Akhter, Adnan; Ngomo Ngonga, Axel-Cyrille; Saleem, Muhammad

doi:10.1007/978-3-030-03667-6_1

An Empirical Evaluation of RDF Graph Partitioning Techniques

Adnan Akhter¹⁷,
Axel-Cyrille Ngomo Ngonga^17,18 &
Muhammad Saleem¹⁷

Conference paper
First Online: 31 October 2018

1169 Accesses
11 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11313))

Abstract

With the significant growth of RDF data sources in both numbers and volume comes the need to improve the scalability of RDF storage and querying solutions. Current implementations employ various RDF graph partitioning techniques. However, choosing the most suitable partitioning for a given RDF graph and application is not a trivial task. To the best of our knowledge, no detailed empirical evaluation exists to evaluate the performance of these techniques. In this work, we present an empirical evaluation of RDF graph partitioning techniques applied to real-world RDF data sets and benchmark queries. We evaluate the selected RDF graph partitioning techniques in terms of their partitioning time, partitioning imbalance (in sizes), and query run time performances achieved, based on real-world data sets and queries selected using the FEASIBLE benchmark generation framework.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
TCGA: http://tcga.deri.ie/.
2.
UniProt: http://www.uniprot.org/statistics/.
3.
http://glaros.dtc.umn.edu/gkhome/metis/metis/download.
4.
Please see T-Test tab of the excel sheet goo.gl/fxa4cJ.

References

Acosta, M., Vidal, M.-E., Lampo, T., Castillo, J., Ruckhaus, E.: ANAPSID: an adaptive query processing engine for SPARQL endpoints. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 18–34. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25073-6_2
Chapter Google Scholar
Buluç, A., Meyerhenke, H., Safro, I., Sanders, P., Schulz, C.: Recent advances in graph partitioning. In: Kliemann, L., Sanders, P. (eds.) Algorithm Engineering. LNCS, vol. 9220, pp. 117–158. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49487-6_4
Chapter Google Scholar
Charalambidis, A., et al.: SemaGrow: optimizing federated SPARQL queries. In: SEMANTICS (2015)
Google Scholar
Erling, O., Mikhailov, I.: Towards web scale RDF. In: Proceedings of SSWS (2008)
Google Scholar
Janke, D., et al.: Impact analysis of data placement strategies on query efforts in distributed RDF stores. JWS (2018)
Google Scholar
Galárraga, L., et al.: Partout: a distributed engine for efficient RDF processing. In: WWW (2014)
Google Scholar
Görlitz, O., Staab, S.: SPLENDID: SPARQL endpoint federation exploiting void descriptions. In: COLD (2011)
Google Scholar
Gurajada, S., et al.: Triad: a distributed shared-nothing RDF engine based on asynchronous message passing. In: SIGMOD (2014)
Google Scholar
Hammoud, M., et al.: DREAM: distributed RDF engine with adaptive query planner and minimal communication. In: VLDB (2015)
Google Scholar
Harris, S., et al.: 4store: the design and implementation of a clustered RDF store. In: SSWS (2009)
Google Scholar
Harth, A., Umbrich, J., Hogan, A., Decker, S.: YARS2: a federated repository for querying graph structured data from the web. In: Aberer, K., et al. (eds.) The Semantic Web. ISWC 2007, ASWC 2007. Lecture Notes in Computer Science, vol. 4825, pp. 211–224. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_16
Chapter Google Scholar
Herodotou, H., et al.: Query optimization techniques for partitioned tables. In: SIGMOD (2011)
Google Scholar
Huang, J., et al.: Scalable SPARQL querying of large RDF graphs. In: VLDB (2011)
Google Scholar
Janke, D., et al.: Koral: a glass box profiling system for individual components of distributed RDF stores. In: BLINK-ISWC (2017)
Google Scholar
Karypis, G., et al.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM JSC 20, 359–392 (1998)
MathSciNet MATH Google Scholar
Khandelwal, A., et al.: ZipG: a memory-efficient graph store for interactive queries. In: ACM ICMD (2017)
Google Scholar
Neumann, T., et al.: The RDF-3X engine for scalable management of RDF data. In: VLDB (2010)
Google Scholar
Owens, A., et al.: Clustered TDB: a clustered triple store for Jena (2008)
Google Scholar
Saleem, M., Mehmood, Q., Ngonga Ngomo, A.C.: FEASIBLE: a feature-based SPARQL benchmark generation framework. In: Arenas, M., et al. (eds.) The Semantic Web - ISWC 2015. ISWC 2015. Lecture Notes in Computer Science, vol. 9366, pp. 52–69. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25007-6_4
Chapter Google Scholar
Saleem, M., et al.: A fine-grained evaluation of SPARQL endpoint federation systems. SWJ (2016)
Google Scholar
Schätzle, A., Przyjaciel-Zablocki, M., Neu, A., Lausen, G.: Sempala: interactive SPARQL query processing on Hadoop. In: Mika, P., et al. (eds.) The Semantic Web - ISWC 2014. ISWC 2014. Lecture Notes in Computer Science, vol. 8796, pp. 164–179. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11964-9_11
Google Scholar
Schätzle, A., et al.: S2RDF: RDF querying with SPARQL on spark. In: VLDB (2016)
Google Scholar
Schwarte, A., Haase, P., Hose, K., Schenkel, R., Schmidt, M.: FedX: optimization techniques for federated query processing on linked data. In: Aroyo, L., et al. (eds.) The Semantic Web - ISWC 2011. ISWC 2011. Lecture Notes in Computer Science, vol. 7031, pp. 601–616. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25073-6_38
Chapter Google Scholar
Tomaszuk, D., Skonieczny, Ł., Wood, D.: RDF graph partitions: a brief survey. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds.) BDAS 2015. CCIS, vol. 521, pp. 256–264. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18422-7_23
Chapter Google Scholar
Wang, X., et al.: LHD: optimising linked data query processing using parallelisation. In: LDOW (2013)
Google Scholar
Yan, Y., et al.: Efficient indices using graph partitioning in RDF triple stores. In: ICDE (2009)
Google Scholar
Zeng, K., et al.: A distributed graph engine for web scale RDF data. In: Proceedings of the VLDB Endowment (2013)
Article Google Scholar

Download references

Acknowledgements

This work was supported by the H2020 project HOBBIT (no. 688227).

Author information

Authors and Affiliations

AKSW, Leipzig, Germany
Adnan Akhter, Axel-Cyrille Ngomo Ngonga & Muhammad Saleem
University of Paderborn, Paderborn, Germany
Axel-Cyrille Ngomo Ngonga

Authors

Adnan Akhter
View author publications
You can also search for this author in PubMed Google Scholar
Axel-Cyrille Ngomo Ngonga
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Saleem
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adnan Akhter .

Editor information

Editors and Affiliations

Université Côte d’Azur, CNRS, Inria, I3S, Sophia Antipolis, France
Catherine Faron Zucker
Fondazione Bruno Kessler, Trento, Italy
Chiara Ghidini
University of Lorraine, CNRS, Inria, LORIA, Nancy, France
Amedeo Napoli
University of Lorraine, CNRS, Inria, LORIA, Nancy, France
Yannick Toussaint

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Akhter, A., Ngomo Ngonga, AC., Saleem, M. (2018). An Empirical Evaluation of RDF Graph Partitioning Techniques. In: Faron Zucker, C., Ghidini, C., Napoli, A., Toussaint, Y. (eds) Knowledge Engineering and Knowledge Management. EKAW 2018. Lecture Notes in Computer Science(), vol 11313. Springer, Cham. https://doi.org/10.1007/978-3-030-03667-6_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-03667-6_1
Published: 31 October 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03666-9
Online ISBN: 978-3-030-03667-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics