Abstract
Distributed processing of RDF data requires partitioning of big and complex data sets. The partitioning affects the performance of the distributed query processing and the amount of data transfer between the network-connected nodes. Static graph partitioning aims to generate partitions with lowest number of edges in between but suffers high communication cost when a query trespasses a partition’s border, because then it requires moving partial results across the network. Workload-aware partitioning is an alternative but faces complex decisions regarding the storage space and the workload orientation. In this paper, we present an adaptive partitioning and replication strategy on three levels. We initialize our system with static partitioning where it collects and analyzes the received workload; then we let it adapt itself with two levels of dynamic replications, besides applying a weighting system to its initial static partitioning to decrease the ratio of border nodes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Al-Ghezi, A., Wiese, L.: Space-adaptive and workload-aware replication and partitioning for distributed RDF triple stores. In: 29th International Workshop on Database and Expert Systems Applications (DEXA). Springer, Cham (2018)
Galárraga, L., Hose, K., Schenkel, R.: Partout: a distributed engine for efficient RDF processing. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 267–268. ACM (2014)
Gurajada, S., Seufert, S., Miliaraki, I., Theobald, M.: TriAD: a distributed shared-nothing RDF engine based on asynchronous message passing. In: Proceedings of the ACM International Conference on Management of Data, pp. 289–300. ACM, New York (2014)
Hose, K., Schenkel, R.: WARP: workload-aware replication and partitioning for RDF. In: IEEE 29th International Conference on Data Engineering Workshops (ICDEW), pp. 1–6 (2013)
Huang, J., Abadi, D.J., Ren, K.: Scalable SPARQL querying of large RDF graphs. Proc. VLDB Endow. 4(11), 1123–1134 (2011)
Karypis, G.: METIS and parMETIS. In: Padua, D. (ed.) Encyclopedia of Parallel Computing, pp. 1117–1124. Springer, Boston (2011). https://doi.org/10.1007/978-0-387-09766-4
Margo, D., Seltzer, M.: A scalable distributed graph partitioner. Proc. VLDB Endow. 8(12), 1478–1489 (2015)
Padiya, T., Kanwar, J.J., Bhise, M.: Workload aware hybrid partitioning. In: Proceedings of the 9th Annual ACM India Conference, pp. 51–58. ACM (2016)
Peng, P., Chen, L., Zou, L., Zhao, D.: Query workload-based RDF graph fragmentation and allocation. In: EDBT, pp. 377–388 (2016)
Shang, Z., Yu, J.X.: Catch the wind: graph workload balancing on cloud. In: Proceedings of the IEEE International Conference on Data Engineering, pp. 553–564. IEEE Computer Society, Washington, DC (2013)
Wu, B., Zhou, Y., Yuan, P., Liu, L., Jin, H.: Scalable SPARQL querying using path partitioning. In: 2015 IEEE 31st International Conference on Data Engineering (ICDE), pp. 795–806. IEEE (2015)
Xu, Q., Wang, X., Wang, J., Yang, Y., Feng, Z.: Semantic-aware partitioning on RDF graphs. In: Chen, L., Jensen, C.S., Shahabi, C., Yang, X., Lian, X. (eds.) APWeb-WAIM 2017. LNCS, vol. 10366, pp. 149–157. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63579-8_12
Zhang, X., Chen, L., Tong, Y., Wang, M.: EAGRE: Towards scalable i/o efficient SPARQL query evaluation on the cloud. In: Jensen, C.S., Jermaine, C.M., Zhou, X. (eds.) ICDE, pp. 565–576. IEEE Computer Society (2013)
Acknowledgements
The authors would like to thank Deutscher Akademischer Austauschdienst (DAAD) for providing fund for research on this project.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Al-Ghezi, A., Wiese, L. (2018). Adaptive Workload-Based Partitioning and Replication for RDF Graphs. In: Hartmann, S., Ma, H., Hameurlain, A., Pernul, G., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2018. Lecture Notes in Computer Science(), vol 11030. Springer, Cham. https://doi.org/10.1007/978-3-319-98812-2_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-98812-2_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98811-5
Online ISBN: 978-3-319-98812-2
eBook Packages: Computer ScienceComputer Science (R0)