Abstract
In this paper, we present an RDF data distribution approach which overcomes the shortcomings of the current solutions in order to scale RDF storage both with the volume of data and query requests. We apply a workload-aware method that identifies frequent patterns accessed by queries in order to keep related data in the same partition. In order to avoid exhaustive analysis on large datasets, a summarized view of the datasets is considered to deploy our reasoning through partitioning templates for data items in an RDF structure. An experimental study shows that our method scales well and is effective to improve the overall performance by decreasing the amount of message passing among servers, compared to alternative data distribution approaches for RDF.
References
METIS (2013). http://glaros.dtc.umn.edu/gkhome/views/metis
Aluc, G., Özsu, M.T., Daudjee, K.: Workload matters: why RDF databases need a new design. PVLDB 7(10), 837–840 (2014)
Bordawekar, R., Shmueli, O.: An algorithm for partitioning trees augmented with sibling edges. Inf. Process. Lett. 108(3), 136–142 (2008)
Curino, C., Jones, E., Zhang, Y., Madden, S.: Schism: a workload-driven approach to database replication and partitioning. VLDB Endow. 3(1–2), 48–57 (2010)
Huang, J., Abadi, D.J.: Scalable SPARQL querying of large RDF graphs. PVLDB 4(11), 1123–1134 (2011)
Navathe, S., Ra, M.: Vertical partitioning for database design: a graphical algorithm. In: ACM SIGMOD International Conference on Management of Data, vol. 18, pp. 440–450 (1989)
Neumann, T., Moerkotte, G.: Characteristic sets: accurate cardinality estimation for RDF queries with multiple joins. In: ICDE, pp. 984–994 (2011)
Ozsu, M.T., Valduriez, P.: Principles of Distributed Database Systems. Prentice-Hall, Inc, Upper Saddle River (1991)
Pham, M.: Self-organizing structured RDF in MonetDB. In: IEEE International Conference on Data Engineering Workshops, pp. 310–313 (2013)
Quamar, A., Kumar, K.A., Deshpande, A.: SWORD: scalable workload-aware data placement for transactional workloads. In: EDBT, pp. 430–441 (2013)
Schroeder, R., Mello, R., Hara, C.: Affinity-based XML fragmentation. In: International Workshop on the Web and Databases (WebDB), Scottsdale (2012)
SchĂ¼tt, T., Schintke, F., Reinefeld, A.: Scalaris: reliable transactional p2p key/value store. In: ACM SIGPLAN Workshop on ERLANG, pp. 41–48 (2008)
Shnaiderman, L., Shmueli, O.: iPIXSAR: incremental clustering of indexed XML data. In: International Conference on Extending Database Technology - Workshops, pp. 74–84 (2009)
Shute, J., Whipkey, C., Menestrina, D., et al.: F1: a distributed SQL database that scales. VLDB Endow. 6(11), 1068–1079 (2013)
Yang, T., Chen, J., Wang, X., Chen, Y., Du, X.: Efficient SPARQL query evaluation via automatic data partitioning. In: Meng, W., Feng, L., Bressan, S., Winiwarter, W., Song, W. (eds.) DASFAA 2013, Part II. LNCS, vol. 7826, pp. 244–258. Springer, Heidelberg (2013)
Zeng, K., Yang, J., Wang, H., Shao, B., Wang, Z.: A distributed graph engine for web scale RDF data. VLDB Endow. 6(4), 265–276 (2013)
Zou, L., Mo, J., Chen, L., Özsu, M.T., Zhao, D.: gStore: answering SPARQL queries via subgraph matching. VLDB Endow. 4(8), 482–493 (2011)
Acknowledgments
This work was partially supported by CAPES, CNPq, FundaĂ§Ă£o AraucĂ¡ria and by AWS in Education.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Schroeder, R., Hara, C.S. (2015). Partitioning Templates for RDF. In: Tadeusz, M., Valduriez, P., Bellatreche, L. (eds) Advances in Databases and Information Systems. ADBIS 2015. Lecture Notes in Computer Science(), vol 9282. Springer, Cham. https://doi.org/10.1007/978-3-319-23135-8_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-23135-8_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23134-1
Online ISBN: 978-3-319-23135-8
eBook Packages: Computer ScienceComputer Science (R0)