Skip to main content

Partitioning Templates for RDF

  • Conference paper
  • First Online:
Advances in Databases and Information Systems (ADBIS 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9282))

Abstract

In this paper, we present an RDF data distribution approach which overcomes the shortcomings of the current solutions in order to scale RDF storage both with the volume of data and query requests. We apply a workload-aware method that identifies frequent patterns accessed by queries in order to keep related data in the same partition. In order to avoid exhaustive analysis on large datasets, a summarized view of the datasets is considered to deploy our reasoning through partitioning templates for data items in an RDF structure. An experimental study shows that our method scales well and is effective to improve the overall performance by decreasing the amount of message passing among servers, compared to alternative data distribution approaches for RDF.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Notes

  1. 1.

    http://wiki.dbpedia.org/Datasets.

  2. 2.

    http://www.w3.org/wiki/LargeTripleStores.

References

  1. METIS (2013). http://glaros.dtc.umn.edu/gkhome/views/metis

  2. Aluc, G., Özsu, M.T., Daudjee, K.: Workload matters: why RDF databases need a new design. PVLDB 7(10), 837–840 (2014)

    Google Scholar 

  3. Bordawekar, R., Shmueli, O.: An algorithm for partitioning trees augmented with sibling edges. Inf. Process. Lett. 108(3), 136–142 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  4. Curino, C., Jones, E., Zhang, Y., Madden, S.: Schism: a workload-driven approach to database replication and partitioning. VLDB Endow. 3(1–2), 48–57 (2010)

    Article  Google Scholar 

  5. Huang, J., Abadi, D.J.: Scalable SPARQL querying of large RDF graphs. PVLDB 4(11), 1123–1134 (2011)

    Google Scholar 

  6. Navathe, S., Ra, M.: Vertical partitioning for database design: a graphical algorithm. In: ACM SIGMOD International Conference on Management of Data, vol. 18, pp. 440–450 (1989)

    Google Scholar 

  7. Neumann, T., Moerkotte, G.: Characteristic sets: accurate cardinality estimation for RDF queries with multiple joins. In: ICDE, pp. 984–994 (2011)

    Google Scholar 

  8. Ozsu, M.T., Valduriez, P.: Principles of Distributed Database Systems. Prentice-Hall, Inc, Upper Saddle River (1991)

    Google Scholar 

  9. Pham, M.: Self-organizing structured RDF in MonetDB. In: IEEE International Conference on Data Engineering Workshops, pp. 310–313 (2013)

    Google Scholar 

  10. Quamar, A., Kumar, K.A., Deshpande, A.: SWORD: scalable workload-aware data placement for transactional workloads. In: EDBT, pp. 430–441 (2013)

    Google Scholar 

  11. Schroeder, R., Mello, R., Hara, C.: Affinity-based XML fragmentation. In: International Workshop on the Web and Databases (WebDB), Scottsdale (2012)

    Google Scholar 

  12. SchĂ¼tt, T., Schintke, F., Reinefeld, A.: Scalaris: reliable transactional p2p key/value store. In: ACM SIGPLAN Workshop on ERLANG, pp. 41–48 (2008)

    Google Scholar 

  13. Shnaiderman, L., Shmueli, O.: iPIXSAR: incremental clustering of indexed XML data. In: International Conference on Extending Database Technology - Workshops, pp. 74–84 (2009)

    Google Scholar 

  14. Shute, J., Whipkey, C., Menestrina, D., et al.: F1: a distributed SQL database that scales. VLDB Endow. 6(11), 1068–1079 (2013)

    Article  Google Scholar 

  15. Yang, T., Chen, J., Wang, X., Chen, Y., Du, X.: Efficient SPARQL query evaluation via automatic data partitioning. In: Meng, W., Feng, L., Bressan, S., Winiwarter, W., Song, W. (eds.) DASFAA 2013, Part II. LNCS, vol. 7826, pp. 244–258. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  16. Zeng, K., Yang, J., Wang, H., Shao, B., Wang, Z.: A distributed graph engine for web scale RDF data. VLDB Endow. 6(4), 265–276 (2013)

    Article  Google Scholar 

  17. Zou, L., Mo, J., Chen, L., Özsu, M.T., Zhao, D.: gStore: answering SPARQL queries via subgraph matching. VLDB Endow. 4(8), 482–493 (2011)

    Article  Google Scholar 

Download references

Acknowledgments

This work was partially supported by CAPES, CNPq, FundaĂ§Ă£o AraucĂ¡ria and by AWS in Education.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rebeca Schroeder .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Schroeder, R., Hara, C.S. (2015). Partitioning Templates for RDF. In: Tadeusz, M., Valduriez, P., Bellatreche, L. (eds) Advances in Databases and Information Systems. ADBIS 2015. Lecture Notes in Computer Science(), vol 9282. Springer, Cham. https://doi.org/10.1007/978-3-319-23135-8_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23135-8_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23134-1

  • Online ISBN: 978-3-319-23135-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics