Abstract
The recent prevalence of Linked Data attracts research interest towards the efficiency of query execution over the web of data. Search and query engines crawl and index triples into a centralized repository and queries are executed locally. It has been shown in various literatures that the performance bottleneck of large scale query execution lies in joins and unions. Based on the observation that a large part of join operations result in a much smaller binding set which can be precomputed and stored, we propose to augment RDF indexes to store the bindings of complex patterns and exploit these patterns to enhance performance. In addition to the index, we also introduce two strategies of selecting these patterns: one depends on developed heuristic rules and the other employs query history to optimize time-space ratio. Our empirical study demonstrates the proposed pattern index outperforms traditional triple index by up to three orders of magnitude while keeping the overhead low.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abadi, D., Marcus, A., Madden, S., Hollenbach, K.: SW-Store: a vertically partitioned DBMS for Semantic Web data management. The VLDB Journal 18(2), 385–406 (2009)
Angles, R., Gutierrez, C.: The Expressive Power of SPARQL. In: Sheth, A.P., Staab, S., Dean, M., Paolucci, M., Maynard, D., Finin, T., Thirunarayan, K. (eds.) ISWC 2008. LNCS, vol. 5318, pp. 114–129. Springer, Heidelberg (2008)
Beckett, D.: RDF/XML Syntax Specification
Broekstra, J., Kampman, A., van Harmelen, F.: Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, pp. 54–68. Springer, Heidelberg (2002)
Weiss, C., Karras, P., Bernstein, A.: Hexastore: sextuple indexing for semantic web data management. In: Proceedings of the VLDB (2008)
Chaudhuri, S.: An overview of query optimization in relational systems. In: Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS 1998, pp. 34–43 (1998)
Chong, E., Das, S., Eadon, G., Srinivasan, J.: An efficient SQL-based RDF querying scheme. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp. 1216–1227. VLDB Endowment (2005)
Cook, S.: The complexity of theorem-proving procedures. In: Proceedings of the Third Annual ACM Symposium on Theory of Computing, pp. 151–158. ACM (1971)
Neumann, T., Weikum, G.: RDF-3X: a RISC-style engine for RDF. Proceedings of the VLDB Endowment 1(1), 647–659 (2008)
Neumann, T., Gerhard, W.: Scalable join processing on very large RDF graphs. In: Proceedings of the 35th SIGMOD, pp. 627–639 (2009)
Prud’hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF
Schmidt, M., Meier, M., Lausen, G.: Foundations of SPARQL query optimization. In: Proceedings of the 13th International Conference on Database Theory, pp. 4–33. ACM (2010)
Sidirourgos, L., Goncalves, R., Kersten, M., Nes, N., Manegold, S.: Column-store support for RDF data management: not all swans are white. Proceedings of the VLDB Endowment 1(2), 1553–1563 (2008)
Silberschatz, A., Korth, H., Sudarshan, S.: Database system concepts, vol. 72. McGraw-Hill (2002)
Stonebraker, M., Abadi, D., Batkin, A., Chen, X., Cherniack, M., Ferreira, M., Lau, E., Lin, A., Madden, S., O’Neil, E., et al.: C-store: a column-oriented DBMS. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp. 553–564. VLDB Endowment (2005)
Udrea, O., Pugliese, A., Subrahmanian, V.S.: GRIN: A graph based RDF index. In: Proceedings of the National Conference on Artificial Intelligence, vol. 22, p. 1465. AAAI Press, MIT Press, Menlo Park, Cambridge (1999/2007)
Wilkinson, K., Sayers, C., Kuno, H.: Efficient RDF storage and retrieval in Jena2. In: Proceedings of SWDB (2003)
Wilkinson, K.: Jena property table implementation. In: Proc. of the International Workshop on Scalable (November 2006)
Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining. Order A Journal On The Theory Of Ordered Sets And Its Applications (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Tian, Y., Wang, H., Jin, W., Ni, Y., Yu, Y. (2012). A Pattern-Based Approach for Efficient Query Processing over RDF Data. In: Hameurlain, A., Küng, J., Wagner, R. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems V. Lecture Notes in Computer Science, vol 7100. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28148-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-28148-8_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28147-1
Online ISBN: 978-3-642-28148-8
eBook Packages: Computer ScienceComputer Science (R0)