Skip to main content
Log in

Redesign of the gStore system

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

gStore is an open-source native Resource Description Framework (RDF) triple store that answers SPARQL queries by subgraph matching over RDF graphs. However, there are some deficiencies in the original system design, such as answering simple queries (including onetriple pattern queries). To improve the efficiency of the system, we reconsider the system design in this paper. Specifically, we propose a new query plan generation module that generates different query plans according to the structures of query graphs. Furthermore, we re-design our vertex encoding strategy to achieve more pruning power and a new multi-join algorithm to speed up the subgraph matching process. Extensive experiments on synthetic and real RDF datasets show that our method outperforms the state-of-the-art algorithms significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Bollacker K D, Cook R P, Tufts P. Freebase: a shared database of structured general human knowledge. In: Proceedings of the 22nd AAAI Conference on Artificial Intelligence. 2007, 1962–1963

    Google Scholar 

  2. Lehmann J, Isele R, Jakob M, Jentzsch A, Kontokostas D, Mendes P N, Hellmann S, Morsey M, van Kleef P, Auer S, Bizer C. Dbpedia–A largescale, multilingual knowledge base extracted from Wikipedia. Semantic Web, 2015, 6(2): 167–195

    Google Scholar 

  3. Neumann T, Weikum G. RDF-3X: a RISC-style engine for RDF. Proceedings of the VLDB Endowment, 2008, 1(1): 647–659

    Article  Google Scholar 

  4. Neumann T, Weikum G. The RDF-3X engine for scalable management of RDF data. VLDB Journal, 2009, 19(1): 91–113

    Article  Google Scholar 

  5. Weiss C, Karras P, Bernstein A. Hexastore: sextuple indexing for semantic Web data management. Proceedings of the VLDB Endowment, 2008, 1(1): 1008–1019

    Article  Google Scholar 

  6. Zou L, Mo J H, Chen L, Özsu M T, Zhao D Y. gStore: answering SPARQL queries via subgraph matching. Proceedings of the VLDB Endowment, 2011, 4(8): 482–493

    Article  Google Scholar 

  7. Zeng K, Yang J C, Wang H X, Shao B, Wang Z Y. A distributed graph engine forWeb scale RDF data. Proceedings of the VLDB Endowment, 2013, 6(4): 265–276

    Article  Google Scholar 

  8. Zou L, Özsu M T, Chen L, Shen X C, Huang R Z, Zhao D Y. gStore: a graph-based SPARQL query engine. VLDB Journal, 2014, 23(4): 565–590

    Article  Google Scholar 

  9. Aluç G. Workload matters: a robust approach to physical RDF database design. Dissertation for the Doctoral Degree. Waterloo: University of Waterloo, 2015

    Google Scholar 

  10. Ingalalli V, Ienco D, Poncelet P, Villata S. Querying RDF data using a multigraph-based approach. In: Proceedings of the 19th International Conference on Extending Database Technology. 2016, 245–256

    Google Scholar 

  11. Nabti C, Seba H. A simple algorithm for subgraph queries in big graphs. 2017, arXiv preprint arXiv:1703.05547

    Google Scholar 

  12. Erling O. Virtuoso, a hybrid rdbms/graph column store. IEEE Data Engineering Bulletin, 2012, 35(1): 3–8

    Google Scholar 

  13. Mcbride B. Jena: a semantic Web toolkit. IEEE Educational Activities Department, 2002

    Google Scholar 

  14. Guo Y B, Pan Z X, Heflin J. LUBM: a benchmark for OWL knowledge base systems. Web Semantics Science Services and Agents on the World Wide Web, 2005, 3(2): 158–182

    Article  Google Scholar 

  15. Ullmann J R. An algorithm for subgraph isomorphism. Journal of the ACM, 1976, 23(1): 31–42

    Article  MathSciNet  Google Scholar 

  16. Cordella L P, Foggia P, Sansone C, Vento M. A (sub)graph isomorphism algorithm for matching large graphs. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2004, 26(10): 1367–1372

    Article  Google Scholar 

  17. Zhao P X, Han J W. On graph query optimization in large networks. VLDB Endowment, 2010, 3(3): 340–351

    Article  Google Scholar 

  18. Zhu K, Zhang Y, Lin X M, Zhu G P, Wang W. Nova: a novel and efficient framework for finding subgraph isomorphism mappings in large graphs. In: Proceedings of the International Conference on Database Systems for Advanced Applications. 2010, 140–154

    Chapter  Google Scholar 

  19. Peng P, Zou L, Chen L, Lin X M, Zhao D Y. Subgraph search over massive disk resident graphs. In: Proceedings of the International Conference on Scientific and Statistical Database Management. 2011, 312–321

    Chapter  Google Scholar 

  20. Peng P, Zou L, Chen L, Lin X M, Zhao D Y. Answering subgraph queries over massive disk resident graphs. World Wide Web, 2016, 19(3): 417–448

    Article  Google Scholar 

  21. Lee J, Han W S, Kasperovics R, Lee J H. An in-depth comparison of subgraph isomorphism algorithms in graph databases. Proceedings of the VLDB Endowment, 2012, 6(2): 133–144

    Article  Google Scholar 

  22. Han W S, Lee J, Lee J H. Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases. In: Proceedings of ACM SIGMOD International Conference on Management of Data. 2013, 337–348

    Google Scholar 

  23. Kim J, Shin H, Han W S, Hong S, Chafi H. Taming subgraph isomorphism for RDF query processing. Proceedings of the VLDB Endowment, 2015, 8(11): 1238–1249

    Article  Google Scholar 

  24. Ren X G, Wang J H. Exploiting vertex relationships in speeding up subgraph isomorphism over large graphs. Proceedings of the VLDB Endowment, 2015, 8(5): 617–628

    Article  Google Scholar 

  25. McKay B D, Piperno A. Practical graph isomorphism, II. Journal of Symbolic Computation, 2014, 60(1): 94–112

    Article  MathSciNet  MATH  Google Scholar 

  26. Shang H C, Zhang Y, Lin X M, Yu F X. Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. Proceedings of the VLDB Endowment, 2008, 1(1): 364–375

    Article  Google Scholar 

  27. Bi F, Chang L J, Lin X M, Qin L, Zhang WJ. Efficient subgraph match matching by postponing Cartesian products. In: Proceedings of ACM International Conference on Management of Data. 2016, 1199–1214

    Google Scholar 

  28. Atre M, Chaoji V, Zaki M J, Hendler J A. Matrix “bit” loaded: a scalable lightweight join query processor for RDF data. In: Proceedings of the International Conference on World Wide Web. 2010, 41–50

    Google Scholar 

  29. Peng P, Zou L, Özsu M T, Chen L, Zhao D Y. Processing SPARQL queries over distributed RDF graphs. VLDB Journal, 2016, 25(2): 243–268

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Zou.

Additional information

Li Zeng received his BS degree in computer science from Peking University, China in 2016. He is now a master student at Peking University majoring in computer science. His research interests include graph databases and data management.

Lei Zou received his BS and PhD degrees in computer science from Huazhong University of Science and Technology, China in 2003 and 2009, respectively. He is now an associate professor at Peking University, China. His research interests include graph databases and knowledge graph data management. He is an awardee of the NSFC Excellent Young Scholars Program in 2016.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zeng, L., Zou, L. Redesign of the gStore system. Front. Comput. Sci. 12, 623–641 (2018). https://doi.org/10.1007/s11704-018-7212-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-018-7212-z

Keywords

Navigation