Abstract
Graph algorithms (e.g., k-hop queries) are widely used to find the deep association of data in various real-world applications such as business recommendation and fraud detection. However, most of the data are still stored in relational database (i.e., RDBMS) and the performance is rather limited when processing graph queries on RDBMS due to the inherent hardness of complicated table join. In this paper, we propose a fast interactive engine SQLG+, which can be integrated to any RDBMS and enable them to process k-hop graph queries efficiently. Different from naive table-join implementations, SQLG+ caches important nodes with their adjacency lists in memory (i.e., graph cache) and generates a hybrid query plan which combines the ability of graph cache and RDBMS. Also, SQLG+ removes duplicates in the end of each hop (using AdaptiveSet) and expands the frontiers in different ways. Furthermore, dynamic BFS/DFS switch is adopted to achieve the balance between query performance and memory occupation. Extensive experiments show that SQLG+ outperforms the state-of-the-art RDBMS-based implementations by up to several orders of magnitude and is even comparable to the fastest graph databases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aiello, W., Graham, F.C., Lu, L.: A random graph model for power law graphs. Exp. Math. 10(1), 53–66 (2001)
Aluç, G., Hartig, O., Özsu, M.T., Daudjee, K.: Diversified stress testing of RDF data management systems. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 197–212. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11964-9_13
Beamer, S., Asanovic, K., Patterson, D.A.: Direction-optimizing breadth-first search. In: SC (2012)
Cheng, J., Shang, Z., Cheng, H., Wang, H., Yu, J.X.: K-reach: who is in your small world. Proc. VLDB Endow. 5(11), 1292–1303 (2012)
Cheng, Y., Ding, P., et al.: Which category is better: benchmarking relational and graph database management systems. Data Sci. Eng. 4(4), 309–322 (2019)
Create Link co., L.: Galaxybase: a high-performance graph database (2016). https://www.galaxybase.com/
Deutsch, A., Yu, X., Wu, M., Lee, V.: Tigergraph: a native MPP graph database. arXiv (2019)
Erling, O.: Virtuoso, a hybrid RDBMS/Graph column store. IEEE Data Eng. Bull. 35(1), 3–8 (2012)
Fan, W., He, T., et al.: GraphScope: a unified engine for big graph processing. Proc. VLDB Endow. 14(12), 2879–2892 (2021)
Jenkins, J., Arkatkar, I., Owens, J.D., Choudhary, A., Samatova, N.F.: Lessons learned from exploring the backtracking paradigm on the GPU. In: Jeannot, E., Namyst, R., Roman, J. (eds.) Euro-Par 2011. LNCS, vol. 6853, pp. 425–437. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23397-5_42
Jin, R., Xiang, Y., Ruan, N., Fuhry, D.: 3-HOP: a high-compression indexing scheme for reachability query. In: Çetintemel, U., Zdonik, S.B., Kossmann, D., Tatbul, N. (eds.) SIGMOD, pp. 813–826. ACM (2009)
Kwak, H., Lee, C., Park, H., Moon, S.B.: What is Twitter, a social network or a news media? In: WWW, pp. 591–600. ACM (2010)
Lehmann, J., et al.: Dbpedia—a large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web (2015)
Leskovec, J., Krevl, A.: SNAP datasets: stanford large network dataset collection (2014). http://snap.stanford.edu/data
Liu, H., Huang, H.H.: Enterprise: breadth-first graph traversal on GPUs. In: SC (2015)
McBride, B.: Jena: a semantic web toolkit. IEEE Internet Comput. 6(6), 55–59 (2002)
Meng, T., Cai, L., He, T., Chen, L., Deng, Z.: K-hop community search based on local distance dynamics. KSII Trans. Internet Inf. Syst. 12(7), 3041–3063 (2018). https://doi.org/10.1007/978-3-319-70139-4_3
Mi, Z., Yang, Y.: Connectivity restorability of mobile ad hoc sensor network based on k-hop neighbor information. In: ICC, pp. 1–5. IEEE (2011)
Qian, Z., Min, C., et al.: GAIA: a system for interactive analysis on distributed graphs using a high-level language. In: NSDI, pp. 321–335. USENIX Association (2021)
Rao, V.N., Kumar, V.: On the efficiency of parallel backtracking. IEEE Trans. Parallel Distrib. Syst. 4(4), 427–437 (1993)
Umuroglu, Y., Morrison, D., Jahre, M.: Hybrid breadth-first search on a single-chip FPGA-CPU heterogeneous platform. In: FPL, pp. 1–8. IEEE (2015)
Zeng, L., Jiang, Y., Lu, W., Zou, L.: Deep analysis on subgraph isomorphism. arXiv (2020)
Zeng, Li., Zou, Lei: Redesign of the gStore system. Front. Comput. Sci. 12(4), 623–641 (2018). https://doi.org/10.1007/s11704-018-7212-z
Zou, L., Mo, J., Chen, L., Ă–zsu, M.T., Zhao, D.: gStore: answering SPARQL queries via subgraph matching. VLDB (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zeng, L., Zhou, J., Qin, S., Cai, H., Zhao, R., Chen, X. (2022). SQLG+: Efficient k-hop Query Processing on RDBMS. In: Bhattacharya, A., et al. Database Systems for Advanced Applications. DASFAA 2022. Lecture Notes in Computer Science, vol 13247. Springer, Cham. https://doi.org/10.1007/978-3-031-00129-1_37
Download citation
DOI: https://doi.org/10.1007/978-3-031-00129-1_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-00128-4
Online ISBN: 978-3-031-00129-1
eBook Packages: Computer ScienceComputer Science (R0)