Abstract
In this paper, we consider solving the all-pairs regular path problem on large graphs efficiently. Let G be a graph and r be a regular path query, and consider finding the answers of r on G. If G is so small that it fits in main memory, it suffices to load entire G into main memory and traverse G to find paths matching r. However, if G is too large and cannot fit in main memory, we need another approach. In this paper, we propose an external memory algorithm for solving all-pairs regular path problem on large graphs. Our algorithm finds the answers matching r by scanning the node list of G sequentially, which avoids random accesses to disk and thus makes regular path query processing I/O efficient.
Y. Kwon–Currently, the author is with Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-ku, Tokyo 113-8657, Japan.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
We currently adopt breadth first search but any other search strategies can be applicable.
- 7.
- 8.
Neo4j has a declarative query language Cypher but it does not fully support regular path query. Thus we chose Gremlin in this experiment.
- 9.
We also tried the same queries on Sparksee 5.1.0 and Apache Jena TDB, but obtained no result due to main memory exhaustion errors in both environments.
References
Aggarwal, A., Vitter, J.S.: The input/output complexity of sorting and related problems. Commun. ACM 31(9), 1116–1127 (1988)
Bai, Y., Wang, C., Ning, Y., Wu, H., Wang, H.: G-path: flexible path pattern query on large graphs. In: Proceedings of the WWW 2013, pp. 333–336 (2013)
Baeza, P.B.: Querying graph databases. In: Proceedings of the PODS 2013, pp. 175–188 (2013)
Hu, X., Tao, Y., Chung, C.-W.: Massive graph triangulation. In: Proceedings of the SIGMOD 2013, pp. 325–336 (2013)
Koschmieder, A., Leser, U.: Regular path queries on large graphs. In: Ailamaki, A., Bowers, S. (eds.) SSDBM 2012. LNCS, vol. 7338, pp. 177–194. Springer, Heidelberg (2012)
Losemann, K., Martens, W.: The complexity of regular expressions and property paths in SPARQL. ACM Trans. Database Syst. 38(4), 24:1–24:39 (2013)
Luo, Y., Fletcher, G.H., Hidders, J., Wu, Y., De Bra, P.: External memory k-bisimulation reduction of big graphs. In: Proceedings of CIKM 2013, pp. 919–928 (2013)
Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: A system for large-scale graph processing. In: Proceedings of the SIGMOD 2010, pp. 135–146 (2010)
MartÃnez-Bazan, N., Muntés-Mulero, V., Gómez-Villamor, S., Nin, J., Sánchez-MartÃnez, M.-A., Larriba-Pey, J.-L.: Dex: high-performance exploration on large graphs for information retrieval. In: Proceedings of the CIKM 2007, pp. 573–582 (2007)
Morishima, A., Tajima, K., Tadaishi, M.: Optimal tree node ordering for child/descendant navigations. In: Proceedings of the ICDE 2010, pp. 840–843 (2010)
Schmidt, M., Hornung, T., Lausen, G., Pinkel, C.: Sp2Bench: A SPARQL performance benchmark. In: Proceedings of the ICDE 2009, pp. 222–233 (2009)
Shoaran, M., Thomo, A.: Distributed multi-source regular path queries. In: Proceedings of the ISPA 2007, pp. 365–374 (2007)
Stefanescu, D.C., Thomo, A., Thomo, L.: Distributed evaluation of generalized path queries. In: Proceedings of the SAC 2005, pp. 610–616 (2005)
Tung, L.-D., Nguyen-Van, Q., Hu, Z.: Efficient query evaluation on distributed graphs with hadoop environment. In: SoICT 2013, pp. 311–319 (2013)
Wood, P.T.: Query languages for graph databases. SIGMOD Rec. 41(1), 50–60 (2012)
Yildirim, H., Chaoji, V., Zaki, M.J.: Grail: a scalable index for reachability queries in very large graphs. VLDB J. 21(4), 509–534 (2012)
Zhang, Z., Yu, J.X., Qin, L., Chang, L., Lin, X.: I/O efficient: Computing SCCs in massive graphs. In: Proceedings of the SIGMOD 2013, pp. 181–192 (2013)
Zhang, Z., Yu, J.X., Qin, L., Zhu, Q., Zhou, X.: I/O cost minimization: reachability queries processing over massive graphs. In: Proceedings of the EDBT 2012, pp. 468–479 (2012)
Zou, L., Özsu, M.T., Chen, L., Shen, X., Huang, R., Zhao, D.: gStore: a graph-based SPARQL query engine. VLDB J. 23(4), 565–590 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Suzuki, N., Ikeda, K., Kwon, Y. (2015). An External Memory Algorithm for All-Pairs Regular Path Problem. In: Chen, Q., Hameurlain, A., Toumani, F., Wagner, R., Decker, H. (eds) Database and Expert Systems Applications. Globe DEXA 2015 2015. Lecture Notes in Computer Science(), vol 9262. Springer, Cham. https://doi.org/10.1007/978-3-319-22852-5_34
Download citation
DOI: https://doi.org/10.1007/978-3-319-22852-5_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22851-8
Online ISBN: 978-3-319-22852-5
eBook Packages: Computer ScienceComputer Science (R0)