Abstract
Subgraph matching is considered as a basis query for graph data management, and is used in many domains, such as semantic web and social network analysis. Subgraph isomorphism is an initial solution for the task, which is an NP-complete problem. To speed up the procedure, graph simulation has been presented to match subgraphs with polynomial complexity. Unfortunately, simulation usually loses topology of matched subgraphs. In this paper, we propose an approximation approach for subgraph matching based on twig patterns. First, we transform query graphs into twig patterns and match candidate substructures in graph data. Second, we present an optimized join strategy along with top-k mechanism, including join order selection based on cost evaluation and optimized pruning based on maximum possible score and minimum possible score. Finally, we design experiments on real-life and synthetic graph data to evaluate the performance of our work. The results show that our approach obviously reduces the time complexity and guarantee the correctness for answering the queries of subgraph matching.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kim, J., Shin, H., Han, W.S., et al.: Taming subgraph isomorphism for RDF query processing. PVLDB 8(11), 1238–1249 (2015)
Fan, W., Wang, X., Wu, Y.: Incremental graph pattern matching. ACM Trans. Database Syst. 38(3), 18 (2013)
Ma, S., Cao, Y., Fan, Y., Huai, J., Wo, T.: Capturing topology in graph pattern matching. PVLDB 5(4), 310–321 (2011)
Neumann, T., Weikum, G.: The RDF-3X engine for scalable management of RDF data. VLDB J. 19(1), 91–113 (2010)
Han, W., Lee, J., Pham, M., Yu, J.X.: iGraph: a framework for comparisons of disk based graph indexing techniques. PVLDB 3(1–2), 449–459 (2010)
Cheng, J., Zeng, X., Yu, J.X.: Top-k graph pattern matching over large graphs. In: Proceedings of ICDE, pp. 1033–1044 (2013)
Zou, L., Chen, L., Ozsu, M.T., Zhao, D.: Answering pattern match queries in large graph databases via graph embedding. PVLDB 21(1), 97–120 (2012)
Bi, F., Chang, L., Lin, X., et al.: Efficient subgraph matching by postponing cartesian products. In: Proceedings of ACM SIGMOD, pp. 1199–1214 (2016)
Ullmann, J.R.: An algorithm for subgraph isomorphism. J. ACM 23(1), 31–42 (1976)
Cordella, L.P., Foggia, P., Sansone, C., Vento, M.: A (sub)graph isomorphism algorithm for matching large graphs. IEEE Trans. Pattern Anal. Mach. Intell. 26(10), 1367–1372 (2004)
Henzinger, M.R., Henzinger, T.A., Kopke, P.W.: Computing simulations on finite and infinite graphs. In: Proceedings of Foundations of Computer Science, pp. 453–462 (1995)
Ren, X., Wang, J.: Exploiting vertex relationships in speeding up subgraph isomorphism over large graphs. PVLDB 8(5), 617–628 (2015)
Gupta, M., Gao, J., Yan, X., Cam, H., Han, J.: Top-k interesting subgraph discovery in information networks. In: Proceedings of ICDE, pp. 820–831 (2014)
Nuutila, E.: An efficient transitive closure algorithm for cyclic digraphs. Inf. Process. Lett. 52(4), 207–213 (1999)
Gou, G., Chirkova, R.: Efficient algorithms for exact ranked twig-pattern matching over graphs. In: Proceedings of ACM SIGMOD, pp. 581–594 (2008)
Chang, L., Lin, X., Zhang, W., Yu, J.X., Zhang, Y., Qin, L.: Optimal enumeration: efficient top-k tree matching. PVLDB 8(5), 533–544 (2015)
Tarjan, R.E.: Depth-first search and linear graph algorithms. SIAM J. Comput. 1(2), 146–160 (1972)
Ilyas, I.F., Aref, W.G., Elmagarmid, A.K.: Supporting top-k join queries in relational databases. VLDB J. 13(3), 207–221 (2004)
Ni, W., Wang, X., Song, W., Li, Y. (eds.): WISA 2019. LNCS, vol. 11817. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30952-7
Acknowledgements
This work is supported by “the Fundamental Research Funds for the Central Universities”, Nankai University (No. 63201207, No. 63201209 and No. 63201166).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, H., Xie, X., Wen, Y., Zhang, Y. (2020). A Twig-Based Algorithm for Top-k Subgraph Matching in Large-Scale Graph Data. In: Wang, G., Lin, X., Hendler, J., Song, W., Xu, Z., Liu, G. (eds) Web Information Systems and Applications. WISA 2020. Lecture Notes in Computer Science(), vol 12432. Springer, Cham. https://doi.org/10.1007/978-3-030-60029-7_43
Download citation
DOI: https://doi.org/10.1007/978-3-030-60029-7_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60028-0
Online ISBN: 978-3-030-60029-7
eBook Packages: Computer ScienceComputer Science (R0)