Abstract
The problem of subgraph matching is one fundamental issue in graph search, which is NP-Complete problem. Recently, subgraph matching has become a popular research topic in the field of knowledge graph analysis, which has a wide range of applications including question answering and semantic search. In this paper, we study the problem of subgraph matching on knowledge graph. Specifically, given a query graph q and a data graph G, the problem of subgraph matching is to conduct all possible subgraph isomorphic mappings of q on G. Knowledge graph is formed as a directed labeled multi-graph having multiple edges between a pair of vertices and it has more dense semantic and structural features than general graph. To accelerate subgraph matching on knowledge graph, we propose a novel subgraph matching algorithm based on subgraph index for knowledge graph, called as FGqT-Match. The subgraph matching algorithm consists of two key designs. One design is a subgraph index of matching-driven flow graph (FGqT), which reduces redundant calculations in advance. Another design is a multi-label weight matrix, which evaluates a near-optimal matching tree for minimizing the intermediate candidates. With the aid of these two key designs, all subgraph isomorphic mappings are quickly conducted only by traversing FGqT. Extensive empirical studies on real and synthetic graphs demonstrate that our techniques outperform the state-of-the-art algorithms.
Similar content being viewed by others
References
Hu S, Zou L, Yu J X, Wang H, Zhao D. Answering natural language questions by subgraph matching over knowledge graphs. IEEE Transactions on Knowledge and Data Engineering, 2018, 30(5): 824–837
Xu Q, Wang X, Li J, Gan Y, Chai L, Wang J. StarMR: an efficient star-decomposition based query processor for SPARQL basic graph patterns using MapReduce. In: proceedings of Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data. 2018, 415–430
Cai T, Li J, Mian A S, Sellis T, Yu J X. Target-aware holistic influence maximization in spatial social networks. IEEE Transactions on Knowledge and Data Engineering, 2020
Shekhar S, Xiong H, Zhou X. Encyclopedia of GIS: Resource Description Framework(RDF). 1st ed. Cham: Springer International Publishing, 2017
Garey M R, Johnson D S. Computers and Intractability: A Guide to the Theory of NP-Completeness. 1st ed. New York: W. H. Freeman, 1979
Kim J, Shin H, Han W H, Hong S, Chafi H. Taming subgraph isomorphism for RDF query processing. Proceedings of the VLDB Endowment, 2015, 8(11): 1238–1249
Ingalalli V, Ienco D, Poncelet P, Villata S. Querying RDF data using a multigraph-based approach. In: Proceedings of the 19th International Conference on Extending Database Technology. 2016, 245–256
Ma H, Langouri M A, Wu Y, Chiang F, Pi J. Ontology-based entity matching in attributed graphs. Proceedings of the VLDB Endowment, 2019, 12(10): 1195–1207
Cordella L P, Foggia P, Sansone C, Vento M. A (sub)graph isomorphism algorithm for matching large graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 26(10): 1367–1372
He H, Singh A K. Graphs-at-a-time: query language and accessmethods for graph databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2008, 405–418
Zhao P, Han J. On graph query optimization in large networks. Proceedings of the VLDB Endowment, 2010, 3(1): 340–351
Han W, Lee J, Lee J H. Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2013, 337–348
Bi F, Chang L, Lin X, Qin L, Zhang W. Efficient subgraph matching by postponing cartesian products. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2016, 1199–1214
Shang H, Zhang Y, Lin X, Yu J X. Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. Proceedings of the VLDB Endowment, 2008, 1(1): 364–375
Kim K, Seo I, Han W S, Hong S, Chafi H, Shin H, Jeong G. Turboflux: A fast continuous subgraph matching system for streaming graph data. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2018, 411–426
Ullmann J R. An algorithm for subgraph isomorphism. Journal of the ACM, 1976, 23(1): 31–42
Jin X, Lai L. MPMatch: A Multi-core Parallel Subgraph Matching Algorithm. In: Proceedings of IEEE 35th International Conference on Data Engineering Workshops. 2019, 241–248
Bhattarai B, Liu H, Huang H. CECI: Compact Embedding Cluster Index for Scalable Subgraph Matching. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2019, 1447–1462
Peng P, Zou L, Du Z, Zhao D. Using partial evaluation in holistic subgraph search. Frontiers of Computer Science, 2017, 12(5): 966–983
Ma Y, Yuan Y, Liu M, Wang G, Wang Y. Graph simulation on large scale temporal graphs. GeoInformatica, 2020, 24(1): 199–220
Lin P, Song Q, Wu Y. Fact checking in knowledge graphs with ontological subgraph patterns. Data Science and Engineering, 2018, 3: 341–358
Xu Y, Tong Y, Shi Y, Tao Q, Xu Ke, Li W. An Efficient Insertion Operator in Dynamic RideSharing Services. In: Proceedings of IEEE 35th International Conference on Data Engineering. 2019, 1022–1033
Zou L, Özsu M T, Chen L, Shen X, Huang R, Zhao D. gStore: a graph-based SPARQL query engine. The VLDB Journal, 2014, 23(4): 565–590
Zeng L, Zou L. Redesign of the gStore system. Frontiers of Computer science, 2018, 12(4): 1–19
Wang X, Chai Le, Xu Q, Yang Y, Li J, Wang J, Chai Y. Efficient subgraph matching on large RDF graphs using MapReduce. Data Science and Engineering, 2019, 4: 24–43
Xu Q, Wang X, Li J, Zhang Q, Chai L. Distributed subgraph matching on big knowledge graphs using pregel. IEEE Access, 2019, 7: 116453–116464
Malewicz G, Austern M H, Bik, A J C, Dehnert J C. Pregel: A system for large-scale graph processing. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2010, 135–146
Li J, Cai T, Deng K, Wang X, Sellis T, Xia F. Community-diversified influence maximization in social networks. Information Systems, 2020, 92: 101522
Ma Y, Yuan Y, Wang G, Bi X, Wang Z, Wang Y. Rising star evaluation based on extreme learning machine in geo-social networks. Cognitive Computation, 2020, 12(1): 296–308
Wang Y, Tong Y, Long C, Xu P, Xu K, Lv W. Adaptive dynamic bipartite graph matching: a reinforcement learning approach. In: Proceedings of IEEE 35th International Conference on Data Engineering, 2019, 1478–1489
Zheng W, Zou L, Peng W, Yan X, Song S, Zhao D. Semantic SPARQL similarity search over RDF knowledge graphs. Proceedings of the VLDB Endowment, 2016, 9(11): 840–851
Acknowledgements
This work was supported by the National Natural Science Foundation of China (Grant Nos. 61976032, 62002039).
Author information
Authors and Affiliations
Corresponding author
Additional information
Yunhao Sun received the BS degree in Computer Science and Technology from Dalian Maritime University, China in 2013, and he is currently a PhD candidate at Faculty of Information Science and Technology, Dalian Maritime University. His research interests cover the query processing of knowledge graph and semantic data management.
Guanyu Li received the BS degree in Electronic Computer from Dalian Maritime University, China in 1985, the MS degree in Computer Application Technology from Dalian Maritime University, China in 1993, and the PhD degree in Management Science and Engineering from Dalian University of Technology, China in 2010. He is currently a Professor and also a PhD Supervisor with the Faculty of Information Science and Technology, Dalian Maritime University. His major research interests include machine learning, intelligent information processing and knowledge graph completion.
Jingjing Du is a graduate student majoring in Computer Science and Technology in Dalian Maritime University, China. She is currently a postgraduate student in Dalian Maritime University. Her research interest is entity linking on knowlede graph.
Bo Ning received the BS degree in Computer Science from Dalian University of Technology, China in 2003, and the PhD degree in Computer Science from Northeastern University, China in 2009. He is currently an associate professor in Dalian Maritime University, China. His primary research interests are network data management and data privacy preserving. He has published more than 30 papers in refereed journals and conferences. Prior to joining Dalian Maritime University, he was a visiting scholar with Swinburne University of Technology, Australia.
Heng Chen received the BS degree in Computer Science and Technology from Jilin University, China in 2004, and the MS degree in Computer Application Technology from Dalian Jiaotong University, China in 2007. He is currently an associate professor with Research Center for Language Intelligence, Dalian University of Foreign Languages, and pursuing the PhD degree in computer application technology with Dalian Maritime University. His major research interests include machine learning and knowledge graph completion.
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Sun, Y., Li, G., Du, J. et al. A subgraph matching algorithm based on subgraph index for knowledge graph. Front. Comput. Sci. 16, 163606 (2022). https://doi.org/10.1007/s11704-020-0360-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11704-020-0360-y