Skip to main content
Log in

A subgraph matching algorithm based on subgraph index for knowledge graph

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

The problem of subgraph matching is one fundamental issue in graph search, which is NP-Complete problem. Recently, subgraph matching has become a popular research topic in the field of knowledge graph analysis, which has a wide range of applications including question answering and semantic search. In this paper, we study the problem of subgraph matching on knowledge graph. Specifically, given a query graph q and a data graph G, the problem of subgraph matching is to conduct all possible subgraph isomorphic mappings of q on G. Knowledge graph is formed as a directed labeled multi-graph having multiple edges between a pair of vertices and it has more dense semantic and structural features than general graph. To accelerate subgraph matching on knowledge graph, we propose a novel subgraph matching algorithm based on subgraph index for knowledge graph, called as FGqT-Match. The subgraph matching algorithm consists of two key designs. One design is a subgraph index of matching-driven flow graph (FGqT), which reduces redundant calculations in advance. Another design is a multi-label weight matrix, which evaluates a near-optimal matching tree for minimizing the intermediate candidates. With the aid of these two key designs, all subgraph isomorphic mappings are quickly conducted only by traversing FGqT. Extensive empirical studies on real and synthetic graphs demonstrate that our techniques outperform the state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Hu S, Zou L, Yu J X, Wang H, Zhao D. Answering natural language questions by subgraph matching over knowledge graphs. IEEE Transactions on Knowledge and Data Engineering, 2018, 30(5): 824–837

    Article  Google Scholar 

  2. Xu Q, Wang X, Li J, Gan Y, Chai L, Wang J. StarMR: an efficient star-decomposition based query processor for SPARQL basic graph patterns using MapReduce. In: proceedings of Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data. 2018, 415–430

  3. Cai T, Li J, Mian A S, Sellis T, Yu J X. Target-aware holistic influence maximization in spatial social networks. IEEE Transactions on Knowledge and Data Engineering, 2020

  4. Shekhar S, Xiong H, Zhou X. Encyclopedia of GIS: Resource Description Framework(RDF). 1st ed. Cham: Springer International Publishing, 2017

    Book  Google Scholar 

  5. Garey M R, Johnson D S. Computers and Intractability: A Guide to the Theory of NP-Completeness. 1st ed. New York: W. H. Freeman, 1979

    MATH  Google Scholar 

  6. Kim J, Shin H, Han W H, Hong S, Chafi H. Taming subgraph isomorphism for RDF query processing. Proceedings of the VLDB Endowment, 2015, 8(11): 1238–1249

    Article  Google Scholar 

  7. Ingalalli V, Ienco D, Poncelet P, Villata S. Querying RDF data using a multigraph-based approach. In: Proceedings of the 19th International Conference on Extending Database Technology. 2016, 245–256

  8. Ma H, Langouri M A, Wu Y, Chiang F, Pi J. Ontology-based entity matching in attributed graphs. Proceedings of the VLDB Endowment, 2019, 12(10): 1195–1207

    Article  Google Scholar 

  9. Cordella L P, Foggia P, Sansone C, Vento M. A (sub)graph isomorphism algorithm for matching large graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 26(10): 1367–1372

    Article  Google Scholar 

  10. He H, Singh A K. Graphs-at-a-time: query language and accessmethods for graph databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2008, 405–418

  11. Zhao P, Han J. On graph query optimization in large networks. Proceedings of the VLDB Endowment, 2010, 3(1): 340–351

    Article  Google Scholar 

  12. Han W, Lee J, Lee J H. Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2013, 337–348

  13. Bi F, Chang L, Lin X, Qin L, Zhang W. Efficient subgraph matching by postponing cartesian products. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2016, 1199–1214

  14. Shang H, Zhang Y, Lin X, Yu J X. Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. Proceedings of the VLDB Endowment, 2008, 1(1): 364–375

    Article  Google Scholar 

  15. Kim K, Seo I, Han W S, Hong S, Chafi H, Shin H, Jeong G. Turboflux: A fast continuous subgraph matching system for streaming graph data. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2018, 411–426

  16. Ullmann J R. An algorithm for subgraph isomorphism. Journal of the ACM, 1976, 23(1): 31–42

    Article  MathSciNet  Google Scholar 

  17. Jin X, Lai L. MPMatch: A Multi-core Parallel Subgraph Matching Algorithm. In: Proceedings of IEEE 35th International Conference on Data Engineering Workshops. 2019, 241–248

  18. Bhattarai B, Liu H, Huang H. CECI: Compact Embedding Cluster Index for Scalable Subgraph Matching. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2019, 1447–1462

  19. Peng P, Zou L, Du Z, Zhao D. Using partial evaluation in holistic subgraph search. Frontiers of Computer Science, 2017, 12(5): 966–983

    Article  Google Scholar 

  20. Ma Y, Yuan Y, Liu M, Wang G, Wang Y. Graph simulation on large scale temporal graphs. GeoInformatica, 2020, 24(1): 199–220

    Article  Google Scholar 

  21. Lin P, Song Q, Wu Y. Fact checking in knowledge graphs with ontological subgraph patterns. Data Science and Engineering, 2018, 3: 341–358

    Article  Google Scholar 

  22. Xu Y, Tong Y, Shi Y, Tao Q, Xu Ke, Li W. An Efficient Insertion Operator in Dynamic RideSharing Services. In: Proceedings of IEEE 35th International Conference on Data Engineering. 2019, 1022–1033

  23. Zou L, Özsu M T, Chen L, Shen X, Huang R, Zhao D. gStore: a graph-based SPARQL query engine. The VLDB Journal, 2014, 23(4): 565–590

    Article  Google Scholar 

  24. Zeng L, Zou L. Redesign of the gStore system. Frontiers of Computer science, 2018, 12(4): 1–19

    Article  Google Scholar 

  25. Wang X, Chai Le, Xu Q, Yang Y, Li J, Wang J, Chai Y. Efficient subgraph matching on large RDF graphs using MapReduce. Data Science and Engineering, 2019, 4: 24–43

    Article  Google Scholar 

  26. Xu Q, Wang X, Li J, Zhang Q, Chai L. Distributed subgraph matching on big knowledge graphs using pregel. IEEE Access, 2019, 7: 116453–116464

    Article  Google Scholar 

  27. Malewicz G, Austern M H, Bik, A J C, Dehnert J C. Pregel: A system for large-scale graph processing. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2010, 135–146

  28. Li J, Cai T, Deng K, Wang X, Sellis T, Xia F. Community-diversified influence maximization in social networks. Information Systems, 2020, 92: 101522

    Article  Google Scholar 

  29. Ma Y, Yuan Y, Wang G, Bi X, Wang Z, Wang Y. Rising star evaluation based on extreme learning machine in geo-social networks. Cognitive Computation, 2020, 12(1): 296–308

    Article  Google Scholar 

  30. Wang Y, Tong Y, Long C, Xu P, Xu K, Lv W. Adaptive dynamic bipartite graph matching: a reinforcement learning approach. In: Proceedings of IEEE 35th International Conference on Data Engineering, 2019, 1478–1489

  31. Zheng W, Zou L, Peng W, Yan X, Song S, Zhao D. Semantic SPARQL similarity search over RDF knowledge graphs. Proceedings of the VLDB Endowment, 2016, 9(11): 840–851

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant Nos. 61976032, 62002039).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guanyu Li.

Additional information

Yunhao Sun received the BS degree in Computer Science and Technology from Dalian Maritime University, China in 2013, and he is currently a PhD candidate at Faculty of Information Science and Technology, Dalian Maritime University. His research interests cover the query processing of knowledge graph and semantic data management.

Guanyu Li received the BS degree in Electronic Computer from Dalian Maritime University, China in 1985, the MS degree in Computer Application Technology from Dalian Maritime University, China in 1993, and the PhD degree in Management Science and Engineering from Dalian University of Technology, China in 2010. He is currently a Professor and also a PhD Supervisor with the Faculty of Information Science and Technology, Dalian Maritime University. His major research interests include machine learning, intelligent information processing and knowledge graph completion.

Jingjing Du is a graduate student majoring in Computer Science and Technology in Dalian Maritime University, China. She is currently a postgraduate student in Dalian Maritime University. Her research interest is entity linking on knowlede graph.

Bo Ning received the BS degree in Computer Science from Dalian University of Technology, China in 2003, and the PhD degree in Computer Science from Northeastern University, China in 2009. He is currently an associate professor in Dalian Maritime University, China. His primary research interests are network data management and data privacy preserving. He has published more than 30 papers in refereed journals and conferences. Prior to joining Dalian Maritime University, he was a visiting scholar with Swinburne University of Technology, Australia.

Heng Chen received the BS degree in Computer Science and Technology from Jilin University, China in 2004, and the MS degree in Computer Application Technology from Dalian Jiaotong University, China in 2007. He is currently an associate professor with Research Center for Language Intelligence, Dalian University of Foreign Languages, and pursuing the PhD degree in computer application technology with Dalian Maritime University. His major research interests include machine learning and knowledge graph completion.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, Y., Li, G., Du, J. et al. A subgraph matching algorithm based on subgraph index for knowledge graph. Front. Comput. Sci. 16, 163606 (2022). https://doi.org/10.1007/s11704-020-0360-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-020-0360-y

Keywords

Navigation