Abstract
Graph pattern matching, which is to compute the set M(Q, G) of matches of Q in G, for the given pattern graph Q and data graph G, has been increasingly used in emerging applications e.g., social network analysis. As the matching semantic is typically defined in terms of subgraph isomorphism, two key issues are hence raised: the semantic is often too rigid to identify meaningful matches, and the problem is intractable, which calls for efficient matching methods. Motivated by these, this paper extends matching semantic with regular expressions, and investigates the top-k graph pattern matching problem. (1) We introduce regular patterns, which revise traditional pattern graphs by incorporating regular expressions; extend traditional matching semantic by allowing edge to regular path mapping. With the extension, more meaningful matches could be captured. (2) We propose a relevance function, that is defined in terms of tightness of connectivity, for ranking matches. Based on the ranking function, we introduce the top-k graph pattern matching problem, denoted by \(\mathsf {TopK}\). (3) We show that \(\mathsf {TopK}\) is intractable. Despite hardness, we develop an algorithm with early termination property, i.e., it finds top-k matches without identifying entire match set. (4) Using real-life and synthetic data, we experimentally verify that our top-k matching algorithms are effective, and outperform traditional counterparts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Full version. https://github.com/xgnaw/sun/raw/master/regularGPM.pdf
Bagan, G., Bonifati, A., Groz, B.: A trichotomy for regular simple path queries on graphs. In: PODS, pp. 261–272 (2013)
Brynielsson, J., Högberg, J., Kaati, L., Martenson, C., Svenson, P.: Detecting social positions using simulation. In: ASONAM (2010)
Cheng, J., Zeng, X., Yu, J.X.: Top-k graph pattern matching over large graphs. In: ICDE, pp. 1033–1044 (2013)
Cheng, X., Dale, C., Liu, J.: Youtube (2008). http://netsg.cs.sfu.ca/youtubedata/
Cordella, L.P., Foggia, P., Sansone, C., Vento, M.: A (sub)graph isomorphism algorithm for matching large graphs. TPAMI 26(10), 1367–1372 (2004)
Fagin, R.: Combining fuzzy information from multiple systems. JCSS 58(1), 83–99 (1999)
Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. JCSS 66(4), 614–656 (2003)
Fan, W., Li, J., Ma, S., Tang, N., Wu, Y.: Adding regular expressions to graph reachability and pattern queries. In: ICDE, pp. 39–50 (2011)
Fan, W., Li, J., Ma, S., Tang, N., Wu, Y., Wu, Y.: Graph pattern matching: from intractable to polynomial time. PVLDB 3(1), 264–275 (2010)
Fan, W., Wang, X., Wu, Y.: Performance guarantees for distributed reachability queries. PVLDB 5(11), 1304–1315 (2012)
Fan, W., Wang, X., Wu, Y.: Diversified top-k graph pattern matching. PVLDB 6(13), 1510–1521 (2013)
Fletcher, G.H.L., Peters, J., Poulovassilis, A.: Efficient regular path query evaluation using path indexes. In: EDBT, pp. 636–639 (2016)
Garg, S., Gupta, T., Carlsson, N., Mahanti, A.: Evolution of an online social aggregation network: an empirical study. In: IMC (2009)
Gou, G., Chirkova, R.: Efficient algorithms for exact ranked twig-pattern matching over graphs. In: SIGMOD (2008)
Henzinger, M.R., Henzinger, T.A., Kopke, P.W.: Computing simulations on finite and infinite graphs. In: FOCS (1995)
Hromkovic, J., Seibert, S., Wilke, T.: Translating regular expressions into small -free nondeterministic finite automata. J. Comput. Syst. Sci. 62(4), 565–588 (2001)
Ilyas, I.F., Beskales, G., Soliman, M.A.: A survey of top- k query processing techniques in relational database systems. ACM Comput. Surv. 40(4), 1–58 (2008)
Leskovec, J., Krevl, A.: Amazon dataset, June 2014. http://snap.stanford.edu/data/index.html
Mendelzon, A.O., Wood, P.T.: Finding regular simple paths in graph databases. In: VLDB, pp. 185–193 (1989)
Papadimitriou, C.H.: Computational Complexity. Addison-Wesley, Boston (1994)
Reutter, J.L., Romero, M., Vardi, M.Y.: Regular queries on graph databases. Theory Comput. Syst. 61(1), 31–83 (2017). https://doi.org/10.1007/s00224-016-9676-2
Terveen, L.G., McDonald, D.W.: Social matching: a framework and research agenda. ACM Trans. Comput.-Hum. Interact. 12, 401–434 (2005)
Wang, H., Han, J., Shao, B., Li, J.: Regular expression matching on billion-nodes graphs. CoRR, abs/1904.11653 (2019)
Wang, X., Zhan, H.: Approximating diversified top-k graph pattern matching. In: Hartmann, S., Ma, H., Hameurlain, A., Pernul, G., Wagner, R.R. (eds.) DEXA 2018. LNCS, vol. 11029, pp. 407–423. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98809-2_25
Wikipedia. Prim’s algorithm (2019). https://en.wikipedia.org/wiki/Prim’s_algorithm
Zou, L., Chen, L., Lu, Y.: Top-k subgraph matching query in a large graph. In: Ph.D. Workshop in CIKM (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, X., Wang, Y., Xu, Y., Zhang, J., Zhong, X. (2020). Extending Graph Pattern Matching with Regular Expressions. In: Hartmann, S., Küng, J., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2020. Lecture Notes in Computer Science(), vol 12392. Springer, Cham. https://doi.org/10.1007/978-3-030-59051-2_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-59051-2_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59050-5
Online ISBN: 978-3-030-59051-2
eBook Packages: Computer ScienceComputer Science (R0)