Abstract
Mining top-k frequent patterns is an important operation on graphs, which is defined as finding k interesting subgraphs with the highest frequency. Most existing work assumes a static graph. However, graphs are dynamic in nature, which is described as streaming graphs. Mining top-k frequent patterns in streaming graphs is challenging due to the streaming nature of the input and the exponential time complexity of the problem. A naive solution is to calculate approximations of the frequent patterns in the streaming graph and then find the top-k answers, which is a memory- and time-consuming method. In this paper, we design a novel auxiliary data structure, FPC, to detect valid subgraph patterns and their frequency in real-time. We first convert each newly produced subgraph into a sequence and then map it into corresponding tracks in FPC based on hash functions. We theoretically prove that FPC can provide unbiased estimation and then give an error bound of our algorithm. In addition, we propose a vertical hashing and candidate buckets sampling technique to further improve FPC with higher space utilization and higher accuracy. Extensive experiments confirm that our approach generates high-quality results compared to the baseline method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Enron. http://www.cs.cmu.edu/enron/
Aslay, Ç., Nasir, M.A.U., Morales, G.D.F., Gionis, A.: Mining frequent patterns in evolving graphs. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, Torino, Italy, pp. 923–932. ACM (2018)
Bringmann, B., Nijssen, S.: What is frequent in a single graph? In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 858–863. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-68125-0_84
Chen, C., Yan, X., Zhu, F., Han, J.: gApprox: mining frequent approximate patterns from a massive network. In: Proceedings of the 7th IEEE International Conference on Data Mining, Omaha, Nebraska, USA. pp. 445–450. IEEE (2007)
Chen, Z., Wang, X., Wang, C., Li, J.: Explainable link prediction in knowledge hypergraphs. In: Proceedings of the 31st ACM International Conference on Information & Knowledge Management, Atlanta, GA, USA, pp. 262–271 (2022)
Duong, V.T.T., Khan, K., Jeong, B., Lee, Y.: Top-k frequent induced subgraph mining using sampling. In: Proceedings of the Sixth International Conference on Emerging Databases: Technologies, Applications, and Theory, Jeju Island, Republic of Korea, pp. 110–113 (2016)
Elseidy, M., Abdelhamid, E., Skiadopoulos, S., Kalnis, P.: GRAMI: frequent subgraph and pattern mining in a single large graph. Proc. VLDB Endow. 7(7), 517–528 (2014)
Hellmann, S., Stadler, C., Lehmann, J., Auer, S.: DBpedia live extraction. In: Meersman, R., Dillon, T., Herrero, P. (eds.) OTM 2009. LNCS, vol. 5871, pp. 1209–1223. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-05151-7_33
Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45372-5_2
Khan, A., Yan, X., Wu, K.: Towards proximity pattern mining in large graphs. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Indianapolis, Indiana, USA, pp. 867–878. ACM (2010)
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proceedings of the 2001 IEEE International Conference on Data Mining, San Jose, California, USA, pp. 313–320 (2001)
Kuramochi, M., Karypis, G.: Finding frequent patterns in a large sparse graph. In: Proceedings of the Fourth SIAM International Conference on Data Mining, Lake Buena Vista, Florida, USA, pp. 345–356. SIAM (2004)
Li, Y., Lin, Q., Li, R., Duan, D.: TGP: mining top-k frequent closed graph pattern without minimum support. In: Cao, L., Feng, Y., Zhong, J. (eds.) ADMA 2010. LNCS (LNAI), vol. 6440, pp. 537–548. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17316-5_51
Li, Z., Liu, X., Wang, X., Liu, P., Shen, Y.: TransO: a knowledge-driven representation learning method with ontology information constraints. World Wide Web (WWW) 26(1), 297–319 (2023). https://doi.org/10.1007/s11280-022-01016-3
Nasir, M.A.U., Aslay, Ç., Morales, G.D.F., Riondato, M.: TipTap: approximate mining of frequent k-subgraph patterns in evolving graphs. ACM Trans. Knowl. Discov. Data 15(3), 1–35 (2021)
Saha, T.K., Hasan, M.A.: Fs\({}^{\text{3}}\): a sampling based method for top-k frequent subgraph mining. In: 2014 IEEE International Conference on Big Data (IEEE BigData 2014), Washington, DC, USA, pp. 72–79 (2014)
Viswanath, B., Mislove, A., Cha, M., Gummadi, P.K.: On the evolution of user interaction in Facebook. In: Proceedings of the 2nd ACM Workshop on Online Social Networks, Barcelona, Spain, pp. 37–42. ACM (2009)
Vitter, J.S.: Random sampling with a reservoir. ACM Trans. Math. Softw. 11(1), 37–57 (1985)
Yan, X., Han, J.: gSpan: graph-based substructure pattern mining. In: Proceedings of the 2002 IEEE International Conference on Data Mining, Maebashi City, Japan, pp. 721–724 (2002)
Acknowledgement
This work is partially supported by National Natural Science Foundation of China under Grant No. U19B2024,62272469.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, X., Zhang, Q., Guo, D., Zhao, X. (2023). Mining Top-k Frequent Patterns over Streaming Graphs. In: Wang, X., et al. Database Systems for Advanced Applications. DASFAA 2023. Lecture Notes in Computer Science, vol 13945. Springer, Cham. https://doi.org/10.1007/978-3-031-30675-4_14
Download citation
DOI: https://doi.org/10.1007/978-3-031-30675-4_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30674-7
Online ISBN: 978-3-031-30675-4
eBook Packages: Computer ScienceComputer Science (R0)