Skip to main content

Mining Top-k Frequent Patterns over Streaming Graphs

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13945))

Included in the following conference series:

  • 1493 Accesses

Abstract

Mining top-k frequent patterns is an important operation on graphs, which is defined as finding k interesting subgraphs with the highest frequency. Most existing work assumes a static graph. However, graphs are dynamic in nature, which is described as streaming graphs. Mining top-k frequent patterns in streaming graphs is challenging due to the streaming nature of the input and the exponential time complexity of the problem. A naive solution is to calculate approximations of the frequent patterns in the streaming graph and then find the top-k answers, which is a memory- and time-consuming method. In this paper, we design a novel auxiliary data structure, FPC, to detect valid subgraph patterns and their frequency in real-time. We first convert each newly produced subgraph into a sequence and then map it into corresponding tracks in FPC based on hash functions. We theoretically prove that FPC can provide unbiased estimation and then give an error bound of our algorithm. In addition, we propose a vertical hashing and candidate buckets sampling technique to further improve FPC with higher space utilization and higher accuracy. Extensive experiments confirm that our approach generates high-quality results compared to the baseline method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://burtleburtle.net/bob/hash/evahash.html.

References

  1. Enron. http://www.cs.cmu.edu/enron/

  2. Snap. http://snap.stanford.edu/

  3. Aslay, Ç., Nasir, M.A.U., Morales, G.D.F., Gionis, A.: Mining frequent patterns in evolving graphs. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, Torino, Italy, pp. 923–932. ACM (2018)

    Google Scholar 

  4. Bringmann, B., Nijssen, S.: What is frequent in a single graph? In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 858–863. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-68125-0_84

    Chapter  Google Scholar 

  5. Chen, C., Yan, X., Zhu, F., Han, J.: gApprox: mining frequent approximate patterns from a massive network. In: Proceedings of the 7th IEEE International Conference on Data Mining, Omaha, Nebraska, USA. pp. 445–450. IEEE (2007)

    Google Scholar 

  6. Chen, Z., Wang, X., Wang, C., Li, J.: Explainable link prediction in knowledge hypergraphs. In: Proceedings of the 31st ACM International Conference on Information & Knowledge Management, Atlanta, GA, USA, pp. 262–271 (2022)

    Google Scholar 

  7. Duong, V.T.T., Khan, K., Jeong, B., Lee, Y.: Top-k frequent induced subgraph mining using sampling. In: Proceedings of the Sixth International Conference on Emerging Databases: Technologies, Applications, and Theory, Jeju Island, Republic of Korea, pp. 110–113 (2016)

    Google Scholar 

  8. Elseidy, M., Abdelhamid, E., Skiadopoulos, S., Kalnis, P.: GRAMI: frequent subgraph and pattern mining in a single large graph. Proc. VLDB Endow. 7(7), 517–528 (2014)

    Article  Google Scholar 

  9. Hellmann, S., Stadler, C., Lehmann, J., Auer, S.: DBpedia live extraction. In: Meersman, R., Dillon, T., Herrero, P. (eds.) OTM 2009. LNCS, vol. 5871, pp. 1209–1223. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-05151-7_33

    Chapter  Google Scholar 

  10. Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45372-5_2

    Chapter  Google Scholar 

  11. Khan, A., Yan, X., Wu, K.: Towards proximity pattern mining in large graphs. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Indianapolis, Indiana, USA, pp. 867–878. ACM (2010)

    Google Scholar 

  12. Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proceedings of the 2001 IEEE International Conference on Data Mining, San Jose, California, USA, pp. 313–320 (2001)

    Google Scholar 

  13. Kuramochi, M., Karypis, G.: Finding frequent patterns in a large sparse graph. In: Proceedings of the Fourth SIAM International Conference on Data Mining, Lake Buena Vista, Florida, USA, pp. 345–356. SIAM (2004)

    Google Scholar 

  14. Li, Y., Lin, Q., Li, R., Duan, D.: TGP: mining top-k frequent closed graph pattern without minimum support. In: Cao, L., Feng, Y., Zhong, J. (eds.) ADMA 2010. LNCS (LNAI), vol. 6440, pp. 537–548. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17316-5_51

    Chapter  Google Scholar 

  15. Li, Z., Liu, X., Wang, X., Liu, P., Shen, Y.: TransO: a knowledge-driven representation learning method with ontology information constraints. World Wide Web (WWW) 26(1), 297–319 (2023). https://doi.org/10.1007/s11280-022-01016-3

    Article  Google Scholar 

  16. Nasir, M.A.U., Aslay, Ç., Morales, G.D.F., Riondato, M.: TipTap: approximate mining of frequent k-subgraph patterns in evolving graphs. ACM Trans. Knowl. Discov. Data 15(3), 1–35 (2021)

    Google Scholar 

  17. Saha, T.K., Hasan, M.A.: Fs\({}^{\text{3}}\): a sampling based method for top-k frequent subgraph mining. In: 2014 IEEE International Conference on Big Data (IEEE BigData 2014), Washington, DC, USA, pp. 72–79 (2014)

    Google Scholar 

  18. Viswanath, B., Mislove, A., Cha, M., Gummadi, P.K.: On the evolution of user interaction in Facebook. In: Proceedings of the 2nd ACM Workshop on Online Social Networks, Barcelona, Spain, pp. 37–42. ACM (2009)

    Google Scholar 

  19. Vitter, J.S.: Random sampling with a reservoir. ACM Trans. Math. Softw. 11(1), 37–57 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  20. Yan, X., Han, J.: gSpan: graph-based substructure pattern mining. In: Proceedings of the 2002 IEEE International Conference on Data Mining, Maebashi City, Japan, pp. 721–724 (2002)

    Google Scholar 

Download references

Acknowledgement

This work is partially supported by National Natural Science Foundation of China under Grant No. U19B2024,62272469.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Qianzhen Zhang or Deke Guo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, X., Zhang, Q., Guo, D., Zhao, X. (2023). Mining Top-k Frequent Patterns over Streaming Graphs. In: Wang, X., et al. Database Systems for Advanced Applications. DASFAA 2023. Lecture Notes in Computer Science, vol 13945. Springer, Cham. https://doi.org/10.1007/978-3-031-30675-4_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-30675-4_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-30674-7

  • Online ISBN: 978-3-031-30675-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics