Abstract
Frequent pattern mining (\(\mathsf {FPM}\)) on large graph has been receiving increasing attention due to its wide applications. The \(\mathsf {FPM}\) problem is defined as mining all the subgraphs (a.k.a. patterns), with frequency above a user-defined threshold in a large graph. Though a host of techniques have been developed, most of them suffers from high computational cost and inconvenient result inspection. To tackle the issues, we propose an approach to discover diversified top-k patterns from a large graph G. We formalize the distributed top-k pattern mining problem based on a diversification function. We develop an algorithm with early termination property, to efficiently identify diversified top-k patterns. Using real-life and synthetic graphs, we show advantages of our algorithm via intensive experimental studies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Pokec social network. http://snap.stanford.edu/data/soc-pokec.html
Abdelhamid, E., Abdelaziz, I., Kalnis, P., Khayyat, Z., Jamour, F.T.: ScaleMine: scalable parallel frequent subgraph mining in a single large graph. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC, pp. 716–727. IEEE Computer Society (2016)
Abdelhamid, E., Canim, M., Sadoghi, M., Bhattacharjee, B., Chang, Y., Kalnis, P.: Incremental frequent subgraph mining on large evolving graphs. IEEE Trans. Knowl. Data Eng. 29(12), 2710–2723 (2017)
Alonso, O., Gamon, M., Haas, K., Pantel, P.: Diversity and relevance in social search. In: DDR (2012)
Ashraf, N., et al.: WeFreS: weighted frequent subgraph mining in a single large graph. In: Perner, P. (ed.) 19th Industrial Conference on Advances in Data Mining - Applications and Theoretical Aspects, ICDM, pp. 201–215. ibai Publishing (2019)
Aslay, Ç., Nasir, M.A.U., De Francisci Morales, G., Gionis, A.: Mining frequent patterns in evolving graphs. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM, pp. 923–932. ACM (2018)
Bhuiyan, M., Hasan, M.A.: An iterative MapReduce based frequent subgraph mining algorithm. IEEE Trans. Knowl. Data Eng. 27(3), 608–620 (2015)
Borodin, A., Lee, H.C., Ye, Y.: Max-sum diversification, monotone submodular functions and dynamic updates. In: PODS, pp. 155–166. ACM (2012)
Borodin, A., Lee, H.C., Ye, Y.: Max-sum diversification, monotone submodular functions and dynamic updates. In: Benedikt, M., Krötzsch, M., Lenzerini, M. (eds.) Proceedings of the 31st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS, pp. 155–166. ACM (2012)
Chen, H., Liu, M., Zhao, Y., Yan, X., Yan, D., Cheng, J.: G-Miner: an efficient task-oriented graph mining system. In: Oliveira, R., Felber, P., Hu, Y.C. (eds.) Proceedings of the Thirteenth EuroSys Conference, EuroSys, pp. 32:1–32:12. ACM (2018)
Cordella, L.P., Foggia, P., Sansone, C., Vento, M.: A (sub)graph isomorphism algorithm for matching large graphs. TPAMI 26(10), 1367–1372 (2004)
Dhifli, W., Aridhi, S., Nguifo, E.M.: MR-SimLab: scalable subgraph selection with label similarity for big data. Inf. Syst. 69, 155–163 (2017)
Elseidy, M., Abdelhamid, E., Skiadopoulos, S., Kalnis, P.: GRAMI: frequent subgraph and pattern mining in a single large graph. PVLDB 7(7), 517–528 (2014)
Fiedler, M., Borgelt, C.: Subgraph support in a single large graph. In: Workshops Proceedings of the 7th IEEE International Conference on Data Mining, pp. 399–404. IEEE Computer Society (2007)
Gollapudi, S., Sharma, A.: An axiomatic approach for result diversification. In: Quemada, J., León, G., Maarek, Y.S., Nejdl, W. (eds.) Proceedings of the 18th International Conference on World Wide Web, pp. 381–390. ACM (2009)
Gong, N.Z., et al.: Evolution of social-attribute networks: measurements, modeling, and implications using Google+. In IMC (2012)
Gudes, E., Shimony, S.E., Vanetik, N.: Discovering frequent graph patterns using disjoint paths. IEEE Trans. Knowl. Data Eng. 18(11), 1441–1456 (2006)
Huan, J., Wang, W., Prins, J., Yang, J.: SPIN: mining maximal frequent subgraphs from graph databases. In: SIGKDD (2004)
Kang, U., Faloutsos, C.: Big graph mining: algorithms and discoveries. SIGKDD Explor. 14(2), 29–36 (2012)
Le, N., Vo, B., Nguyen, L.B.Q., Fujita, H., Le, B.: Mining weighted subgraphs in a single large graph. Inf. Sci. 514, 149–165 (2020)
Le, T., Vo, B., Huynh, V., Nguyen, N.T., Baik, S.W.: Mining top-k frequent patterns from uncertain databases. Appl. Intell. 50(5), 1487–1497 (2020). https://doi.org/10.1007/s10489-019-01622-1
Ray, A., Holder, L., Choudhury, S.: Frequent subgraph discovery in large attributed streaming graphs. In: Proceedings of the 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, vol. 36, pp. 166–181. JMLR.org (2014)
Shao, Y., Cui, B., Chen, L., Ma, L., Yao, J., Xu, N.: Parallel subgraph listing in a large-scale graph. In: SIGMOD (2014)
Talukder, N., Zaki, M.J.: A distributed approach for graph mining in massive networks. Data Min. Knowl. Discov. 30(5), 1024–1052 (2016). https://doi.org/10.1007/s10618-016-0466-x
Teixeira, C.H.C., Fonseca, A.J., Serafini, M., Siganos, G., Zaki, M.J., Aboulnaga, A.: Arabesque: a system for distributed graph mining. In: Miller, E.L., Hand, S. (eds.) Proceedings of the 25th Symposium on Operating Systems Principles, SOSP 2015, Monterey, CA, USA, 4–7 October 2015, pp. 425–440. ACM (2015)
Yan, D., Qu, W., Guo, G., Wang, X.: PrefixFPM: a parallel framework for general-purpose frequent pattern mining. In: 36th IEEE International Conference on Data Engineering, ICDE, pp. 1938–1941. IEEE (2020)
Yan, X., Han, J.: CloseGraph: mining closed frequent graph patterns. In: Getoor, L., Senator, T.E., Domingos, P.M., Faloutsos, C. (eds.) Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 286–295. ACM (2003)
Zhu, F., Qu, Q., Lo, D., Yan, X., Han, J., Yu, P.: Mining top-k large structural patterns in a massive network. VLDB 4(11), 807–818 (2011)
Zhu, X., Chen, W., Zheng, W., Ma, X.: Gemini: a computation-centric distributed graph processing system. In: Keeton, K., Roscoe, T. (eds.) 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2016, Savannah, GA, USA, 2–4 November 2016, pp. 301–316. USENIX Association (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, X., Tang, L., Liu, Y., Zhan, H., Feng, X. (2021). Diversified Pattern Mining on Large Graphs. In: Strauss, C., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2021. Lecture Notes in Computer Science(), vol 12923. Springer, Cham. https://doi.org/10.1007/978-3-030-86472-9_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-86472-9_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86471-2
Online ISBN: 978-3-030-86472-9
eBook Packages: Computer ScienceComputer Science (R0)