Abstract
In recent years, graph mining has become a popular research direction in the area of data mining. Frequent subgraph mining is an important technology of graph mining that can be used in many fields such as chemical informatics, bioinformatics, and social sciences. The increasing size of graph database is challenging traditional methods of subgraph mining. In this paper, we propose a new approach based on MapReduce to mine frequent subgraph patterns from the vertex-classified graph databases in large sizes. There are two rounds operation to MapReduce. The first round is to mine the locally frequent subgraphs in each node and then we collect the results for all nodes and filter some redundant graphs to obtain a set of frequent subgraphs candidate in global view. The second round is to calculate the global frequency for each graph using the set of candidate generated by the first round. Some topical frequent subgraphs are filtered according to special requirement. The experimental results show that this approach reduces the execution time when dealing with large graph databases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Lin, W., Xiao, X., Ghinita, G.: Large-scale frequent subgraph mining in mapreduce. In: Proceedings of IEEE International Conference on Data Engineering, pp. 844–855 (2014)
Xu, S., Su, S., Xiong, L., Cheng, X., Xiao, K.: Differentially private frequent subgraph mining. In: Proceedings of IEEE International Conference on Data Engineering, pp. 229–240 (2016)
Shahrivari, S., Jalili, S.: Distributed discovery of frequent subgraphs of a network using mapreduce. J. Comput. 97(11), 1101–1120 (2015)
Elseidy, M., Abdelhamid, E., Skiadopoulos, S., Kalnis, P.: Grami: frequent subgraph and pattern mining in a single large graph. In: Proceedings of the VLDB Endowment, pp. 517–528 (2014)
Chen, Y., Zhao, X., Lin, X., Wang, Y.: Towards frequent subgraph mining on single large uncertain graphs. In: Proceedings of IEEE International Conference on Data Mining, pp. 41–50 (2015)
Zhao, Z., Wang, G., Butt, A.R., Khan, M., Kumar, V.S.A., Marathe, M.V.: Sahad: subgraph analysis in massive networks using hadoop. In: Proceedings of IEEE International Parallel and Distributed Processing Symposium, pp. 390–401 (2012)
Afrati, F., Fotakis, D., Ullman, J.: Enumerating subgraph instances using map-reduce. In: Proceedings of IEEE International Conference on Data Engineering, pp. 62–73 (2012)
Lee, J., Han, W.S., Kasperovics, R., Lee, J.H.: An in-depth comparison of subgraph isomorphism algorithms in graph databases. In: Proceedings of the VLDB Endowment, pp. 133–144 (2012)
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proceedings of IEEE International Conference on Data Mining, pp. 313–320 (2001)
Yan, X., Han, J.: Gspan: Graph-based substructure pattern mining. In: Proceedings of IEEE International Conference on Data Mining, pp. 721–724 (2002)
Teixeira, C., Fonseca, A.J., Serafini, M., Siganos, G., Zaki, M.J., Aboulnaga, A.: Arabesque: a system for distributed graph mining. In: Proceedings of Symposium on Operating Systems Principles, pp. 425–440 (2015)
Hill, S., Srichandan, B., Sunderraman, R.: An iterative mapreduce approach to frequent subgraph mining in biological datasets. In: Proceedings of ACM Conference on Bioinformatics, Computational Biology and Biomedicine, pp. 661–666 (2012)
Han, J., Cheng, H., Xin, D., Yan, X.: Frequent pattern mining: current status and future directions. Data Min. Knowl. Discov. 15(15), 55–86 (2007)
Dean, J., Ghemawat, S.: Mapreduce: a flexible data processing tool. Commun. ACM 53(1), 72–77 (2010)
Inokuchi, A., Washio, T., Motoda, H.: Complete mining of frequent patterns from graphs: mining graph data. J. Mach. Learn. 50(3), 321–354 (2003)
Acknowledgments
This paper is supported by the NSFC under grant No.61433019 and Science and technology project of Guangdong Province (No. 2016B030306003 and 2016B030305002).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Wang, K., Xie, X., Jin, H., Yuan, P., Lu, F., Ke, X. (2016). Frequent Subgraph Mining in Graph Databases Based on MapReduce. In: Wang, G., Han, Y., MartÃnez Pérez, G. (eds) Advances in Services Computing. APSCC 2016. Lecture Notes in Computer Science(), vol 10065. Springer, Cham. https://doi.org/10.1007/978-3-319-49178-3_35
Download citation
DOI: https://doi.org/10.1007/978-3-319-49178-3_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49177-6
Online ISBN: 978-3-319-49178-3
eBook Packages: Computer ScienceComputer Science (R0)