Abstract
How to mine many interesting subgraphs in uncertain graph has become an important research field in data mining. In this paper, a novel algorithm Uncertain Maximal Frequent Subgraph Mining Algorithm Based on Adjacency Matrix and Weight (UMFGAMW) is proposed. The definition of the adjacency matrix and the standard matrix coding for uncertain graph are presented. The correspondence between the adjacency matrix and uncertain graph is established. A new vertex ordering policy for computing the standard coding of uncertain graph adjacency matrix is designed. The complexity of uncertain graph standard coding is reduced, and the matching speed of uncertain subgraph standard coding is improved. The definition of the weight of uncertain graph and the mean weight of uncertain edge is proposed. The importance of the uncertain subgraphs that meet the minimum support threshold in the graph dataset is fully considered. Finally, a depth-first search weighted uncertain maximal frequent subgraph mining algorithm is discussed. According to the limiting condition of the uncertain maximum frequent subgraph and weighed uncertain edge, the number of mining results is reduced effectively. Experimental results demonstrate that the UMFGAMW algorithm has higher efficiency and better scalability.
Similar content being viewed by others
References
Lee G, Yun U, Ryang H (2015) An uncertainty-based approach: frequent itemset mining from uncertain data with different item importance. Knowl Based Syst 90:239–256
Kurumatani N, Monji H, Ohkawa T (2014) Binding site extraction by similar subgraphs mining from protein molecular surfaces and its application to protein classification. Int J Artif Intell Tools 23:1460007
Douar B, Latiri C, Liquiere M et al (2014) a projection bias in frequent subgraph mining can make a difference. Int J Artif Intell Tools 23:1450005
Myithili K, Parvathi R, Akram M (2016) Certain types of intuitionistic fuzzy directed hypergraphs. Int J Mach Learn Cybern 7:287–295
NagoorGani A, Akram M, Vijayalakshmi P (2016) Certain types of fuzzy sets in a fuzzy graph. Int J Mach Learn Cybern 7:573–579
Yuan Y, Wang GR, Chen L et al (2016) Efficient pattern matching on big uncertain graphs. Inf Sci 339:369–394
Lnokuchi A, Washio T, Motoda H (2002) An apriori-based algorithm for mining frequent substructures from graph data. Princ Data Min Knowl Discov 18:13–23
Inokuchi A, Washio T, Motoda H (2000) An Apriori-based Algorithm for Mining Frequent Substructures from Graph Data, In: Proc. of the 4th European Conf. on Principles of Data Mining and Knowledge Discovery, Lyon: Springer-Verlag, pp 13–23
Kuramochi M, Karypis G (2001) Frequent sub-graph discovery. In: Proc. of the 2001 IEEE Int Conf. on Data Mining, San Jose: IEEE Computer Society, pp 313–320
Yan X, Han J (2002) GSpanGraph-based Sub-Structure Pattern Mining, In: Proc. of the 2002 IEEE Int Conf. on Data Mining, MaebashiIEEE Computer Society, pp 721–724
Huan J, Wang W, Prins J (2002) Efficient mining of frequent sub-graphs in the presence of isomorphism. In: Proc. of the 2003 IEEE Int Conf. on Data Mining, Melbourne: IEEE Computer Society, pp 549–552
Guo LX, Zhang DT, Chen L et al (2011) Research and application of data sieving algorithm based on GSpan. Appl Res Comput 28:2071–2072
Yan X, Han J (2003) Closegraph: mining closed frequent graph patterns. In: Proc. of the 9th ACM SIGKDD Int Conf. on Knowledge Discovery and Data Mining, Washington: ACM, pp 286–295
J. Huan, W. Wang, J. Prins, et al, SPIN: Mining Maximal Frequent Sub-graphs from Graph Databases, In: Proc. of the 10th ACM SIGKDD Int Conf. on Knowledge Discovery and Data Mining, Seattle: ACM, pp.581-586, 2004
Zou ZN, Zhu R (2013) Mining top-K maximal cliques from large uncertain graphs. Chin J Comput 36:2146–2155
Li MP, Gao H, Zou ZN (2014) K-reach query processing based on graph compression. J Softw 25:797–812
Li MP, Zou ZN, Gao H et al (2013) Computing expected shortest distance in uncertain graph. J Comput Res Dev 49:2208–2220
Zou Z, Li JZ, Gao H et al. (2010) Discovering Frequent Subgraph Over Uncertain Graph Database Under Probabilistic Sementids, In: Proc. of the 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’10), Washington, DC, USA, pp 633–642
Han M, Zhang W, Li JZ (2010) RAKING: an efficient K-maximal frequent pattern mining algorithm on uncertain graph database. J Comput 33:1387–1395
Wang WL, Li JZ (2013) MUSIC: an efficient of mining frequent sub-graph patterns in uncertain graph databases. Intell Comput Appl 3:20–23
Liu Y, Wang Y, Shang XQ (2014) An uncertain graph classification algorithm based on discriminative sub-graphs. Journal of Shanxi Normal University(Natural Science Edition), vol 42, pp 16–19
Hu J, He LB, Mao YM et al (2015) Research of improved mining frequent subgraph patterns in uncertain graph databases. Comput Eng Appl 51:112–116
Acknowledgements
This work is supported by the National Natural Science Foundation of China (No.61170190), the Nature Science Foundation of Hebei Province (No.F2015402114, F2015402070, F2015402119) and Foundation of Hebei Educational Committee (No.YQ2014014, QN20131081). The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wu, D., Ren, J. & Sheng, L. Uncertain maximal frequent subgraph mining algorithm based on adjacency matrix and weight. Int. J. Mach. Learn. & Cyber. 9, 1445–1455 (2018). https://doi.org/10.1007/s13042-017-0655-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-017-0655-y