Abstract
In this paper, we propose a new mining task: mining top-k frequent closed graph patterns without minimum support. Most previous frequent graph pattern mining works require the specification of a minimum support threshold. However it is difficult for users to set a suitable value sometimes. We develop an efficient algorithm, called TGP, to mine patterns without minimum support. A new structure called Lexicographic Pattern Net is designed to store graph patterns, which makes the closed pattern verification more efficient and speeds up raising support threshold dynamically. In addition, Lexicographic Pattern Net can be stored in the file through serialization, so it doesn’t need generate candidate patterns again in the next mining. It is found in the preliminary experiments that TGP can find top-k frequent closed graph patterns completely and accurately. Furthermore, TGP can be extended to mine other kinds of graphs or dynamic graph streams easily.
This work is supported by National Natural Science Foundation of China under Grant 70771043, 60873225, 60773191. National High Technology Research and Development Program of China under Grant 2007AA01Z403, Natural Science Foundation of Hubei Province under Grant 2009CDB298.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: First IEEE International Conference on Data Mining (ICDM 2001), pp. 313–320. IEEE Computer Society, San Jose (2001)
Vanetik, N., Gudes, E., Shimony, S.E.: Computing Frequent Graph Patterns from Semistructured Data. In: Second IEEE International Conference on Data Mining (ICDM 2002), pp. 458–465. IEEE Computer Society, Maebashi City (2002)
Han, J., Wang, W., Prins, J.: Efficient Mining of Frequent Subgraph in the presence of Isomorphims. In: 3rd IEEE International Conference on Data Mining (ICDM 2003), pp. 549–552. IEEE Computer Society, Melbourne (2003)
Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining. In: Second IEEE International Conference on Data Mining (ICDM 2002), pp. 721–723. IEEE Computer Society, Maebashi City (2002)
Christian, B.: An Implementation of the FP-growth Algorithm. In: Proceedings of the 1st International Workshop on Open Source Data Mining: Frequent Pattern Mining Implementations (OSDM 2005), pp. 1–5. ACM, Chicago (2005)
Han, J., Yan, X.: CloseGraph: Mining Closed Frequent Graph Patterns. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2003), pp. 286–295. ACM, Washington (2003)
PTamas, H., PJan, R., Stefan, W.: Frequent subgraph mining in outerplanar graphs. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2006), pp. 1097–1111. ACM, Philadelphia (2006)
Yan, X., Zhou, J., Han, J.: Mining closed relational graphs with connectivity constraints. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (KDD 2005), pp. 324–333. ACM, Chicago (2005)
Thomas, L.T., Valluri, S.R., Karlapalem, K.: MARGIN: Maximal Frequent Subgraph Mining. In: Proceedings of the Sixth International Conference on Data Mining (ICDM 2006), pp. 1097–1101. IEEE Computer Society, Hong Kong (2006)
Wang, N., Parthasarathy, S., Tan, K., Tung, A.K.H.: CSV: visualizing and mining cohesive subgraphs. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (KDD 2008), pp. 445–458. ACM, Vancouver (2008)
Wang, J., Han, J., Lu, Y., Tzvetkov, P.: TFP: An Efficient Algorithm for Mining Top-K Frequent Closed Itemsets. IEEE Trans. Knowl. Data Eng. 17(5), 652–664 (2005)
Tzvetkov, P., Yan, X., Han, J.: TSP: Mining Top-K Closed Sequential Patterns. In: 3rd IEEE International Conference on Data Mining (ICDM 2003), pp. 347–354. IEEE Computer Society, Maebashi (2003)
Yan, X., Han, J., Afshar, R.: CloSpan: Mining Closed Sequential Patterns in Large Databases. In: SIAM International Conference on Data Mining (SDM 2003), San Francisco, pp. 166–177 (2003)
Li, Y., Lin, Q., Zhong, G.: Duan. D.: A Directed Labeled Graph Frequent Pattern Mining Algorithm based on Minimum Code. In: The 3rd International Conference on Multimedia and Ubiquitous Engineering (MUE 2009), pp. 353–359. Conference Publishing Services, Qingdao (2009)
Maunz, A., Helma, C., Kramer, S.: Large-scale graph mining using backbone refinement classes. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2009), pp. 617–626. ACM, Paris (2009)
Muthukrishnan, S.: Data streams: algorithms and applications. In: 14th ACM-SIAM Symposium on Discrete Algorithms (SODA 2003), pp.413–413. ACM, Baltimore (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, Y., Lin, Q., Li, R., Duan, D. (2010). TGP: Mining Top-K Frequent Closed Graph Pattern without Minimum Support. In: Cao, L., Feng, Y., Zhong, J. (eds) Advanced Data Mining and Applications. ADMA 2010. Lecture Notes in Computer Science(), vol 6440. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17316-5_51
Download citation
DOI: https://doi.org/10.1007/978-3-642-17316-5_51
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17315-8
Online ISBN: 978-3-642-17316-5
eBook Packages: Computer ScienceComputer Science (R0)