Frequent Subgraph Summarization with Error Control

Liu, Zheng; Jin, Ruoming; Cheng, Hong; Yu, Jeffrey Xu

doi:10.1007/978-3-642-38562-9_1

Zheng Liu²¹,
Ruoming Jin²²,
Hong Cheng²¹ &
…
Jeffrey Xu Yu²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7923))

Included in the following conference series:

International Conference on Web-Age Information Management

3474 Accesses

Abstract

Frequent subgraph mining has been an important research problem in the literature. However, the huge number of discovered frequent subgraphs becomes the bottleneck for exploring and understanding the generated patterns. In this paper, we propose to summarize frequent subgraphs with an independence probabilistic model, with the goal to restore the frequent subgraphs and their frequencies accurately from a compact summarization model. To achieve a good summarization quality, our summarization framework allows users to specify an error tolerance σ, and our algorithms will discover k summarization templates in a top-down fashion and keep the frequency restoration error within σ. Experiments on real graph datasets show that our summarization framework can effectively control the frequency restoration error within 10% with a concise summarization model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Borgelt, C., Berthold, M.R.: Mining molecular fragments: Finding relevant substructures of molecules. In: ICDM, p. 51 (2002)
Google Scholar
Bringmann, B., Nijssen, S.: What is frequent in a single graph? In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 858–863. Springer, Heidelberg (2008)
Chapter Google Scholar
Chen, C., Lin, C.X., Yan, X., Han, J.: On effective presentation of graph patterns: a structural representative approach. In: CIKM, pp. 299–308 (2008)
Google Scholar
Deshpande, M., Kuramochi, M., Wale, N., Karypis, G.: Frequent substructure-based approaches for classifying chemical compounds. IEEE Trans. on Knowl. and Data Eng. 17(8), 1036–1050 (2005)
Article Google Scholar
Fiedler, M., Borgelt, C.: Subgraph support in a single large graph. In: ICDM Workshops, pp. 399–404 (2007)
Google Scholar
Hasan, M.A., Chaoji, V., Salem, S., Besson, J., Zaki, M.J.: Origami: Mining representative orthogonal graph patterns. In: ICDM, pp. 153–162 (2007)
Google Scholar
Hasan, M.A., Zaki, M.J.: Output space sampling for graph patterns. Proc. VLDB Endow. 2(1), 730–741 (2009)
Google Scholar
Huan, J., Wang, W., Prins, J.: Efficient mining of frequent subgraphs in the presence of isomorphism. In: ICDM, p. 549 (2003)
Google Scholar
Huan, J., Wang, W., Prins, J., Yang, J.: Spin: mining maximal frequent subgraphs from graph databases. In: KDD, pp. 581–586 (2004)
Google Scholar
Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)
Chapter Google Scholar
Jin, R., Abu-Ata, M., Xiang, Y., Ruan, N.: Effective and efficient itemset pattern summarization: Regression-based approaches. In: KDD, pp. 399–407 (2008)
Google Scholar
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: ICDM, pp. 313–320 (2001)
Google Scholar
Kuramochi, M., Karypis, G.: Finding frequent patterns in a large sparse graph. Data Min. Knowl. Discov. 11(3), 243–271 (2005)
Article MathSciNet Google Scholar
Li, S., Zhang, S., Yang, J.: DESSIN: Mining dense subgraph patterns in a single graph. In: Gertz, M., Ludäscher, B. (eds.) SSDBM 2010. LNCS, vol. 6187, pp. 178–195. Springer, Heidelberg (2010)
Chapter Google Scholar
Liu, Y., Li, J., Gao, H.: Summarizing graph patterns. In: ICDE, pp. 903–912 (2008)
Google Scholar
Navlakha, S., Rastogi, R., Shrivastava, N.: Graph summarization with bounded error. In: SIGMOD, pp. 419–432 (2008)
Google Scholar
Nijssen, S., Kok, J.N.: A quickstart in frequent structure mining can make a difference. In: KDD, pp. 647–652 (2004)
Google Scholar
Freund, R.J., Wilson, W.J., Sa, P.: Regression Analysis: Statistical Modeling of a Response Variable, 2nd edn. Academic Press (2006)
Google Scholar
Thomas, L.T., Valluri, S.R., Karlapalem, K.: Margin: Maximal frequent subgraph mining. In: ICDM, pp. 1097–1101 (2006)
Google Scholar
Tian, Y., Hankins, R.A., Patel, J.M.: Efficient aggregation for graph summarization. In: SIGMOD, pp. 567–580 (2008)
Google Scholar
Vanetik, N., Gudes, E., Shimony, S.E.: Computing frequent graph patterns from semistructured data. In: ICDM, p. 458 (2002)
Google Scholar
Wang, C., Parthasarathy, S.: Summarizing itemset patterns using probabilistic models. In: KDD, pp. 730–735 (2006)
Google Scholar
Yan, X., Cheng, H., Han, J., Xin, D.: Summarizing itemset patterns: a profile-based approach. In: KDD, pp. 314–323 (2005)
Google Scholar
Yan, X., Han, J.: gspan: Graph-based substructure pattern mining. In: ICDM, p. 721 (2002)
Google Scholar
Yan, X., Han, J.: Closegraph: mining closed frequent graph patterns. In: KDD, pp. 286–295 (2003)
Google Scholar
Yan, X., Yu, P.S., Han, J.: Graph indexing based on discriminative frequent structure analysis. ACM Trans. Database Syst. 30(4), 960–993 (2005)
Article Google Scholar
Zhang, N., Tian, Y., Patel, J.: Discovery-driven graph summarization. In: ICDE, pp. 880–891 (2010)
Google Scholar
Zhang, S., Yang, J., Li, S.: Ring: An integrated method for frequent representative subgraph mining. In: ICDM, pp. 1082–1087 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

The Chinese University of Hong Kong, Hong Kong
Zheng Liu, Hong Cheng & Jeffrey Xu Yu
Kent State University, USA
Ruoming Jin

Authors

Zheng Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ruoming Jin
View author publications
You can also search for this author in PubMed Google Scholar
Hong Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey Xu Yu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Technology, Tsinghua University, 100084, Beijing, China
Jianyong Wang
Management Science and Information Systems Department, Rutgers, the State University of New Jersey, 1, Washington Park, 07102, Newark, NJ, USA
Hui Xiong
Department of Information Engineering, Nagoya University, 464-8601, Nagoya, Japan
Yoshiharu Ishikawa
Department of Computer Science, Hong Kong Baptist University, Hong Kong
Jianliang Xu
School of Information Science and Engineering, Yanshan University, Qinhuangdao, China
Junfeng Zhou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, Z., Jin, R., Cheng, H., Yu, J.X. (2013). Frequent Subgraph Summarization with Error Control. In: Wang, J., Xiong, H., Ishikawa, Y., Xu, J., Zhou, J. (eds) Web-Age Information Management. WAIM 2013. Lecture Notes in Computer Science, vol 7923. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38562-9_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-38562-9_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38561-2
Online ISBN: 978-3-642-38562-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics