Mining Graph Patterns

Cheng, Hong; Yan, Xifeng; Han, Jiawei

doi:10.1007/978-1-4419-6045-0_12

Hong Cheng³,
Xifeng Yan⁴ &
Jiawei Han⁵

Part of the book series: Advances in Database Systems ((ADBS,volume 40))

7395 Accesses
15 Citations

Abstract

Graph pattern mining becomes increasingly crucial to applications in a variety of domains including bioinformatics, cheminformatics, social network analysis, computer vision and multimedia. In this chapter, we first examine the existing frequent subgraph mining algorithms and discuss their computational bottleneck. Then we introduce recent studies on mining significant and representative subgraph patterns. These new mining algorithms represent the state-of-the-art graph mining techniques: they not only avoid the exponential size of mining result, but also improve the applicability of graph patterns significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

T. Asai, K. Abe, S. Kawasoe, H. Arimura, H. Satamoto, and S. Arikawa. Efficient substructure discovery from large semi-structured data. In Proc. 2002 SIAM Int. Conf. Data Mining (SDM’02), pages 158–174, 2002.
Google Scholar
C. Borgelt and M. R. Berthold. Mining molecular fragments: Finding relevant substructures of molecules. In Proc. 2002 Int. Conf. Data Mining (ICDM’02), pages 211–218, 2002.
Google Scholar
B. Bringmann and S. Nijssen. What is frequent in a single graph? In Proc. 2008 Pacific-Asia Conf. Knowledge Discovery and Data Mining (PAKDD’08), pages 858–863, 2008.
Google Scholar
H. Cheng, X. Yan, J. Han, and C.-W. Hsu. Discriminative frequent pattern analysis for effective classification. In Proc. 2007 Int. Conf. Data Engineering (ICDE’07), pages 716–725, 2007.
Google Scholar
Y. Chi, Y. Xia, Y. Yang, and R. Muntz. Mining closed and maximal frequent subtrees from databases of labeled rooted trees. IEEE Trans. Knowledge and Data Eng., 17:190–202, 2005.
Article Google Scholar
L. Dehaspe, H. Toivonen, and R. King. Finding frequent substructures in chemical compounds. In Proc. 1998 Int. Conf. Knowledge Discovery and Data Mining (KDD’98), pages 30–36, 1998.
Google Scholar
M. Deshpande, M. Kuramochi, N. Wale, and G. Karypis. Frequent substructure-based approaches for classifying chemical compounds. IEEE Trans. on Knowledge and Data Engineering, 17:1036–1050, 2005.
Article Google Scholar
M. Fiedler and C. Borgelt. Support computation for mining frequent subgraphs in a single graph. In Proc. 5th Int. Workshop on Mining and Learning with Graphs (MLG’07), 2007.
Google Scholar
Y. Freund and R. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. In Proc. 2nd European Conf. Computational Learning Theory, pages 23–27, 1995.
Google Scholar
M. Al Hasan, V. Chaoji, S. Salem, J. Besson, and M. J. Zaki. ORIGAMI: Mining representative orthogonal graph patterns. In Proc. 2007 Int. Conf. Data Mining (ICDM’07), pages 153–162, 2007.
Google Scholar
H. He and A. K. Singh. Efficient algorithms for mining significant substructures in graphs with quality guarantees. In Proc. 2007 Int. Conf. Data Mining (ICDM’07), pages 163–172, 2007.
Google Scholar
L. B. Holder, D. J. Cook, and S. Djoko. Substructure discovery in the subdue system. In Proc. AAAI’94 Workshop Knowledge Discovery in Databases (KDD’94), pages 169–180, 1994.
Google Scholar
J. Huan, W. Wang, D. Bandyopadhyay, J. Snoeyink, J. Prins, and A. Tropsha. Mining spatial motifs from protein structure graphs. In Proc. 8th Int. Conf. Research in Computational Molecular Biology (RECOMB), pages 308–315, 2004.
Google Scholar
J. Huan, W. Wang, and J. Prins. Efficient mining of frequent subgraph in the presence of isomorphism. In Proc. 2003 Int. Conf. Data Mining (ICDM’03), pages 549–552, 2003.
Google Scholar
J. Huan, W. Wang, J. Prins, and J. Yang. SPIN: Mining maximal frequent subgraphs from graph databases. In Proc. 2004 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD’04), pages 581–586, 2004.
Google Scholar
A. Inokuchi, T. Washio, and H. Motoda. An apriori-based algorithm for mining frequent substructures from graph data. In Proc. 2000 European Symp. Principle of Data Mining and Knowledge Discovery (PKDD’00), pages 13–23, 1998.
Google Scholar
R. Jin, C. Wang, D. Polshakov, S. Parthasarathy, and G. Agrawal. Discovering frequent topological structures from graph datasets. In Proc. 2005 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD’05), pages 606–611, 2005.
Google Scholar
M. Koyuturk, A. Grama, and W. Szpankowski. An efficient algorithm for detecting frequent subgraphs in biological networks. Bioinformatics, 20:I200–I207, 2004.
Article Google Scholar
T. Kudo, E. Maeda, and Y. Matsumoto. An application of boosting to graph classification. In Advances in Neural Information Processing Systems 18 (NIPS’04), 2004.
Google Scholar
M. Kuramochi and G. Karypis. Frequent subgraph discovery. In Proc. 2001 Int. Conf. Data Mining (ICDM’01), pages 313–320, 2001.
Google Scholar
M. Kuramochi and G. Karypis. Finding frequent patterns in a large sparse graph. Data Mining and Knowledge Discovery, 11:243–271, 2005.
Article MathSciNet Google Scholar
S. Nijssen and J. Kok. A quickstart in frequent structure mining can make a difference. In Proc. 2004 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD’04), pages 647–652, 2004.
Google Scholar
J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In Proc. 2001 Int. Conf. Data Engineering (ICDE’01), pages 215–224, 2001.
Google Scholar
S. Ranu and A. K. Singh. GraphSig: A scalable approach to mining significant subgraphs in large graph databases. In Proc. 2009 Int. Conf. Data Engineering (ICDE’09), pages 844–855, 2009.
Google Scholar
H. Saigo, N. Kramer, and K. Tsuda. Partial least squares regression for graph mining. In Proc. 2008 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD’08), pages 578–586, 2008.
Google Scholar
L. Thomas, S. Valluri, and K. Karlapalem. MARGIN: Maximal frequent subgraph mining. In Proc. 2006 Int. Conf. on Data Mining (ICDM’06), pages 1097–1101, 2006.
Google Scholar
K. Tsuda. Entire regularization paths for graph data. In Proc. 2007 Int. Conf. Machine Learning (ICML’07), pages 919–926, 2007.
Google Scholar
N. Vanetik, E. Gudes, and S. E. Shimony. Computing frequent graph patterns from semistructured data. In Proc. 2002 Int. Conf. on Data Mining (ICDM’02), pages 458–465, 2002.
Google Scholar
C. Wang, W. Wang, J. Pei, Y. Zhu, and B. Shi. Scalable mining of large disk-base graph databases. In Proc. 2004 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD’04), pages 316–325, 2004.
Google Scholar
T. Washio and H. Motoda. State of the art of graph-based data mining. SIGKDD Explorations, 5:59–68, 2003.
Article Google Scholar
X. Yan, H. Cheng, J. Han, and P. S. Yu. Mining significant graph patterns by scalable leap search. In Proc. 2008 ACM SIGMOD Int. Conf. on Management of Data (SIGMOD’08), pages 433–444, 2008.
Google Scholar
X. Yan and J. Han. gSpan: Graph-based substructure pattern mining. In Proc. 2002 Int. Conf. Data Mining (ICDM’02), pages 721–724, 2002.
Google Scholar
X. Yan and J. Han. CloseGraph: Mining closed frequent graph patterns. In Proc. 2003 ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining (KDD’03), pages 286–295, 2003.
Google Scholar
X. Yan and J. Han. Discovery of frequent substructures. In D. Cook and L. Holder (eds.), Mining Graph Data, pages 99–115, John Wiley Sons, 2007.
Google Scholar
X. Yan, P. S. Yu, and J. Han. Graph indexing: A frequent structure-based approach. In Proc. 2004 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD’04), pages 335–346, 2004.
Google Scholar
X. Yan, X. J. Zhou, and J. Han. Mining closed relational graphs with connectivity constraints. In Proc. 2005 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD’05), pages 324–333, 2005.
Google Scholar
M. J. Zaki. Efficiently mining frequent trees in a forest. In Proc. 2002 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD’02), pages 71–80, 2002.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Systems Engineering and Engineering Management, Chinese University of Hong Kong, Hong Kong, The People’s Republic of China
Hong Cheng
Department of Computer Science, University of California at Santa Barbara, Santa Barbara, CA, USA
Xifeng Yan
Department of Computer Science, University of Illinois at Urbana-Champaign, Champaign, IL, United States
Jiawei Han

Authors

Hong Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Xifeng Yan
View author publications
You can also search for this author in PubMed Google Scholar
Jiawei Han
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hong Cheng .

Editor information

Editors and Affiliations

Thomas J. Watson Research Center, IBM, Skyline Drive 19, Hawthorne, 10532, U.S.A.
Charu C. Aggarwal
Microsoft Research Asia, Zhichun Road 49, Beijing, 100080, China, People's Republic
Haixun Wang

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Cheng, H., Yan, X., Han, J. (2010). Mining Graph Patterns. In: Aggarwal, C., Wang, H. (eds) Managing and Mining Graph Data. Advances in Database Systems, vol 40. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-6045-0_12

Download citation

DOI: https://doi.org/10.1007/978-1-4419-6045-0_12
Published: 18 January 2010
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-6044-3
Online ISBN: 978-1-4419-6045-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics