Abstract
Graph pattern mining becomes increasingly crucial to applications in a variety of domains including bioinformatics, cheminformatics, social network analysis, computer vision and multimedia. In this chapter, we first examine the existing frequent subgraph mining algorithms and discuss their computational bottleneck. Then we introduce recent studies on mining significant and representative subgraph patterns. These new mining algorithms represent the state-of-the-art graph mining techniques: they not only avoid the exponential size of mining result, but also improve the applicability of graph patterns significantly.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
T. Asai, K. Abe, S. Kawasoe, H. Arimura, H. Satamoto, and S. Arikawa. Efficient substructure discovery from large semi-structured data. In Proc. 2002 SIAM Int. Conf. Data Mining (SDM’02), pages 158–174, 2002.
C. Borgelt and M. R. Berthold. Mining molecular fragments: Finding relevant substructures of molecules. In Proc. 2002 Int. Conf. Data Mining (ICDM’02), pages 211–218, 2002.
B. Bringmann and S. Nijssen. What is frequent in a single graph? In Proc. 2008 Pacific-Asia Conf. Knowledge Discovery and Data Mining (PAKDD’08), pages 858–863, 2008.
H. Cheng, X. Yan, J. Han, and C.-W. Hsu. Discriminative frequent pattern analysis for effective classification. In Proc. 2007 Int. Conf. Data Engineering (ICDE’07), pages 716–725, 2007.
Y. Chi, Y. Xia, Y. Yang, and R. Muntz. Mining closed and maximal frequent subtrees from databases of labeled rooted trees. IEEE Trans. Knowledge and Data Eng., 17:190–202, 2005.
L. Dehaspe, H. Toivonen, and R. King. Finding frequent substructures in chemical compounds. In Proc. 1998 Int. Conf. Knowledge Discovery and Data Mining (KDD’98), pages 30–36, 1998.
M. Deshpande, M. Kuramochi, N. Wale, and G. Karypis. Frequent substructure-based approaches for classifying chemical compounds. IEEE Trans. on Knowledge and Data Engineering, 17:1036–1050, 2005.
M. Fiedler and C. Borgelt. Support computation for mining frequent subgraphs in a single graph. In Proc. 5th Int. Workshop on Mining and Learning with Graphs (MLG’07), 2007.
Y. Freund and R. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. In Proc. 2nd European Conf. Computational Learning Theory, pages 23–27, 1995.
M. Al Hasan, V. Chaoji, S. Salem, J. Besson, and M. J. Zaki. ORIGAMI: Mining representative orthogonal graph patterns. In Proc. 2007 Int. Conf. Data Mining (ICDM’07), pages 153–162, 2007.
H. He and A. K. Singh. Efficient algorithms for mining significant substructures in graphs with quality guarantees. In Proc. 2007 Int. Conf. Data Mining (ICDM’07), pages 163–172, 2007.
L. B. Holder, D. J. Cook, and S. Djoko. Substructure discovery in the subdue system. In Proc. AAAI’94 Workshop Knowledge Discovery in Databases (KDD’94), pages 169–180, 1994.
J. Huan, W. Wang, D. Bandyopadhyay, J. Snoeyink, J. Prins, and A. Tropsha. Mining spatial motifs from protein structure graphs. In Proc. 8th Int. Conf. Research in Computational Molecular Biology (RECOMB), pages 308–315, 2004.
J. Huan, W. Wang, and J. Prins. Efficient mining of frequent subgraph in the presence of isomorphism. In Proc. 2003 Int. Conf. Data Mining (ICDM’03), pages 549–552, 2003.
J. Huan, W. Wang, J. Prins, and J. Yang. SPIN: Mining maximal frequent subgraphs from graph databases. In Proc. 2004 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD’04), pages 581–586, 2004.
A. Inokuchi, T. Washio, and H. Motoda. An apriori-based algorithm for mining frequent substructures from graph data. In Proc. 2000 European Symp. Principle of Data Mining and Knowledge Discovery (PKDD’00), pages 13–23, 1998.
R. Jin, C. Wang, D. Polshakov, S. Parthasarathy, and G. Agrawal. Discovering frequent topological structures from graph datasets. In Proc. 2005 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD’05), pages 606–611, 2005.
M. Koyuturk, A. Grama, and W. Szpankowski. An efficient algorithm for detecting frequent subgraphs in biological networks. Bioinformatics, 20:I200–I207, 2004.
T. Kudo, E. Maeda, and Y. Matsumoto. An application of boosting to graph classification. In Advances in Neural Information Processing Systems 18 (NIPS’04), 2004.
M. Kuramochi and G. Karypis. Frequent subgraph discovery. In Proc. 2001 Int. Conf. Data Mining (ICDM’01), pages 313–320, 2001.
M. Kuramochi and G. Karypis. Finding frequent patterns in a large sparse graph. Data Mining and Knowledge Discovery, 11:243–271, 2005.
S. Nijssen and J. Kok. A quickstart in frequent structure mining can make a difference. In Proc. 2004 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD’04), pages 647–652, 2004.
J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In Proc. 2001 Int. Conf. Data Engineering (ICDE’01), pages 215–224, 2001.
S. Ranu and A. K. Singh. GraphSig: A scalable approach to mining significant subgraphs in large graph databases. In Proc. 2009 Int. Conf. Data Engineering (ICDE’09), pages 844–855, 2009.
H. Saigo, N. Kramer, and K. Tsuda. Partial least squares regression for graph mining. In Proc. 2008 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD’08), pages 578–586, 2008.
L. Thomas, S. Valluri, and K. Karlapalem. MARGIN: Maximal frequent subgraph mining. In Proc. 2006 Int. Conf. on Data Mining (ICDM’06), pages 1097–1101, 2006.
K. Tsuda. Entire regularization paths for graph data. In Proc. 2007 Int. Conf. Machine Learning (ICML’07), pages 919–926, 2007.
N. Vanetik, E. Gudes, and S. E. Shimony. Computing frequent graph patterns from semistructured data. In Proc. 2002 Int. Conf. on Data Mining (ICDM’02), pages 458–465, 2002.
C. Wang, W. Wang, J. Pei, Y. Zhu, and B. Shi. Scalable mining of large disk-base graph databases. In Proc. 2004 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD’04), pages 316–325, 2004.
T. Washio and H. Motoda. State of the art of graph-based data mining. SIGKDD Explorations, 5:59–68, 2003.
X. Yan, H. Cheng, J. Han, and P. S. Yu. Mining significant graph patterns by scalable leap search. In Proc. 2008 ACM SIGMOD Int. Conf. on Management of Data (SIGMOD’08), pages 433–444, 2008.
X. Yan and J. Han. gSpan: Graph-based substructure pattern mining. In Proc. 2002 Int. Conf. Data Mining (ICDM’02), pages 721–724, 2002.
X. Yan and J. Han. CloseGraph: Mining closed frequent graph patterns. In Proc. 2003 ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining (KDD’03), pages 286–295, 2003.
X. Yan and J. Han. Discovery of frequent substructures. In D. Cook and L. Holder (eds.), Mining Graph Data, pages 99–115, John Wiley Sons, 2007.
X. Yan, P. S. Yu, and J. Han. Graph indexing: A frequent structure-based approach. In Proc. 2004 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD’04), pages 335–346, 2004.
X. Yan, X. J. Zhou, and J. Han. Mining closed relational graphs with connectivity constraints. In Proc. 2005 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD’05), pages 324–333, 2005.
M. J. Zaki. Efficiently mining frequent trees in a forest. In Proc. 2002 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD’02), pages 71–80, 2002.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag US
About this chapter
Cite this chapter
Cheng, H., Yan, X., Han, J. (2010). Mining Graph Patterns. In: Aggarwal, C., Wang, H. (eds) Managing and Mining Graph Data. Advances in Database Systems, vol 40. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-6045-0_12
Download citation
DOI: https://doi.org/10.1007/978-1-4419-6045-0_12
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-6044-3
Online ISBN: 978-1-4419-6045-0
eBook Packages: Computer ScienceComputer Science (R0)