Abstract
Currently, a large amount of data can be best represented as graphs, e.g., social networks, protein interaction networks, etc. The analysis of these networks is an urgent research problem with great practical applications. In this paper, we study the particular problem of finding frequently occurring dense subgraph patterns in a large connected graph. Due to the ambiguous nature of occurrences of a pattern in a graph, we devise a novel frequent pattern model for a single graph. For this model, the widely used Apriori property no longer holds. However, we are able to identify several important properties, i.e., small diameter, reachability, and fast calculation of automorphism. These properties enable us to employ an index-based method to locate all occurrences of a pattern in a graph and a depth-first search method to find all patterns. Concluding this work, a large number of real and synthetic data sets are used to show the effectiveness and efficiency of the DESSIN method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bader, G., Hogue, C.: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 4(2) (2003)
Bringmann, B., Nijssen, S.: What is Frequent in a Single Graph? In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 858–863. Springer, Heidelberg (2008)
Chang, R., Podgurski, A., Yang, J.: Finding what’s not there: a new approach to revealing neglected conditions in software. In: International symposium on software testing and analysis (2007)
Dehaspe, L., Toivonen, H., King, R.: Finding frequent substructures in chemical compounds. In: Proc. of KDD, New York, NY, USA (1998)
Fan, W., Zhang, K., Cheng, H., Yan, X., Han, J., Yu, P., Verscheure, O.: Direct mining of discriminative and essential frequent patterns via model-based search tree. In: Proc. of KDD, Las Vegas, Nevada, USA, pp. 230–238 (2008)
Fiedler, M., Borgelt, C.: Subgraph support in a single large graph. In: Proc. of ICDMW, pp. 399–404 (2007)
Gibson, D., Kumar, R., Tomkins, A.: Discovering Large Dense Subgraphs in Massive Graphs. In: Proc. of VLDB, Trondheim, Norway, pp. 721–732 (2005)
Hasan, M., Chaoji, V., Salem, S., Besson, J., Zaki, M.: ORIGAMI: Mining Representative Orthogonal Graph Patterns. In: Proc. of ICDM, pp. 153–162 (2007)
Hu, H., Yan, X., Huang, Y., Han, J., Zhou, X.: Mining coherent dense subgraphs across massive biological networks for functional discovery. In: Proc. of ISMB (Supplement of Bioinformatics), pp. 213–221 (2005)
Huan, J., Wang, W., Prins, J.: Efficient mining of frequent subgraphs in the presence of isomorphism. In: Proc. of ICDM, Melbourne, Florida, USA, pp. 549–552 (2003)
Huan, J., Wang, W., Prins, J., Yang, J.: SPIN: mining maximal frequent subgraphs from graph databases. In: Proc. of SIGKDD, Seattle, WA, USA, pp. 581–586 (2004)
Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Proc. of Principles of Data Mining and Knowledge Discovery, pp. 13–23 (2000)
Ketkar, N., Holder, L., Cook, D.: Subdue: compression-based frequent pattern discovery in graph data. In: Proc. of the 1st international workshop on open source data mining: frequent pattern mining implementations, Chicago, Illinois, USA, pp. 71–76 (2005)
Koyuturk, M., Grama, A., Szpankowski, W.: An efficient algorithm for detecting frequent subgraphs in biological networks. Bioinformatics 21(16), 3401–3408 (2004)
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proc. of ICDE, pp. 313–320 (2001)
Kuramochi, M., Karypis, G.: Finding Frequent Patterns in a Large Sparse Graph. DMKD 11(3), 243–271 (2005)
Moody, J.: Peer Influence Groups: Identifying Dense Clusters in Large Networks. Social Networks 23, 261–283 (2001)
Nijssen, S., Kok, J.: A quick start in frequent structure mining can make a difference. In: Proc. of KDD, Seattle, WA, US, pp. 647–652 (2004)
Palmer, C., Gibbons, P., Faloutsos, C.: ANF: A fast and scalable tool for data mining in massive graphs. In: Proc. of KDD, Edmonton, Alberta, Canada, pp. 81–90 (2002)
Pei, J., Jiang, D., Zhang, A.: On mining cross-graph quasi-cliques. In: Proc. of KDD, Chicago, Illinois, USA (2005)
Thomas, L., Valluri, S., Karlapalem, K.: MARGIN:Maximal Frequent Subgraph Mining. In: Proc. of ICDM, pp. 1097–1101 (2006)
Wang, J., Zeng, Z., Zhou, L.: CLAN: An Algorithm for Mining Closed Cliques from Large Dense Graph Databases. In: Proc. of ICDE, vol. 73 (2006)
Wang, N., Parthasarathy, S., Tan, K., Tung, A.: CSV: Visualizing and Mining Cohesive Subgraphs. In: Proc. of SIGMOD (2008)
Yan, X., Cheng, H., Han, J., Yu, P.: Mining significant graph patterns by leap search. In: Prof. of SIGMOD, Vancouver, Canada, pp. 433–444 (2008)
Zhang, S., Hu, M., Yang, J.: TreePi: a novel graph indexing method. In: Proc. of ICDE (2007)
Zhang, S., Li, S., Yang, J.: GADDI: Distance index base subgraph matching in biological networks. In: Proc. of EDBT (2009)
Zhang, S., Yang, J., Li, S.: RING: an integrated method for frequent representative subgraph mining. In: Proc. of ICDM (2009)
Zeng, Z., Wang, J., Zhou, L., Karypis, G.: Coherent closed quasi-clique discovery from large dense graph databases. In: Proc. of KDD, Philadelphia, PA, USA, pp. 797–802 (2006)
Gene Ontology, http://www.geneontology.org/
Social Network, http://www-personal.umich.edu/~mejn/netdata/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, S., Zhang, S., Yang, J. (2010). DESSIN: Mining Dense Subgraph Patterns in a Single Graph. In: Gertz, M., Ludäscher, B. (eds) Scientific and Statistical Database Management. SSDBM 2010. Lecture Notes in Computer Science, vol 6187. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13818-8_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-13818-8_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13817-1
Online ISBN: 978-3-642-13818-8
eBook Packages: Computer ScienceComputer Science (R0)