Abstract
In graph mining applications, there has been an increasingly strong urge for imposing user-specified constraints on the mining results. However, unlike most traditional itemset constraints, structural constraints, such as density and diameter of a graph, are very hard to be pushed deep into the mining process.
In this paper, we give the first comprehensive study on the pruning properties of both traditional and structural constraints aiming to reduce not only the pattern search space but the data search space as well. A new general framework, called gPrune, is proposed to incorporate all the constraints in such a way that they recursively reinforce each other through the entire mining process. A new concept, Pattern-inseparable Data-antimonotonicity, is proposed to handle the structural constraints unique in the context of graph, which, combined with known pruning properties, provides a comprehensive and unified classification framework for structural constraints. The exploration of these antimonotonicities in the context of graph pattern mining is a significant extension to the known classification of constraints, and deepens our understanding of the pruning properties of structural graph constraints.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Boulicaut, J., De Raedt, L.: Inductive Databases and Constraint-Based Mining. In: ECML’02 Tutorial
Koyuturk, M., Grama, A., Szpankowski, W.: An efficient algorithm for detecting frequent subgraphs in biological networks. In: ISMB’04, pp. 200–207 (2004)
Borgelt, C., Berthold, M.R.: Mining molecular fragments: Finding relevant substructures of molecules. In: ICDM’02, pp. 211–218 (2002)
Deshpande, M., Kuramochi, M., Karypis, G.: Frequent sub-structure-based approaches for classifying chemical compounds. In: ICDM’03, pp. 35–42 (2003)
Huan, J., et al.: Mining spatial motifs from protein structure graphs. In: RECOMB ’04, pp. 308–315 (2004)
Deshpande, M., et al.: Frequent substructure-based approaches for classifying chemical compounds. IEEE TKDE 17(8), 1036–1050 (2005)
Yan, X., Yu, P.S., Han, J.: Graph indexing: A frequent structure-based approach. In: SIGMOD’04, pp. 335–346 (2004)
Butte, A., et al.: Discovering functional relationships between rna expression and chemotherapeutic susceptibility. Proc. of the National Academy of Science 97, 12182–12186 (2000)
Ng, R., et al.: Exploratory mining and pruning optimizations of constrained associations rules. In: SIGMOD’98, pp. 13–24 (1998)
Bucila, C., et al.: DualMiner: A dual-pruning algorithm for itemsets with constraints. Data Mining and Knowledge Discovery 7, 241–272 (2003)
Bonchi, F., et al.: Exante: Anticipated data reduction in constrained pattern mining. In: Lavrač, N., et al. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, Springer, Heidelberg (2003)
Bonchi, F., et al.: Exante: A preprocessing method for frequent-pattern mining. IEEE Intelligent Systems 20(3), 25–31 (2005)
Bonchi, F., Lucchese, C.: Pushing tougher constraints in frequent pattern mining. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 114–124. Springer, Heidelberg (2004)
Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, A.D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: ICDM’01, pp. 313–320 (2001)
Vanetik, N., Gudes, E., Shimony, S.E.: Computing frequent graph patterns from semistructured data. In: ICDM’02, pp. 458–465 (2002)
Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining. In: ICDM’02, pp. 721–724 (2002)
Huan, J., Wang, W., Prins, J.: Efficient mining of frequent subgraph in the presence of isomorphism. In: ICDM’03, pp. 549–552 (2003)
Prins, J., et al.: Spin: Mining maximal frequent subgraphs from graph databases. In: KDD’04, pp. 581–586 (2004)
Nijssen, S., Kok, J.: A quickstart in frequent structure mining can make a difference. In: KDD’04, pp. 647–652 (2004)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: VLDB’94, pp. 487–499 (1994)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: SIGMOD’00, pp. 1–12 (2000)
Yan, X., Zhou, X.J., Han, J.: Mining closed relational graphs with connectivity constraints. In: KDD’05, pp. 324–333 (2005)
Goldberg, A.: Finding a maximum density subgraph. Berkeley Tech Report, CSD-84-171
Seno, M., Karypis, G.: Slpminer: An algorithm for finding frequent sequential patterns using length decreasing support constraint. In: ICDM’02, pp. 418–425 (2002)
Dong, G., et al.: Mining constrained gradients in multi-dimensional databases. IEEE TKDE 16, 922–938 (2004)
Gade, K., Wang, J., Karypis, G.: Efficient closed pattern mining in the presence of tough block constraints. In: KDD’04, pp. 138–147 (2004)
Zaki, M.: Generating non-redundant association rules. In: KDD’00, pp. 34–43 (2000)
Wang, C., et al.: Constraint-based graph mining in large database. In: Zhang, Y., et al. (eds.) APWeb 2005. LNCS, vol. 3399, pp. 133–144. Springer, Heidelberg (2005)
Yan, X., Han, J.: CloseGraph: Mining Closed Frequent Graph Patterns. In: KDD’03, pp. 286–295 (2003)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Zhu, F., Yan, X., Han, J., Yu, P.S. (2007). gPrune: A Constraint Pushing Framework for Graph Pattern Mining. In: Zhou, ZH., Li, H., Yang, Q. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4426. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71701-0_38
Download citation
DOI: https://doi.org/10.1007/978-3-540-71701-0_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71700-3
Online ISBN: 978-3-540-71701-0
eBook Packages: Computer ScienceComputer Science (R0)