Skip to main content

Top-K Correlation Sub-graph Search in Graph Databases

  • Conference paper
Book cover Database Systems for Advanced Applications (DASFAA 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5463))

Included in the following conference series:

Abstract

Recently, due to its wide applications, (similar) subgraph search has attracted a lot of attentions from database and data mining community, such as [13,18,19,5]. In [8], Ke et al. first proposed correlation sub-graph search problem (CGSearch for short) to capture the underlying dependency between sub-graphs in a graph database, that is CGS algorithm. However, CGS algorithm requires the specification of a minimum correlation threshold θ to perform computation. In practice, it may not be trivial for users to provide an appropriate threshold θ, since different graph databases typically have different characteristics. Therefore, we propose an alternative mining task: top -K c orrelation sub- g raph search(TOP-CGSearh for short). The new problem itself does not require setting a correlation threshold, which leads the previous proposed CGS algorithm inefficient if we apply it directly to TOP-CGSearch problem. To conduct TOP-CGSearch efficiently, we develop a p attern- g rowth algorithm (that is PG-search algorithm) and utilize graph indexing methods to speed up the mining task. Extensive experiment results evaluate the efficiency of our methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB (1994)

    Google Scholar 

  2. Brin, S., Motwani, R., Silverstein, C.: Beyond market baskets: Generalizing association rules to correlations. In: SIGMOD (1997)

    Google Scholar 

  3. Cai, D., Shao, Z., He, X., Yan, X., Han, J.: Community mining from multi-relational networks. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS, vol. 3721, pp. 445–452. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  4. Fortin, S.: The graph isomorphism problem. Department of Computing Science, University of Alberta (1996)

    Google Scholar 

  5. He, H., Singh, A.K.: Closure-tree: An index structure for graph queries. In: ICDE (2006)

    Google Scholar 

  6. Ilyas, I.F., Markl, V., Haas, P.J., Brown, P., Aboulnaga, A.: Cords: Automatic discovery of correlations and soft functional dependencies. In: SIGMOD (2004)

    Google Scholar 

  7. Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS, vol. 1910, pp. 13–23. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  8. Ke, Y., Cheng, J., Ng, W.: Correlation search in graph databases. In: SIGKDD (2007)

    Google Scholar 

  9. Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: ICDM (2001)

    Google Scholar 

  10. Omiecinski, E.: Alternative interest measures for mining associations in databases. IEEE TKDE 15(1) (2003)

    Google Scholar 

  11. Pan, J.-Y., Yang, H.-J., Faloutsos, C., Duygulu, P.: Automatic multimedia cross-modal correlation discovery. In: KDD (2004)

    Google Scholar 

  12. Petrakis, E.G.M., Faloutsos, C.: Similarity searching in medical image databases. IEEE Transactions on Knowledge and Data Enginnering 9(3) (1997)

    Google Scholar 

  13. Shasha, D., Wang, J.T.-L., Giugno, R.: Algorithmics and applications of tree and graph searching. In: PODS (2002)

    Google Scholar 

  14. Willett, P.: Chemical similarity searching. J. Chem. Inf. Comput. Sci. 38(6) (1998)

    Google Scholar 

  15. Xiong, H., Brodie, M., Ma, S.: Top-cop: Mining top-k strongly correlated pairs in large databases. In: Perner, P. (ed.) ICDM 2006. LNCS, vol. 4065. Springer, Heidelberg (2006)

    Google Scholar 

  16. Xiong, H., Shekhar, S., Tan, P.-N., Kumar, V.: Exploiting a support-based upper bound of pearson’s correlation coefficient for efficiently identifying strongly correlated pairs. In: KDD (2004)

    Google Scholar 

  17. Yan, X., Han, J.: gspan: Graph-based substructure pattern mining. In: ICDM (2002)

    Google Scholar 

  18. Yan, X., Yu, P.S., Han, J.: Graph indexing: A frequent structure-based approach. In: SIGMOD (2004)

    Google Scholar 

  19. Yan, X., Yu, P.S., Han, J.: Substructure similarity search in graph databases, pp. 766– 777 (2005)

    Google Scholar 

  20. Zou, L., Chen, L., Yu, J.X., Lu, Y.: A novel spectral coding in a large graph database. In: EDBT (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zou, L., Chen, L., Lu, Y. (2009). Top-K Correlation Sub-graph Search in Graph Databases. In: Zhou, X., Yokota, H., Deng, K., Liu, Q. (eds) Database Systems for Advanced Applications. DASFAA 2009. Lecture Notes in Computer Science, vol 5463. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00887-0_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-00887-0_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-00886-3

  • Online ISBN: 978-3-642-00887-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics