Skip to main content

Exploiting the Block Structure of Link Graph for Efficient Similarity Computation

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5476))

Included in the following conference series:

Abstract

In many real-world domains, link graph is one of the most effective ways to model the relationships between objects. Measuring the similarity of objects in a link graph is studied by many researchers, but an effective and efficient method is still expected. Based on our observation of link graphs from real domains, we find the block structure naturally exists. We propose an algorithm called BlockSimRank, which partitions the link graph into blocks, and obtains similarity of each node-pair in the graph efficiently. Our method is based on random walk on two-layer model, with time complexity as low as O(n 4/3) and less memory need. Experiments show that the accuracy of BlockSimRank is acceptable when the time cost is the lowest.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Getoor, L., Diehl, C.P.: Link mining: A survey. In: SIGKDD 2005 Explorations, vol. 7(2), pp. 3–12 (2005)

    Google Scholar 

  2. Jeh, G., Widom, J.: SimRank: A measure of structural-context similarity. In: SIGKDD 2002, pp. 538–543 (2002)

    Google Scholar 

  3. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. ACM Press/Addison-Wesley (1999)

    Google Scholar 

  4. Xi, W., Fox, E.A., Fan, W., Zhang, B., Chen, Z., Yan, J., Zhuang, D.: SimFusion: measuring similarity using unified relationship matrix. In: SIGIR 2005, pp. 130–137 (2005)

    Google Scholar 

  5. Dean, J., Henzinger, M.R.: Finding Related Pages in the World Wide Web. In: WWW 1999, pp. 1467–1479 (1999)

    Google Scholar 

  6. Shardanand, U., Maes, P.: Social information filtering: Algorithms for automating “word of mouth”. In: Proceedings of the Conference on Human Factors in Computing Systems, Denver, Colorado (1995)

    Google Scholar 

  7. Yin, X., Han, J.: Yu. P.S.: Linkclus: Efficient clustering via heterogeneous semantic links. In: VLDB 2006, pp. 427–438 (2006)

    Google Scholar 

  8. Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing order to the Web. Technical report, Stanford University Database Group (1998)

    Google Scholar 

  9. Sun, J., Qu, H., Chakrabarti, D., Faloutsos, C.: Relevance search and anomaly detection in bipartite graphs. SIGKDD Explorations 7(2), 48–55 (2005)

    Article  Google Scholar 

  10. Lovasz, L.: Random walks on graphs: a survey. Combinatorics, Paul Erdos is Eighty, vol. 2, Keszthely (Hungary), pp. 1–46 (1993)

    Google Scholar 

  11. Kamvar, S.D., Haveliwala, T.H., Manning, C.D., Golub, G.H.: Exploiting the Block Structure of the Web for Computing PageRank. Technical Report, Stanford University, Stanford, CA (2003)

    Google Scholar 

  12. Karypis, G., Kumar, V.: Multilevel k-way partitioning scheme for irregular graphs. Journal of Parallel and Distributed Computing 48(1), 96–129 (1998), http://www.cs.umn.edu/~karypis

    Article  MATH  Google Scholar 

  13. ACM Computing Classification System, http://portal.acm.org/ccs.cfm

  14. Kallenberg, O.: Foundations of Modern Probability. Springer, New York (1997)

    MATH  Google Scholar 

  15. Meila, M., Shi, J.: Learning Segmentation by Random Walks. Advances in Neural Information Processing Systems (2001)

    Google Scholar 

  16. Fischer, I., Poland, J.: Amplifying the block matrix structure for spectral clustering. In: Proceedings of the 14th Annual Machine Learning Conference of Belgium and the Netherlands, pp. 21–28 (2005)

    Google Scholar 

  17. Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley & Sons, Chichester (1990)

    Book  MATH  Google Scholar 

  18. Karypis, G., Kumar, V.: METIS: Unstructured Graph Partitioning and Sparse Matrix Ordering System. Technical Report, Department of Computer Science, University of Minnesota (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Li, P., Cai, Y., Liu, H., He, J., Du, X. (2009). Exploiting the Block Structure of Link Graph for Efficient Similarity Computation. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, TB. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2009. Lecture Notes in Computer Science(), vol 5476. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01307-2_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01307-2_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01306-5

  • Online ISBN: 978-3-642-01307-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics