Skip to main content

Taming Computational Complexity: Efficient and Parallel SimRank Optimizations on Undirected Graphs

  • Conference paper
Web-Age Information Management (WAIM 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6184))

Included in the following conference series:

Abstract

SimRank has been considered as one of the promising link-based ranking algorithms to evaluate similarities of web documents in many modern search engines. In this paper, we investigate the optimization problem of SimRank similarity computation on undirected web graphs. We first present a novel algorithm to estimate the SimRank between vertices in \(O\left( {{n}^{3}}+K\cdot {{n}^{2}} \right)\) time, where n is the number of vertices, and K is the number of iterations. In comparison, the most efficient implementation of SimRank algorithm in [1] takes \(O\left( K\cdot {{n}^{3}} \right)\) time in the worst case. To efficiently handle large-scale computations, we also propose a parallel implementation of the SimRank algorithm on multiple processors. The experimental evaluations on both synthetic and real-life data sets demonstrate the better computational time and parallel efficiency of our proposed techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Lizorkin, D., Velikhov, P., Grinev, M., Turdakov, D.: Accuracy estimate and optimization techniques for simrank computation. PVLDB 1(1) (2008)

    Google Scholar 

  2. Jeh, G., Widom, J.: Simrank: a measure of structural-context similarity. In: KDD (2002)

    Google Scholar 

  3. Fogaras, D., Rácz, B.: Scaling link-based similarity search. In: WWW (2005)

    Google Scholar 

  4. Fogaras, D., Rácz, B.: A scalable randomized method to compute link-based similarity rank on the web graph. In: Lindner, W., Mesiti, M., Türker, C., Tzitzikas, Y., Vakali, A.I. (eds.) EDBT 2004. LNCS, vol. 3268, pp. 557–567. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  5. Cai, Y., Li, P., Liu, H., He, J., Du, X.: S-simrank: Combining content and link information to cluster papers effectively and efficiently. In: Tang, C., Ling, C.X., Zhou, X., Cercone, N.J., Li, X. (eds.) ADMA 2008. LNCS (LNAI), vol. 5139, pp. 317–329. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  6. Antonellis, I., Garcia-Molina, H., Chang, C.C.: Simrank++: query rewriting through link analysis of the click graph. PVLDB 1(1) (2008)

    Google Scholar 

  7. Yu, W., Lin, X., Le, J.: A space and time efficient algorithm for simrank computation. In: APWeb (2010)

    Google Scholar 

  8. Weinberg, B.H.: Bibliographic coupling: A review. Information Storage and Retrieval 10(5-6) (1974)

    Google Scholar 

  9. Wijaya, D.T., Bressan, S.: Clustering web documents using co-citation, coupling, incoming, and outgoing hyperlinks: a comparative performance analysis of algorithms. IJWIS 2(2) (2006)

    Google Scholar 

  10. Li, C., Han, J., He, G., Jin, X., Sun, Y., Yu, Y., Wu, T.: Fast computation of simrank for static and dynamic information networks. In: EDBT (2010)

    Google Scholar 

  11. Bhatia, R.: Matrix Analysis. Springer, Heidelberg (1997)

    Google Scholar 

  12. Hernandez, V., Roman, J.E., Vidal, V.: Slepc: A scalable and flexible toolkit for the solution of eigenvalue problems. ACM Trans. Math. Softw. 31(3) (2005)

    Google Scholar 

  13. Maschhoff, K.J., Sorensen, D.C.: A portable implementation of arpack for distributed memory parallel architectures. In: CMCIM (1996)

    Google Scholar 

  14. Wu, K., Simon, H.: A parallel lanczos method for symmetric generalized eigenvalue problems. Technical report, Lawrence Berkeley National Laboratory (1997)

    Google Scholar 

  15. Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking bringing order to the web, Technial report (1998)

    Google Scholar 

  16. Mendelzon, A.O.: Review - authoritative sources in a hyperlinked environment. ACM SIGMOD Digital Review 1 (2000)

    Google Scholar 

  17. Wang, J., Zeng, H., Chen, Z., Lu, H., Tao, L., Ma, W.: Recom: reinforcement clustering of multi-type interrelated data objects. In: SIGIR (2003)

    Google Scholar 

  18. Xi, W., Fox, E.A., Fan, W., Zhang, B., Chen, Z., Yan, J., Zhuang, D.: Simfusion: measuring similarity using unified relationship matrix. In: SIGIR (2005)

    Google Scholar 

  19. Zhao, P., Han, J., Sun, Y.: P-rank: a comprehensive structural similarity measure over information networks. In: CIKM 2009: Proceeding of the 18th ACM conference on Information and knowledge management (2009)

    Google Scholar 

  20. Zhou, Y., Cheng, H., Yu, J.X.: Graph clustering based on structural/attribute similarities. PVLDB 2(1) (2009)

    Google Scholar 

  21. Li, P., Cai, Y., Liu, H., He, J., Du, X.: Exploiting the block structure of link graph for efficient similarity computation. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 389–400. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yu, W., Lin, X., Le, J. (2010). Taming Computational Complexity: Efficient and Parallel SimRank Optimizations on Undirected Graphs. In: Chen, L., Tang, C., Yang, J., Gao, Y. (eds) Web-Age Information Management. WAIM 2010. Lecture Notes in Computer Science, vol 6184. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14246-8_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14246-8_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14245-1

  • Online ISBN: 978-3-642-14246-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics