Random walk with restart: fast solutions and applications

Tong, Hanghang; Faloutsos, Christos; Pan, Jia-Yu

doi:10.1007/s10115-007-0094-2

Random walk with restart: fast solutions and applications

Regular Paper
Published: 04 July 2007

Volume 14, pages 327–346, (2008)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Hanghang Tong¹,
Christos Faloutsos¹ &
Jia-Yu Pan¹

2034 Accesses
250 Citations
3 Altmetric
Explore all metrics

Abstract

How closely related are two nodes in a graph? How to compute this score quickly, on huge, disk-resident, real graphs? Random walk with restart (RWR) provides a good relevance score between two nodes in a weighted graph, and it has been successfully used in numerous settings, like automatic captioning of images, generalizations to the “connection subgraphs”, personalized PageRank, and many more. However, the straightforward implementations of RWR do not scale for large graphs, requiring either quadratic space and cubic pre-computation time, or slow response time on queries. We propose fast solutions to this problem. The heart of our approach is to exploit two important properties shared by many real graphs: (a) linear correlations and (b) block-wise, community-like structure. We exploit the linearity by using low-rank matrix approximation, and the community structure by graph partitioning, followed by the Sherman–Morrison lemma for matrix inversion. Experimental results on the Corel image and the DBLP dabasets demonstrate that our proposed methods achieve significant savings over the straightforward implementations: they can save several orders of magnitude in pre-computation and storage cost, and they achieve up to 150 × speed up with 90%+ quality preservation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

Achlioptas D, McSherry F (2001) Fast computation of low rank matrix approximation. In: STOC
Aditya B, Bhalotia G, Chakrabarti S, Hulgeri A, Nakhe C, Parag SS (2002) Banks: Browsing and keyword searching in relational databases. In: VLDB, pp 1083–1086
Balmin A, Hristidis V, Papakonstantinou Y (2004) Objectrank: Authority-based keyword search in databases. In: VLDB, 564, 564–575
http://www.informatik.uni-trier.de/~ley/db/
Deerwester S, Dumais S, Landauer T, Furnas G and Harshman R (1990). Indexing by latent semantic analysis. J Am Soc Inform Sci 41(6): 391–407
Article Google Scholar
Dhillon IS, Mallela S, Modha DS (2003) Information-theoretic co-clustering. In: The ninth ACM SIGKDD international conference on knowledge discovery and data mining (KDD 03), Washington, DC, August 24–27
Faloutsos C, McCurley KS, Tomkins A (2004) Fast discovery of connection subgraphs. In: KDD, pp 118–127
Flake G, Lawrence S, Giles C (2000) Efficient identification of web communities. In: KDD, pp 150–160
Fogaras D, Racz B (2004) Towards scaling fully personalized pagerank. In: Proc. WAW, pp 105–117
Geerts F, Mannila H, Terzi E (2004) Relational link-based ranking. In: VLDB, pp 552–563
Girvan M, Newman MEJ (2002) Community structure is social and biological networks. Proc Natl Acad Sci 7821–7826
Golub G, Loan C (1996) Matrix computation. Johns Hopkins
Haveliwala TH (2002) Topic-sensitive pagerank. WWW, pp 517–526
He J, Li M, Zhang H, Tong H, Zhang C (2004) Manifold-ranking based image retrieval. In: ACM Multimedia, pp 9–16
Jeh G, Widom J (2002) Simrank: A measure of structural-context similarity. In: KDD, pp 538–543
Jeh G, Widom J (2003) Scaling personalized web search. In: WWW
Jolliffe I (2002). Principal component analysis. Springer, Heidelberg
MATH Google Scholar
Kamvar S, Haveliwala T, Manning C, Golub G (2003) Exploiting the block structure of the web for computing pagerank. Stanford University Technical Report
Karypis G and Kumar V (1999). Parallel multilevel k-way partitioning for irregular graphs. SIAM Rev 41(2): 278–300
Article MATH MathSciNet Google Scholar
Liben-Nowell D, Kleinberg J (2003) The link prediction problem for social networks. In: Proc. CIKM
Lu W, Janssen JCM, Milios EE, Japkowicz N and Zhang Y (2007). Node similarity in the citation graph. J Knowledge Informat Syst 11(1): 105–129
Article Google Scholar
Ng A, Jordan M, Weiss Y (2001) On spectral clustering: Analysis and an algorithm. In: NIPS, pp 849–856
Page L, Brin S, Motwani R, Winograd T (1998) The PageRank citation ranking: Bringing order to the web. Technical Report, Stanford Digital Library Technologies Project. Paper SIDL-WP-1999-0120 (version of 11/11/1999)
Palopoli L, Rosaci D, Terracina G and Ursino D (2005). A graph-based approach for extracting terminological properties from information sources with heterogeneous formats. J Knowledge Informat Syst 8(4): 462–497
Article Google Scholar
Pan J-Y, Yang H-J, Faloutsos C, Duygulu P (2004) Automatic multimedia cross-modal correlation discovery. In: KDD, pp 653–658
Piegorsch W and Casella GE (1990). Inverting a sum of matrices. SIAM Rev 32: 470
Article MathSciNet Google Scholar
Rasmusen CE, Williams C (2006) Gaussian processes for machine learning. MIT Press
Sun J, Qu H, Chakrabarti D, Faloutsos C (2005) Neighborhood formation and anomaly detection in bipartite graphs. In: ICDM, pp 418–425
Tong H, Faloutsos C (2006) Center-piece subgraphs: Problem definition and fast solutions. In: KDD
Zhou D, Bousquet O, Lal TN, Weston J, Scholkopf B (2003) Learning with local and global consistency. In: NIPS
Zhu X, Ghahramani Z, Lafferty JD (2003) Semi-supervised learning using gaussian field and harmonic functions. In: ICML, pp 912–919

Download references

Author information

Authors and Affiliations

Carnegie Mellon University, Pittsburgh, PA, USA
Hanghang Tong, Christos Faloutsos & Jia-Yu Pan

Authors

Hanghang Tong
View author publications
You can also search for this author in PubMed Google Scholar
Christos Faloutsos
View author publications
You can also search for this author in PubMed Google Scholar
Jia-Yu Pan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hanghang Tong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tong, H., Faloutsos, C. & Pan, JY. Random walk with restart: fast solutions and applications. Knowl Inf Syst 14, 327–346 (2008). https://doi.org/10.1007/s10115-007-0094-2

Download citation

Received: 02 March 2007
Accepted: 28 April 2007
Published: 04 July 2007
Issue Date: March 2008
DOI: https://doi.org/10.1007/s10115-007-0094-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Random walk with restart: fast solutions and applications

Abstract

Access this article

Similar content being viewed by others

Network Essence: PageRank Completion and Centrality-Conforming Markov Chains

Non-backtracking PageRank

Which One to Choose: Random Walks or Spreading Activation?

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Random walk with restart: fast solutions and applications

Abstract

Access this article

Similar content being viewed by others

Network Essence: PageRank Completion and Centrality-Conforming Markov Chains

Non-backtracking PageRank

Which One to Choose: Random Walks or Spreading Activation?

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation