skip to main content
10.1145/1390156.1390269acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
research-article

Fast incremental proximity search in large graphs

Published: 05 July 2008 Publication History

Abstract

In this paper we investigate two aspects of ranking problems on large graphs. First, we augment the deterministic pruning algorithm in Sarkar and Moore (2007) with sampling techniques to compute approximately correct rankings with high probability under random walk based proximity measures at query time. Second, we prove some surprising locality properties of these proximity measures by examining the short term behavior of random walks. The proposed algorithm can answer queries on the fly without caching any information about the entire graph. We present empirical results on a 600, 000 node author-word-citation graph from the Citeseer domain on a single CPU machine where the average query processing time is around 4 seconds. We present quantifiable link prediction tasks. On most of them our techniques outperform Personalized Pagerank, a well-known diffusion based proximity measure.

References

[1]
Aldous, D., & Fill, J. A. (2001). Reversible markov chains.
[2]
Balmin, A., Hristidis, V., & Papakonstantinou, Y. (2004). ObjectRank: Authority-based keyword search in databases. VLDB, 2004.
[3]
Brand, M. (2005). A Random Walks Perspective on Maximizing Satisfaction and Profit. SIAM '05.
[4]
Chakrabarti, S. (2007). Dynamic personalized pagerank in entity-relation graphs. WWW '07 (pp. 571--580). New York, NY, USA: ACM Press.
[5]
Fogaras, D., Rácz, B., Csalogány, K., & Sarlóós, T. (2004). Towards scaling fully personalized pagerank: Algorithms, lower bounds, and experiments.
[6]
Haveliwala, T. (2002). Topic-sensitive pagerank. WWW.
[7]
Jeh, G., & Widom, J. (2002a). Scaling personalized web search. Stanford University Technical Report.
[8]
Jeh, G., & Widom, J. (2002b). Simrank: A measure if structural-context similarity. KDD.
[9]
Katz, L. (1953). A new status index derived from sociometric analysis. Psychometrika.
[10]
Liben-Nowell, D., & Kleinberg, J. (2003). The link prediction problem for social networks. CIKM '03.
[11]
Sarkar, P., & Moore, A. (2007). A tractable approach to finding closest truncated-commute-time neighbors in large graphs. Proc. UAI.
[12]
Spielman, D., & Srivastava, N. (2008). Graph sparsification by effective resistances. Proceedings of the STOC'08.
[13]
Tong, H., Koren, Y., & Faloutsos, C. (2007). Fast direction-aware proximity for graph mining. Proc. KDD.

Cited By

View all
  • (2024)Efficient and Provable Effective Resistance Computation on Large Graphs: An Index-based ApproachProceedings of the ACM on Management of Data10.1145/36549362:3(1-27)Online publication date: 30-May-2024
  • (2024)A Fast Algorithm for Moderating Critical Nodes via Edge RemovalIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.330998736:4(1385-1398)Online publication date: Apr-2024
  • (2023)GERWkNN: GPU-accelerated Exact Random Walk-based kNN Query in Large GraphsProceedings of the 2023 5th International Conference on Big Data Engineering10.1145/3640872.3640880(47-54)Online publication date: 17-Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICML '08: Proceedings of the 25th international conference on Machine learning
July 2008
1310 pages
ISBN:9781605582054
DOI:10.1145/1390156
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • Pascal
  • University of Helsinki
  • Xerox
  • Federation of Finnish Learned Societies
  • Google Inc.
  • NSF
  • Machine Learning Journal/Springer
  • Microsoft Research: Microsoft Research
  • Intel: Intel
  • Yahoo!
  • Helsinki Institute for Information Technology
  • IBM: IBM

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 July 2008

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

ICML '08
Sponsor:
  • Microsoft Research
  • Intel
  • IBM

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)18
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Efficient and Provable Effective Resistance Computation on Large Graphs: An Index-based ApproachProceedings of the ACM on Management of Data10.1145/36549362:3(1-27)Online publication date: 30-May-2024
  • (2024)A Fast Algorithm for Moderating Critical Nodes via Edge RemovalIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.330998736:4(1385-1398)Online publication date: Apr-2024
  • (2023)GERWkNN: GPU-accelerated Exact Random Walk-based kNN Query in Large GraphsProceedings of the 2023 5th International Conference on Big Data Engineering10.1145/3640872.3640880(47-54)Online publication date: 17-Nov-2023
  • (2023)Efficient Resistance Distance Computation: The Power of Landmark-based ApproachesProceedings of the ACM on Management of Data10.1145/35889221:1(1-27)Online publication date: 30-May-2023
  • (2023)Efficient Estimation of Pairwise Effective ResistanceProceedings of the ACM on Management of Data10.1145/35886961:1(1-27)Online publication date: 30-May-2023
  • (2023)Opinion Maximization in Social Networks via Leader SelectionProceedings of the ACM Web Conference 202310.1145/3543507.3583243(133-142)Online publication date: 30-Apr-2023
  • (2022)Scalable stream-based recommendations with random walks on incremental graph of sequential interactions with implicit feedbackUser Modeling and User-Adapted Interaction10.1007/s11257-021-09315-632:4(543-573)Online publication date: 13-Jan-2022
  • (2020)An Efficient Approximate Algorithm for Single-Source Discounted Hitting Time QueryDatabase Systems for Advanced Applications10.1007/978-3-030-59419-0_15(237-254)Online publication date: 22-Sep-2020
  • (2018)Active Search of Connections for Case Building and Combating Human TraffickingProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3219819.3220103(2120-2129)Online publication date: 19-Jul-2018
  • (2017)Faces selection in images using the spectral graph theory and constraints2017 International Conference on Industrial Engineering, Applications and Manufacturing (ICIEAM)10.1109/ICIEAM.2017.8076407(1-5)Online publication date: May-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media