Skip to main content
Log in

Neighborhood and PageRank methods for pairwise link prediction

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

Link prediction is a common problem in network science that cuts across many disciplines. The goal is to forecast the appearance of new links or to find links missing in the network. Typical methods for link prediction use the topology of the network to predict the most likely future or missing connections between a pair of nodes. However, network evolution is often mediated by higher-order structures involving more than pairs of nodes; for example, cliques on three nodes (also called triangles) are key to the structure of social networks, but the standard link prediction framework does not directly predict these structures. To address this gap, in recent work, we propose a new link prediction task called “pairwise link prediction” that directly targets the prediction of new triangles, where one is tasked with finding which nodes are most likely to form a triangle with a given edge. We extend this work in this manuscript, and we evaluate a variety of natural extensions to link prediction methods including neighborhood and PageRank-based methods. A key difference from our previous work is the definition of the neighborhood of an edge, which has a surprisingly large impact on the empirical performance. Our experiments on a variety of networks show that diffusion-based methods are less sensitive to the type of graphs used and more consistent in their results. We also show how our pairwise link prediction framework can be used to get better predictions within the context of standard link prediction evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. https://github.com/nassarhuda/pairseed/blob/master/trpr.jl

References

  • Adamic LA, Adar E (2003) Friends and neighbors on the web. Soc Netw 25(3):211–230

    Article  Google Scholar 

  • Agrawal M, Zitnik M, Leskovec J (2018) Large-scale analysis of disease pathways in the human interactome. In: Pacific symposium on biocomputing, vol 23, p 111. World Scientific

  • Andersen R, Chung F, Lang K (2006) Local graph partitioning using PageRank vectors. In: 2006 47th annual IEEE symposium on foundations of computer science. IEEE

  • Avin C et al (2015) Core size and densification in preferential attachment networks. In: International colloquium on automata. languages, and programming, ICALP 2015. Springer, Berlin, pp 492–503

  • Backstrom L, Leskovec J (2011) Supervised random walks: predicting and recommending links in social networks. WSDM ’11, pp 635–644. ACM, New York, NY, USA

  • Barabási A.L, Albert R (1999) Emergence of scaling in random networks. Science 286(5439), 509–512 . https://science.sciencemag.org/content/286/5439/509.full.pdf

  • Benson AR, Abebe R, Schaub MT, Jadbabaie A, Kleinberg J (2018) Simplicial closure and higher-order link prediction. Proc Natl Acad Sci 115:E11221–E11230

    Article  Google Scholar 

  • Benson AR, Gleich DF, Leskovec J (2016) Higher-order organization of complex networks. Science 353(6295):163–166

    Article  Google Scholar 

  • Benson AR, Gleich DF, Lim LH (2017) The spacey random walk: a stochastic process for higher-order data. SIAM Rev 59(2):321–345

    Article  MathSciNet  Google Scholar 

  • Clauset A, Moore C, Newman MEJ (2008) Hierarchical structure and the prediction of missing links in networks. Nature 453:98. https://doi.org/10.1038/nature06830

    Article  Google Scholar 

  • Dave V, Hasan M (2019) Triangle completion time prediction using time-conserving embedding. ECMLPKDD

  • Easley D, Kleinberg J (2010) Networks, crowds, and markets: reasoning about a highly connected world. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Eikmeier N, Ramani AS, Gleich DF (2018) The hyperkron graph model for higher-order features. In: IEEE international conference on data mining, ICDM 2018, Singapore, 2018

  • Gleich DF (2015) PageRank beyond the web. SIAM Rev 57(3):321–363

    Article  MathSciNet  Google Scholar 

  • Gomez-Uribe CA, Hunt N (2015) The netflix recommender system: algorithms, business value, and innovation. ACM Trans Manag Inf Syst 6(4):13:1–13:19

    Google Scholar 

  • Granovetter M.S (1977) The strength of weak ties. In: Social networks, pp 347–367. Elsevier

  • Guimerà R, Danon L, Díaz-Guilera A, Giralt F, Arenas A (2003) Self-similar community structure in a network of human interactions. Phys Rev E 68:065103

    Article  Google Scholar 

  • Holland P.W, Leinhardt S (1977) A method for detecting structure in sociometric data. In: Social networks, pp 411–432. Elsevier

  • Katz L (1953) A new status index derived from sociometric analysis. Psychometrika 18(1):39–43

    Article  Google Scholar 

  • Lambiotte R, Rosvall M, Scholtes I (2019) From networks to optimal higher-order models of complex systems. Nat Phys 15(4):313–320

    Article  Google Scholar 

  • Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Am Soc Inf Sci Technol 58(7):1019–1031

    Article  Google Scholar 

  • Lin C.H, Konecki D.M, Liu M, Wilson S.J, Nassar H, Wilkins A.D, Gleich D.F, Lichtarge O (2018) Multimodal network diffusion predicts future disease-gene-chemical associations

  • Lofgren P, Banerjee S, Goel A (2016) Personalized pagerank estimation and search: A bidirectional approach. In: Proceedings of the ninth ACM international conference on web search and data mining, WSDM ’16, pp 163–172. ACM, New York

  • Lü L, Zhou T (2011) Link prediction in complex networks: a survey. Phys A Stat Mech Appl 390(6):1150–1170

    Article  Google Scholar 

  • Lü L, Zhou T (2011b) Link prediction in complex networks: a survey. Phys A 390(6):1150–1170

    Article  Google Scholar 

  • Milo R (2004) Superfamilies of evolved and designed networks. Science 303(5663):1538–1542

    Article  Google Scholar 

  • Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U (2002) Network motifs: Simple building blocks of complex networks. Science 298(5594), 824–827. https://science.sciencemag.org/content/298/5594/824.full.pdf

  • Nassar H, Benson AR, Gleich DF (2019) Pairwise link prediction. ASONAM

  • Nassar H, Gleich DF (2018) Matrixnetworks.jl. https://github.com/nassarhuda/MatrixNetworks.jl

  • Newman MEJ (2001) Clustering and preferential attachment in growing networks. Phys Rev E 64:025102

    Article  Google Scholar 

  • Page L, Brin S, Motwani R, Winograd T (1999) The pagerank citation ranking: bringing order to the web. Technical report 1999-66, Stanford InfoLab . Previous number = SIDL-WP-1999-0120

  • Panzarasa P, Opsahl T, Carley KM (2009) Patterns and dynamics of users’ behavior and interaction: network analysis of an online community. J Am Soc Inform Sci Technol 60(5):911–932

    Article  Google Scholar 

  • Rapoport A (1953) Spread of information through a population with socio-structural bias: I. assumption of transitivity. Bull Math Biophys 15(4):523–533

    Article  MathSciNet  Google Scholar 

  • Schoenebeck G (2013) Potential networks, contagious communities, and understanding social network structure. In: Proceedings of the 22nd international conference on World Wide Web, WWW ’13, pp 1123–1132. Association for computing machinery, New York. https://doi.org/10.1145/2488388.2488486

  • Soundarajan S, Hopcroft J (2012) Using community information to improve the precision of link prediction methods. In: Proceedings of the 21st international conference on World Wide Web, pp 607–608

  • Stanford SNAP Group: (2017) Miner: Gigascale multimodal biological network. https://github.com/snap-stanford/miner-data

  • Traud AL, Mucha PJ, Porter MA (2012) Social structure of facebook networks. Phys A Stat Mechan Appl 391(16):4165–4180. https://doi.org/10.1016/j.physa.2011.12.021. http://www.sciencedirect.com/science/article/pii/S0378437111009186

  • Wishart DS et al (2017) DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res 46:D1074–D1082

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huda Nassar.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

NSF IIS-1422918, IIS-1546488, CCF-1909528, NSF Center for Science of Information STC, CCF-0939370, NASA, Sloan Foundation, DARPA SIMPLEX, NSF DMS-1830274, ARO W911NF-19-1-0057, and ARO MURI.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nassar, H., Benson, A.R. & Gleich, D.F. Neighborhood and PageRank methods for pairwise link prediction. Soc. Netw. Anal. Min. 10, 63 (2020). https://doi.org/10.1007/s13278-020-00671-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-020-00671-6

Keywords

Navigation