Abstract
Link prediction is a common problem in network science that cuts across many disciplines. The goal is to forecast the appearance of new links or to find links missing in the network. Typical methods for link prediction use the topology of the network to predict the most likely future or missing connections between a pair of nodes. However, network evolution is often mediated by higher-order structures involving more than pairs of nodes; for example, cliques on three nodes (also called triangles) are key to the structure of social networks, but the standard link prediction framework does not directly predict these structures. To address this gap, in recent work, we propose a new link prediction task called “pairwise link prediction” that directly targets the prediction of new triangles, where one is tasked with finding which nodes are most likely to form a triangle with a given edge. We extend this work in this manuscript, and we evaluate a variety of natural extensions to link prediction methods including neighborhood and PageRank-based methods. A key difference from our previous work is the definition of the neighborhood of an edge, which has a surprisingly large impact on the empirical performance. Our experiments on a variety of networks show that diffusion-based methods are less sensitive to the type of graphs used and more consistent in their results. We also show how our pairwise link prediction framework can be used to get better predictions within the context of standard link prediction evaluation.
Similar content being viewed by others
References
Adamic LA, Adar E (2003) Friends and neighbors on the web. Soc Netw 25(3):211–230
Agrawal M, Zitnik M, Leskovec J (2018) Large-scale analysis of disease pathways in the human interactome. In: Pacific symposium on biocomputing, vol 23, p 111. World Scientific
Andersen R, Chung F, Lang K (2006) Local graph partitioning using PageRank vectors. In: 2006 47th annual IEEE symposium on foundations of computer science. IEEE
Avin C et al (2015) Core size and densification in preferential attachment networks. In: International colloquium on automata. languages, and programming, ICALP 2015. Springer, Berlin, pp 492–503
Backstrom L, Leskovec J (2011) Supervised random walks: predicting and recommending links in social networks. WSDM ’11, pp 635–644. ACM, New York, NY, USA
Barabási A.L, Albert R (1999) Emergence of scaling in random networks. Science 286(5439), 509–512 . https://science.sciencemag.org/content/286/5439/509.full.pdf
Benson AR, Abebe R, Schaub MT, Jadbabaie A, Kleinberg J (2018) Simplicial closure and higher-order link prediction. Proc Natl Acad Sci 115:E11221–E11230
Benson AR, Gleich DF, Leskovec J (2016) Higher-order organization of complex networks. Science 353(6295):163–166
Benson AR, Gleich DF, Lim LH (2017) The spacey random walk: a stochastic process for higher-order data. SIAM Rev 59(2):321–345
Clauset A, Moore C, Newman MEJ (2008) Hierarchical structure and the prediction of missing links in networks. Nature 453:98. https://doi.org/10.1038/nature06830
Dave V, Hasan M (2019) Triangle completion time prediction using time-conserving embedding. ECMLPKDD
Easley D, Kleinberg J (2010) Networks, crowds, and markets: reasoning about a highly connected world. Cambridge University Press, Cambridge
Eikmeier N, Ramani AS, Gleich DF (2018) The hyperkron graph model for higher-order features. In: IEEE international conference on data mining, ICDM 2018, Singapore, 2018
Gleich DF (2015) PageRank beyond the web. SIAM Rev 57(3):321–363
Gomez-Uribe CA, Hunt N (2015) The netflix recommender system: algorithms, business value, and innovation. ACM Trans Manag Inf Syst 6(4):13:1–13:19
Granovetter M.S (1977) The strength of weak ties. In: Social networks, pp 347–367. Elsevier
Guimerà R, Danon L, Díaz-Guilera A, Giralt F, Arenas A (2003) Self-similar community structure in a network of human interactions. Phys Rev E 68:065103
Holland P.W, Leinhardt S (1977) A method for detecting structure in sociometric data. In: Social networks, pp 411–432. Elsevier
Katz L (1953) A new status index derived from sociometric analysis. Psychometrika 18(1):39–43
Lambiotte R, Rosvall M, Scholtes I (2019) From networks to optimal higher-order models of complex systems. Nat Phys 15(4):313–320
Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Am Soc Inf Sci Technol 58(7):1019–1031
Lin C.H, Konecki D.M, Liu M, Wilson S.J, Nassar H, Wilkins A.D, Gleich D.F, Lichtarge O (2018) Multimodal network diffusion predicts future disease-gene-chemical associations
Lofgren P, Banerjee S, Goel A (2016) Personalized pagerank estimation and search: A bidirectional approach. In: Proceedings of the ninth ACM international conference on web search and data mining, WSDM ’16, pp 163–172. ACM, New York
Lü L, Zhou T (2011) Link prediction in complex networks: a survey. Phys A Stat Mech Appl 390(6):1150–1170
Lü L, Zhou T (2011b) Link prediction in complex networks: a survey. Phys A 390(6):1150–1170
Milo R (2004) Superfamilies of evolved and designed networks. Science 303(5663):1538–1542
Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U (2002) Network motifs: Simple building blocks of complex networks. Science 298(5594), 824–827. https://science.sciencemag.org/content/298/5594/824.full.pdf
Nassar H, Benson AR, Gleich DF (2019) Pairwise link prediction. ASONAM
Nassar H, Gleich DF (2018) Matrixnetworks.jl. https://github.com/nassarhuda/MatrixNetworks.jl
Newman MEJ (2001) Clustering and preferential attachment in growing networks. Phys Rev E 64:025102
Page L, Brin S, Motwani R, Winograd T (1999) The pagerank citation ranking: bringing order to the web. Technical report 1999-66, Stanford InfoLab . Previous number = SIDL-WP-1999-0120
Panzarasa P, Opsahl T, Carley KM (2009) Patterns and dynamics of users’ behavior and interaction: network analysis of an online community. J Am Soc Inform Sci Technol 60(5):911–932
Rapoport A (1953) Spread of information through a population with socio-structural bias: I. assumption of transitivity. Bull Math Biophys 15(4):523–533
Schoenebeck G (2013) Potential networks, contagious communities, and understanding social network structure. In: Proceedings of the 22nd international conference on World Wide Web, WWW ’13, pp 1123–1132. Association for computing machinery, New York. https://doi.org/10.1145/2488388.2488486
Soundarajan S, Hopcroft J (2012) Using community information to improve the precision of link prediction methods. In: Proceedings of the 21st international conference on World Wide Web, pp 607–608
Stanford SNAP Group: (2017) Miner: Gigascale multimodal biological network. https://github.com/snap-stanford/miner-data
Traud AL, Mucha PJ, Porter MA (2012) Social structure of facebook networks. Phys A Stat Mechan Appl 391(16):4165–4180. https://doi.org/10.1016/j.physa.2011.12.021. http://www.sciencedirect.com/science/article/pii/S0378437111009186
Wishart DS et al (2017) DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res 46:D1074–D1082
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
NSF IIS-1422918, IIS-1546488, CCF-1909528, NSF Center for Science of Information STC, CCF-0939370, NASA, Sloan Foundation, DARPA SIMPLEX, NSF DMS-1830274, ARO W911NF-19-1-0057, and ARO MURI.
Rights and permissions
About this article
Cite this article
Nassar, H., Benson, A.R. & Gleich, D.F. Neighborhood and PageRank methods for pairwise link prediction. Soc. Netw. Anal. Min. 10, 63 (2020). https://doi.org/10.1007/s13278-020-00671-6
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-020-00671-6