Abstract
Link prediction is an important task in data mining, which has widespread applications in social network research. Given a social network, the objective of this task is to predict future links which have not yet observed in the current state of the network. Owing to its importance, the link prediction task has received substantial attention from researchers in diverse disciplines; thus, a large number of methodologies for solving this problem have been proposed in recent decades. However, existing literatures lack a current and comprehensive analysis of existing link prediction methodologies. Couple of survey articles on link prediction are available, but they are out-dated as numerous link prediction methods have been proposed after these articles have been published. In this paper, we provide a systematic analysis of existing link prediction methodologies. Our analysis is comprehensive, it covers the earliest scoring-based methodologies and extends up to the most recent methodologies which are based on deep learning methods. We also categorize the link prediction methods based on their technical approach, and discuss the strength and weakness of various methods.
Similar content being viewed by others
References
Acar E, Dunlavy DM, Kolda TG (2009) Link prediction on evolving data using matrix and tensor factorizations. In: 2009 IEEE international conference on data mining workshops, IEEE, pp 262–269
Acar E, Kolda TG, Dunlavy DM (2011) All-at-once optimization for coupled matrix and tensor factorizations. arXiv:1105.3422
Adamic LA, Adar E (2003) Friends and neighbors on the web. Soc Netw 25(3):211–230
Aggarwal C, Subbian K (2014) Evolutionary network analysis: a survey. ACM Comput Surv CSUR 47(1):10
Al Hasan M, Zaki MJ (2011) A survey of link prediction in social networks. In: Aggarwal C (eds) Social network data analytics. Springer, Boston, pp 243–275
Al Hasan M, Chaoji V, Salem S, Zaki M (2006) Link prediction using supervised learning. In: SDM06: workshop on link analysis, counter-terrorism and security
Backstrom L, Leskovec J (2011) Supervised random walks: predicting and recommending links in social networks. In: Proceedings of the fourth ACM international conference on Web search and data mining, ACM, pp 635–644
Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828
Bilgic M, Namata GM, Getoor L (2007) Combining collective classification and link prediction. In: Seventh IEEE international conference on data mining workshops (ICDMW 2007), IEEE, pp 381–386
Bliss CA, Frank MR, Danforth CM, Dodds PS (2014) An evolutionary algorithm approach to link prediction in dynamic social networks. J Comput Sci 5(5):750–764
Bordes A, Weston J, Collobert R, Bengio Y (2011) Learning structured embeddings of knowledge bases. In: Conference on artificial intelligence, EPFL-CONF-192344
Bordes A, Usunier N, Garcia-Duran A, Weston J, Yakhnenko O (2013) Translating embeddings for modeling multi-relational data. In: Burges CJC (eds) Advances in neural information processing systems. Curran Associates Inc., pp 2787–2795
Bordes A, Glorot X, Weston J, Bengio Y (2014) A semantic matching energy function for learning with multi-relational data. Mach Learn 94(2):233–259
Brandes U, Wagner D (2004) Analysis and visualization of social networks. In: Jünger M, Mutzel P (eds) Graph drawing software. Mathematics and visualization. Springer, Berlin, pp 321–340
Cao B, Liu NN, Yang Q (2010) Transfer learning for collective link prediction in multiple heterogenous domains. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 159–166
Chung TS, Wedel M, Rust RT (2016) Adaptive personalization using social networks. J Acad Mark Sci 44(1):66–87
Clauset A, Moore C, Newman ME (2008) Hierarchical structure and the prediction of missing links in networks. Nature 453(7191):98–101
Collomb G, Härdle W (1986) Strong uniform convergence rates in robust nonparametric time series analysis and prediction: Kernel regression estimation from dependent observations. Stoch Process Their Appl 23(1):77–89
da Silva Soares PR, Prudêncio RBC (2012) Time series based link prediction. In: The 2012 international joint conference on neural networks (IJCNN), IEEE, pp 1–7
Davis D, Lichtenwalter R, Chawla NV (2011) Multi-relational link prediction in heterogeneous information networks. In: 2011 International conference on advances in social networks analysis and mining (ASONAM), IEEE, pp 281–288
Davis J, Goadrich M (2006) The relationship between precision-recall and roc curves. In: Proceedings of the 23rd international conference on Machine learning, ACM, pp 233–240
Doppa JR, Yu J, Tadepalli P, Getoor L (2009) Chance-constrained programs for link prediction. In: NIPS workshop on analyzing networks and learning with graphs
Dunlavy DM, Kolda TG, Acar E (2011) Temporal link prediction using matrix and tensor factorizations. ACM Trans Knowl Discov Data TKDD 5(2):10
Ermiş B, Acar E, Cemgil AT (2012) Link prediction via generalized coupled tensor factorisation. arXiv:1208.6231
Ermiş B, Acar E, Cemgil AT (2015) Link prediction in heterogeneous data via generalized coupled tensor factorization. Data Min Knowl Discov 29(1):203–236
Feng X, Zhao J, Xu K (2012) Link prediction in complex networks: a clustering perspective. Eur Phys J B 85(1):1–9
Fire M, Tenenboim L, Lesser O, Puzis R, Rokach L, Elovici Y (2011) Link prediction in social networks using computationally efficient topological features. In: 2011 IEEE third international conference on privacy, security, risk and trust (PASSAT) and 2011 IEEE third inernational conference on social computing (SocialCom), IEEE, pp 73–80
Gao S, Denoyer L, Gallinari P (2011) Link pattern prediction with tensor decomposition in multi-relational networks. In: 2011 IEEE symposium on computational intelligence and data mining (CIDM), IEEE, pp 333–340
Garcia-Duran A, Bordes A, Usunier N, Grandvalet Y (2016) Combining two and three-way embedding models for link prediction in knowledge bases. J Artif Intell Res 55:715–742
Getoor L, Diehl CP (2005) Link mining: a survey. ACM SIGKDD Explor Newsl 7(2):3–12
Goodfellow I, Bengio Y, Courville A (2016) Deep learning, http://www.deeplearningbook.org, book in preparation for MIT Press
Grover A, Leskovec J (2016) Node2Vec: Scalable feature learning for networks. In: Proceedings of the 22nd acm SIGKDD international conference on knowledge discovery and data mining. KDD’16. ACM, San Francisco, CA, USA, pp 855–864
Han Y, Moutarde F (2016) Analysis of large-scale traffic dynamics in an urban transportation network using non-negative tensor factorization. Int J Intell Transp Syst Res 14(1):36–49
Heaukulani C, Ghahramani Z (2013) Dynamic probabilistic models for latent feature propagation in social networks. In: Dasgupta S, McAllester D (eds) ICML (1). PMLR, pp 275–283
Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Jenatton R, Roux NL, Bordes A, Obozinski GR (2012) A latent factor model for highly multi-relational data. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems. Curran Associates Inc., pp 3167–3175
Jiang X, Tresp V, Huang Y, Nickel M (2012) Link prediction in multi-relational graphs using additive models. In: Proceedings of the 2012 international conference on semantic technologies meet recommender systems & big data-volume 919, CEUR-WS. org, pp 1–12
Junuthula RR, Xu KS, Devabhaktuni VK (2016) Evaluating link prediction accuracy in dynamic networks with added and removed edges. In: 2016 IEEE International conferences on big data and cloud computing (BDCloud), social computing and networking (SocialCom), sustainable computing and communications (SustainCom) (BDCloud-SocialCom-SustainCom), IEEE, pp 377–384
Kashima H, Kato T, Yamanishi Y, Sugiyama M, Tsuda K (2009) Link propagation: a fast semi-supervised learning algorithm for link prediction. In: Park H, Parthasarathy S, Liu H (eds) SDM, vol 9, SIAM, Philadelphia, pp 1099–1110
Keyvanpour MR, Azizani F (2012) Classification and analysis of frequent subgraphs mining algorithms. J Softw 7(1):220–227
Keyvanpour MR, Moradi SS (2014) A perturbation method based on singular value decomposition and feature selection for privacy preserving data mining. Int J Data Warehous Min 10(1):55–76
Kim DI, Gopalan PK, Blei D, Sudderth E (2013) Efficient online inference for bayesian nonparametric relational models. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems. Curran Associates, Inc., pp 962–970
Kolda TG, Bader BW (2009) Tensor decompositions and applications. SIAM Rev 51(3):455–500
Krompaß D, Nickel M, Tresp V (2014) Large-scale factorization of type-constrained multi-relational data. In: 2014 International conference on data science and advanced analytics (DSAA), IEEE, pp 18–24
Kuhn F, Oshman R (2011) Dynamic networks: models and algorithms. ACM SIGACT News 42(1):82–96
Lee C, Nick B, Brandes U, Cunningham P (2013) Link prediction with social vector clocks. In: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 784–792
Li K, Gao J, Guo S, Du N, Li X, Zhang A (2014a) Lrbm: a restricted boltzmann machine based approach for representation learning on linked data. In: 2014 IEEE international conference on data mining, IEEE, pp 300–309
Li X, Du N, Li H, Li K, Gao J, Zhang A (2014b) A deep learning approach to link prediction in dynamic networks. In: Proceedings of the 2014 SIAM international conference on data mining. SIAM, pp 289–297
Li Deng DY (2014) Deep learning: methods and applications. Tech. rep., https://www.microsoft.com/en-us/research/publication/deep-learning-methods-and-applications/
Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Am Soc Inf Sci Technol 58(7):1019–1031
Lichtenwalter RN, Lussier JT, Chawla NV (2010) New perspectives and methods in link prediction. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 243–252
Lichtnwalter R, Chawla NV (2012) Link prediction: fair and effective evaluation. In: Proceedings of the 2012 international conference on advances in social networks analysis and mining (ASONAM 2012), IEEE Computer Society, pp 376–383
Litwin H, Stoeckel KJ (2016) Social network, activity participation, and cognition a complex relationship. Res Aging 38(1):76–97
Liu F, Liu B, Wang X, Liu M, Wang B (2012) Features for link prediction in social networks: a comprehensive study. In: 2012 IEEE international conference on systems, man, and cybernetics (SMC), IEEE, pp 1706–1711
Liu F, Liu B, Sun C, Liu M, Wang X (2013) Deep learning approaches for link prediction in social network services. In: International conference on neural information processing, Springer, pp 425–432
London B, Rekatsinas T, Huang B, Getoor L (2013) Multi-relational learning using weighted tensor decomposition with modular loss. arXiv:1303.1733
Lü L, Zhou T (2011) Link prediction in complex networks: a survey. Phys A Stat Mech Appl 390(6):1150–1170
Menon AK, Elkan C (2011) Link prediction via matrix factorization. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 437–452
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781
Miller K, Jordan MI, Griffiths TL (2009) Nonparametric latent feature models for link prediction. In: Bengio Y, Schuurmans D, Lafferty JD, Williams CKI, Culotta A (eds) Advances in neural information processing systems. Curran Associates, Inc., pp 1276–1284
Nakatsuji M, Toda H, Sawada H, Zheng JG, Hendler JA (2016) Semantic sensitive tensor factorization. Artif Intell 230:224–245
Narita A, Hayashi K, Tomioka R, Kashima H (2012) Tensor factorization using auxiliary information. Data Min Knowl Discov 25(2):298–324
Nasim M, Brandes U (2014) Predicting network structure using unlabeled interaction information. MMB & DFT 2014:57
Ngonmang B, Viennet E, Tchuente M, Kamga V (2015) Community analysis and link prediction in dynamic social networks. In: Gamatié A. (eds) Computing in research and development in Africa. Springer, Cham
Nguyen CH, Mamitsuka H (2011) Kernels for link prediction with latent feature models. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 517–532
Nguyen CH, Mamitsuka H (2012) Latent feature kernels for link prediction on sparse graphs. IEEE Trans Neural Netw Learn Syst 23(11):1793–1804
Nguyen-Thi AT, Nguyen PQ, Ngo TD, Nguyen-Hoang TA (2015) Transfer adaboost svm for link prediction in newly signed social networks using explicit and pnr features. Proc Comput Sci 60:332–341
Nickel M, Tresp V (2013a) Logistic tensor factorization for multi-relational data. arXiv:1306.2084
Nickel M, Tresp V (2013b) Tensor factorization for multi-relational learning. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 617–621
Nickel M, Jiang X, Tresp V (2014) Reducing the rank in relational factorization models by including observable patterns. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in neural information processing systems. Curran Associates, Inc., pp 1179–1187
Nickel M, Murphy K, Tresp V, Gabrilovich E (2016) A review of relational machine learning for knowledge graphs. Proc IEEE 104(1):11–33
Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 701–710
Rahman M, Al Hasan M (2016) Link prediction in dynamic networks using graphlet. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 394–409
Rastelli R, Friel N, Raftery AE (2016) Properties of latent variable network models. Netw Sci 4(4):407–432
Richard E, Gaïffas S, Vayatis N (2014) Link prediction in graphs with autoregressive features. J Mach Learn Res 15(1):565–593
Riedel S, Yao L, McCallum A, Marlin BM (2013) Relation extraction with matrix factorization and universal schemas. In: HLT-NAACL. Curran Associates, Inc., pp 74–84
Rossetti G, Guidotti R, Pennacchioli D, Pedreschi D, Giannotti F (2015) Interaction prediction in dynamic networks exploiting community discovery. In: 2015 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), IEEE, pp 553–558
Sarkar P, Moore AW (2005) Dynamic social network analysis using latent space models. ACM SIGKDD Explor Newsl 7(2):31–40
Sarkar P, Chakrabarti D, Moore AW (2011) Theoretical justification of popular link prediction heuristics. In: IJCAI proceedings-international joint conference on artificial intelligence, vol 22, p 2722
Sarkar P, Chakrabarti D, Jordan M (2012) Nonparametric link prediction in dynamic networks. arXiv:1206.6394
Sarkar P, Chakrabarti D, Jordan M et al (2014) Nonparametric link prediction in large scale dynamic networks. Electron J Stat 8(2):2022–2065
Schmidt MN, Morup M (2013) Nonparametric bayesian modeling of complex networks: an introduction. IEEE Signal Process Mag 30(3):110–128
Sewell DK, Chen Y (2016) Latent space models for dynamic networks with weighted edges. Soc Netw 44:105–116
Socher R, Chen D, Manning CD, Ng A (2013) Reasoning with neural tensor networks for knowledge base completion. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems. Curran Associates, Inc., pp 926–934
Spiegel S, Clausen J, Albayrak S, Kunegis J (2011) Link prediction on evolving data using tensor factorization. In: Pacific-Asia conference on knowledge discovery and data mining, Springer, pp 100–110
Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: Large-scale information network embedding. In: Proceedings of the 24th international conference on world wide web, ACM, pp 1067–1077
Taskar B, Wong MF, Abbeel P, Koller D (2003) Link prediction in relational data. In: Thrun S, Saul LK, Schölkopf PB (eds) Advances in neural information processing systems. MIT Press
Tylenda T, Angelova R, Bedathur S (2009) Towards time-aware link prediction in evolving social networks. In: Proceedings of the 3rd workshop on social network mining and analysis, ACM, p 9
Wang C, Satuluri V, Parthasarathy S (2007) Local probabilistic models for link prediction. In: Seventh IEEE international conference on data mining (ICDM 2007), IEEE, pp 322–331
Wang D, Pedreschi D, Song C, Giannotti F, Barabasi AL (2011) Human mobility, social ties, and link prediction. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 1100–1108
Wang P, Xu B, Wu Y, Zhou X (2015) Link prediction in social networks: the state-of-the-art. Sci China Inf Sci 58(1):1–38
Yang Y, Chawla N, Sun Y, Hani J (2012) Predicting links in multi-relational and heterogeneous networks. In: 2012 IEEE 12th International conference on data mining, IEEE, pp 755–764
Yang Y, Lichtenwalter RN, Chawla NV (2015) Evaluating link prediction methods. Knowl Inf Syst 45(3):751–782
Yao L, Sheng QZ, Qin Y, Wang X, Shemshadi A, He Q (2015) Context-aware point-of-interest recommendation using tensor factorization with social regularization. In: Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval, ACM, pp 1007–1010
Yılmaz KY, Cemgil AT, Simsekli U (2011) Generalised coupled tensor factorisation. In: Shawe-Taylor J, Zemel RS, Bartlett PL, Pereira F, Weinberger KQ (eds) Advances in neural information processing systems Curran Associates Inc., pp 2151–2159
Yu K, Chu W, Yu S, Tresp V, Xu Z (2006) Stochastic relational models for discriminative link prediction. In: Schölkopf PB, Platt JC, Hoffman T (eds) Advances in neural information processing systems, pp 1553–1560
Yu K, Lafferty J, Zhu S, Gong Y (2009) Large-scale collaborative prediction using a nonparametric random effects model. In: Proceedings of the 26th annual international conference on machine learning, ACM. MIT Press, pp 1185–1192
Zhai S, Zhang Z (2015) Dropout training of matrix factorization and autoencoder for link prediction in sparse graphs. In: Proceedings of the 2015 SIAM international conference on data mining. SIAM, pp 451–459
Zhang J, Lv Y, Yu P (2015) Enterprise social link recommendation. In: Proceedings of the 24th ACM international on conference on information and knowledge management, ACM, pp 841–850
Zhang X, Chen W, Yan H (2016) TLINE: Scalable transductive network embedding. In: Ma S et al (eds) Information retrieval technology. AIRS 2016. Lecture notes in computer science, vol 9994. Springer, Cham
Zhu J, Song J, Chen B (2016a) Max-margin nonparametric latent feature models for link prediction. arXiv:1602.07428
Zhu L, Guo D, Yin J, Ver Steeg G, Galstyan A (2016b) Scalable temporal latent space inference for link prediction in dynamic social networks. IEEE Trans Knowl Data Eng 28(10):2765–2777
Acknowledgements
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Haghani, S., Keyvanpour, M.R. A systemic analysis of link prediction in social network. Artif Intell Rev 52, 1961–1995 (2019). https://doi.org/10.1007/s10462-017-9590-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-017-9590-2