Abstract
In today’s online social networks, it becomes essential to help newcomers as well as existing community members to find new social contacts. In scientific literature, this recommendation task is known as link prediction. Link prediction has important practical applications in social network platforms. It allows social network platform providers to recommend friends to their users. Another application is to infer missing links in partially observed networks. The shortcoming of many of the existing link prediction methods is that they mostly focus on undirected graphs only. This work closes this gap and introduces link prediction methods and metrics for directed graphs. Here, we compare well-known similarity metrics and their suitability for link prediction in directed social networks. We advance existing techniques and propose mining of subgraph patterns that are used to predict links in networks such as GitHub, GooglePlus, and Twitter. Our results show that the proposed metrics and techniques yield more accurate predictions when compared with metrics not accounting for the directed nature of the underlying networks.
Similar content being viewed by others
Notes
The upper bound for which the Data Layer has been tested was a network consisting of approximately 5 × 107 nodes and 1.5 × 109 edges.
References
Adamic LA, Adar E (2001) Friends and neighbors on the web. Soc Netw 25:211–230
Aiello LM, Barrat A, Schifanella R, Cattuto C, Markines B, Menczer F (2012) Friendship prediction and homophily in social media. ACM Trans Web 6(2):9:1–9:33
Airoldi EM, Blei DM, Fienberg SE, Xing EP (2008) Mixed membership stochastic blockmodels. J Mach Learn Res 9:1981–2014
Alon U (2007) Network motifs: theory and experimental approaches. Nat Rev Genet 8(6):450–461
Backstrom L, Leskovec J (2011) Supervised random walks: predicting and recommending links in social networks. In: Proceedings of the 4th ACM international conference on Web search and data mining, WSDM ’11. ACM, New York, NY, pp 635–644
Batagelj V, Mrvar AA (2001) A subquadratic triad census algorithm for large sparse networks with small maximum degree. Soc Netw 23(3):237–243
Bradley AP (1997) The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recogn 30:1145–1159
Brzozowski MJ, Romero DM (2011) Who should i follow? recommending people in directed social networks. In: Adamic LA, Baeza-Yates RA, Counts S (eds) ICWSM. The AAAI Press, Menlo Park, CA
Clauset A, Moore C, Newman MEJ (2008) Hierarchical structure and the prediction of missing links in networks. Nature 453(7191):98–101
Esslimani I, Brun A, Boyer A (2011) Densifying a behavioral recommender system by social networks link prediction methods. Soc Netw Anal Min 1(3):159–172
Facebook. Online: http://facebook.com (last access 22 Feb 2013)
GitHub. Online: http://github.com (last access 22 Feb 2013)
GitHub. Online: http://developer.github.com/ (last access 22 Feb 2013)
GooglePlus. Online: http://plus.google.com/ (last access 22 Feb 2013)
Granovetter M (1973) The strength of weak ties. Am J Sociol 78(6):1360–1380
Holland PW, Laskey KB, Leinhardt S (1983) Stochastic blockmodels: first steps. Soc Netw 5(2):109–137
Holland PW, Leinhardt S (1970) A method for detecting structure in sociometric data. Am J Sociol 76(3):492–513
Jeh G, Widom J (2002) Simrank: a measure of structural-context similarity. In: Proceedings of the 8th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD’02. ACM, New York, NY, pp 538–543
Jeh G, Widom J (2003) Scaling personalized web search. In: Proceedings of the 12th international conference on World Wide Web, WWW ’03. ACM, New York, NY, pp 271–279
Katz L (1953) A new status index derived from sociometric analysis. Psychometrika 18(1):39–43
Kwak H, Lee C, Park H, Moon S (2010) What is twitter, a social network or a news media? In: Proceedings of the 19th international conference on World wide web, WWW ’10. ACM, New York, NY, pp 591–600
Leicht EA, Holme P, Newman MEJ (2006) Vertex similarity in networks. Phys Rev E 73:026120
Leskovec J, Huttenlocher D, Kleinberg J (2010) Predicting positive and negative links in online social networks. In: Proceedings of the 19th international conference on World wide web, WWW ’10. ACM, New York, NY, pp 641–650
Liben-Nowell D, Kleinberg J (2003) The link prediction problem for social networks. In: Proceedings of the twelfth international conference on information and knowledge management, CIKM ’03. ACM, New York, NY, pp 556–559
Liu W, Lu L (2010) Link prediction based on local random walk. Europhys Lett (EPL) 89(5):58007
Lu L, Zhou T (2011) Link prediction in complex networks: a survey. Phys A Stat Mech Appl 390(6):1150–1170
McAuley JJ, Leskovec J (2012) Learning to discover social circles in ego networks. In: Bartlett PL, Pereira FCN, Burges CJC, Bottou L, Weinberger KQ (eds) NIPS. pp 548–556
Meng B, Ke H, Yi T (2011) Link prediction based on a semi-local similarity index. Chin Phys B 20(12):128902
Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U (2002) Network motifs: simple building blocks of complex networks. Science 298(5594):824–827
Page L, Brin S, Motwani R, Winograd T (1998) The PageRank citation ranking: bringing order to the web. Technical Report, Stanford University
Ravasz E, Somera AL, Mongru DA, Oltvai ZN, Barabasi AL (2002) Hierarchical organization of modularity in metabolic networks. Science 297(5586):1551–1555
Rettinger A, Wermser H, Huang Y, Tresp V (2012) Context-aware tensor decomposition for relation prediction in social networks. Soc Netw Anal Min 2(4):373–385
Romero DM, Kleinberg JM (2010) The directed closure process in hybrid social-information networks, with an analysis of link formation on twitter. In: Cohen WW, Gosling S (eds) ICWSM. The AAAI Press, Menlo Park, CA
Salton G, McGill MJ (1986) Introduction to modern Information retrieval. McGraw-Hill, Inc., New York, NY
Sautter G, Bhm K (2013) High-throughput crowdsourcing mechanisms for complex tasks. Soc Netw Anal Min 3(4):873–888
Schall D (2012) Expertise ranking using activity and contextual link measures. Data Knowl Eng 71(1):92–113
Schall D (2012) Service oriented crowdsourcing: architecture, protocols and algorithms. Springer Briefs in Computer Science. Springer, New York, NY
Schall D, Skopik F (2012) Social network mining of requester communities in crowdsourcing markets. Soc Netw Anal Min 2(4):329–344
Snijders TA (2012) Transitivity and Triads. University of Oxford. Online: http://www.stats.ox.ac.uk/snijders/Trans_Triads_ha.pdf (last access 22-Feb-2013)
Sørensen T (1957) A method of establishing groups of equal amplitude in plant sociology based on similarity of species and its application to analyses of the vegetation on danish commons. Biologiske Skrifter/Kongelige Danske Videnskabernes Selskab 5(4):1–34
Stanford. Online: http://snap.stanford.edu/data/index.html (last access 22 Feb 2013)
Symeonidis P, Mantas N (2013) Spectral clustering for link prediction in social networks with positive and negative links. Soc Netw Anal Min 3(4):1433–1447
Twitter. Online: http://twitter.com (last access 22 Feb 2013)
Wasserman S, Faust K, Iacobucci D (1994) Social network analysis: methods and applications (structural analysis in the social sciences). Cambridge University Press, Cambridge
White HC, Boorman SA, Breiger RL (1976) Social structure from multiple networks. i. blockmodels of roles and positions. Am J Sociol 81(4):730–780
Zhou T, Lu L, Zhang Y-C (2009) Predicting missing links via local information. Eur Phys J B Condens Matter Complex Syst 71(4):623–630
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Schall, D. Link prediction in directed social networks. Soc. Netw. Anal. Min. 4, 157 (2014). https://doi.org/10.1007/s13278-014-0157-9
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-014-0157-9