Abstract
Link prediction in a given instance of a network topology is a crucial task for extracting and inspecting the evolution of social networks. It predicts missing links in existing community networks and new or terminating links in future systems. It also attracted much attention in many fields. In the past decade, many methodologies have been compiled to predict the suitable links in a given social network. Analyzing link prediction methods is difficult when the network is very complex due to restrictive computing cost. It is still a very challenging task to predict missing links efficiently and accurately in an incomplete complex network. Depending on the certainty, the nodes with an incredible number of normal neighbors will probably be connected. Numerous similarity indices have accomplished extensive exactness and efficiency that greatly optimized this task. To accommodate this instance, in this paper, we propose one such index, namely Clustering Coefficient Index, using triangle counting implemented on the component of Apache Spark’s GraphX methodology. The proposed index uses the property of formation of triangles in the given network topology and clustering coefficients. Experimental results show that the proposed methodology outperforms in linking the suitable communications compared to other existing methods.
Similar content being viewed by others
References
Adamic L, Adar E (2005) How to search a social network. Soc Netw 27(3):187–203
Adamic LA, Glance N (2005) The political blogosphere and the 2004 US election: divided they blog. In: Proceedings of the 3rd international workshop on link discovery. ACM, pp 36–43
Al Hasan M, Zaki MJ (2011) A survey of link prediction in social networks. In: Social network data analytics. Springer, New York, pp 243–275
Barzel B, Barabási AL (2013) Network link prediction by global silencing of indirect correlations. Nat Biotechnol 31(8):720–725
Batagelj, V., & Mrvar, A. (2014). Pajek. In: Encyclopedia of Social Network Analysis and Mining, Springer, New York. pp. 1245–1256. https://doi.org/10.1007/978-1-4614-6170-8_310
Benchettara N, Kanawati R, Rouveirol C (2010) A supervised machine learning link prediction approach for academic collaboration recommendation. In: Proceedings of the fourth ACM conference on recommender systems. ACM, pp 253–256
Bu D, Zhao Y, Cai L, Xue H, Zhu X, Lu H, Li G (2003) Topological structure analysis of the protein–protein interaction network in budding yeast. Nucl Acids Res 31(9):2443–2450
Cannistraci CV, Alanis-Lobato G, Ravasi T (2013) From link-prediction in brain connectomes and protein interactomes to the local-community-paradigm in complex networks. Sci Rep 3:1613
Chelliah PR (2017) The hadoop ecosystem technologies and tools. In: Advances in Computers, Elsevier
Chen J, Geyer W, Dugan C, Muller M, Guy I (2009) Make new friends, but keep the old: recommending people on social networking sites. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 201–210
Clauset A, Moore C, Newman ME (2008) Hierarchical structure and the prediction of missing links in networks. Nature 453(7191):98–101
Cukier K (2010) The data deluge: businesses, governments and society are only starting to tap its vast potential. Economist 23
Dharavath R, Singh AK (2016) Entity resolution-based jaccard similarity coefficient for heterogeneous distributed databases. In: Proceedings of the second international conference on computer and communication technologies. Springer, New Delhi, pp 497–507
Diestel R (2010) Graph theory, 4th edn. Springer, Heidelberg
Duch J, Arenas A (2005) Community detection in complex networks using extremal optimization. Phys Rev E 72(2):027104
Facebook (NIPS) Network Dataset—KONECT (2017). http://konect.uni-koblenz.de/networks/ego-facebook. Accessed April 2017
Gandomi A, Haider M (2015) Beyond the hype: big data concepts, methods, and analytics. Int J Inf Manag 35(2):137–144
Gantz J, Reinsel D (2011) Extracting value from chaos. IDC iview 1142(2011):1–12
Gonzalez JE, Xin RS, Dave A, Crankshaw D, Franklin MJ, Stoica I (2014) GraphX: graph processing in a distributed dataflow framework. In: OSDI, vol 14, pp 599–613
Guimerà R, Sales-Pardo M (2009) Missing and spurious interactions and the reconstruction of complex networks. Proc Natl Acad Sci 106(52):22073–22078
Hamsterster Friendships Network Dataset—{KONECT} (2015) http://konect.uni-koblenz.de/networks/petster-friendships-hamster. Accessed April 2017
Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143(1):29–36
Huynen MA, Snel B, von Mering C, Bork P (2003) Function prediction and protein networks. Curr Opin Cell Biol 15(2):191–198
Jaccard P (1901) Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bull Soc Vaudoise Sci Nat 37:547–579
Jeh G, Widom J (2003) Scaling personalized web search. In: Proceedings of the 12th international conference on World Wide Web. ACM, pp 271–279
Katz L (1953) A new status index derived from sociometric analysis. Psychometrika 18(1):39–43
Krebs V (2002) Uncloaking terrorist networks. First Monday. https://doi.org/10.5210/fm.v7i4.941
Latora V, Marchiori M (2004) How the science of complex networks can help developing strategies against terrorism. Chaos, Solitons Fractals 20(1):69–75
Leicht EA, Holme P, Newman ME (2006) Vertex similarity in networks. Phys Rev E 73(2):026120
Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Assoc Inf Sci Technol 58(7):1019–1031
Liu W, Lü L (2010) Link prediction based on local random walk. EPL (Europhys Lett) 89(5):58007
Liu Z, Zhang QM, Lü L, Zhou T (2011) Link prediction in complex networks: a local naïve Bayes model. EPL (Europhys Lett) 96(4):48007
Lorrain F, White HC (1977) Structural equivalence of individuals in social networks. Soc Netw Dev Paradig 1:67
Lu LH (2012) Financial slack, board composition and the explorative and exploitative innovation behavior of firms. In: Academy of management proceedings, vol 2012, no 1, pp 1–1. Academy of Management
Lü L, Zhou T (2011) Link prediction in complex networks: a survey. Phys A Stat Mech Appl 390(6):1150–1170
Lü L, Jin CH, Zhou T (2009) Similarity index based on local paths for link prediction of complex networks. Phys Rev E 80(4):046122
Lusseau D, Schneider K, Boisseau OJ, Haase P, Slooten E, Dawson SM (2003) The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations. Behav Ecol Sociobiol 54(4):396–405
Mohan A, Venkatesan R, Pramod KV (2017) A scalable method for link prediction in large real world networks. J Parallel Distrib Comput 109:89–101
Newman ME (2001) Clustering and preferential attachment in growing networks. Phys Rev E 64(2):025102
Papadimitriou A, Symeonidis P, Manolopoulos Y (2012) Fast and accurate link prediction in social networking systems. J Syst Softw 85(9):2119–2132
Pavlov M, Ichise R (2007) Finding experts by link prediction in co-authorship networks. In: Proceedings of the 2nd international conference on finding experts on the web with semantics, vol 290, pp 42–55
Petersen AM, Fortunato S, Pan RK, Kaski K, Penner O, Rungi A, Riccaboni M, Stanley HE, Pammolli F (2014) Reputation and impact in academic careers. Proc Natl Acad Sci 111(43):15316–15321
Sharan R, Ulitsky I, Shamir R (2007) Network-based prediction of protein function. Mol Syst Biol 3(1):88
Shyam R, Bharathi Ganesh HB, Kumar S, Poornachandran P, Soman KP (2015) Apache Spark a big data analytics platform for smart grid. Procedia Technol 21:171–178
Singh H, Bawa S (2017) A MapReduce-based scalable discovery and indexing of structured big data. Future Gen Comput Syst 73:32–43
Sun Y, Barber R, Gupta M, Aggarwal CC, Han J (2011) Co-author relationship prediction in heterogeneous bibliographic networks. In: International conference on advances in social networks analysis and mining (ASONAM), pp 121–128. IEEE
Tang J, Hu X, Liu H (2013) Social recommendation: a review. Soc Netw Anal Min 3(4):1113–1133
Tasgin M, Herdagdelen A, Bingol H (2007) Community detection in complex networks using genetic algorithms. arXiv preprint arXiv:0711.0491
Ugander J, Karrer B, Backstrom L, Marlow C (2011) The anatomy of the facebook social graph. arXiv preprint arXiv:1111.4503
Wang G (2013) Analysis of complex diseases: a mathematical perspective. CRC Press, Boca Raton
Wang D, Pedreschi D, Song C, Giannotti F, Barabasi AL (2011) Human mobility, social ties, and link prediction. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1100–1108
Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393(6684):440–442
White JG, Southgate E, Thomson JN, Brenner S (1986) The structure of the nervous system of the nematode Caenorhabditis elegans: the mind of a worm. Philos Trans R Soc Lond 314:1–340
Wu Z, Menichetti G, Rahmede C, Bianconi G (2015) Emergent complex network geometry. Sci Rep 5:10073
Wu Z, Lin Y, Wang J, Gregory S (2016) Link prediction with node clustering coefficient. Phys A Stat Mech Appl 452:1–8
Stanford Large Network Dataset Collection. http://snap.stanford.edu/data/index.html. Accessed April 2017
Yuan W, He K, Guan D, Zhou L, Li C (2019) Graph kernel based link prediction for signed social networks. Inf Fusion 46:1–10
Zhang S, Wang RS, Zhang XS (2007) Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Phys A Stat Mech Appl 374(1):483–490
Zheleva E, Getoor L, Golbeck J, Kuter U (2008) Using friendship ties and family circles for link prediction. In: Advances in social network mining and analysis. Springer, Berlin, pp 97–113
Zhou T, Lü L, Zhang YC (2009) Predicting missing links via local information. Eur Phys J B Condens Matter Complex Syst 71(4):623–630
Acknowledgements
This work was supported by Ministry of Human Resource Development, Indian Institute of Technology (ISM), Govt. of India, with the Grant Number TEQIP-III/2018. The authors would like to express their gratitude and heartiest thanks to the Department of Computer Science and Engineering, Indian Institute of Technology (ISM), Dhanbad, India, for providing their research support.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Dharavath, R., Arora, N.S. Spark’s GraphX-based link prediction for social communication using triangle counting. Soc. Netw. Anal. Min. 9, 28 (2019). https://doi.org/10.1007/s13278-019-0573-y
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-019-0573-y