Abstract
Providing a solution for the link prediction problem attracts several computer science fields and becomes a popular challenge in researches. This challenge is presented by introducing several approaches keen to provide the most precise prediction quality within a short period of time. The difficulty of the link prediction problem comes from the sparse nature of most complex networks such as social networks. This paper presents a parallel metaheuristic framework which is based on moth-flame optimization (MFO), clustering and pre-processed datasets to solve the link prediction problem. This framework is implemented and tested on a high-performance computing cluster and carried out on large and complex networks from different fields such as social, citation, biological, and information and publication networks. This framework is called Parallel MFO for Link Prediction (PMFO-LP). PMFO-LP is composed of data preprocessing stage and prediction stage. Dataset division with stratified sampling, feature extraction, data under-sampling, and feature selection are performed in the data preprocessing stage. In the prediction stage, the MFO based on clustering is used as the prediction optimizer. The PMFO-LP provides a solution to the link prediction problem with more accurate prediction results within a reasonable amount of time. Experimental results show that PMFO-LP algorithm outperforms other well-regarded algorithms in terms of error rate, the area under curve and speedup. Note that the source code of the PMFO-LP algorithm is available at https://github.com/RehamBarham/PMFO_MPI.cpp.
Similar content being viewed by others
References
http://www.iman1.jo/iman1/index.php. Accessed 25 Mar 2018
Barham RS, Sharieh A, Sleit A (2018). A meta-heuristic framework based on clustering and preprocessed datasets for solving the link prediction problem. J Inf Sci. https://doi.org/10.1177/0165551518816296
Panda B, Majhi B (2018) A novel improved prediction of protein structural class using deep recurrent neural network. Evol Intell. https://doi.org/10.1007/s12065-018-0171-3
Li J, Chen Q, Liu B (2017) Classification and disease probability prediction via machine learning programming based on multi-GPU cluster MapReduce system. J Supercomput 73(5):1782–1809
Pook MF, Ramlan EI (2019) The Anglerfish algorithm: a derivation of randomized incremental construction technique for solving the traveling salesman problem. Evol Intell 12(1):11–20
Grama A, Gupta A, Karyp G, Kumar G (2003) Introduction to parallel computing. Addison Wesley, Boston
Mirjalili S (2015) Moth-flame optimization algorithm: a novel nature inspired heuristic paradigm. Knowl Based Syst 89:228–249
Seyedali M, Andrew L (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67
Goldberg D (1989) Genetic algorithms in search, optimization and machine learning. Addison-Wesley, Boston
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceeding of the IEEE international conference on neural networks, vol 4. IEEE service center, Piscataway, pp 1942–1948
Price K, Storn R (1997) Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. J Glob Optim 11(4):341–359. https://doi.org/10.1023/A:1008202821328
Srinivas V, Mitra P (2016) Link prediction in social networks: role of power law distribution. Springer, Berlin
Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Am Soc Inform Sci Technol 58(7):1019–1031
Mehne SHH, Mirjalili S (2020) Moth-flame optimization algorithm: theory, literature review, and application in optimal nonlinear feedback control design. In: Mirjalili S, Song Dong J, Lewis A (eds) Nature-inspired optimizers. Studies in computational intelligence. vol. 811, Springer, Cham
Jaccard P (1901) Etude comparative de la distribution florale dans une portion des Alpes et du Jura. B Soc Vaudoise Sc N 37(142):547–579
Adamic LA, Adar E (2003) Friends and neighbors on the web. Soc Netw 25(3):211–230
Newman ME (2001) Clustering and preferential attachment in growing networks. Phys Rev E 64(2):025102
Barham R, Sharieh A, Sliet A (2016) Chemical reaction optimization for max flow problem. IJACSA 7(8):189–196
Bliss CA, Frank MR, Danforth CM, Dodds PS (2014) An evolutionary algorithm approach to link prediction in dynamic social networks. J Comput Scie 5(5):750–764
Barham R, Aljarah I (2017) Link prediction based on whale optimization algorithm. In: 2017 International conference on new trends in computing sciences (ICTCS). IEEE, pp 55–60
Chen B, Chen L (2014) A link prediction algorithm based on ant colony optimization. Appl Intell 41:694–708
Barham R, Sharieh A, Sleit A (2019) Moth flame optimization based on golden section search and its application for link prediction problem. Mod Appl Sci 13(1):10–27. https://doi.org/10.5539/mas.v13n1p10
Bastami E, Mahabadi A, Taghizadeh E (2019) A gravitation-based link prediction approach in social networks. Swarm Evol Comput 44:176–186
Loia V, Parente D, Pedrycz W, Tomasiello S (2018) A granular functional network with delay: some dynamical properties and application to the sign prediction in social networks. Neurocomputing 321:61–71
Goyal P, Ferrara E (2018) Graph embedding techniques, applications, and performance: a survey. Knowl Based Syst 151:78–94
Yuan W, Pang J, Guan D, Tian Y, Al-Dhelaan A, Al-Dhelaan M (2019) Sign prediction on unlabeled social networks using branch and bound optimized transfer learning. Complexity. https://doi.org/10.1155/2019/4906903
Yang J, Yang L, Zhang P (2015) A new link prediction algorithm based on local links. In: Proceeding of the web-age information management: WAIM 2015 international workshops: HENA, HRSUNE, Qingdao, China, June 8–10. Springer, Berlin, pp 16–28. https://doi.org/10.1007/978-3-319-23531-8_2
Rao J, Wu B, Dong YX (2012) Parallel link prediction in complex network using MapReduce. Ruanjian Xuebao J Softw 23(12):3175–3186
Garcia-Gasulla D, Cortés CU (2014) Link prediction in very large directed graphs: exploiting hierarchical properties in parallel. In: Proceeding of the 3rd workshop on knowledge discovery and data mining meets linked open data—11th extended semantic web conference, pp 1–13
Dong Y, Robinson C, Xu J (2013) Hadoop based link prediction performance analysis. https://pdfs.semanticscholar.org/3e69/193e2b7526f323e474a27eaa440ee644f860.pdf. Accessed 26 June 2018
Yuan H, Ma Y, Zhanga F, Liu M, Shen W (2015) A distributed link prediction algorithm based on clustering in dynamic social networks. In: IEEE international conference on systems, man, and cybernetics 2015, pp 1341–1345
Sui X, Lee TH, Whang J, Savas B, Jain S, Pingali K, Dhillon I (2012) Parallel clustered low-rank approximation of graphs and its application to link prediction. In: Proceeding of the international workshop on languages and compilers for parallel computing. Springer, Berlin, Heidelberg, pp 76–95
Corbellini A, Godoy D, Mateos C, Schiaffino S, Zunino A (2018) DPM: a novel distributed large-scale social graph processing framework for link prediction algorithms. Future Gener Comput Syst 78:474–480
Behera RK, Sukla AS, Mahapatra S, Rath SK, Sahoo B, Bhattacharya S (2017) Map-reduce based link prediction for large scale social network. In: Proceeding of the 29th international conference on software engineering and knowledge engineering. Wyndham Pittsburgh University Center, Pittsburgh, July 5–7, pp 341–344. https://doi.org/10.18293/SEKE2017-100
Zhou T, Lü L, Zhang YC (2009) Predicting missing links via local information. Eur Phys J B 71:623–630
Lichtenwalter R, Lussier J, Chawla N (2010) New perspectives and methods in link prediction. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining KDD’10. ACM, Washington, pp 243-252, 25–28 July 2010
Yu C, Zhao X, An L, Lin X (2016) Similarity-based link prediction in social networks: a path and node combined approach. J Inf Sci 43(5):683–695
Bellman RE (1957) Dynamic programming. Princeton University Press, Princeton
Hira Z, Gillies D (2015) A review of feature selection and feature extraction methods applied on microarray data. Adv Bioinf. https://doi.org/10.1155/2015/198363
Sheydaei N, Saraee M, Shahgholian A (2015) A novel feature selection method for text classification using association rules and clustering. J Inf Sci 41(1):3–15
Onan A, Korukoglu S (2015) A feature selection model based on genetic rank aggregation for text sentiment classification. J Inf Sci 43(1):25–38
Sun Y, Babbs C, Delp E (2005) A comparison of feature selection methods for the detection of breast cancers in mammograms: adaptive sequential floating search vs. genetic algorithm. In: 27th Annual international conference medicine and biology society, IEEE-EMBS 2005. IEEE, pp 6532–6535
Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28
Črepinšek M, Liu SH, Mernik M (2013) Exploration and exploitation in evolutionary algorithms: a survey. ACM Comput Surv (CSUR) 45(3):1–35
Saida IB, Nadjet K, Omar B (2014) A new algorithm for data clustering based on cuckoo search optimization. Genetic and evolutionary computing. Adv Intell Syst Comput 238:55–64
Rendón E, Abundez I, Arizmendi A, Quiroz EM (2011) Internal versus external cluster validation indexes. Int J Comput Commun 5(1):27–34
Liu B (2011) Supervised learning. In: Proceeding of the web data mining. data-centric systems and applications. Springer, Berlin, Heidelberg, pp 63–132
Receiver operating characteristic. http://en.wikipedia.org/wiki/Receiver_operating_characteristic. Accessed 1 Jan 2018
Link prediction group (LPG). http://www.linkprediction.org/index.php/link/resource/data. Accessed 15 Sept 2017
Lü L, Chen D, Ren X, Zhang Q, Zhang Y, Zhou T (2016) Vital nodes identification in complex networks. Phys Rep 650:1–63
Stanford Large Network Dataset Collection (SNAP). https://snap.stanford.edu/data/. Accessed 1 Apr 2018
Tang L, Liu H (2009) Scalable learning of collective behavior based on sparse social dimensions. In: Proceedings of the 18th ACM conference on Information and knowledge management. ACM, pp 1107–1116
Acknowledgements
The authors would like to express their deep gratitude to IMAN1 Authority and the University of Jordan for their support in using their facilities.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Barham, R., Sharieh, A. & Sleit, A. Multi-moth flame optimization for solving the link prediction problem in complex networks. Evol. Intel. 12, 563–591 (2019). https://doi.org/10.1007/s12065-019-00257-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12065-019-00257-y