Abstract
The relationship extraction and network fusion are hotspots of current research in social network mining. Since the types of data are manifold, researchers can utilize multi-type data to construct multiple networks. In academic social network mining, present researches are mostly based on the single-type data, e.g., the co-authorship network constructed by academic co-authorship records. However, the relationships portrayed by single-type data are not sufficient to characterize the complex relationships of the real world. To solve this problem, we are the first to the best of our knowledge to use acknowledgment text to construct a semantic information-based academic social network. First, we extract named entities from multi-type data and implement network optimization and alignment. Second, a semi-supervised fusion framework for multiple networks (SFMN), using the gradient boosting decision tree algorithm to fuse the information of multiple networks into a single network, is proposed in this paper. Third, we implement the parallel version of SFMN on Spark to improve the performance of large-scale social network analysis. Experiments show the superiority of our framework over several state-of-the-art methods and prove our method can effectively integrate network information.







Similar content being viewed by others
References
Alsheikh MA, Niyato D, Lin S, Tan HP, Han Z (2016) Mobile big data analytics using deep learning and apache spark. IEEE Netw 30(3):22–29
Azaouzi M, Rhouma D, Romdhane LB (2019) Community detection in large-scale social networks: state-of-the-art and future directions. Soc Netw Anal Min 9(1):23
Berlingerio M, Coscia M, Giannotti F (2011) finding and characterizing communities in multidimensional networks. In: 2011 international conference on advances in social networks analysis and mining. IEEE, pp 490–494
Cai D, Shao Z, He X, Yan X, Han J (2005) Mining hidden community in heterogeneous social networks. In: Proceedings of the 3rd international workshop on Link discovery. ACM, pp 58–65
Cucchiarelli A, Fulvio DA, Velardi P (2012) Semantically interconnected social networks. Soc Netw Anal Min 2(1):69–95
Cucerzan S (2007) Large-scale named entity disambiguation based on Wikipedia data. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp 708–716
De Domenico M, Nicosia V, Arenas A, Latora V (2015) Structural reducibility of multilayer networks. Nat Commun 6:6864
Dharavath R, Arora NS (2019) Spark’s GraphX-based link prediction for social communication using triangle counting. Soc Netw Anal Min 9(1):28
Farasat A, Gross G, Nagi R, Nikolaev AG (2015) Social network extraction and high value individual (HVI) identification within fused intelligence data. In: International conference on social computing, behavioral-cultural modeling, and prediction. Springer, Cham, pp 44–54
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232
Gong WH, Chen YQ, Pei XB, Yang LH (2018) Community detection combined with multi-dimensional relationships in location-based social networks. J Softw 29(4):1163–1176
Gonzalez JE, Low Y, Gu H, Bickson D, Guestrin C (2012) Powergraph: Distributed graph-parallel computation on natural graphs. In: Presented as part of the 10th USENIX symposium on operating systems design and implementation (OSDI 12), pp 17–30
Gonzalez JE, Xin RS, Dave A, Crankshaw D, Franklin MJ, Stoica I (2014) Graphx: graph processing in a distributed dataflow framework. In: 11th USENIX symposium on operating systems design and implementation (OSDI 14), pp 599–613
He Z, Liu S, Song Y, Li M, Zhou M, Wang H (2013) Efficient collective entity linking with stacking. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 426–435
Jiang X, Hu X (2014) Inferring microbial interaction networks based on consensus similarity network fusion. Sci China Life Sci 57(11):1115–1120
Jutla IS, Jeub LG, Mucha PJ (2011) A generalized Louvain method for community detection implemented in MATLAB. http://netwiki.amath.unc.edu/GenLouvain
Kivelä M, Arenas A, Barthelemy M, Gleeson JP, Moreno Y, Porter MA (2014) Multilayer networks. J Complex Netw 2(3):203–271
Long F, Ning N, Song C, Wu B (2019) Strengthening social networks analysis by networks fusion. In: Proceedings of the 2019 IEEE/ACM international conference on advances in social networks analysis and mining, pp 460–463
Low Y, Gonzalez JE, Kyrola A, Bickson D, Guestrin CE, Hellerstein J (2014) Graphlab: a new framework for parallel machine learning. arXiv preprint arXiv:1408.2041
Malewicz G, Austern MH, Bik AJ, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data. ACM, pp 135–146
Meng X, Bradley J, Yavuz B, Sparks E, Venkataraman S, Liu D, Xin D (2016) Mllib: machine learning in apache spark. J Mach Learn Res 17(1):1235–1241
Mohamed S, Moonam K, Hakim T (2013) Enabling cross-site interactions in social networks. Soc Netw Anal Min
Nasution MK, Noah SAM, Saad S (2016) Social network extraction: superficial method and information retrieval. arXiv preprint arXiv:1601.02904
Newman MEJ (2018) Network structure from rich but noisy data. Nat Phys 14(6):542
Niu L, Wu J, Shi Y (2012) Entity disambiguation with textual and connection information. Procedia Comput Sci 9:1249–1255
Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 701–710
Ruan P, Wang Y, Shen R et al (2019) Using association signal annotations to boost similarity network fusion. Bioinformatics (Oxford, England)
Shanahan JG, Dai L (2015) Large scale distributed data science using apache spark. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 2323–2324
Steurer M, Trattner C (2013) Acquaintance or partner? Predicting partnership in online and location-based social networks. In: 2013 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM 2013). IEEE, pp 372–379
Taheri SM, Mahyar H, firouzi M, Ghalebi KE, Grosu R, Movaghar A (2017) Extracting implicit social relation for social recommendation techniques in user rating prediction. In: Proceedings of the 26th international conference on world wide web companion, pp 1343–1351
Tang L, Wang X, Liu H (2012) Community detection via heterogeneous interaction analysis. Data Min Knowl Disc 25(1):1–33
Tang J, Zhang J, Yao L, Li J, Zhang L, Su Z (2008) Arnetminer: extraction and mining of academic social networks. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 990-998
Wang B, Mezlini AM, Demir F, fiume M, Tu Z, Brudno M, Goldenberg A (2014) Similarity network fusion for aggregating data types on a genomic scale. Nat Methods 11(3):333
Wang C, Han J, Jia Y, Tang J, Zhang D, Yu Y, Guo J (2010) Mining advisor–advisee relationships from research publication networks. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 203–212
Wang W, Liu J, Xia F, King I, Tong H (2017) Shifu: deep learning based advisor-advisee relationship mining in scholarly big data. In: Proceedings of the 26th international conference on world wide web companion, pp 303–310
Yue F, Fattane Z, Ebrahim B, Hossein F, Feras AO (2018) Entity linking of tweets based on dominant entity candidates. Soc Netw Anal Min 8(1):46
Zhang J (2018) Social network fusion and mining: a survey. arXiv preprint arXiv:1804.09874
Zhang J, Yu PS (2016) PCT: partial co-alignment of social networks. In: Proceedings of the 25th international conference on World Wide Web, pp 749–759
Acknowledgements
This work is supported by the National Key Research and Development Program of China (2018YFC0831500) and Big Data Research Foundation of PICC.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Long, F., Ning, N., Zhang, Y. et al. Mining latent academic social relationships by network fusion of multi-type data. Soc. Netw. Anal. Min. 10, 52 (2020). https://doi.org/10.1007/s13278-020-00663-6
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-020-00663-6