Skip to main content
Log in

Mining latent academic social relationships by network fusion of multi-type data

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

The relationship extraction and network fusion are hotspots of current research in social network mining. Since the types of data are manifold, researchers can utilize multi-type data to construct multiple networks. In academic social network mining, present researches are mostly based on the single-type data, e.g., the co-authorship network constructed by academic co-authorship records. However, the relationships portrayed by single-type data are not sufficient to characterize the complex relationships of the real world. To solve this problem, we are the first to the best of our knowledge to use acknowledgment text to construct a semantic information-based academic social network. First, we extract named entities from multi-type data and implement network optimization and alignment. Second, a semi-supervised fusion framework for multiple networks (SFMN), using the gradient boosting decision tree algorithm to fuse the information of multiple networks into a single network, is proposed in this paper. Third, we implement the parallel version of SFMN on Spark to improve the performance of large-scale social network analysis. Experiments show the superiority of our framework over several state-of-the-art methods and prove our method can effectively integrate network information.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. https://github.com/fxsjy/jieba.

  2. https://aminer.org/open-academic-graph.

References

  • Alsheikh MA, Niyato D, Lin S, Tan HP, Han Z (2016) Mobile big data analytics using deep learning and apache spark. IEEE Netw 30(3):22–29

    Article  Google Scholar 

  • Azaouzi M, Rhouma D, Romdhane LB (2019) Community detection in large-scale social networks: state-of-the-art and future directions. Soc Netw Anal Min 9(1):23

    Article  Google Scholar 

  • Berlingerio M, Coscia M, Giannotti F (2011) finding and characterizing communities in multidimensional networks. In: 2011 international conference on advances in social networks analysis and mining. IEEE, pp 490–494

  • Cai D, Shao Z, He X, Yan X, Han J (2005) Mining hidden community in heterogeneous social networks. In: Proceedings of the 3rd international workshop on Link discovery. ACM, pp 58–65

  • Cucchiarelli A, Fulvio DA, Velardi P (2012) Semantically interconnected social networks. Soc Netw Anal Min 2(1):69–95

    Article  Google Scholar 

  • Cucerzan S (2007) Large-scale named entity disambiguation based on Wikipedia data. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp 708–716

  • De Domenico M, Nicosia V, Arenas A, Latora V (2015) Structural reducibility of multilayer networks. Nat Commun 6:6864

    Article  Google Scholar 

  • Dharavath R, Arora NS (2019) Spark’s GraphX-based link prediction for social communication using triangle counting. Soc Netw Anal Min 9(1):28

    Article  Google Scholar 

  • Farasat A, Gross G, Nagi R, Nikolaev AG (2015) Social network extraction and high value individual (HVI) identification within fused intelligence data. In: International conference on social computing, behavioral-cultural modeling, and prediction. Springer, Cham, pp 44–54

  • Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232

    Article  MathSciNet  Google Scholar 

  • Gong WH, Chen YQ, Pei XB, Yang LH (2018) Community detection combined with multi-dimensional relationships in location-based social networks. J Softw 29(4):1163–1176

    MATH  Google Scholar 

  • Gonzalez JE, Low Y, Gu H, Bickson D, Guestrin C (2012) Powergraph: Distributed graph-parallel computation on natural graphs. In: Presented as part of the 10th USENIX symposium on operating systems design and implementation (OSDI 12), pp 17–30

  • Gonzalez JE, Xin RS, Dave A, Crankshaw D, Franklin MJ, Stoica I (2014) Graphx: graph processing in a distributed dataflow framework. In: 11th USENIX symposium on operating systems design and implementation (OSDI 14), pp 599–613

  • He Z, Liu S, Song Y, Li M, Zhou M, Wang H (2013) Efficient collective entity linking with stacking. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 426–435

  • Jiang X, Hu X (2014) Inferring microbial interaction networks based on consensus similarity network fusion. Sci China Life Sci 57(11):1115–1120

    Article  Google Scholar 

  • Jutla IS, Jeub LG, Mucha PJ (2011) A generalized Louvain method for community detection implemented in MATLAB. http://netwiki.amath.unc.edu/GenLouvain

  • Kivelä M, Arenas A, Barthelemy M, Gleeson JP, Moreno Y, Porter MA (2014) Multilayer networks. J Complex Netw 2(3):203–271

    Article  Google Scholar 

  • Long F, Ning N, Song C, Wu B (2019) Strengthening social networks analysis by networks fusion. In: Proceedings of the 2019 IEEE/ACM international conference on advances in social networks analysis and mining, pp 460–463

  • Low Y, Gonzalez JE, Kyrola A, Bickson D, Guestrin CE, Hellerstein J (2014) Graphlab: a new framework for parallel machine learning. arXiv preprint arXiv:1408.2041

  • Malewicz G, Austern MH, Bik AJ, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data. ACM, pp 135–146

  • Meng X, Bradley J, Yavuz B, Sparks E, Venkataraman S, Liu D, Xin D (2016) Mllib: machine learning in apache spark. J Mach Learn Res 17(1):1235–1241

    MathSciNet  MATH  Google Scholar 

  • Mohamed S, Moonam K, Hakim T (2013) Enabling cross-site interactions in social networks. Soc Netw Anal Min

  • Nasution MK, Noah SAM, Saad S (2016) Social network extraction: superficial method and information retrieval. arXiv preprint arXiv:1601.02904

  • Newman MEJ (2018) Network structure from rich but noisy data. Nat Phys 14(6):542

    Article  Google Scholar 

  • Niu L, Wu J, Shi Y (2012) Entity disambiguation with textual and connection information. Procedia Comput Sci 9:1249–1255

    Article  Google Scholar 

  • Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 701–710

  • Ruan P, Wang Y, Shen R et al (2019) Using association signal annotations to boost similarity network fusion. Bioinformatics (Oxford, England)

  • Shanahan JG, Dai L (2015) Large scale distributed data science using apache spark. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 2323–2324

  • Steurer M, Trattner C (2013) Acquaintance or partner? Predicting partnership in online and location-based social networks. In: 2013 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM 2013). IEEE, pp 372–379

  • Taheri SM, Mahyar H, firouzi M, Ghalebi KE, Grosu R, Movaghar A (2017) Extracting implicit social relation for social recommendation techniques in user rating prediction. In: Proceedings of the 26th international conference on world wide web companion, pp 1343–1351

  • Tang L, Wang X, Liu H (2012) Community detection via heterogeneous interaction analysis. Data Min Knowl Disc 25(1):1–33

    Article  MathSciNet  Google Scholar 

  • Tang J, Zhang J, Yao L, Li J, Zhang L, Su Z (2008) Arnetminer: extraction and mining of academic social networks. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 990-998

  • Wang B, Mezlini AM, Demir F, fiume M, Tu Z, Brudno M, Goldenberg A (2014) Similarity network fusion for aggregating data types on a genomic scale. Nat Methods 11(3):333

    Article  Google Scholar 

  • Wang C, Han J, Jia Y, Tang J, Zhang D, Yu Y, Guo J (2010) Mining advisor–advisee relationships from research publication networks. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 203–212

  • Wang W, Liu J, Xia F, King I, Tong H (2017) Shifu: deep learning based advisor-advisee relationship mining in scholarly big data. In: Proceedings of the 26th international conference on world wide web companion, pp 303–310

  • Yue F, Fattane Z, Ebrahim B, Hossein F, Feras AO (2018) Entity linking of tweets based on dominant entity candidates. Soc Netw Anal Min 8(1):46

    Article  Google Scholar 

  • Zhang J (2018) Social network fusion and mining: a survey. arXiv preprint arXiv:1804.09874

  • Zhang J, Yu PS (2016) PCT: partial co-alignment of social networks. In: Proceedings of the 25th international conference on World Wide Web, pp 749–759

Download references

Acknowledgements

This work is supported by the National Key Research and Development Program of China (2018YFC0831500) and Big Data Research Foundation of PICC.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yunlei Zhang or Bin Wu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Long, F., Ning, N., Zhang, Y. et al. Mining latent academic social relationships by network fusion of multi-type data. Soc. Netw. Anal. Min. 10, 52 (2020). https://doi.org/10.1007/s13278-020-00663-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-020-00663-6

Keywords

Navigation