Abstract
Social link identification, that is to identify accounts across different online social networks that belong to the same user, is an important task in social network applications. Most existing methods to solve this problem directly applied machine learning classifiers on features extracted from user’s rich information. In practice, however, only some limited user information can be obtained because of privacy concerns. In addition, we observe that the existing methods cannot handle huge amount of potential account pairs from different online social networks. In this paper, we propose an effective method to address the above two challenges by expanding known anchor links (seed account pairs belonging to the same person). In particular, we leverage potentially useful information possessed by the existing anchor link and then develop a local expansion propagation model to identify new social links, which are taken as a generated anchor link to be used for iteratively identifying additional new social link. We evaluate our method on two most popular Chinese social networks. Experimental results show our proposed method can quickly find most of identity account pairs across different online social networks.
Similar content being viewed by others
References
Absar R, Gruzd A, Haythornthwaite C, Paulin D (2016) Linking online identities and content in connectivist MOOCs across multiple social media platforms. In: Proceedings of the International Conference on World Wide Web. Quebec, Canada, April 2016, pp 483–488
Anwar T, Abulaish M (2012) An MCL-based text mining approach for namesake disambiguation on the web. In: Proceedings of the IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, Macau, China, December 2012, pp 40–44
Borges CJR (1967) A study of multivalued functions. Pac J Math 23(3):451–461
Carmagnola F, Cena F (2009) User identification for cross-system personalization. Inf Sci 179(1–2):16–32
Carmagnola F, Osborne F, Torre I (2014) User data discovery and aggregation: the CS-UDD algorithm. Inf Sci 270(20):41–72
Chen Y, Zhuang C, Cao Q, Hui P (2014) Understanding cross-site linking in online social networks. In: Proceedings of the 8th Workshop on Social Network Mining and Analysis, New York, USA, August 2014, pp 1–9
Dean J, Ghemawat S (2004) MapReduce: Simplified data processing on large clusters. In: Proceedings of the 6th USENIX Symposium on Operating Systems Design and Implementation, CA, USA, December 2004, pp 107–113
Gao Q, Abel F, Houben GJ ,Yu Y (2012) A comparative study of users’ Microblogging behavior on Sina Weibo and Twitter. In: Proceedings of International Conference on User Modeling, Adaptation, and Personalization, Montreal, Canada, July 2012, pp 88–101
Goga O (2014) Matching user accounts across online social networks: methods and applications. Ph.D thesis, University Pierre and Marie CURIE
Goga O, Lei H, Krishnan S, Friedl G, Sommer R, Teixeira R (2013) Exploiting innocuous activity for correlating users across sites. In: Proceedings of the 22nd International Conference on World Wide Web, Rio, Brazil, May 2013, pp 447–458
Gusfield D (1999) Algorithms on strings, trees and sequences: computer science and computational biology. Cambridge University Press, Cambridge
Iofciu T, Fankhauser P, Abel F, Bischoff K (2011) Identifying users across social tagging systems. In: Proceedings of 5th International Conference on Weblogs and Social Media, Barcelona, Spain, July 2011, pp 522–525
Irani D, Webb S, Li K, Pu C (2011) Modeling unintended personal-information leakage from multiple online social networks. IEEE Internet Comput 3:13–19
Jain P, Kumaraguru P, Joshi A (2013) @i seek ‘fb.me’: identifying users across multiple online social networks. In: Proceedings of the 22nd International Conference on World Wide Web, Rio, Brazil, May 2013, pp 1259–1268
Jain P, Kumaraguru P, Joshi A (2015) Other times, other values: Leveraging attribute history to link user profiles across online social networks. In: Proceedings of 26th ACM Conference on Hypertext and Social Media, Guzelyurt, TRNC, September 2015, pp 247–255
Kong X, Zhang J, Yu PS (2013) Inferring anchor links across multiple heterogeneous social networks. In: Proceedings the 22nd ACM International Conference on Information and Knowledge Management, CA, USA, October 2013, pp 179–188
Li J, Wang GA, Chen H (2011) Identity matching using personal and social identity features. Inf Syst Front 13(1):101–113
Li XL, Foo CS, Tew KL, Ng SK (2009) Searching for rising stars in bibliography networks. In: Proceedings of International Conference on Database Systems for Advanced Applications, Brisbane, Australia, April 2009, pp 288–292
Li XL, Tan A, Yu PS, Ng SK (2011) ECODE: event-based community detection from social networks. In: Proceedings of International Conference on Database Systems for Advanced Applications, Hong Kong, China, April 2011, pp 22–37
Liu J, Zhang F, Song X, Song YI, Lin CY, Hon HW (2013) What’s in a name?: an unsupervised approach to link users across communities. In: Proceedings of ACM International Conference on Web Search and Data Mining, Rome, Italy, February 2013, pp 495–504
Liu L, Cheung WK, Li X, Liao LJ (2016) Aligning users across social networks using network embedding. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence, New York, USA, July 2016, pp 1774–1780
Liu S, Wang S, Jeyarajah K, Misra A, Krishnan R (2013) TODMIS: Mining communities from trajectories. In: Proceedings the 22nd ACM International Conference on Information and Knowledge Management, CA, USA, October 2013, pp 2109–2118
Liu S, Wang S, Zhu F, Zhang J, Krishnan R (2014) HYDRA: Large-scale Social Identity Linkage via Heterogeneous Behavior Modeling. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Utah, USA, June 2014, pp 51–62
Malhotra A, Totti L, Meira M, Kumaraguru P, Almeida V (2012) Studying User Footprints in Different Online Social Networks. In: Proceedings of IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Istanbul, Turkey, August 2012, pp 1065–1070
Man T, Shen HW, Liu SH, Jin XL, Cheng XQ (2016) Predict anchor links across social networks via an embedding approach. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence, New York, USA, July 2016, pp 1823-1829
Motoyama M, Varghese G (2009) I seek you: searching and matching individuals in social networks. In: Proceedings of the 10th International Workshop on Web Information and Data Management, Hong Kong, China, November 2009, pp 67–75
Narayanan A, Shmatikov V (2009) De-anonymizing social networks. In: Proceedings of 30th IEEE Symposium on Security and Privacy, CA, US, May 2009, pp 173–187
Nie Y, Jia Y, Li S, Zhu X, Li A, Zhou B (2016) Identifying users across social networks based on dynamic core interests. Neurocomputing 210:107–115
Nunes A, Calado P, Martins B (2012) Resolving user identities over social networks through supervised learning and rich similarity features. In: Proceedings of the 27th Annual ACM Symposium on Applied Computing, Trento, Italy, March 2012, pp 728–729
Perito D, Castelluccia C, Ali Kafar M, Manils P (2011) How unique and traceable are usernames? In: Proceedings of The International Symposium on Privacy Enhancing Technologies Symposium, Waterloo, Canada, July 2011, pp 1–17
Rapoport A (1953) Spread of information through a population with socio-structural bias I: assumption of transitivity. Bull Math Bio 15(4):523–533
Riederer C, Kim Y, Chaintreau A, Korula N, Lattanzi S (2016) Linking users across domains with location data: Theory and validation. In: Proceedings of the 25th International Conference on World Wide Web, Montreal, Canada, April 2016, pp 707–719
Smith K, Gupta J (2000) Neural networks in business: techniques and applications for the operations researcher. Comput Oper Res 27(11):1023–1044
Stephan D, Lucila OM (2002) Logistic regression and artificial neural network classification models: a methodology review. J Biomed Inf 35(5):352–359
Shu K, Wang SH, Tang JL, Zafarani R, Liu H (2016) User identity linkage across online social networks: a review. ACM SIGKDD Explor Newslett 18(2):5–17
Vosecky J, Hong D, Shen VY (2009) User identification across multiple social networks. In: Proceedings of the International Conference on Networked Digital Technologies, Ostrava, The Czech Republic, July 2009, pp 360–365
Wondracek G, Holz T, Kirda E, Kruegel C (2010) A practical attack to de-anonymize social network users. In: Proceedings of the IEEE Symposium on Security and Privacy, California, USA, May 2010, pp 223–238
Zafarani R, Liu H (2009) Connecting corresponding identities across communities. In: Proceedings of the International AAAI Conference on Weblogs and Social Media, California, USA, May 2009, pp 354–357
Zafarani R, Liu H (2013) Connecting users across social media sites: A behavioral-modeling approach. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, USA, August 2013, pp 41–49
Zhang H, Kan M , Liu Y, Ma S (2014) Online social network profile linkage. In: Proceedings of the Conference on Asia Information Retrieval Symposium, Kuching, Malaysia, December 2014, pp 197–208
Zhang Y, Tang J, Yang Z, Pei J, Yu PS (2015) COSNET: Connecting heterogeneous social networks with local and global consistency. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, Australia, August 2015, pp 1485–1494
Zhang JW, Yu PS (2015) Multiple anonymized social networks alignment. In: Proceedings of IEEE International Conference on Data Mining, New Jersey, USA, November 2015, pp 599–608
Zhang JW, Yu PS, Zhou ZH (2014) Meta-path based multi-network collective link prediction. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, USA, August 2014, pp 1286–1295
Zhao X, Sala A, Zheng H, Zhao B (2011) Efficient shortest paths on massive social graphs. In: Proceedings of 7th International Conference on Collaborative Computing: Networking, Applications and Worksharing, Orlando, USA, October 2011, pp 77–86
Acknowledgements
We thank anonymous reviewers for their very useful comments and suggestions. This work was partially supported by grants from the National Natural Science Foundation of China (Grant No. U1533104, U1633110).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, Y., Fu, J., Yang, C. et al. A local expansion propagation algorithm for social link identification. Knowl Inf Syst 60, 545–568 (2019). https://doi.org/10.1007/s10115-018-1221-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-018-1221-y