Skip to main content
Log in

Matching user accounts across social networks based on username and display name

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Matching user accounts across social networks is helpful for building better user profile, which has practical significance for many applications. It has attracted many scholars’ attention. Existing works are mainly based on the rich online profiles or activities. However, due to privacy settings or some other specific purposes, the online rich data is usually unavailable, incomplete or unreliable. This makes the existing schemes fail to work properly. Users often make their display names and/or usernames public on different social networks. These names belonging to the same user often contain affluent information redundancies, which provide an opportunity to address the matching problem. In this paper, we focus on the problem of matching user accounts across social networks solely based on username and display name. The problem is two-fold: 1) how to characterize those information redundancies contained in the usernames or display names; 2) how to match the user accounts based on these information redundancies. To address this problem, we propose a solution to User Identification across Social Network based on Username and Display name (UISN-UD), which consists of three key components: 1) extracting features that exploit the information redundancies among names based on user naming habits; 2) training a two-stage classification framework to tackle the user identification problem based on the extracted features; 3) employing the Gale-Shapley algorithm to eliminate the one-to-many or many-to-many relationships existed in the identification results. We perform the experiments based on real social network datasets and the results show that the proposed method can provide excellent performance with F1 values reaching 90%+. From a computational point of view, comparing display names and/or usernames is surely more convenient than comparing the online rich profile attributes or activities of two accounts. This work shows the possibility of matching the user accounts with high accessible and small amount of online data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9

Similar content being viewed by others

Notes

  1. https://www.statista.com/statistics/272014/global-social-networks-ranked-by-number-of-users/

  2. https://support.foursquare.com/hc/en-us/articles/201065720-What-information-is-shared-when-I-connect-my-Twitter-account-

  3. http://scikit-learn.org/stable/

References

  1. Bartunov, S., Korshunov, A., Park, S., et al.: Joint Link-attribute user identity resolution in online social networks. Proceedings of the 6th International Conference on Knowledge Discovery and Data Mining, Workshop on Social Network Mining and Analysis. Beijing, China, 2012:104–109

  2. Bodhit, A., Amin, K.: Possible solutions of new user or item cold-start problem. Int. J. Math. 1(3), (2013)

  3. Chen, T., Kaafar, M., et al.: Is more always merrier? A deep dive into online social footprints. Proceedings of the 2012 ACM Workshop on Workshop on Online Social Networks, Helsinki, Finland, 67–72 (2012)

  4. Chen, Y., Zhuang, C., Cao, Q., Hui, P.: Understanding Cross-site Linking in Online Social Networks. Proceedings of the 8th Workshop on Social Network Mining and Analysis, New York, USA. Article no. 6 ( 2014)

  5. Chen, W., Yin, H.Z., Wang, W.Q., Zhao, L., Hua, W., Zhou, X.F.: Exploiting Spatio-temporal user behaviors for user linkage. Proceedings of the 26th ACM international conference on information and Knowledge Management 517–526 (2017)

  6. Dubins, L., Freedman, D.: Machiavelli and the Gale-Shapley algorithm. Am. Math. Mon. 88(7), 485–494 (1981)

    Article  MathSciNet  MATH  Google Scholar 

  7. Goga, O.: Matching user accounts across online social networks: methods and applications. Ph.D. thesis, Pierre and Marie Curie University-Pairs 6 (UPMC) (2014)

  8. Kong, X.N., Zhang, J.W., Yu, P.S.: Inferring anchor links across multiple heterogeneous social networks. Proceedings of the 22nd ACM international conference on Information & Knowledge Management 179–188 (2013)

  9. Li, Y.J., Liu, Z., Yu, H.: Advisor-advisee Relationship identification based on maximum entropy model. Acta Phys. Sin. 62(16), 168902 (2013)

    Google Scholar 

  10. Li, Y.J., Peng, Y., Zhang, Z., et al.: Understanding the User Display Names across Social Networks. Proceedings of 26th International World Wide Web Conference Companion, Perth, Australia 1319–1326 (2017)

  11. Li, Y.J., Peng, Y., Ji, W.L., Zhang, Z., Xu, Q.: User identification based on display names across online social networks. IEEE Access. 5, 17342–17353 (2017)

    Article  Google Scholar 

  12. Li, Y.J., Zhang, Z., Peng, Y., Yin, H.Z., Xu, Q.Q.: Matching user accounts based on user generated content across social networks. Futur. Gener. Comput. Syst. 83, 104–115 (2018)

    Article  Google Scholar 

  13. Liu, J., Zhang, F., Song, X.Y.: What's in a name?: an unsupervised approach to link users across communities. Proceedings of the 6th ACM International Conference on Web Search and Data Mining. Rome, Italy 495–504 (2013)

  14. Liu, S.Y., Wang, S.H., Zhu, F.D., et al.: HYDRA: Large-Scale Social Identity Linkage Via Heterogeneous Behavior Modeling. Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, Snowbird, USA 51–62 (2014)

  15. Liu, D., Wu, Q.Y., Han, W.H., Zhou, B.: User identification across multiple websites based on username features. Ji Suan Ji Xue Bao. Chin. J. Comput. 38(10), 2028–2040 (2015)

  16. Miller, G.A.: The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol. Rev. 63, 81–97 (1956)

    Article  Google Scholar 

  17. Motoyama, M., Varghese, G.: I seek you: searching and matching individuals in social networks. Proceedings of the eleventh international workshop on Web information and data management. Hong Kong, China 67–75 (2009)

  18. Mu, X., Zhu, F., Lim, E., Xiao, J., Wang, J., Zhou, Z.: User identity linkage by latent user space modeling. Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco,USA 1775–1784 (2016)

  19. Narayanan, A., Shmatikov, V.: De-anonymizing social networks. Proceedings of 30th IEEE Symposium on Security and Privacy, Berkeley, USA 173–187 (2009)

  20. Nitish, K., Silvi, L.: An efficient reconciliation algorithm for social networks. Proc VLDB Endowment. 7(5), 377–388 (2014)

    Article  Google Scholar 

  21. Ottoni, R., Casas, D.L., Pesce, J.P., et al.: Of pins and tweets: investigating how users behave across image and text-based social networks. Proceedings of eighth international AAAI conference on weblogs and social Media (2014)

  22. Perito, D., Castelluccia, C., Kaafar, M.A., Manils, P.: How unique and traceable are usernames? Proceedings of the 11th international conference on privacy enhancing technologies, waterloo. Can. Underwrit. 1–17 (2011)

  23. Raad, E., Chbeir, R., Dipanda, A.: User profile matching in social networks. Proceedings of the 2010 13th International Conference on Network-Based Information Systems. Takayama, Japan 297–304 (2010)

  24. Ruths, D., Pfeffer, J.: Social media for large studies of behavior. Science. 346(6213), 1063–1064 (2014)

    Article  Google Scholar 

  25. Tan, S.L., Guan, Z.Y., Cai, D., et al.: Mapping users across networks by manifold alignment on Hypergraph. Proc Natl Conf Artif Intell. 159–165 (2014)

  26. Vosecky, J., Hong, D., Shen, V.: User identification across social networks using the Web profile and friend network. Int J Web Appl. 2(1), 23–34 (2010)

    Google Scholar 

  27. Wang, P., He, W., Zhao, J.: A tale of three social networks: user activity comparisons across Facebook, twitter, and foursquare. Internet Comput. 18(2), 10–15 (2014)

    Article  Google Scholar 

  28. Yin, H.Z., Hu, Z.T., Zhou, X.F., Wang, H., Zheng, K., Nguyen, Q.V.H., Sadiq, S.; Discovering interpretable geo-social communities for user behavior prediction. Proceedings of the 32nd IEEE international conference on data Engineering 942–953 (2016)

  29. Yin, H.Z., Chen, H.X., Sun, X.S., Wang, H., Wang, Y., Nguyen, Q.V.: SPTF: a scalable probabilistic tensor factorization model for semantic-aware behavior prediction. Proceedings of 2017 I.E. International Conference on Data Mining 585–594 (2017)

  30. Zafarani, R., Liu, H.: Connecting Corresponding Identities across Communities. Proceedings of International Conference on Weblogs and Social Media. San Jose, USA, May 17–20, (2009)

  31. Zafarani, R., Liu, H.: Connecting users across social media sites: A behavioral-modeling approach. Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Chicago,USA 41–49 (2013)

  32. Zafarani, R., Tang, L., Liu, H.. User identification across social media. ACM Trans. Knowl. Discov. Data 10(2): Article 16, 1, 30 (2015)

  33. Zhang, J.W., Yu, P.S.: Multiple anonymized social networks alignment. Proc IEEE Int Conf Data Min 599–608 (2015)

  34. Zhou, X.P., Liang, X., Zhao, J.C., Li, Z.Y., Ma, Y.F.: State-of-the-art survey of correlating user mining for social network integration. Ruan Jian Xue Bao/ J. Softw. (2016) http://www.jos.org.cn/1000-9825/0000.htm

  35. Narayanan, A., Paskov, H., Gong, Z.Q., et al.: On the Feasibility of Internet-Scale Author Identification. Proceedings of 2012 I.E. Symposium on Security and Privacy, San Francisco, s 300–314 (2012)

  36. Zhang, J., Yu, P.: Integrated anchor and social link predictions across partially aligned social networks. Proceedings of the 24th International Conference on Artificial Intelligence, Buenos Aires, Argentina 2125–2131 (2015)

  37. Zhang, J., Yu, P., Zhou, Z.: Meta-path based multi-network collective link prediction. Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, USA 1286–1295 (2014)

  38. Tu, C.C., Liu, Z.Y., Sun, M.S.: PRISM: Profession identification in social media with personal information and community structure. Proceedings of 4th National Conference of Social Media Processing 15–27 (2015)

  39. Zhou, X.P., Liang, X., Zhang, H.Y., Ma, Y.F.: Cross-platform identification of anonymous identical users in multiple social media networks. IEEE Trans. Knowl. Data Eng. 28(2), 411–424 (2016)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongjun Li.

Additional information

This article belongs to the Topical Collection: Special Issue on Geo-Social Computing

Guest Editors: Guandong Xu, Wen-Chih Peng, Hongzhi Yin, Zi (Helen) Huang

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Y., Peng, Y., Zhang, Z. et al. Matching user accounts across social networks based on username and display name. World Wide Web 22, 1075–1097 (2019). https://doi.org/10.1007/s11280-018-0571-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-018-0571-4

Keywords

Navigation