Skip to main content
Log in

EEUPL: Towards effective and efficient user profile linkage across multiple social platforms

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Linking user profiles belonging to the same people across multiple social networks underlines a wide range of applications, such as cross-platform prediction, cross-platform recommendation, and advertisement. Most of existing approaches focus on pairwise user profile linkage between two platforms, which can not effectively piece up information from three or more social platforms. Different from the previous work, we investigate scalable user profile linkage across multiple social platforms by proposing an effective and efficient model called EEUPL, which can detect duplicate profiles within one platform that belong to same person and is implemented with Apache Spark for distributed execution. The model contains two key components: 1) To link cross-platform user profiles effectively, we propose an average-link strategy based clustering method. 2) To extend the model EEUPL to large-scale datasets, an Apache Spark based approach is developed. Extensive experiments are conducted on two real-world datasets, and the results demonstrate the superiority of the model EEUPL compared with the state-of-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. https://twitter.com/

  2. https://www.instagram.com/

  3. https://www.globalwebindex.com/reports/social

  4. https://economictimes.indiatimes.com/

  5. https://spark.apache.org/

  6. https://hadoop.apache.org/

  7. http://vishalshar.github.io/data/

  8. https://dbs.uni-leipzig.de

References

  1. Cao, D., He, X., Nie, L., Wei, X., Hu, X., Wu, S., Chua, T.: Cross-platform app recommendation by jointly modeling ratings and texts. ACM Trans. Inf. Syst. 35(4), 37:1–37:27 (2017)

    Article  Google Scholar 

  2. Chen, W., Wang, W., Yin, H., Fang, J., Zhao, L.: User account linkage across multiple platforms with location data. J. Comput. Sci. Technol. 35 (4), 751–768 (2020)

    Article  Google Scholar 

  3. Chen, W., Yin, H., Wang, W., Zhao, L., Hua, W., Zhou, X.: Exploiting spatio-temporal user behaviors for user linkage. In: CIKM (2017)

  4. Chen, W., Yin, H., Wang, W., Zhao, L., Zhou, X.: Effective and efficient user account linkage across location based social networks. In: ICDE, pp. 1085–1096 (2018)

  5. Fu, S., Wang, G., Xia, S., Liu, L.: Deep multi-granularity graph embedding for user identity linkage across social networks. Knowl. Based Syst. 193, 105301 (2020)

    Article  Google Scholar 

  6. Gao, X., Ji, W., Li, Y., Deng, Y., Dong, W.: User identification with spatio-temporal awareness across social networks. In: CIKM, pp. 1831–1834 (2018)

  7. Gao, M., Lim, E., Lo, D., Zhu, F., Prasetyo, P.K., Zhou, A.: CNL: collective network linkage across heterogeneous social platforms. In: ICDM, pp. 757–762. IEEE Computer Society (2015)

  8. Goga, O., Lei, H., Parthasarathi, S.H.K., Friedland, G., Sommer, R., Teixeira, R.: Exploiting innocuous activity for correlating users across sites. In: WWW, pp. 447–458. ACM (2013)

  9. Han, X., Wang, L., Xu, L., Zhang, S.: Social media account linkage using user-generated geo-location data. In: ISI, pp. 157–162 (2016)

  10. Iofciu, T., Fankhauser, P., Abel, F., Bischoff, K.: Identifying users across social tagging systems. In: ICWSM (2011)

  11. Jin, F., Hua, W., Xu, J., Zhou, X.: Moving object linking based on historical trace. In: ICDE, pp. 1058–1069 (2019)

  12. Kong, X., Zhang, J., Yu, P.S.: Inferring anchor links across multiple heterogeneous social networks. In: CIKM, pp. 179–188. ACM (2013)

  13. Leskovec, J., Rajaraman, A., Ullman, J. D.: Mining of Massive Datasets. 2nd Ed. Cambridge University Press (2014)

  14. Li, Y., Peng, Y., Ji, W., Zhang, Z., Xu, Q.: User identification based on display names across online social networks. IEEE Access 5, 17:342–17:353 (2017)

    Article  Google Scholar 

  15. Li, Y., Peng, Y., Zhang, Z., Yin, H., Xu, Q.: Matching user accounts across social networks based on username and display name. World Wide Web 22(3), 1075–1097 (2019)

    Article  Google Scholar 

  16. Li, Y., Zhang, Z., Peng, Y., Yin, H., Xu, Q.: Matching user accounts based on user generated content across social networks. Future Gener. Comput. Syst. 83, 104–115 (2018)

    Article  Google Scholar 

  17. Liu, G., Liu, Y., Zheng, K., Liu, A., Li, Z., Wang, Y., Zhou, X.: MCS-GPM: multi-constrained simulation based graph pattern matching in contextual social graphs. IEEE Trans. Knowl. Data Eng. 30(6), 1050–1064 (2018)

    Article  Google Scholar 

  18. Liu, G., Wang, Y., Orgun, M.A.: Optimal social trust path selection in complex social networks. In: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2010, Atlanta, Georgia, USA, July 11-15 2010 (2010)

  19. Liu, G., Wang, Y., Orgun, M.A., Lim, E.: Finding the optimal social trust path for the selection of trustworthy service providers in complex social networks. IEEE Trans. Serv. Comput. 6(2), 152–167 (2013)

    Article  Google Scholar 

  20. Liu, S., Wang, S., Zhu, F., Zhang, J., Krishnan, R.: HYDRA: large-scale social identity linkage via heterogeneous behavior modeling. In: SIGMOD, pp. 51–62. ACM (2014)

  21. Liu, J., Zhang, F., Song, X., Song, Y.-I., Lin, C.-Y., Hon, H.-W.: What’s in a name?: an unsupervised approach to link users across communities. In: WSDM, pp. 495–504 (2013)

  22. Liu, J., Zhang, F., Song, X., Song, Y., Lin, C., Hon, H.: What’s in a name?: an unsupervised approach to link users across communities. In: WSDM, pp. 495–504. ACM (2013)

  23. Liu, G., Zheng, K., Wang, Y., Orgun, M. A., Liu, A., Zhao, L., Zhou, X.: Multi-constrained graph pattern matching in large-scale contextual social graphs. In: 31st IEEE International Conference on Data Engineering, ICDE 2015, Seoul, South Korea, April 13-17, 2015, pp. 351–362 (2015)

  24. Mu, X., Zhu, F., Lim, E., Xiao, J., Wang, J., Zhou, Z.: User identity linkage by latent user space modelling. In: KDD, pp. 1775–1784. ACM (2016)

  25. Nentwig, M., Rahm, E.: Incremental clustering on linked data. In: ICDM, pp. 531–538. IEEE (2018)

  26. Paulheim, H.: Knowledge graph refinement: a survey of approaches and evaluation methods. Semantic Web 8(3), 489–508 (2017)

    Article  Google Scholar 

  27. Raad, E., Chbeir, R., Dipanda, A.: User profile matching in social networks. In: NBiS, pp. 297–304. IEEE Computer Society (2010)

  28. Saeedi, A., Nentwig, M., Peukert, E., Rahm, E.: Scalable matching and clustering of entities with FAMER. CSIMQ 16, 61–83 (2018)

    Article  Google Scholar 

  29. Saeedi, A., Peukert, E., Rahm, E.: Using link features for entity clustering in knowledge graphs. In: ESWC, vol. 10843, pp. 576–592. Springer (2018)

  30. Sharma, V., Dyreson, C.E.: LINKSOCIAL: Linking user profiles across multiple social media platforms. In: ICBK, pp. 260–267. IEEE Computer Society (2018)

  31. Shen, Y., Jin, H.: Controllable information sharing for user accounts linkage across multiple online social networks. In: CIKM, pp. 381–390 (2014)

  32. Vosecky, J., Hong, D., Shen, V. Y.: User identification across multiple social networks. In: International Conference on Networked Digital Technologies, pp. 360–365 (2009)

  33. Wang, M., Chen, W., Xu, J., Zhao, P., Zhao, L.: User profile linkage across multiple social platforms. In: WISE (2020)

  34. Xie, W., Mu, X., Lee, R.K., Zhu, F., Lim, E.: Unsupervised user identity linkage via factoid embedding. In: ICDM, pp. 1338–1343 (2018)

  35. Zafarani, R., Liu, H.: Connecting corresponding identities across communities. ICWSM 9, 354–357 (2009)

    Google Scholar 

  36. Zafarani, R., Liu, H.: Connecting users across social media sites: a behavioral-modeling approach. In: KDD, pp. 41–49. ACM (2013)

  37. Zhang, H., Kan, M., Liu, Y., Ma, S.: Online social network profile linkage. In: Information Retrieval Technology, vol. 8870, pp. 197–208. Springer (2014)

  38. Zhang, W., Lai, X., Wang, J.: Social link inference via multiview matching network from spatiotemporal trajectories. IEEE Transactions on Neural Networks and Learning Systems, pp. 1–12 (2020)

  39. Zhou, J., Fan, J.: Translink: User identity linkage across heterogeneous social networks via translating embeddings. In: INFOCOM, pp. 2116–2124 (2019)

  40. Zhou, X., Liang, X., Zhang, H., Ma, Y.: Cross-platform identification of anonymous identical users in multiple social media networks. IEEE Trans. Knowl. Data Eng. 28(2), 411–424 (2016)

    Article  Google Scholar 

  41. Zhou, F., Liu, L., Zhang, K., Trajcevski, G., Wu, J., Zhong, T.: Deeplink: A deep learning approach for user identity linkage. In: INFOCOM, pp. 1313–1321 (2018)

Download references

Acknowledgments

This work was supported by the Major Program of the Natural Science Foundation of Jiangsu Higher Education Institutions of China under Grant No. 19KJA610002 and 19KJB520050, and the National Natural Science Foundation of China under Grant No. 61902270.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Wei Chen or Lei Zhao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Explainability in the Web

Guest Editors: Guandong Xu, Hongzhi Yin, Irwin King, and Lin Li

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, M., Wang, W., Chen, W. et al. EEUPL: Towards effective and efficient user profile linkage across multiple social platforms. World Wide Web 24, 1731–1748 (2021). https://doi.org/10.1007/s11280-021-00882-7

Download citation

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-021-00882-7

Keywords

Navigation