Abstract
People tend to have multiple identities or personalities in their real and on-line lives. In the real life, these identities can be even associated with different names used with parents, groups of friends or in formal contexts. In the on-line side of life, the attitude has exploded: people have the possibility to express different identities with different names in different social networks (SNs), interfacing with these tools claiming the same meaning as the actions and connections in real life. Thus, a fundamental question arises—Can profiles of the same user be connected in multiple SNs? In this paper, we present Hiding Your Face Is Not Enough (HYFINE) model: a User Identity Linking model that fully exploits images in profiles. Our HYFINE model consists of two parts: (1) the corpus extraction system; (2) the classification system HYFINE-c, which classify if two profiles to determine if these profiles are two different identities of the same user by fully using images along with other features. We show that HYFINE model, exploiting images in profiles, can match profiles of the users in different SNs with high performance.

Similar content being viewed by others
Notes
The system is available on https://github.com/leooJo/SeleniumWebScraper.
The implementation is available in PythonBook format at the following link: https://github.com/leooJo/SeleniumWebScraper
References
Ahmad W, Ali R (2018) Understanding the users personal attributes selection tendency across social networks, pp 1–6. https://doi.org/10.1109/IoT-SIU.2018.8519905
Cohen WW, Ravikumar P, Fienberg SE (2003) A comparison of string distance metrics for name-matching tasks. In: Proceedings of the 2003 international conference on Information Integration on the Web, IIWEB’03. AAAI Press, pp 73–78. http://dl.acm.org/citation.cfm?id=3104278.3104293
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297. https://doi.org/10.1023/A:1022627411411
Goga O (2014) Matching user accounts across online social networks: methods and applications (corrélation des profils d’utilisateurs dans les réseaux sociaux : méthodes et applications)
Halimi A, Ayday E (2017) Profile matching across unstructured online social networks: threats and countermeasures. CoRR arXiv:1711.01815
Hamming RW (1950) Error detecting and error correcting codes. Bell Syst Tech J 29(2):147–160
Kaushal R, Gupta S, Kumaraguru P (2020) Investigation of biases in identity linkage datasets. In: Hung C, Cerný T, Shin D, Bechini A (eds) SAC ’20: the 35th ACM/SIGAPP Symposium on Applied Computing, online event, [Brno, Czech Republic], March 30–April 3, 2020. ACM, pp 1861–1868. https://doi.org/10.1145/3341105.3374015
King DE (2009) Dlib-ml: a machine learning toolkit. J Mach Learn Res 10:1755–1758
Korula N, Lattanzi S (2014) An efficient reconciliation algorithm for social networks. Proc VLDB Endow 7(5):377–388. https://doi.org/10.14778/2732269.2732274
Lee RK, Hee MS, Prasetyo PK, Lim E (2019) Linky: visualizing user identity linkage results for multiple online social networks. CoRR arXiv:1902.08737
Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions and reversals. Sov Phys Dokl 10(8):707–710 (doklady Akademii Nauk SSSR, V163 No4 845–848 1965)
Liu S, Wang S, Zhu F, Zhang J, Krishnan R (2014) Hydra: large-scale social identity linkage via heterogeneous behavior modeling. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD ’14. ACM, New York, NY, USA, pp 51–62. https://doi.org/10.1145/2588555.2588559
Lovdata (2019) Imagehash library. https://pypi.org/project/ImageHash/
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110. https://doi.org/10.1023/B:VISI.0000029664.99615.94
Mishra R (2019) Entity resolution in online multiple social networks (@Facebook and LinkedIn). In: Proceedings of IEMIS 2018, vol 2, pp 221–237. https://doi.org/10.1007/978-981-13-1498-8_20
Narayanan A, Shmatikov V (2009) De-anonymizing social networks. In: Proceedings of the 2009 30th IEEE Symposium on Security and Privacy, SP ’09. IEEE Computer Society, Washington, DC, USA, pp 173–187. https://doi.org/10.1109/SP.2009.22
Nunes A, Calado P, Martins B (2012) Resolving user identities over social networks through supervised learning and rich similarity features. In: Proceedings of the 27th Annual ACM Symposium on Applied Computing, SAC ’12. ACM, New York, NY, USA, pp 728–729. https://doi.org/10.1145/2245276.2245413
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
Peled O, Fire M, Rokach L, Elovici Y (2016) Matching entities across online social networks. Neurocomputing 210(C):91–106. https://doi.org/10.1016/j.neucom.2016.03.089
Quinlan JR (1993) C4.5: Programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco
Shu K, Wang S, Tang J, Zafarani R, Liu H (2017) User identity linkage across online social networks: a review. SIGKDD Explor Newsl 18(2):5–17. https://doi.org/10.1145/3068777.3068781
Tichy W (1984) The string-to-string correction problem with block moves. ACM Trans Comput Syst 2:309–321. https://doi.org/10.1145/357401.357404
Vosecky J, Hong D, Shen V (2009) User identification across multiple social networks, pp 360–365. https://doi.org/10.1109/NDT.2009.5272173
Wang Z, Bovik A, Sheikh H (2005) Structural similarity based image quality assessment. In: Digital Video Image Quality and Perceptual Coding, Ser Series in Signal Processing and Communications. https://doi.org/10.1201/9781420027822.ch7
Wang J, Li G, Fe J (2011) Fast-join: an efficient method for fuzzy token matching based string similarity join. In: 2011 IEEE 27th International Conference on Data Engineering, pp 458–469. https://doi.org/10.1109/ICDE.2011.5767865
Wondracek G, Holz T, Kirda E, Kruegel C (2010) A practical attack to de-anonymize social network users. In: 2010 IEEE Symposium on Security and Privacy, pp 223–238. https://doi.org/10.1109/SP.2010.21
Zafarani R, Liu H (2013) Connecting users across social media sites: a behavioral-modeling approach. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’13. ACM, New York, NY, USA, pp 41–49. https://doi.org/10.1145/2487575.2487648
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ranaldi, L., Zanzotto, F.M. Hiding Your Face Is Not Enough: user identity linkage with image recognition. Soc. Netw. Anal. Min. 10, 56 (2020). https://doi.org/10.1007/s13278-020-00673-4
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-020-00673-4