Abstract
Social Media Platforms (SMPs) in general and messaging platforms, namely WhatsApp, have changed how people connect. Unfortunately, SMPs are often used to spread fake information. We focus on images shared on the WhatsApp platform; our goal is to detect whether the image is fake. Our main contribution is in terms of feature engineering. Given an image and meta-data, we compute three features: (1) image content-based features, (2) temporal features using the timestamps at which images were shared, and (3) social context features based on the users who shared images. We provide these features into machine learning models to predict whether the input is fake or not. We evaluate our approach on a fact-checked WhatsApp image dataset released in 2020 gathered during 2.5 months containing 810K and 34K images shared on WhatsApp by 63K and 17K WhatsApp users in India and Brazil. We observed that temporal and social contextual features are essential predictors for fake image detection. Counter-intuitively, we found that image content features derived by CNNs using raw images are not giving promising results in comparison with socio-temporal features, but they are better than random prediction. Our best model uses ensemble learning which fuses the outcomes of Support vector machines, Random Forest, and Logistic Regression using socio-temporal features.









Similar content being viewed by others
Notes
We shall share code and dataset used in this work with fellow researchers upon request.
https://opensource.google/projects/tesseract
References
AlShariah NM, Khader A, Saudagar J (2019) Detecting fake images on social media using machine learning
Anoop K, Gangan MP, Deepak P, Lajish V (2019) Leveraging heterogeneous data for fake news detection. Linking and mining heterogeneous and multi-view data. Springer, Berlin, pp 229–264
Bessi A, Ferrara E (2016) Social bots distort the 2016 us presidential election online discussion. First Monday 21(11-7)
Fallis D (2014) A functional analysis of disinformation. iConference 2014 Proceedings
Fourney A, Racz MZ, Ranade G, Mobius M, Horvitz E (2017) Geographic and temporal trends in fake news consumption during the 2016 us presidential election. CIKM 17:6–10
Garimella K, Morales GDF, Gionis A, Mathioudakis M (2018) Quantifying controversy on social media. ACM Trans Soc Comput 1(1):1–27
Gragnani J (2018) Pesquisa inédita identifica grupos de família como principal vetor de notícias falsas no whatsapp. BBC Brasil Londres 20
Gupta A, Lamba H, Kumaraguru P, Joshi A (2013) Faking sandy: characterizing and identifying fake images on twitter during hurricane sandy. In: Proceedings of the 22nd international conference on World Wide Web, pp 729–736
Hoseini M, Melo P, Júnior M, Benevenuto F, Chandrasekaran B, Feldmann A, Zannettou S (2020) Demystifying the messaging platforms’ ecosystem through the lens of twitter. In: Proceedings of the ACM internet measurement conference, pp 345–359
Iqbal M (2021) Whatsapp revenue and usage statistics (2021). https://www.businessofapps.com/data/whatsapp-statistics/
Kim YM, Hsu J, Neiman D, Kou C, Bankston L, Kim SY, Heinrich R, Baragwanath R, Raskutti G (2018) The stealth media? groups and targets behind divisive issue campaigns on facebook. Polit Commun 35(4):515–541
Kumar S, Shah N (2018) False information on web and social media: A survey. arXiv preprint arXiv:180408559
Kumar S, West R, Leskovec J (2016) Disinformation on the web: Impact, characteristics, and detection of wikipedia hoaxes. In: Proceedings of the 25th international conference on World Wide Web, pp 591–602
Kumar S, Hooi B, Makhija D, Kumar M, Faloutsos C, Subrahmanian V (2018) Rev2: fraudulent user prediction in rating platforms. In: Proceedings of the 11th ACM international conference on web search and data mining, pp 333–341
Lee CY, Osindero S (2016) Recursive recurrent nets with attention modeling for ocr in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2231–2239
Nucci FS, Boi S, Magaldi M (2021) Artificial intelligence against disinformation: the fandango practical case
Reis J, Melo PdF, Garimella K, Benevenuto F (2020a) Can whatsapp benefit from debunked fact-checked stories to reduce misinformation? arXiv preprint arXiv:200602471
Reis JC, Correia A, Murai F, Veloso A, Benevenuto F (2019) Supervised learning for fake news detection. IEEE Intell Syst 34(2):76–81
Reis JC, Melo P, Garimella K, Almeida JM, Eckles D, Benevenuto F (2020) A dataset of fact-checked images shared on whatsapp during the brazilian and indian elections. In: Proceedings of the international AAAI conference on web and social media 14:903–908
Resende G, Melo P, CS Reis J, Vasconcelos M, Almeida JM, Benevenuto F (2019a) Analyzing textual (mis) information shared in whatsapp groups. In: Proceedings of the 10th ACM conference on web science, pp 225–234
Resende G, Melo P, Sousa H, Messias J, Vasconcelos M, Almeida J, Benevenuto F (2019b) (mis)information dissemination in whatsapp: Gathering, analyzing and countermeasures. In: the world wide web conference, association for computing machinery, New York, NY, USA, WWW ’19, pp 818–828, DOI:https://doi.org/10.1145/3308558.3313688
Ribeiro FN, Saha K, Babaei M, Henrique L, Messias J, Benevenuto F, Goga O, Gummadi KP, Redmiles EM (2019) On microtargeting socially divisive ads: A case study of russia-linked ad campaigns on facebook. In: Proceedings of the conference on fairness, accountability, and transparency, pp 140–149
Salem FKA, Al Feel R, Elbassuoni S, Jaber M, Farah M (2019) Fa-kes: a fake news dataset around the syrian war. In: Proceedings of the international AAAI conference on web and social media 13:573–582
Shu K, Sliva A, Wang S, Tang J, Liu H (2017) Fake news detection on social media: a data mining perspective. ACM SIGKDD Explor Newsl 19(1):22–36
Thomas JE (1986) Statements of fact, statements of opinion, and the first amendment. Calif L Rev 74:1001
Tripathy RM, Bagchi A, Mehta S (2010) A study of rumor control strategies on social networks. In: Proceedings of the 19th ACM international conference on information and knowledge management, pp 1817–1820
Vosoughi S, Roy D, Aral S (2018) The spread of true and false news online. Science 359(6380):1146–1151
Wang G, Xie S, Liu B, Philip SY (2011) Review graph based online store review spammer detection. In: 2011 IEEE 11th international conference on data mining, IEEE, pp 1242–1247
Wu K, Yang S, Zhu KQ (2015) False rumors detection on sina weibo by propagation structures. In: 2015 IEEE 31st international conference on data engineering, IEEE, pp 651–662
Wu L, Li J, Hu X, Liu H (2017) Gleaning wisdom from the past: early detection of emerging rumors in social media. In: Proceedings of the 2017 SIAM international conference on data mining, SIAM, pp 99–107
Zubiaga A, Aker A, Bontcheva K, Liakata M, Procter R (2018) Detection and resolution of rumours in social media: a survey. ACM Comput Surv (CSUR) 51(2):1–36
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kaur, M., Daryani, P., Varshney, M. et al. Detection of fake images on whatsApp using socio-temporal features. Soc. Netw. Anal. Min. 12, 58 (2022). https://doi.org/10.1007/s13278-022-00883-y
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-022-00883-y