Skip to main content

Advertisement

Log in

An attention based multi-modal gender identification system for social media users

  • 1183: Multimedia Processing to Tackle the Dark Side of Social Life
  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The rising usage of social media has motivated to invent different methodologies of anonymous writing, which leads to increase in malicious and suspicious activities. This anonymity has created difficulty in finding the suspect. Author profiling deals with characterization of an author through some key attributes such as gender, age, language, dialect region variety, and personality etc. Identifying the gender of the writer of a suspect document is also very popular task. Different social media platforms such as twitter, facebook, instagram, etc. are used regularly by the users for sharing their daily life activities. In this paper, we have proposed a neural architecture for solving the gender prediction task on a multimodal twitter data. Bidirectional GRU is used for learning the encoded representation for the text part of the tweet, and ResNet-50 is used for extracting the features from images. Different types of attention networks have been applied for fusing the text and image representations, followed by a fully connected layer for predicting the gender of a twitter user. PAN-2018 author profiling data is used for evaluating the performance of our proposed approach. Experimental results illustrate that weighted attention performs the best for the gender prediction task. It is observed that, our model has achieved an accuracy of 84.03% and outperformed the previous state-of-the-art works. A deep analysis of the developed system has also been carried out, which demonstrates the writing patterns of male and female users.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Alanazi SA (2019) Toward identifying features for automatic gender detection: A corpus creation and analysis. IEEE Access 7 111931–111943

  2. Alowibdi JS, Buy UA, Yu P (2013) Language independent gender classification on twitter. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining, pp. 739–743

  3. Argamon S, Koppel M, Fine J, Shimoni AR (2003) Gender, genre, and writing style in formal written texts. Text-The Hague Then Amsterdam Then Berlin- 23(3):321–346

    Google Scholar 

  4. Argamon S, Koppel M, Pennebaker JW, Schler J (2009) Automatically profiling the author of an anonymous text. Commun ACM 52(2):119–123

    Article  Google Scholar 

  5. Basile A, Dwyer G, Medvedeva M, Rawee J, Haagsma H, Nissim M (2017) N-gram:, New groningen author-profiling model. arXiv preprint arXiv:1707.03764

  6. Burger JD, Henderson J, Kim G, Zarrella G (2011) Discriminating gender on twitter. In: Proceedings of the conference on empirical methods in natural language processing, pp. 1301–1309. Association for Computational Linguistics

  7. Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555

  8. Ciccone G, Sultan A, Laporte L, Egyed-Zsigmond E, Alhamzeh A, Granitzer M (2018) Stacked gender prediction from tweet texts and images notebook for pan at clef 2018. In: CLEF 2018-Conference and labs of the evaluation, p. 11p

  9. Daelemans W, Kestemont M, Manjavacas E, Potthast M, Rangel F, Rosso P, Specht G, Stamatatos E, Stein B, Tschuggnall M, et al. (2019) Overview of pan 2019: bots and gender profiling, celebrity profiling, cross-domain authorship attribution and style change detection. In: International conference of the cross-language evaluation forum for european languages, pp. 402–416. Springer

  10. Daneshvar S, Inkpen D (2018) Gender identification in twitter using n-grams and lsa. In: Proceedings of the Ninth International Conference of the CLEF Association (CLEF 2018)

  11. Deitrick W, Miller Z, Valyou B, Dickinson B, Munson T, Hu W (2012) Author gender prediction in an email stream using neural networks

  12. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on computer vision and pattern recognition, pp. 248–255. Ieee

  13. Farzindar A, Inkpen D (2015) Natural language processing for social media. Synthesis Lectures on Human Language Technologies 8(2):1–166

    Article  Google Scholar 

  14. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778

  15. Johansson F (2019) Supervised classification of twitter accounts based on textual content of tweets. PAN, 2019

  16. Koppel M, Argamon S, Shimoni AR (2002) Automatically categorizing written texts by author gender. Literary and linguistic computing 17(4):401–412

    Article  Google Scholar 

  17. Kucukyilmaz T, Cambazoglu BB, Aykanat C, Can F (2006) Chat mining for gender prediction. In: International conference on advances in information systems, pp. 274–283. Springer

  18. Ljubešić N, Fišer D, Erjavec T (2017) Language-independent gender prediction on twitter. In: Proceedings of the Second Workshop on NLP and Computational Social Science, pp. 1–6

  19. Miller Z, Dickinson B, Hu W (2012) Gender prediction on twitter using stream algorithms with n-gram character features

  20. Patra BG, Das KG (2018) Dd. multimodal author profiling for arabic, english, and spanish. In: Proceedings of the Ninth International Conference of the CLEF Association (CLEF 2018), vol. 2125

  21. Pennebaker JW, Mehl MR, Niederhoffer KG (2003) Psychological aspects of natural language use: Our words, our selves. Annual review of psychology 54(1):547–577

    Article  Google Scholar 

  22. Pennington J, Socher R, Manning C (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532–1543

  23. Rangel F, Rosso P, Montes-y Gómez M, Potthast M, Stein B (2018) Overview of the 6th author profiling task at pan 2018: multimodal gender identification in twitter Working Notes Papers of the CLEF

  24. Russell CA, Miller BH (1977) Profile of a terrorist. Studies in conflict & terrorism 1(1):17–34

    Google Scholar 

  25. Schler J, Koppel M, Argamon S, Pennebaker J (2006) Effects of age and gender on blogging aaai spring symposium on computational approaches for analyzing weblogs

  26. Sezerer E, Polatbilek O, Sevgili Ö, Tekir S (2018) Gender prediction from tweets with convolutional neural networks: Notebook for pan at clef 2018 CEUR Workshop Proceedings

  27. Sezerer E, Polatbilek O, Tekir S (2019) A turkish dataset for gender identification of twitter users. In: Proceedings of the 13th Linguistic Annotation Workshop, Florence, Italy, pp. 203–207

  28. Sierra S, González FA (2018) Combining textual and visual representations for multimodal author profiling. Working Notes Papers of the CLEF 2125:219–228

    Google Scholar 

  29. Student: The probable error of a mean. Biometrika pp. 1–25 (1908)

  30. Suman C, Kumar P, Saha S, Bhattacharyya P (2019) Gender, age and dialect recognition using tweets in a deep learning framework-notebook for fire 2019. In: Working notes of the forum for information retrieval evaluation (FIRE 2019). CEUR workshop proceedings. CEUR-WS. org, kolkata, india, december, pp. 12–15

  31. Suman C, Saha S, Bhattacharyya P, Chaudhari RS (2020) Emoji helps! a multi-modal siamese architecture for tweet user verification. Cognitive Computation, pp 1–16

  32. Takahashi T, Tahara T, Nagatani K, Miura Y, Taniguchi T, Ohkuma T (2018) Text and image synergy with feature cross technique for gender identification Working Notes Papers of the CLEF

  33. Valencia AIV, Adorno HG, Rhodes CS, Pineda G.F (2019) Bots and gender identification based on stylometry of tweet minimal structure and n-grams model

  34. van der Goot R, Ljubešić N, Matroos I, Nissim M, Plank B (2018) Bleaching text:, Abstract features for cross-lingual gender prediction. arXiv preprint arXiv:1805.03122

  35. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp. 5998–6008

  36. Wiegmann M, Stein B, Potthast M (2019) Celebrity profiling. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 2611–2618

  37. Yu Z, Cui Y, Yu J, Tao D, Tian Q (2019) Multimodal unified attention networks for vision-and-language interactions. arXiv preprint arXiv:1908.04107

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chanchal Suman.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Suman, C., Chaudhary, R.S., Saha, S. et al. An attention based multi-modal gender identification system for social media users. Multimed Tools Appl 81, 27033–27055 (2022). https://doi.org/10.1007/s11042-021-11256-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-11256-6

Keywords

Navigation