Skip to main content

Towards Author Profiling from Modern Standard Arabic Texts: A Review

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 235))

Abstract

One of the most popular topics that many researchers showed interest in recently is the possibility of extracting the personal/demographic characteristics of the authors from their texts such as gender, age, political affiliation, or native language. This is known as Author Profiling (AP). There is a growing interest in AP these last years, in view of its unlimited applications like crime investigations, security, or marketing analysis. In this paper, we review the state of the art about the main author profiling problems, as well as the most used techniques and features, focusing mainly on the Standard Arabic language.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    PAN is for plagiarism analysis and authorship identification: https://pan.webis.de/events.html.

  2. 2.

    An ARFF is an attribute relation file format which is a text file (ASCII) that describes a group of instances that shares a set of attributes. They were built by the University of Waikato.

References

  1. Oakes MP (2019) Author profiling and related applications, no February 2019

    Google Scholar 

  2. Argamon S, Koppel M, Fine J, Shimoni AR (2003) Gender, genre, and writing style in formal written texts. Text Talk 23(3):321–346

    Google Scholar 

  3. SchlerJ, Koppel M, Argamon S, Pennebaker J (2006) Effects of age and gender on blogging. In Proceedings of 2006 AAAI spring symposium on computational approaches for analyzing weblogs

    Google Scholar 

  4. MechtiS, Jaoua M, Belguith LH, Faiz R (2013) Author profiling using style-based features. Notebook papers CLEF2

    Google Scholar 

  5. López-MonroyAP, Montes-y-Gómez M, Escalante HJ, Pineda LV (2014) Using intra-profile information for author profiling. CLEF (Working Notes), pp 1116–1120

    Google Scholar 

  6. Takahashi T, Tahara T, Nagatani K, Miura Y, Taniguchi T, Ohkuma T (2018) Text and image synergy with feature cross technique for gender identification. Working notes papers CLEF

    Google Scholar 

  7. Estival D, Gaustad T, Pham SB, Radford W, Hutchinson B (2007) Tat: an author profiling tool with application to arabic emails In: Proceedings of the Australasian language technology workshop 2007, pp 21–30

    Google Scholar 

  8. Alsukhni E (2016) Investigating the Use of machine learning algorithms in detecting gender of the Arabic Tweet author, vol 7, no 7, p. 319–328

    Google Scholar 

  9. AlsmearatK, Shehab M, Al-Ayyoub M, Al-Shalabi R, Kanaan G (2016) Emotion analysis of Arabic articles and its impact on identifying the author’s gender. In: Proceedings of IEEE/ACS international conference computer system application AICCSA, vol 2016, July, 2016

    Google Scholar 

  10. Al-Ghadir AI, Azmi AM (2019) A study of arabic social media users—posting behavior and author’s gender prediction. Cognit Comput 11(1):71–86

    Article  Google Scholar 

  11. Can F, Patton JM (2004) Change of writing style with time. Comput Hum 38(1):61–82

    Article  Google Scholar 

  12. ZaghouaniW, Charfi A (2018) Guidelines and annotation framework for Arabic author profiling. arXiv1808.07678

    Google Scholar 

  13. Abbasi A, Chen H (2005) Applying authorship analysis to extremist-group Web forum messages. IEEE Intell Syst 20(5):67–75

    Article  Google Scholar 

  14. Dahllöf M (2012) Automatic prediction of gender, political affiliation, and age in Swedish politicians from the wording of their speeches—A comparative study of classifiability. Lit Linguist Comput 27(2):139–153

    Article  Google Scholar 

  15. KoppelM, Akiva N, Alshech E, Bar K (2009) Automatically classifying documents by ideological and organizational affiliation. In: 2009 IEEE international conference on intelligence and security informatics, pp 176–178

    Google Scholar 

  16. Franco-Salvador M, Kondrak G, Rosso P (2017) Bridging the Native Language and Language Variety Identification Tasks. Procedia Comput Sci 112:1554–1561

    Article  Google Scholar 

  17. BasileA, Dwyer G, Medvedeva M, Rawee J, Haagsma H, Nissim M (2017) N-gram: new groningen author-profiling model. arXiv1707.03764

    Google Scholar 

  18. KodiyanD, Hardegger F, Neuhaus S, Cieliebak M (1866) Author Profiling with bidirectional rnns using attention with grus: notebook for PAN at CLEF 2017. In: CLEF 2017 conference and labs of the evaluation forum, Dublin, Ireland, 11–14 September 2017, vol 1866

    Google Scholar 

  19. Campbell RS, Pennebaker JW (2003) The secret life of pronouns: flexibility in writing style and physical health. Psychol Sci 14(1):60–65

    Article  Google Scholar 

  20. ModaresiP, Liebeck M, Conrad S (2016) Exploring the effects of cross-genre machine learning for author profiling in PAN 2016. In: CLEF (Working Notes), pp 970–977

    Google Scholar 

  21. Meina M et al. (2013) Ensemble-based classification for author profiling using various features. Notebook papers CLEF

    Google Scholar 

  22. Stamatatos E (2009) A survey of modern authorship attribution methods, vol 60, no 3, pp 538–556

    Google Scholar 

  23. BarbieriF, Saggion H, Ronzano F (2014) Modelling sarcasm in Twitter, a novel approach, pp 50–58

    Google Scholar 

  24. ArgamonS, Koppel M, Pennebaker JW, Schler J (2006) Automatically profiling the author of an anonymous text, no Sebastiani 2002

    Google Scholar 

  25. HolbE (2019) Using author profiling to determine the age group of an author, no June

    Google Scholar 

  26. Pennebaker JW, Francis ME, Booth RJ (2001) Linguistic inquiry and word count: LIWC 2001. Mahw Lawrence Erlbaum Assoc 71(2001):2001

    Google Scholar 

  27. Biber D (1995) Dimensions of register variation: a cross-linguistic comparison. Cambridge University Press

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Asmaa Mansour Khoudja .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mansour Khoudja, A., Loukam, M., Belkredim, F.Z. (2022). Towards Author Profiling from Modern Standard Arabic Texts: A Review. In: Yang, XS., Sherratt, S., Dey, N., Joshi, A. (eds) Proceedings of Sixth International Congress on Information and Communication Technology. Lecture Notes in Networks and Systems, vol 235. Springer, Singapore. https://doi.org/10.1007/978-981-16-2377-6_69

Download citation

Publish with us

Policies and ethics