Text classification models for personality disorders identification

Jain, Deepti; Arora, Sandhya; Jha, C. K.; Malik, Garima

doi:10.1007/s13278-024-01219-8

Text classification models for personality disorders identification

Original Article
Published: 19 March 2024

Volume 14, article number 64, (2024)
Cite this article

Social Network Analysis and Mining Aims and scope Submit manuscript

Deepti Jain¹,
Sandhya Arora²,
C. K. Jha^1,3 &
…
Garima Malik³

187 Accesses
Explore all metrics

Abstract

This research focuses on identifying personality disorders in individuals using their social media text. We developed a unique collection of words (PD-Corpus) and a dataset (PD-TXT), which includes texts marked with different personality disorder traits. Our goal was to classify these texts into six types of personality disorders, using Natural Language Processing (NLP) classification models. The results showed that our transformer-based models, especially the BERT-base-uncased model, were more effective than traditional methods, achieving a 74.7% success rate in correctly classifying these disorders. Also, our models consistently outperform existing literature baseline models on the PD-TXT dataset, showcasing significant enhancements. This study presents a new way to predict personality disorders through linguistic analysis and highlights the potential for further research combining language studies with mental health.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Natural language processing: state of the art, current trends and challenges

Article 14 July 2022

A review on sentiment analysis and emotion detection from text

Article 28 August 2021

Artificial Intelligence for Mental Health and Mental Illnesses: an Overview

Article 07 November 2019

Data availability

The PD-TXT data are available upon request.

Notes

References

Adams JM, Florell D, Burton KA et al (2014) Why do narcissists disregard social-etiquette norms? A test of two explanations for why narcissism relates to offensive-language use. Personal Individ Differ 58:26–30
Article Google Scholar
Al-Mosaiwi M, Johnstone T (2018) In an absolute state: elevated use of absolutist words is a marker specific to anxiety, depression, and suicidal ideation. Clin Psychol Sci 6(4):529–542
Article PubMed PubMed Central Google Scholar
Alakrot A, Murray L, Nikolov NS (2018) Dataset construction for the detection of anti-social behaviour in online communication in Arabic. Procedia Comput Sci 142:174–181
Article Google Scholar
Baccianella S, Esuli A, Sebastiani F (2010) Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the seventh international conference on language resources and evaluation (LREC’10)
Birnbaum ML, Norel R, Van Meter A et al (2020) Identifying signals associated with psychiatric illness utilizing language and images posted to facebook. NPJ Schizophr 6(1):1–10
Article Google Scholar
Black DW, Grant JE (2014) DSM-5® guidebook: the essential companion to the diagnostic and statistical manual of mental disorders. American Psychiatric Pub
Bogolyubova O, Panicheva P, Tikhonov R et al (2018) Dark personalities on facebook: harmful online behaviors and language. Comput Hum Behav 78:151–159
Article Google Scholar
Boyd RL, Pennebaker JW (2017) Language-based personality: a new approach to personality in a digital world. Curr Opin Behav Sci 18:63–68
Article Google Scholar
Boyd RL, Schwartz HA (2021) Natural language analysis and the psychology of verbal behavior: the past, present, and future states of the field. J Lang Soc Psychol 40(1):21–41
Article PubMed Google Scholar
Burdisso SG, Errecalde M, Montes-y Gómez M (2019) A text classification framework for simple and effective early depression detection over social media streams. Expert Syst Appl 133:182–197
Article Google Scholar
Calvo RA, Milne DN, Hussain MS et al (2017) Natural language processing in mental health applications using non-clinical texts. Nat Lang Eng 23(5):649–685
Article Google Scholar
Cheng J, Danescu-Niculescu-Mizil C, Leskovec J (2015) Antisocial behavior in online discussion communities. In: Proceedings of the international AAAI conference on web and social media, pp 61–70
Clarkin JF, Fonagy P, Levy KN, et al (2015) Borderline personality disorder. In: Handbook of psychodynamic approaches to psychopathology. Guilford Publications, p 353
Clements C, Jones S, Morriss R et al (2015) Self-harm in bipolar disorder: findings from a prospective clinical database. J Affect Disord 173:113–119
Article PubMed Google Scholar
Cohan A, Desmet B, Yates A, et al (2018) Smhd: a large-scale resource for exploring online language usage for multiple mental health conditions. arXiv preprint arXiv:1806.05258
Coppersmith G, Dredze M, Harman C (2014) Quantifying mental health signals in twitter. In: Proceedings of the workshop on computational linguistics and clinical psychology: from linguistic signal to clinical reality, pp 51–60
Coppersmith G, Dredze M, Harman C, et al (2015) From adhd to sad: Analyzing the language of mental health on twitter through self-reported diagnoses. In: Proceedings of the 2nd workshop on computational linguistics and clinical psychology: from linguistic signal to clinical reality, pp 1–10
Coppersmith G, Leary R, Crutchley P et al (2018) Natural language processing of social media as screening for suicide risk. Biomed Inf Insights 10(1178222618792):860
Google Scholar
Cutler AD, Carden SW, Dorough HL et al (2021) Inferring grandiose narcissism from text: Liwc versus machine learning. J Lang Soc Psychol 40(2):260–276
Article Google Scholar
Dorough HL (2018) Vulnerable narcissism and first-person singular pronoun use. https://digitalcommons.georgiasouthern.edu/cgi/viewcontent.cgi?article=1430&context=honors-theses.
Duwairi R, Halloush Z (2023) A multi-view learning approach for detecting personality disorders among Arab social media users. ACM Trans Asian Low-Resour Lang Inf Process 22(4):1–19
Article Google Scholar
Ellouze M, Hadrich Belguith L (2022) A hybrid approach for the detection and monitoring of people having personality disorders on social networks. Soc Netw Anal Min 12(1):1–17
Article Google Scholar
Fava M, Farabaugh A, Sickinger A et al (2002) Personality disorders and depression. Psychol Med 32(6):1049–1057
Article CAS PubMed Google Scholar
Gawda B (2013) The emotional lexicon of individuals diagnosed with antisocial personality disorder. J Psycholinguist Res 42(6):571–580
Article PubMed PubMed Central Google Scholar
Gawda B, Czubak K (2017) Prevalence of personality disorders in a general population among men and women. Psychol Rep 120(3):503–519
Article PubMed Google Scholar
Golbeck J (2016) Negativity and anti-social attention seeking among narcissists on twitter: a linguistic analysis. First Monday. https://doi.org/10.5210/fm.v0i0.6017
Haz L, Rodríguez-García MÁ, Fernández A (2022) Detecting narcissist dark triad psychological traits from twitter. In: ICAART (2), pp 313–322
Henning AS (2017) Machine learning and natural language methods for detecting psychopathy in textual data. Electronic theses and dissertations, 446. https://egrove.olemiss.edu/etd/446
Holtzman NS, Tackman AM, Carey AL et al (2019) Linguistic markers of grandiose narcissism: a LIWC analysis of 15 samples. J Lang Soc Psychol 38(5–6):773–786
Article Google Scholar
Homan C, Johar R, Liu T, et al (2014) Toward macro-insights for suicide prevention: analyzing fine-grained distress at scale. In: Proceedings of the workshop on computational linguistics and clinical psychology: from linguistic signal to clinical reality, pp 107–117
Howard V (2019) Recognising narcissistic abuse and the implications for mental health nursing practice. Issues Mental Health Nurs. https://doi.org/10.1080/01612840.2019.1590485
Article Google Scholar
Janschewitz K (2008) Taboo, emotionally valenced, and emotionally neutral word norms. Behav Res Methods 40(4):1065–1074
Article PubMed Google Scholar
Jashinsky J, Burton SH, Hanson CL et al (2014) Tracking suicide risk factors through twitter in the us. Crisis: J Crisis Interv Suicide Prev 35(1):51
Article Google Scholar
Kadkhoda E, Khorasani M, Pourgholamali F et al (2022) Bipolar disorder detection over social media. Inf Med Unlocked 32(101):042
Google Scholar
Kenton JDMWC, Toutanova LK (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of naacL-HLT, p 2
Kessing L (2007) Epidemiology of subtypes of depression. Acta Psychiatr Scand 115:85–89
Article Google Scholar
Kessler RC, Bromet EJ (2013) The epidemiology of depression across cultures. Annu Rev Public Health 34:119–138
Article PubMed PubMed Central Google Scholar
Kovanicova M, Kubasovska Z, Pallayova M (2020) Exploring the presence of personality disorders in a sample of psychiatric inpatients. J Psychiat Psychiatr Disord 4(3):118–129
Article Google Scholar
Liu Y, Ott M, Goyal N, et al (2019) Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692
Lyons M, Aksayli ND, Brewer G (2018) Mental distress and language use: linguistic analysis of discussion forum posts. Comput Hum Behav 87:207–211
Article Google Scholar
McLaren K (2020) Embracing anxiety: how to access the genius of this vital emotion. Sounds True ISBN. 9781683644422. https://books.google.co.in/books?id=A-rdyAEACAAJ
Mitchell M, Hollingshead K, Coppersmith G (2015) Quantifying the language of schizophrenia in social media. In: Proceedings of the 2nd workshop on Computational linguistics and clinical psychology: from linguistic signal to clinical reality, pp 11–20
Morf CC, Rhodewalt F (2001) Unraveling the paradoxes of narcissism: a dynamic self-regulatory processing model. Psychol Inq 12(4):177–196
Article Google Scholar
Nielsen FÅ (2011) A new anew: Evaluation of a word list for sentiment analysis in microblogs. arXiv preprint arXiv:1103.2903
Pamungkas EW, Basile V, Patti V (2023) Investigating the role of swear words in abusive language detection tasks. Lang Resour Eval 57(1):155–188
Article Google Scholar
Rytting CA, Novak V, Hull JR, et al (2022) Ru-adept: Russian anonymized dataset with eight personality traits. In: Proceedings of the thirteenth language resources and evaluation conference, pp 109–118
Scott LN, Wright AG, Beeney JE et al (2017) Borderline personality disorder symptoms and aggression: a within-person process model. J Abnorm Psychol 126(4):429
Article PubMed PubMed Central Google Scholar
Sekulić I, Gjurković M, Šnajder J (2018) Not just depressed: bipolar disorder prediction on reddit. arXiv preprint arXiv:1811.04655
Sheldon P, Rauschnabel P, Honeycutt JM (2019) The dark side of social media: psychological, managerial, and societal perspectives. Academic Press, Cambridge
Google Scholar
Singh R, Du J, Zhang Y, et al (2020) A framework for early detection of antisocial behavior on twitter using natural language processing. In: Complex, intelligent, and software intensive systems: proceedings of the 13th international conference on complex, intelligent, and software intensive systems (CISIS-2019), Springer, pp 484–495
Singh R, Subramani S, Du J et al (2023) Antisocial behavior identification from twitter feeds using traditional machine learning algorithms and deep learning. EAI Endorsed Trans Scalable Inf Syst 10(4):e17–e17
Article Google Scholar
Tatay-Manteiga A, Correa-Ghisays P, Cauli O et al (2018) Staging, neurocognition and social functioning in bipolar disorder. Front Psych 9:709
Article Google Scholar
Tausczik YR, Pennebaker JW (2010) The psychological meaning of words: LIWC and computerized text analysis methods. J Lang Soc Psychol 29(1):24–54
Article Google Scholar
Teh PL, Cheng CB, Chee WM (2018) Identifying and categorising profane words in hate speech. In: Proceedings of the 2nd international conference on compute and data analysis, pp 65–69
Trifan A, Antunes R, Matos S, et al (2020) Understanding depression from psycholinguistic patterns in social media texts. In: European conference on information retrieval, Springer, pp 402–409
Vaknin S (2020) CPQ neurology and psychology (2020) 3: 3 perspective. Psychology 3(3):01–06
Google Scholar
Vaswani A, Shazeer N, Parmar N, et al (2017) Attention is all you need. Adv Neural Inf Process Syst 30. Long Beach, California, USA, pp, 6000–6010
Wang B, Wu Y, Taylor N, et al (2020) Learning to detect bipolar disorder and borderline personality disorder with language and speech in non-clinical interviews. arXiv preprint arXiv:2008.03408
Winsper C, Bilgin A, Thompson A et al (2020) The prevalence of personality disorders in the community: a global systematic review and meta-analysis. Br J Psychiatr 216(2):69–78
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank the collaboration of the psychologists Dr. Rajat Mitra and Dr. Puneet Jain.

Author information

Authors and Affiliations

Department of Computer Science, Banasthali Vidyapith, Rajasthan 304022, India
Deepti Jain & C. K. Jha
Department of Computer Engineering, Cummins College of Engineering, Pune, Maharashtra, 411052, India
Sandhya Arora
Department of Mechanical and Industrial Engineering, Toronto Metropolitan University, 350 Victoria st, Toronto, Ontario, M4C2C3, Canada
C. K. Jha & Garima Malik

Authors

Deepti Jain
View author publications
You can also search for this author in PubMed Google Scholar
Sandhya Arora
View author publications
You can also search for this author in PubMed Google Scholar
C. K. Jha
View author publications
You can also search for this author in PubMed Google Scholar
Garima Malik
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Deepti Jain.

Ethics declarations

Conflict of interest

No potential conflict of interest was reported by the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

This section includes additional results related to the experiments carried out in this study. Figures 6 and 7 show the training and validation loss and accuracy curves for each deep learning model employed in the experiments. It is evident that the BiLSTM model with Keras embedding converges properly and provides the highest performance across the training epochs.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Jain, D., Arora, S., Jha, C.K. et al. Text classification models for personality disorders identification. Soc. Netw. Anal. Min. 14, 64 (2024). https://doi.org/10.1007/s13278-024-01219-8

Download citation

Received: 09 May 2023
Revised: 04 February 2024
Accepted: 06 February 2024
Published: 19 March 2024
DOI: https://doi.org/10.1007/s13278-024-01219-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Text classification models for personality disorders identification

Abstract

Access this article

Similar content being viewed by others

Natural language processing: state of the art, current trends and challenges

A review on sentiment analysis and emotion detection from text

Artificial Intelligence for Mental Health and Mental Illnesses: an Overview

Data availability

Notes

References

Acknowledgements