Depression and anorexia detection in social media as a one-class classification problem

Aguilera, Juan; Farías, Delia Irazú Hernández; Ortega-Mendoza, Rosa María; Montes-y-Gómez, Manuel

doi:10.1007/s10489-020-02131-2

Depression and anorexia detection in social media as a one-class classification problem

Published: 29 January 2021

Volume 51, pages 6088–6103, (2021)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Juan Aguilera¹,
Delia Irazú Hernández Farías²,
Rosa María Ortega-Mendoza ORCID: orcid.org/0000-0003-4506-9653³ &
…
Manuel Montes-y-Gómez¹

1647 Accesses
20 Citations
4 Altmetric
Explore all metrics

Abstract

Taking advantage of the increasing amount of user-generated content in social media, some computational methods have already been proposed for detecting people suffering from depression and anorexia. Such complex tasks have been tackled as a binary classification problem using, in most cases, automatically generated training data. Despite its promising results, this approach has some important drawbacks, namely: it suffers from a severely skewed class distribution, the negative class is very diverse since it attempts to model all kinds of healthy users, and, above all, there is not a complete certainty about annotations, especially for the negative cases (i.e., healthy users). Motivated by these issues, in this paper, we propose to face the detection of these disorders following a one-class classification (OCC) approach. Particularly, we introduce two new instance-based OCC methods especially suited to manage the high diversity of content from social media documents. Taking up ideas from the gravitational attraction force, these methods evaluate the relation of documents by their strengths, considering their distances as well as their masses (relevance) with respect to the target task. Experiments were conducted on depression and anorexia benchmark datasets. The obtained results are encouraging; the overall performance was better than the results from other standard OCC methods, and competitive with regard to state-of-the-art results from binary classification approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence for Mental Health and Mental Illnesses: an Overview

Article 07 November 2019

ImageNet Large Scale Visual Recognition Challenge

Article 11 April 2015

Social media analytics: a survey of techniques, tools and platforms

Article Open access 26 July 2014

Data Availability

The data exploited for experimental purposes have not been collected by the authors of this manuscript. We kindly refer to the original owners for obtaining them.

Notes

Groups who defend eating disorders as a lifestyle, often denoted as proana.
http://clpsych.org/
https://early.irlab.org/
This component of Formula (1) can be computed using other measures such as the Euclidean distance. Indeed, we carried out experiments with different distance measures, obtaining a better performance when the cosine distance was used.
We define a personal phrase as a sentence that contains a singular first-person pronoun.
For example, the words listed in: https://github.com/first20hours/google-10000-english/blob/master/google-10000-english.txt
The data can be obtained upon request. More information can be found in https://early.irlab.org/
https://my.clevelandclinic.org/health/articles/9285-depression-glossary-of-depression-related-terms
https://urbanthesaurus.org/synonyms/anorexia
https://scikit-learn.org/stable/modules/generated/sklearn.svm.OneClassSVM.html
The DB lexicon was discarded because it is computed from the training set, and its quality highly depends on the amount of training instances.

References

Agarwal S, Sureka A (2015) Using KNN and SVM based one-class classifier for detecting online radicalization on Twitter. In: Proceedings of the 11th international conference on distributed computing and internet technology - volume 8956, ICDCIT 2015. Springer, Berlin, pp 431–442
Aguilera J, González LC, Montes-y-Gómez M, Rosso P (2019) A new weighted k-nearest neighbor algorithm based on Newton’s gravitational force. In: Vera-Rodriguez R, Fierrez J, Morales A (eds) Progress in pattern recognition, image analysis, computer vision, and applications. Springer International Publishing, Cham, pp 305–313
Aguilera J, González LC, Montes-y-Gomeź, M. López R, Escalante HJ (2020) From Neighbors to Strengths - The k-Strongest Strengths (kSS) Classification Algorithm. Pattern Recognition Letters 136:301–308
Alam S, Sonbhadra SK, Agarwal S, Nagabhushan P (2020) One-class support vector classifiers: a survey. Knowl-Based Syst 196:105754
Article Google Scholar
Aragón ME, López-Monroy AP, González-Gurrola LC, Montes-y-Gómez M (2019) Detecting depression in social media using fine-grained emotions. In: Proceedings of the 2019 conference of the north american chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers). Association for Computational Linguistics, Minneapolis, pp 1481–1486
Benavoli A, Mangili F, Corani G, Zaffalon M, Ruggeri F (2014) A Bayesian Wilcoxon signed-rank test based on the Dirichlet process. In: Proceedings of the 31st international conference on international conference on machine learning - volume 32, ICML’14, pp II–1026–II–1034. JMLR.org
Birnbaum ML, Ernala SK, Rizvi AF, De Choudhury M, Kane JM (2017) A collaborative approach to identifying social media markers of schizophrenia by employing machine learning and clinical appraisals. J Med Internet Res 19(8):e289
Article Google Scholar
Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguis 5:135–146
Article Google Scholar
Burdisso SG, Errecalde M, Gómez MM (2019) A text classification framework for simple and effective early depression detection over social media streams. Expert Syst Appl 133:182– 197
Article Google Scholar
Cabral GG, De Oliveira ALI (2014) One-class classification for heart disease diagnosis. In: 2014 IEEE International conference on systems, man, and cybernetics (SMC), pp 2551– 2556
Calvo RA, Milne DN, Hussain MS, Christensen H (2017) Natural language processing in mental health applications using non-clinical texts. Nat Lang Eng 23(5):649–685
Article Google Scholar
Chancellor S, De Choudhury M (2020) Methods in predictive techniques for mental health status on social media: a critical review. npj Digit Med 3(1):43
Article Google Scholar
Chen X, Sykora MD, Jackson TW, Elayan S (2018) What about mood swings: identifying depression on Twitter with temporal measures of emotions. In: Companion proceedings of the web conference 2018, WWW ’18, pp 1653–1660
Coppersmith G, Dredze M, Harman C (2014) Quantifying mental health signals in Twitter. In: Proceedings of the workshop on computational linguistics and clinical psychology: from linguistic signal to clinical reality, pp 51–60
De Choudhury M (2015) Anorexia on Tumblr: a characterization study. In: Proceedings of the 5th international conference on digital health 2015. Association for Computing Machinery, New York, pp 43–50
De Choudhury M, Counts S, Horvitz E (2013) Social media as a measurement tool of depression in populations. In: Proceedings of the 5th annual ACM web science conference. Association for Computing Machinery, New York, pp 47–56
Guntuku SC, Yaden DB, Kern ML, Ungar LH, Eichstaedt JC (2017) Detecting depression and mental illness on social media: an integrative review. Curr Opin Behav Sci 18:43–49
Article Google Scholar
Hussain J, Satti FA, Afzal M, Khan WA, Bilal HSM, Ansaar MZ, Ahmad HF, Hur T, Bang J, Kim J, Park GH, Seung H, Lee S (2020) Exploring the Dominant Features of Social Media for Depression Detection. J Inf Sci 46(6):739–759
Article Google Scholar
Husseini Orabi A, Buddhitha P, Husseini Orabi M, Inkpen D (2018) Deep learning for depression detection of Twitter users. In: Proceedings of the fifth workshop on computational linguistics and clinical psychology: from keyboard to clinic. Association for Computational Linguistics, New Orleans, pp 88–97
Irigoien I, Sierra B, Arenas C (2014) Towards application of one-class classification methods to medical data. Sci World J 2014:730712
Article Google Scholar
Islam MR, Kabir MA, Ahmed A, Kamal ARM, Wang H, Ulhaq A (2018) Depression detection from social network data using machine learning techniques. Health Inform Sci Syst 6(1):8
Article Google Scholar
Itani S, Lecron F, Fortemps P (2020) A one-class classification decision tree based on kernel density estimation. Appl Soft Comput 91:106250
Article Google Scholar
Joffe E, Pettigrew EJ, Herskovic JR, Bearden CF, Bernstam EV (2015) Expert guided natural language processing using one-class classification. J Am Med Inform Assoc 22(5):962–966
Article Google Scholar
Khan SS, Ahmad A (2018) Relationship between variants of one-class nearest neighbors and creating their accurate ensembles. IEEE Trans Knowl Data Eng 30(09):1796–1809
Article Google Scholar
Khan SS, Madden MG (2014) One-class classification: taxonomy of study and review of techniques. Knowl Eng Rev 29(3):345–374
Article Google Scholar
Kim J, Lee J, Park E, Han J (2020) A deep learning model for detecting mental illness from user content on social media. Sci Rep 10(1):11846
Article Google Scholar
Koppel M, Schler J (2004) Authorship verification as a one-class classification problem. In: Proceedings of the twenty-first international conference on machine learning. Association for Computing Machinery, New York, p 62
Li A, Jiao D, Zhu T (2018) Detecting depression stigma on social media: a linguistic analysis. J Affect Disord 232:358–362
Article Google Scholar
Losada DE, Crestani F (2016) A test collection for research on depression and language use. In: Conference labs of the evaluation forum. Springer, pp 28–39
Losada DE, Crestani F, Parapar J (2017) eRISK 2017: CLEF lab on early risk prediction on the internet: experimental foundations. In: Experimental IR meets multilinguality, multimodality, and interaction - proceedings of the 8th international conference of the CLEF association, pp 346–360
Losada DE, Crestani F, Parapar J (2018) Overview of eRisk – early risk prediction on the internet. In: Experimental IR meets multilinguality, multimodality, and interaction. Proceedings of the ninth international conference of the CLEF Association. Avignon
Losada DE, Crestani F, Parapar J (2019) Overview of eRisk 2019. Early risk prediction on the internet. In: 10th International conference of the CLEF association. Springer, pp 340–357
Manevitz LM, Yousef M (2002) One-class SVMs for document classification. J Mach Learn Res 2:139–154
MATH Google Scholar
Martínez-Castaño R, Pichel JC, Losada DE (2020) A big data platform for real time analysis of signs of depression in social media. Int J Environ Res Public Health 17(13):4752
Article Google Scholar
Mazhelis O (2006) One-class classifiers: a review and analysis of suitability in the context of mobile-masquerader detection. South African Comput J 36:29–48
Google Scholar
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: Bengio Y, LeCun Y (eds) 1st International conference on learning representations, ICLR 2013. Workshop Track Proceedings
Mirończuk MM, Protasiewicz J (2018) A recent overview of the state-of-the-art elements of text classification. Expert Syst Appl 106:36–54
Article Google Scholar
Mohammadi E, Amini H, Kosseim L (2019) Quick and (maybe not so) easy detection of anorexia in social media posts. In: Working notes of CLEF 2019 - conference and labs of the evaluation forum. Lugano
Mounika N, Vaijayanthi P (2017) Analysis of algorithms for one class classification of heart disease identification. In: 2017 2nd International conference on communication and electronics systems (ICCES), pp 907–912
Norris ML, Boydell KM, Pinhas L, Katzman DK (2006) Ana and the internet: a review of pro-anorexia websites. Int J Eating Disorders 39(6):443–447
Article Google Scholar
Ortega-Mendoza RM, López-Monroy AP, Franco-Arcega A, Montes-y-Gómez M (2018) Emphasizing personal information for author profiling: new approaches for term selection and weighting. Knowl-Based Syst 145:169–181
Article Google Scholar
Park M, McDonald D, Cha M (2013) Perception differences between the depressed and non-depressed users in Twitter. In: Proceedings of the 7th international conference on weblogs and social media (ICWSM 2013), pp 476– 485
Pennington J, Socher R, Manning C (2014) GloVe: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), Doha, pp 1532–1543
Ranganathan AAH, Thenmozhi D, Aravindan C (2019) Early detection of anorexia using RNN-LSTM and SVM classifiers. In: Working notes of CLEF 2019 - conference and labs of the evaluation forum, Lugano
Schölkopf B, Platt JC, Shawe-Taylor JC, Smola AJ, Williamson RC (2001) Estimating the support of a high-dimensional distribution. Neural Comput 13(7):1443–1471
Article Google Scholar
Shen JH, Rudzicz F (2017) Detecting anxiety through Reddit. In: Proceedings of the fourth workshop on computational linguistics and clinical psychology — from linguistic signal to clinical reality, Vancouver, pp 58–65
Spinczyk D, Nabrdalik K, Rojewska K (2018) Computer aided sentiment analysis of anorexia nervosa patients’ vocabulary. BioMedical Engineering OnLine, 17
Strous R, Koppel M, Fine J, Nachliel S, Shaked G, Zivotofsky A (2009) Automated characterization and identification of schizophrenia in writing. J Nervous Mental Disease 197:585–8
Article Google Scholar
Swan N, Schmidt U, Tchanturia K (2012) An experimental investigation of verbal expression of emotion in anorexia and bulimia nervosa. European eating disorders review: The journal of the Eating Disorders Association, 20
Tahir B, Amjad K, Firdous S, Mehmood MA (2018) Public health surveillance system for online social networks using one-class text classification. In: 2018 6th international conference on control engineering information technology (CEIT), pp 1–6
Tausczik YR, Pennebaker JW (2010) The psychological meaning of words: LIWC and computerized text analysis methods. J Lang Soc Psychol 29(1):24–54
Article Google Scholar
Trotzek M, Koitka S, Friedrich C (2018) Word embeddings and linguistic metadata at the CLEF 2018 tasks for early detection of depression and anorexia. In: Experimental IR meets multilinguality, multimodality, and interaction. Proceedings of the ninth international conference of the CLEF association (CLEF 2018), Avignon
Wang T, Brede M, Ianni A, Mentzakis E (2017) Detecting and characterizing eating-disorder communities on social media. In: Proceedings of the tenth ACM international conference on web search and data mining, WSDM ’17. Association for Computing Machinery, New York, pp 91–100
Wang YT, Huang HH, Chen HH (2018) A neural network approach to early risk detection of depression and anorexia on social media text. CEUR Workshop Proceedings, p 2125
Wolf M, Theis F, Kordy H (2013) Language use in eating disorder blogs: psychological implications of social online activity. J Lang Soc Psychol 32(2):212–226
Article Google Scholar
Yan H, Fitzsimmons-Craft EE, Goodman M, Krauss M, Das S, Cavazos-Rehg P (2019) Automatic detection of eating disorder-related social media posts that could benefit from a mental health intervention. International Journal of Eating Disorders (July), 1–7
Zhang Y, Zhang B, Coenen F, Xiao J, Lu W (2014) One-class kernel subspace ensemble for medical image classification. EURASIP J Adv Signal Process 2014(1):17
Article Google Scholar

Download references

Funding

This research was partially supported by CONACYT: project grant FC-2016-2410, postdoctoral fellowship CVU-174410, and graduate scholarship CVU-814295.

Author information

Authors and Affiliations

Coordinación de Ciencias Computacionales, Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE), Puebla, Mexico
Juan Aguilera & Manuel Montes-y-Gómez
División de Ciencias e Ingenierías Campus León, Universidad de Guanajuato, Guanajuato, Mexico
Delia Irazú Hernández Farías
Universidad Politécnica de Tulancingo (UPT), Hidalgo, Mexico
Rosa María Ortega-Mendoza

Authors

Juan Aguilera
View author publications
You can also search for this author in PubMed Google Scholar
Delia Irazú Hernández Farías
View author publications
You can also search for this author in PubMed Google Scholar
Rosa María Ortega-Mendoza
View author publications
You can also search for this author in PubMed Google Scholar
Manuel Montes-y-Gómez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rosa María Ortega-Mendoza.

Ethics declarations

Conflict of interests

The authors declare that they have no conflict of interest.

Additional information

Code Availability

Most of the code used in the experimental phase was developed by the authors.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Aguilera, J., Farías, D.I.H., Ortega-Mendoza, R.M. et al. Depression and anorexia detection in social media as a one-class classification problem. Appl Intell 51, 6088–6103 (2021). https://doi.org/10.1007/s10489-020-02131-2

Download citation

Accepted: 08 December 2020
Published: 29 January 2021
Issue Date: August 2021
DOI: https://doi.org/10.1007/s10489-020-02131-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Depression and anorexia detection in social media as a one-class classification problem

Abstract

Access this article

Similar content being viewed by others

Artificial Intelligence for Mental Health and Mental Illnesses: an Overview

ImageNet Large Scale Visual Recognition Challenge

Social media analytics: a survey of techniques, tools and platforms

Data Availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Additional information

Code Availability

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Depression and anorexia detection in social media as a one-class classification problem

Abstract

Access this article

Similar content being viewed by others

Artificial Intelligence for Mental Health and Mental Illnesses: an Overview

ImageNet Large Scale Visual Recognition Challenge

Social media analytics: a survey of techniques, tools and platforms

Data Availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Additional information

Code Availability

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation