Data Set Creation and Empirical Analysis for Detecting Signs of Depression from Social Media Postings

Sampath, Kayalvizhi; Durairaj, Thenmozhi

doi:10.1007/978-3-031-16364-7_11

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 654))

Included in the following conference series:

International Conference on Computational Intelligence in Data Science

396 Accesses
7 Citations

Abstract

Depression is a common mental illness that has to be detected and treated at an early stage to avoid serious consequences. There are many methods and modalities for detecting depression that involves physical examination of the individual. However, diagnosing mental health using their social media data is more effective as it avoids such physical examinations. Also, people express their emotions well in social media, it is desirable to diagnose their mental health using social media data. Though there are many existing systems that detects mental illness of a person by analysing their social media data, detecting the level of depression is also important for further treatment. Thus, in this research, we developed a gold standard data set that detects the levels of depression as ‘not depressed’, ‘moderately depressed’ and ‘severely depressed’ from the social media postings. Traditional learning algorithms were employed on this data set and an empirical analysis was presented in this paper. Data augmentation technique was applied to overcome the data imbalance. Among the several variations that are implemented, the model with Word2Vec vectorizer and Random Forest classifier on augmented data outperforms the other variations with a score of 0.877 for both accuracy and F1 measure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

American Psychiatric Association. https://www.psychiatry.org/patients-families/depression/what-is-depression. Accessed 17 Nov 2021
Healthline. https://www.healthline.com/health/depression/mild-depression. Accessed 17 Nov 2021
Institute of Health Metrics and Evaluation. Global Health Data Exchange (GHDx). http://ghdx.healthdata.org/gbd-results-tool?params=gbd-api-2019-permalink/d780dffbe8a381b25e1416884959e88b. Accessed 17 Nov 2021
Statista statistics. https://www.statista.com/statistics/278414/number-of-worldwide-social-network-users/. Accessed 17 Nov 2021
Al Hanai, T., Ghassemi, M.M., Glass, J.R.: Detecting depression with audio/text sequence modeling of interviews. In: Interspeech, pp. 1716–1720 (2018)
Google Scholar
Alghowinem, S., et al.: Multimodal depression detection: fusion analysis of paralinguistic, head pose and eye gaze behaviors. IEEE Trans. Affect. Comput. 9(4), 478–490 (2016)
Article Google Scholar
Artstein, R., Poesio, M.: Inter-coder agreement for computational linguistics. Comput. Linguist. 34(4), 555–596 (2008)
Article Google Scholar
Boettcher, N., et al.: Studies of depression and anxiety using reddit as a data source: scoping review. JMIR Ment. Health 8(11), e29487 (2021)
Article Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Article Google Scholar
Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Measur. 20(1), 37–46 (1960)
Article Google Scholar
Deshpande, M., Rao, V.: Depression detection using emotion artificial intelligence. In: 2017 International Conference on Intelligent Sustainable Systems (ICISS), pp. 858–862. IEEE (2017)
Google Scholar
Dibeklioğlu, H., Hammal, Z., Yang, Y., Cohn, J.F.: Multimodal detection of depression in clinical interviews. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, pp. 307–310 (2015)
Google Scholar
Eichstaedt, J.C., et al.: Facebook language predicts depression in medical records. Proc. Natl. Acad. Sci. 115(44), 11203–11208 (2018)
Article Google Scholar
Havigerová, J.M., Haviger, J., Kučera, D., Hoffmannová, P.: Text-based detection of the risk of depression. Front. Psychol. 10, 513 (2019)
Article Google Scholar
Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33, 159–174 (1977)
Article Google Scholar
Lin, C., et al.: SenseMood: depression detection on social media. In: Proceedings of the 2020 International Conference on Multimedia Retrieval, pp. 407–411 (2020)
Google Scholar
Losada, D.E., Crestani, F., Parapar, J.: eRISK 2017: CLEF lab on early risk prediction on the internet: experimental foundations. In: Jones, G.J.F., et al. (eds.) CLEF 2017. LNCS, vol. 10456, pp. 346–360. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-65813-1_30
Chapter Google Scholar
Morales, M.R., Levitan, R.: Speech vs. text: a comparative analysis of features for depression detection systems. In: 2016 IEEE Spoken Language Technology Workshop (SLT), pp. 136–143. IEEE (2016)
Google Scholar
Nasir, M., Jati, A., Shivakumar, P.G., Nallan Chakravarthula, S., Georgiou, P.: Multimodal and multiresolution depression detection from speech and facial landmark features. In: Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge, pp. 43–50 (2016)
Google Scholar
Nguyen, T., Phung, D., Dao, B., Venkatesh, S., Berk, M.: Affective and content analysis of online depression communities. IEEE Trans. Affect. Comput. 5(3), 217–226 (2014)
Article Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Pedregosa, F., Varoquaux, G., Gramfort, A., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Google Scholar
Pirina, I., Çöltekin, Ç.: Identifying depression on Reddit: the effect of training data. In: Proceedings of the 2018 EMNLP Workshop SMM4H: The 3rd Social Media Mining for Health Applications Workshop & Shared Task, Brussels, Belgium, pp. 9–12. Association for Computational Linguistics, October 2018. https://doi.org/10.18653/v1/W18-5903, https://aclanthology.org/W18-5903
Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)
Article Google Scholar
Reece, A.G., Danforth, C.M.: Instagram photos reveal predictive markers of depression. EPJ Data Sci. 6, 1–12 (2017)
Google Scholar
Reece, A.G., Reagan, A.J., Lix, K.L., Dodds, P.S., Danforth, C.M., Langer, E.J.: Forecasting the onset and course of mental illness with Twitter data. Sci. Rep. 7(1), 1–11 (2017)
Article Google Scholar
Stankevich, M., Latyshev, A., Kuminskaya, E., Smirnov, I., Grigoriev, O.: Depression detection from social media texts. In: Data Analytics and Management in Data Intensive Domains: XXI International Conference DAMDID/RDCL 2019, p. 352 (2019)
Google Scholar
Tadesse, M.M., Lin, H., Xu, B., Yang, L.: Detection of depression-related posts in Reddit social media forum. IEEE Access 7, 44883–44893 (2019). https://doi.org/10.1109/ACCESS.2019.2909180
Article Google Scholar
Tsugawa, S., Kikuchi, Y., Kishino, F., Nakajima, K., Itoh, Y., Ohsaki, H.: Recognizing depression from Twitter activity. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 3187–3196 (2015)
Google Scholar
Tyshchenko, Y.: Depression and anxiety detection from blog posts data. Nature Precis. Sci., Institute of Computer Science, University of Tartu, Tartu, Estonia (2018)
Google Scholar
Wolohan, J., Hiraga, M., Mukherjee, A., Sayyed, Z.A., Millard, M.: Detecting linguistic traces of depression in topic-restricted text: attending to self-stigmatized depression with NLP. In: Proceedings of the 1st International Workshop on Language Cognition and Computational Models, pp. 11–21 (2018)
Google Scholar
Yao, H., Rashidian, S., Dong, X., Duanmu, H., Rosenthal, R.N., Wang, F.: Detection of suicidality among opioid users on Reddit: machine learning-based approach. J. Med. Internet Res. 22(11), e15293 (2020)
Article Google Scholar

Download references

Acknowledgements

We would like to thank the Department of Science and Technology - Science and Engineering Research Board (DST-SERB) for providing funds to annotate the collected data.

Author information

Authors and Affiliations

Sri Sivasubramaniya Nadar College of Engineering, Chennai, India
Kayalvizhi Sampath & Thenmozhi Durairaj

Authors

Kayalvizhi Sampath
View author publications
You can also search for this author in PubMed Google Scholar
Thenmozhi Durairaj
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kayalvizhi Sampath .

Editor information

Editors and Affiliations

Sri Sivasubramaniya Nadar College of Engineering, Chennai, India
Lekshmi Kalinathan
Sri Sivasubramaniya Nadar College of Engineering, Chennai, India
Priyadharsini R.
Sri Sivasubramaniya Nadar College of Engineering, Chennai, India
Madheswari Kanmani
Sri Sivasubramaniya Nadar College of Engineering, Chennai, India
Manisha S.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sampath, K., Durairaj, T. (2022). Data Set Creation and Empirical Analysis for Detecting Signs of Depression from Social Media Postings. In: Kalinathan, L., R., P., Kanmani, M., S., M. (eds) Computational Intelligence in Data Science. ICCIDS 2022. IFIP Advances in Information and Communication Technology, vol 654. Springer, Cham. https://doi.org/10.1007/978-3-031-16364-7_11

Download citation

DOI: https://doi.org/10.1007/978-3-031-16364-7_11
Published: 29 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16363-0
Online ISBN: 978-3-031-16364-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Federation for Information Processing (opens in a new tab)

Data Set Creation and Empirical Analysis for Detecting Signs of Depression from Social Media Postings