research-article

Deep Learning Technique to Diagnose Depression in Audio

Authors:
Gargi Misra

Motilal Nehru National Institute of Technology, India

Motilal Nehru National Institute of Technology, India

0009-0000-0028-139X
View Profile

,
Divya Kumar

Motilal Nehru National Institute of Technology, India

Motilal Nehru National Institute of Technology, India

0000-0002-5550-2134
View Profile

IC3-2023: Proceedings of the 2023 Fifteenth International Conference on Contemporary ComputingAugust 2023Pages 366–370https://doi.org/10.1145/3607947.3608029

Published:28 September 2023Publication History

IC3-2023: Proceedings of the 2023 Fifteenth International Conference on Contemporary Computing

Pages 366–370

ABSTRACT

Depression is a prevalent psychiatric condition that has to be identified and treated right away. It may cause suicidal ideation in extreme instances. The requirement for creating an efficient audio-based automated depression identification system has recently piqued the fascination of researchers. The bulk of studies conducted so far incorporates a broad range of expertly created audio elements for a depression diagnosis. This expands feature space and causes a high-dimensionality problem which complicates pattern identification and increases the chance of data imbalance. This paper suggests a deep learning autoencoder-based method to retrieve pertinent and condensed features from speech signals in order to more precisely diagnose mental illness. The performance and efficacy of the suggested approach are evaluated on the DAIC-WoZ dataset and compared the results with other noteworthy machine learning algorithms. According to the findings, this technique works better than existing audio-based depression detection models when used with an SVM classifier resulting in an accuracy of 97% for diagnosing depression.

References

Tuka Al Hanai, Mohammad M Ghassemi, and James R Glass. 2018. Detecting Depression with Audio/Text Sequence Modeling of Interviews.. In Interspeech. 1716–1720.Google Scholar
Joana Correia, Bhiksha Raj, and Isabel Trancoso. 2018. Querying depression vlogs. In 2018 IEEE Spoken Language Technology Workshop (SLT). IEEE, 987–993.Google ScholarCross Ref
Joana Correia, Isabel Trancoso, and Bhiksha Raj. 2016. Detecting psychological distress in adults through transcriptions of clinical interviews. In Advances in Speech and Language Technologies for Iberian Languages: Third International Conference, IberSPEECH 2016, Lisbon, Portugal, November 23-25, 2016, Proceedings 3. Springer, 162–171.Google ScholarDigital Library
Nicholas Cummins, Vidhyasaharan Sethu, Julien Epps, Sebastian Schnieder, and Jarek Krajewski. 2015. Analysis of acoustic space variability in speech affected by depression. Speech Communication 75 (2015), 27–49.Google ScholarDigital Library
Florian Eyben, Klaus R Scherer, Björn W Schuller, Johan Sundberg, Elisabeth André, Carlos Busso, Laurence Y Devillers, Julien Epps, Petri Laukka, Shrikanth S Narayanan, 2015. The Geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing. IEEE transactions on affective computing 7, 2 (2015), 190–202.Google Scholar
Jonathan Gratch, Ron Artstein, Gale Lucas, Giota Stratou, Stefan Scherer, Angela Nazarian, Rachel Wood, Jill Boberg, David DeVault, Stacy Marsella, 2014. The distress analysis interview corpus of human and computer interviews. Technical Report. University of Southern California Los Angeles.Google Scholar
Lang He and Cui Cao. 2018. Automated depression analysis using convolutional neural networks from speech. Journal of biomedical informatics 83 (2018), 103–111.Google ScholarCross Ref
Paula Lopez-Otero and Laura Docio-Fernandez. 2021. Analysis of gender and identity issues in depression detection on de-identified speech. Computer Speech & Language 65 (2021), 101118.Google ScholarCross Ref
Paula Lopez-Otero, Laura Docio-Fernandez, and Carmen Garcia-Mateo. 2015. Assessing speaker independence on a speech-based depression level estimation system. Pattern Recognition Letters 68 (2015), 343–350.Google ScholarDigital Library
Paula Lopez-Otero, Laura Docío Fernández, Alberto Abad, and Carmen Garcia-Mateo. 2017. Depression Detection Using Automatic Transcriptions of De-Identified Speech.. In INTERSPEECH. 3157–3161.Google Scholar
Colin D Mathers and Dejan Loncar. 2006. Projections of global mortality and burden of disease from 2002 to 2030. PLoS medicine 3, 11 (2006), e442.Google Scholar
Michelle Renee Morales and Rivka Levitan. 2016. Speech vs. text: A comparative analysis of features for depression detection systems. In 2016 IEEE spoken language technology workshop (SLT). IEEE, 136–143.Google Scholar
Md Nasir, Arindam Jati, Prashanth Gurunath Shivakumar, Sandeep Nallan Chakravarthula, and Panayiotis Georgiou. 2016. Multimodal and multiresolution depression detection from speech and facial landmark features. In Proceedings of the 6th international workshop on audio/visual emotion challenge. 43–50.Google ScholarDigital Library
World Health Organization. 2021. Depressive disorder (depression). https://www.who.int/news-room/fact-sheets/detail/depressionGoogle Scholar
Eugenia Palylyk-Colwell and Charlene Argáez. 2018. Telehealth for the assessment and treatment of depression, post-traumatic stress disorder, and anxiety: clinical evidence. (2018).Google Scholar
Fabien Ringeval, Björn Schuller, Michel Valstar, Jonathan Gratch, Roddy Cowie, Stefan Scherer, Sharon Mozgai, Nicholas Cummins, Maximilian Schmitt, and Maja Pantic. 2017. Avec 2017: Real-life depression, and affect recognition workshop and challenge. In Proceedings of the 7th annual workshop on audio/visual emotion challenge. 3–9.Google ScholarDigital Library
Bo Sun, Yinghui Zhang, Jun He, Lejun Yu, Qihua Xu, Dongliang Li, and Zhaoying Wang. 2017. A random forest regression method with selected-text feature for depression assessment. In Proceedings of the 7th annual workshop on Audio/Visual emotion challenge. 61–68.Google ScholarDigital Library
Zafi Sherhan Syed, Kirill Sidorov, and David Marshall. 2017. Depression severity prediction based on biomarkers of psychomotor retardation. In Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge. 37–43.Google ScholarDigital Library
Michel Valstar, Jonathan Gratch, Björn Schuller, Fabien Ringeval, Denis Lalanne, Mercedes Torres Torres, Stefan Scherer, Giota Stratou, Roddy Cowie, and Maja Pantic. 2016. Avec 2016: Depression, mood, and emotion recognition workshop and challenge. In Proceedings of the 6th international workshop on audio/visual emotion challenge. 3–10.Google ScholarDigital Library
Michel Valstar, Björn Schuller, Kirsty Smith, Timur Almaev, Florian Eyben, Jarek Krajewski, Roddy Cowie, and Maja Pantic. 2014. Avec 2014: 3d dimensional affect and depression recognition challenge. In Proceedings of the 4th international workshop on audio/visual emotion challenge. 3–10.Google ScholarDigital Library
James R Williamson, Elizabeth Godoy, Miriam Cha, Adrianne Schwarzentruber, Pooya Khorrami, Youngjune Gwon, Hsiang-Tsung Kung, Charlie Dagli, and Thomas F Quatieri. 2016. Detecting depression using vocal, facial and semantic communication cues. In Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge. 11–18.Google ScholarDigital Library
James R Williamson, Thomas F Quatieri, Brian S Helfer, Gregory Ciccarelli, and Daryush D Mehta. 2014. Vocal and facial biomarkers of depression based on motor incoordination and timing. In Proceedings of the 4th international workshop on audio/visual emotion challenge. 65–72.Google ScholarDigital Library
Le Yang, Dongmei Jiang, Lang He, Ercheng Pei, Meshia Cédric Oveneke, and Hichem Sahli. 2016. Decision tree based depression classification from audio video and language information. In Proceedings of the 6th international workshop on audio/visual emotion challenge. 89–96.Google ScholarDigital Library
Le Yang, Dongmei Jiang, Xiaohan Xia, Ercheng Pei, Meshia Cédric Oveneke, and Hichem Sahli. 2017. Multimodal measurement of depression using deep learning models. In Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge. 53–59.Google ScholarDigital Library
Xiaowei Zhang, Jian Shen, Zia ud Din, Jinyong Liu, Gang Wang, and Bin Hu. 2019. Multimodal depression detection: fusion of electroencephalography and paralinguistic behaviors using a novel strategy for classifier ensemble. IEEE journal of biomedical and health informatics 23, 6 (2019), 2265–2275.Google Scholar

Index Terms

Deep Learning Technique to Diagnose Depression in Audio

Index terms have been assigned to the content through auto-classification.

Recommendations

Multi-Modal Depression Detection Based on High-Order Emotional Features
AICCC '22: Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference

The diagnosis of depression has always been a difficulty in its treatment. At present, the research on automatic depression detection mostly directly uses low-order features such as video, audio and text as input. The lack of guidance of high-order ...
Read More
Harnessing emotions for depression detection
Abstract
Human emotions using textual cues, speech patterns, and facial expressions can give insight into their mental state. Although there are several uni-modal datasets for emotion recognition, there are very few labeled datasets for multi-modal ...
Read More
An effective analysis of deep learning based approaches for audio based feature extraction and its visualization
Abstract
Visualizations help decipher latent patterns in music and garner a deep understanding of a song’s characteristics. This paper offers a critical analysis of the effectiveness of various state-of-the-art Deep Neural Networks in visualizing music. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

IC3-2023: Proceedings of the 2023 Fifteenth International Conference on Contemporary Computing
August 2023
783 pages
ISBN:9798400700224
DOI:10.1145/3607947

Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 September 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Audio Feature Extraction
Auto-encoder
Deep Learning
Depression Detection
Mental Illness
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 81
  Total Downloads
- Downloads (Last 12 months)81
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Deep Learning Technique to Diagnose Depression in Audio

IC3-2023: Proceedings of the 2023 Fifteenth International Conference on Contemporary Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Multi-Modal Depression Detection Based on High-Order Emotional Features

Harnessing emotions for depression detection

An effective analysis of deep learning based approaches for audio based feature extraction and its visualization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Deep Learning Technique to Diagnose Depression in Audio

IC3-2023: Proceedings of the 2023 Fifteenth International Conference on Contemporary Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Multi-Modal Depression Detection Based on High-Order Emotional Features

Harnessing emotions for depression detection

An effective analysis of deep learning based approaches for audio based feature extraction and its visualization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media