Skip to main content

Rhythm Formant Analysis for Automatic Depression Classification

  • Conference paper
  • First Online:
Speech and Computer (SPECOM 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14338))

Included in the following conference series:

Abstract

This paper presents a study on the application of Rhythm Formant Analysis (RFA) for automatic depression classification in speech signals. The research utilizes the EATD-corpus, a dataset specifically designed for studying depression in speech. The goal is to develop an effective classification system capable of distinguishing between depressed and non-depressed speech based on Rhythm Formant (RF) features. The proposed methodology involves extracting RFs from the speech signals using signal processing techniques. Two kinds of RFs, namely Amplitude Modulation (AM) and Frequency Modulation (FM) RFs and their combinations are analyzed and used as features for classification. These features provide valuable information about the temporal and spectral characteristics of the speech. The classification system is built using a Decision Tree (DT) classifier and its results are compared with logistic regression and random forest. The model’s performance is evaluated using the accuracy, F1 scores for each class and their macro and weighted averages. Experimental results demonstrate promising outcomes, with the DT classifier achieving an accuracy of 70%, a weighted average F1 score of 0.73 and a macro average F1 score of 0.53 when using FM RFs as feature, showing much better performance compared to other features and classifiers. These results indicate that the proposed approach effectively captures discriminative features related to depression in the speech signals. The findings suggest that RFs have the potential to serve as a valuable tool for building automatic depression classification systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Al Hanai, T., Ghassemi, M.M., Glass, J.R.: Detecting depression with audio/text sequence modeling of interviews. In: Interspeech, pp. 1716–1720 (2018)

    Google Scholar 

  2. Alghowinem, S., Goecke, R., Wagner, M., Epps, J., Breakspear, M., Parker, G., et al.: From joyous to clinically depressed: Mood detection using spontaneous speech. In: FLAIRS Conference, vol. 19 (2012)

    Google Scholar 

  3. Cummins, N., Scherer, S., Krajewski, J., Schnieder, S., Epps, J., Quatieri, T.F.: A review of depression and suicide risk assessment using speech analysis. Speech Commun. 71, 10–49 (2015)

    Article  Google Scholar 

  4. France, D.J., Shiavi, R.G., Silverman, S., Silverman, M., Wilkes, M.: Acoustical properties of speech as indicators of depression and suicidal risk. IEEE Trans. Biomed. Eng. 47(7), 829–837 (2000)

    Article  Google Scholar 

  5. Gibbon, D.: Speech rhythms: learning to discriminate speech styles. Proc. Speech Prosody 2022, 302–306 (2022)

    Article  Google Scholar 

  6. Gibbon, D.: The rhythms of rhythm. J. Int. Phon. Assoc. 53(1), 233–265 (2023)

    Article  Google Scholar 

  7. Gibbon, D., Li, P.: Quantifying and correlating rhythm formants in speech. arXiv preprint arXiv:1909.05639 (2019)

  8. He, L., Cao, C.: Automated depression analysis using convolutional neural networks from speech. J. Biomed. Inform. 83, 103–111 (2018)

    Article  Google Scholar 

  9. Hossin, M., Sulaiman, M.N.: A review on evaluation metrics for data classification evaluations. Inter. J. Data Mining Knowl. Manag. Process 5(2), 1 (2015)

    Article  Google Scholar 

  10. Satt, A., Rozenberg, S., Hoory, R., et al.: Efficient emotion recognition from speech using deep learning on spectrograms. In: Interspeech, pp. 1089–1093 (2017)

    Google Scholar 

  11. Shen, Y., Yang, H., Lin, L.: Automatic depression detection: an emotional audio-textual corpus and a gru/bilstm-based model. In: ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6247–6251. IEEE (2022)

    Google Scholar 

  12. Wu, P., Wang, R., Lin, H., Zhang, F., Tu, J., Sun, M.: Automatic depression recognition by intelligent speech signal processing: a systematic survey. CAAI Trans. Intell. Technol. 8(3), 701–711 (2023)

    Article  Google Scholar 

  13. Yingthawornsuk, T., Keskinpala, H.K., Wilkes, D.M., Shiavi, R.G., Salomon, R.M.: Direct acoustic feature using iterative em algorithm and spectral energy for classifying suicidal speech. In: Eighth Annual Conference of the International Speech Communication Association (2007)

    Google Scholar 

  14. Zhao, Z., et al.: Automatic assessment of depression from speech via a hierarchical attention transfer network and attention autoencoders. IEEE J. Selected Topics Signal Process. 14(2), 423–434 (2019)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kumar Kaustubh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kaustubh, K., Gogoi, P., Prasanna, S. (2023). Rhythm Formant Analysis for Automatic Depression Classification. In: Karpov, A., Samudravijaya, K., Deepak, K.T., Hegde, R.M., Agrawal, S.S., Prasanna, S.R.M. (eds) Speech and Computer. SPECOM 2023. Lecture Notes in Computer Science(), vol 14338. Springer, Cham. https://doi.org/10.1007/978-3-031-48309-7_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-48309-7_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-48308-0

  • Online ISBN: 978-3-031-48309-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics