Research on depression detection algorithm combine acoustic rhythm with sparse face recognition

Zhao, Jian; Su, Weiwen; Jia, Jian; Zhang, Chao; Lu, Tingting

doi:10.1007/s10586-017-1469-0

Research on depression detection algorithm combine acoustic rhythm with sparse face recognition

Published: 09 December 2017

Volume 22, pages 7873–7884, (2019)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Jian Zhao¹,
Weiwen Su¹,
Jian Jia²,
Chao Zhang¹ &
…
Tingting Lu¹

949 Accesses
7 Citations
Explore all metrics

Abstract

Due to the existence of false positive rate of the traditional depression diagnosis method, this paper proposes a multi-modal fusion algorithm based on speech signal and facial image sequence for depression diagnosis. Introduced spectrum subtraction to enhance depressed speech signal, and use cepstrum method to extract pitch frequency features with large variation rate and formant features with significant difference, the short time energy and Mel-frequency cepstral coefficients characteristic parameters for different emotion speeches are analyzed in both time domain and frequency domain, and establish a model for training and identification. Meanwhile, this paper implements the orthogonal match pursuit algorithm to obtain a sparse linear combination of face test samples, and cascade with voice and facial emotions based proportion. The experimental results show that the recognition rate based on the depression detection algorithm of fusion speech and facial emotions has reached 81.14%. Compared to the existing doctor’s accuracy rate of 47.3%, the accuracy can bring extra 71.54% improvement by combining with the proposed method of this paper. Additionally, it can easily apply to the hardware and software on the existing hospital instruments with low cost. Therefore, it is an accurate and effective method for diagnosing depression.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic Depression Level Analysis Using Audiovisual Modality

A Facial Fatigue Expression Recognition Method Based on Sparse Representation on the Low-Resolution Image

Audio-visual emotion recognition using multi-directional regression and Ridgelet transform

Article 26 November 2015

References

Ionescu, D.F., et al.: Defining anxious depression: a review of the literature. CNS Spectr. 18(5), 252–260 (2013)
Article Google Scholar
Erschens, R., et al.: Methodological aspects of international research on the burden of anxiety and depression in medical students. Ment. Health Prev. 4(1), 31–35 (2016)
Article Google Scholar
Melton, T.H., et al.: Comorbid anxiety and depressive symptoms in children and adolescents: a systematic review and analysis. J. Psychiatr. Pract. 22(2), 84 (2016)
Article Google Scholar
Potapova, R., Grigorieva, M.: Crosslinguistic intelligibility of Russian and German speech in noisy environment. J. Electr. Comput. Eng. 2017, 1–9 (2017)
Article Google Scholar
Vrbova, K., et al.: Quality of life, self-stigma, and hope in schizophrenia spectrum disorders: a cross-sectional study. Neuropsychiatr. Dis. Treat. 13, 567 (2017)
Article Google Scholar
Hernández-Mena, C.D., Meza-Ruiz, I.V., Herrera-Camacho, J.A.: Automatic speech recognizers for Mexican Spanish and its open resources. J. Appl. Res. Technol. 15(3) (2017)
Huang, Y.B., et al.: Hash authentication algorithm of compressed domain speech perception based on MFCC and NMF. Appl. Mech. Mater. 719–720, 1166–1170 (2015)
Article Google Scholar
Yang, A.Y., et al.: Distributed sensor perception via sparse representation. Proc. IEEE 98(6), 1077–1088 (2010)
Article Google Scholar
Maas, A.L., et al.: Building DNN acoustic models for large vocabulary speech recognition. Comput. Speech Lang. 41(C), 195–213 (2017)
Ozdas, A., et al.: Investigation of vocal jitter and glottal flow spectrum as possible cues for depression and near-term suicidal risk. IEEE Trans. Biomed. Eng. 51(9), 1530–1540 (2004)
Article Google Scholar
Vihari, S., et al.: Comparison of speech enhancement algorithms. Procedia Comput. Sci. 89, 666–676 (2016)
Article Google Scholar
Liu, Y.H., Zhou, D.M., Jiang, Z.J.: Improved spectral subtraction speech enhancement algorithm. Adv. Mater. Res. 760–762, 536–541 (2013)
Article Google Scholar
Tohidypour, H.R., Ahadi, S.M.: New features for speech enhancement using bivariate shrinkage based on redundant wavelet filter-banks. Comput. Speech Lang. 35(C), 93–115 (2016)
You, C.H., Bin, M.A.: Spectral-domain speech enhancement for speech recognition. Speech Commun. 94, 30–41 (2017)
Article Google Scholar
Sahu, S., Espywilson, C.: Effects of depression on speech. J. Acoust. Soc. Am. 136(4), 2312–2312 (2014)
Article Google Scholar
ChinnaRao, M., Murthy, A.V.S.N., Satyanarayana, Ch.: Emotion recognition system based on skew gaussian mixture model and MFCC coefficients. Int. J. Inf. Eng. Electron. Bus. (IJIEEB) 4, 51–57 (2015)
Yang, Y., Fairbairn, C., Cohn, J.F.: Detecting depression severity from vocal prosody. IEEE Trans. Affect. Comput. 4(2), 142–150 (2013)
Article Google Scholar
Schuller, B., et al.: Recognising realistic emotions and affect in speech: state of the art and lessons learnt from the first challenge. Speech Commun. 53(9–10), 1062–1087 (2011)
Article Google Scholar
Laukkanen, A.M., Björkner, E., Sundberg, J.: Throaty voice quality: subglottal pressure, voice source, and formant characteristics. J. Voice 20(1), 25–37 (2006)
Article Google Scholar
Hou, L.M., Xiao-Ning, H.U., Xie, J.M.: Application of formant instantaneous characteristics to speech recognition and speaker identification. J. Shanghai Univ. (English Edition) 15(2), 123–127 (2011)
Article Google Scholar
Vijayan, K., Reddy, P.R., Murty, K.S.R.: Significance of analytic phase of speech signals in speaker verification. Speech Commun. 81, 54–71 (2016)
Article Google Scholar
He, L., Guo, L.H., Li, H.Z.: Emotion speech recognition under sadness conditions. Adv. Mater. Res. 488–489, 1329–1334 (2012)
Article Google Scholar
Jian, Z., et al.: A fast iterative pursuit algorithm in robust face recognition based on sparse representation. Math. Probl. Eng. 2, 1–11 (2014)
Yin, A.H., Jiang, H.M., Zhang, Q.M.: Application of improved OMP algorithm in face recognition. Comput. Eng. 38(12), 275–278 (2012)
Google Scholar

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (No. 61379010).

Author information

Authors and Affiliations

School of Information Science and Technology, Northwest University, Xi’an, 710127, China
Jian Zhao, Weiwen Su, Chao Zhang & Tingting Lu
School of Mathematics, Northwest University, Xi’an, 710127, China
Jian Jia

Authors

Jian Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Weiwen Su
View author publications
You can also search for this author in PubMed Google Scholar
Jian Jia
View author publications
You can also search for this author in PubMed Google Scholar
Chao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Tingting Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jian Jia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhao, J., Su, W., Jia, J. et al. Research on depression detection algorithm combine acoustic rhythm with sparse face recognition. Cluster Comput 22 (Suppl 4), 7873–7884 (2019). https://doi.org/10.1007/s10586-017-1469-0

Download citation

Received: 02 November 2017
Revised: 22 November 2017
Accepted: 30 November 2017
Published: 09 December 2017
Issue Date: July 2019
DOI: https://doi.org/10.1007/s10586-017-1469-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Research on depression detection algorithm combine acoustic rhythm with sparse face recognition

Abstract

Access this article

Similar content being viewed by others

Automatic Depression Level Analysis Using Audiovisual Modality

A Facial Fatigue Expression Recognition Method Based on Sparse Representation on the Low-Resolution Image

Audio-visual emotion recognition using multi-directional regression and Ridgelet transform

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Research on depression detection algorithm combine acoustic rhythm with sparse face recognition

Abstract

Access this article

Similar content being viewed by others

Automatic Depression Level Analysis Using Audiovisual Modality

A Facial Fatigue Expression Recognition Method Based on Sparse Representation on the Low-Resolution Image

Audio-visual emotion recognition using multi-directional regression and Ridgelet transform

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation