Skip to main content
Log in

Automated facial video-based recognition of depression and anxiety symptom severity: cross-corpus validation

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

There is a growing interest in computational approaches permitting accurate detection of nonverbal signs of depression and related symptoms (i.e., anxiety and distress) that may serve as minimally intrusive means of monitoring illness progression. The aim of the present work was to develop a methodology for detecting such signs and to evaluate its generalizability and clinical specificity for detecting signs of depression and anxiety. Our approach focused on dynamic descriptors of facial expressions, employing motion history image, combined with appearance-based feature extraction algorithms (local binary patterns, histogram of oriented gradients), and visual geometry group features derived using deep learning networks through transfer learning. The relative performance of various alternative feature description and extraction techniques was first evaluated on a novel dataset comprising patients with a clinical diagnosis of depression (\(n=20\)) and healthy volunteers (\(n=45\)). Among various schemes involving depression measures as outcomes, best performance was obtained for continuous assessment of depression severity (as opposed to binary classification of patients and healthy volunteers). Comparable performance was achieved on a benchmark dataset, the audio/visual emotion challenge (AVEC’14). Regarding clinical specificity, results indicated that the proposed methodology was more accurate in detecting visual signs associated with self-reported anxiety symptoms. Findings are discussed in relation to clinical and technical limitations and future improvements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. http://avec2013-db.sspnet.eu/.

  2. http://dcapswoz.ict.usc.edu/.

  3. http://www.jeffcohn.net/resources/.

  4. https://github.com/jmgirard/CARMA/.

  5. Figures of the setup are screen-shots from the interview given to a local TV channel (complete video available at: https://youtu.be/lH6Lo4S9KQ0).

  6. https://github.com/TadasBaltrusaitis/OpenFace/.

References

  1. World Health Organization (WHO) (2017). http://www.who.int/mental_health/management/depression/en/

  2. First, M.B.: Structured Clinical Interview for DSM-IV-TR Axis I Disorders: Patient Edition. Biometrics Research Department, Columbia University, New York (2005)

    Google Scholar 

  3. Chmielewski, M., Clark, L.A., Bagby, R.M., Watson, D.: Method matters: understanding diagnostic reliability in DSM-IV and DSM-5. J. Abnorm. Psychol. 124(3), 764 (2015)

    Article  Google Scholar 

  4. Pampouchidou, A., Simos, P.G., Marias, K., Meriaudeau, F., Yang, F., Pediaditis, M., Tsiknakis, M.: Automatic assessment of depression based on visual cues: a systematic review. IEEE Trans. Affect. Comput. 10(4), 445–470 (2019)

    Article  Google Scholar 

  5. Falagas, M., Vardakas, K., Vergidis, P.: Under-diagnosis of common chronic diseases: prevalence and impact on human health. Int. J. Clin. Pract. 61(9), 1569–1579 (2007)

    Article  Google Scholar 

  6. Comer, J.S.: Introduction to the special series: applying new technologies to extend the scope and accessibility of mental health care. Cogn. Behav. Pract. 22(3), 253–257 (2015)

    Article  Google Scholar 

  7. Ellgring, H.: Non-verbal Communication in Depression. Cambridge University Press, New York (2007)

    Google Scholar 

  8. Waxer, P.H.: Therapist training in nonverbal communication. I: nonverbal cues for depression. J. Clin. Psychol. 30(2), 215 (1974)

    Article  Google Scholar 

  9. Girard, J.M., Cohn, J.F.: Automated audiovisual depression analysis. Curr. Opin. Psychol. 4, 75–79 (2015)

    Article  Google Scholar 

  10. Alghowinem, S., Goecke, R., Wagner, M., Epps, J., Hyett, M., Parker, G., Breakspear, M.: Multimodal depression detection: fusion analysis of paralinguistic, head pose and eye gaze behaviors. IEEE Trans. Affect. Comput. 9(4), 478–490 (2018)

    Article  Google Scholar 

  11. Cohn, J.F., Kruez, T.S., Matthews, I., Yang, Y., Nguyen, M.H., Padilla, M.T., Zhou, F., De La Torre, F.: Detecting depression from facial actions and vocal prosody. In: 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, pp. 1–7. IEEE (2009)

  12. Pampouchidou, A., Pediaditis, M., Maridaki, A., Awais, M., Vazakopoulou, C.M., Sfakianakis, S., Tsiknakis, M., Simos, P., Marias, K., Yang, F., Meriaudeau, F.: Quantitative comparison of motion history image variants for video-based depression assessment. EURASIP J. Image Video Process. 2017(1), 64 (2017)

    Article  Google Scholar 

  13. Jan, A., Meng, H., Gaus, Y.F.B.A., Zhang, F.: Artificial intelligent system for automatic depression level analysis through visual and vocal expressions. IEEE Trans. Cogn. Dev. Syst. 10(3), 668–680 (2018)

    Article  Google Scholar 

  14. de Melo, W.C., Granger, E., Hadid, A.: Combining global and local convolutional 3D networks for detecting depression from facial expressions. In: 2019 14th IEEE International Conference on Automatic Face Gesture Recognition (FG 2019), pp. 1–8 (2019)

  15. Kang, Y., Jiang, X., Yin, Y., Shang, Y., Zhou, X.: Deep transformation learning for depression diagnosis from facial images. In: Zhou, J., Wang, Y., Sun, Z., Xu, Y., Shen, L., Feng, J., Shan, S., Qiao, Y., Guo, Z., Yu, S. (eds.) Biometric Recognition, pp. 13–22. Springer, Cham (2017)

    Chapter  Google Scholar 

  16. Zhu, Y., Shang, Y., Shao, Z., Guo, G.: Automated depression diagnosis based on deep networks to encode facial appearance and dynamics. IEEE Trans. Affect. Comput. 9(4), 578–584 (2018)

    Article  Google Scholar 

  17. Kaya, H., Salah, A.A.: Eyes whisper depression: a CCA based multimodal approach. In: International Conference on Multimedia, pp. 961–964. ACM (2014)

  18. Jain, V., Crowley, J.L., Dey, A.K., Lux, A.: Depression estimation using audiovisual features and fisher vector encoding. In: 4th ACM International Workshop on Audio/Visual Emotion Challenge (AVEC ’14), pp. 87–91. ACM (2014)

  19. Jan, A., Meng, H., Gaus, Y.F.A., Zhang, F., Turabzadeh, S.: Automatic depression scale prediction using facial expression dynamics and regression. In: 4th ACM International Workshop on Audio/Visual Emotion Challenge (AVEC ’14), pp. 73–80. ACM (2014)

  20. Kächele, M., Glodek, M., Zharkov, D., Meudt, S., Schwenker, F.: Fusion of audio-visual features using hierarchical classifier systems for the recognition of affective states and the state of depression. In: 3rd International Conference on Pattern Recognition Applications and Methods, pp. 671–678. SciTePress (2014)

  21. Valstar, M., Schuller, B., Smith, K., Almaev, T., Eyben, F., Krajewski, J., Cowie, R., Pantic, M.: AVEC 2014: 3D dimensional affect and depression recognition challenge. In: 4th ACM International Workshop on Audio/Visual Emotion Challenge (AVEC ’14), pp. 3–10. ACM (2014)

  22. Jazaery, M.A., Guo, G.: Video-based depression level analysis by encoding deep spatiotemporal features. IEEE Trans. Affect. Comput. (2018). https://doi.org/10.1109/TAFFC.2018.2870884

    Article  Google Scholar 

  23. He, L., Jiang, D., Sahli, H.: Automatic depression analysis using dynamic facial appearance descriptor and dirichlet process fisher encoding. IEEE Trans. Multimed. 21(6), 1476–1486 (2019)

    Article  Google Scholar 

  24. Zhou, X., Jin, K., Shang, Y., Guo, G.: Visually interpretable representation learning for depression recognition from facial images. IEEE Trans. Affect. Comput. (2018). https://doi.org/10.1109/TAFFC.2018.2828819

    Article  Google Scholar 

  25. Chentsova-Dutton, Y.E., Tsai, J.L., Gotlib, I.H.: Further evidence for the cultural norm hypothesis: positive emotion in depressed and Control European American and Asian American Women. Cult. Divers. Ethn. Minor. Psychol. 16(2), 284 (2010)

    Article  Google Scholar 

  26. Girard, J.M., McDuff, D.: Historical heterogeneity predicts smiling: evidence from large-scale observational analyses. In: 2017 12th IEEE International Conference on Automatic Face Gesture Recognition (FG 2017), pp. 719–726 (2017)

  27. Malhi, G.S., Parker, G.B., Gladstone, G., Wilhelm, K., Mitchell, P.B.: Recognizing the anxious face of depression. J. Nerv. Ment. Dis. 190(6), 366–373 (2002)

    Article  Google Scholar 

  28. Katz, M.M., Wetzler, S., Cloitre, M., Swann, A., Secunda, S., Mendels, J., Robins, E.: Expressive characteristics of anxiety in depressed men and women. J. Affect. Disord. 28(4), 267–277 (1993)

    Article  Google Scholar 

  29. Kotov, R., Krueger, R.F., Watson, D., Achenbach, T.M., Althoff, R.R., Bagby, R.M., Brown, T.A., Carpenter, W.T., Caspi, A., Clark, L.A., et al.: The hierarchical taxonomy of psychopathology (HiTOP): a dimensional alternative to rraditional nosologies. J. Abnorm. Psychol. 126(4), 454 (2017)

    Article  Google Scholar 

  30. Kaufman, J., Charney, D.: Comorbidity of mood and anxiety disorders. Depress. Anxiety 12(S1), 69–76 (2000)

    Article  Google Scholar 

  31. Cohn, J.F., Cummins, N., Epps, J., Goecke, R., Joshi, J., Scherer, S.: Multimodal Assessment of Depression from Behavioral Signals, pp. 375–417. Association for Computing Machinery, Morgan & Claypool, New York, Williston (2018)

    Google Scholar 

  32. Jaiswal, S., Song, S., Valstar, M.: Automatic prediction of depression and anxiety from behaviour and personality attributes. In: 2019 8th International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 1–7 (2019)

  33. Bogner, H.R., Gallo, J.J.: Are higher rates of depression in women accounted for by differential symptom reporting? Soc. Psych. Psych. Epidemiol. 39(2), 126–132 (2004)

    Article  Google Scholar 

  34. Giannakou, M., Roussi, P., Kosmides, M., Kiosseoglou, G., Adamopoulou, A., Garyfallos, G.: Adaptation of the beck depression inventory-II to greek population. Hell. J. Psychol. 10(2), 120–146 (2013)

    Google Scholar 

  35. Fountoulakis, K.N., Papadopoulou, M., Kleanthous, S., Papadopoulou, A., Bizeli, V., Nimatoudis, I., Iacovides, A., Kaprinis, G.S.: Reliability and psychometric properties of the greek translation of the state-trait anxiety inventory form Y: preliminary data. Ann. Gen. Psych. 5(1), 2 (2006)

    Article  Google Scholar 

  36. Girard, J.M.: CARMA: software for continuous affect rating and media annotation. J. Open Res. Softw. 2(1), e5 (2014)

    Google Scholar 

  37. Davies, H., Wolz, I., Leppanen, J., Fernandez-Aranda, F., Schmidt, U., Tchanturia, K.: Facial expression to emotional stimuli in non-psychotic disorders: a systematic review and meta-analysis. Neurosci. Biobehav. Rev. 64, 252–271 (2016)

    Article  Google Scholar 

  38. Kächele, M., Schels, M., Schwenker, F.: The influence of annotation, corpus design, and evaluation on the outcome of automatic classification of human emotions. Front. ICT 3, 27 (2016)

    Article  Google Scholar 

  39. Ekman, P.: Are there basic emotions? Psychol. Rev. 99, 550–553 (1992)

    Article  Google Scholar 

  40. Baltrušaitis, T., Zadeh, A., Lim, Y.C., Morency, L.: Openface 2.0: facial behavior analysis toolkit. In: 2018 13th IEEE International Conference on Automatic Face Gesture Recognition (FG 2018), pp. 59–66 (2018)

  41. Zadeh, A., Lim, Y.C., Baltrusaitis, T., Morency, L.: Convolutional experts constrained local model for 3D facial landmark detection. In: 2017 IEEE International Conference on Computer Vision Workshop (ICCVW), pp. 2519–2528 (2017)

  42. Ahad, M.A.R.: Motion History Images for Action Recognition and Understanding. Springer, New York (2012)

    MATH  Google Scholar 

  43. Ahad, M.A.R., Tan, J.K., Kim, H., Ishikawa, S.: Motion history image: its variants and applications. Mach. Vis. Appl. 23(2), 255–281 (2012)

    Article  Google Scholar 

  44. Bobick, A., Davis, J.: Real-time recognition of activity using temporal templates. In: Proceedings Third IEEE Workshop on Applications of Computer Vision. WACV’96, pp. 39–42 (1996)

  45. Valstar, M., Pantic, M., Patras, I.: Motion history for facial action detection in video. In: 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583), vol. 1, pp. 635–640 (2004)

  46. Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)

    Article  MATH  Google Scholar 

  47. Mäenpää, T., Pietikäinen, M.: Texture analysis with local binary patterns. In: Chen, C.H., Pau, L.F., Wang, P.S.P. (eds.) Handbook of Pattern Recognition and Computer Vision, pp. 197–216. World Scientific, Singapore (2005)

    Chapter  Google Scholar 

  48. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp. 886–893 (2005)

  49. Chen, J., Chen, Z., Chi, Z., Fu, H.: Facial expression recognition based on facial components detection and HOG features. In: International workshops on electrical and computer engineering subfields, pp. 884–888 (2014)

  50. Simonyan, K., Zisserman, A.: Very Deep Convolutional Networks for Large-scale Image Recognition (2014). arXiv preprint arXiv:1409.1556

  51. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, Curran Associates Inc., USA, NIPS’12, pp. 1097–1105 (2012)

  52. Cristianini, N., Shawe-Taylor, J., et al.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)

    Book  MATH  Google Scholar 

  53. Cohen, J.: Weighted kappa: nominal scale agreement provision for scaled disagreement or partial credit. Psychol. Bull. 70(4), 213 (1968)

    Article  Google Scholar 

  54. Karekla, M., Michaelides, M.P.: Validation and invariance testing of the greek adaptation of the acceptance and action questionnaire-II across clinical vs. nonclinical samples and sexes. J. Context. Behav. Sci. 6(1), 119–124 (2017)

    Article  Google Scholar 

  55. Bland, J.M., Altman, D.G.: Statistical methods for assessing agreement between two methods of clinical measurement. Int. J. Nurs. Stud. 47(8), 931–936 (2010)

    Article  Google Scholar 

  56. Dibeklioğlu, H., Hammal, Z., Cohn, J.F.: Dynamic multimodal measurement of depression severity using deep autoencoding. IEEE J. Biomed. Health Inf. 22(2), 525–536 (2018)

    Article  Google Scholar 

  57. Feinstein, A.R., Cicchetti, D.V.: High agreement but low kappa: I. The problems of two paradoxes. J. Clin. Epidemiol. 43(6), 543–549 (1990)

    Article  Google Scholar 

  58. Stylianidis, S., Pantelidou, S., Chondros, P., Roelandt, J., Barbato, A.: Prevalence of mental disorders in a Greek island. Psychiatriki 25(1), 19–26 (2014)

    Google Scholar 

  59. Cummins, N., Scherer, S., Krajewski, J., Schnieder, S., Epps, J., Quatieri, T.F.: A review of depression and suicide risk assessment using speech analysis. Speech Commun. 71, 10–49 (2015)

    Article  Google Scholar 

  60. Simantiraki, O., Charonyktakis, P., Pampouchidou, A., Tsiknakis, M., Cooke, M.: Glottal source features for automatic speech-based depression assessment. In: INTERSPEECH, pp. 2700–2704 (2017)

  61. Pampouchidou, A., Simantiraki, O., Fazlollahi, A., Pediaditis, M., Manousos, D., Roniotis, A., Giannakakis, G., Meriaudeau, F., Simos, P., Marias, K., Yang, F., Tsiknakis, M.: Depression assessment by fusing high and low level features from audio, video, and text. In: Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge, ACM, New York, NY, USA, AVEC ’16, pp. 27–34 (2016)

  62. Sfakianakis, S., Bei, E.S., Zervakis, M.: Stacking of network based classifiers with application in breast cancer classification. In: XIV Mediterranean Conference on Medical and Biological Engineering and Computing 2016, pp. 1079–1084. Springer (2016)

  63. Gross, J.J., Levenson, R.W.: Emotion elicitation using films. Cognit. Emot. 9(1), 87–108 (1995)

    Article  Google Scholar 

Download references

Acknowledgements

Funding was provided by State Scholarships Foundation (Grant No. Legacy fund in the memory of Maria Zaousi).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. Pampouchidou.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A. Pampouchidou was funded by the Greek State Scholarship Foundation, under the scholarship instituted in memory of Maria Zaousi.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 597 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pampouchidou, A., Pediaditis, M., Kazantzaki, E. et al. Automated facial video-based recognition of depression and anxiety symptom severity: cross-corpus validation. Machine Vision and Applications 31, 30 (2020). https://doi.org/10.1007/s00138-020-01080-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00138-020-01080-7

Keywords

Navigation