Abstract
Artificial intelligence methods are widely applied to depression recognition and provide an objective solution. Many effective automated methods for detecting depression use facial expressions, which are strong indicators of psychiatric disorders. However, existing approaches ignore the uneven distribution of depression information in time and space. Therefore, these approaches have limitations in their ability to form discriminative depression representations. In this paper, we propose a framework based on information regions and clips for depression detection. Specifically, we first divide the regions of interest (ROIs), which are regarded as spatially informative regions, according to pathological knowledge of depression. Following this, the local-MHHLBP-BiLSTM (LMB) module is proposed as a feature extractor to exploit short-term and long-term temporal information. Finally, an improved attention mechanism with a balancing factor is introduced into LMB to increase attention to information segments. The proposed model performs tenfold cross-validation on our 150-subject video dataset and outperforms most state-of-the-art approaches with accuracy = 0.757, precision = 0.767, recall = 0.786, and F1 score = 0.761. The obtained results demonstrate that focusing on information regions, and clips can effectively reduce the error in depression diagnosis. More importantly, we observe that the area near the eye is fairly informative and that depressed individuals blink more frequently.
Similar content being viewed by others
Data Availability
The datasets generated and/or analyzed during the current study are available from the corresponding author upon reasonable request.
References
Organization WH. Depression and other common mental disorders: global health estimates. World Health Organization; 2017.
Hawton K, i Comabella CC, Haw C, Saunders K. Risk factors for suicide in individuals with depression: a systematic review. J Affect Disord. 2013;147(1–3):17–28.
Muzammel M, Salam H, Othmani A. End-to-end multimodal clinical depression recognition using deep neural networks: a comparative analysis. Comput Methods Programs Biomed. 2021;211:106433.
Zhu H, Han G, Shu L, Zhao H. ArvaNet: deep recurrent architecture for PPG-based negative mental-state monitoring. IEEE Trans Comput Soc Syst. 2020;8(1):179–90.
Zhou J, Zogan H, Yang S, Jameel S, Xu G, Chen F. Detecting community depression dynamics due to covid-19 pandemic in australia. IEEE Trans Comput Soc Syst. 2021;8(4):982–91.
Yasin S, Hussain SA, Aslan S, Raza I, Muzammel M, Othmani A. EEG based major depressive disorder and bipolar disorder detection using neural networks: a review. Comput Methods Programs Biomed. 2021;202:106007.
Acharya UR, Oh SL, Hagiwara Y, Tan JH, Adeli H, Subha DP. Automated EEG-based screening of depression using deep convolutional neural network. Comput Methods Programs Biomed. 2018;161:103–13.
Cai H, Qu Z, Li Z, Zhang Y, Hu X, Hu B. Feature-level fusion approaches based on multimodal EEG data for depression recognition. Inform Fus. 2020;59:127–38.
Noda T, Yoshida S, Matsuda T, Okamoto N, Sakamoto K, Koseki S, et al. Frontal and right temporal activations correlate negatively with depression severity during verbal fluency task: a multi-channel near-infrared spectroscopy study. J Psychiatr Res. 2012;46(7):905–12.
Husain SF, Tang T-B, Yu R, Tam WW, Tran B, Quek TT, et al. Cortical haemodynamic response measured by functional near infrared spectroscopy during a verbal fluency task in patients with major depression and borderline personality disorder. EBioMedicine. 2020;51:102586.
Pominova M, Artemov A, Sharaev M, Kondrateva E, Bernstein A, Burnaev E, editors. Voxelwise 3d convolutional and recurrent neural networks for epilepsy and depression diagnostics from structural and functional mri data. In: 2018 IEEE International Conference on Data Mining Workshops (ICDMW). IEEE; 2018.
Han K-M, De Berardis D, Fornaro M, Kim Y-K. Differentiating between bipolar and unipolar depression in functional and structural MRI studies. Prog Neuropsychopharmacol Biol Psychiatry. 2019;91:20–7.
Lin Y, Ma H, Pan Z, Wang R, editors. Depression detection by combining eye movement with image semantics. In: 2021 IEEE International Conference on Image Processing (ICIP). IEEE; 2021.
Alghowinem S, Goecke R, Wagner M, Parker G, Breakspear M, editors. Eye movement analysis for depression detection. In: 2013 IEEE International Conference on Image Processing. IEEE; 2013.
Niu M, Tao J, Liu B, editors. Multi-scale and multi-region facial discriminative representation for automatic depression level prediction. In: ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE; 2021.
Darzi A, Provenza NR, Jeni LA, Borton DA, Sheth SA, Goodman WK, et al., editors. Facial action units and head dynamics in longitudinal interviews reveal OCD and depression severity and DBS energy. In: 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021). IEEE; 2021.
Rao H, Xu S, Hu X, Cheng J, Hu B. Augmented skeleton based contrastive action learning with momentum lstm for unsupervised action recognition. Inf Sci. 2021;569:90–109.
Rao H, Wang S, Hu X, Tan M, Guo Y, Cheng J, et al. A self-supervised gait encoding approach with locality-awareness for 3d skeleton based person re-identification. IEEE Trans Pattern Anal Mach Intell. 2021;44(10):6649–66.
Valstar M, Schuller B, Smith K, Eyben F, Jiang B, Bilakhia S, et al., editors. Avec 2013: the continuous audio/visual emotion and depression recognition challenge. In: Proceedings of the 3rd ACM international workshop on Audio/visual emotion challenge. 2013.
Meng H, Huang D, Wang H, Yang H, Ai-Shuraifi M, Wang Y, editors. Depression recognition based on dynamic facial and vocal expression features using partial least square regression. In: Proceedings of the 3rd ACM international workshop on Audio/visual emotion challenge. 2013.
Cummins N, Joshi J, Dhall A, Sethu V, Goecke R, Epps J, editors. Diagnosis of depression by behavioural signals: a multimodal approach. In: Proceedings of the 3rd ACM international workshop on Audio/visual emotion challenge. 2013.
Zhao G, Pietikainen M. Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans Pattern Anal Mach Intell. 2007;29(6):915–28.
Wen L, Li X, Guo G, Zhu Y. Automated depression diagnosis based on facial dynamic analysis and sparse coding. IEEE Trans Inf Forensics Secur. 2015;10(7):1432–41.
Valstar M, Schuller B, Smith K, Almaev T, Eyben F, Krajewski J, et al., editors. Avec 2014: 3d dimensional affect and depression recognition challenge. In: Proceedings of the 4th international workshop on audio/visual emotion challenge. 2014.
He L, Jiang D, Sahli H. Automatic depression analysis using dynamic facial appearance descriptor and dirichlet process fisher encoding. IEEE Trans Multimedia. 2018;21(6):1476–86.
Jan A, Meng H, Gaus YFBA, Zhang F. Artificial intelligent system for automatic depression level analysis through visual and vocal expressions. IEEE Trans Cogn Dev Syst. 2017;10(3):668–80.
De Melo WC, Granger E, Hadid A, editors. Depression detection based on deep distribution learning. In: 2019 IEEE International Conference on Image Processing (ICIP). IEEE; 2019.
de Melo WC, Granger E, Lopez MB. MDN: a deep maximization-differentiation network for spatio-temporal depression detection. IEEE Trans Affect Comput. 2021;14(1):578–90.
Zhou X, Jin K, Shang Y, Guo G. Visually interpretable representation learning for depression recognition from facial images. IEEE Trans Affect Comput. 2018;11(3):542–52.
He L, Chan JC-W, Wang Z. Automatic depression recognition using CNN with attention mechanism from videos. Neurocomputing. 2021;422:165–75.
de Melo WC, Granger E, Hadid A. Combining global and local convolutional 3d networks for detecting depression from facial expressions. In: 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019). IEEE; 2019.
de Melo WC, Granger E, Hadid A. A deep multiscale spatiotemporal network for assessing depression from facial dynamics. IEEE Trans Affect Comput. 2020;13(3):1581–92.
Zhou X, Wei Z, Xu M, Qu S, Guo G. Facial depression recognition by deep joint label distribution and metric learning. IEEE Trans Affect Comput. 2020;13(3):1605–18.
Xu J, Song S, Kusumam K, Gunes H, Valstar M. Two-stage temporal modelling framework for video-based depression recognition using graph representation. arXiv preprint; 2021. arXiv:211115266.
Al Jazaery M, Guo G. Video-based depression level analysis by encoding deep spatiotemporal features. IEEE Trans Affect Comput. 2018;12(1):262–8.
Niu M, Tao J, Liu B, Huang J, Lian Z. Multimodal spatiotemporal representation for automatic depression level detection. IEEE Trans Affect Comput. 2020;14(1):294–307.
He L, Guo C, Tiwari P, Pandey HM, Dang W. Intelligent system for depression scale estimation with facial expressions and case study in industrial intelligence. Int J Intell Syst. 2021;37(12):10140–56.
Du Z, Li W, Huang D, Wang Y, editors. Encoding visual behaviors with attentive temporal convolution for depression prediction. In: 2019 14th IEEE international conference on automatic face & gesture recognition (FG 2019). IEEE; 2019.
De Melo WC, Granger E, Lopez MB. Encoding temporal information for automatic depression recognition from facial analysis. In: ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE; 2020.
Uddin MA, Joolee JB, Lee Y-K. Depression level prediction using deep spatiotemporal features and multilayer bi-ltsm. IEEE Trans Affect Comput. 2020;13(2):864–70.
Shang Y, Pan Y, Jiang X, Shao Z, Guo G, Liu T, et al. LQGDNet: a local quaternion and global deep network for facial depression recognition. IEEE Trans Affect Comput. 2021.
He L, Tiwari P, Lv C, Wu W, Guo L. Reducing noisy annotations for depression estimation from facial images. Neural Netw. 2022;153:120–9.
Kupfer DJ, Frank E, Phillips ML. Major depressive disorder: new clinical, neurobiological, and treatment perspectives. The Lancet. 2012;379(9820):1045–55.
Waxer PH. Therapist training in nonverbal communication: I. Nonverbal cues for depression. J Clin Psychol. 1974;30(2):215–8.
Joshi ML, Kanoongo N. Depression detection using emotional artificial intelligence and machine learning: a closer review. Mater Today Proc. 2022;1(58):217–26.
Meng H, Pears N. Descriptive temporal template features for visual motion recognition. Pattern Recogn Lett. 2009;30(12):1049–58.
Jaiswal S, Valstar M, Kusumam K, Greenhalgh C, editors. Virtual human questionnaire for analysis of depression, anxiety and personality. In: Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents. 2019.
Mackintosh J, Kumar R, Kitamura T. Blink rate in psychiatric illness. Br J Psychiatry. 1983;143(1):55–7.
Al-gawwam S, Benaissa M, editors. Depression detection from eye blink features. In: 2018 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT). IEEE; 2018. p. 388–92.
Funding
This work was supported in part by the National Key Research and Development Program of China (Grant No. 2019YFA0706200), the National Natural Science Foundation of China (Grant No.61632014, No.61627808, No. 61802159, No. 61802158), and the Fundamental Research Funds for Central Universities (lzujbky-2019-26, lzujbky-2021-kb26).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Ethical Approval
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. Informed consent was obtained from all individual participants included in the study.
Conflict of Interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yuan, X., Liu, Z., Chen, Q. et al. Combining Informative Regions and Clips for Detecting Depression from Facial Expressions. Cogn Comput 15, 1961–1972 (2023). https://doi.org/10.1007/s12559-023-10157-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-023-10157-0