Selection of Relevant Visual Feature Sets for Enhanced Depression Detection using Incremental Linear Discriminant Analysis

Rathi, Swati; Kaur, Baljeet; Agrawal, R.K.

doi:10.1007/s11042-022-12420-2

Selection of Relevant Visual Feature Sets for Enhanced Depression Detection using Incremental Linear Discriminant Analysis

Published: 07 March 2022

Volume 81, pages 17703–17727, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

549 Accesses
4 Citations
Explore all metrics

Abstract

Presently, while automated depression diagnosis has made great progress, most of the recent works have focused on combining multiple modalities rather than strengthening a single one. In this research work, we present a unimodal framework for depression detection based on facial expressions and facial motion analysis. We investigate a wide set of visual features extracted from different facial regions. Due to high dimensionality of the obtained feature sets, identification of informative and discriminative features is a challenge. This paper suggests a hybrid dimensionality reduction approach which leverages the advantages of the filter and wrapper methods. First, we use a univariate filter method, Fisher Discriminant Ratio, to initially reduce the size of each feature set. Subsequently, we propose an Incremental Linear Discriminant Analysis (ILDA) approach to find an optimal combination of complementary and relevant feature sets. We compare the performance of the proposed ILDA with the batch-mode LDA and also the Composite Kernel based Support Vector Machine (CKSVM) method. The experiments conducted on the Distress Analysis Interview Corpus Wizard-of-Oz (DAIC-WOZ) dataset demonstrate that the best depression classification performance is obtained by using different feature extraction methods in combination rather than individually. ILDA generates better depression classification results in comparison to the CKSVM. Moreover, ILDA based wrapper feature selection incurs lower computational cost in comparison to the CKSVM and the batch-mode LDA methods. The proposed framework significantly improves the depression classification performance, with an F1 Score of 0.805, which is better than all the video based depression detection models suggested in literature, for the DAIC-WOZ dataset. Salient facial regions and well performing visual feature extraction methods are also identified.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhanced Depression Detection from Facial Cues Using Univariate Feature Selection Techniques

Bi-stage QWOA-Based Efficient Feature Selection for Enhanced Depression Detection Based on Facial Cues

Multi-source Information Fusion for Depression Detection

References

Ahonen T, Hadid A, Pietikainen M (2006) Face description with local binary patterns: Application to face recognition. IEEE Trans Pattern Anal Mach Intell 28(12):2037–2041
Article Google Scholar
Al Jazaery M, Guo G (2018) Video-based depression level analysis by encoding deep spatiotemporal features. IEEE Trans Affect Comput 12(1):262–268
Article Google Scholar
Alghowinem S et al (2016) Multimodal depression detection: fusion analysis of paralinguistic, head pose and eye gaze behaviors. IEEE Trans Affect Comput 9(4):478–490
Article Google Scholar
American Psychiatric Association, DS and American Psychiatric Association and others (2013) Diagnostic and statistical manual of mental disorders (DSM-5®). American psychiatric association Washington, DC
Baltrušaitis T, Robinson P, Morency L-P (2016) Openface: an open source facial behavior analysis toolkit. In: 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–10
Beck AT, Steer RA, Brown GK (1996) Beck depression inventory-II. San Antonio 78(2):490–498
Google Scholar
Bellantonio M et al (2016) Spatio-temporal pain recognition in cnn-based super-resolved facial images. In: Video Analytics. Face and Facial Expression Recognition and Audience Measurement. Springer, pp 151–162
Buyukdura JS, McClintock SM, Croarkin PE (2011) Psychomotor retardation in depression: biological underpinnings, measurement, and treatment. Prog Neuro-Psychopharmacol Biol Psychiatry 35(2):395–409
Article Google Scholar
Castro E, Martínez-Ramón M, Pearlson G, Sui J, Calhoun VD (2011) Characterization of groups using composite kernels and multi-source fMRI analysis data: application to schizophrenia. Neuroimage 58(2):526–536
Article Google Scholar
Chen J et al (2009) WLD: A robust local image descriptor. IEEE Trans Pattern Anal Mach Intell 32(9):1705–1720
Article Google Scholar
Cohen I, Garg A, Huang TS (2000) Emotion recognition from facial expressions using multilevel HMM. In: Neural information processing systems, vol. 2
Cohn JF et al (2009) Detecting depression from facial actions and vocal prosody. pp. 1–7.
Cummins N, Joshi J, Dhall A, Sethu V, Goecke R, Epps J (2013) Diagnosis of depression by behavioural signals: a multimodal approach. pp. 11–20
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. vol. 1, pp. 886–893
de Melo WC, Granger E, Hadid A (2019) Combining global and local convolutional 3d networks for detecting depression from facial expressions. pp. 1–8
Who.int (2020) Depression. [online] Available at: https://www.who.int/news-room/fact-sheets/detail/depression. Accessed 18 June 2020
Dibeklioğlu H, Hammal Z, Yang Y, Cohn JF (Nov. 2015) Multimodal Detection of Depression in Clinical Interviews. Proc ACM Int Conf Multimodal Interact 2015:307–310
Article Google Scholar
Duda RO, Hart PE, Stork DG (2006) Pattern classification. John Wiley & Sons
MATH Google Scholar
Fukunaga K (2013) Introduction to statistical pattern recognition. Elsevier
MATH Google Scholar
Giannakakis G et al (2017) Stress and anxiety detection using facial cues from videos. Biomedical Signal Proc Control 31:89–101
Article Google Scholar
Girard JM, Cohn JF, Mahoor MH, Mavadati S, Rosenwald DP (2013) Social risk and depression: Evidence from manual and automatic facial expression analysis. pp. 1–8
Gong Y, Poellabauer C (2017) Topic modeling based multi-modal depression detection. pp. 69–76
Gratch J et al (2014) The distress analysis interview corpus of human and computer interviews. pp. 3123–3128
Gupta R et al (2014) Multimodal prediction of affective dimensions and depression in human-computer interactions. pp. 33–40
Haque A, Guo M, Miner AS, Fei-Fei L (2018) Measuring depression symptom severity from spoken language and 3D facial expressions. arXiv preprint arXiv:1811.08592
Hawton K, Comabella CCI, Haw C, Saunders K (2013) Risk factors for suicide in individuals with depression: a systematic review. J Affect Disord 147(1–3):17–28
Article Google Scholar
He S, Soraghan JJ, O’Reilly BF, Xing D (2009) Quantitative analysis of facial paralysis using local binary patterns in biomedical videos. IEEE Trans Biomed Eng 56(7):1864–1870
Article Google Scholar
He L, Jiang D, Sahli H (2018) Automatic Depression Analysis using Dynamic Facial Appearance Descriptor and Dirichlet Process Fisher Encoding. IEEE Trans Multimedia 21:1476–1486
Article Google Scholar
Hill D (1974) Non-verbal behaviour in mental illness. Br J Psychiatry 124(580):221–230
Article Google Scholar
Jain V, Crowley JL, Dey AK, Lux A (2014) Depression estimation using audiovisual features and fisher vector encoding. pp. 87–91
James SL et al (2018) Global, regional, and national incidence, prevalence, and years lived with disability for 354 diseases and injuries for 195 countries and territories, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet 392(10159):1789–1858
Article Google Scholar
Jan A, Meng H, Gaus YFA, Zhang F, Turabzadeh S (2014) Automatic depression scale prediction using facial expression dynamics and regression. pp. 73–80
Jan A, Meng H, Gaus YFBA, Zhang F (2017) Artificial intelligent system for automatic depression level analysis through visual and vocal expressions. IEEE Trans Cogn Dev Syst 10(3):668–680
Article Google Scholar
Joshi J et al (2013) Multimodal assistive technologies for depression diagnosis and monitoring. J Multimodal User Interfaces 7(3):217–228
Article Google Scholar
Kroenke K, Spitzer RL, Williams JB (2001) The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med 16(9):606–613
Article Google Scholar
Manfredonia J et al (2019) Automatic recognition of posed facial expression of emotion in individuals with autism spectrum disorder. J Autism Dev Disord 49(1):279–293
Article Google Scholar
Marsaglia G, Styan GP (1974) Rank conditions for generalized inverses of partitioned matrices. Sankhyā: The Indian Journal of Statistics, Series A:437–442
Mehrabian A, Russell JA (1974) An approach to environmental psychology. the MIT Press
Google Scholar
Meng H, Pears N, Freeman M, Bailey C (2009) Motion history histograms for human action recognition. In: Embedded Computer Vision. Springer, pp 139–162. https://doi.org/10.1007/978-1-84800-304-0_7
Meng H, Huang D, Wang H, Yang H, Ai-Shuraifi M, Wang Y (2013) Depression recognition based on dynamic facial and vocal expression features using partial least square regression. pp. 21–30
Nasir M, Jati A, Shivakumar PG, Nallan Chakravarthula S, Georgiou P (2016) Multimodal and multiresolution depression detection from speech and facial landmark features,” pp. 43–50
Neumann D, Langner T, Ulbrich F, Spitta D, Goehring D (2017) Online vehicle detection using Haar-like, LBP and HOG feature based image classifiers with stereo vision preselection. In: 2017 IEEE Intelligent Vehicles Symposium (IV), pp. 773–778
Nhat HTM, Hoang VT (2019) Feature fusion by using LBP, HOG, GIST descriptors and Canonical Correlation Analysis for face recognition. In: 2019 26th international conference on telecommunications (ICT). pp. 371–375
Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell (7):971–987
Ojansivu V, Heikkilä J (2008) Blur insensitive texture classification using local phase quantization. pp. 236–243
Ouellette DV (1981) Schur complements and statistics. Linear Algebra Appl 36:187–295
Article MathSciNet Google Scholar
Pampouchidou A et al (2016) Depression assessment by fusing high and low level features from audio, video, and text. pp. 27–34
Ringeval F et al (2017) Avec 2017: Real-life depression, and affect recognition workshop and challenge. pp. 3–9
Senoussaoui M, Sarria-Paja M, Santos JF, Falk TH (2014) Model fusion for multimodal depression classification and level detection. pp. 57–63
Shao L, Mattivi R (2010) Feature detector and descriptor evaluation in human action recognition. In: Proceedings of the ACM International Conference on Image and Video Retrieval. pp. 477–484
Song S, Shen L, Valstar M (2018) Human behaviour-based automatic depression analysis using hand-crafted statistics and deep learned spectral features. pp. 158–165
Stratou G, Scherer S, Gratch J, Morency L-P (2015) Automatic nonverbal behavior indicators of depression and ptsd: the effect of gender. J Multimodal User Interfaces 9(1):17–29
Article Google Scholar
Sun B et al (2017) A random forest regression method with selected-text feature for depression assessment. pp. 61–68
Syed ZS, Sidorov K, Marshall D (2017) Depression severity prediction based on biomarkers of psychomotor retardation. pp. 37–43
Tan X, Triggs B (2010) Enhanced local texture feature sets for face recognition under difficult lighting conditions. IEEE Trans Image Process 19(6):1635–1650
Article MathSciNet Google Scholar
Turan C, Lam K-M (2018) Histogram-based local descriptors for facial expression recognition (FER): A comprehensive study. J Vis Commun Image Represent 55:331–341
Article Google Scholar
Valstar M et al (2013) AVEC 2013: the continuous audio/visual emotion and depression recognition challenge. pp. 3–10
Valstar M et al (2014) Avec 2014: 3d dimensional affect and depression recognition challenge. pp. 3–10
Valstar M et al (2016) AVEC 2016: Depression, Mood, and Emotion Recognition Workshop and Challenge. In: Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge - AVEC ‘16, Amsterdam, The Netherlands, pp. 3–10. https://doi.org/10.1145/2988257.2988258.
Wang Y et al (2020) Automatic Depression Detection via Facial Expressions Using Multiple Instance Learning. pp. 1933–1936
Wen L, Li X, Guo G, Zhu Y (2015) Automated depression diagnosis based on facial dynamic analysis and sparse coding. IEEE Trans Inf Forensics Secur 10(7):1432–1441
Article Google Scholar
Williams JB (1988) A structured interview guide for the Hamilton Depression Rating Scale. Arch Gen Psychiatry 45(8):742–747
Article Google Scholar
Williamson JR, Quatieri TF, Helfer BS, Horwitz R, Yu B, Mehta DD (2013) Vocal biomarkers of depression based on motor incoordination. pp. 41–48
Williamson JR, Quatieri TF, Helfer BS, Ciccarelli G, Mehta DD (2014) Vocal and facial biomarkers of depression based on motor incoordination and timing. pp. 65–72
J. R. Williamson et al (2016) Detecting depression using vocal, facial and semantic communication cues. pp. 11–18.
Yang M, Zhang L, Shiu SC-K, Zhang D (2012) Monogenic binary coding: An efficient local feature extraction approach to face recognition. IEEE Trans Inf Forensics Secur 7(6):1738–1751
Article Google Scholar
Yang B-Q, Zhang T, Gu C-C, Wu K-J, Guan X-P (2016) A novel face recognition method based on IWLD and IWBC. Multimed Tools Appl 75(12):6979–7002
Article Google Scholar
Yang L, Jiang D, He L, Pei E, Oveneke MC, Sahli H (2016) Decision tree based depression classification from audio video and language information. pp. 89–96
Zheng W, Yan L, Gou C, Wang F-Y (2020) Graph Attention Model Embedded With Multi-Modal Knowledge For Depression Detection. pp. 1–6
Zhou X, Jin K, Shang Y, Guo G (2018) Visually interpretable representation learning for depression recognition from facial images. IEEE Trans Affect Comput 11(3):542–552
Zhu Y, Shang Y, Shao Z, Guo G (2018) Automated depression diagnosis based on deep networks to encode facial appearance and dynamics. IEEE Trans Affect Comput 9(4):578–584
Article Google Scholar

Download references

Data Availability (data transparency)

Not Applicable

Code availability (software application or custom code)

The authors do not wish to share the code at this stage.

Funding

Not Applicable

Author information

Authors and Affiliations

School of Computer and Systems Sciences, Jawaharlal Nehru University, Delhi, India
Swati Rathi & R.K. Agrawal
Hansraj College, University of Delhi, Delhi, India
Baljeet Kaur

Authors

Swati Rathi
View author publications
You can also search for this author in PubMed Google Scholar
Baljeet Kaur
View author publications
You can also search for this author in PubMed Google Scholar
R.K. Agrawal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Swati Rathi.

Ethics declarations

Conflicts of interest/Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

1.1 1. Composite kernel support vector machine (CKSVM)

Castro et al. [9] in his work for schizophrenia detection, handled the high dimensionality of the fMRI data by using composite kernels and recursive feature elimination. The non-linear relationship between the features(voxels) within a region was captured by using the gaussian kernel, which transforms the features to an infinite dimension Hilbert space, provided with a kernel inner product. Further, to capture the linear relationship between the regions, voxels of different regions in the Hilbert space were combined using the summation kernel. This linear combination of kernels was called as the composite kernel. Composite kernel-based support vector machine (SVM) classifier parameters helped in finding the relevance of the regions instead of the individual features. Further, recursive feature elimination approach was utilized to identify the regions which distinguished the patients and the controls better. Motivated by Castro’s work, we used CKSVM to rank the 19 feature sets and used the forward selection approach to incrementally add a feature set that is the most relevant for the task of depression detection.

Let v_{i, f} denote a feature vector from the f^th feature set, 1 ≤ f ≤ F, for the i^th sample, 1 ≤ i ≤ N. In our study, N = 107 for the training set, N = 35 for the test set and F = 19. Using a non-linear transformation φ_f, feature vectors of the feature set f are mapped to a high dimensional Hilbert space provided that

$$ <{\varphi}_f\left({\mathbf{v}}_{i,f}\right).{\varphi}_f\left({\mathbf{v}}_{j,f}\right)>={k}_f\left({\mathbf{v}}_{i,f},{\mathbf{v}}_{j,f}\right) $$

(13)

where <. > denotes the inner product for a pair of feature vectors in the Hilbert space and k_f(._,.) is a Mercer’s kernel function. We used the gaussian kernel for non-linearly transforming each of the F feature sets. Corresponding to the feature set f, a kernel matrix K_f is generated. The component (i, j) of K_f is computed as

$$ {\mathbf{K}}_f\left(i,j\right)={k}_f\left({\mathbf{v}}_{i,f},{\mathbf{v}}_{j,f}\right)={e}^{-\frac{{\left\Vert {\mathbf{v}}_{i,f}-{\mathbf{v}}_{j,f}\right\Vert}^2}{2{\sigma}^2}} $$

(14)

where σis the gaussian kernel parameter. The feature sets that have been mapped individually can be concatenated into a single vector as

$$ {\varphi}_f\left({\mathbf{v}}_i\right)={\left[{\varphi_f}^T\left({\mathbf{v}}_{i,1}\right)\cdots {\varphi_f}^T\left({\mathbf{v}}_{i,F}\right)\right]}^T $$

(15)

The inner product for a pair of vectors v_i and v_jcan be given as

$$ {\displaystyle \begin{array}{c}<{\upvarphi}_{\mathrm{f}}\left({\mathrm{v}}_{\mathrm{i}}\right).{\upvarphi}_{\mathrm{f}}\left({\mathrm{v}}_{\mathrm{j}}\right)>=\left[{\upvarphi_{\mathrm{f}}}^{\mathrm{T}}\left({\mathrm{v}}_{\mathrm{i},1}\right)\cdots {\upvarphi_{\mathrm{f}}}^{\mathrm{T}}\left({\mathrm{v}}_{\mathrm{i},\mathrm{F}}\right)\right].{\left[{\upvarphi_{\mathrm{f}}}^{\mathrm{T}}\left({\mathrm{v}}_{\mathrm{i},1}\right)\cdots {\upvarphi_{\mathrm{f}}}^{\mathrm{T}}\left({\mathrm{v}}_{\mathrm{i},\mathrm{F}}\right)\right]}^{\mathrm{T}}\\ {}\kern0.5em =\sum \limits_{\mathrm{f}=1}^{\mathrm{F}}{\upvarphi_{\mathrm{f}}}^{\mathrm{T}}\left({\mathrm{v}}_{\mathrm{i},\mathrm{f}}\right).{\upvarphi}_{\mathrm{f}}\left({\mathrm{v}}_{\mathrm{j},\mathrm{f}}\right)=\sum \limits_{\mathrm{f}=1}^{\mathrm{F}}{\mathrm{k}}_{\mathrm{f}}\left({\mathrm{v}}_{\mathrm{i},\mathrm{f}},{\mathrm{v}}_{\mathrm{j},\mathrm{f}}\right)\end{array}} $$

(16)

The above result of the inner product is a composite kernel, expressed as the sum of the kernels for F feature sets. Accordingly, the optimization algorithm of the conventional support vector machine can be modified as:

$$ {\displaystyle \begin{array}{c}\underset{\upalpha}{\max}\sum \limits_{\mathrm{i}=1}^{\mathrm{N}}{\upalpha}_{\mathrm{i}}-\frac{1}{2}\sum \limits_{\mathrm{i}=1}^{\mathrm{N}}\sum \limits_{\mathrm{j}=1}^{\mathrm{N}}{\upalpha}_{\mathrm{i}}{\upalpha}_{\mathrm{j}}{\mathrm{y}}_{\mathrm{i}}{\mathrm{y}}_{\mathrm{j}}\sum \limits_{\mathrm{f}=1}^{\mathrm{F}}{\mathrm{k}}_{\mathrm{f}}\left({\mathrm{v}}_{\mathrm{i},\mathrm{f}},{\mathrm{v}}_{\mathrm{j},\mathrm{f}}\right)\\ {}\begin{array}{c}\mathrm{s}.\mathrm{t}.\kern0.5em \sum \limits_{\mathrm{i}=1}^{\mathrm{N}}{\upalpha}_{\mathrm{i}}{\mathrm{y}}_{\mathrm{i}}=0\\ {}{\upalpha}_{\mathrm{i}}\ge 0\\ {}\begin{array}{c}1\le \mathrm{i},\mathrm{j}\le \mathrm{N}\\ {}1\le \mathrm{f}\le \mathrm{F}\end{array}\end{array}\end{array}} $$

(17)

Similarly, the equation to predict the output of SVM learning algorithm is modified as:

$$ {y}^{\ast }=\sum \limits_{i=1}^N{\alpha}_i{y}_i\sum \limits_{f=1}^F{k}_f\left({\mathbf{v}}_{i,f},{\mathbf{v}}_{\ast, f}\right)+b $$

(18)

where, α_i and b are the classifier parameters. By using composite kernels and the SVM parameter α, it is possible to compute the relevance of a particular feature set as

$$ {\left\Vert {\mathbf{w}}_f\right\Vert}^2={\alpha}^T{\mathbf{K}}_f\alpha $$

(19)

The higher the relevance of a feature set f, higher is the quadratic norm of w_f. Usingthe forward selection approach, an optimum combination of feature sets for depression detection is determined incrementally, based on the ‖w_f‖²for each distinct combination of feature sets. The CKSVM method entailed high time complexity due to the initial kernel computation and parameter tuning of σ for the gaussian kernel.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rathi, S., Kaur, B. & Agrawal, R. Selection of Relevant Visual Feature Sets for Enhanced Depression Detection using Incremental Linear Discriminant Analysis. Multimed Tools Appl 81, 17703–17727 (2022). https://doi.org/10.1007/s11042-022-12420-2

Download citation

Received: 30 December 2020
Revised: 19 May 2021
Accepted: 25 January 2022
Published: 07 March 2022
Issue Date: May 2022
DOI: https://doi.org/10.1007/s11042-022-12420-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Selection of Relevant Visual Feature Sets for Enhanced Depression Detection using Incremental Linear Discriminant Analysis

Abstract

Access this article

Similar content being viewed by others

Enhanced Depression Detection from Facial Cues Using Univariate Feature Selection Techniques

Bi-stage QWOA-Based Efficient Feature Selection for Enhanced Depression Detection Based on Facial Cues

Multi-source Information Fusion for Depression Detection

References

Data Availability (data transparency)

Code availability (software application or custom code)

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest/Competing interests

Additional information

Publisher’s note

Appendix

1.1 1. Composite kernel support vector machine (CKSVM)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Selection of Relevant Visual Feature Sets for Enhanced Depression Detection using Incremental Linear Discriminant Analysis

Abstract

Access this article

Similar content being viewed by others

Enhanced Depression Detection from Facial Cues Using Univariate Feature Selection Techniques

Bi-stage QWOA-Based Efficient Feature Selection for Enhanced Depression Detection Based on Facial Cues

Multi-source Information Fusion for Depression Detection

References

Data Availability (data transparency)

Code availability (software application or custom code)

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest/Competing interests

Additional information

Publisher’s note

Appendix

Appendix

1.1 1. Composite kernel support vector machine (CKSVM)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation