Loading [MathJax]/extensions/MathMenu.js
Fusing Multi-Level Features from Audio and Contextual Sentence Embedding from Text for Interview-Based Depression Detection | IEEE Conference Publication | IEEE Xplore

Fusing Multi-Level Features from Audio and Contextual Sentence Embedding from Text for Interview-Based Depression Detection


Abstract:

Automatic depression detection based on audio and text representations from participants’ interviews has attracted widespread attention. However, most of previous researc...Show More

Abstract:

Automatic depression detection based on audio and text representations from participants’ interviews has attracted widespread attention. However, most of previous researches only used one type of feature of one single modality for depression detection, so that the rich information of audio and text from interviews has not been fully utilized. Moreover, an effective multi-modal fusion approach to leverage the independence among audio and text representations is still lacking. To address these problems, we propose a multi-modal fusion depression detection model based on the interaction of multilevel audio features and text sentence embedding. Specifically, we first extract Low-Level Descriptors (LLDs), mel-spectrogram features, and wav2vec features from the audio. Then we design a Multi-level Audio Features Interaction Module (MAFIM) to fuse these three levels of features for a comprehensive audio representation. For interview text, we use pre-trained BERT to extract sentence-level embedding. Further, to effectively fuse audio and text representations, we design a Channel Attention-based Multi-modal Fusion Module (CAMFM) by taking into account the independence and correlation between two different modalities. Our proposed model shows better performance on two datasets, DAIC-WOZ and EATD-Corpus, than existing methods, so it has a high potential to be applied for interview-based depression detection in practice.
Date of Conference: 14-19 April 2024
Date Added to IEEE Xplore: 18 March 2024
ISBN Information:

ISSN Information:

Conference Location: Seoul, Korea, Republic of

Contact IEEE to Subscribe

References

References is not available for this document.