Abstract:
Multimodal Conversational Emotion recognition (MMCER) aims to detect the muti-emotion label for each utterance from heterogeneous visual, text and audio modalities. In th...Show MoreMetadata
Abstract:
Multimodal Conversational Emotion recognition (MMCER) aims to detect the muti-emotion label for each utterance from heterogeneous visual, text and audio modalities. In this paper, we focus on applying multi-modal graph data structures to conversational emotion recognition and use a novel and efficient graph—MMDCGs to better integrate multi-modal contextual information into conversations. MMDCG provides a new way of encoding intrinsic structural connectivity. Besides, inspired by time series analysis, we set a rolling time window as the receptive field, which can reduce the interference of remote information on the current utterances detection and achieve the purpose of data enhancement. We innovatively ensemble such graph structures with transformers, named rolling-windows MMDCGs (RW-MMDCG). Comprehensive experiments are performed on two representative multi-modal datasets, IEMOCAP and MELD, and we compare them with existing baselines, demonstrating the great advantages and effectiveness of RW-MMDCG.
Published in: 2023 26th International Conference on Computer Supported Cooperative Work in Design (CSCWD)
Date of Conference: 24-26 May 2023
Date Added to IEEE Xplore: 22 June 2023
ISBN Information: