Abstract:
Multimodal learning has demonstrated a great advantage in emotion recognition tasks due to the richer information from different modalities. However, multimodal models ma...Show MoreMetadata
Abstract:
Multimodal learning has demonstrated a great advantage in emotion recognition tasks due to the richer information from different modalities. However, multimodal models may incline to rely on some modalities that are easier to be learned, while under-fit the other modalities and lead to sub-optimal results. To address this problem, we propose a novel plug-in module, Adaptive Mask Co-optimization (AMCo), which could be inserted into advanced models. The adaptive mask can encourage the model to fit other modalities better by making dependent modalities harder to be learned. The cooptimization can preserve the performance of models on dependent modalities without degradation. The extensive experiments on the IEMOCAP dataset show AMCo can improve four state-of-the-art models by 1.14% ~ 3.03% in terms of accuracy.
Published in: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 04-10 June 2023
Date Added to IEEE Xplore: 05 May 2023
ISBN Information: