Abstract:
Multimodal Sentiment Analysis (MSA) is an essential task in multimodal learning, which aims to incorporate multi-modal signals to perceive human psychological states. Exi...Show MoreMetadata
Abstract:
Multimodal Sentiment Analysis (MSA) is an essential task in multimodal learning, which aims to incorporate multi-modal signals to perceive human psychological states. Existing MSA methods focus on developing kinds of fusion strategies on multimodal signals, hardly considering the reliability of each modality. However, the information hidden in input signals is not trustworthy since the uncertainty in each modality is not well quantified. To tackle this problem, we propose a novel MSA method, termed Uncertainty-Debiased Multi-modal Fusion (UDMF), which leverages uncertainty estimation and information bottleneck theory to learn an improved fusion of vision, audio, and text modalities. In this work, we quantify each modality uncertainty by placing evidence priors and dynamically adjusting their contributions. Moreover, the information bottleneck theory is further introduced to slim down the fusion and learn a deterministic minimal sufficient representation. Extensive experiments on the CMU-MOSI and CMU-MOSEI datasets show that our method outperforms recent state-of-the-art methods.
Date of Conference: 15-19 July 2024
Date Added to IEEE Xplore: 30 September 2024
ISBN Information: