Dense Fusion Network with Multimodal Residual for Sentiment Classification | IEEE Conference Publication | IEEE Xplore

Dense Fusion Network with Multimodal Residual for Sentiment Classification


Abstract:

In this paper, we propose a deep dense fusion network with multimodal residual (DFMR) to integrate multimodal information including language, acoustic speeches, and visua...Show More

Abstract:

In this paper, we propose a deep dense fusion network with multimodal residual (DFMR) to integrate multimodal information including language, acoustic speeches, and visual images for sentiment analysis. DFMR exploits a dense fusion (DF) block to fuse the multimodal features obtained by modality-specific sequence networks, which is achieved by modelling their unimodal, bimodal and trimodal interactions jointly. Instead of concatenating the multimodal features directly, DF block conducts fusion for any two paired modalities firstly, and the fused information will be integrated with the other modalities subsequently. Furthermore, DFMR stacks multiple DF blocks to capture high-level semantic information conveyed by the multimodal representations. In particular, DFMR adopts a multimodal residual (MR) block to integrate the modality-specific features and fused features in each DF blocks, to avoid forgetting the multi-aspect information and alleviate gradient vanishing during stacking. Extensive experiments conducted on four public benchmark datasets show that DFMR outperforms eleven state-of-the-art baselines.
Date of Conference: 05-09 July 2021
Date Added to IEEE Xplore: 09 June 2021
ISBN Information:

ISSN Information:

Conference Location: Shenzhen, China

Funding Agency:


References

References is not available for this document.