research-article

MORSE: MultimOdal sentiment analysis for Real-life SEttings

Authors:

Yiqun Yao,

Verónica Pérez-Rosas,

Mohamed Abouelenien,

Mihai BurzoAuthors Info & Claims

ICMI '20: Proceedings of the 2020 International Conference on Multimodal Interaction

Pages 387 - 396

https://doi.org/10.1145/3382507.3418821

Published: 22 October 2020 Publication History

Get Access

Abstract

Multimodal sentiment analysis aims to detect and classify sentiment expressed in multimodal data. Research to date has focused on datasets with a large number of training samples, manual transcriptions, and nearly-balanced sentiment labels. However, data collection in real settings often leads to small datasets with noisy transcriptions and imbalanced label distributions, which are therefore significantly more challenging than in controlled settings. In this work, we introduce MORSE, a domain-specific dataset for MultimOdal sentiment analysis in Real-life SEttings. The dataset consists of 2,787 video clips extracted from 49 interviews with panelists in a product usage study, with each clip annotated for positive, negative, or neutral sentiment. The characteristics of MORSE include noisy transcriptions from raw videos, naturally imbalanced label distribution, and scarcity of minority labels. To address the challenging real-life settings in MORSE, we propose a novel two-step fine-tuning method for multimodal sentiment classification using transfer learning and the Transformer model architecture; our method starts with a pre-trained language model and one step of fine-tuning on the language modality, followed by the second step of joint fine-tuning that incorporates the visual and audio modalities. Experimental results show that while MORSE is challenging for various baseline models such as SVM and Transformer, our two-step fine-tuning method is able to capture the dataset characteristics and effectively address the challenges. Our method outperforms related work that uses both single and multiple modalities in the same transfer learning settings.

Supplementary Material

MP4 File (3382507.3418821.mp4)

Presentation for our paper "MORSE: MultimOdal sentiment analysis for Real-life SEttings", including the introduction of the task, the definition of real-life settings, the details on dataset collection, a novel model based on two-step fine-tuning with Transformer neural network, and experimental results.

Download
14.51 MB

References

[1]

M. Abouelenien and X. Yuan. 2012. SampleBoost: Improving boosting performance by destabilizing weak learners based on weighted error analysis. In Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012). 585--588.

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Multimodal transformer with adaptive modality weighting for multimodal sentiment analysis

Few-shot Multimodal Sentiment Analysis Based on Multimodal Probabilistic Fusion Prompts

Joint training strategy of unimodal and multimodal for multimodal sentiment analysis

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations