Benchmarking Multimodal Sentiment Analysis

Cambria, Erik; Hazarika, Devamanyu; Poria, Soujanya; Hussain, Amir; Subramanyam, R. B. V.

doi:10.1007/978-3-319-77116-8_13

Erik Cambria¹⁴,
Devamanyu Hazarika¹⁵,
Soujanya Poria¹⁶,
Amir Hussain¹⁷ &
…
R. B. V. Subramanyam¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10762))

Included in the following conference series:

International Conference on Computational Linguistics and Intelligent Text Processing

1774 Accesses
34 Citations

Abstract

We propose a deep-learning-based framework for multimodal sentiment analysis and emotion recognition. In particular, we leverage on the power of convolutional neural networks to obtain a performance improvement of 10% over the state of the art by combining visual, text and audio features. We also discuss some major issues frequently ignored in multimodal sentiment analysis research, e.g., role of speaker-independent models, importance of different modalities, and generalizability. The framework illustrates the different facets of analysis to be considered while performing multimodal sentiment analysis and, hence, serves as a new benchmark for future research in this emerging field.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Multimodal Sentiment Analysis Using Deep Learning: A Review

A comprehensive survey on deep learning-based approaches for multimodal sentiment analysis

Article 28 July 2023

reSenseNet: Ensemble Early Fusion Deep Learning Architecture for Multimodal Sentiment Analysis

Notes

1.
We have reimplemented the method by Poria et al. [5].
2.
http://sentic.net/demo (best viewed in Mozilla Firefox).

References

Poria, S., Cambria, E., Bajpai, R., Hussain, A.: A review of affective computing: from unimodal analysis to multimodal fusion. Inf. Fusion 37, 98–125 (2017)
Article Google Scholar
Cambria, E., Hussain, A., Havasi, C., Eckl, C.: Sentic computing: exploitation of common sense for the development of emotion-sensitive systems. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds.) Development of Multimodal Interfaces: Active Listening and Synchrony. LNCS, vol. 5967, pp. 148–156. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12397-9_12
Chapter Google Scholar
Pérez-Rosas, V., Mihalcea, R., Morency, L.P.: Utterance-level multimodal sentiment analysis. In: ACL, pp. 973–982 (2013)
Google Scholar
Wollmer, M., Weninger, F., Knaup, T., Schuller, B., Sun, C., Sagae, K., Morency, L.P.: Youtube movie reviews: sentiment analysis in an audio-visual context. IEEE Intell. Syst. 28, 46–53 (2013)
Article Google Scholar
Poria, S., Cambria, E., Gelbukh, A.: Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. In: Proceedings of EMNLP, pp. 2539–2544
Google Scholar
Zadeh, A.: Micro-opinion sentiment intensity analysis and summarization in online videos. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, pp. 587–591. ACM (2015)
Google Scholar
Poria, S., Chaturvedi, I., Cambria, E., Hussain, A.: Convolutional MKL based multimodal emotion recognition and sentiment analysis. In: ICDM, Barcelona, pp. 439–448 (2016)
Google Scholar
Cambria, E.: Affective computing and sentiment analysis. IEEE Intell. Syst. 31, 102–107 (2016)
Article Google Scholar
Cambria, E., Poria, S., Hazarika, D., Kwok, K.: SenticNet 5: discovering conceptual primitives for sentiment analysis by means of context embeddings. In: AAAI, pp. 1795–1802 (2018)
Google Scholar
Poria, S., Gelbukh, A., Agarwal, B., Cambria, E., Howard, N.: Advances in soft computing and its applications. In: Common Sense Knowledge Based Personality Recognition From Text, pp. 484–496. Springer (2013)
Google Scholar
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of ACL, pp. 79–86 (2002)
Google Scholar
Socher, R., Perelygin, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of EMNLP, vol. 1631, p. 1642 (2013)
Google Scholar
Oneto, L., Bisio, F., Cambria, E., Anguita, D.: Statistical learning theory and ELM for big social data analysis. IEEE Comput. Intell. Mag. 11, 45–55 (2016)
Article Google Scholar
Ekman, P.: Universal facial expressions of emotion. Contemporary Readings/Chicago, Culture and Personality (1974)
Google Scholar
Datcu, D., Rothkrantz, L.: Semantic audio-visual data fusion for automatic emotion recognition. Euromedia (2008)
Google Scholar
De Silva, L.C., Miyasato, T., Nakatsu, R.: Facial emotion recognition using multi-modal information. In: Proceedings of ICICS, vol. 1, pp. 397–401. IEEE (1997)
Google Scholar
Chen, L.S., Huang, T.S., Miyasato, T., Nakatsu, R.: Multimodal human emotion/expression recognition. In: Proceedings of the Third IEEE International Conference on Automatic Face and Gesture Recognition, pp. 366–371. IEEE (1998)
Google Scholar
Kessous, L., Castellano, G., Caridakis, G.: Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis. J. Multimodal User Interfaces 3, 33–48 (2010)
Article Google Scholar
Schuller, B.: Recognizing affect from linguistic information in 3D continuous space. IEEE Trans. Affect. Comput. 2, 192–205 (2011)
Article Google Scholar
Rozgic, V., Ananthakrishnan, S., Saleem, S., Kumar, R., Prasad, R.: Ensemble of SVM trees for multimodal emotion recognition. In: IEEE APSIPA, pp. 1–4 (2012)
Google Scholar
Metallinou, A., Lee, S., Narayanan, S.: Audio-visual emotion recognition using Gaussian mixture models for face and voice. In: Tenth IEEE International Symposium on ISM 2008, pp. 250–257. IEEE (2008)
Google Scholar
Eyben, F., Wöllmer, M., Graves, A., Schuller, B., Douglas-Cowie, E., Cowie, R.: On-line emotion recognition in a 3D activation-valence-time continuum using acoustic and linguistic cues. J. Multimodal User Interfaces 3, 7–19 (2010)
Article Google Scholar
Wu, C.H., Liang, W.B.: Emotion recognition of affective speech based on multiple classifiers using acoustic-prosodic information and semantic labels. IEEE Trans. Affect. Comput. 2, 10–21 (2011)
Article Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space (2013). arXiv preprint arXiv:1301.3781
Eyben, F., Wöllmer, M., Schuller, B.: Opensmile: the Munich versatile and fast open-source audio feature extractor. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 1459–1462. ACM (2010)
Google Scholar
Baltrušaitis, T., Robinson, P., Morency, L.P.: 3D constrained local model for rigid and non-rigid facial tracking. In: IEEE CVPR, pp. 2610–2617 (2012)
Google Scholar
Zadeh, A., Zellers, R., Pincus, E., Morency, L.P.: Multimodal sentiment intensity analysis in videos: facial gestures and verbal messages. IEEE Intell. Syst. 31, 82–88 (2016)
Article Google Scholar
Busso, C., Bulut, M., Lee, C.C., Kazemzadeh, A., Mower, E., Kim, S., Chang, J.N., Lee, S., Narayanan, S.S.: Iemocap: interactive emotional dyadic motion capture database. Lang. Resour. Eval. 42, 335–359 (2008)
Article Google Scholar
Rozgić, V., Ananthakrishnan, S., Saleem, S., Kumar, R., Prasad, R.: Ensemble of SVM trees for multimodal emotion recognition. In: IEEE APSIPA, pp. 1–4 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, NTU, Singapore
Erik Cambria
National Institute of Technology, Warangal, India
Devamanyu Hazarika & R. B. V. Subramanyam
Temasek Laboratories, NTU, Singapore
Soujanya Poria
School of Natural Sciences, University of Stirling, Stirling, UK
Amir Hussain

Authors

Erik Cambria
View author publications
You can also search for this author in PubMed Google Scholar
Devamanyu Hazarika
View author publications
You can also search for this author in PubMed Google Scholar
Soujanya Poria
View author publications
You can also search for this author in PubMed Google Scholar
Amir Hussain
View author publications
You can also search for this author in PubMed Google Scholar
R. B. V. Subramanyam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Erik Cambria .

Editor information

Editors and Affiliations

CIC, Instituto Politécnico Nacional, Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cambria, E., Hazarika, D., Poria, S., Hussain, A., Subramanyam, R.B.V. (2018). Benchmarking Multimodal Sentiment Analysis. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2017. Lecture Notes in Computer Science(), vol 10762. Springer, Cham. https://doi.org/10.1007/978-3-319-77116-8_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-77116-8_13
Published: 10 October 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77115-1
Online ISBN: 978-3-319-77116-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics