Skip to main content

Impact Evaluation of Multimodal Information on Sentiment Analysis

  • Conference paper
  • First Online:
Advances in Computational Intelligence (MICAI 2022)

Abstract

Text-based sentiment analysis is a popular application of artificial intelligence that has benefited in the past decade from the growth of digital social networks and its almost unlimited amount of data. Currently, social network users can combine different types of information in a single post, such as images, videos, GIFs, and live streams. As a result, they can express more complex thoughts and opinions. The goal of our study is to analyze the impact that incorporating different types of multimodal information may have on social media sentiment analysis. In particular, we give special attention to the interaction between text messages and images with and without text captions. To study this interaction we first create a new dataset in Spanish that contains tweets with images. Afterwards, we manually label several sentiments for each tweet, as follows: the overall tweet sentiment, the sentiment of the text, the sentiment of the individual images, the sentiment of the caption, if present, and—in cases where a single tweet has several images—the aggregate sentiment of all images present in the tweet. We conclude that incorporating visual information into text-based sentiment analysis raises the performance of the classifiers that determine the overall sentiment of a tweet by an average of 25.5%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.internetworldstats.com/stats7.htm.

  2. 2.

    The dataset can be downloaded here: https://github.com/lzun/mssaid.

  3. 3.

    https://developer.twitter.com/en/docs/twitter-api.

  4. 4.

    Note that in this paper we used as examples images of our own authoring instead of the ones contained within the dataset to avoid any violations of the original authors’ copyright.

  5. 5.

    https://github.com/carpedm20/emoji/.

  6. 6.

    https://github.com/faustomorales/keras-ocr.

  7. 7.

    https://scikit-learn.org/stable/modules/model_evaluation.html#balanced-accuracy-score.

References

  1. Abdu, S.A., Yousef, A.H., Salem, A.: Multimodal video sentiment analysis using deep learning approaches, a survey. Inf. Fusion 76, 204–226 (2021)

    Article  Google Scholar 

  2. Broder, A.Z., Glassman, S.C., Manasse, M.S., Zweig, G.: Syntactic clustering of the web. Computer Networks and ISDN Systems 29(8), 1157–1166 (1997). https://www.sciencedirect.com/science/article/pii/S0169755297000317, papers from the Sixth International World Wide Web Conference

  3. Busso, C., et al.: IEMOCAP: interactive emotional dyadic motion capture database. Lang. Resour. Eval. 42, 335–359 (2008)

    Article  Google Scholar 

  4. Chandrasekaran, G., Nguyen, T.N., D., J.H.: Multimodal sentiment analysis for social media applications: a comprehensive review. WIREs Data Min. Knowl. Discov. 11(5) (2021)

    Google Scholar 

  5. Chen, L., Huang, T., Miyasato, T., Nakatsu, R.: Multimodal human emotion/expression recognition. In: Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition, pp. 366–371 (1998)

    Google Scholar 

  6. Datcu, D., Rothkrantz, L.J.M.: Semantic audio-visual data fusion for automatic emotion recognition. Euromedia (2008)

    Google Scholar 

  7. Ganganwar, V.: An overview of classification algorithms for imbalanced datasets. Int. J. Emerg. Technol. Adv. Eng. 2(4), 42–47 (2012)

    Google Scholar 

  8. Guibon, G., Ochs, M., Bellot, P.: From emojis to sentiment analysis. In: WACAI 2016. Lab-STICC and ENIB and LITIS, Brest, France (2016). https://hal-amu.archives-ouvertes.fr/hal-01529708

  9. Hsu, C.W., Chang, C.C., Lin, C.J.: A practical guide to support vector classication. National Taiwan University, Tech. rep. (2016)

    Google Scholar 

  10. Kumar, A., Garg, G.: Sentiment analysis of multimodal twitter data. Multimedia Tool. Appl. 78(17), 24103–24119 (2019). https://doi.org/10.1007/s11042-019-7390-1

    Article  Google Scholar 

  11. Liu, B., et al.: Context-aware social media user sentiment analysis. Tsinghua Sci. Technol. 25(4), 528–541 (2020)

    Article  Google Scholar 

  12. Metallinou, A., Lee, S., Narayanan, S.: Audio-visual emotion recognition using gaussian mixture models for face and voice, pp. 250–257 (2008)

    Google Scholar 

  13. Oliveira, N., Cortez, P., Areal, N.: Stock market sentiment lexicon acquisition using microblogging data and statistical measures. Decis. Support Syst. 85, 62–73 (2016)

    Article  Google Scholar 

  14. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002), pp. 79–86. Association for Computational Linguistics (2002). https://aclanthology.org/W02-1011

  15. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  16. Poria, S., Cambria, E., Gelbukh, A.: Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. Association for Computational Linguistics, pp. 2539–2544 (2015). https://www.aclweb.org/anthology/D15-1303

  17. Poria, S., Cambria, E., Hazarika, D., Mazumder, N., Zadeh, A., Morency, L.P.: Context-dependent sentiment analysis in user-generated videos. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 873–883 (2017)

    Google Scholar 

  18. Poria, S., Majumder, N., Hazarika, D., Cambria, E., Gelbukh, A., Hussain, A.: Multimodal sentiment analysis: Addressing key issues and setting up the baselines (2018)

    Google Scholar 

  19. Pérez-Rosas, V., Mihalcea, R., Morency, L.P.: Utterance-level multimodal sentiment analysis. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 973–982 (2013)

    Google Scholar 

  20. Rajagopalan, S.S., Morency, L.-P., Baltrus̆aitis, T., Goecke, R.: Extending long short-term memory for multi-view structured learning. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 338–353. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_21

    Chapter  Google Scholar 

  21. Rajaraman, A., Ullman, J.D.: Data Mining, pp. 1–17. Cambridge University Press, Cambridge (2011)

    Google Scholar 

  22. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection (2015)

    Google Scholar 

  23. Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement (2018)

    Google Scholar 

  24. Rodrigues, A.P., et al.: Real-time twitter spam detection and sentiment analysis using machine learning and deep learning techniques. Computat. Intell. Neurosci. (2022)

    Google Scholar 

  25. Silva, L.D., Miyasato, T., Nakatsu, R.: Facial emotion recognition using multi-modal information, pp. 397–401. IEEE (1997)

    Google Scholar 

  26. Vapnik, V., Cortes, C.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)

    Article  Google Scholar 

  27. Van der Walt, S., et al.: The Scikit-image contributors: Scikit-image: image processing in Python. PeerJ 2, e453 (2014). https://doi.org/10.7717/peerj.453

  28. Wiggins, B.E.: The discursive power of memes in digital culture: ideology, semiotics, and intertextuality. Routledge, 1st edn. (2019)

    Google Scholar 

  29. Wöllmer, M., et al.: Youtube movie reviews: sentiment analysis in an audio-visual context. IEEE Intell. Syst. 28, 46–53 (2013)

    Article  Google Scholar 

  30. Zadeh, A., Zellers, R., Pincus, E., Morency, L.P.: Multimodal sentiment intensity analysis in videos: facial gestures and verbal messages. IEEE Intell. Syst. 31, 82–88 (2016)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the Universidad Iberoamericana Ciudad de México and the Institute of Applied Research and Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luis N. Zúñiga-Morales .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zúñiga-Morales, L.N., González-Ordiano, J.Á., Quiroz-Ibarra, J., Simske, S.J. (2022). Impact Evaluation of Multimodal Information on Sentiment Analysis. In: Pichardo Lagunas, O., Martínez-Miranda, J., Martínez Seis, B. (eds) Advances in Computational Intelligence. MICAI 2022. Lecture Notes in Computer Science(), vol 13613. Springer, Cham. https://doi.org/10.1007/978-3-031-19496-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19496-2_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19495-5

  • Online ISBN: 978-3-031-19496-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics