Automatic Sentiment Labelling of Multimodal Data

Biswas, Sumana; Young, Karen; Griffith, Josephine

doi:10.1007/978-3-031-37890-4_8

Sumana Biswas⁹,
Karen Young⁹ &
Josephine Griffith⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1860))

Included in the following conference series:

275 Accesses
1 Citations

Abstract

This study investigates the challenging problem of automatically providing sentiment labels for training and testing multimodal data containing both image and textual information for supervised machine learning. Because both the image and text components, individually and collectively, convey sentiment, assessing the sentiment of multimodal data typically requires both image and text information. Consequently, the majority of studies classify sentiment by combining image and text features (‘Image+Text-features’). In this study, we propose ‘Combined-Text-Features’ that incorporate the object names and attributes identified in an image, as well as any accompanying superimposed or captioned text of that image, and utilize these text features to classify the sentiment of multimodal data. Inspired by our prior research, we employ the Afinn labelling method to automatically provide sentiment labels to the ‘Combined-Text-Features’. We test whether classifier models, using these ‘Combined-Text-Features’ with the Afinn labelling, can provide comparable results as when using other multimodal features and other labelling (human labelling). CNN, BiLSTM, and BERT models are used for the experiments on two multimodal datasets. The experimental results demonstrate the usefulness of the ‘Combined-Text-Features’ as a representation for multimodal data for the sentiment classification task. The results also suggest that the Afinn labelling approach can be a feasible alternative to human labelling for providing sentiment labels.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Anderson, P., et al.: Bottom-up and top-down attention for image captioning and visual question answering. In: CVPR (2018)
Google Scholar
Balntas, V., Riba, E., Ponsa, D., Mikolajczyk, K.: Learning local feature descriptors with triplets and shallow convolutional neural networks. In: BMVC, vol. 1, p. 3 (2016)
Google Scholar
Barbosa, L., Feng, J.: Robust sentiment detection on twitter from biased and noisy data. In: Coling 2010: Posters, pp. 36–44 (2010)
Google Scholar
Biswas, S., Young, K., Griffith, J.: A comparison of automatic labelling approaches for sentiment analysis. In: Proceedings of the 11th International Conference on Data Science, Technology and Applications, DATA, Portugal, pp. 312–319 (2022)
Google Scholar
Cambria, E., Poria, S., Bisio, F., Bajpai, R., Chaturvedi, I.: The CLSA model: a novel framework for concept-level sentiment analysis. In: Gelbukh, A. (ed.) CICLing 2015. LNCS, vol. 9042, pp. 3–22. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18117-2_1
Chapter Google Scholar
Camgözlü, Y., Kutlu, Y.: Analysis of filter size effect in deep learning. arXiv preprint arXiv:2101.01115 (2020)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chapter Google Scholar
Chakraborty, K., Bhatia, S., Bhattacharyya, S., Platos, J., Bag, R., Hassanien, A.E.: Sentiment analysis of COVID-19 tweets by deep learning classifiers-a study to show how popularity is affecting accuracy in social media. Appl. Soft Comput. 97, 106754 (2020)
Article Google Scholar
Chen, F., Ji, R., Su, J., Cao, D., Gao, Y.: Predicting microblog sentiments via weakly supervised multimodal deep learning. IEEE Trans. Multimedia 20(4), 997–1007 (2017)
Article Google Scholar
Chen, M., Wang, S., Liang, P.P., Baltrušaitis, T., Zadeh, A., Morency, L.P.: Multimodal sentiment analysis with word-level fusion and reinforcement learning. In: Proceedings of the 19th ACM International Conference on Multimodal Interaction, pp. 163–171 (2017)
Google Scholar
Deepa, D., Tamilarasi, A., et al.: Sentiment analysis using feature extraction and dictionary-based approaches. In: 2019 Third International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), pp. 786–790. IEEE (2019)
Google Scholar
Deriu, J.M., Gonzenbach, M., Uzdilli, F., Lucchi, A., De Luca, V., Jaggi, M.: SwissCheese at SemEval-2016 task 4: sentiment classification using an ensemble of convolutional neural networks with distant supervision. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 1124–1128 (2016)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Dimitrakakis, C., Savu-Krohn, C.: Cost-minimising strategies for data labelling: optimal stopping and active learning. In: Hartmann, S., Kern-Isberner, G. (eds.) FoIKS 2008. LNCS, vol. 4932, pp. 96–111. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-77684-0_9
Chapter MATH Google Scholar
Druzhkov, P., Kustikova, V.: A survey of deep learning methods and software tools for image classification and object detection. Pattern Recognit Image Anal. 26(1), 9–15 (2016)
Article Google Scholar
Felicetti, A., Martini, M., Paolanti, M., Pierdicca, R., Frontoni, E., Zingaretti, P.: Visual and textual sentiment analysis of daily news social media images by deep learning. In: Ricci, E., Rota Bulò, S., Snoek, C., Lanz, O., Messelodi, S., Sebe, N. (eds.) ICIAP 2019. LNCS, vol. 11751, pp. 477–487. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30642-7_43
Chapter Google Scholar
Ghorbanali, A., Sohrabi, M.K., Yaghmaee, F.: Ensemble transfer learning-based multimodal sentiment analysis using weighted convolutional neural networks. Inf. Process. Manag. 59(3), 102929 (2022)
Article Google Scholar
Hasan, A., Moin, S., Karim, A., Shamshirband, S.: Machine learning-based sentiment analysis for twitter accounts. Math. Comput. Appl. 23(1), 11 (2018)
Google Scholar
Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177 (2004)
Google Scholar
Huang, P.Y., Liu, F., Shiang, S.R., Oh, J., Dyer, C.: Attention-based multimodal neural machine translation. In: Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers, pp. 639–645 (2016)
Google Scholar
Huang, Q., Chen, R., Zheng, X., Dong, Z.: Deep sentiment representation based on CNN and LSTM. In: 2017 International Conference on Green Informatics (ICGI), pp. 30–33. IEEE (2017)
Google Scholar
Kim, Y., et al.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Krishna, R., et al.: Visual genome: Connecting language and vision using crowdsourced dense image annotations. Int. J. Comput. Vision 123(1), 32–73 (2017)
Article MathSciNet Google Scholar
Li, X., Chen, M.: Multimodal sentiment analysis with multi-perspective fusion network focusing on sense attentive language. In: Sun, M., Li, S., Zhang, Y., Liu, Y., He, S., Rao, G. (eds.) CCL 2020. LNCS (LNAI), vol. 12522, pp. 359–373. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-63031-7_26
Chapter Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Lindstrom, P., Delany, S.J., Mac Namee, B.: Handling concept drift in a text data stream constrained by high labelling cost. In: Twenty-Third International FLAIRS Conference (2010)
Google Scholar
Liu, B.: Sentiment analysis and opinion mining. Synth. Lect. Hum. Lang. Technol. 5(1), 1–167 (2012)
Article MathSciNet Google Scholar
Maas, A., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 142–150 (2011)
Google Scholar
Nielsen, F.Å.: A new anew: evaluation of a word list for sentiment analysis in microblogs. arXiv preprint arXiv:1103.2903 (2011)
Niu T., Zhu, S., Pang, L., Saddik, A.El: Sentiment analysis on multi-view social data. In: MultiMedia Modeling: 22nd International Conference, MMM 2016, Miami, FL, USA, January 4-6, 2016, Proceedings, Part II 22, PP. 15–27 (2016) Springer
Google Scholar
Ortis, A., Farinella, G.M., Torrisi, G., Battiato, S.: Exploiting objective text description of images for visual sentiment analysis. Multimedia Tools Appl. 80(15), 22323–22346 (2021)
Article Google Scholar
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. arXiv preprint CS/0205070 (2002)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Google Scholar
Poria, S., Cambria, E., Hazarika, D., Majumder, N., Zadeh, A., Morency, L.P.: Context-dependent sentiment analysis in user-generated videos. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 873–883 (2017)
Google Scholar
Poria, S., Chaturvedi, I., Cambria, E., Hussain, A.: Convolutional MKL based multimodal emotion recognition and sentiment analysis. In: 2016 IEEE 16th International Conference on Data Mining (ICDM), pp. 439–448. IEEE (2016)
Google Scholar
Ridnik, T., Ben-Baruch, E., Noy, A., Zelnik-Manor, L.: Imagenet-21k pretraining for the masses. arXiv preprint arXiv:2104.10972 (2021)
Saad, E., et al.: Determining the efficiency of drugs under special conditions from users’ reviews on healthcare web forums. IEEE Access 9, 85721–85737 (2021)
Article Google Scholar
Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast-but is it good? Evaluating non-expert annotations for natural language tasks. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp. 254–263 (2008)
Google Scholar
Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. Linguist. 37(2), 267–307 (2011)
Article Google Scholar
Tan, H., Bansal, M.: Lxmert: learning cross-modality encoder representations from transformers. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (2019)
Google Scholar
Thomee, B., et al.: YFCC100M: the new data in multimedia research. Commun. ACM 59(2), 64–73 (2016)
Article Google Scholar
Turney, P.D.: Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm. J. Artif. Intell. Res. 2, 369–409 (1994)
Article Google Scholar
Wadera, M., Mathur, M., Vishwakarma, D.K.: Sentiment analysis of tweets-a comparison of classifiers on live stream of twitter. In: 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS), pp. 968–972. IEEE (2020)
Google Scholar
Wang, D., Xiong, D.: Efficient object-level visual context modeling for multimodal machine translation: masking irrelevant objects helps grounding. In: AAAI, pp. 2720–2728 (2021)
Google Scholar
Wang, M., Cao, D., Li, L., Li, S., Ji, R.: Microblog sentiment analysis based on cross-media bag-of-words model. In: Proceedings of International Conference on Internet Multimedia Computing and Service, pp. 76–80 (2014)
Google Scholar
Wang, Y., Huang, M., Zhu, X., Zhao, L.: Attention-based LSTM for aspect-level sentiment classification. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 606–615 (2016)
Google Scholar
Whitehill, J., Wu, T.F., Bergsma, J., Movellan, J., Ruvolo, P.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Advances in Neural Information Processing Systems, vol. 22 (2009)
Google Scholar
Wu, Y., Kirillov, A., Massa, F., Lo, W.Y., Girshick, R.: Detectron2 (2019). https://github.com/facebookresearch/detectron2
Xu, G., Meng, Y., Qiu, X., Yu, Z., Wu, X.: Sentiment analysis of comment texts based on BiLSTM. IEEE Access 7, 51522–51532 (2019)
Article Google Scholar
Xu, N., Mao, W.: Multisentinet: a deep semantic network for multimodal sentiment analysis. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 2399–2402 (2017)
Google Scholar
Xu, N., Mao, W., Chen, G.: A co-memory network for multimodal sentiment analysis. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 929–932 (2018)
Google Scholar
Xue, X., Zhang, C., Niu, Z., Wu, X.: Multi-level attention map network for multimodal sentiment analysis. IEEE Trans. Knowl. Data Eng. (2022)
Google Scholar
Yang, J., She, D., Sun, M., Cheng, M.M., Rosin, P.L., Wang, L.: Visual sentiment prediction based on automatic discovery of affective regions. IEEE Trans. Multimedia 20(9), 2513–2525 (2018)
Article Google Scholar
Yang, X., Feng, S., Wang, D., Zhang, Y.: Image-text multimodal emotion classification via multi-view attentional network. IEEE Trans. Multimedia 23, 4014–4026 (2020)
Article Google Scholar
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1480–1489 (2016)
Google Scholar
Yao, T., Pan, Y., Li, Y., Mei, T.: Exploring visual relationship for image captioning. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 684–699 (2018)
Google Scholar
Yoon, J., Kim, H.: Multi-channel lexicon integrated CNN-BiLSTM models for sentiment analysis. In: Proceedings of the 29th Conference on Computational Linguistics and Speech Processing (ROCLING 2017), pp. 244–253 (2017)
Google Scholar
You, Q., Cao, L., Jin, H., Luo, J.: Robust visual-textual sentiment analysis: when attention meets tree-structured recursive neural networks. In: Proceedings of the 24th ACM International Conference on Multimedia, pp. 1008–1017 (2016)
Google Scholar
You, Q., Luo, J., Jin, H., Yang, J.: Cross-modality consistent regression for joint visual-textual sentiment analysis of social multimedia. In: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, pp. 13–22 (2016)
Google Scholar
Zhao, Z., et al.: An image-text consistency driven multimodal sentiment analysis approach for social media. Inf. Process. Manag. 56(6), 102097 (2019)
Article Google Scholar
Zhou, L., Palangi, H., Zhang, L., Hu, H., Corso, J., Gao, J.: Unified vision-language pre-training for image captioning and VQA. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 13041–13049 (2020)
Google Scholar
Zhu, T., Li, L., Yang, J., Zhao, S., Liu, H., Qian, J.: Multimodal sentiment analysis with image-text interaction network. IEEE Trans. Multimedia (2022)
Google Scholar
Zhu, Y., et al.: Aligning books and movies: towards story-like visual explanations by watching movies and reading books. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 19–27 (2015)
Google Scholar

Download references

Acknowledgements

This work was supported by the College of Engineering, University of Galway, Ireland.

Author information

Authors and Affiliations

School of Computer Science, University of Galway, Galway, Ireland
Sumana Biswas, Karen Young & Josephine Griffith

Authors

Sumana Biswas
View author publications
You can also search for this author in PubMed Google Scholar
Karen Young
View author publications
You can also search for this author in PubMed Google Scholar
Josephine Griffith
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Sumana Biswas or Josephine Griffith .

Editor information

Editors and Affiliations

University of Calabria, Rende, Italy
Alfredo Cuzzocrea
Ford Motor Company, Commerce Township, MI, USA
Oleg Gusikhin
Siège du Groupe ESEO, Angers, France
Slimane Hammoudi
Hochschule Niederrhein, Krefeld, Nordrhein-Westfalen, Germany
Christoph Quix

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Biswas, S., Young, K., Griffith, J. (2023). Automatic Sentiment Labelling of Multimodal Data. In: Cuzzocrea, A., Gusikhin, O., Hammoudi, S., Quix, C. (eds) Data Management Technologies and Applications. DATA DATA 2022 2021. Communications in Computer and Information Science, vol 1860. Springer, Cham. https://doi.org/10.1007/978-3-031-37890-4_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-37890-4_8
Published: 23 July 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-37889-8
Online ISBN: 978-3-031-37890-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Automatic Sentiment Labelling of Multimodal Data