research-article

Image–Text Multimodal Sentiment Analysis Framework of Assamese News Articles Using Late Fusion

Authors:
Ringki Das

Department of Computer Science and Engineering, National Institute of Technology Silchar, India

Department of Computer Science and Engineering, National Institute of Technology Silchar, India

0000-0003-0483-7169
View Profile

,
Thoudam Doren Singh

Department of Computer Science and Engineering, National Institute of Technology Silchar, India

Department of Computer Science and Engineering, National Institute of Technology Silchar, India

0000-0001-9906-9136
View Profile

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 22 Issue 6Article No.: 161pp 1–30https://doi.org/10.1145/3584861

Published:17 June 2023Publication History

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

Before the arrival of the web as a corpus, people detected positive and negative news based on the understanding of the textual content from physical newspaper rather than an automatic identification approach from readily available e-newspapers. Thus, the earlier sentiment analysis approach is based on unimodal data, and less effort is paid to the multimodal data. However, the presence of multimodal information helps us to get a clearer understanding of the sentiment. To the best of our knowledge, less work has been introduced on the image–text multimodal sentiment analysis framework of Assamese, a low-resource Indian language mostly spoken in the northeast part of India. We built an Assamese news articles dataset consisting of news text and associated images and one image caption to conduct an experimental study. Focusing on important words and discriminative regions of the images mostly related to sentiment, two individual unimodal such as textual and visual models are proposed. The visual model is developed using an encoder-decoder–based image caption generation system. An image–text multimodal approach is proposed to explore the internal correlation between textual and visual features for joint sentiment classification. Finally, we propose the multimodal sentiment analysis framework, i.e., Textual Visual Multimodal Fusion, by employing a late fusion scheme to merge the three different modalities for the final sentiment prediction. Experimental results conducted on the Assamese dataset built in-house demonstrate that the contextual integration of multimodal features delivers better performance than unimodal features.

REFERENCES

[1] Al-Kabi Mohammed, Al-Qudah Noor M., Alsmadi Izzat, Dabour Muhammad, and Wahsheh Heider. 2013. Arabic/English sentiment analysis: An empirical study. In Proceedings of the 4th International Conference on Information and Communication Systems (ICICS’13). 23–25.Google Scholar
[2] Borth Damian, Ji Rongrong, Chen Tao, Breuel Thomas, and Chang Shih-Fu. 2013. Large-scale visual sentiment ontology and detectors using adjective noun pairs. In Proceedings of the 21st ACM International Conference on Multimedia. 223–232.Google ScholarDigital Library
[3] Campos Victor, Jou Brendan, and Nieto Xavier Giro-i. 2017. From pixels to sentiment: Fine-tuning CNNs for visual sentiment prediction. Image Vis. Comput. 65 (2017), 15–22.Google ScholarDigital Library
[4] Cao Donglin, Ji Rongrong, Lin Dazhen, and Li Shaozi. 2016. Visual sentiment topic model based microblog image sentiment analysis. Multimedia Tools Appl. 75, 15 (2016), 8955–8968.Google ScholarDigital Library
[5] Chen Xingyue, Wang Yunhong, and Liu Qingjie. 2017. Visual and textual sentiment analysis using deep fusion convolutional neural networks. In Proceedings of the IEEE International Conference on Image Processing (ICIP’17). IEEE, 1557–1561.Google ScholarDigital Library
[6] Das Amitava and Bandyopadhyay Sivaji. 2010. Opinion-polarity identification in bengali. In Proceedings of the International Conference on Computer Processing of Oriental Languages. 169–182.Google Scholar
[7] Das Ringki and Singh Thoudam Doren. 2021. Image caption generation framework for assamese news using attention mechanism. In Proceedings of the 18th International Conference on Natural Language Processing (ICON’21). 231–239.Google Scholar
[8] Das Ringki and Singh Thoudam Doren. 2021. A step towards sentiment analysis of assamese news articles using lexical features. In Proceedings of the International Conference on Computing and Communication Systems (I3CS’20), Vol. 170. Springer, 15.Google ScholarCross Ref
[9] Das Ringki and Singh Thoudam Doren. 2022. Assamese news image caption generation using attention mechanism. Multimedia Tools Appl. 81, 7 (2022), 10051–10069.Google ScholarDigital Library
[10] Das Ringki and Singh Thoudam Doren. 2022. A multi-stage multimodal framework for sentiment analysis of Assamese in low resource setting. Expert Syst. Appl. (2022), 117575.Google ScholarDigital Library
[11] Dhaoui Chedia, Webster Cynthia M., and Tan Lay Peng. 2017. Social media sentiment analysis: Lexicon versus machine learning. J. Cons. Market. (2017).Google ScholarCross Ref
[12] Hu Minqing and Liu Bing. 2004. Mining opinion features in customer reviews. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’04), Vol. 4. 755–760.Google Scholar
[13] Huang Feiran, Zhang Xiaoming, Zhao Zhonghua, Xu Jie, and Li Zhoujun. 2019. Image–text sentiment analysis via deep multimodal attentive fusion. Knowl.-Bas. Syst. 167 (2019), 26–37.Google ScholarDigital Library
[14] Kaur Jasleen and Saini Jatinderkumar R.. 2014. A study and analysis of opinion mining research in Indo-Aryan, Dravidian and Tibeto-Burman language families. Int. J. Data Min. Emerg. Technol. 4, 2 (2014), 53–60.Google ScholarCross Ref
[15] Kim Soo-Min and Hovy Eduard. 2004. Determining the sentiment of opinions. In Proceedings of the 20th International Conference on Computational Linguistics. Association for Computational Linguistics, 1367.Google ScholarDigital Library
[16] Le Tuan Anh, Moeljadi David, Miura Yasuhide, and Ohkuma Tomoko. 2016. Sentiment analysis for low resource languages: A study on informal Indonesian tweets. In Proceedings of the 12th Workshop on Asian Language Resources (ALR12’16). 123–131.Google Scholar
[17] Meetei Loitongbam Sanayai, Singh Thoudam Doren, Borgohain Samir Kumar, and Bandyopadhyay Sivaji. 2021. Low resource language specific pre-processing and features for sentiment analysis task. Lang. Resourc. Eval. (2021), 1–23.Google Scholar
[18] Mehmood Khawar, Essam Daryl, Shafi Kamran, and Malik Muhammad Kamran. 2019. Sentiment analysis for a resource poor language–Roman Urdu. ACM Trans. Asian Low-Resour. Lang. Inf. Proc. 19, 1 (2019), 1–15.Google Scholar
[19] Neethu M. S. and Rajasree R.. 2013. Sentiment analysis in twitter using machine learning techniques. In Proceedings of the 4th International Conference on Computing, Communications and Networking Technologies (ICCCNT’13). IEEE, 1–5.Google ScholarCross Ref
[20] Ortis Alessandro, Farinella Giovanni Maria, Torrisi Giovanni, and Battiato Sebastiano. 2020. Exploiting objective text description of images for visual sentiment analysis. Multimedia Tools Appl. (2020), 1–24.Google Scholar
[21] Pang Bo and Lee Lillian. 2004. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 271.Google ScholarDigital Library
[22] Pang Bo and Lee Lillian. 2005. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 115–124.Google ScholarDigital Library
[23] Pang Bo, Lee Lillian, and Vaithyanathan Shivakumar. 2002. Thumbs up?: Sentiment classification using machine learning techniques. In Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing-Volume 10. Association for Computational Linguistics, 79–86.Google ScholarDigital Library
[24] Qian Chen, Ragusa Edoardo, Chaturvedi Iti, Cambria Erik, and Zunino Rodolfo. Text-image sentiment analysis.Google Scholar
[25] Rani Sujata and Kumar Parteek. 2019. A journey of Indian languages over sentiment analysis: A systematic review. Artif. Intell. Rev. 52, 2 (2019), 1415–1462.Google ScholarDigital Library
[26] Saharia Navanath, Das Dhrubajyoti, Sharma Utpal, and Kalita Jugal. 2009. Part of speech tagger for Assamese text. In Proceedings of the ACL-IJCNLP Conference Short Papers. 33–36.Google ScholarCross Ref
[27] Sarkar Kamal and Bhowmick Mandira. 2017. Sentiment polarity detection in bengali tweets using multinomial Naïve Bayes and support vector machines. In Proceedings of the IEEE Calcutta Conference (CALCON’17). IEEE, 31–36.Google ScholarCross Ref
[28] Simonyan Karen and Zisserman Andrew. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556. Retrieved from https://arxiv.org/abs/1409.1556.Google Scholar
[29] Singh Alok, Meetei Loitongbam Sanayai, Singh Salam Michael, Singh Thoudam Doren, and Bandyopadhyay Sivaji. 2021. An efficient keyframes selection based framework for video captioning. In Proceedings of the 18th International Conference on Natural Language Processing (ICON’21). 240–250.Google Scholar
[30] Singh Thoudam Doren, Singh Telem Joyson, Shadang Mirinso, and Thokchom Surmila. 2021. Review comments of manipuri online video: Good, bad or ugly. In Proceedings of the International Conference on Computing and Communication Systems (I3CS’20), Vol. 170. Springer, 45.Google ScholarCross Ref
[31] Song Kaikai, Yao Ting, Ling Qiang, and Mei Tao. 2018. Boosting image sentiment analysis with visual attention. Neurocomputing 312 (2018), 218–228.Google ScholarDigital Library
[32] Vinyals Oriol, Toshev Alexander, Bengio Samy, and Erhan Dumitru. 2015. Show and tell: A neural image caption generator. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3156–3164.Google ScholarCross Ref
[33] Wang Jingwen, Fu Jianlong, Xu Yong, and Mei Tao. 2016. Beyond object recognition: Visual sentiment analysis with deep coupled adjective and noun neural networks. In Proceedings of the Internationa Joint Conference on Artificial Intelligence (IJCAI’16). 3484–3490.Google Scholar
[34] Wankhade Mayur, Rao Annavarapu Chandra Sekhara, and Kulkarni Chaitanya. 2022. A survey on sentiment analysis methods, applications, and challenges. Artif. Intell. Rev. (2022), 1–50.Google Scholar
[35] Xu C., Cetintas S., Lee K. C., and Li L. J.. [n.d.]. Visual sentiment prediction with deep convolutional neural networks. arXiv:1411.5731. Retrieved from https://arixv.org/abs/1411.5731.Google Scholar
[36] Yao Xingxu, She Dongyu, Zhang Haiwei, Yang Jufeng, Cheng Ming-Ming, and Wang Liang. 2020. Adaptive deep metric learning for affective image retrieval and classification. IEEE Trans. Multimedia (2020).Google Scholar
[37] You Quanzeng, Cao Liangliang, Jin Hailin, and Luo Jiebo. 2016. Robust visual-textual sentiment analysis: When attention meets tree-structured recursive neural networks. In Proceedings of the 24th ACM International Conference on Multimedia. 1008–1017.Google ScholarDigital Library
[38] You Quanzeng, Jin Hailin, and Luo Jiebo. 2017. Visual sentiment analysis by attending on local image regions. In Proceedings of the 31st AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
[39] You Quanzeng, Luo Jiebo, Jin Hailin, and Yang Jianchao. 2015. Joint visual-textual sentiment analysis with deep neural networks. In Proceedings of the 23rd ACM International Conference on Multimedia. 1071–1074.Google ScholarDigital Library
[40] Yuan Jianbo, McDonough Sean, You Quanzeng, and Luo Jiebo. 2013. Sentribute: Image sentiment analysis from a mid-level perspective. In Proceedings of the 2nd International Workshop on Issues of Sentiment Discovery and Opinion Mining. 1–8.Google ScholarDigital Library
[41] Zhang Yaowen, Shang Lin, and Jia Xiuyi. 2015. Sentiment analysis on microblogging by integrating text and image features. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 52–63.Google ScholarCross Ref
[42] Zhao Ziyuan, Zhu Huiying, Xue Zehao, Liu Zhao, Tian Jing, Chua Matthew Chin Heng, and Liu Maofu. 2019. An image-text consistency driven multimodal sentiment analysis approach for social media. Inf. Process. Manage. 56, 6 (2019), 102097.Google ScholarDigital Library
[43] Zitouni Imed and Florian Radu. 2008. Mention detection crossing the language barrier. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 600–609.Google ScholarDigital Library

Index Terms

Image–Text Multimodal Sentiment Analysis Framework of Assamese News Articles Using Late Fusion
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Machine translation

Recommendations

A multi-stage multimodal framework for sentiment analysis of Assamese in low resource setting
Abstract
Multimodality has shown to be helpful in several natural language processing tasks. Thus, adding multiple modalities to the traditional sentiment analysis has also proven to be useful. However, multimodality in a low resource setting ...
Highlights
- A multimodal and multi-purpose Assamese news dataset is presented.
- Two models ...
Read More
Multimodal Social Media Sentiment Analysis Based on Cross-Modal Hierarchical Attention Fusion
Artificial Intelligence and Mobile Services – AIMS 2021
Abstract
With the diversification of data forms on social media, more and more multimodal information mixed with image and text replaces the traditional single text description. Compared with single-modal data, multimodal data can more fully express people’...
Read More
Text-Oriented Modality Reinforcement Network for Multimodal Sentiment Analysis from Unaligned Multimodal Sequences
Artificial Intelligence
Abstract
Multimodal Sentiment Analysis (MSA) aims to mine sentiment information from text, visual, and acoustic modalities. Previous works have focused on representation learning and feature fusion strategies. However, most of these efforts ignored the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Asian and Low-Resource Language Information Processing Volume 22, Issue 6
June 2023
635 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/3604597
Editor:
Imed Zitouni
Google, USA
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 June 2023
- Online AM: 17 February 2023
- Accepted: 3 February 2023
- Revised: 26 August 2022
- Received: 1 September 2021
Published in tallip Volume 22, Issue 6

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Multimodal sentiment analysis
low resource language
caption generation
machine learning classifier
late fusion
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 732
  Total Downloads
- Downloads (Last 12 months)609
- Downloads (Last 6 weeks)74
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

Image–Text Multimodal Sentiment Analysis Framework of Assamese News Articles Using Late Fusion

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

A multi-stage multimodal framework for sentiment analysis of Assamese in low resource setting

Multimodal Social Media Sentiment Analysis Based on Cross-Modal Hierarchical Attention Fusion

Text-Oriented Modality Reinforcement Network for Multimodal Sentiment Analysis from Unaligned Multimodal Sequences

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Full Text

Caption

Image–Text Multimodal Sentiment Analysis Framework of Assamese News Articles Using Late Fusion

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

A multi-stage multimodal framework for sentiment analysis of Assamese in low resource setting

Multimodal Social Media Sentiment Analysis Based on Cross-Modal Hierarchical Attention Fusion

Text-Oriented Modality Reinforcement Network for Multimodal Sentiment Analysis from Unaligned Multimodal Sequences

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Full Text

Share this Publication link

Share on Social Media