SaliencyBERT: Recurrent Attention Network for Target-Oriented Multimodal Sentiment Classification

Wang, Jiawei; Liu, Zhe; Sheng, Victor; Song, Yuqing; Qiu, Chenjian

doi:10.1007/978-3-030-88010-1_1

SaliencyBERT: Recurrent Attention Network for Target-Oriented Multimodal Sentiment Classification

Jiawei Wang¹⁶,
Zhe Liu¹⁶,
Victor Sheng¹⁷,
Yuqing Song¹⁶ &
…
Chenjian Qiu¹⁶

Conference paper
First Online: 22 October 2021

2441 Accesses
6 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 13021))

Abstract

As multimodal data become increasingly popular on social media platforms, it is desirable to enhance text-based approaches with other important data sources (e.g. images) for the Sentiment Classification of social media posts. However, existing approaches primarily rely on the textual content or are designed for the coarse-grained Multimodal Sentiment Classification. In this paper, we propose a recurrent attention network (called SaliencyBERT) over the BERT architecture for Target-oriented Multimodal Sentiment Classification (TMSC). Specifically, we first adopt BERT and ResNet to capture the intra-modality dynamics with the textual content and the visual information respectively. Then, we design a recurrent attention mechanism, which can derive target-sensitive visual representations, to capture the inter-modality dynamics. With recurrent attention, our model can progressively optimize the alignment of target-sensitive textual features and visual features and produce an output after a fixed number of time steps. Finally, we combine the loss of all-time steps for deep supervision to prevent converging slower and overfitting. Our empirical results show that the proposed model consistently outperforms single modal methods and achieves an indistinguishable or even better performance on several highly competitive methods on two multimodal datasets from Twitter.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Wagner, J., et al.: DCU: aspect-based polarity classification for semeval task 4. In: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pp. 223–229 (2014)
Google Scholar
Kiritchenko, S., Zhu, X., Cherry, C., Mohammad, S.: Nrc-canada-2014: detecting aspects and sentiment in customer reviews. In: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pp. 437–442 (2014)
Google Scholar
Dong, L., Wei, F., Tan, C., Tang, D., Zhou, M., Xu, K.: Adaptive recursive neural network for target-dependent twitter sentiment classification. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (volume 2: Short papers), pp. 49–54 (2014)
Google Scholar
Nguyen, T. H., & Shirai, K.: PhraseRNN: phrase recursive neural network for aspect-based sentiment analysis. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 2509–2514 (2015)
Google Scholar
Ma, D., Li, S., Zhang, X., Wang, H.: Interactive attention networks for aspect-level sentiment classification. In: Twenty-Sixth International Joint Conference on Artificial Intelligence, pp. 4068–4074 (2017)
Google Scholar
Li, C., Guo, X., Mei, Q.: Deep memory networks for attitude identification. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, pp. 671–680 (2017)
Google Scholar
Xue, W., Li, T.: Aspect based sentiment analysis with gated convolutional networks. In: Proceedings Annual Meeting Association for Computational Linguistics, pp. 2514–2523 (2018)
Google Scholar
Zadeh, A., Chen, M., Poria, S., Cambria, E., Morency, L.P.: Tensor fusion network for multimodal sentiment analysis. In: Empirical Methods in Natural Language Processing, pp. 1103–1114 (2017)
Google Scholar
Zadeh, A., Liang, P.P., Mazumder, N., Poria, S., Cambria, E., Morency, L.P.: Memory fusion network for multi-view sequential learning. In: AAAI, pp. 5634–5641 (2018)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Proceedings Neural Information Processing System, pp. 5998–6008 (2017)
Google Scholar
Li, J., Qiu, L.: A Sentiment Analysis Method of Short Texts in Microblog. A Sentiment Analysis Method of Short Texts in Microblog. IEEE Computer Society (2017)
Google Scholar
Fan, X., Li, X., Du, F., Li, X., Wei, M.: Apply word vectors for sentiment analysis of APP reviews. In: 2016 3rd International Conference on Systems and Informatics, ICSAI 2016, 2017, no. Icsai, pp. 1062–1066 (2016)
Google Scholar
Tang, D., Wei, F., Qin, B., Liu, T., Zhou, M.: Coooolll: a deep learning system for twitter sentiment classification. In: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pp. 208–212 (2014)
Google Scholar
Zhang, L., Wang, S., Liu, B.: Deep learning for sentiment analysis: a survey. WIREs Data Mining Knowl. Discov. 8(4), e1253 (2018)
Article Google Scholar
Tang, D., Qin, B., Feng, X., and Liu, T.: Effective LSTMs for target-dependent sentiment classification. In: Computer Conference, pp. 3298–3307 (2015)
Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: Proceedings International Conference Learning Representation, pp. 1–15 (2014)
Google Scholar
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings Conference North American Chapter Association Computational Linguistics: Human Language Technologies, pp. 1480–1489 (2016)
Google Scholar
Majumder, N., Poria, S., Gelbukh, A., Akhtar, M.S., Ekbal, A.: IARM: inter-aspect relation modeling with memory networks in aspect-based sentiment analysis. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3402–3411 (2018)
Google Scholar
Fan, F., Feng, Y., Zhao, D.: Multi-grained attention network for aspect level sentiment classification. In: Proc. Conf. Empir. Methods Nat. Lang. Process, pp. 3433–3442 (2018)
Google Scholar
Bertero, D., Siddique, F.B., Wu, C.S., Wan, Y., Chan, R.H.Y., Fung, P.: Real-time speech emotion and sentiment recognition for interactive dialogue systems. In: Proceedings of the 2016 Conference on Empirical Methods in NLP, pp. 1042–1047 (2016)
Google Scholar
Wang, Y., Huang, M., Zhu, X., Zhao, L.: Attention-based LSTM for aspect-level sentiment classification. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 606–615 (2016)
Google Scholar
Yu, J., Jiang, J., Xia, R.: Entity-sensitive attention and fusion network for entity-level multimodal sentiment classification. IEEE/ACM Trans. Audio Speech Lang. Process. 28, 429–439 (2019)
Article Google Scholar
Kumar, A., Irsoy, O., Ondruska, P., Iyyer, M., Bradbury, J., Gulrajani, I.: Ask me anything: dynamic memory networks for natural language processing. arXiv:1506.07285 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Yu, J., Jiang, J.: Adapting BERT for target-oriented multimodal sentiment classification. In: Twenty-Eighth International Joint Conference on Artificial Intelligence IJCAI-19 (2019)
Google Scholar
Lu, D., Neves, L., Carvalho, V., Zhang, N., Ji, H.: Visual attention model for name tagging in multimodal social media. In: The Association for Computational Linguistics, pp. 1990–1999 (2018)
Google Scholar
Zoran, D., Chrzanowski, M., Huang, P.S., Gowal, S., Mott, A., Kohli, P.: Towards robust image classification using sequential attention models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9483–9492 (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

Jiangsu University, Zhenjiang, 212013, China
Jiawei Wang, Zhe Liu, Yuqing Song & Chenjian Qiu
Texas Tech University, Lubbock, TX, USA
Victor Sheng

Authors

Jiawei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhe Liu
View author publications
You can also search for this author in PubMed Google Scholar
Victor Sheng
View author publications
You can also search for this author in PubMed Google Scholar
Yuqing Song
View author publications
You can also search for this author in PubMed Google Scholar
Chenjian Qiu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhe Liu .

Editor information

Editors and Affiliations

University of Science and Technology Beijing, Beijing, China
Huimin Ma
Chinese Academy of Sciences, Beijing, China
Liang Wang
Tsinghua University, Beijing, China
Changshui Zhang
Zhejiang University, Hangzhou, China
Fei Wu
Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Hunan University, Changsha, China
Yaonan Wang
Sun Yat-Sen University, Guangzhou, Guangdong, China
Jianhuang Lai
Beijing Jiaotong University, Beijing, China
Yao Zhao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, J., Liu, Z., Sheng, V., Song, Y., Qiu, C. (2021). SaliencyBERT: Recurrent Attention Network for Target-Oriented Multimodal Sentiment Classification. In: Ma, H., et al. Pattern Recognition and Computer Vision. PRCV 2021. Lecture Notes in Computer Science(), vol 13021. Springer, Cham. https://doi.org/10.1007/978-3-030-88010-1_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-88010-1_1
Published: 22 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88009-5
Online ISBN: 978-3-030-88010-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics