Skip to main content
Log in

IAN-BERT: Combining Post-trained BERT with Interactive Attention Network for Aspect-Based Sentiment Analysis

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Aspect-based sentiment analysis (ABSA), a task in sentiment analysis, predicts the sentiment polarity of specific aspects mentioned in the input sentence. Recent research has demonstrated the effectiveness of Bidirectional Encoder Representation from Transformers (BERT) and its variants in improving the performance of various Natural Language Processing (NLP) tasks, including sentiment analysis. However, BERT, trained on Wikipedia and BookCorpus dataset, lacks domain-specific knowledge. Also, for the ABSA task, the Attention mechanism leverages the aspect information to determine the sentiment orientation of the aspect within the given sentence. Based on the abovementioned observations, this paper proposes a novel approach called the IAN-BERT model. The IAN-BERT model leverages attention mechanisms to enhance a post-trained BERT representation trained on Amazon and Yelp datasets. The objective is to capture domain-specific knowledge using BERT representation and identify the significance of context words with aspect terms and vice versa. By incorporating attention mechanisms, the IAN-BERT model aims to improve the model’s ability to extract more relevant and informative features from the input text, ultimately leading to better predictions. Experimental evaluations conducted on SemEval-14 (Restaurant and Laptop dataset) and MAMS dataset demonstrate the effectiveness and superiority of the IAN-BERT model in aspect-based sentiment analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Data availability

The dataset used in this article is publicly available.

References

  1. Jiang L, Yu M, Zhou M, Liu X, Zhao T. Target-dependent twitter sentiment classification. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies; 2011. p. 151–160.

  2. Liu H, Chatterjee I, Zhou M, Lu XS, Abusorrah A. Aspect-based sentiment analysis: a survey of deep learning methods. IEEE Trans Comput Soc Syst. 2020;7(6):1358–75.

    Article  Google Scholar 

  3. Wang C-Y, Bochkovskiy A, Liao H-YM. Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv:2207.02696. 2022.

  4. Gulati A, Qin J, Chiu C-C, Parmar N, Zhang Y, Yu J, Han W, Wang S, Zhang Z, Wu Y, et al. Conformer: Convolution-augmented transformer for speech recognition. arXiv:2005.08100. 2020.

  5. Liu Y, Han T, Ma S, Zhang J, Yang Y, Tian J, He H, Li A, He M, Liu Z, et al. Summary of chatgpt/gpt-4 research and perspective towards the future of large language models. arXiv:2304.01852. 2023.

  6. Adeniji OD, Adeyemi SO, Ajagbe SA. An improved bagging ensemble in predicting mental disorder using hybridized random forest-artificial neural network model. Informatica. 2022;46(4):543–50.

  7. Ajagbe SA, Amuda KA, Oladipupo MA, Oluwaseyi FA, Okesola KI. Multi-classification of Alzheimer disease on magnetic resonance images (MRI) using deep convolutional neural network (dcnn) approaches. Int J Adv Comput Res. 2021;11(53):51.

    Article  Google Scholar 

  8. Elman JL. Finding structure in time. Cogn Sci. 1990;14(2):179–211.

    Article  Google Scholar 

  9. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80.

    Article  Google Scholar 

  10. Bengio Y, Simard P, Frasconi P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw. 1994;5(2):157–66.

    Article  Google Scholar 

  11. Tang D, Qin B, Feng X, Liu T. Effective LSTMs for target-dependent sentiment classification. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, The COLING 2016 Organizing Committee, Osaka, Japan; 2016. p. 3298–3307. https://www.aclweb.org/anthology/C16-1311.

  12. Kumar A, Verma S, Sharan A. ATE-SPD: simultaneous extraction of aspect-term and aspect sentiment polarity using bi-LSTM-CRF neural network. J Exp Theor Artif Intell. 2021;33(3):487–508.

    Article  Google Scholar 

  13. Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. arXiv:1409.0473. 2014.

  14. Wang Y, Huang M, Zhu X, Zhao L. Attention-based lSTM for aspect-level sentiment classification. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016. p. 606–615.

  15. Ma D, Li S, Zhang X, Wang H. Interactive attention networks for aspect-level sentiment classification. arXiv:1709.00893. 2017.

  16. Pennington J, Socher R, Manning CD. Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP); 2014. p. 1532–1543.

  17. Devlin J, Chang M-W, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805. 2018.

  18. Li X, Bing L, Zhang W, Lam W. Exploiting bert for end-to-end aspect-based sentiment analysis. arXiv:1910.00883. 2019.

  19. Li X, Bing L, Li P, Lam W. A unified model for opinion target extraction and target sentiment prediction. Proc the AAAI Conf Artif Intell. 2019;33:6714–21.

    Google Scholar 

  20. Hu M, Peng Y, Huang Z, Li D, Lv Y. Open-domain targeted sentiment analysis via span-based extraction and classification. arXiv:1906.03820. 2019.

  21. Xu H, Liu B, Shu L, Yu PS. Bert post-training for review reading comprehension and aspect-based sentiment analysis. arXiv:1904.02232. 2019.

  22. Xu, H., Liu, B., Shu, L., Yu, P.S.: Dombert: Domain-oriented language model for aspect-based sentiment analysis. arXiv:2004.13816. 2020.

  23. Xu H, Shu L, Yu PS, Liu B. Understanding pre-trained bert for aspect-based sentiment analysis. arXiv:2011.00169. 2020.

  24. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. Attention is all you need. Advances in neural information processing systems. 2017;30:5998–6008.

  25. Song Y, Wang J, Jiang T, Liu Z, Rao Y. Attentional encoder network for targeted sentiment classification. arXiv:1902.09314. 2019.

  26. Wu Z, Li Y, Liao J, Li D, Li X, Wang S. Aspect-context interactive attention representation for aspect-level sentiment classification. IEEE Access. 2020;8:29238–48.

    Article  Google Scholar 

  27. Ambartsoumian A, Popowich F. Self-attention: A better building block for sentiment analysis neural network classifiers. arXiv:1812.07860; 2018.

  28. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed representations of words and phrases and their compositionality. In: Burges CJ, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds). Advances in neural information processing systems, vol. 26. Curran Associates, Inc; 2013. p. 3111–3119. https://proceedings.neurips.cc/paper_files/paper/2013/file/9aa42b31882ec039965f3c4923ce901b-Paper.pdf

  29. Peters M, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L. Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (long papers). Association for Computational Linguistics, New Orleans, Louisiana; 2018. p. 2227–2237. https://doi.org/10.18653/v1/N18-1202. https://www.aclweb.org/anthology/N18-1202.

  30. Pontiki M, Galanis D, Pavlopoulos J, Papageorgiou H, Androutsopoulos I, Manandhar S. Semeval-2014 task 4: Aspect based sentiment analysis. In: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). ACL; 2014, p. 27–35.

  31. Jiang Q, Chen L, Xu R, Ao X, Yang M. A challenge dataset and effective models for aspect-based sentiment analysis. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP); 2019. p. 6280–6285.

  32. Tang D, Qin B, Liu T. Aspect level sentiment classification with deep memory network. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Austin, Texas; 2016. p. 214–224. https://doi.org/10.18653/v1/D16-1021. https://aclanthology.org/D16-1021.

  33. Chen P, Sun Z, Bing L, Yang W. Recurrent attention network on memory for aspect sentiment analysis. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark; 2017. p. 452–461. https://doi.org/10.18653/v1/D17-1047. https://www.aclweb.org/anthology/D17-1047.

  34. Li Z, Wei Y, Zhang Y, Zhang X, Li X. Exploiting coarse-to-fine task transfer for aspect-level sentiment classification. Proc AAAI Conf Artif Intell. 2019;33:4253–60.

    Google Scholar 

  35. Sun K, Zhang R, Mensah S, Mao Y, Liu X. Aspect-level sentiment analysis via convolution over dependency tree. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP); 2019. p. 5679–5688.

Download references

Funding

The authors have no funding to disclose.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sharad Verma.

Ethics declarations

Conflict of Interest

Not applicable.

Ethics Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Financial Interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Research Trends in Computational Intelligence” guest edited by Anshul Verma, Pradeepika Verma, Vivek Kumar Singh and S. Karthikeyan.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Verma, S., Kumar, A. & Sharan, A. IAN-BERT: Combining Post-trained BERT with Interactive Attention Network for Aspect-Based Sentiment Analysis. SN COMPUT. SCI. 4, 756 (2023). https://doi.org/10.1007/s42979-023-02229-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-023-02229-7

Keywords

Navigation