Skip to main content
Log in

ACNN-TL: attention-based convolutional neural network coupling with transfer learning and contextualized word representation for enhancing the performance of sentiment classification

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Due to the rapid growth of textual information on the web, analyzing users' opinions about particular products, events or services is now considered a crucial and challenging task that has changed sentiment analysis from an academic endeavor to an essential analytic tool in cognitive science and natural language understanding. Despite the remarkable success of deep learning models for textual sentiment classification, they are still confronted with some limitations. Convolutional neural network is one of the deep learning models that has been excelled at sentiment classification but tends to need a large amount of training data while it considers that all words in a sentence have equal contribution in the polarity of a sentence and its performance is highly dependent on its accompanying hyper-parameters. To overcome these issues, an Attention-Based Convolutional Neural Network with Transfer Learning (ACNN-TL) is proposed in this paper that not only tries to take advantage of both attention mechanism and transfer learning to boost the performance of sentiment classification but also language models, namely Word2Vec and BERT, are used as its the backbone to better express sentence semantics as word vector. We conducted our experiment on widely-studied sentiment classification datasets and according to the empirical results, not only the proposed ACNN-TL achieved comparable or even better classification results but also employing contextual representation and transfer learning yielded remarkable improvement in the classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://goo.gl/bm0IkT.

  2. https://goo.gl/bm0IkT.

  3. https://goo.gl/NWatud.

  4. http://nlp.stanford.edu/sentiment/.

  5. https://stanfordnlp.github.io/CoreNLP/.

  6. https://github.com/google-research/bert.

References

  1. Salloum SA, Khan R, Shaalan K (2020) A survey of semantic analysis approaches. In: Joint European-US Workshop on Applications of Invariance in Computer Vision. Springer, pp 61–70

  2. Sadr H, Pedram MM, Teshnehlab M (2019) A robust sentiment analysis method based on sequential combination of convolutional and recursive neural networks. Neural Process Lett 50:2745–2761

    Article  Google Scholar 

  3. Yadav A, Vishwakarma DK (2020) Sentiment analysis using deep learning architectures: a review. Artif Intell Rev 53(6):4335–4385

    Article  Google Scholar 

  4. Prabha MI, Srikanth GU (2019) Survey of sentiment analysis using deep learning techniques. In: 2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT). IEEE, pp 1–9

  5. Habimana O, Li Y, Li R, Gu X, Yu G (2020) Sentiment analysis using deep learning approaches: an overview. Sci China Inf Sci 63(1):1–36

    Article  Google Scholar 

  6. Sadr H, Pedram MM, Teshnelab M (2019) Improving the performance of text sentiment analysis using deep convolutional neural network integrated with hierarchical attention layer. Int J Inf Commun Technol Res 11(3):57–67

    Google Scholar 

  7. Xie X, Ge S, Hu F, Xie M, Jiang N (2019) An improved algorithm for sentiment analysis based on maximum entropy. Soft Comput 23(2):599–611

    Article  Google Scholar 

  8. Pathak AR, Agarwal B, Pandey M, Rautaray S (2020) Application of deep learning approaches for sentiment analysis. In: Agarwal B, Nayak R, Mittal N, Patnaik S (eds) Deep learning-based approaches for sentiment analysis. Springer, Singapore, pp 1–31

    Google Scholar 

  9. Sadr H, Soleimandarabi MN, Pedram M, Teshnelab M (2019) Unified topic-based semantic models: a study in computing the semantic relatedness of geographic terms. In: 2019 5th International Conference on Web Research (ICWR). IEEE, pp 134–140

  10. Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1480–1489

  11. Soleymanpour S, Sadr H, Soleimandarabi MN (2021) CSCNN: cost-sensitive convolutional neural network for encrypted traffic classification. Neural Process Lett 53:3497–3523

    Article  Google Scholar 

  12. Zhang Z, Zou Y, Gan C (2018) Textual sentiment analysis via three different attention convolutional neural networks and cross-modality consistent regression. Neurocomputing 275:1407–1415

    Article  Google Scholar 

  13. Liu R, Shi Y, Ji C, Jia M (2019) A survey of sentiment analysis based on transfer learning. IEEE Access 7:85401–85412

    Article  Google Scholar 

  14. Sun C, Qiu X, Xu Y, Huang X (2019) How to fine-tune Bert for text classification? In: China National Conference on Chinese Computational Linguistics. Springer, pp 194–206

  15. Yin D, Meng T, Chang K-W (2020) SentiBERT: a transferable transformer-based architecture for compositional sentiment semantics. arXiv preprint, arXiv:2005.04114

  16. Mikolov T, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Nips

  17. Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 1532–1543

  18. Peters ME et al (2018) Deep contextualized word representations. arXiv preprint, arXiv:1802.05365

  19. Howard J, Ruder S (2018) Universal language model fine-tuning for text classification. arXiv preprint, arXiv:1801.06146

  20. Radford A, Narasimhan K, Salimans T, Sutskever I (2018) Improving language understanding by generative pre-training

  21. Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint, arXiv:1810.04805

  22. Sadr H, Nazari Solimandarabi M (2019) Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures. J Adv Comput Res 10(2):1–10

    Google Scholar 

  23. Sadr H, Nazari M, Pedram MM, Teshnehlab M (2019) Exploring the efficiency of topic-based models in computing semantic relatedness of geographic terms. Int J Web Res 2(2):23–35

    Google Scholar 

  24. Sadr H (2021) An intelligent model for multidimensional personality recognition of users using deep learning methods. J Inf Commun Technol 47(47)

  25. Sadr H, Pedram MM, Teshnehlab M (2020) Multi-view deep network: a deep model based on learning features from heterogeneous neural networks for sentiment analysis. IEEE Access 8:86984–86997

    Article  Google Scholar 

  26. Salloum SA, Khan R, Shaalan K (2020) A survey of semantic analysis approaches. In: Proceedings of the International Conference on Artificial Intelligence and Computer Vision (AICV2020), pp 61–70. Springer International Publishing, Cham

  27. Aliakbarpour H, Manzuri MT, Rahmani AM (2021) Improving the readability and saliency of abstractive text summarization using combination of deep neural networks equipped with auxiliary attention mechanism. J Supercomput. https://doi.org/10.1007/s11227-021-03950-x

    Article  Google Scholar 

  28. Young T, Hazarika D, Poria S, Cambria E (2018) Recent trends in deep learning based natural language processing. IEEE Comput Intell Mag 13(3):55–75

    Article  Google Scholar 

  29. Socher R, Huval B, Manning CD, Ng AY (2012) Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Association for Computational Linguistics

  30. Al-Smadi M, Qawasmeh O, Al-Ayyoub M, Jararweh Y, Gupta B (2017) Deep recurrent neural network vs. support vector machine for aspect-based sentiment analysis of Arabic Hotels’ reviews. J Comput Sci 27:386–393

    Article  Google Scholar 

  31. Kim Y (2014) Convolutional neural networks for sentence classification. arXiv preprint, arXiv:1408.5882

  32. Ruangkanokmas P, Achalakul T, Akkarajitsakul K (2016) Deep belief networks with feature selection for sentiment classification. In: Uksim.Info, p 16

  33. Kuta M, Morawiec M, Kitowski J (2017) Sentiment analysis with tree-structured gated recurrent units. In: Ekštein K, Matoušek V (eds) Text, speech, and dialogue. TSD 2017. Lecture Notes in Computer Science, vol 10415. Springer, Cham, pp 74–82

    Google Scholar 

  34. Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint, arXiv:1503.00075

  35. Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. Adv Neural Inf Process Syst 28:649–657

    Google Scholar 

  36. Yin W, Schütze H, Xiang B, Zhou B (2015) Abcnn: attention-based convolutional neural network for modeling sentence pairs. arXiv preprint, arXiv:1512.05193

  37. Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. arXiv preprint, arXiv:1404.2188

  38. Wang Y, Huang M, Zhao L (2016) Attention-based LSTM for aspect-level sentiment classification. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp 606–615

  39. Yuan Z, Wu S, Wu F, Liu J, Huang Y (2018) Domain attention model for multi-domain sentiment classification. Knowl-Based Syst 155:1–10

    Article  Google Scholar 

  40. Er MJ, Zhang Y, Wang N, Pratama M (2016) Attention pooling-based convolutional neural network for sentence modelling. Inf Sci 373:388–403

    Article  Google Scholar 

  41. Zhao Z, Wu Y (2016) Attention-based convolutional neural networks for sentence classification. In: INTERSPEECH, pp 705–709

  42. Lee G, Jeong J, Seo S, Kim C, Kang P (2017) Sentiment classification with word attention based on weakly supervised learning with a convolutional neural network. arXiv preprint, arXiv:1709.09885

  43. Yin W, Schütze H (2018) Attentive convolution: equipping CNNS with RNN-style attention mechanisms. Trans Assoc Comput Linguist 6:687–702

    Article  Google Scholar 

  44. Liu Y, Ji L, Huang R, Ming T, Gao C, Zhang J (2019) An attention-gated convolutional neural network for sentence classification. Intell Data Anal 23(5):1091–1107

    Article  Google Scholar 

  45. Basiri ME, Nemati S, Abdar M, Cambria E, Acharya UR (2021) ABCDM: an attention-based bidirectional CNN-RNN deep model for sentiment analysis. Futur Gener Comput Syst 115:279–294

    Article  Google Scholar 

  46. Phan HT, Tran VC, Nguyen NT, Hwang D (2020) Improving the performance of sentiment analysis of tweets containing fuzzy sentiment using the feature ensemble model. IEEE Access 8:14630–14641

    Article  Google Scholar 

  47. Semwal T, Yenigalla P, Mathur G, Nair SB (2018) A practitioners' guide to transfer learning for text classification using convolutional neural networks. In: Proceedings of the 2018 SIAM International Conference on Data Mining. SIAM, pp 513–521

  48. Zhuang F et al (2019) A comprehensive survey on transfer learning. arXiv preprint, arXiv:1911.02685

  49. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105

    Google Scholar 

  50. Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2013) Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv preprint, arXiv:1312.6229

  51. Sukhbaatar S, Weston J, Fergus R (2015) End-to-end memory networks. Adv Neural Inf Process Syst 28:2440–2448

    Google Scholar 

  52. Kumar A et al (2016) Ask me anything: dynamic memory networks for natural language processing. In: International Conference on Machine Learning, pp 1378–1387

  53. Maas A, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics

  54. Pang B, Lee L (2005) Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp 115–124. Association for Computational Linguistics

  55. Socher R et al (2013) Recursive deep models for semantic compositionality over a sentiment Treebank. In: EMNLP

  56. Du C, Huang L (2017) Sentiment classification via recurrent convolutional neural networks. In: DEStech Transactions on Computer Science and Engineering, no cii

  57. Kokkinos F, Potamianos A (2017) Structural attention neural networks for improved sentiment analysis. arXiv preprint, arXiv:1701.01811

  58. Munikar M, Shakya S, Shrestha A (2019) Fine-grained sentiment classification using BERT. In: 2019 Artificial Intelligence for Transforming Business and Society (AITB), vol 1, pp 1–5. IEEE

  59. Zheng S, Yang M (2019) A new method of improving BERT for text classification. In: International Conference on Intelligent Science and Big Data Engineering, pp 442–452. Springer

  60. Liu X, He P, Chen W, Gao J (2019) Improving multi-task deep neural networks via knowledge distillation for natural language understanding. arXiv preprint, arXiv:1904.09482

  61. Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov RR, Le QV (2019) Xlnet: Generalized autoregressive pretraining for language understanding. Adv Neural Inf Process Syst 32:5753–5763

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hossein Sadr.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sadr, H., Nazari Soleimandarabi, M. ACNN-TL: attention-based convolutional neural network coupling with transfer learning and contextualized word representation for enhancing the performance of sentiment classification. J Supercomput 78, 10149–10175 (2022). https://doi.org/10.1007/s11227-021-04208-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-021-04208-2

Keywords

Navigation