Abstract
Due to the rapid growth of textual information on the web, analyzing users' opinions about particular products, events or services is now considered a crucial and challenging task that has changed sentiment analysis from an academic endeavor to an essential analytic tool in cognitive science and natural language understanding. Despite the remarkable success of deep learning models for textual sentiment classification, they are still confronted with some limitations. Convolutional neural network is one of the deep learning models that has been excelled at sentiment classification but tends to need a large amount of training data while it considers that all words in a sentence have equal contribution in the polarity of a sentence and its performance is highly dependent on its accompanying hyper-parameters. To overcome these issues, an Attention-Based Convolutional Neural Network with Transfer Learning (ACNN-TL) is proposed in this paper that not only tries to take advantage of both attention mechanism and transfer learning to boost the performance of sentiment classification but also language models, namely Word2Vec and BERT, are used as its the backbone to better express sentence semantics as word vector. We conducted our experiment on widely-studied sentiment classification datasets and according to the empirical results, not only the proposed ACNN-TL achieved comparable or even better classification results but also employing contextual representation and transfer learning yielded remarkable improvement in the classification accuracy.





Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Salloum SA, Khan R, Shaalan K (2020) A survey of semantic analysis approaches. In: Joint European-US Workshop on Applications of Invariance in Computer Vision. Springer, pp 61–70
Sadr H, Pedram MM, Teshnehlab M (2019) A robust sentiment analysis method based on sequential combination of convolutional and recursive neural networks. Neural Process Lett 50:2745–2761
Yadav A, Vishwakarma DK (2020) Sentiment analysis using deep learning architectures: a review. Artif Intell Rev 53(6):4335–4385
Prabha MI, Srikanth GU (2019) Survey of sentiment analysis using deep learning techniques. In: 2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT). IEEE, pp 1–9
Habimana O, Li Y, Li R, Gu X, Yu G (2020) Sentiment analysis using deep learning approaches: an overview. Sci China Inf Sci 63(1):1–36
Sadr H, Pedram MM, Teshnelab M (2019) Improving the performance of text sentiment analysis using deep convolutional neural network integrated with hierarchical attention layer. Int J Inf Commun Technol Res 11(3):57–67
Xie X, Ge S, Hu F, Xie M, Jiang N (2019) An improved algorithm for sentiment analysis based on maximum entropy. Soft Comput 23(2):599–611
Pathak AR, Agarwal B, Pandey M, Rautaray S (2020) Application of deep learning approaches for sentiment analysis. In: Agarwal B, Nayak R, Mittal N, Patnaik S (eds) Deep learning-based approaches for sentiment analysis. Springer, Singapore, pp 1–31
Sadr H, Soleimandarabi MN, Pedram M, Teshnelab M (2019) Unified topic-based semantic models: a study in computing the semantic relatedness of geographic terms. In: 2019 5th International Conference on Web Research (ICWR). IEEE, pp 134–140
Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1480–1489
Soleymanpour S, Sadr H, Soleimandarabi MN (2021) CSCNN: cost-sensitive convolutional neural network for encrypted traffic classification. Neural Process Lett 53:3497–3523
Zhang Z, Zou Y, Gan C (2018) Textual sentiment analysis via three different attention convolutional neural networks and cross-modality consistent regression. Neurocomputing 275:1407–1415
Liu R, Shi Y, Ji C, Jia M (2019) A survey of sentiment analysis based on transfer learning. IEEE Access 7:85401–85412
Sun C, Qiu X, Xu Y, Huang X (2019) How to fine-tune Bert for text classification? In: China National Conference on Chinese Computational Linguistics. Springer, pp 194–206
Yin D, Meng T, Chang K-W (2020) SentiBERT: a transferable transformer-based architecture for compositional sentiment semantics. arXiv preprint, arXiv:2005.04114
Mikolov T, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Nips
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 1532–1543
Peters ME et al (2018) Deep contextualized word representations. arXiv preprint, arXiv:1802.05365
Howard J, Ruder S (2018) Universal language model fine-tuning for text classification. arXiv preprint, arXiv:1801.06146
Radford A, Narasimhan K, Salimans T, Sutskever I (2018) Improving language understanding by generative pre-training
Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint, arXiv:1810.04805
Sadr H, Nazari Solimandarabi M (2019) Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures. J Adv Comput Res 10(2):1–10
Sadr H, Nazari M, Pedram MM, Teshnehlab M (2019) Exploring the efficiency of topic-based models in computing semantic relatedness of geographic terms. Int J Web Res 2(2):23–35
Sadr H (2021) An intelligent model for multidimensional personality recognition of users using deep learning methods. J Inf Commun Technol 47(47)
Sadr H, Pedram MM, Teshnehlab M (2020) Multi-view deep network: a deep model based on learning features from heterogeneous neural networks for sentiment analysis. IEEE Access 8:86984–86997
Salloum SA, Khan R, Shaalan K (2020) A survey of semantic analysis approaches. In: Proceedings of the International Conference on Artificial Intelligence and Computer Vision (AICV2020), pp 61–70. Springer International Publishing, Cham
Aliakbarpour H, Manzuri MT, Rahmani AM (2021) Improving the readability and saliency of abstractive text summarization using combination of deep neural networks equipped with auxiliary attention mechanism. J Supercomput. https://doi.org/10.1007/s11227-021-03950-x
Young T, Hazarika D, Poria S, Cambria E (2018) Recent trends in deep learning based natural language processing. IEEE Comput Intell Mag 13(3):55–75
Socher R, Huval B, Manning CD, Ng AY (2012) Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Association for Computational Linguistics
Al-Smadi M, Qawasmeh O, Al-Ayyoub M, Jararweh Y, Gupta B (2017) Deep recurrent neural network vs. support vector machine for aspect-based sentiment analysis of Arabic Hotels’ reviews. J Comput Sci 27:386–393
Kim Y (2014) Convolutional neural networks for sentence classification. arXiv preprint, arXiv:1408.5882
Ruangkanokmas P, Achalakul T, Akkarajitsakul K (2016) Deep belief networks with feature selection for sentiment classification. In: Uksim.Info, p 16
Kuta M, Morawiec M, Kitowski J (2017) Sentiment analysis with tree-structured gated recurrent units. In: Ekštein K, Matoušek V (eds) Text, speech, and dialogue. TSD 2017. Lecture Notes in Computer Science, vol 10415. Springer, Cham, pp 74–82
Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint, arXiv:1503.00075
Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. Adv Neural Inf Process Syst 28:649–657
Yin W, Schütze H, Xiang B, Zhou B (2015) Abcnn: attention-based convolutional neural network for modeling sentence pairs. arXiv preprint, arXiv:1512.05193
Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. arXiv preprint, arXiv:1404.2188
Wang Y, Huang M, Zhao L (2016) Attention-based LSTM for aspect-level sentiment classification. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp 606–615
Yuan Z, Wu S, Wu F, Liu J, Huang Y (2018) Domain attention model for multi-domain sentiment classification. Knowl-Based Syst 155:1–10
Er MJ, Zhang Y, Wang N, Pratama M (2016) Attention pooling-based convolutional neural network for sentence modelling. Inf Sci 373:388–403
Zhao Z, Wu Y (2016) Attention-based convolutional neural networks for sentence classification. In: INTERSPEECH, pp 705–709
Lee G, Jeong J, Seo S, Kim C, Kang P (2017) Sentiment classification with word attention based on weakly supervised learning with a convolutional neural network. arXiv preprint, arXiv:1709.09885
Yin W, Schütze H (2018) Attentive convolution: equipping CNNS with RNN-style attention mechanisms. Trans Assoc Comput Linguist 6:687–702
Liu Y, Ji L, Huang R, Ming T, Gao C, Zhang J (2019) An attention-gated convolutional neural network for sentence classification. Intell Data Anal 23(5):1091–1107
Basiri ME, Nemati S, Abdar M, Cambria E, Acharya UR (2021) ABCDM: an attention-based bidirectional CNN-RNN deep model for sentiment analysis. Futur Gener Comput Syst 115:279–294
Phan HT, Tran VC, Nguyen NT, Hwang D (2020) Improving the performance of sentiment analysis of tweets containing fuzzy sentiment using the feature ensemble model. IEEE Access 8:14630–14641
Semwal T, Yenigalla P, Mathur G, Nair SB (2018) A practitioners' guide to transfer learning for text classification using convolutional neural networks. In: Proceedings of the 2018 SIAM International Conference on Data Mining. SIAM, pp 513–521
Zhuang F et al (2019) A comprehensive survey on transfer learning. arXiv preprint, arXiv:1911.02685
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2013) Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv preprint, arXiv:1312.6229
Sukhbaatar S, Weston J, Fergus R (2015) End-to-end memory networks. Adv Neural Inf Process Syst 28:2440–2448
Kumar A et al (2016) Ask me anything: dynamic memory networks for natural language processing. In: International Conference on Machine Learning, pp 1378–1387
Maas A, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics
Pang B, Lee L (2005) Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp 115–124. Association for Computational Linguistics
Socher R et al (2013) Recursive deep models for semantic compositionality over a sentiment Treebank. In: EMNLP
Du C, Huang L (2017) Sentiment classification via recurrent convolutional neural networks. In: DEStech Transactions on Computer Science and Engineering, no cii
Kokkinos F, Potamianos A (2017) Structural attention neural networks for improved sentiment analysis. arXiv preprint, arXiv:1701.01811
Munikar M, Shakya S, Shrestha A (2019) Fine-grained sentiment classification using BERT. In: 2019 Artificial Intelligence for Transforming Business and Society (AITB), vol 1, pp 1–5. IEEE
Zheng S, Yang M (2019) A new method of improving BERT for text classification. In: International Conference on Intelligent Science and Big Data Engineering, pp 442–452. Springer
Liu X, He P, Chen W, Gao J (2019) Improving multi-task deep neural networks via knowledge distillation for natural language understanding. arXiv preprint, arXiv:1904.09482
Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov RR, Le QV (2019) Xlnet: Generalized autoregressive pretraining for language understanding. Adv Neural Inf Process Syst 32:5753–5763
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sadr, H., Nazari Soleimandarabi, M. ACNN-TL: attention-based convolutional neural network coupling with transfer learning and contextualized word representation for enhancing the performance of sentiment classification. J Supercomput 78, 10149–10175 (2022). https://doi.org/10.1007/s11227-021-04208-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-021-04208-2