Abstract
Considering contextual features is a key issue in sentiment analysis. Existing approaches including convolutional neural networks (CNNs) and recurrent neural networks (RNNs) lack the ability to account and prioritize informative contextual features that are necessary for better sentiment interpretation. CNNs present limited capability since they are required to be very deep, which can lead to the gradient vanishing whereas, RNNs fail because they sequentially process input sequences. Furthermore, the two approaches treat all words equally. In this paper, we suggest a novel approach named attentive convolutional gated recurrent network (ACGRN) that alleviates the above issues for sentiment analysis. The motivation behind ACGRN is to avoid the vanishing gradient caused by deep CNN via applying a shallow-and-wide CNN that learns local contextual features. Afterwards, to solve the problem caused by the sequential structure of RNN and prioritizing informative contextual information, we use a novel prior knowledge attention based bidirectional gated recurrent unit (ATBiGRU). Prior knowledge ATBiGRU captures global contextual features with a strong focus on the previous hidden states that carry more valuable information to the current time step. The experimental results show that ACGRN significantly outperforms the baseline models over six small and large real-world datasets for the sentiment classification task.
Similar content being viewed by others
References
AlSmadi M, Talafha B, AlAyyoub M, Jararweh Y (2019) Using long shortterm memory deep neural networks for aspect based sentiment analysis of Arabic reviews. Int J Mach Learn Cybern 10(8):2163–2175
Amplayo RK, Kim J, Sung S, Hwang S (2018) Cold-start aware user and product attention for sentiment classification. In: Proceedings of the 56th annual meeting of the association for computational linguistics (ACL), pp 2535–2544
Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd international conference on learning representations (ICLR), pp 1–15
Bengio Y (2017) The consciousness prior. CoRR arXiv:1709.08568
Cai Y, Yang K, Huang D, ZhouXue Z, Lei X, Xie H et al (2019) A hybrid model for opinion mining based on domain sentiment dictionary. Int J Mach Learn Cybern 10(8):2131–2142
Cambria E, White B, Durrani TS, Howard N (2014) Computational intelligence for natural language processing [guest editorial]. IEEE Comput Intell Mag Nat Lang Process 9(1):19–63
Campos V, Jou B, Giró i Nieto X, Torres J, Chang S (2018) Skip RNN: learning to skip state updates in recurrent neural networks. In: 6th international conference on learning representations (ICLR), pp 1–17
Cho K, van Merriënboer B, Gülçehre C, Bahdanau D, Bougares F, Schwenk H et al (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1724–1734
Collobert R, Weston J (2008) A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the twenty-fifth international conference machine learning (ICML), pp 160–167
Conneau A, Barrault L, Schwenk H, LeCun Y (2017) Very deep convolutional networks for text classification. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics (EACL), pp 1107–1116
Hassan A, Mahmood A (2017) Deep learning approach for sentiment analysis of short texts. In: 3rd international conference on control, automation and robotics (ICCAR), pp 705–710
dos Santos CN, Gatti M (2014) Deep convolutional neural networks for sentiment analysis of short texts. In: 25th international conference on computational linguistics (COLING), pp 69–78
Hemmatian F, Sohrab MK (2019) A survey on classification techniques for opinion mining and sentiment analysis. Artif Intell Rev 52(3):1495–1545
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics, pp 427–431
Johnson R, Zhang T (2015) Effective use of word order for text categorization with convolutional neural networks. In: The 2015 conference of the North American chapter of the association for computational linguistics: human language technologies (HLT-NAACL), pp 103–112
Johnson R, Zhang T (2017) Deep pyramid convolutional neural networks for text categorization. In: Proceedings of the 55th annual meeting of the association for computational linguistics (ACL), pp 562–570
Habimana O, Li Y, Li R, Gu X (2020) Sentiment analysis using deep learning approaches: an overview. Sci China Inf Sci 63(1):111102
Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (ACL), pp 655–665
Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1746–1751
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: 3rd international conference on learning representations (ICLR), pp 1–15
Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: Proceedings of the twenty-ninth AAAI conference on artificial intelligence, pp 2267–2273
Le HT, Cerisara C, Alexandre DA (2018) Do convolutional networks need to be deep for text classification? In: The workshops of the thirty-second AAAI conference on artificial intelligence, pp 29–36
Liu B (2012) Sentiment analysis and opinion mining. Synthesis lectures on human language technologies. Morgan & Claypool Publishers, San Rafael
Liu J, Wang G, Hu P, Duan LY, Kot AC (2017) Global context-aware attention LSTM networks for 3D action recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3671–3680
Long F, Zhou K, Ou W (2019) Sentiment analysis of text based on bidirectional LSTM with multi-head attention. IEEE Access 7:141960–141969
Long Y, Qin L, Xiang R, Li M, Huang C (2017) A cognition based attention model for sentiment analysis. In: Proceedings of the 2017 conference on empirical methods in natural language processing (EMNLP), pp 462–471
Luong T, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 conference on empirical methods in natural language processing (EMNLP), pp 1412–1421
Ma Q, Yu L, Tian S, Chen E, Ng WWY (2019) Global-local mutual attention model for text classification. IEEE/ACM Trans Audio Speech Lang Process 27(12):2127–2139
Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies (ACL), pp 142–150
McAuley JJ, Leskovec J (2013) Hidden factors and hidden topics: understanding rating dimensions with review text. In: Seventh ACM conference on recommender systems (RecSys), pp 165–172
Mishra A, Tamilselvam S, Dasgupta R, Nagar S, Dey K (2018) Cognition-cognizant sentiment analysis with multitask subjectivity summarization based on annotators’ gaze behavior. In: Proceedings of the 32nd AAAI conference on artificial intelligence, pp 5884–5891
Mousa AE, Schuller BW (2017) Contextual bidirectional long short-term memory recurrent neural network language models: A generative approach to sentiment analysis. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics (EACL), pp 1023–1032
Muhammad A, Wiratunga N, Lothian R (2016) Contextual sentiment analysis for social media genres. Knowl Based Syst 108:92–101
Mujika A, Meier F, Steger A (2017) Fast-slow recurrent neural networks. In: Advances in neural information processing systems 30: annual conference on neural information processing systems (NIPS), pp 5917–5926
Pang B, Lee L (2007) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Potamianos A, Kokkinos F (2017) Structural attention neural networks for improved sentiment analysis. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics (EACL), pp 586–591
Pozzi FA, Fersini E, Messina E, Liu B ( 2016) Sentiment Analysis in Social Networks. Morgan Kaufmann Publishers Inc
Qiao X, Peng C, Liu Z, Hu Y (2019) Word-character attention model for Chinese text classification. Int J Mach Learn Cybern 10(12):3521–3537
Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 conference on empirical methods in natural language processing (EMNLP), pp 1631–1642
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing of the Asian federation of natural language processing (ACL–AFNLP), pp 1556–1566
Wang J, Yu L, Lai KR, Zhang X (2019) Investigating dynamic routing in tree-structured LSTM for sentiment analysis. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, pp 3430–3435
Wang N, Wang J, Zhang X (2017) YNU-HPCC at IJCNLP-2017 task 4: attention-based bi-directional GRU model for customer feedback analysis task of English. In: Proceedings of the IJCNLP, pp 174–179
Wang L, Tu Z, Way A, Liu Q (2017) Exploiting cross-sentence context for neural machine translation. In: Proceedings of the 2017 conference on empirical methods in natural language processing (EMNLP), pp 2826–2831
Wang Y, Tian F (2016) Recurrent residual learning for sequence classification. In: Proceedings of the 2016 conference on empirical methods in natural language processing (EMNLP), pp 938–943
Weston J, Chopra S, Bordes A (2015) Memory networks. In: 3rd international conference on learning representations (ICLR), pp 1–15
Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: Human language technology conference and conference on empirical methods in natural language processing, proceedings of the conference (HLT/EMNLP), pp 347–354
Wu Z, Dai X, Yin C, Huang S, Chen J (2018) Improving review representations with user attention and product attention for sentiment classification. In: Proceedings of the thirty-second AAAI conference on artificial intelligence, (AAAI-18), the 30th innovative applications of artificial intelligence (IAAI-18), and the 8th AAAI symposium on educational advances in artificial intelligence (EAAI-18), pp 5989–5996
Xu G, Meng Y, Qiu X, Yu Z, Wu X (2019) Sentiment analysis of comment texts based on BiLSTM. IEEE Access 7:51522–51532
Yang M, Tu W, Wang J, Xu F, Chen X (2017) Attention based LSTM for target dependent sentiment classification. In: Proceedings of the thirty-first AAAI conference on artificial intelligence, pp 5013–5014
Yang Z, Yang D, Dyer C, He X, Smola AJ, Hovy EH (2016) Hierarchical attention networks for document classification. In: The 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 1480–1489
Zhang M, Zhang Y, Vo D (2016) Gated neural networks for targeted sentiment analysis. In: Proceedings of the thirtieth AAAI conference on artificial intelligence (AAAI), pp 3087–3093
Zhang R, Lee H, Radev DR (2016) Dependency sensitive convolutional neural networks for modeling sentences and documents. In: The 2016 conference of the North American chapter of the association for computational linguistics: human language technologies (NAACL/HLT), pp 1512–1521
Zhang Y, Wallace BC (2017) A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. In: Proceedings of the the 8th international joint conference on natural language processing (IJCNLP), pp 253–263
Zhao J, Zhan Z, Yang Q, Zhang Y, Hu C, Li Z et al (2018) Adaptive learning of local semantic and global structure representations for text classification. In: Proceedings of the 27th international conference on computational linguistics (COLING), pp 2033–2043
Zheng L, Wang H, Gao S (2018) Sentimental feature selection for sentiment analysis of Chinese online reviews. Int J Mach Learn Cybern 9(1):75–84
Acknowledgements
This work is supported by the National Key Research and Development Program of China under Grants 2016YFB0800402 and 2016QY01W0202, National Natural Science Foundation of China under Grants U1836204, U1936108, 61433006, U1401258, and 61502185.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Habimana, O., Li, Y., Li, R. et al. Attentive convolutional gated recurrent network: a contextual model to sentiment analysis. Int. J. Mach. Learn. & Cyber. 11, 2637–2651 (2020). https://doi.org/10.1007/s13042-020-01135-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-020-01135-1