Abstract
As a fundamental task, review sentiment classification aims to predict a user’s overall sentiment in a review about a product. Recent studies have proven critical effects of user and product attributes on this task. However, they usually adopt the same way to incorporate these two attributes, which does not fully consider their different effects and thus is not capable of leveraging them effectively. To address this issue, we propose a simple and effective review sentiment classification model based on hierarchical attention network, where different effects of user and product attributes are respectively captured via fusion modules and attention mechanisms. We further propose a training framework based on mutual distillation to fully capture the individual effects of user and product attributes. Specifically, two auxiliary models with only the user or product attribute are introduced to benefit our model. During the joint training process, our model and the auxiliary models boost each other via mutual knowledge distillation in an iterative manner. On the benchmark IMDB and Yelp datasets, our model significantly outperforms competitive baselines with a 1.6% improvement in average accuracy. When BERT embeddings are used as inputs, our model still performs much better than recent BERT-enhanced baselines.






Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability and Access
The datasets generated during and/or analysed during the current study are publicly available on here. Note that the used Yelp 2013, Yelp 2014 and IMDB datasets are provided by authors of ACL 2015 Paper “Learning Semantic Representations of Users and Products for Document Level Sentiment Classification”. The datasets generated during and/or analysed during the current study are also available from the corresponding author on reasonable request.
Notes
In the experimental datasets, the anonymized user ID and product ID are provided for each review, without any other information of them.
This statement can be supported by the experimental results in Table 5, where CFeat-MLoss (with an additional loss for each representation) achieves substantially better performance than CFeat (concatenating three review representations directly). Similar experimental phenomena have also been reported in [52] and [60].
In our experiments, we use a learnable vector to replace the masked product or user attribute. Note that we mask an attribute to mimic real situations where it is absent.
References
Abbasi A, Chen H, Salem A (2008) Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans Inf Syst 26:12:1–12:34
Amplayo RK (2019) Rethinking attribute representation and injection for sentiment classification In: Proceedings of EMNLP, pp 5601–5612
Amplayo RK, Kim J, Sung S, Hwang Sw (2018) Cold-start aware user and product attention for sentiment classification. In: Proceedings of ACL, pp 2535–2544
Amplayo RK, Yoo KM, Lee SW (2022) Attribute injection for pretrained language models: a new benchmark and an efficient method. Proceedings of COLING 2022:1051–1064
Ba J, Caruana R (2014) Do deep nets really need to be deep? In: Proceedings of NIPS
Ba JL, Kiros JR, Hinton GE (2016) Layer normalization. arXiv:1607.06450
Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Proceedings of ICLR
Cao X, Yu J, Zhuang Y (2022) Injecting user identity into pretrained language models for document-level sentiment classification. IEEE Access 10:30157–30167
Chen H, Sun M, Tu C, Lin Y, Liu Z (2016) Neural sentiment classification with user and product attention. In: Proceedings of EMNLP, pp 1650–1659
Deng D, Jing L, Yu J, Sun S (2019) Sparse self-attention LSTM for sentiment lexicon construction. IEEE/ACM Trans Audio Speech Lang Process 27:1777–1790
Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL, pp 4171–4186
Diao Q, Qiu M, Wu CY, Smola AJ, Jiang J, Wang C (2014) Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS). Proceedings of SIGKDD 2014:193–202
Evgeniou T, Pontil M (2004) Regularized multi-task learning. In: Proceedings of KDD, pp 109–117
Fan S, Lin C, Li H, Lin Z, Su J, Zhang H, Gong Y, Guo J, Duan N (2022) Sentiment-aware word and sentence level pre-training for sentiment analysis. Proceedings of EMNLP 2022:4984–4994
Feng S, Wang B, Yang Z, Ouyang J (2022) Aspect-based sentiment analysis with attention-assisted graph and variational sentence representation. Knowl-Based Syst 258:109975
Gabrilovich E, Markovitch S (2007) Harnessing the expertise of 70,000 human editors: knowledge-based feature generation for text categorization. J Mach Learn Res 8:2297–2345
Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks 18:602–610
Han W, Chen H, Poria S (2021) Improving multimodal fusion with hierarchical mutual information maximization for multimodal sentiment analysis. Proceedings of EMNLP 2021:9180–9192
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of CVPR, pp 770–778
Hinton G, Vinyals O, Dean J (2014) Distilling the knowledge in a neural network. In: Proceedings of NIPS deep learning workshop, pp 1–9
Hovy D (2015) Demographic factors improve classification performance. In: Proceedings of ACL, pp 752–762
Ji Y, Wu W, Chen S, Chen Q, Hu W, He L (2020) Two-stage sentiment classification based on user-product interactive information. Knowl-Based Syst 203:106091
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: Proceedings of ICLR, pp 1–11
Kiritchenko S, Mohammad S (2018) Examining gender and race bias in two hundred sentiment analysis systems. In: Proceedings of the seventh joint conference on lexical and computational semantics, pp 43–53
Kong L, Li C, Ge J, Zhang F, Feng Y, Li Z, Luo B (2020) Leveraging multiple features for document sentiment classification. Inf Sci 518:39–55
Li Z, Xu P, Chang X, Yang L, Zhang Y, Yao L, Chen X (2023) When object detection meets knowledge distillation: a survey. IEEE Trans Pattern Anal Mach Intell 45:10555–10579
Liang X, Wu L, Li J, Qin T, Zhang M, Liu TY (2022) Multi-teacher distillation with single model for neural machine translation. IEEE/ACM Trans Audio Speech Lang Process 30:992–1002
Liu P, Qiu X, Huang X (2016) Recurrent neural network for text classification with multi-task learning. In: Proceedings of IJCAI, pp 2873–2879
Liu X, Liu K, Li X, Su J, Ge Y, Wang B, Luo J (2020) An iterative multi-source mutual knowledge transfer framework for machine reading comprehension. In: Proceedings of IJCAI, pp 3794–3800
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) RoBERTa: a robustly optimized BERT pretraining approach. arXiv:1907.11692
Long Y, Lu Q, Xiang R, Li M, Huang CR (2017) A cognition based attention model for sentiment analysis. In: Proceedings of EMNLP, pp 462–471
Long Y, Ma M, Lu Q, Xiang R, Huang CR (2018) Dual memory network model for biased product review classification. In: Proceedings of EMNLP workshop, pp 140–148
Lyu C, Foster J, Graham Y (2020) Improving document-level sentiment analysis with user and product context. In: Proceedings of COLING, pp 6724–6729
Lyu C, Yang L, Zhang Y, Graham Y, Foster J (2023) Exploiting rich textual user-product context for improving personalized sentiment analysis. In: Findings of ACL, pp 1419–1429
Ma D, Li S, Zhang X, Wang H, Sun X (2017) Cascading multiway attentions for document-level sentiment classification. In: Proceedings of IJCNLP, pp 634–643
Manning C, Surdeanu M, Bauer J, Finkel J, Bethard S, McClosky D (2014) The Stanford CoreNLP natural language processing toolkit. In: Proceedings of ACL: system demonstrations, pp 55–60
Minaee S, Kalchbrenner N, Cambria E, Nikzad N, Chenaghlu M, Gao J (2021) Deep learning–based text classification: a comprehensive review. ACM Comput Surv 54:62:1–62:40
Peters M, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of NAACL, pp 2227–2237
Shen J, Ma MD, Xiang R, Lu Q, Vallejos EP, Xu G, Huang CR, Long Y (2020) Dual memory network model for sentiment analysis of review text. Knowl-Based Syst 188:105004
Song J (2019) Distilling knowledge from user information for document level sentiment classification. In: Proceedings of ICDE workshop, pp 169–176
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958
Su J, Tang J, Jiang H, Lu Z, Ge Y, Song L, Xiong D, Sun L, Luo J (2021) Enhanced aspect-based sentiment analysis models with progressive self-supervised attention learning. Artif Intell 296:103477
Taboada M, Brooke J, Tofiloski M, Voll K, Stede M (2011) Lexicon-based methods for sentiment analysis. Comput Linguist 37:267–307
Tang D, Qin B, Liu T (2015) Learning semantic representations of users and products for document level sentiment classification. In: Proceedings of ACL, pp 1014–1023
Tian H, Gao C, Xiao X, Liu H, He B, Wu H, Wang H, Wu F (2020) SKEP: sentiment knowledge enhanced pre-training for sentiment analysis. In: Proceedings of ACL, pp 4067–4076
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. Proceedings of NIPS 2017:6000–6010
Vosoughi S, Zhou H, Roy d (2015) Enhanced twitter sentiment classification using contextual information. In: Proceedings of the 6th workshop on computational approaches to subjectivity, sentiment and social media analysis, pp 16–24
Wang L, Yoon KJ (2022) Knowledge distillation and student-teacher learning for visual intelligence: a review and new outlooks. IEEE Trans Pattern Anal Mach Intell 44:3048–3068
Wen J, Huang A, Zhong M, Ma J, Wei Y (2023) Hybrid sentiment analysis with textual and interactive information. Expert Syst Appl 213:118960
Wu C, Cao L, Ge Y, Liu Y, Zhang M, Su J (2022) A label dependence-aware sequence generation model for multi-level implicit discourse relation recognition. In: Proceedings of AAAI, pp 11486–11494
Wu C, Wu F, Qi T, Huang Y (2021) Hi-transformer: hierarchical interactive transformer for efficient and effective long document modeling. In: Proceedings ACL, pp 848–853
Wu Z, Dai XY, Yin C, Huang S, Chen J (2018) Improving review representations with user attention and product attention for sentiment classification. In: Proceedings of AAAI, pp 5989–5996
Xie B, Su J, Ge Y, Li X, Cui J, Yao J, Wang B (2021) Improving tree-structured decoder training for code generation via mutual learning. In: Proceedings of AAAI, pp 14121–14128
Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of NAACL, pp 1480–1489
Yuan Z, Wu F, Liu J, Wu C, Huang Y, Xie X (2019) Neural review rating prediction with user and product memory. In: Proceedings of CIKM, pp 2341–2344
Zeng J, Liu Y, Su J, Ge Y, Lu Y, Yin Y, Luo J (2019) Iterative dual domain adaptation for neural machine translation. In: Proceedings of EMNLP, pp 845–855
Zeng Y, Li Z, Chen Z, Ma H (2023) Aspect-level sentiment analysis based on semantic heterogeneous graph convolutional network. Front Comput Sci 17:176340
Zeng Y, Li Z, Tang Z, Chen Z, Ma H (2023) Heterogeneous graph convolution based on in-domain self-supervision for multimodal sentiment analysis. Expert Syst Appl 213:119240
Zhang Y, Wang J, Yu LC, Zhang X (2021a) MA-BERT: learning representation by incorporating multi-attribute knowledge in transformers. In: Findings of ACL, pp 2338–2343
Zhang Y, Wang J, Zhang X (2021) Personalized sentiment classification of customer reviews via an interactive attributes attention model. Knowl-Based Syst 226:107135
Zhang Y, Xiang T, Hospedales TM, Lu H (2018) Deep mutual learning. In: Proceedings of CVPR, pp 4320–4328
Zhou D, Zhang M, Zhang L, He Y (2021) A neural group-wise sentiment analysis model with data sparsity awareness. In: Proceedings of AAAI, pp 14594–14601
Zhou X, Wang Z, Li S, Zhou G, Zhang M (2019) Emotion detection with neural personal discrimination. In: Proceedings of EMNLP, pp 5499–5507
Acknowledgements
We would like to thank all the reviewers for their constructive and helpful suggestions on this paper. This work was supported in part by the National Natural Science Foundation of China (Nos. 62266017, 62166018 and 61861032), and the Natural Science Foundation of Jiangxi Province of China (No. 20232BAB202050).
Author information
Authors and Affiliations
Contributions
Changxing Wu: Conceptualization, Methodology, Software, Writing - Original Draft Liuwen Cao: Software, Validation, Data Curation Jiayu Chen: Software, Validation, Data Curation Yuanyun Wang: Conceptualization, Methodology Jinsong Su: Conceptualization, Methodology, Writing - Original Draft
Corresponding author
Ethics declarations
Ethical and Informed Consent for Data Used
The research did not involve human participants and animals.
Competing Interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wu, C., Cao, L., Chen, J. et al. Modeling different effects of user and product attributes on review sentiment classification. Appl Intell 54, 835–850 (2024). https://doi.org/10.1007/s10489-023-05236-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-05236-6