Abstract
Word representation learning is a fundamental technique in cognitive computation that plays a crucial role in enabling machines to understand and process human language. By representing words as vectors in a high-dimensional space, computers can perform complex natural language processing tasks such as sentiment analysis. However, most word representation learning models are trained in open-domain corpora, which results in suboptimal performance in domain-specific tasks. To address this problem, we propose a unified learning framework that leverages external hybrid sentiment knowledge to enhance the sentiment information of word distributed representations. Specifically, we automatically acquire domain- and target-dependent sentiment knowledge from multiple sources. To mitigate knowledge noise, we introduce knowledge expectation and knowledge context weights to filter the acquired knowledge items. Finally, we integrate the filtered sentiment knowledge into the word distributed representations via a learning framework to enrich their semantic information. Extensive experiments are conducted to verify the effectiveness of enhancing sentiment information in word representations for different sentiment analysis tasks. The experimental results show that the proposed models significantly outperform state-of-the-art baselines. Our work demonstrates the advantages of sentiment-enhanced word representations in sentiment analysis tasks and provides insights into the acquisition and fusion of sentiment knowledge from different domains for generating word representations with richer semantics.
Similar content being viewed by others
Data Availability
The datasets generated during the current study are available from the corresponding author on reasonable request.
Notes
Target-opinion word pair extraction is a sentiment analysis research field, and many methods have been proposed. How to extract them is beyond the scope of this paper. We only use two heuristic rules to extract target-opinion word pairs in this paper.
This sentence was annotated by the CoreNLP, which is available at https://corenlp.run/.
References
Agrawal A, An A, Papagelis M. Learning emotion-enriched word representations. In: Proceedings of the 27th international conference on computational linguistics; 2018. pp. 950–961.
Ke P, Ji H, Liu S, Zhu X, Huang M. Sentilare: Sentiment-aware language representation learning with linguistic knowledge. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020. pp. 6975–6988. https://doi.org/10.18653/v1/2020.emnlp-main.567.
Liang B, Su H, Gui L, Cambria E, Xu R. Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks. Knowl-Based Syst. 2022;235: 107643. https://doi.org/10.1016/j.knosys.2021.107643.
Li Y, Pan Q, Yang T, Wang S, Tang J, Cambria E. Learning word representations for sentiment analysis. Cogn Comput. 2017;9(6):843–51. https://doi.org/10.1007/s12559-017-9492-2.
Cambria E, Li Y, Xing FZ, Poria S, Kwok K. Senticnet 6: Ensemble application of symbolic and subsymbolic ai for sentiment analysis. In: Proceedings of the 29th ACM international conference on information & knowledge management; 2020. pp. 105–114.
Hussain A, Cambria E, Poria S, Hawalah A, Herrera F. Information fusion for affective computing and sentiment analysis. Inf Fusion. 2021;71.
Chen F, Huang Y. Knowledge-enhanced neural networks for sentiment analysis of chinese reviews. Neurocomputing. 2019;368:51–8. https://doi.org/10.1016/j.neucom.2019.08.054.
Yan X, Jian F, Sun B. Sakg-bert: Enabling language representation with knowledge graphs for chinese sentiment analysis. IEEE Access. 2021;9:101695–701. https://doi.org/10.1109/ACCESS.2021.3098180.
Zhao A, Yu Y. Knowledge-enabled bert for aspect-based sentiment analysis. Knowledge-Based Systems. 2021;107220. https://doi.org/10.1016/j.knosys.2021.107220.
Li Z, Dai Y, Li X. Construction of sentimental knowledge graph of chinese government policy comments. Knowl Manag Res Pract. 2022;20(1):73–90. https://doi.org/10.1080/14778238.2021.1971056.
Zhang B, Hu Y, Xu D, Li M, Li M. Skg-learning: a deep learning model for sentiment knowledge graph construction in social networks. Neural Comput Applic. 2022;1–20. https://doi.org/10.1007/s00521-022-07028-4.
Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781. 2013. https://doi.org/10.48550/arXiv.1301.3781.
Pennington J, Socher R, Manning CD. Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014. pp. 1532–1543. https://doi.org/10.3115/v1/D14-1162.
Devlin J, Chang MW, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019. pp. 4171–4186. https://doi.org/10.18653/v1/N19-1423.
Wang A, Singh A, Michael J, Hill F, Levy O, Bowman S. Glue: A multi-task benchmark and analysis platform for natural language understanding. In: Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. 2018. pp. 353–355. https://doi.org/10.18653/v1/W18-5446.
Cui Y, Che W, Liu T, Qin B, Yang Z. Pre-training with whole word masking for chinese bert. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2021;29:3504–14. https://doi.org/10.1109/TASLP.2021.3124365.
Zhang Z, Han X, Liu Z, Jiang X, Sun M, Liu Q. Ernie: Enhanced language representation with informative entities. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. pp. 1441–1451. https://doi.org/10.18653/v1/P19-1139.
Cui Y, Yang Z, Liu T. Pert: Pre-training bert with permuted language model. arXiv preprint arXiv:2203.06906. 2022. 10.48550/arXiv.2203.06906.
Tian H, Gao C, Xiao X, Liu H, He B, Wu H, Wang H, Wu F. Skep: Sentiment knowledge enhanced pre-training for sentiment analysis. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020. pp. 4067–4076.
Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R. Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942. 2019. https://doi.org/10.48550/arXiv.1909.11942.
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692. 2019. https://doi.org/10.48550/arXiv.1907.11692.
Hofmann V, Schütze H, Pierrehumbert J. An embarrassingly simple method to mitigate undesirable properties of pretrained language model tokenizers. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 2022. pp. 385–393. https://doi.org/10.18653/v1/2022.acl-short.43.
Opdahl AL, Al-Moslmi T, Dang-Nguyen DT, Gallofré Ocaña M, Tessem B, Veres C. Semantic knowledge graphs for the news: A review. ACM Comput Surv. 2022;55(7):1–38. https://doi.org/10.1145/3543508.
Peters ME, Neumann M, Logan R, Schwartz R, Joshi V, Singh S, Smith NA. Knowledge enhanced contextual word representations. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019. pp. 43–54. https://doi.org/10.18653/v1/D19-1005.
Liu W, Zhou P, Zhao Z, Wang Z, Ju Q, Deng H, Wang P. K-bert: Enabling language representation with knowledge graph. In: Proceedings of the AAAI Conference on Artificial Intelligence, 03. 2020. pp. 2901–2908. https://doi.org/10.1609/AAAI.V34I03.5681.
Wang X, Gao T, Zhu Z, Zhang Z, Liu Z, Li J, Tang J. Kepler: A unified model for knowledge embedding and pre-trained language representation. Transactions of the Association for Computational Linguistics. 2021;9:176–94. https://doi.org/10.1162/tacl_a_00360.
Li W, Zhu L, Mao R, Cambria E. Skier: A symbolic knowledge integrated model for conversational emotion recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2023.
Zhao Q, Ma S, Ren S. Kesa: A knowledge enhanced approach to sentiment analysis. In: Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing. 2022. pp. 766–776.
Yin D, Meng T, Chang KW. Sentibert: A transferable transformer-based architecture for compositional sentiment semantics. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020. pp. 3695–3706. https://doi.org/10.18653/v1/2020.acl-main.341.
Cambria E, Mao R, Han S, Liu Q. Sentic parser: A graph-based approach to concept extraction for sentiment analysis. In: Proceedings of the 2022 International Conference on Data Mining Workshops, Orlando, FL, USA; 2022. vol. 30.
Kim T, Yoo KM, Lee SG. Self-guided contrastive learning for bert sentence representations. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021. pp. 2528–2540.
Cambria E, Liu Q, Decherchi S, Xing F, Kwok K. SenticNet 7: A commonsense-based neurosymbolic AI framework for explainable sentiment analysis. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference; 2022. pp. 3829–3839. European Language Resources Association, Marseille, France. https://aclanthology.org/2022.lrec-1.408.
Baccianella S, Esuli A, Sebastiani F. Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Lrec, 2010; 2010. pp. 2200–2204.
Pang B, Lee L. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05). 2005. pp. 115–124. https://doi.org/10.3115/1219840.1219855.
Sanh V, Debut L, Chaumond J, Wolf T. Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108. 2019. https://doi.org/10.48550/arXiv.1910.01108.
Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C. Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 conference on empirical methods in natural language processing. 2013. pp. 1631–1642.
Liu P, Qiu X, Huang XJ. Adversarial multi-task learning for text classification. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2017. pp. 1–10.
Alm C. Affect in text and speech. Ph.D. thesis, University of Illinois at Urbana-Champaign; 2008.
Aman S, Szpakowicz S. Identifying expressions of emotion in text. In: International Conference on Text, Speech and Dialogue. 2007. pp. 196–205. Springer. https://doi.org/10.1007/978-3-540-74628-7_27.
Xu H, Liu B, Shu L, Philip SY. Bert post-training for review reading comprehension and aspect-based sentiment analysis. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019. pp. 2324–2335. https://doi.org/10.18653/v1/N19-1242.
Xu P, Madotto A, Wu CS, Park JH, Fung P. Emo2vec: Learning generalized emotion representation by multi-task training. In: Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. 2018. pp. 292–298. https://doi.org/10.18653/v1/W18-6243.
Zhang N, Deng S, Cheng X, Chen X, Zhang Y, Zhang W, Chen H, Center HI. Drop redundant, shrink irrelevant: Selective knowledge injection for language pretraining. In: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI 2021. 2021. pp. 4007–4014. https://doi.org/10.24963/ijcai.2021/552.
Funding
This work was supported by National Natural Science Foundation of China (No. 62062027), Natural Science Foundation of Guangxi Province (No. 2020GXNSFAA159012), Innovation Project of GUET Graduate Education (No. 2022YCXS093), and the project of Guangxi Key Laboratory of Trusted Software.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethical Approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Competing Interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, Y., Lin, Z., Lin, Y. et al. Learning Sentiment-Enhanced Word Representations by Fusing External Hybrid Sentiment Knowledge. Cogn Comput 15, 1973–1987 (2023). https://doi.org/10.1007/s12559-023-10164-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-023-10164-1