Abstract
In Chinese classics, the sentiment attitudes or thoughts of the ancients regarding specific environments, people, and events were generally expressed in the form of poetry. Compared with previous attempts to classify the polarity of poetry, sentiment terms can be used to detect more fine-grained humanity knowledge in literary information resources. However, the existing techniques of domain sentiment lexicon construction fail to take full advantage of deep learning and linguistic knowledge, which cannot ensure the term integrity and accuracy. To this end, this work proposes a novel approach for the construction of a sentiment lexicon via the combination of supervised sentiment term extraction and classification, aiming at incorporating multi-dimensional linguistic knowledge into a two-phase deep learning model. A character-sequence labeling model for term extraction is first constructed by fusing the emotion radical features of Chinese characters, and term embedding augmentation via word knowledge is then carried out to classify the extracted terms. Experiments on Chinese poetry and its appreciation texts validate the superiority of the proposed method, and the model incorporating linguistic knowledge is found to outperform the benchmark models in different metrics. A fine-grained sentiment lexicon with two first classes, five-second classes, 15 third classes, and 14,368 domain terms and unregistered terms is constructed via hierarchical term classification, thereby contributing to the advancement of the interpretability of the humanities computing of classical Chinese poetry.









Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The authors confirm that the datasets generated and analyzed during the current study are available in public repositories (see footnotes in the article), and all of them support the published claims and comply with field standards.
References
Hong L, Hou W, Wu Z, Han H (2020) A cooperative crowdsourcing framework for knowledge extraction in digital humanities–cases on Tang poetry. Aslib J Inf Manag 72(2):243–261
Li G, Li J (2018) Research on sentiment classification for Tang poetry based on TF-IDF and FP-growth. In: 2018 IEEE 3rd advanced information technology, electronic and automation control conference (IAEAC), pp 630–634. IEEE
Wu B, Ji J, Meng L, Shi C, Zhao HD, Li YQ (2016) Transfer learning based sentiment analysis for poetry of the Tang Dynasty and Song Dynasty. Acta Electron Sin 44(11):2780–2787
Tang Y, Wang X, Qi P, Sun Y (2020) A neural network-based sentiment analysis scheme for Tang Poetry. In: 2020 international wireless communications and mobile computing (IWCMC), pp 1783–1788. IEEE
Su C, Li J, Peng Y, Chen Y (2019) Chinese metaphor sentiment computing via considering culture. Neurocomputing 352:33–41
Wu S, Wu F, Chang Y, Wu C, Huang Y (2019) Automatic construction of target-specific sentiment lexicon. Expert Syst Appl 116:285–298
Oliveira N, Cortez P, Areal N (2016) Stock market sentiment lexicon acquisition using microblogging data and statistical measures. Decis Support Syst 85:62–73
Viegas F, Alvim MS, Canuto S, Rosa T, Gonçalves MA, Rocha L (2020) Exploiting semantic relationships for unsupervised expansion of sentiment lexicons. Inf Syst 94:101606
Wu F, Huang Y, Song Y, Liu S (2016) Towards building a high-quality microblog-specific Chinese sentiment lexicon. Decis Support Syst 87:39–49
Allison B, Guthrie D, Guthrie L (2006) Another look at the data sparsity problem. In: International conference on text, speech and dialogue, pp 327–334. Springer, Berlin
Mei LL, Huang HY, Zhou XY, Mao XL (2016) A survey on sentiment Lexicon construction. J Chin Inf Process 30(5):19–27
Blanke T, Bryant M, Hedges M (2020) Understanding memories of the Holocaust—a new approach to neural networks in the digital humanities. Digital Scholarsh Hum 35(1):17–33
Tao Y (2019) Exploring the development and application of readers’ knowledge from the “Readers’ Salon”. Libraly J 38(12):67–71
Xing FZ, Pallucchini F, Cambria E (2019) Cognitive-inspired domain adaptation of sentiment lexicons. Inf Process Manag 56(3):554–564
Kamps J, Marx M, Mokken RJ, De Rijke M (2004) Using WordNet to measure semantic orientations of adjectives. In: LREC, Vol 4, pp 1115–1118
Esuli A, Sebastiani F (2007) Pageranking wordnet synsets: an application to opinion mining. In: Proceedings of the 45th annual meeting of the association of computational linguistics, pp 424–431
Qiu G, Liu B, Bu J, Chen C (2011) Opinion word expansion and target extraction through double propagation. Comput Linguist 37(1):9–27
Khan FH, Qamar U, Bashir S (2016) Senti-CS: building a lexical resource for sentiment analysis using subjective feature selection and normalized Chi-Square-based feature weight generation. Expert Syst 33(5):489–500
Wang K, Xia R (2016) A survey on automatical construction methods of sentiment lexicons. Acta Automatica Sinica 42(4):495–511
Tang F, Fu L, Yao B, Xu W (2019) Aspect based fine-grained sentiment analysis for online reviews. Inf Sci 488:190–204
Poria S, Cambria E, Gelbukh A (2016) Aspect extraction for opinion mining with a deep convolutional neural network. Knowl-Based Syst 108:42–49
Yang H, Zeng B, Yang J, Song Y, Xu R (2021) A multi-task learning model for chinese-oriented aspect polarity classification and aspect term extraction. Neurocomputing 419:344–356
Liang B, Su H, Gui L, Cambria E, Xu R (2022) Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks. Knowl-Based Syst 235:107643
Chen Y, Zhou C, Li T, Wu H, Zhao X, Ye K, Liao J (2019) Named entity recognition from Chinese adverse drug event reports with lexical feature based BiLSTM-CRF and tri-training. J Biomed Inform 96:103252
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
Zhang Q, Sun Y, Zhang L, Jiao Y, Tian Y (2021) Named entity recognition method in health preserving field based on BERT. Procedia Comput Sci 183:212–220
Labusch K, Kulturbesitz P, Neudecker C, Zellhöfer D (2019) BERT for named entity recognition in contemporary and historical German. In: Proceedings of the 15th conference on natural language processing, Erlangen, Germany, pp 8–11
Tabassum J, Maddela M, Xu W, Ritter A (2020) Code and named entity recognition in stackoverflow. arXiv preprint arXiv:2005.01634
Cambria E, Liu Q, Decherchi S, Xing F, Kwok K (2022) SenticNet 7: a commonsense-based neurosymbolic AI framework for explainable sentiment analysis. In: Proceedings of LREC 2022
Chen J, Chen Y, He Y, Xu Y, Zhao S, Zhang Y (2022) A classified feature representation three-way decision model for sentiment analysis. Appl Intell 52(7):7995–8007
Huang Q, Zhou C, Wu J, Liu L, Wang B (2020) Deep spatial–temporal structure learning for rumor detection on Twitter. Neural Comput Appl 2:1–11
Li W, Shao W, Ji S, Cambria E (2022) BiERU: bidirectional emotional recurrent unit for conversational sentiment analysis. Neurocomputing 467:73–82
Tanaka H, Shinnou H, Cao R, Bai J, Ma W (2019) Document classification by word embeddings of Bert. In: International conference of the Pacific association for computational linguistics, pp 145–154. Springer, Singapore
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105
Xie Z, Wang SI, Li J, Lévy D, Nie A, Jurafsky D, Ng AY (2017) Data noising as smoothing in neural network language models. arXiv preprint arXiv:1703.02573
Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. Adv Neural Inf Process Syst 28:649–657
Wei J, Zou K (2019) Eda: easy data augmentation techniques for boosting performance on text classification tasks. arXiv preprint arXiv:1901.11196
Yu AW, Dohan D, Luong MT, Zhao R, Chen K, Norouzi M, Le QV (2018) Qanet: combining local convolution with global self-attention for reading comprehension. arXiv preprint arXiv:1804.09541
Hu J, Cen Y, Wu C (2018) Constructing sentiment dictionary with deep learning: case study of financial data. Data Anal Knowl Discov 2(10):95–102
Cho H, Kim S, Lee J, Lee JS (2014) Data-driven integration of multiple sentiment dictionaries for lexicon-based sentiment classification of product reviews. Knowl-Based Syst 71:61–71
Ekman P (1992) An argument for basic emotions. Cogn Emot 6(3–4):169–200
Liu Z (2018) A study of Chinese poetry generation based on deep learning (Doctoral dissertation, Changsha: Hunan Normal University)
Wang X, Wang Z (2020) Question answering system based on diease knowledge base. In: 2020 IEEE 11th international conference on software engineering and service science (ICSESS), pp 351–354. IEEE
Yang J, Fang Y (1998) Automatic recognition of handwritten Chinese text based on linguistic knowledge. J Comput Res Dev 35(7):93–97
Reimers N, Gurevych I (2019) Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084
Brandsen A, Verberne S, Lambers K, Wansleeben M, Calzolari (2020) Creating a dataset for named entity recognition in the archaeology domain. In: Conference proceedings LREC 2020, pp 4573–4577. The European Language Resources Association
Dietterich TG (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10(7):1895–1923
Sprugnoli R, Tonelli S, Marchetti A, Moretti G (2016) Towards sentiment analysis for historical texts. Digital Scholarsh Hum 31(4):762–772
Moreno-Ortiz A (2017) Lingmotif: sentiment analysis for the digital humanities. In: Proceedings of the software demonstrations of the 15th conference of the European chapter of the association for computational linguistics, pp 73–76
Hou, Y., & Frank, A. (2015). Analyzing sentiment in classical Chinese poetry. In Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH) (pp. 15–24).
Wei J, Liao J, Yang Z, Wang S, Zhao Q (2020) BiLSTM with multi-polarity orthogonal attention for implicit sentiment analysis. Neurocomputing 383:165–173
Wang, B., Hu, R., & Yang, L. (2019, October). Constructing the image graph of tang poetry. In CCF International Conference on Natural Language Processing and Chinese Computing (pp. 426–434). Springer, Cham.
Shen, Y., Ma, Y., Li, C., Li, S., & Gu, M. (2019, May). Sentiment Analysis for Tang Poetry Based on Imagery Aided and Classifier Fusion. In International Conference on Artificial Intelligence for Communications and Networks (pp. 283–290). Springer, Cham.
Nguyen CV, Le KH, Tran AM, Pham QH, Nguyen BT (2022) Learning for amalgamation: a multi-source transfer learning framework for sentiment classification. Inform Sci 2:8856
Li H, Chen Q, Zhong Z, Gong R, Han G (2022) E-word of mouth sentiment analysis for user behavior studies. Inf Process Manag 59(1):102784
Ray B, Garain A, Sarkar R (2021) An ensemble-based hotel recommender system using sentiment analysis and aspect categorization of hotel reviews. Appl Soft Comput 98:106935
Pasupa K, Ayutthaya SN, T. (2022) Hybrid deep learning models for Thai sentiment analysis. Cogn Comput 14(1):167–193
Lu Y, Castellanos M, Dayal U, Zhai C (2011) Automatic construction of a context-aware sentiment lexicon: an optimization approach. In: Proceedings of the 20th international conference on World wide web, pp 347–356
Cambria E (2016) Affective computing and sentiment analysis. IEEE Intell Syst 31(2):102–107
Li X, Zhang H, Zhou XH (2020) Chinese clinical named entity recognition with variant neural structures based on BERT methods. J Biomed Inform 107:103422
Wu S, Song X, Feng Z (2021) MECT: multi-metadata embedding based cross-transformer for Chinese named entity recognition. arXiv preprint arXiv:2107.05418
Wang Z, Ho SB, Cambria E (2020) Multi-level fine-scaled sentiment sensing with ambivalence handling. Int J Uncertain Fuzziness Knowl-Based Syst 28(04):683–697
Acknowledgements
This paper was supported by the National Natural Science Foundation of China [72074108], the Postgraduate Research & Practice Innovation Program of Jiangsu Province [KYCX21_0026], the National Research Foundation of Korea [2022R1A2B5B02002359], the Fundamental Research Funds for the Central Universities [010814370113], as well as the program of the China Scholarship Council (award to Wei Zhang for 1 year’s study abroad at the Yonsei University). We would like to acknowledge the annotator, Tao Fan, a doctoral student in Information Science from Nanjing University, as well as his valuable suggestions. We would also like to express our special thanks to the editor and reviewers for their very constructive comments and suggestions.
Author information
Authors and Affiliations
Contributions
Conceptualization and Software and Writing-Original draft, WZ and HW; Formal analysis and Validation and Supervision, MS and SD. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, W., Wang, H., Song, M. et al. A method of constructing a fine-grained sentiment lexicon for the humanities computing of classical chinese poetry. Neural Comput & Applic 35, 2325–2346 (2023). https://doi.org/10.1007/s00521-022-07690-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07690-8