Abstract
Document-level sentiment analysis is among the most popular research fields of nature language processing in recent years, in which one of major challenges is that discourse structural information can be hardly captured by existing approaches. In this paper, a domain-independent framework for document-level sentiment classification with weighting rules based on Rhetorical Structure Theory is proposed. First, original textual documents are parsed into rhetorical structure trees through a preprocessing pipeline. Next, the sentiment score of elementary discourse units is computed via sentence-level sentiment classification method. Finally, according to the rhetorical relation between neighbor discourse units, we define weighting schema and composing rules based on which scores of elementary discourse units are summed recursively to the whole document. Experiment results show that our approach has better performance on datasets in different domains, compared with state-of-art document-level sentiment analysis systems based on RST, and the best result is 15% higher than baseline.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Available at http://stanfordnlp.github.io/CoreNLP/.
- 2.
- 3.
Available at http://www.cs.jhu.edu/~mdredze/datasets/sentiment/.
- 4.
Available at https://github.com/jiyfeng/DPLP.
- 5.
Available at http://mpqa.cs.pitt.edu/lexicons/subj_lexicon/.
- 6.
Available at http://sentiwordnet.isti.cnr.it/.
References
O’Connor, B., Balasubramanyan, R., Routledge, B.R., et al.: From tweets to polls: linking text sentiment to public opinion time series. ICWSM 11, 122–129 (2010)
Musto, C., Semeraro, G., Lops, P., et al.: CrowdPulse: a framework for real-time semantic analysis of social streams. Inf. Syst. 54, 127–146 (2015)
Bollen, J., Mao, H., Zeng, X.: Twitter mood predicts the stock market. J. Comput. Sci. 2(1), 1–8 (2011)
Smailović, J., Grčar, M., Lavrač, N., et al.: Stream-based active learning for sentiment analysis in the financial domain. Inf. Sci. 285(1), 181–203 (2014)
Liu, B., Zhang, L.: A survey of opinion mining and sentiment analysis. In: Aggarwal C., Zhai, C. (eds.) Mining Text Data, pp. 415–463. Springer, US (2012)
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retrieval 2(1-2), 1–135 (2008)
Bhatia, P., Ji, Y., Eisenstein, J.: Better document-level sentiment analysis from RST discourse parsing. arXiv preprint arXiv:1509.01599 (2015)
Mann, W.C., Thompson, S.A.: Rhetorical structure theory: description and construction of text structures. In: Kempen, G. (ed.) Natural Language Generation, pp. 85–95. Springer, Netherlands (1987)
Ji, Y., Eisenstein, J.: Representation learning for text-level discourse parsing. In: Meeting of the Association for Computational Linguistics, pp. 13–24, USA (2014)
Corston-Oliver, S.H.: Beyond string matching and cue phrases: improving efficiency and coverage in discourse analysis. In: The AAAI Spring Symposium on Intelligent Text Summarization, pp. 9–15 (1970)
Soricut, R., Marcu, D.: Sentence level discourse parsing using syntactic and lexical information. In: Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1, pp. 149–156. Association for Computational Linguistics (2004)
Feng, V.W., Hirst, G.: Text-level discourse parsing with rich linguistic features. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, vol. 1, pp. 60–68. Association for Computational Linguistics (2012)
Li, S., Wang, L., Cao, Z., et al.: Text-level discourse dependency parsing. Meet. Assoc. Comput. Linguist. 1, 25–35 (2014)
Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Meeting on Association for Computational Linguistics, p. 271. Association for Computational Linguistics (2004)
Sharma, A., Dey, S.: A document-level sentiment analysis approach using artificial neural network and sentiment lexicons. ACM SIGAPP Appl. Comput. Rev. 12(4), 67–75 (2012)
Tang, D., Qin, B., Liu, T.: Document modeling with gated recurrent neural network for sentiment classification. In: Conference on Empirical Methods in Natural Language Processing, pp. 1422–1432, Portugal (2015)
Xu, J., Chen, D., Qiu, X., et al.: Cached Long Short-Term Memory Neural Networks for Document-Level Sentiment Classification. arXiv preprint arXiv:1610.04989 (2016)
Voll, K., Taboada, M.: Not all words are created equal: extracting semantic orientation as a function of adjective relevance. In: Orgun, M.A., Thornton, J. (eds.) AI 2007. LNCS, vol. 4830, pp. 337–346. Springer, Heidelberg (2007). doi:10.1007/978-3-540-76928-6_35
Heerschop, B., Goossen, F., Hogenboom, A., et al.: Polarity analysis of texts using discourse structure. In: ACM Conference on Information and Knowledge Management. DBLP, pp. 1061–1070, Glasgow, United Kingdom (2011)
Wang, F., Wu, Y., Qiu, L.: Exploiting discourse relations for sentiment analysis. In: COLING: Posters, pp. 1311–1320 (2012)
Li, J., Zhou, Y., Liu, C., et al.: Sentiment classification of Chinese contrast sentences. In: Zong, C., Nie, JY., Zhao, D., Feng, Y. (eds.) Natural Language Processing and Chinese Computing, vol. 496, pp. 205–216. Springer, Heidelberg (2014)
Hogenboom, A., Frasincar, F., De Jong, F., et al.: Using rhetorical structure in sentiment analysis. Commun. ACM 58(7), 69–77 (2015)
Acknowledgement
This work is supported bythe National Natural Science Foundation of China (NSFC) (61373165, 61373035 and 61672377).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Zhao, Z., Rao, G., Feng, Z. (2017). DFDS: A Domain-Independent Framework for Document-Level Sentiment Analysis Based on RST. In: Chen, L., Jensen, C., Shahabi, C., Yang, X., Lian, X. (eds) Web and Big Data. APWeb-WAIM 2017. Lecture Notes in Computer Science(), vol 10366. Springer, Cham. https://doi.org/10.1007/978-3-319-63579-8_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-63579-8_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63578-1
Online ISBN: 978-3-319-63579-8
eBook Packages: Computer ScienceComputer Science (R0)