Abstract
Polarity estimation in large-scale and multi-topic domains is a difficult issue. Most state-of-the-art solutions essentially rely on frequencies of sentiment-carrying words (e.g., taken from a lexicon) when analyzing the sentiment conveyed by natural language text. These approaches ignore the structural aspects of a document, which contain valuable information. Rhetorical Structure Theory (RST) provides important information about the relative importance of the different text spans in a document. This knowledge could be useful for sentiment analysis and polarity classification. However, RST has only been studied for polarity classification problems in constrained and small scale scenarios. The main objective of this paper is to explore the usefulness of RST in large-scale polarity ranking of blog posts. We apply sentence-level methods to select the key sentences that convey the overall on-topic sentiment of a blog post. Then, we apply RST analysis to these core sentences in order to guide the classification of their polarity and thus to generate an overall estimation of the document’s polarity with respect to a specific topic. Our results show that RST provides valuable information about the discourse structure of the texts that can be used to make a more accurate ranking of documents in terms of their estimated sentiment in multi-topic blogs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Santos, R.L.T., Macdonald, C., McCreadie, R., Ounis, I., Soboroff, I.: Information retrieval on the blogosphere. Found. Trends Inf. Retr. 6(1), 1–125 (2012)
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2(1-2), 1–135 (2007)
Ounis, I., Macdonald, C., Soboroff, I.: Overview of the TREC 2008 blog track. In: Proc. of the 17th Text Retrieval Conference, TREC 2008. NIST (2008)
Chenlo, J.M., Losada, D.: Effective and efficient polarity estimation in blogs based on sentence-level evidence. In: Proc. 20th ACM Int. Conf. on Information and Knowledge Management, CIKM 2011, Glasgow, UK, pp. 365–374 (2011)
Heerschop, B., Goossen, F., Hogenboom, A., Frasincar, F., Kaymak, U., de Jong, F.: Polarity analysis of texts using discourse structure. In: Proc. 20th ACM Int. Conf. on Inf. and Knowledge Manag., CIKM 2011, Glasgow, UK, pp. 1061–1070 (2011)
Mann, W.C., Thompson, S.A.: Rhetorical structure theory: Toward a functional theory of text organization. Text 8(3), 243–281 (1988)
Gerani, S., Carman, M.J., Crestani, F.: Proximity-based opinion retrieval. In: Proc. 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2010, pp. 403–410. ACM, New York (2010)
Santos, R.L.T., He, B., Macdonald, C., Ounis, I.: Integrating proximity to subjective sentences for blog opinion retrieval. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 325–336. Springer, Heidelberg (2009)
He, B., Macdonald, C., He, J., Ounis, I.: An effective statistical approach to blog post opinion retrieval. In: Proc. 17th ACM Int. Conf. on Information and Knowledge Management, CIKM 2008, pp. 1063–1072. ACM, New York (2008)
Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proc. Conf. on Human Language Technology and Empirical Methods in Natural Language Processing, HLT 2005, pp. 347–354. ACL (2005)
He, B., Macdonald, C., Ounis, I.: Ranking opinionated blog posts using opinionfinder. In: SIGIR, pp. 727–728 (2008)
Robertson, S.: How okapi came to TREC. In: Voorhees, E.M., Harman, D.K. (eds.) TREC: Experiments and Evaluation in Information Retrieval, pp. 287–299 (2005)
Soricut, R., Marcu, D.: Sentence level discourse parsing using syntactic and lexical information. In: Proc. 2003 Conf. of the North American Chapter of the ACL on Human Language Technology, NAACL 2003, vol. 1, pp. 149–156. ACL, Stroudsburg (2003)
Carlson, L., Marcu, D., Okurowski, M.E.: Building a discourse-tagged corpus in the framework of rhetorical structure theory. In: Proc. 2nd SIGdial Workshop on Discourse and Dialogue, SIGDIAL 2001, vol. 16, pp. 1–10. ACL (2001)
Macdonald, C., Ounis, I.: The TREC Blogs 2006 collection: Creating and analysing a blog test collection. Technical Report TR-2006-224, Department of Computing Science, University of Glasgow (2006)
Parapar, J., Vidal, M., Santos, J.: Finding the best parameter setting: Particle swarm optimisation. In: 2nd Spanish Conf. on IR, CERI 2012, pp. 49–60 (2012)
Pang, B., Lee, L.: A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In: Pr. of the ACL, pp. 271–278 (2004)
Zirn, C., Niepert, M., Stuckenschmidt, H., Strube, M.: Fine-grained sentiment analysis with structural features. In: Asian Federation of Natural Language Processing, vol. 12 (2011)
Somasundaran, S., Namata, G., Wiebe, J., Getoor, L.: Supervised and unsupervised methods in employing discourse relations for improving opinion polarity classification. In: Proc. 2009 Conf. on Empirical Methods in Natural Language Processing, EMNLP 2009, vol. 1, pp. 170–179. ACL (2009)
Zhou, L., Li, B., Gao, W., Wei, Z., Wong, K.F.: Unsupervised discovery of discourse relations for eliminating intra-sentence polarity ambiguities. In: Proc. Conf. on Empirical Methods in Natural Language Processing, EMNLP 2011, pp. 162–171. ACL, Stroudsburg (2011)
Lioma, C., Larsen, B., Lu, W.: Rhetorical relations for information retrieval. In: Proc. 35th Int. Conf. ACM SIGIR on Research and Development in Information Retrieval, SIGIR 2012, pp. 931–940. ACM, New York (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chenlo, J.M., Hogenboom, A., Losada, D.E. (2013). Sentiment-Based Ranking of Blog Posts Using Rhetorical Structure Theory. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds) Natural Language Processing and Information Systems. NLDB 2013. Lecture Notes in Computer Science, vol 7934. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38824-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-38824-8_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38823-1
Online ISBN: 978-3-642-38824-8
eBook Packages: Computer ScienceComputer Science (R0)