Abstract
As an important aspect of review mining, sentence-level sentiment classification has received much attention from both academia and industry. Many recently developed methods, especially the ones based on deep learning models, have centred around the task. In a majority of the existing methods, training sentence-level sentiment classifiers require sentence-level sentiment labels, that are usually expensive to obtain. In this research, we propose a novel approach, named ‘Averaged logits’, that uses the prevalently available ratings, instead of sentence-level sentiment labels to train the classifiers. In the approach, the rating of a review is assumed to be the ‘average’ of the sentiments of the individual sentences. We experiment with this idea under the framework of the recurrent neural network model. The results show that, the performance of the proposed approach is close to that of the traditional SVM and Naive Bayes classifiers trained by labelled sentences when their training sizes are approximately equal, and close to that of the neural network based classifiers trained by labelled sentences when the proposed approach uses approximately 5 times more training samples.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
- 3.
All the reviews were written by ‘audience’ with ratings on a 5-point scale.
References
Blair-Goldensohn, S., Hannan, K., McDonald, R., Neylon, T., Reis, G.A., Reynar, J.: Building a sentiment summarizer for local service reviews. In: WWW Workshop on NLP in the Information Explosion Era, vol. 14, pp. 339–348 (2008)
Choi, Y., Cardie, C.: Learning with compositional semantics as structural inference for subsentential sentiment analysis. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, Honolulu, Hawaii, pp. 793–801. Association for Computational Linguistics, October 2008. http://www.aclweb.org/anthology/D08-1083
Cui, H.: Comparative experiments on sentiment classification for online product reviews, p. 6
Ding, X., Liu, B., Yu, P.S.: A holistic lexicon-based approach to opinion mining. In: Proceedings of the 2008 International Conference on Web Search and Data Mining, WSDM 2008, pp. 231–240. ACM, New York (2008). http://doi.acm.org/10.1145/1341531.1341561
Gamon, M.: Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis, p. 7
Ganapathibhotla, M., Liu, B.: Mining opinions in comparative sentences. In: Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1, pp. 241–248. Association for Computational Linguistics (2008)
Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177. ACM (2004)
Kennedy, A., Inkpen, D.: Sentiment classification of movie reviews using contextual valence shifters. Comput. Intell. 22(2), 110–125 (2006)
Kim, S.M., Hovy, E.: Determining the sentiment of opinions. In: Proceedings of the 20th International Conference on Computational Linguistics, p. 1367. Association for Computational Linguistics (2004)
Kim, Y.: Convolutional neural networks for sentence classification. arXiv:1408.5882 [cs], August 2014
Kouloumpis, E., Wilson, T., Moore, J.D.: Twitter sentiment analysis: the good the bad and the omg! In: ICWSM, vol. 11(538–541), p. 164 (2011)
Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001)
Li, F., Huang, M., Zhu, X.: Sentiment analysis with global topics and local dependency. In: AAAI, vol. 10, pp. 1371–1376 (2010)
Lin, C., He, Y.: Joint sentiment/topic model for sentiment analysis, p. 10
Lu, C.Y., Lin, S.H., Liu, J.C., Cruz-Lara, S., Hong, J.S.: Automatic event-level textual emotion sensing using mutual action histogram between entities. Expert Syst. Appl. 37(2), 1643–1653 (2010). http://www.sciencedirect.com/science/article/pii/S0957417409006046
Martineau, J., Finin, T.: Delta TFIDF: an improved feature space for sentiment analysis. In: ICWSM, vol. 9, p. 106 (2009)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Mullen, T., Collier, N.: Sentiment analysis using support vector machines with diverse information sources. In: Lin, D., Wu, D. (eds.) Proceedings of EMNLP 2004, Barcelona, Spain, pp. 412–418. Association for Computational Linguistics, July 2004
Ng, V., Dasgupta, S., Arifin, S.M.N.: Examining the role of linguistic knowledge sources in the automatic identification and classification of reviews. In: Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, Sydney, Australia, pp. 611–618. Association for Computational Linguistics, July 2006. http://www.aclweb.org/anthology/P/P06/P06-2079
Pang, B., Lee, L.: Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 115–124. Association for Computational Linguistics (2005)
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. arXiv:cs/0205070, May 2002
Qu, L., Gemulla, R., Weikum, G.: A weakly supervised model for sentence-level semantic orientation analysis with multiple experts. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jeju Island, Korea, pp. 149–159. Association for Computational Linguistics, July 2012. http://www.aclweb.org/anthology/D12-1014
Socher, R., et al.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, Washington, USA, pp. 1631–1642. Association for Computational Linguistics, October 2013. http://www.aclweb.org/anthology/D13-1170
Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. Linguist. 37(2), 267–307 (2011)
Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. arXiv:1503.00075 [cs], February 2015
Turney, P.D.: Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. arXiv:cs/0212032, December 2002
Täckström, O., McDonald, R.: Discovering fine-grained sentiment with latent variable structured prediction models. In: Clough, P., et al. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 368–374. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20161-5_37
Valitutti, A., Strapparava, C., Stock, O.: Developing affective lexical resources. PsychNology J. 2(1), 61–83 (2004)
Wang, X., Jiang, W., Luo, Z.: Combination of convolutional and recurrent neural network for sentiment analysis of short texts. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan, pp. 2428–2437. The COLING 2016 Organizing Committee, December 2016. http://aclweb.org/anthology/C16-1229
Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity: an exploration of features for phrase-level sentiment analysis. Comput. Linguist. 35(3), 399–433 (2009). http://www.mitpressjournals.org/doi/10.1162/coli.08-012-R1-06-90
Wu, F., Zhang, J., Yuan, Z., Wu, S., Huang, Y., Yan, J.: Sentence-level sentiment classification with weak supervision, pp. 973–976. ACM Press (2017). http://dl.acm.org/citation.cfm?doid=3077136.3080693
Xianghua, F., Guo, L., Yanyan, G., Zhiqiang, W.: Multi-aspect sentiment analysis for Chinese online social reviews based on topic modeling and HowNet lexicon. Knowl.-Based Syst. 37, 186–195 (2013)
Yang, B., Cardie, C.: Context-aware learning for sentence-level sentiment analysis with posterior regularization. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Baltimore, Maryland, pp. 325–335. Association for Computational Linguistics, June 2014. http://www.aclweb.org/anthology/P14-1031
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Tang, X., Ou, W., Huynh, VN. (2019). Averaged Logits: An Weakly-Supervised Approach to Use Ratings to Train Sentence-Level Sentiment Classifiers. In: Seki, H., Nguyen, C., Huynh, VN., Inuiguchi, M. (eds) Integrated Uncertainty in Knowledge Modelling and Decision Making. IUKM 2019. Lecture Notes in Computer Science(), vol 11471. Springer, Cham. https://doi.org/10.1007/978-3-030-14815-7_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-14815-7_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14814-0
Online ISBN: 978-3-030-14815-7
eBook Packages: Computer ScienceComputer Science (R0)