Abstract
Aiming at the fine-grained sentiment classification that distinguishes the emotional intensity, the commonly used dataset SST-1 is analyzed in depth. Through the analysis, it is found that the dataset has serious problems such as data imbalance and small overall scale, which seriously restricts the classification effect. In order to solve the related problems, data augmentation method is adopted to realize the optimization of the dataset. The IMDB and other data which are relatively homologous to the original dataset are annotated, and the focus is to expand the categories with fewer numbers. By this way, the problem of data imbalance is effectively alleviated and the original data scale is expanded. Then, based on the Bidirectional Encoder Representations from Transformers (BERT) model, which has good overall performance on natural language processing, the benchmark classification model is built. Through multiple comparison experiments on the original dataset and the enhanced data, the influence of the deficiency of the original dataset on the classification effect is verified. And, it is fully demonstrated that the enhanced data can effectively improve the test results and solve the problem of large differences in performance between different categories well.
Supported by National Key Laboratory of Parallel and Distributed Processing, College of Computer Science and Technology, National University of Defence Technology.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Huang, Z., Tang, X., Xie, B., et al.: Sentiment classification using machine learning techniques with syntax features. In: International Conference on Computational Science & Computational Intelligence (2015)
Devlin, J., Chang, M.W., Lee, K., et al.: BERT: pre-training of deep bidirectional transformers for language understanding. https://arxiv.org/abs/1810.04805. Accessed 14 May 2019
Pang, B., Lee, L.: Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the ACL, Ann Arbor, pp. 115–124 (2005)
Ding, Z., Xia, R., Yu, J., et al.: Densely connected bidirectional LSTM with applications to sentence classification. https://arxiv.org/abs/1802.00889. Accessed 14 May 2019
Socher, R., Pennington, J., Huang, E.H., et al.: Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, Edinburgh, pp. 151–161 (2011)
Hadji, I., Wildes, R.P.: What do we understand about convolutional networks? https://arxiv.org/abs/1803.08834. Accessed 14 May 2019
Cardie, C.: Deep recursive neural networks for compositionality in language. In: International Conference on Neural Information Processing Systems, pp. 2096–2104. MIT Press, Montreal (2014)
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, Doha, pp. 1746–1751 (2014)
Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. In: Annual Meeting of the Association for Computational Linguistics, ACL 2014, Baltimore, pp. 655–665 (2014)
Yin, W., Schütze, H.: Multichannel variable-size convolution for sentence classification. In: Proceedings of the Nineteenth Conference on Computational Natural Language Learning, CoNLL 2015, Beijing, pp. 204–214 (2015)
Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. In: Annual Meeting of the Association for Computational Linguistics, ACL 2015, Beijing, pp. 1556–1566 (2015)
Qian, Q., Huang, M., Lei, J., Zhu, X.: Linguistically regularized LSTMs for sentiment classification. https://arxiv.org/abs/1611.03949. Accessed 14 May 2019
Zhou, P., Qi, Z., Zheng, S., et al.: Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling. In: Computational Linguistics, COLING 2016, Osaka, pp. 3485–3495 (2016)
Vaswani, A., et al.: Attention is all you need. In: Annual Conference on Neural Information Processing Systems, NIPS 2017, Long Beach (2017)
Taylor, W.L.: “Cloze Procedure”: a new tool for measuring readability. J. Q. 30(4), 415–433 (1953)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, Z., Wang, X., Chang, T., Lv, S., Guo, X. (2019). Research on Fine-Grained Sentiment Classification. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11839. Springer, Cham. https://doi.org/10.1007/978-3-030-32236-6_39
Download citation
DOI: https://doi.org/10.1007/978-3-030-32236-6_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32235-9
Online ISBN: 978-3-030-32236-6
eBook Packages: Computer ScienceComputer Science (R0)