Abstract
Sentiment analysis is a research hotspot in Nature Language Processing, and high-quality sentiment lexicon plays an important part in sentiment analysis. In this paper, we explore an approach to build a microblog-specific Chinese sentiment lexicon from massive microblog data. In feature learning, in order to enhance the quality of word embedding, we build a neural architecture to train a sentiment-aware word embedding by integrating three kinds of knowledge, including the context words and their composing characters, the polarity of sentences and the polarity of labeled words. Experiments conducted on several public datasets show that in both unsupervised and supervised microblog sentiment classification, the lexicon generated by our approach achieves the state-of-the-art performance compared to several existing Chinese sentiment lexicons and our feature learning method successfully catches both semantics and sentiment information.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Liu, B.: Sentiment Analysis and Opinion Mining. University of Illinois at Chicago, Chicago (2012)
Chen, X., Xu, L., Liu, Z.: Joint learning of character and word embeddings. In: Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, pp. 1236–1242 (2015)
Yu, J., Jian, X., Xin, H.: Joint embeddings of chinese words, characters, and fine-grained subcharacter components. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 286–291 (2017)
Feng, S., Song, K., Wang, D.: A word-emoticon mutual reinforcement ranking model for building sentiment lexicon from massive collection of microblogs. World Wide Web 18, 949–967 (2015)
Wu, F., Huang, Y., Song, Y.: Towards building a high-quality microblog-specific Chinese sentiment lexicon. Decis. Support Syst. 87, 39–49 (2016)
Tan, J., Xu, M., Shang, L., Jia, X.: Sentiment analysis for images on microblogging by integrating textual information with multiple kernel learning. In: Booth, R., Zhang, M.-L. (eds.) PRICAI 2016. LNCS, vol. 9810, pp. 496–506. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-42911-3_41
Tang, D., Wei, F., Qin, B.: Building large-scale Twitter-specific sentiment lexicon: a representation learning approach. In: Proceedings of the 25th International Conference on Computational Linguistics: Technical Papers, pp. 172–182 (2014)
Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177 (2004)
Heerschop, B., Hogenboom, A., Frasincar, F.: Sentiment lexicon creation from lexical resources. In: Abramowicz, W. (ed.) BIS 2011. LNBIP, vol. 87, pp. 185–196. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21863-7_16
Esuli, A., Sebastiani, F.: PageRanking WordNet synsets: an application to opinion mining. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, ACL, pp. 424–431 (2010)
Hatzivassiloglou, V., McKeown, K.: Predicting the semantic orientation of adjectives. In: Proceedings of the Eighth Conference on European Chapter of the Association for Computational Linguistics, pp. 174–181 (1997)
Kanayama, H., Nasukawa, T.: Fully automatic lexicon expansion for domain-oriented sentiment analysis. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pp. 355–363 (2006)
Qiu, G., Liu, B., Bu, J., Chen, C.: Opinion word expansion and target extraction through double propagation. Comput. Linguist. 37(1), 9–27 (2011)
Turney, P.: Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 417–424 (2002)
Hamilton, W., Clark, K., Leskovec, J.: Inducing domain-specific sentiment lexicons from unlabeled corpora. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 595–605 (2016)
Islam, M., Inkpen, D.: Second order co-occurrence PMI for determining the semantic similarity of words. In: Language Resources and Evaluation, pp. 1033–1038 (2006)
Vo, D., Zhang, Y.: Don’t count, predict! An automatic approach to learning sentiment lexicons for short text. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 219–224 (2016)
Mikolov, T., Chen, K., Corrado, G.: Efficient estimation of word representations in vector space. http://arxiv.org/abs/1309.4168 (2013)
Wang, L., Xia, R.: Sentiment lexicon construction with representation learning based on hierarchical sentiment supervision. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 513–521 (2017)
Su, T., Lee, H.: Learning Chinese word representations from glyphs of characters. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 264–273 (2017)
Xu, J., Liu, J., Zhang, L.: Improve Chinese word embeddings by exploiting internal structure. In: Proceedings of NAACL-HLT, pp. 1041–1050 (2016)
Yin, R., Wang, Q., Liu, R.: Multi-granularity Chinese word embedding. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 981–986 (2016)
Sun, Y., Lin, L., Yang, N., Ji, Z., Wang, X.: Radical-enhanced Chinese character embedding. In: Loo, C.K., Yap, K.S., Wong, K.W., Teoh, A., Huang, K. (eds.) ICONIP 2014. LNCS, vol. 8835, pp. 279–286. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12640-1_34
Mikolov, T., Sutskever, L., Chen, K.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Mohammad, S.: Building the state-of-the-art in sentiment analysis of tweets. In: Proceedings of the Seventh International Workshop on Semantic Evaluation Exercises, SemEval 2013, pp. 321–327 (2013)
Myers, J., Well, A., Lorch, R.: Research Design and Statistical Analysis, 2nd edn. Routledge, London (2010)
Acknowledgment
This work was supported by the National Key R&D Program of China (2016YFC0800803).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Kong, L., Li, C., Ge, J., Yang, Y., Zhang, F., Luo, B. (2018). Construction of Microblog-Specific Chinese Sentiment Lexicon Based on Representation Learning. In: Geng, X., Kang, BH. (eds) PRICAI 2018: Trends in Artificial Intelligence. PRICAI 2018. Lecture Notes in Computer Science(), vol 11012. Springer, Cham. https://doi.org/10.1007/978-3-319-97304-3_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-97304-3_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97303-6
Online ISBN: 978-3-319-97304-3
eBook Packages: Computer ScienceComputer Science (R0)