Abstract
The given research paper describes modern approaches of solving the task of sentiment analysis of the news articles in Kazakh and Russian languages by using deep recurrent neural networks. Particularly, we used Long-Short Term Memory (LSTM) in order to consider long term dependencies of the whole text. Thereby, research shows that good results can be achieved even without knowing linguistic features of particular language. Here we are going to use word embedding (word2vec, GloVes) as the main feature in our machine learning algorithms. The main idea of word embedding is the representations of words with the help of vectors in such manner that semantic relationships between words preserved as basic linear algebra operations.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Chetviorkin, I., Braslavskiy, P., Loukachevich, N.: Sentiment analysis track at ROMIP 2011. In: International Conference “Dialog 2012”: Computational Linguistics and Intellectual Technologies, Bekasovo, pp. 1–14 (2012)
Pak, A.A., Narynov, S.S., Zharmagambetov, A.S., Sagyndykova, S.N., Kenzhebayeva, Z.E., Turemuratovich, I.: The method of synonyms extraction from unannotated corpus. In: DINWC 2015, Moscow, pp. 1–5 (2015)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Workshop at ICLR, Scottsdale, AZ, USA (2013)
Bo, P., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: ACL (2004)
Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)
Turney, P.D.: Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. In: 40th Annual Meeting of the Association for Computational Linguistics (ACL 2002), Philadelphia, Pennsylvania, pp. 417–424 (2002)
Go, A., Bhayani, R., Huang, L.: Twitter sentiment classification using distant supervision. Technical report, Stanford (2009)
Furnkranz, J., Mitchell, T., Riloff, E.: A case study in using linguistic phrases for text categorization on the WWW. In: AAAI/ICML Workshop on Learning for Text Categorization, pp. 5–12 (1998)
Caropreso, M.F., Matwin, S., Sebastiani, F.: A learner-independent evaluation of the usefulness of statistical phrases for automated text categorization. In: Chin, A.G. (ed.) Text Databases and Document Management: Theory and Practice, pp. 78–102. Idea Group Publishing, USA (2001)
Nastase, B., Shirabad, J.S., Caropreso, M.F.: Using dependency relations for text classification. In: 19th Canadian Conference on Artificial Intelligence, Quebec City, pp. 12–25 (2006)
Gamon, M.: Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis. In: COLING 2004, Geneva, pp. 841–847 (2004)
Natural Language Toolkit. http://www.nltk.org/
Gensim: Topic modeling for humans. https://radimrehurek.com/gensim/
Sci-kit: Machine learning in python. http://scikit-learn.org/stable/
Cython: C-Extensions for Python. http://cython.org/
Liu, B.: Sentiment analysis and opinion mining. Synth. Lect. Hum. Lang. Technol. 5(1), 1–167 (2012)
Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 142–150. Association for Computational Linguistics (2011)
Tarasov, D.S.: Deep recurrent neural networks for multiple language aspect based sentiment analysis of user reviews. In: Dialog 2015, Moskow (2015)
Socher, R., Perelygin, A., Jean, Y.W., Chuang, J., Manning, C.D, Ng, A.Y., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1642–1656. Citeseer, Seattle (2013)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. J. Neural Comput. 9(8), 1735–1780 (1997)
Theano: Framework for python. http://deeplearning.net/software/theano/
Lasagne: Framework for python. https://github.com/Lasagne/Lasagne
Mystem: Morphology analysis tool. https://tech.yandex.ru/mystem/
Understanding LSTM Networks. Colah’s personal blog. http://colah.github.io/posts/2015–08-Understanding-LSTMs/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Sakenovich, N.S., Zharmagambetov, A.S. (2016). On One Approach of Solving Sentiment Analysis Task for Kazakh and Russian Languages Using Deep Learning. In: Nguyen, N., Iliadis, L., Manolopoulos, Y., Trawiński, B. (eds) Computational Collective Intelligence. ICCCI 2016. Lecture Notes in Computer Science(), vol 9876. Springer, Cham. https://doi.org/10.1007/978-3-319-45246-3_51
Download citation
DOI: https://doi.org/10.1007/978-3-319-45246-3_51
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45245-6
Online ISBN: 978-3-319-45246-3
eBook Packages: Computer ScienceComputer Science (R0)