Sentiment Dictionary Refinement Using Word Embeddings

Wawer, Aleksander

doi:10.1007/978-3-319-25252-0_20

Aleksander Wawer¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9384))

Included in the following conference series:

International Symposium on Methodologies for Intelligent Systems

734 Accesses
1 Citations

Abstract

Previous works on Polish sentiment dictionaries revealed the superiority of machine learning on vectors created from word contexts (concordances or word co-occurrence distributions), especially compared to the SO-PMI method (semantic orientation of pointwise mutual information). This paper demonstrates that this state-of-the-art method could be improved upon when extending the vectors by word embeddings, obtained from skip-gram language models. Specifically, it proposes a new method of computing word sentiment polarity using feature sets composed of vectors created from word embeddings and word co-occurrence distributions. The new technique is evaluated in a number of experimental settings.

This work was funded by the National Science Centre of Poland grant nr UMO-2012/05/N/ST6/03587.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Micro means to calculate metrics globally by counting the total true positives, false negatives and false positives. Thus, it takes into account label imbalance. The formula is: \(F1 = 2 * (precision * recall) / (precision + recall)\) .
2.
Macro means to calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.

References

Stone, P.J., Dunphy, D.C., Ogilvie, D.M., Smith, M.S.: The General Inquirer: A Computer Approach to Content Analysis. MIT Press, Cambridge (1966)
Google Scholar
Liu, B.: Sentiment Analysis and Opinion Mining. Morgan and Claypool Publishers (2012)
Google Scholar
Zhu, X., Kiritchenko, S., Mohammad, S.: Nrc-canada-2014: Recent improvements in the sentiment analysis of tweets. In: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), Dublin, Ireland, Association for Computational Linguistics and Dublin City University, pp. 443–447, August 2014
Google Scholar
Wawer, A., Rogozinska, D.: How much supervision? Corpus-based lexeme sentiment estimation. In: 2012 IEEE 12th International Conference on Data Mining Workshops (SENTIRE 2012), Los Alamitos, CA, USA, pp. 724–730. IEEE Computer Society (2012)
Google Scholar
Turney, P., Littman, M.: Measuring praise and criticism: Inference of semantic orientation from association. ACM Trans. Inf. Syst. 21, 315–346 (2003)
Article Google Scholar
Grefenstette, G., Qu, Y., Evans, D.A., Shanahan, J.G.: In: Validating the Coverage of Lexical Resources for Affect Analysis and Automatically Classifying New Words along Semantic Axes. Springer, Netherlands (2006)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of NIPS (2013)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of Workshop at ICLR (2013)
Google Scholar
Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C., Ng, A., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Conference on Empirical Methods in Natural Language Processing (2013)
Google Scholar
Levy, O., Goldberg, Y.: Neural word embedding as implicit matrix factorization. In: The Twenty-Eighth Annual Conference on Neural Information Processing Systems (NIPS 2014) (2014)
Google Scholar
Tiedemann, J.: Parallel data, tools and interfaces in opus. In: Chair, N.C.C., Choukri, K., Declerck, T., Dogan, M.U., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S., (eds.): Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC 2012), Istanbul, Turkey, European Language Resources Association (ELRA), May 2012
Google Scholar
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: A library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)
MATH Google Scholar
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, pp. 1532–1543 (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Computer Science, Polish Academy of Science, ul. Jana Kazimierza 5, 01-238, Warszawa, Poland
Aleksander Wawer

Authors

Aleksander Wawer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aleksander Wawer .

Editor information

Editors and Affiliations

Computer Science, University of Bari, Bari, Italy
Floriana Esposito
Enssat, Lannion, France
Olivier Pivert
LISI-UFR d'Informatique, Université Claude Bernard Lyon 1, Villeurbanne Cedex, France
Mohand-Said Hacid
University of North Carolina, CHARLOTTE, North Carolina, USA
Zbigniew W. Rás
Dipartimento di Informatica, Università degli Studi di Bari, Bari, Italy
Stefano Ferilli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wawer, A. (2015). Sentiment Dictionary Refinement Using Word Embeddings. In: Esposito, F., Pivert, O., Hacid, MS., Rás, Z., Ferilli, S. (eds) Foundations of Intelligent Systems. ISMIS 2015. Lecture Notes in Computer Science(), vol 9384. Springer, Cham. https://doi.org/10.1007/978-3-319-25252-0_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-25252-0_20
Published: 30 December 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25251-3
Online ISBN: 978-3-319-25252-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics