Abstract
The goal of Aspect-Based Sentiment Analysis is to identify opinions regarding specific targets and the corresponding sentiment polarity in a document. The proposed approach is designed for real-world scenarios, where the amount of available information and annotated data is often too limited to train supervised models. We focus on the two core tasks of Aspect-Based Sentiment Analysis: aspect and sentiment polarity classification. The first task – which consists in the identification of the opinion targets in a document – is tackled by means of a weakly-supervised technique based on Non-negative Matrix Factorization. This strategy allows users to easily embed some a priori domain knowledge by means of short seed terms lists. Experimental results on publicly available data sets related to online reviews suggest that the proposed approach is very flexible and can be easily adapted to different languages and domains.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
In this work we employed the stopwords and stemmers provided in Python nltk 3.2.5, https://www.nltk.org.
- 2.
In this work we will consider corpora in English and Spanish (see Sect. 4); the lists of negation terms for English (16 terms) and Spanish (12 terms) are included in our code repository https://gitlab.dei.unipd.it/dl_dei/ws4absa.
- 3.
The code for deploying and evaluating WS4ABSA is available on https://gitlab.dei.unipd.it/dl_dei/ws4absa.
- 4.
Word2Vec implementation from https://radimrehurek.com/gensim/models/word2vec.html.
- 5.
The difference was computed as: \(\text {difference} \div \text {other value} * 100\).
References
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal Mach. Learn. Res. 3, 993–1022 (2003)
Brody, S., Elhadad, N.: An unsupervised aspect-sentiment model for online reviews. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 804–812. Association for Computational Linguistics (2010)
Choo, J., Lee, C., Reddy, C.K., Park, H.: Utopian: User-driven topic modeling based on interactive nonnegative matrix factorization. IEEE Trans. Vis. Comput. Graph. 19(12), 1992–2001 (2013)
Cichocki, A., Phan, A.H.: Fast local algorithms for large scale nonnegative matrix and tensor factorizations. IEICE Trans. Fundam. Electron., Commun. Comput. Sci. 92(3), 708–721 (2009)
García-Pablos, A., Cuadros, M., Rigau, G.: W2vlda: almost unsupervised system for aspect based sentiment analysis. Expert. Syst. Appl. 91, 127–137 (2018)
Kim, H., Park, H.: Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method. SIAM J. Matrix Anal. Appl. 30(2), 713–730 (2008)
Kim, J., He, Y., Park, H.: Algorithms for nonnegative matrix and tensor factorizations: A unified view based on block coordinate descent framework. J. Glob. Optim. 58(2), 285–319 (2014)
Kuang, D., Choo, J., Park, H.: Nonnegative matrix factorization for interactive topic modeling and document clustering. In: Celebi, M.E. (ed.) Partitional Clustering Algorithms, pp. 215–243. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-09259-1_7
Lawson, C.L., Hanson, R.J.: Solving Least Squares Problems, vol. 15. SIAM, Philadelphia (1995)
Li, T., Sindhwani, V., Ding, C., Zhang, Y.: Bridging domains with words: Opinion analysis with matrix tri-factorizations. In: Proceedings of the 2010 SIAM International Conference on Data Mining, pp. 293–302. SIAM (2010)
Liu, B.: Sentiment analysis and opinion mining. Synth. Lect. Hum. Lang. Technol. 5(1), 1–167 (2012)
Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies-vol. 1, pp. 142–150. Association for Computational Linguistics (2011)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Paatero, P., Tapper, U.: Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5(2), 111–126 (1994)
Pontiki, M., et al.: Semeval-2016 task 5: aspect based sentiment analysis. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 19–30 (2016)
Pontiki, M., Galanis, D., Papageorgiou, H., Manandhar, S., Androutsopoulos, I.: Semeval-2015 task 12: aspect based sentiment analysis. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pp. 486–495 (2015)
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24(5), 513–523 (1988)
Toh, Z., Su, J.: Nlangp at semeval-2016 task 5: improving aspect based sentiment analysis using neural network features. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 282–288 (2016)
Tsoumakas, G., Katakis, I.: Multi-label classification: an overview. Int. J. Data Warehous. Min. 3(3), 1–13 (2006)
Varghese, R., Jayasree, M.: Aspect based sentiment analysis using support vector machine classifier. In: 2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 1581–1586. IEEE (2013)
Vavasis, S.A.: On the complexity of nonnegative matrix factorization. SIAM J. Optim. 20(3), 1364–1377 (2009)
Wang, F., Li, T., Zhang, C.: Semi-supervised clustering via matrix factorization. In: Proceedings of the 2008 SIAM International Conference on Data Mining, pp. 1–12. SIAM (2008)
Xiang, B., Zhou, L.: Improving twitter sentiment analysis with topic-based mixture modeling and semi-supervised training. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), vol. 2, pp. 434–439 (2014)
Yan, X., Guo, J., Liu, S., Cheng, X., Wang, Y.: Learning topics in short texts by non-negative matrix factorization on term correlation matrix. In: Proceedings of the 2013 SIAM International Conference on Data Mining, pp. 749–757. SIAM (2013)
Zagibalov, T., Carroll, J.: Automatic seed word selection for unsupervised sentiment classification of chinese text. In: Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1, pp. 1073–1080. Association for Computational Linguistics (2008)
Zhao, W.X., Jiang, J., Yan, H., Li, X.: Jointly modeling aspects and opinions with a maxent-lda hybrid. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 56–65. Association for Computational Linguistics (2010)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Purpura, A., Masiero, C., Susto, G.A. (2018). WS4ABSA: An NMF-Based Weakly-Supervised Approach for Aspect-Based Sentiment Analysis with Application to Online Reviews. In: Soldatova, L., Vanschoren, J., Papadopoulos, G., Ceci, M. (eds) Discovery Science. DS 2018. Lecture Notes in Computer Science(), vol 11198. Springer, Cham. https://doi.org/10.1007/978-3-030-01771-2_25
Download citation
DOI: https://doi.org/10.1007/978-3-030-01771-2_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01770-5
Online ISBN: 978-3-030-01771-2
eBook Packages: Computer ScienceComputer Science (R0)