Comparative Analyses of Multilingual Sentiment Analysis Systems for News and Social Media

Přibáň, Pavel; Balahur, Alexandra

doi:10.1007/978-3-031-24340-0_20

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13452))

Included in the following conference series:

International Conference on Computational Linguistics and Intelligent Text Processing

375 Accesses

Abstract

In this paper, we present evaluation of three in-house sentiment analysis (SA) systems originally designed for three distinct SA tasks, in a highly multilingual setting. For the evaluation, we collected a large number of available gold standard datasets, in different languages and varied text types. The aim of using different domain datasets was to achieve a clear snapshot of the level of overall performance of the systems and thus obtain a better quality of an evaluation. We compare the results obtained with the best performing systems evaluated on their basis and performed an in-depth error analysis. Based on the results, we can see that some systems perform better for different datasets and tasks than the ones they were designed for, showing that we could replace one system with another and gain an improvement in performance. Our results are hardly comparable with the original dataset results because the datasets often contain a different number of polarity classes than we used, and for some datasets, there are even no basic results. For the cases in which a comparison was possible, our results show that our systems perform very well in view of multilinguality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Dataset can be obtained from https://github.com/pmbaumgartner/text-feat-lib.
2.
http://www.statmt.org/wmt10/translation-task.html.
3.
For this task we also used tweets described in Subsect. 3.1.
4.
On the EMMTonality system we perfom experiments with all available languages, but we report results only for English, German, French, Italian and Spanish.
5.
Except baselines systems.
6.
Each incorrectly classified example may be contained in more than one error group. Some examples were also (in our view) annotated incorrectly. For some cases, we were not able to discover the reason for misclassification.

References

de Arruda, G.D., Roman, N.T., Monteiro, A.M.: An annotated corpus for sentiment analysis in political news. In: Proceedings of the 10th Brazilian Symposium in Information and Human Language Technology, pp. 101–110 (2015)
Google Scholar
Balahur, A., Turchi, M.: Multilingual sentiment analysis using machine translation? In: Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis, WASSA 2012, Stroudsburg, PA, USA, pp. 52–60. Association for Computational Linguistics (2012). dl.acm.org/citation.cfm?id=2392963.2392976
Balahur, A., Turchi, M.: Comparative experiments using supervised learning and machine translation for multilingual sentiment analysis. Comput. Speech Lang. 28(1), 56–75 (2014)
Article Google Scholar
Balahur, A., et al.: Resource creation and evaluation for multilingual sentiment analysis in social media texts. In: LREC, pp. 4265–4269. Citeseer (2014)
Google Scholar
Barnes, J., Klinger, R., Schulte im Walde, S.: Assessing state-of-the-art sentiment models on state-of-the-art sentiment datasets. In: Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp. 2–12. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/W17-5202. aclweb.org/anthology/W17-5202
Barnes, J., Klinger, R., Schulte im Walde, S.: Bilingual sentiment embeddings: joint projection of sentiment across languages. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2483–2493. Association for Computational Linguistics (2018). aclweb.org/anthology/P18-1231
Barnes, J., Klinger, R., Schulte im Walde, S.: Projecting embeddings for domain adaption: joint modeling of sentiment analysis in diverse domains. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 818–830. Association for Computational Linguistics (2018). aclweb.org/anthology/C18-1070
Baziotis, C., Pelekis, N., Doulkeridis, C.: DataStories at SemEval-2017 task 4: deep LSTM with attention for message-level and topic-based sentiment analysis. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), Vancouver, Canada, pp. 747–754. Association for Computational Linguistics, August 2017
Google Scholar
Bobichev, V., Kanishcheva, O., Cherednichenko, O.: Sentiment analysis in the Ukrainian and Russian news. In: 2017 IEEE First Ukraine Conference on Electrical and Computer Engineering (UKRCON), pp. 1050–1055. IEEE (2017)
Google Scholar
Bučar, J., Žnidaršič, M., Povh, J.: Annotated news corpora and a lexicon for sentiment analysis in Slovene. Lang. Resour. Eval. 52(3), 895–919 (2018). https://doi.org/10.1007/s10579-018-9413-3
Article Google Scholar
Can, E.F., Ezen-Can, A., Can, F.: Multilingual sentiment analysis: an RNN-based framework for limited data. CoRR abs/1806.04511 (2018)
Google Scholar
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724–1734. Association for Computational Linguistics (2014). https://doi.org/10.3115/v1/D14-1179. aclweb.org/anthology/D14-1179
Cimino, A., Dell’Orletta, F.: Tandem LSTM-SVM approach for sentiment analysis. In: CLiC-it/EVALITA (2016)
Google Scholar
Cliche, M.: BB_twtr at SemEval-2017 task 4: Twitter sentiment analysis with CNNs and LSTMs. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp. 573–580. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/S17-2094. aclweb.org/anthology/S17-2094
Collomb, A., Costea, C., Joyeux, D., Hasan, O., Brunie, L.: A study and comparison of sentiment analysis methods for reputation evaluation. Rapport de recherche RR-LIRIS-2014-002 (2014)
Google Scholar
Dashtipour, K., et al.: Multilingual sentiment analysis: state of the art and independent comparison of techniques. Cogn. Comput. 8(4), 757–771 (2016). https://doi.org/10.1007/s12559-016-9415-7
Article Google Scholar
Dong, L., Wei, F., Tan, C., Tang, D., Zhou, M., Xu, K.: Adaptive recursive neural network for target-dependent Twitter sentiment classification. In: The 52nd Annual Meeting of the Association for Computational Linguistics (ACL). ACL (2014)
Google Scholar
Duppada, V., Jain, R., Hiray, S.: SeerNet at SemEval-2018 task 1: domain adaptation for affect in tweets. In: Proceedings of The 12th International Workshop on Semantic Evaluation, pp. 18–23. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/S18-1002. aclweb.org/anthology/S18-1002
Go, A., Bhayani, R., Huang, L.: Twitter sentiment classification using distant supervision. CS224N project report, Stanford 1(12) (2009)
Google Scholar
Hltcoe, J.: SemEval-2013 task 2: sentiment analysis in Twitter, Atlanta, Georgia, USA 312 (2013)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751. Association for Computational Linguistics (2014). https://doi.org/10.3115/v1/D14-1181. aclweb.org/anthology/D14-1181
Klinger, R., De Clercq, O., Mohammad, S., Balahur, A.: IEST: WASSA-2018 implicit emotions shared task. In: Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, Brussels, Belgium, pp. 31–42. Association for Computational Linguistics, October 2018. aclweb.org/anthology/W18-6206
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Lommatzsch, A., Bütow, F., Ploch, D., Albayrak, S.: Towards the automatic sentiment analysis of German news and forum documents. In: Eichler, G., Erfurth, C., Fahrnberger, G. (eds.) I4CS 2017. CCIS, vol. 717, pp. 18–33. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60447-3_2
Chapter Google Scholar
Mitchell, M., Aguilar, J., Wilson, T., Van Durme, B.: Open domain targeted sentiment. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1643–1654 (2013)
Google Scholar
Mohammad, S.M., Bravo-Marquez, F.: WASSA-2017 shared task on emotion intensity. In: Proceedings of the Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA), Copenhagen, Denmark (2017)
Google Scholar
Mohammad, S.M., Bravo-Marquez, F., Salameh, M., Kiritchenko, S.: SemEval-2018 task 1: affect in tweets. In: Proceedings of International Workshop on Semantic Evaluation (SemEval-2018), New Orleans, LA, USA (2018)
Google Scholar
Mohammad, S.M., Kiritchenko, S., Zhu, X.: NRC-Canada: building the state-of-the-art in sentiment analysis of tweets. arXiv preprint arXiv:1308.6242 (2013)
Nakov, P., Ritter, A., Rosenthal, S., Stoyanov, V., Sebastiani, F.: SemEval-2016 task 4: sentiment analysis in Twitter. In: Proceedings of the 10th International Workshop on Semantic Evaluation, SemEval 2016, San Diego, California. Association for Computational Linguistics, June 2016
Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Google Scholar
Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Advances in Kernel Methods, pp. 185–208 (1999)
Google Scholar
Reitan, J., Faret, J., Gambäck, B., Bungum, L.: Negation scope detection for Twitter sentiment analysis. In: Proceedings of the 6th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp. 99–108 (2015)
Google Scholar
Rosenthal, S., Farra, N., Nakov, P.: SemEval-2017 task 4: sentiment analysis in Twitter. In: Proceedings of the 11th International Workshop on Semantic Evaluation, SemEval 2017, Vancouver, Canada. Association for Computational Linguistics, August 2017
Google Scholar
Saif, H., Fernandez, M., He, Y., Alani, H.: Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold (2013)
Google Scholar
Shamma, D.A., Kennedy, L., Churchill, E.F.: Tweet the debates: understanding community annotation of uncollected sources. In: Proceedings of the First SIGMM Workshop on Social Media, pp. 3–10. ACM (2009)
Google Scholar
Speriosu, M., Sudan, N., Upadhyay, S., Baldridge, J.: Twitter polarity classification with label propagation over lexical links and the follower graph. In: Proceedings of the First Workshop on Unsupervised Learning in NLP, pp. 53–63. Association for Computational Linguistics (2011)
Google Scholar
Steinberger, J., Lenkova, P., Kabadjov, M., Steinberger, R., Van der Goot, E.: Multilingual entity-centered sentiment analysis evaluated by parallel corpora. In: Proceedings of the International Conference Recent Advances in Natural Language Processing 2011, pp. 770–775 (2011)
Google Scholar
Tang, D., Wei, F., Yang, N., Zhou, M., Liu, T., Qin, B.: Learning sentiment-specific word embedding for Twitter sentiment classification. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1555–1565. Association for Computational Linguistics (2014). https://doi.org/10.3115/v1/P14-1146. aclweb.org/anthology/P14-1146
Vadicamo, L., et al.: Cross-media learning for image sentiment analysis in the wild. In: 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 308–317, October 2017. https://doi.org/10.1109/ICCVW.2017.45
Zhang, L., Wang, S., Liu, B.: Deep learning for sentiment analysis: a survey. CoRR abs/1801.07883 (2018). arxiv.org/abs/1801.07883
Zhou, H., Chen, L., Shi, F., Huang, D.: Learning bilingual sentiment word embeddings for cross-language sentiment classification. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 430–440. Association for Computational Linguistics (2015). https://doi.org/10.3115/v1/P15-1042. aclweb.org/anthology/P15-1042

Download references

Acknowledgments

This work was partially supported from ERDF “Research and Development of Intelligent Components of Advanced Technologies for the Pilsen Metropolitan Area (InteCom)” (no.: CZ.02.1.01/0.0/0.0/17_048/0007267) and by Grant No. SGS-2019-018 Processing of heterogeneous data and its specialized applications. Access to computing and storage facilities owned by parties and projects contributing to the National Grid Infrastructure MetaCentrum provided under the programme “Projects of Large Research, Development, and Innovations Infrastructures” (CESNET LM2015042), is greatly appreciated.

Author information

Authors and Affiliations

European Commission Joint Research Centre, Via E. Fermi 2749, 21027, Ispra, VA, Italy
Pavel Přibáň & Alexandra Balahur
Faculty of Applied Sciences, Department of Computer Science and Engineering, University of West Bohemia, Univerzitni 8, 301 00, Pilsen, Czech Republic
Pavel Přibáň

Authors

Pavel Přibáň
View author publications
You can also search for this author in PubMed Google Scholar
Alexandra Balahur
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pavel Přibáň .

Editor information

Editors and Affiliations

Instituto Politécnico Nacional, Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Přibáň, P., Balahur, A. (2023). Comparative Analyses of Multilingual Sentiment Analysis Systems for News and Social Media. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2019. Lecture Notes in Computer Science, vol 13452. Springer, Cham. https://doi.org/10.1007/978-3-031-24340-0_20

Download citation

DOI: https://doi.org/10.1007/978-3-031-24340-0_20
Published: 26 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24339-4
Online ISBN: 978-3-031-24340-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Comparative Analyses of Multilingual Sentiment Analysis Systems for News and Social Media