Abstract
Sentiment analysis refers to retrieving an author’s sentiment from a text. We analyze the differences that occur in sentiment scoring across languages. We present our experiments for the Dutch and English language based on forum, blog, news and social media texts available on the Web, where we focus on the differences in the use of a language and the effect of the grammar of a language on sentiment analysis. We propose a multilingual pipeline for evaluating how an author’s sentiment is conveyed in different languages. We succeed in correctly classifying positive and negative texts with an accuracy of approximately 71% for English and 79% for Dutch. The evaluation of the results shows however that usage of common expressions, emoticons, slang language, irony, sarcasm, and cynicism, acronyms and different ways of negation in English prevent the underlying sentiment scores from being directly comparable.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abbasi, A., Chan, H., Salem, A.: Sentiment Analysis in Multiple Languages: Feature Selection for Opinion Classification in Web Forums. ACM Transactions on Information Systems 26(3) (2008)
Alexa Internet Inc.: Alexa the Web Information Company (2011), http://www.alexa.com/
Amati, G., van Rijsbergen, C.: Probabilistic Models of Information Retrieval Based on Measuring the Divergence from Randomness. ACM Transactions on Information Systems 20(4), 375–389 (2002)
Bautin, M., Vijayarenu, L., Skiena, S.: International Sentiment Analysis for News and Blogs. In: 2nd International Conference on Weblogs and Social Media (ICWSM 2008), pp. 19–26. AAAI Press, Menlo Park (2008)
Dai, W., Xue, G., Yang, Q., Yu, Y.: Co-clustering Based Classification. In: 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2007), pp. 210–219. ACM, New York (2007)
Dai, W., Xue, G.-R., Yang, Q., Yu, Y.: Transferring naive bayes classifiers for text classification. In: 22nd Association for the Advancement of Articifial Intelligence Conference on Artificial Intelligence (AAAI 2007), pp. 540–545. AAAI Press, Menlo Park (2007)
FilmTotaal: Film Recensies en Reviews op FilmTotaal (2011), http://www.filmtotaal.nl/recensies.php
Gliozzo, A., Strapparava, C.: Cross Language Text Categorization by Acquiring Multilingual Domain Models from Comparable Corpora. In: ACL Workshop on Building and Using Parallel Texts (ParaText 2005), pp. 9–16. ACL (2005)
Hofman, K., Jijkoun, V.: Generating a Non-English Subjectivity Lexicon: Relations that Matter. In: 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2009). pp. 398–405. ACL (2009)
IMDb.com Inc.: The Internet Movie Database (IMDb) (2011), http://www.imdb.com/
Moens, M.-F., Boiy, E.: A Machine Learning Approach to Sentiment Analysis in Multilingual Web Texts. Information Retrieval 12(5), 526–558 (2007)
Pang, B., Lee, L.: A Sentimental Education: Sentiment Analysis using Subjectivity Summarization based on Minimum Cuts. In: 42nd Annual Meeting of the Association for Computational Linguistics (ACL 2004)), pp. 271–280. ACL (2004)
Pang, B., Lee, L.: Opinion Mining and Sentiment Analysis. Foundations and Trends in Information Retrieval 2(1), 1–135 (2008)
Wan, X.: Co-Training for Cross-Lingual Sentiment Classification. In: Joint Conference of the 47th Annual Meeting of ACL and the 4th International Join Conference on Natural Language Processing of the AFNLP (ACL 2009), pp. 235–243. ACL (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bal, D., Bal, M., van Bunningen, A., Hogenboom, A., Hogenboom, F., Frasincar, F. (2011). Sentiment Analysis with a Multilingual Pipeline. In: Bouguettaya, A., Hauswirth, M., Liu, L. (eds) Web Information System Engineering – WISE 2011. WISE 2011. Lecture Notes in Computer Science, vol 6997. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24434-6_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-24434-6_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24433-9
Online ISBN: 978-3-642-24434-6
eBook Packages: Computer ScienceComputer Science (R0)