Skip to main content

Sentiment Analysis with a Multilingual Pipeline

  • Conference paper
Web Information System Engineering – WISE 2011 (WISE 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6997))

Included in the following conference series:

  • 1546 Accesses

Abstract

Sentiment analysis refers to retrieving an author’s sentiment from a text. We analyze the differences that occur in sentiment scoring across languages. We present our experiments for the Dutch and English language based on forum, blog, news and social media texts available on the Web, where we focus on the differences in the use of a language and the effect of the grammar of a language on sentiment analysis. We propose a multilingual pipeline for evaluating how an author’s sentiment is conveyed in different languages. We succeed in correctly classifying positive and negative texts with an accuracy of approximately 71% for English and 79% for Dutch. The evaluation of the results shows however that usage of common expressions, emoticons, slang language, irony, sarcasm, and cynicism, acronyms and different ways of negation in English prevent the underlying sentiment scores from being directly comparable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Abbasi, A., Chan, H., Salem, A.: Sentiment Analysis in Multiple Languages: Feature Selection for Opinion Classification in Web Forums. ACM Transactions on Information Systems 26(3) (2008)

    Google Scholar 

  2. Alexa Internet Inc.: Alexa the Web Information Company (2011), http://www.alexa.com/

  3. Amati, G., van Rijsbergen, C.: Probabilistic Models of Information Retrieval Based on Measuring the Divergence from Randomness. ACM Transactions on Information Systems 20(4), 375–389 (2002)

    Article  Google Scholar 

  4. Bautin, M., Vijayarenu, L., Skiena, S.: International Sentiment Analysis for News and Blogs. In: 2nd International Conference on Weblogs and Social Media (ICWSM 2008), pp. 19–26. AAAI Press, Menlo Park (2008)

    Google Scholar 

  5. Dai, W., Xue, G., Yang, Q., Yu, Y.: Co-clustering Based Classification. In: 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2007), pp. 210–219. ACM, New York (2007)

    Chapter  Google Scholar 

  6. Dai, W., Xue, G.-R., Yang, Q., Yu, Y.: Transferring naive bayes classifiers for text classification. In: 22nd Association for the Advancement of Articifial Intelligence Conference on Artificial Intelligence (AAAI 2007), pp. 540–545. AAAI Press, Menlo Park (2007)

    Google Scholar 

  7. FilmTotaal: Film Recensies en Reviews op FilmTotaal (2011), http://www.filmtotaal.nl/recensies.php

  8. Gliozzo, A., Strapparava, C.: Cross Language Text Categorization by Acquiring Multilingual Domain Models from Comparable Corpora. In: ACL Workshop on Building and Using Parallel Texts (ParaText 2005), pp. 9–16. ACL (2005)

    Google Scholar 

  9. Hofman, K., Jijkoun, V.: Generating a Non-English Subjectivity Lexicon: Relations that Matter. In: 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2009). pp. 398–405. ACL (2009)

    Google Scholar 

  10. IMDb.com Inc.: The Internet Movie Database (IMDb) (2011), http://www.imdb.com/

  11. Moens, M.-F., Boiy, E.: A Machine Learning Approach to Sentiment Analysis in Multilingual Web Texts. Information Retrieval 12(5), 526–558 (2007)

    Google Scholar 

  12. Pang, B., Lee, L.: A Sentimental Education: Sentiment Analysis using Subjectivity Summarization based on Minimum Cuts. In: 42nd Annual Meeting of the Association for Computational Linguistics (ACL 2004)), pp. 271–280. ACL (2004)

    Google Scholar 

  13. Pang, B., Lee, L.: Opinion Mining and Sentiment Analysis. Foundations and Trends in Information Retrieval 2(1), 1–135 (2008)

    Article  Google Scholar 

  14. Wan, X.: Co-Training for Cross-Lingual Sentiment Classification. In: Joint Conference of the 47th Annual Meeting of ACL and the 4th International Join Conference on Natural Language Processing of the AFNLP (ACL 2009), pp. 235–243. ACL (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bal, D., Bal, M., van Bunningen, A., Hogenboom, A., Hogenboom, F., Frasincar, F. (2011). Sentiment Analysis with a Multilingual Pipeline. In: Bouguettaya, A., Hauswirth, M., Liu, L. (eds) Web Information System Engineering – WISE 2011. WISE 2011. Lecture Notes in Computer Science, vol 6997. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24434-6_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24434-6_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24433-9

  • Online ISBN: 978-3-642-24434-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics