Abstract
One of the objectives of sentiment analysis is to classify the polarity of conveyed opinions from the perspective of textual evidence. Most of the work in the field has been intensively applied to the English language and only few experiments have explored other languages. In this paper, we present a supervised classification of posts in French online forums where sentiment analysis is based on shallow linguistic features such as POS tagging, chunking and common negation forms. Furthermore, we incorporate word semantic orientation extracted from the English lexical resource SentiWordNet as an additional feature. Since SentiWordNet is an English resource, lexical entries in the studied French corpus should be translated into English. For this purpose, we propose a number of French to English translation experiments such as machine translation and WordNet synset translation using EuroWordNet. Obtained results show that WordNet synset translation have not significantly improved the classification performance with respect to the bag of words baseline due to the shortage in coverage. Automatic translation haven’t either significantly improved the results due to its insufficient quality. Propositions of improving the classification performance are given by the end of the article.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2005), Vancouver, B.C., Canada, pp. 347–354 (October 2005)
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? sentiment classification using machine learning techniques. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Philadelphia, pp. 79–86 (July 2002)
Esuli, A., Sebastiani, F.: Sentiwordnet: a publicly available lexical resource for opinion mining. In: Proceedings of the 5th Conference on Language Resources and Evaluation, LREC, vol. 6 (2006)
Strapparava, C., Valitutti, A.: Wordnet-affect: an affective extension of wordnet. In: Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004), Lisbon, pp. 1083–1086 (May 2004)
Banea, C., Mihalcea, R., Wiebe, J.: Multilingual sentiment and subjectivity analysis. In: Zitouni, I., Bikel, D. (eds.) Multilingual Natural Language Processing. Prentice-Hall (2011)
Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of Knowledge Discovery and Data Mining (KDD 2004), Seattle (2004)
Esuli, A., Sebastiani, F.: Determining the semantic orientation of terms through gloss classification. In: Proceedings of CIKM 2005, pp. 617–624 (2005)
Nastase, V., Sokolova, M., Shirabad, J.S.: Do happy words sound happy? a study of the relation between form and meaning for english words expressing emotions. In: Proceedings of Recent Advances in Natural Language Processing (RANLP 2007), pp. 406–410 (2007)
Denecke, K.: Using sentiwordnet for multilingual sentiment analysis. In: Proceedings of the IEEE International Conference on Data Engineering (ICDE 2008), Cancun, Mexico, pp. 507–512 (2008)
Kim, S.M., Hovy, E.: Determining the sentiment of opinions. In: Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland, pp. 1367–1373 (August 2004)
Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Computational Linguistics 37(2), 267–307 (2011)
Yi, J., Nasukawa, T., Bunescu, R., Niblack, W.: Sentiment analyzer: Extracting sentiments about a given topic using natural language processing techniques. In: The Third IEEE International Conference on Data Mining (2003)
Gamon, M., Aue, A.: Automatic identification of sentiment vocabulary: exploiting low association with known sentiment terms. In: Proceedings of the ACL 2005 Workshop on Feature Engineering for Machine Learning in Natural Language Processing. Association for Computational Linguistics, Ann Arbor, US (July 2005)
Whissell, C.M.: The dictionary of affect in language. In: Lutchik, R., Kellerman, H. (eds.) Emotion: Theory, Research, and Experience, pp. 113–131 (1989)
Pennebaker, J.W., Francis, M.E., Booth, R.J.: Linguistic Inquiry and Word Count (LIWC): LIWC 2001. Erlbaum Publisher, Mahwah (2001)
Banea, C., Mihalcea, R., Wiebe, J., Hassan, S.: Multilingual subjectivity analysis using machine translation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2008, pp. 127–135. Association for Computational Linguistics, Stroudsburg (2008)
Riloff, E., Wiebe, J.: Learning extraction patterns for subjective expressions. In: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, EMNLP 2003, pp. 105–112. Association for Computational Linguistics, Stroudsburg (2003)
Abbasi, A., Chen, H., Salem, A.: Sentiment analysis in multiple languages: Feature selection for opinion classification in web forums. ACM Transactions on Information Systems (TOIS) 26(3), Article 12 (2008)
Alexandra, B., Marco, T.: Multilingual sentiment analysis using machine translation? In: Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis, pp. 52–60. Association for Computational Linguistics, Jeju (2012)
Généreux, M., Poibeau, T.: Approche mixte utilisant des outils et ressources pour l’anglais pour l’identification de fragments textuels subjectifs français. In: Actes de l’atelier de clôure de la cinquiéme édition du DÉfi Fouille de Textes (DEFT 2009), Paris (June 2009)
Kim, S.M., Hovy, E.H.: Identifying and analyzing judgment opinions. In: Proceedings of the Human Language Technology Conference of the NAACL (HLT-NAACL), New York, USA (2006)
Mihalcea, R., Banea, C., Wiebe, J.: Learning multilingual subjective language via cross-lingual projections. In: Proceedings of the Association for Computational Linguistics (ACL 2007), Prague (June 2007)
Piolat, A., Booth, R.J., Chung, C.K., Davids, M., Pennebaker, J.W.: La version franc̨aise du liwc: modalités de construction et exemples d’application. Psychologie Franc̨aise 56, 145–159 (2011)
Ghorbel, H., Jacot, D.: Further experiments in sentiment analysis of french movie reviews. In: Proceedings of the 7th Atlantic Web Intelligence Conference on Advances in Intelligent Web Mastering 3, AWIC 2011, Fribourg, Switzerland, vol. 86, pp. 19–28 (2011)
Scheggloff, E.A.: Sequence organization (2005) (unpublished manuscript)
Koshik, I.: Beyond Rhetorical Questions: Assertive Questions in Everyday Interaction. John Benjamins (2005)
Ben-Hur, A., Weston, J.: Data Mining Techniques for the Life Sciences. Springer (2009)
Hatzivassiloglou, V., McKeown, K.R.: Predicting the semantic orientation of adjectives. In: Proceedings of the 8th conference on European Chapter of the Association for Computational Linguistics, Madrid, Spain, pp. 174–181 (1997)
Fellbaum, C. (ed.): WordNet An Electronic Lexical Database. MIT Press (1998)
McCarthy, D., Koeling, R.: JulieWeeds: Eurowordnet general document. Technical Report CSRP 569l, Department of Informatics, University of Sussex, Falmer, Brighton (2004)
Rentoumi, V., Giannakopoulos, G., Vouros, G.A.: Sentiment analysis of figurative language using a word sense disambiguation approach. In: Proceedings of the International Conference on RANLP, pp. 370–375 (2009)
Vossen, P.: Eurowordnet general document. Technical Report Version 3 Final, University of Amsterdam (2010)
Schmid, H.: Probabilistic part-of-speech tagging using decision trees. In: Proceedings of the International Conference on New Methods in Language Processing, pp. 44–49 (1994)
Joachims, T.: Making large-scale svm learning practical. ACM Transactions on Information Systems, TOIS (1998)
Ghorbel, H., Jacot, D.: Sentiment Analysis of French Movie Reviews. In: Pallotta, V., Soro, A., Vargiu, E. (eds.) DART 2011. SCI, vol. 361, pp. 97–108. Springer, Heidelberg (2011)
Gala, N., Brun, C.: Propagation de polarités dans des familles de mots: impact de la morphologie dans la construction d’un lexique pour l’analyse d’opinions. In: Actes de Traitement Automatique des Langues Naturelles (TALN 2012), Grenoble (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ghorbel, H. (2012). Experiments in Cross-Lingual Sentiment Analysis in Discussion Forums. In: Aberer, K., Flache, A., Jager, W., Liu, L., Tang, J., Guéret, C. (eds) Social Informatics. SocInfo 2012. Lecture Notes in Computer Science, vol 7710. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35386-4_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-35386-4_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35385-7
Online ISBN: 978-3-642-35386-4
eBook Packages: Computer ScienceComputer Science (R0)