skip to main content
10.1145/3390557.3394133acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiciaiConference Proceedingsconference-collections
research-article

Unsupervised Bilingual Sentiment Word Embeddings for Cross-lingual Sentiment Classification

Published: 04 June 2020 Publication History

Abstract

In recent years, bilingual word embeddings have been used to promote sentiment classification task in low-resource languages. However, existing bilingual word embedding methods either require annotated cross-lingual data or fail to capture enough sentiment information. In this paper, we propose Unsupervised Bilingual Sentiment word Embeddings (UBSE), which only need source-language annotated corpora and a monolingual sentimental lexicon. This method is constructed in an unsupervised way, we pre-train a projection matrix between the source and target languages using Generative Adversarial Nets (GAN) without using any parallel corpora. Further, we incorporate a monolingual sentiment lexicon from the source language to fine-tune the model, making it more sensitive to sentiment implication. Experiments on Spanish, Catalan and Basque demonstrate that the proposed approach on sentence-level cross-lingual sentiment classification significantly outperforms competitive baseline models which use cross-lingual dictionaries, even comparable with translation based methods.

References

[1]
Mingbo Ma, Liang Huang, Bowen Zhou, and Bing Xiang. 2015. Dependency-based convolutional neural networks for sentence embedding. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), volume 2, pages 174--179.
[2]
Sarthak Jain and Shashank Batra. 2015. Cross lingual sentiment analysis using modified brae. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 159--168.
[3]
Saif Mohammad, Mohammad Salameh, and Svetlana Kiritchenko. 2016. Sentiment lexicons for arabic social media. In LREC.
[4]
AR Balamurali, Aditya Joshi, and Pushpak Bhat- tacharyya. 2012. Cross-lingual sentiment analysis for indian languages using linked wordnets. Proceedings of COLING 2012: Posters, pages 73--82.
[5]
Mohamed Abdalla and Graeme Hirst. 2017. Cross-lingual sentiment analysis without (good) translation. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), volume 1, pages 506--515.
[6]
Ruochen Xu, Yiming Yang, Naoki Otani, and Yuexin Wu. 2018. Unsupervised cross-lingual transfer of word embedding spaces. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2465--2474.
[7]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013b. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111--3119.
[8]
Tomas Mikolov, Quoc V Le, and Ilya Sutskever. 2013a. Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168.
[9]
Chao Xing, Dong Wang, Chao Liu, and Yiye Lin. 2015.Normalized word embedding and orthogonal transform for bilingual word translation. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1006--1011.
[10]
Mikel Artetxe, Gorka Labaka, and Eneko Agirre. 2017. Learning bilingual word embeddings with (almost) no bilingual data. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), volume 1, pages 451--462.
[11]
Meng Zhang, Yang Liu, Huanbo Luan, and Maosong Sun. 2017. Adversarial training for unsupervised bilingual lexicon induction. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), volume 1, pages 1959--1970.
[12]
Alexis Conneau, Guillaume Lample, Marc'Aurelio Ranzato, Ludovic Denoyer, and Herve´Je´gou. 2017. Word translation without parallel data. arXiv preprint arXiv:1710.04087.
[13]
Xinjie Zhou, Xiaojun Wan, and Jianguo Xiao. 2016b. Cross-lingual sentiment classification with bilingual document representation learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), volume 1, pages 1403--1412.
[14]
Jeremy Barnes, Roman Klinger, and Sabine Schulte im Walde. 2018a. Bilingual sentiment embeddings: Joint projection of sentiment across languages. arXiv preprint arXiv:1805.09016.
[15]
Moustapha Cisse, Piotr Bojanowski, Edouard Grave, Yann Dauphin, and Nicolas Usunier. 2017. Parseval networks: Improving robustness to adversarial examples. In International Conference on Machine Learning, pages 854--863.
[16]
Jacopo Staiano and Marco Guerini. 2014. Depechemood: a lexicon for emotion analysis from crowd annotated news. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), volume 2, pages 427--433.
[17]
Rodrigo Agerri, Montse Cuadros Sea´n Gaines, IP HSLT, and German Rigau. 2013. Opener: Openpolarity enhanced named entity recognition. Proce samiento del Lenguaje Natural, 51: 215--218.
[18]
Jeremy Barnes, Patrik Lambert, and Toni Badia. 2018b.Multibooked: A corpus of basque and catalan hotel reviews annotated for aspect-level sentiment classification. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC'18), Miyazaki, Japan. European Language Resources Association (ELRA).
[19]
Minqing Hu and Bing Liu. 2004. Mining and summa- rizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 168--177. ACM.
[20]
Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5: 135--146.

Cited By

View all

Index Terms

  1. Unsupervised Bilingual Sentiment Word Embeddings for Cross-lingual Sentiment Classification

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICIAI '20: Proceedings of the 2020 the 4th International Conference on Innovation in Artificial Intelligence
    May 2020
    271 pages
    ISBN:9781450376587
    DOI:10.1145/3390557
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • The Hong Kong Polytechnic: The Hong Kong Polytechnic University
    • Xi'an Jiaotong-Liverpool University: Xi'an Jiaotong-Liverpool University

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 June 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Word embeddings
    2. bilingual
    3. sentiment classification

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • National Key R&D Program of China Subject II
    • National Key R&D Program of China
    • MoE-CMCC Artificial Intelligence Project

    Conference

    ICIAI 2020

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 27 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Systematic Review of Cross-Lingual Sentiment Analysis: Tasks, Strategies, and ProspectsACM Computing Surveys10.1145/364510656:7(1-37)Online publication date: 9-Apr-2024
    • (2023)MUSEDAKnowledge-Based Systems10.1016/j.knosys.2023.110560273:COnline publication date: 3-Aug-2023
    • (2023)A Cross-lingual Sentiment Embedding Model with Semantic and Sentiment Joint LearningNatural Language Processing and Chinese Computing10.1007/978-3-031-44693-1_7(82-94)Online publication date: 8-Oct-2023
    • (2022)A Survey of Cross-lingual Sentiment Analysis: Methodologies, Models and EvaluationsData Science and Engineering10.1007/s41019-022-00187-37:3(279-299)Online publication date: 8-Jun-2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media