Improving bilingual word embeddings mapping with monolingual context information

Zhu, Shaolin; Mi, Chenggang; Li, Tianqi; Zhang, Fuhua; Zhang, Zhifeng; Sun, Yu

doi:10.1007/s10590-021-09274-0

Improving bilingual word embeddings mapping with monolingual context information

Published: 21 July 2021

Volume 35, pages 503–518, (2021)
Cite this article

Machine Translation

Shaolin Zhu²,
Chenggang Mi ORCID: orcid.org/0000-0002-6367-6118¹,
Tianqi Li²,
Fuhua Zhang²,
Zhifeng Zhang² &
…
Yu Sun²

280 Accesses
1 Citation
Explore all metrics

Abstract

Bilingual word embeddings (BWEs) play a very important role in many natural language processing (NLP) tasks, especially cross-lingual tasks such as machine translation (MT) and cross-language information retrieval. Most existing methods to train BWEs are based on bilingual supervision. However, bilingual resources are not available for many low-resource language pairs. Although some studies addressed this issue with unsupervised methods, monolingual contextual data are not used to improve the performance of low-resource BWEs. To address these issues, we propose an unsupervised method to improve BWEs using optimized monolingual context information without any parallel corpora. In particular, we first build a bilingual word embeddings mapping model between two languages by aligning monolingual word embedding spaces based on unsupervised adversarial training. To further improve the performance of these mappings, we use monolingual context information to optimize them during the course. Experimental results show that our method outperforms other baseline systems significantly, including results for four low-resource language pairs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Text Data Augmentation for Deep Learning

Article Open access 19 July 2021

Connor Shorten, Taghi M. Khoshgoftaar & Borko Furht

How to Fine-Tune BERT for Text Classification?

Pre-trained models for natural language processing: A survey

Article 15 September 2020

XiPeng Qiu, TianXiang Sun, … XuanJing Huang

Notes

References

Ammar W, Mulcaire G, Tsvetkov Y, Lample G, Dyer C, Smith NA (2016) Massively multilingual word embeddings, arXiv preprint arXiv:1602.01925
Artetxe M, Labaka G, Agirre E (2016) Learning principled bilingual mappings of word embeddings while preserving monolingual invariance. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp 2289–2294
Artetxe M, Labaka G, Agirre E (2018) A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings. arXiv preprint arXiv:1805.06297
Barone AVM (2016) Towards cross-lingual distributed representations without parallel text trained with adversarial autoencoders. arXiv preprint arXiv:1608.02996
Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146
Article Google Scholar
Cao H, Zhao T, Zhang S, Meng Y (2016) A distribution-based model to learn bilingual word embeddings. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp 1818–1827
Cisse M, Bojanowski P, Grave E, Dauphin Y, Usunier N (2017) Parseval networks: Improving robustness to adversarial examples. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, pp 854–863
Conneau A, Lample G, Ranzato M, Denoyer L, Jégou H (2018) Word translation without parallel data. arXiv preprint arXiv:1710.04087
Faruqui M, Dyer C (2014) Improving vector space word representations using multilingual correlation. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp 462–471
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
Jagadeesha S, Sinha S, Mehra D (1994) A recursive modified gram-schmidt algorithm based adaptive beamformer. Signal process 39(1–2):69–78
Article Google Scholar
Khan FH, Qamar U, Bashir S (2016) Sentimi: Introducing point-wise mutual information with sentiwordnet to improve sentiment polarity detection. Appl Soft Comput 39:140–153
Article Google Scholar
Mikolov T, Le QV, Sutskever I (2013b) xploiting similarities among languages for machine translatio. arXiv preprint arXiv:1309.4168
Patra B, Moniz JRA, Garg S, Gormley MR, Neubig G (2019) Bilingual lexicon induction with semi-supervision in non-isometric embedding spaces. arXiv preprint arXiv:1908.06625
Ren S, Cao X, Wei Y, Sun J (2014) Face alignment at 3000 fps via regressing local binary features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1685–1692
Smith SL, Turban DH, Hamblin S, Hammerla NY (2017) Offline bilingual word vectors, orthogonal transformations and the inverted softmax. arXiv preprint arXiv:1702.03859
Strassel S, Tracey J (2016) Lorelei language packs: Data, tools, and resources for technology development in low resource languages. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16). pp 3273–3280
Søgaard A, Ruder S, Vulić I (2018) On the limitations of unsupervised bilingual dictionary induction. arXiv preprint arXiv:1805.03620
Vulic I, Korhonen A-L (2016) On the role of seed lexicons in learning bilingual word embeddings
Xing C, Wang D, Liu C, Lin Y (2015) Normalized word embedding and orthogonal transform for bilingual word translation. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1006–1011
Zhang M, Liu Y, Luan H, Sun M, Izuha T, Hao J (2016) Building earth mover’s distance on bilingual word embeddings for machine translation. In: Thirtieth AAAI Conference on Artificial Intelligence
Zhang M, Liu Y, Luan H, Sun M (2017b) Adversarial training for unsupervised bilingual lexicon induction. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vol 1. ( Long Papers), pp 1959–1970
Zhang M, Peng H, Liu Y, Luan H, Sun M (2017a) Bilingual lexicon induction from non-parallel data with minimal supervision. In: Thirty-First AAAI Conference on Artificial Intelligence

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (61906158), the Project of Science and Technology Research in Henan Province (212102210075).

Author information

Authors and Affiliations

Northwestern Polytechnical University, Xi’an, 710129, China
Chenggang Mi
Zhengzhou University of Light Industry, Zhengzhou, 450002, China
Shaolin Zhu, Tianqi Li, Fuhua Zhang, Zhifeng Zhang & Yu Sun

Authors

Shaolin Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Chenggang Mi
View author publications
You can also search for this author in PubMed Google Scholar
Tianqi Li
View author publications
You can also search for this author in PubMed Google Scholar
Fuhua Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhifeng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yu Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chenggang Mi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhu, S., Mi, C., Li, T. et al. Improving bilingual word embeddings mapping with monolingual context information. Machine Translation 35, 503–518 (2021). https://doi.org/10.1007/s10590-021-09274-0

Download citation

Received: 25 February 2020
Accepted: 16 June 2021
Published: 21 July 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s10590-021-09274-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Improving bilingual word embeddings mapping with monolingual context information

Abstract

Access this article

Similar content being viewed by others

Text Data Augmentation for Deep Learning

How to Fine-Tune BERT for Text Classification?

Pre-trained models for natural language processing: A survey

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improving bilingual word embeddings mapping with monolingual context information

Abstract

Access this article

Similar content being viewed by others

Text Data Augmentation for Deep Learning

How to Fine-Tune BERT for Text Classification?

Pre-trained models for natural language processing: A survey

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation