Skip to main content
Log in

Improving bilingual word embeddings mapping with monolingual context information

  • Published:
Machine Translation

Abstract

Bilingual word embeddings (BWEs) play a very important role in many natural language processing (NLP) tasks, especially cross-lingual tasks such as machine translation (MT) and cross-language information retrieval. Most existing methods to train BWEs are based on bilingual supervision. However, bilingual resources are not available for many low-resource language pairs. Although some studies addressed this issue with unsupervised methods, monolingual contextual data are not used to improve the performance of low-resource BWEs. To address these issues, we propose an unsupervised method to improve BWEs using optimized monolingual context information without any parallel corpora. In particular, we first build a bilingual word embeddings mapping model between two languages by aligning monolingual word embedding spaces based on unsupervised adversarial training. To further improve the performance of these mappings, we use monolingual context information to optimize them during the course. Experimental results show that our method outperforms other baseline systems significantly, including results for four low-resource language pairs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://linguatools.org/tools/corpora/wikipedia-monolingual-corpora/.

  2. ftp://ftpmirror.your.org/pub/wikimedia/dumps/newiki/.

  3. https://github.com/BYVoid/OpenCC.

  4. https://pypi.org/project/jieba/.

  5. http://www.nltk.org.

References

  • Ammar W, Mulcaire G, Tsvetkov Y, Lample G, Dyer C, Smith NA (2016) Massively multilingual word embeddings, arXiv preprint arXiv:1602.01925

  • Artetxe M, Labaka G, Agirre E (2016) Learning principled bilingual mappings of word embeddings while preserving monolingual invariance. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp 2289–2294

  • Artetxe M, Labaka G, Agirre E (2018) A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings. arXiv preprint arXiv:1805.06297

  • Barone AVM (2016) Towards cross-lingual distributed representations without parallel text trained with adversarial autoencoders. arXiv preprint arXiv:1608.02996

  • Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146

    Article  Google Scholar 

  • Cao H, Zhao T, Zhang S, Meng Y (2016) A distribution-based model to learn bilingual word embeddings. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp 1818–1827

  • Cisse M, Bojanowski P, Grave E, Dauphin Y, Usunier N (2017) Parseval networks: Improving robustness to adversarial examples. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, pp 854–863

  • Conneau A, Lample G, Ranzato M, Denoyer L, Jégou H (2018) Word translation without parallel data. arXiv preprint arXiv:1710.04087

  • Faruqui M, Dyer C (2014) Improving vector space word representations using multilingual correlation. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp 462–471

  • Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680

  • Jagadeesha S, Sinha S, Mehra D (1994) A recursive modified gram-schmidt algorithm based adaptive beamformer. Signal process 39(1–2):69–78

    Article  Google Scholar 

  • Khan FH, Qamar U, Bashir S (2016) Sentimi: Introducing point-wise mutual information with sentiwordnet to improve sentiment polarity detection. Appl Soft Comput 39:140–153

    Article  Google Scholar 

  • Mikolov T, Le QV, Sutskever I (2013b) xploiting similarities among languages for machine translatio. arXiv preprint arXiv:1309.4168

  • Patra B, Moniz JRA, Garg S, Gormley MR, Neubig G (2019) Bilingual lexicon induction with semi-supervision in non-isometric embedding spaces. arXiv preprint arXiv:1908.06625

  • Ren S, Cao X, Wei Y, Sun J (2014) Face alignment at 3000 fps via regressing local binary features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1685–1692

  • Smith SL, Turban DH, Hamblin S, Hammerla NY (2017) Offline bilingual word vectors, orthogonal transformations and the inverted softmax. arXiv preprint arXiv:1702.03859

  • Strassel S, Tracey J (2016) Lorelei language packs: Data, tools, and resources for technology development in low resource languages. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16). pp 3273–3280

  • Søgaard A, Ruder S, Vulić I (2018) On the limitations of unsupervised bilingual dictionary induction. arXiv preprint arXiv:1805.03620

  • Vulic I, Korhonen A-L (2016) On the role of seed lexicons in learning bilingual word embeddings

  • Xing C, Wang D, Liu C, Lin Y (2015) Normalized word embedding and orthogonal transform for bilingual word translation. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1006–1011

  • Zhang M, Liu Y, Luan H, Sun M, Izuha T, Hao J (2016) Building earth mover’s distance on bilingual word embeddings for machine translation. In: Thirtieth AAAI Conference on Artificial Intelligence

  • Zhang M, Liu Y, Luan H, Sun M (2017b) Adversarial training for unsupervised bilingual lexicon induction. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vol 1. ( Long Papers), pp 1959–1970

  • Zhang M, Peng H, Liu Y, Luan H, Sun M (2017a) Bilingual lexicon induction from non-parallel data with minimal supervision. In: Thirty-First AAAI Conference on Artificial Intelligence

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (61906158), the Project of Science and Technology Research in Henan Province (212102210075).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chenggang Mi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, S., Mi, C., Li, T. et al. Improving bilingual word embeddings mapping with monolingual context information. Machine Translation 35, 503–518 (2021). https://doi.org/10.1007/s10590-021-09274-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10590-021-09274-0

Keywords

Navigation