Supervised Contrastive Learning for Cross-Lingual Transfer Learning

Wang, Shuaibo; Di, Hui; Huang, Hui; Lai, Siyu; Ouchi, Kazushige; Chen, Yufeng; Xu, Jinan

doi:10.1007/978-3-031-18315-7_14

Shuaibo Wang¹⁴,
Hui Di¹⁵,
Hui Huang¹⁶,
Siyu Lai¹⁴,
Kazushige Ouchi¹⁵,
Yufeng Chen¹⁴ &
…
Jinan Xu¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13603))

Included in the following conference series:

China National Conference on Chinese Computational Linguistics

462 Accesses

Abstract

Multilingual pre-trained representations are not well-aligned by nature, which harms their performance on cross-lingual tasks. Previous methods propose to post-align the multilingual pre-trained representations by multi-view alignment or contrastive learning. However, we argue that both methods are not suitable for the cross-lingual classification objective, and in this paper we propose a simple yet effective method to better align the pre-trained representations. On the basis of cross-lingual data augmentations, we make a minor modification to the canonical contrastive loss, to remove false-negative examples which should not be contrasted. Augmentations with the same class are brought close to the anchor sample, and augmentations with different class are pushed apart. Experiment results on three cross-lingual tasks from XTREME benchmark show our method could improve the transfer performance by a large margin with no additional resource needed. We also provide in-detail analysis and comparison between different post-alignment strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Artetxe, M., Schwenk, H.: Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond. Trans. Assoc. Comput. Linguist. 7, 597–610 (2019)
Article Google Scholar
Cao, S., Kitaev, N., Klein, D.: Multilingual alignment of contextual word representations. In: Proceedings of ICLR (2020)
Google Scholar
Chi, Z., et al.: InfoXLM: an information-theoretic framework for cross-lingual language model pre-training. In: Proceedings of ACL (2021)
Google Scholar
Chi, Z., et al.: XLM-E: cross-lingual language model pre-training via ELECTRA. CoRR (2021)
Google Scholar
Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. In: Proceedings of ACL (2020)
Google Scholar
Conneau, A., et al.: Xnli: Evaluating cross-lingual sentence representations. In: Proceedings of EMNLP (2018)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of ACL (2019)
Google Scholar
Dou, Z., Neubig, G.: Word alignment by fine-tuning embeddings on parallel corpora. In: Proceedings of ACL (2021)
Google Scholar
Gao, T., Yao, X., Chen, D.: SimCSE: simple contrastive learning of sentence embeddings. In: Proceedings of EMNLP (2021)
Google Scholar
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Proceedings of CVPR (2020)
Google Scholar
Hu, J., Ruder, S., Siddhant, A., Neubig, G., Firat, O., Johnson, M.: Xtreme: a massively multilingual multi-task benchmark for evaluating cross-lingual generalization. CoRR abs/2003.11080 (2020)
Google Scholar
Khosla, P., et al.: Supervised contrastive learning. In: Proceedings of NeurIPS (2020)
Google Scholar
Kulshreshtha, S., Redondo Garcia, J.L., Chang, C.Y.: Cross-lingual alignment methods for multilingual BERT: a comparative study. In: Proceedings of ACL (2020)
Google Scholar
Lai, S., Huang, H., Jing, D., Chen, Y., Xu, J., Liu, J.: Saliency-based multi-view mixed language training for zero-shot cross-lingual classification. In: Proceedings of ACL (2021)
Google Scholar
Lample, G., Conneau, A.: Cross-lingual language model pretraining. Adv. Neural Inf. Process. Syst. (NeurIPS) (2019)
Google Scholar
Lample, G., Conneau, A., Denoyer, L., Ranzato, M.: Unsupervised machine translation using monolingual corpora only. In: Proceedings of ICLR (2018)
Google Scholar
Liang, Y., et al.: Xglue: a new benchmark dataset for cross-lingual pre-training, understanding and generation. arXiv (2020)
Google Scholar
van den Oord, A., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. CoRR (2018)
Google Scholar
Pan, L., Hang, C.W., Qi, H., Shah, A., Potdar, S., Yu, M.: Multilingual BERT post-pretraining alignment. In: Proceedings of ACL (2021)
Google Scholar
Qin, L., Ni, M., Zhang, Y., Che, W.: CoSDA-ML: multi-lingual code-switching data augmentation for zero-shot cross-lingual NLP. In: Proceedings of IJCAI (2020)
Google Scholar
Singh, J., McCann, B., Keskar, N.S., Xiong, C., Socher, R.: XLDA: cross-lingual data augmentation for natural language inference and question answering. CoRR (2019)
Google Scholar
Tao, C., et al.: Exploring the equivalence of siamese self-supervised learning via a unified gradient framework. CoRR (2021)
Google Scholar
Wang, L., Zhao, W., Liu, J.: Aligning cross-lingual sentence representations with dual momentum contrast. In: Proceedings of EMNLP (2021)
Google Scholar
Wang, T., Isola, P.: Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In: Proceedings of ICML (2020)
Google Scholar
Wei, X., Weng, R., Hu, Y., Xing, L., Yu, H., Luo, W.: On learning universal representations across languages. In: Proceedings of ICLR (2021)
Google Scholar
Yang, Y., Zhang, Y., Tar, C., Baldridge, J.: PAWS-X: a cross-lingual adversarial dataset for paraphrase identification. In: Proceedings of EMNLP (2019)
Google Scholar
Yang, Z., Cheng, Y., Liu, Y., Sun, M.: Reducing word omission errors in neural machine translation: a contrastive learning approach. In: Proceedings of ACL (2019)
Google Scholar
Zheng, B., et al.: Consistency regularization for cross-lingual fine-tuning. In: Proceedings of ACL (2021)
Google Scholar

Download references

Acknowledgement

This research work is supported by the National Key R &D Program of China (2020AAA0108001), the National Nature Science Foundation of China (No. 61976016, 61976015 and 61876198) and Toshiba (China) Co., Ltd. The authors would like to thank the anonymous reviewers for their valuable comments and suggestions to improve this paper.

Author information

Authors and Affiliations

School of Computer and Information Technology, Beijing Jiaotong University, Beijing, 100044, China
Shuaibo Wang, Siyu Lai, Yufeng Chen & Jinan Xu
Toshiba (China) Co., Ltd., Beijing, China
Hui Di & Kazushige Ouchi
Harbin Institute of Technology, Harbin, China
Hui Huang

Authors

Shuaibo Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hui Di
View author publications
You can also search for this author in PubMed Google Scholar
Hui Huang
View author publications
You can also search for this author in PubMed Google Scholar
Siyu Lai
View author publications
You can also search for this author in PubMed Google Scholar
Kazushige Ouchi
View author publications
You can also search for this author in PubMed Google Scholar
Yufeng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jinan Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yufeng Chen .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Maosong Sun
Tsinghua University, Beijing, China
Yang Liu
Harbin Institute of Technology, Harbin, China
Wanxiang Che
Chinese Academy of Sciences, Institute of Computing Technology, Beijing, China
Yang Feng
Fudan University, Shanghai, China
Xipeng Qiu
Beijing Language and Culture University, Beijing, China
Gaoqi Rao
Chinese Academy of Sciences, Institute of Automation, Beijing, China
Yubo Chen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, S. et al. (2022). Supervised Contrastive Learning for Cross-Lingual Transfer Learning. In: Sun, M., et al. Chinese Computational Linguistics. CCL 2022. Lecture Notes in Computer Science(), vol 13603. Springer, Cham. https://doi.org/10.1007/978-3-031-18315-7_14

Download citation

DOI: https://doi.org/10.1007/978-3-031-18315-7_14
Published: 06 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18314-0
Online ISBN: 978-3-031-18315-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics