Abstract
Prompt learning has achieved remarkable performance in various natural language understanding scenarios as it intuitively bridges the gap between pre-training and fine-tuning. However, directly applying monolingual prompting methods to cross-lingual tasks leads to discrepancies between source-language training and target-language inference, namely language bias in cross-lingual transfer. To address this gap, we propose a novel model called Cross-lingual Semantic Clustering Prompt (X-SCP). Specifically, in the prompt engineering stage, we design a language-agnostic prompt template and introduce a progressive code-switching approach to enhance the alignment between source and target languages. In the answer engineering stage, we construct a unified multilingual answer space through semantic consistency-guided clustering. The model trains a cluster-based verbalizer by learning a pre-clustered multilingual answer space. In this way, X-SCP alleviates language bias in both prompt engineering and answer engineering. Experimental results show that our model outperforms the strong baselines under zero-shot cross-lingual settings on both the XGLUE-NC and MLDoc document classification datasets.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Code Availability
The code and experimental environment are available from the first author, Ahtamjan Ahmat (aihetamujiangaihemaiti20@mails.ucas.ac.cn).
References
Conneau A, Khandelwal K, Goyal N, et al (2020) Unsupervised cross-lingual representation learning at scale. In: Jurafsky D, Chai J, Schluter N, et al (eds) Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5-10, 2020. Association for Computational Linguistics, pp 8440–8451 https://doi.org/10.18653/v1/2020.acl-main.747
Devlin J, Chang M, Lee K, et al (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein J, Doran C, Solorio T (eds) Proceedings of the 2019 conference of the north american chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers). Association for Computational Linguistics, pp 4171–4186. https://doi.org/10.18653/v1/n19-1423. https://doi.org/10.18653/v1/n19-1423
Esposito M, Damiano E, Minutolo A et al (2020) Hybrid query expansion using lexical resources and word embeddings for sentence retrieval in question answering. Inf Sci 514:88–105
Ma R, Ma B, Wang L et al (2024) Relational concept enhanced prototypical network for incremental few-shot relation classification. Knowl Based Syst 284:111282. https://doi.org/10.1016/J.KNOSYS.2023.111282
Ahmad PN, Liu Y, Ullah I et al (2024) Enhancing coherence and diversity in multi-class slogan generation systems. ACM Trans Asian and Low-Resource Lang Inf Process 23(8):1–24
Lu K, Yang Y, Dong R, et al (2024) Pruning residual networks in multilingual neural machine translation to improve zero-shot translation. In: The 13th CCF international conference on natural language processing and chinese computing. https://openreview.net/forum?id=YTYBloSCXt
Brown TB, Mann B, Ryder N, et al (2020) Language models are few-shot learners. In: Larochelle H, Ranzato M, Hadsell R, et al (eds) advances in neural information processing systems 33: annual conference on neural information processing systems 2020, NeurIPS 2020, December 6-12, 2020, virtual. https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html
Gao T, Fisch A, Chen D (2021) Making pre-trained language models better few-shot learners. In: Zong C, Xia F, Li W, et al (eds) Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021. Association for Computational Linguistics, pp 3816–3830. https://doi.org/10.18653/v1/2021.acl-long.295
Zhao M, Schütze H (2021) Discrete and soft prompting for multilingual models. In: Moens M, Huang X, Specia L, et al (eds) proceedings of the 2021 conference on empirical methods in natural language processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021. Association for Computational Linguistics, pp 8547–8555. https://doi.org/10.18653/v1/2021.emnlp-main.672
Ma J, Huang Y, Wang L, et al (2024) Augmenting low-resource cross-lingual summarization with progression-grounded training and prompting. ACM Transactions on Asian and Low-Resource Language Information Processing
Feng K, Huang L, Wang K et al (2024) Prompt-based learning framework for zero-shot cross-lingual text classification. Eng Appl Artif Int 133:108481
Philippy F, Guo S, Haddadan S, et al (2024) Soft prompt tuning for cross-lingual transfer: When less is more. In: Workshop on modular and open multilingual NLP (MOOMIN 2024), p 7
Fu J, Ng SK, Liu P (2022) Polyglot prompt: Multilingual multitask prompt training. In: Proceedings of the 2022 conference on empirical methods in natural language processing, pp 9919–9935
Huang L, Ma S, Zhang D, et al (2022) Zero-shot cross-lingual transfer of prompt-based tuning with a unified multilingual prompt. In: Goldberg Y, Kozareva Z, Zhang Y (eds) EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022. Association for Computational Linguistics, pp 11488–11497. https://aclanthology.org/2022.emnlp-main.790
Lin XV, Mihaylov T, Artetxe M, et al (2021) Few-shot learning with multilingual language models. arXiv:2112.10668
Lester B, Al-Rfou R, Constant N (2021) The power of scale for parameter-efficient prompt tuning. In: Moens M, Huang X, Specia L, et al (eds) EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021. Association for Computational Linguistics, pp 3045–3059. https://doi.org/10.18653/v1/2021.emnlp-main.243
Liang Y, Duan N, Gong Y, et al (2020) XGLUE: A new benchmark dataset for cross-lingual pre-training, understanding and generation. arXiv:2004.01401
Schwenk H, Li X (2018) A corpus for multilingual document classification in eight languages. In: Calzolari N, Choukri K, Cieri C, et al (eds) proceedings of the eleventh international conference on language resources and evaluation, LREC 2018, Miyazaki, Japan, May 7-12, 2018. European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2018/summaries/658.html
Liu Y, et al. MO (2019) Roberta: A robustly optimized BERT pretraining approach. arXiv:1907.11692
Liu X, Zheng Y, Du Z, et al (2021) GPT understands, too. arXiv:2103.10385
Liu P, Yuan W, Fu J, et al (2023) Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Comput Surv 55(9):195:1–195:35. https://doi.org/10.1145/3560815
Han X, Zhao W, Ding N et al (2022) PTR: prompt tuning with rules for text classification. AI Open 3:182–192. https://doi.org/10.1016/j.aiopen.2022.11.003
Schick T, Schütze H (2021) Exploiting cloze-questions for few-shot text classification and natural language inference. In: Merlo P, Tiedemann J, Tsarfaty R (eds) Proceedings of the 16th conference of the european chapter of the association for computational linguistics: main volume, EACL 2021, Online, April 19 - 23, 2021. Association for Computational Linguistics, pp 255–269. https://doi.org/10.18653/v1/2021.eacl-main.20
Ding N, Chen Y, Han X, et al (2022) Prompt-learning for fine-grained entity typing. In: Goldberg Y, Kozareva Z, Zhang Y (eds) Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022. Association for Computational Linguistics, pp 6888–6901. https://aclanthology.org/2022.findings-emnlp.512
Wu H, Shi X (2022) Adversarial soft prompt tuning for cross-domain sentiment analysis. In: Muresan S, Nakov P, Villavicencio A (eds) ACL 2022, Dublin, Ireland, May 22-27, 2022. Association for Computational Linguistics, pp 2438–2447. https://doi.org/10.18653/v1/2022.acl-long.174
Shin T, Razeghi Y, IV RLL, et al (2020) Autoprompt: Eliciting knowledge from language models with automatically generated prompts. In: Webber B, Cohn T, He Y, et al (eds) EMNLP 2020, Online, November 16-20, 2020. Association for Computational Linguistics, pp 4222–4235 . https://doi.org/10.18653/v1/2020.emnlp-main.346
Hambardzumyan K, Khachatrian H, May J (2021) WARP: word-level adversarial reprogramming. In: Zong C, Xia F, Li W, et al (eds) Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021. Association for Computational Linguistics, pp 4921–4933. https://doi.org/10.18653/v1/2021.acl-long.381
Vu T, Lester B, Constant N, et al (2022) Spot: Better frozen model adaptation through soft prompt transfer. In: Muresan S, Nakov P, Villavicencio A (eds) proceedings of the 60th annual meeting of the association for computational linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022. Association for Computational Linguistics, pp 5039–5059. https://doi.org/10.18653/v1/2022.acl-long.346
Philippy F, Guo S, Haddadan S, et al (2024) Soft prompt tuning for cross-lingual transfer: When less is more. https://doi.org/10.48550/ARXIV.2402.03782
Liu X, Zheng Y, Du Z, et al (2023) Gpt understands, too. AI Open
Zhao WX, Zhou K, Li J, et al (2023) A survey of large language models. arXiv:2303.18223
Touvron H, Lavril T, Izacard G, et al (2023) Llama: Open and efficient foundation language models. https://doi.org/10.48550/ARXIV.2302.13971
OpenAI (2023) GPT-4 technical report. https://doi.org/10.48550/ARXIV.2303.08774
Pan Z, Jiang Y, Garg S, et al (2024) S\(^{2}\)ip-llm: Semantic space informed prompt learning with LLM for time series forecasting. https://doi.org/10.48550/ARXIV.2403.05798
Schick T, Schmid H, Schütze H (2020) Automatically identifying words that can serve as labels for few-shot text classification. In: Scott D, Bel N, Zong C (eds) COLING 2020, Barcelona, Spain (Online), December 8-13, 2020. International Committee on Computational Linguistics, pp 5569–5578. https://doi.org/10.18653/v1/2020.coling-main.488
Wang Z, Yang Y, Xi Z, et al (2022) ASCM: an answer space clustered prompting method without answer engineering. In: Muresan S, Nakov P, Villavicencio A (eds) findings of the association for computational linguistics: ACL 2022, Dublin, Ireland, May 22-27, 2022. Association for Computational Linguistics, pp 2455–2469. https://doi.org/10.18653/V1/2022.FINDINGS-ACL.193
Wei Y, Mo T, Jiang Y, et al (2022) Eliciting knowledge from pretrained language models for prototypical prompt verbalizer. In: Pimenidis E, Angelov PP, Jayne C, et al (eds) artificial neural networks and machine learning - ICANN 2022 - 31st international conference on artificial neural networks, Bristol, UK, September 6-9, 2022, Proceedings, Part II, Lecture Notes in Computer Science, vol 13530. Springer, pp 222–233. https://doi.org/10.1007/978-3-031-15931-2_19
Hu S, Ding N, Wang H, et al (2022) Knowledgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification. In: Muresan S, Nakov P, Villavicencio A (eds) ACL 2022, Dublin, Ireland, May 22-27, 2022. Association for Computational Linguistics, pp 2225–2240. https://doi.org/10.18653/v1/2022.acl-long.158
Jiang W, Zhang Y, Kwok JT (2023) Effective structured prompting by meta-learning and representative verbalizer. In: Krause A, Brunskill E, Cho K, et al (eds) international conference on machine learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, Proceedings of Machine Learning Research, vol 202. PMLR, pp 15186–15199. https://proceedings.mlr.press/v202/jiang23k.html
Conneau A, Lample G (2019) Cross-lingual language model pretraining. In: Wallach HM, Larochelle H, Beygelzimer A, et al (eds) Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pp 7057–7067. https://proceedings.neurips.cc/paper/2019/hash/c04c19c2c2474dbf5f7ac4372c5b9af1-Abstract.html
Chi Z, Dong L, Wei F, et al (2021) Infoxlm: An information-theoretic framework for cross-lingual language model pre-training. In: Toutanova K, Rumshisky A, Zettlemoyer L, et al (eds) proceedings of the 2021 conference of the north american chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2021, Online, June 6-11, 2021. Association for Computational Linguistics, pp 3576–3588. https://doi.org/10.18653/V1/2021.NAACL-MAIN.280
Huang H, Liang Y, Duan N, et al (2019) Unicoder: A universal language encoder by pre-training with multiple cross-lingual tasks. In: Inui K, Jiang J, Ng V, et al (eds) proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019. Association for Computational Linguistics, pp 2485–2494. https://doi.org/10.18653/v1/D19-1252
Minutolo A, Guarasci R, Damiano E et al (2022) A multi-level methodology for the automated translation of a coreference resolution dataset: an application to the italian language. Neural Comput Appl 34(24):22493–22518
Wei X, Hu Y, Weng R, et al (2020) On learning universal representations across languages. arXiv:2007.15960
Yang Z, Ma W, Cui Y, et al (2021) Bilingual alignment pre-training for zero-shot cross-lingual transfer. arXiv:2106.01732
Ahmat A, Yang Y, Ma B et al (2023) Wad-x: Improving zero-shot cross-lingual transfer via adapter-based word alignment. ACM Trans Asian and Low-Resource Lang Inf Process 22(9):1–23
Choenni R, Shutova E (2020) What does it mean to be language-agnostic? probing multilingual sentence encoders for typological properties. arXiv:2009.12862
Zhao W, Eger S, Bjerva J, et al (2021) Inducing language-agnostic multilingual representations. In: Nastase V, Vulic I (eds) Proceedings of *SEM 2021: the tenth joint conference on lexical and computational semantics, *SEM 2021, Online, August 5-6, 2021. Association for Computational Linguistics, pp 229–240. https://doi.org/10.18653/v1/2021.starsem-1.22
Lauscher A, Ravishankar V, Vulic I, et al (2020) From zero to hero: On the limitations of zero-shot language transfer with multilingual transformers. In: Webber B, Cohn T, He Y, et al (eds) EMNLP 2020, Online, November 16-20, 2020. Association for Computational Linguistics, pp 4483–4499. https://doi.org/10.18653/v1/2020.emnlp-main.363
Wu S, Dredze M (2019) Beto, bentz, becas: The surprising cross-lingual effectiveness of BERT. In: Inui K, Jiang J, Ng V, et al (eds) proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019. Association for Computational Linguistics, pp 833–844. https://doi.org/10.18653/v1/D19-1077
Blevins T, Limisiewicz T, Gururangan S, et al (2024) Breaking the curse of multilinguality with cross-lingual expert language models. arXiv:2401.10440
Scao TL, Fan A, Akiki C, et al (2022) BLOOM: A 176b-parameter open-access multilingual language model. https://doi.org/10.48550/ARXIV.2211.05100
Cui Y, Yang Z, Yao X (2023) Efficient and effective text encoding for chinese llama and alpaca. arXiv:2304.08177
Tu L, Xiong C, Zhou Y (2022) Prompt-tuning can be much better than fine-tuning on cross-lingual understanding with multilingual language models. In: Goldberg Y, Kozareva Z, Zhang Y (eds) Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022. Association for Computational Linguistics, pp 5478–5485. https://aclanthology.org/2022.findings-emnlp.401
Lin XV, et al. TM (2022) Few-shot learning with multilingual generative language models. In: Goldberg Y, Kozareva Z, Zhang Y (eds) Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022. Association for Computational Linguistics, pp 9019–9052. https://aclanthology.org/2022.emnlp-main.616
Qin L, Chen Q, Wei F, et al (2023) Cross-lingual prompting: Improving zero-shot chain-of-thought reasoning across languages. In: Bouamor H, Pino J, Bali K (eds) proceedings of the 2023 conference on empirical methods in natural language processing, EMNLP 2023, Singapore, December 6-10, 2023. Association for Computational Linguistics, pp 2695–2709. https://aclanthology.org/2023.emnlp-main.163
Qiu X, Wang Y, Shi J, et al (2024) Cross-lingual transfer for natural language inference via multilingual prompt translator. arXiv:2403.12407
Bojanowski P, Grave E, Joulin A et al (2017) Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics 5:135–146
Lewis DD, Yang Y, Rose TG, et al (2004) RCV1: A new benchmark collection for text categorization research. J Mach Learn Res 5:361–397. http://jmlr.org/papers/volume5/lewis04a/lewis04a.pdf
Wolf T, Debut L, Sanh V, et al (2020) Transformers: State-of-the-art natural language processing. In: Liu Q, Schlangen D (eds) Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, EMNLP 2020 - Demos, Online, November 16-20, 2020. Association for Computational Linguistics, pp 38–45. https://doi.org/10.18653/V1/2020.EMNLP-DEMOS.6
Loshchilov I, Hutter F (2017) Decoupled weight decay regularization. arXiv:1711.05101
Conneau A, Lample G, Ranzato M, et al (2017) Word translation without parallel data. arXiv:1710.04087
Choe YJ, Park K, Kim D (2020) word2word: A collection of bilingual lexicons for 3, 564 language pairs. In: Calzolari N, Béchet F, Blache P, et al (eds) Proceedings of The 12th Language Resources and Evaluation Conference, LREC 2020, Marseille, France, May 11-16, 2020. European Language Resources Association, pp 3036–3045.https://aclanthology.org/2020.lrec-1.371/
Van der Maaten L, Hinton G (2008) Visualizing data using t-sne. Journal of machine learning research 9(11)
Acknowledgements
This work is supported by the Natural Science Foundation of Xinjiang Uyghur Autonomous Region (Grant No. 2022D01D04), the National Natural Science Foundation of China (Grant No. U2003303), and the Outstanding Member Program of the Youth Innovation Promotion Association of Chinese Academy of Sciences (Grant No. Y2021112). This work is also supported by the "Tianshan Elite" Sci-Tech Innovation Leading Talents Program (Grant No. 2022TSYCLJ0046), the "Tianshan Elite" Sci-Tech Topnotch Youth Talents Program (Grant No. 2022TSYCCX0059), the Key Research and Development Program of Xinjiang Uyghur Autonomous Region (Grant No. 2022B03010), and the Shanghai Cooperation Organization Sci-Tech Partnership Program and International Sci-Tech Cooperation Program (Grant No. 2022E01044).
Author information
Authors and Affiliations
Contributions
Ahtamjan Ahmat: Conceptualization, Methodology, Implementation, Experiments, Result Analysis, Writing - original draft. Yating Yang: Writing - review & editing, Validation, Supervision. Bo Ma: Data curation, Investigation. Rui Dong: Formal analysis, Project administration. Rong Ma: Resources, Visualization. Lei Wang: Validation, Investigation, Supervision.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Ethical approval
This study does not involve any human participant or animal. All the data used in this article are sourced from open and publicly accessible platforms. No proprietary, confidential, or private data has been used.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ahmat, A., Yang, Y., Ma, B. et al. Cross-lingual prompting method with semantic-based answer space clustering. Appl Intell 55, 134 (2025). https://doi.org/10.1007/s10489-024-06101-w
Accepted:
Published:
DOI: https://doi.org/10.1007/s10489-024-06101-w