Skip to main content

Advertisement

Cross-lingual prompting method with semantic-based answer space clustering

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Prompt learning has achieved remarkable performance in various natural language understanding scenarios as it intuitively bridges the gap between pre-training and fine-tuning. However, directly applying monolingual prompting methods to cross-lingual tasks leads to discrepancies between source-language training and target-language inference, namely language bias in cross-lingual transfer. To address this gap, we propose a novel model called Cross-lingual Semantic Clustering Prompt (X-SCP). Specifically, in the prompt engineering stage, we design a language-agnostic prompt template and introduce a progressive code-switching approach to enhance the alignment between source and target languages. In the answer engineering stage, we construct a unified multilingual answer space through semantic consistency-guided clustering. The model trains a cluster-based verbalizer by learning a pre-clustered multilingual answer space. In this way, X-SCP alleviates language bias in both prompt engineering and answer engineering. Experimental results show that our model outperforms the strong baselines under zero-shot cross-lingual settings on both the XGLUE-NC and MLDoc document classification datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Code Availability

The code and experimental environment are available from the first author, Ahtamjan Ahmat (aihetamujiangaihemaiti20@mails.ucas.ac.cn).

References

  1. Conneau A, Khandelwal K, Goyal N, et al (2020) Unsupervised cross-lingual representation learning at scale. In: Jurafsky D, Chai J, Schluter N, et al (eds) Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5-10, 2020. Association for Computational Linguistics, pp 8440–8451 https://doi.org/10.18653/v1/2020.acl-main.747

  2. Devlin J, Chang M, Lee K, et al (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein J, Doran C, Solorio T (eds) Proceedings of the 2019 conference of the north american chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers). Association for Computational Linguistics, pp 4171–4186. https://doi.org/10.18653/v1/n19-1423. https://doi.org/10.18653/v1/n19-1423

  3. Esposito M, Damiano E, Minutolo A et al (2020) Hybrid query expansion using lexical resources and word embeddings for sentence retrieval in question answering. Inf Sci 514:88–105

    Article  Google Scholar 

  4. Ma R, Ma B, Wang L et al (2024) Relational concept enhanced prototypical network for incremental few-shot relation classification. Knowl Based Syst 284:111282. https://doi.org/10.1016/J.KNOSYS.2023.111282

  5. Ahmad PN, Liu Y, Ullah I et al (2024) Enhancing coherence and diversity in multi-class slogan generation systems. ACM Trans Asian and Low-Resource Lang Inf Process 23(8):1–24

    Article  MATH  Google Scholar 

  6. Lu K, Yang Y, Dong R, et al (2024) Pruning residual networks in multilingual neural machine translation to improve zero-shot translation. In: The 13th CCF international conference on natural language processing and chinese computing. https://openreview.net/forum?id=YTYBloSCXt

  7. Brown TB, Mann B, Ryder N, et al (2020) Language models are few-shot learners. In: Larochelle H, Ranzato M, Hadsell R, et al (eds) advances in neural information processing systems 33: annual conference on neural information processing systems 2020, NeurIPS 2020, December 6-12, 2020, virtual. https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html

  8. Gao T, Fisch A, Chen D (2021) Making pre-trained language models better few-shot learners. In: Zong C, Xia F, Li W, et al (eds) Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021. Association for Computational Linguistics, pp 3816–3830. https://doi.org/10.18653/v1/2021.acl-long.295

  9. Zhao M, Schütze H (2021) Discrete and soft prompting for multilingual models. In: Moens M, Huang X, Specia L, et al (eds) proceedings of the 2021 conference on empirical methods in natural language processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021. Association for Computational Linguistics, pp 8547–8555. https://doi.org/10.18653/v1/2021.emnlp-main.672

  10. Ma J, Huang Y, Wang L, et al (2024) Augmenting low-resource cross-lingual summarization with progression-grounded training and prompting. ACM Transactions on Asian and Low-Resource Language Information Processing

  11. Feng K, Huang L, Wang K et al (2024) Prompt-based learning framework for zero-shot cross-lingual text classification. Eng Appl Artif Int 133:108481

    Article  MATH  Google Scholar 

  12. Philippy F, Guo S, Haddadan S, et al (2024) Soft prompt tuning for cross-lingual transfer: When less is more. In: Workshop on modular and open multilingual NLP (MOOMIN 2024), p 7

  13. Fu J, Ng SK, Liu P (2022) Polyglot prompt: Multilingual multitask prompt training. In: Proceedings of the 2022 conference on empirical methods in natural language processing, pp 9919–9935

  14. Huang L, Ma S, Zhang D, et al (2022) Zero-shot cross-lingual transfer of prompt-based tuning with a unified multilingual prompt. In: Goldberg Y, Kozareva Z, Zhang Y (eds) EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022. Association for Computational Linguistics, pp 11488–11497. https://aclanthology.org/2022.emnlp-main.790

  15. Lin XV, Mihaylov T, Artetxe M, et al (2021) Few-shot learning with multilingual language models. arXiv:2112.10668

  16. Lester B, Al-Rfou R, Constant N (2021) The power of scale for parameter-efficient prompt tuning. In: Moens M, Huang X, Specia L, et al (eds) EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021. Association for Computational Linguistics, pp 3045–3059. https://doi.org/10.18653/v1/2021.emnlp-main.243

  17. Liang Y, Duan N, Gong Y, et al (2020) XGLUE: A new benchmark dataset for cross-lingual pre-training, understanding and generation. arXiv:2004.01401

  18. Schwenk H, Li X (2018) A corpus for multilingual document classification in eight languages. In: Calzolari N, Choukri K, Cieri C, et al (eds) proceedings of the eleventh international conference on language resources and evaluation, LREC 2018, Miyazaki, Japan, May 7-12, 2018. European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2018/summaries/658.html

  19. Liu Y, et al. MO (2019) Roberta: A robustly optimized BERT pretraining approach. arXiv:1907.11692

  20. Liu X, Zheng Y, Du Z, et al (2021) GPT understands, too. arXiv:2103.10385

  21. Liu P, Yuan W, Fu J, et al (2023) Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Comput Surv 55(9):195:1–195:35. https://doi.org/10.1145/3560815

  22. Han X, Zhao W, Ding N et al (2022) PTR: prompt tuning with rules for text classification. AI Open 3:182–192. https://doi.org/10.1016/j.aiopen.2022.11.003

    Article  MATH  Google Scholar 

  23. Schick T, Schütze H (2021) Exploiting cloze-questions for few-shot text classification and natural language inference. In: Merlo P, Tiedemann J, Tsarfaty R (eds) Proceedings of the 16th conference of the european chapter of the association for computational linguistics: main volume, EACL 2021, Online, April 19 - 23, 2021. Association for Computational Linguistics, pp 255–269. https://doi.org/10.18653/v1/2021.eacl-main.20

  24. Ding N, Chen Y, Han X, et al (2022) Prompt-learning for fine-grained entity typing. In: Goldberg Y, Kozareva Z, Zhang Y (eds) Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022. Association for Computational Linguistics, pp 6888–6901. https://aclanthology.org/2022.findings-emnlp.512

  25. Wu H, Shi X (2022) Adversarial soft prompt tuning for cross-domain sentiment analysis. In: Muresan S, Nakov P, Villavicencio A (eds) ACL 2022, Dublin, Ireland, May 22-27, 2022. Association for Computational Linguistics, pp 2438–2447. https://doi.org/10.18653/v1/2022.acl-long.174

  26. Shin T, Razeghi Y, IV RLL, et al (2020) Autoprompt: Eliciting knowledge from language models with automatically generated prompts. In: Webber B, Cohn T, He Y, et al (eds) EMNLP 2020, Online, November 16-20, 2020. Association for Computational Linguistics, pp 4222–4235 . https://doi.org/10.18653/v1/2020.emnlp-main.346

  27. Hambardzumyan K, Khachatrian H, May J (2021) WARP: word-level adversarial reprogramming. In: Zong C, Xia F, Li W, et al (eds) Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021. Association for Computational Linguistics, pp 4921–4933. https://doi.org/10.18653/v1/2021.acl-long.381

  28. Vu T, Lester B, Constant N, et al (2022) Spot: Better frozen model adaptation through soft prompt transfer. In: Muresan S, Nakov P, Villavicencio A (eds) proceedings of the 60th annual meeting of the association for computational linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022. Association for Computational Linguistics, pp 5039–5059. https://doi.org/10.18653/v1/2022.acl-long.346

  29. Philippy F, Guo S, Haddadan S, et al (2024) Soft prompt tuning for cross-lingual transfer: When less is more. https://doi.org/10.48550/ARXIV.2402.03782

  30. Liu X, Zheng Y, Du Z, et al (2023) Gpt understands, too. AI Open

  31. Zhao WX, Zhou K, Li J, et al (2023) A survey of large language models. arXiv:2303.18223

  32. Touvron H, Lavril T, Izacard G, et al (2023) Llama: Open and efficient foundation language models. https://doi.org/10.48550/ARXIV.2302.13971

  33. OpenAI (2023) GPT-4 technical report. https://doi.org/10.48550/ARXIV.2303.08774

  34. Pan Z, Jiang Y, Garg S, et al (2024) S\(^{2}\)ip-llm: Semantic space informed prompt learning with LLM for time series forecasting. https://doi.org/10.48550/ARXIV.2403.05798

  35. Schick T, Schmid H, Schütze H (2020) Automatically identifying words that can serve as labels for few-shot text classification. In: Scott D, Bel N, Zong C (eds) COLING 2020, Barcelona, Spain (Online), December 8-13, 2020. International Committee on Computational Linguistics, pp 5569–5578. https://doi.org/10.18653/v1/2020.coling-main.488

  36. Wang Z, Yang Y, Xi Z, et al (2022) ASCM: an answer space clustered prompting method without answer engineering. In: Muresan S, Nakov P, Villavicencio A (eds) findings of the association for computational linguistics: ACL 2022, Dublin, Ireland, May 22-27, 2022. Association for Computational Linguistics, pp 2455–2469. https://doi.org/10.18653/V1/2022.FINDINGS-ACL.193

  37. Wei Y, Mo T, Jiang Y, et al (2022) Eliciting knowledge from pretrained language models for prototypical prompt verbalizer. In: Pimenidis E, Angelov PP, Jayne C, et al (eds) artificial neural networks and machine learning - ICANN 2022 - 31st international conference on artificial neural networks, Bristol, UK, September 6-9, 2022, Proceedings, Part II, Lecture Notes in Computer Science, vol 13530. Springer, pp 222–233. https://doi.org/10.1007/978-3-031-15931-2_19

  38. Hu S, Ding N, Wang H, et al (2022) Knowledgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification. In: Muresan S, Nakov P, Villavicencio A (eds) ACL 2022, Dublin, Ireland, May 22-27, 2022. Association for Computational Linguistics, pp 2225–2240. https://doi.org/10.18653/v1/2022.acl-long.158

  39. Jiang W, Zhang Y, Kwok JT (2023) Effective structured prompting by meta-learning and representative verbalizer. In: Krause A, Brunskill E, Cho K, et al (eds) international conference on machine learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, Proceedings of Machine Learning Research, vol 202. PMLR, pp 15186–15199. https://proceedings.mlr.press/v202/jiang23k.html

  40. Conneau A, Lample G (2019) Cross-lingual language model pretraining. In: Wallach HM, Larochelle H, Beygelzimer A, et al (eds) Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pp 7057–7067. https://proceedings.neurips.cc/paper/2019/hash/c04c19c2c2474dbf5f7ac4372c5b9af1-Abstract.html

  41. Chi Z, Dong L, Wei F, et al (2021) Infoxlm: An information-theoretic framework for cross-lingual language model pre-training. In: Toutanova K, Rumshisky A, Zettlemoyer L, et al (eds) proceedings of the 2021 conference of the north american chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2021, Online, June 6-11, 2021. Association for Computational Linguistics, pp 3576–3588. https://doi.org/10.18653/V1/2021.NAACL-MAIN.280

  42. Huang H, Liang Y, Duan N, et al (2019) Unicoder: A universal language encoder by pre-training with multiple cross-lingual tasks. In: Inui K, Jiang J, Ng V, et al (eds) proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019. Association for Computational Linguistics, pp 2485–2494. https://doi.org/10.18653/v1/D19-1252

  43. Minutolo A, Guarasci R, Damiano E et al (2022) A multi-level methodology for the automated translation of a coreference resolution dataset: an application to the italian language. Neural Comput Appl 34(24):22493–22518

    Article  Google Scholar 

  44. Wei X, Hu Y, Weng R, et al (2020) On learning universal representations across languages. arXiv:2007.15960

  45. Yang Z, Ma W, Cui Y, et al (2021) Bilingual alignment pre-training for zero-shot cross-lingual transfer. arXiv:2106.01732

  46. Ahmat A, Yang Y, Ma B et al (2023) Wad-x: Improving zero-shot cross-lingual transfer via adapter-based word alignment. ACM Trans Asian and Low-Resource Lang Inf Process 22(9):1–23

    Article  Google Scholar 

  47. Choenni R, Shutova E (2020) What does it mean to be language-agnostic? probing multilingual sentence encoders for typological properties. arXiv:2009.12862

  48. Zhao W, Eger S, Bjerva J, et al (2021) Inducing language-agnostic multilingual representations. In: Nastase V, Vulic I (eds) Proceedings of *SEM 2021: the tenth joint conference on lexical and computational semantics, *SEM 2021, Online, August 5-6, 2021. Association for Computational Linguistics, pp 229–240. https://doi.org/10.18653/v1/2021.starsem-1.22

  49. Lauscher A, Ravishankar V, Vulic I, et al (2020) From zero to hero: On the limitations of zero-shot language transfer with multilingual transformers. In: Webber B, Cohn T, He Y, et al (eds) EMNLP 2020, Online, November 16-20, 2020. Association for Computational Linguistics, pp 4483–4499. https://doi.org/10.18653/v1/2020.emnlp-main.363

  50. Wu S, Dredze M (2019) Beto, bentz, becas: The surprising cross-lingual effectiveness of BERT. In: Inui K, Jiang J, Ng V, et al (eds) proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019. Association for Computational Linguistics, pp 833–844. https://doi.org/10.18653/v1/D19-1077

  51. Blevins T, Limisiewicz T, Gururangan S, et al (2024) Breaking the curse of multilinguality with cross-lingual expert language models. arXiv:2401.10440

  52. Scao TL, Fan A, Akiki C, et al (2022) BLOOM: A 176b-parameter open-access multilingual language model. https://doi.org/10.48550/ARXIV.2211.05100

  53. Cui Y, Yang Z, Yao X (2023) Efficient and effective text encoding for chinese llama and alpaca. arXiv:2304.08177

  54. Tu L, Xiong C, Zhou Y (2022) Prompt-tuning can be much better than fine-tuning on cross-lingual understanding with multilingual language models. In: Goldberg Y, Kozareva Z, Zhang Y (eds) Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022. Association for Computational Linguistics, pp 5478–5485. https://aclanthology.org/2022.findings-emnlp.401

  55. Lin XV, et al. TM (2022) Few-shot learning with multilingual generative language models. In: Goldberg Y, Kozareva Z, Zhang Y (eds) Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022. Association for Computational Linguistics, pp 9019–9052. https://aclanthology.org/2022.emnlp-main.616

  56. Qin L, Chen Q, Wei F, et al (2023) Cross-lingual prompting: Improving zero-shot chain-of-thought reasoning across languages. In: Bouamor H, Pino J, Bali K (eds) proceedings of the 2023 conference on empirical methods in natural language processing, EMNLP 2023, Singapore, December 6-10, 2023. Association for Computational Linguistics, pp 2695–2709. https://aclanthology.org/2023.emnlp-main.163

  57. Qiu X, Wang Y, Shi J, et al (2024) Cross-lingual transfer for natural language inference via multilingual prompt translator. arXiv:2403.12407

  58. Bojanowski P, Grave E, Joulin A et al (2017) Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics 5:135–146

    Article  MATH  Google Scholar 

  59. Lewis DD, Yang Y, Rose TG, et al (2004) RCV1: A new benchmark collection for text categorization research. J Mach Learn Res 5:361–397. http://jmlr.org/papers/volume5/lewis04a/lewis04a.pdf

  60. Wolf T, Debut L, Sanh V, et al (2020) Transformers: State-of-the-art natural language processing. In: Liu Q, Schlangen D (eds) Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, EMNLP 2020 - Demos, Online, November 16-20, 2020. Association for Computational Linguistics, pp 38–45. https://doi.org/10.18653/V1/2020.EMNLP-DEMOS.6

  61. Loshchilov I, Hutter F (2017) Decoupled weight decay regularization. arXiv:1711.05101

  62. Conneau A, Lample G, Ranzato M, et al (2017) Word translation without parallel data. arXiv:1710.04087

  63. Choe YJ, Park K, Kim D (2020) word2word: A collection of bilingual lexicons for 3, 564 language pairs. In: Calzolari N, Béchet F, Blache P, et al (eds) Proceedings of The 12th Language Resources and Evaluation Conference, LREC 2020, Marseille, France, May 11-16, 2020. European Language Resources Association, pp 3036–3045.https://aclanthology.org/2020.lrec-1.371/

  64. Van der Maaten L, Hinton G (2008) Visualizing data using t-sne. Journal of machine learning research 9(11)

Download references

Acknowledgements

This work is supported by the Natural Science Foundation of Xinjiang Uyghur Autonomous Region (Grant No. 2022D01D04), the National Natural Science Foundation of China (Grant No. U2003303), and the Outstanding Member Program of the Youth Innovation Promotion Association of Chinese Academy of Sciences (Grant No. Y2021112). This work is also supported by the "Tianshan Elite" Sci-Tech Innovation Leading Talents Program (Grant No. 2022TSYCLJ0046), the "Tianshan Elite" Sci-Tech Topnotch Youth Talents Program (Grant No. 2022TSYCCX0059), the Key Research and Development Program of Xinjiang Uyghur Autonomous Region (Grant No. 2022B03010), and the Shanghai Cooperation Organization Sci-Tech Partnership Program and International Sci-Tech Cooperation Program (Grant No. 2022E01044).

Author information

Authors and Affiliations

Authors

Contributions

Ahtamjan Ahmat: Conceptualization, Methodology, Implementation, Experiments, Result Analysis, Writing - original draft. Yating Yang: Writing - review & editing, Validation, Supervision. Bo Ma: Data curation, Investigation. Rui Dong: Formal analysis, Project administration. Rong Ma: Resources, Visualization. Lei Wang: Validation, Investigation, Supervision.

Corresponding authors

Correspondence to Yating Yang or Lei Wang.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Ethical approval

This study does not involve any human participant or animal. All the data used in this article are sourced from open and publicly accessible platforms. No proprietary, confidential, or private data has been used.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ahmat, A., Yang, Y., Ma, B. et al. Cross-lingual prompting method with semantic-based answer space clustering. Appl Intell 55, 134 (2025). https://doi.org/10.1007/s10489-024-06101-w

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-024-06101-w

Keywords