Skip to main content
Log in

Narrowing the language gap: domain adaptation guided cross-lingual passage re-ranking

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

For a given query, the objective of Cross-lingual Passage Re-ranking (XPR) is to rank a list of candidate passages in multiple languages, where only a portion of the passages are in the query’s language. Multilingual BERT (mBERT) is often used for the XPR task and achieves impressive performance. Nevertheless, there still exist two essential issues to be addressed in mBERT, including the performance gap between high- and low-resource languages, and the lack of explicit embedding distribution alignment. Regarding each language as a separated domain, we theoretically explore how these problems lead to errors in XPR under the guidance of domain adaptation. Based on the theoretical analysis, we propose a novel framework that comprises two modules, namely knowledge distillation and adversarial learning. The former enables the knowledge to be transferred from high-resource languages to low-resource ones, narrowing their performance gap. The latter encourages mBERT to align the embedding distributions across different languages by utilizing a novel language-distinguish task and adversarial training. Extensive experiments on in-domain and out-domain datasets confirm the effectiveness and robustness of the proposed framework and show that it can outperform state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Data availibility statement

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Notes

  1. https://github.com/Samurais/insuranceqa-corpus-zh.

  2. https://github.com/sharejing/BiPaR.

  3. https://github.com/facebookresearch/MLQA.

  4. https://github.com/deepmind/xquad.

References

  1. Nogueira R, Cho K (2019) Passage re-ranking with BERT. arXiv preprint arXiv:1901.04085

  2. Hao T, Li X, He Y, Wang FL, Qu Y (2022) Recent progress in leveraging deep learning methods for question answering. Neural Comput Appl 1–19

  3. Etezadi R, Shamsfard M (2022) The state of the art in open domain complex question answering: a survey. Appl Intell 1–21

  4. Li R, Wang L, Jiang Z, Hu Z, Zhao M, Lu X (2022) Mutually improved dense retriever and GNN-based reader for arbitrary-hop open-domain question answering. Neural Comput Appl 1–21

  5. Chen D, Zhang S, Zhang X, Yang K (2020) Cross-lingual passage re-ranking with alignment augmented multilingual BERT. IEEE Access 8:213232–213243

    Article  Google Scholar 

  6. Roy U, Constant N, Al-Rfou R, Barua A, Phillips A, Yang Y (2020) LAReQA: Language-agnostic answer retrieval from a multilingual pool. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 5919–5930

  7. Asai A, Kasai J, Clark JH, Lee K, Choi E, Hajishirzi H (2021) XOR QA: cross-lingual open-retrieval question answering. In: Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 547–564

  8. Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American chapter of the association for computational linguistics: human language technologies, pp 4171–4186

  9. Yang Z, Yang Y, Cer D, Darve E (2021) A simple and effective method to eliminate the self language bias in multilingual representations. In: Proceedings of the 2021 conference on empirical methods in natural language processing (EMNLP), pp 5825–5832

  10. Kassner N, Dufter P, Schütze H (2021) Multilingual LAMA: investigating knowledge in multilingual pretrained language models. In: Proceedings of the 16th conference of the European chapter of the association for computational linguistics: main volume, pp 3250–3258

  11. Besacier L, Barnard E, Karpov A, Schultz T (2014) Automatic speech recognition for under-resourced languages: a survey. Speech Commun 56:85–100

    Article  Google Scholar 

  12. Choudhury M, Deshpande A (2021) How linguistically fair are multilingual pre-trained language models?. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 12710–12718

  13. Wu S, Dredze M (2020) Are all languages created equal in multilingual BERT?. In: Proceedings of the 5th workshop on representation learning for NLP, pp 120–130

  14. Ahn J, Oh A (2021) Mitigating language-dependent ethnic bias in BERT. In: Proceedings of the 2021 conference on empirical methods in natural language processing (EMNLP), pp 533–549

  15. Novak E, Bizjak L, Mladenić D, Grobelnik M (2022) Why is a document relevant? Understanding the relevance scores in cross-lingual document retrieval. Knowl Based Syst 244:108545

    Article  Google Scholar 

  16. Ulčar M, Robnik-Šikonja M (2022) Cross-lingual alignments of ELMo contextual embeddings. Neural Comput Appl 1–19

  17. Minutolo A, Guarasci R, Damiano E, De Pietro G, Fujita H, Esposito M (2022) A multi-level methodology for the automated translation of a coreference resolution dataset: an application to the italian language. Neural Comput Appl 1–26

  18. Amara A, Hadj Taieb MA, Ben Aouicha M (2021) Multilingual topic modeling for tracking COVID-19 trends based on Facebook data analysis. Appl Intell 51(5):3052–3073

    Article  Google Scholar 

  19. Hull DA, Grefenstette G (1996) Querying across languages: a dictionary-based approach to multilingual information retrieval. In: Proceedings of the 19th annual international ACM SIGIR conference on research and development in information retrieval, pp 49–57

  20. Ghanbari E, Shakery A (2022) A learning to rank framework based on cross-lingual loss function for cross-lingual information retrieval. Appl Intell 52(3):3156–3174

    Article  Google Scholar 

  21. Zweigenbaum P, Sharoff S, Rapp R (2017) Overview of the second BUCC shared task: Spotting parallel sentences in comparable corpora. In: Proceedings of the 10th workshop on building and using comparable corpora, pp 60–67

  22. Reimers N, Gurevych I (2020) Making monolingual sentence embeddings multilingual using knowledge distillation. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 4512–4525

  23. Liu F, Vulić I, Korhonen A, Collier N (2021) Fast, effective, and self-supervised: transforming masked language models into universal lexical and sentence encoders. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 1442–1459

  24. Wang K, Thakur N, Reimers N, Gurevych I (2022) GPL: generative pseudo labeling for unsupervised domain adaptation of dense retrieval. In: Proceedings of the 2022 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 2345–2360

  25. Buciluǎ C, Caruana R, Niculescu-Mizil A (2006) Model compression. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, pp 535–541

  26. Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531

  27. Tang R, Lu Y, Liu L, Mou L, Vechtomova O, Lin J (2019) Distilling task-specific knowledge from BERT into simple neural networks. arXiv preprint arXiv:1903.12136

  28. Ma X, Shen Y, Fang G, Chen C, Jia C, Lu W (2020) Adversarial self-supervised data free distillation for text classification. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp. 6182–6192

  29. Park S, Kwak N (2019) Feed: Feature-level ensemble for knowledge distillation. arXiv preprint arXiv:1909.10754

  30. He W, Yang M, Yan R, Li C, Shen Y, Xu R (2020) Amalgamating knowledge from two teachers for task-oriented dialogue system with adversarial training. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 3498–3507

  31. Wu Q, Lin Z, Karlsson B, Lou J-G, Huang B (2020) Single-/multi-source cross-lingual NER via teacher-student learning on unlabeled data in target language. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 6505–6514

  32. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Proceedings of the advances in neural information processing systems (NIPS), pp 2672–2680

  33. Ganin Y, Lempitsky V (2015) Unsupervised domain adaptation by backpropagation. In: Proceedings of the 32nd international conference on machine learning, pp 1180–1189

  34. Miyato T, Dai AM, Goodfellow I (2016) Adversarial training methods for semi-supervised text classification. arXiv preprint arXiv:1605.07725

  35. Qi K, Du J (2020) Translation-based matching adversarial network for cross-lingual natural language inference. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 8632–8639

  36. Li B, Du X, Chen M (2020) Cross-language question retrieval with multi-layer representation and layer-wise adversary. Inf Sci 527:241–252

    Article  Google Scholar 

  37. Keung P, Lu Y, Bhardwaj V (2019) Adversarial learning with contextual embeddings for zero-resource cross-lingual classification and NER. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 1355–1360

  38. Chen M, Xu Z, Weinberger KQ, Sha F (2012) Marginalized denoising autoencoders for domain adaptation. In: Proceedings of the 29th international coference on machine learning, pp 1627–1634

  39. Wang R, Zhang Z, Zhuang F, Gao D, Wei Y, He Q (2021) Adversarial domain adaptation for cross-lingual information retrieval with multilingual BERT. In: Proceedings of the 30th ACM international conference on information and knowledge management, pp 3498–3502

  40. Ben-David S, Blitzer J, Crammer K, Kulesza A, Pereira F, Vaughan JW (2010) A theory of learning from different domains. Mach Learn 79(1):151–175

    Article  MathSciNet  MATH  Google Scholar 

  41. Long M, Wang J, Cao Y, Sun J, Philip SY (2016) Deep learning of transferable representation for scalable domain adaptation. IEEE Trans Knowl Data Eng 28(8):2027–2040

    Article  Google Scholar 

  42. Pires T, Schlinger E, Garrette D (2019) How multilingual is multilingual BERT? In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 4996–5001

  43. Fan Y, Liang Y, Muzio A, Hassan H, Li H, Zhou M, Duan N (2021) Discovering representation sprachbund for multilingual pre-training. In: Findings of the association for computational linguistics: EMNLP 2021, pp 881–894

  44. Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, Laviolette F, Marchand M, Lempitsky V (2016) Domain-adversarial training of neural networks. J Mach Learn Res 17(1):2096–2030

    MathSciNet  MATH  Google Scholar 

  45. Feng M, Xiang B, Glass MR, Wang L, Zhou B (2015) Applying deep learning to answer selection: a study and an open task. In: 2015 IEEE workshop on automatic speech recognition and understanding (ASRU), pp 813–820

  46. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Proceedings of the advances in neural information processing systems (NIPS), pp 5998–6008

  47. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980

  48. Wang S, Khabsa M, Ma H (2020) To pretrain or not to pretrain: Examining the benefits of pretrainng on resource rich tasks. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 2209–2213

Download references

Funding

The authors did not receive support from any organization for the submitted work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xin Zhang.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, D., Zhang, X. & Zhang, S. Narrowing the language gap: domain adaptation guided cross-lingual passage re-ranking. Neural Comput & Applic 35, 20735–20748 (2023). https://doi.org/10.1007/s00521-023-08803-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08803-7

Keywords

Navigation