Skip to main content

Probing Cross-lingual Transfer of XLM Multi-language Model

  • Conference paper
  • First Online:
Advances in Internet, Data & Web Technologies (EIDWT 2024)

Abstract

This paper investigates the ability of XLM language model to transfer linguistic knowledge cross-lingually, verifying if and to which extent syntactic dependency relationships learnt in a language are maintained in other languages. In detail, a structural probe is developed to analyse the cross-lingual syntactic transfer capability of XLM model and comparison of cross-language syntactic transfer among languages belonging to different families from a typological classification, which are characterised by very different syntactic constructions. The probe aims to reconstruct the dependency parse tree of a sentence in order to representing the input sentences with the contextual embeddings from XLM layers. The results of the experimental assessment improved the previous results obtained using mBERT model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    XLM models are available at https://github.com/facebookresearch/XLM.

  2. 2.

    https://github.com/UniversalDependencies/UD_English-EWT.

  3. 3.

    https://github.com/UniversalDependencies/UD_Italian-ISDT.

  4. 4.

    https://github.com/UniversalDependencies/UD_French-GSD.

  5. 5.

    https://github.com/UniversalDependencies/UD_French-Sequoia.

  6. 6.

    https://universaldependencies.org/treebanks/en_pud/index.html.

References

  1. Arslan, T.P., Eryiğit, G.: Incorporating dropped pronouns into coreference resolution: the case for Turkish. In: Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, pp. 14–25 (2023)

    Google Scholar 

  2. Bjerva, J., Augenstein, I.: Does typological blinding impede cross-lingual sharing? In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 480–486. Association for Computational Linguistics, Online (2021)

    Google Scholar 

  3. Bonetti, F., Leonardelli, E., Trotta, D., Guarasci, R., Tonelli, S.: Work hard, play hard: collecting acceptability annotations through a 3D game. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, LREC 2022, pp. 1740–1750. ELRA, Marseille, France (2022)

    Google Scholar 

  4. Bosco, C., Montemagni, S., Simi, M.: Converting Italian treebanks: towards an Italian Stanford dependency treebank. In: Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, pp. 61–69. ACL, Sofia, Bulgaria (2013)

    Google Scholar 

  5. Candito, M., et al.: Deep syntax annotation of the Sequoia French Treebank. In: Calzolari, N., et al. (eds.) Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC 2014, pp. 2298–2305. European Language Resources Association (ELRA), Reykjavik, Iceland (2014)

    Google Scholar 

  6. Chi, E.A., Hewitt, J., Manning, C.D.: Finding universal grammatical relations in multilingual BERT. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 5564–5577. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-main.493

  7. Clark, K., Khandelwal, U., Levy, O., Manning, C.D.: What does BERT look at? An analysis of BERT’s attention. In: Proceedings of the 2019 ACL Workshop BlackboxNLP: analyzing and Interpreting Neural Networks for NLP, pp. 276–286. ACL, Florence, Italy (2019). https://doi.org/10.18653/v1/W19-4828

  8. Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8440–8451. ACL, Online (2020)

    Google Scholar 

  9. Conneau, A., Lample, G.: Cross-lingual language model pretraining. In: Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019. NeurIPS 2019, pp. 7057–7067. Vancouver, BC, Canada (2019)

    Google Scholar 

  10. Conneau, A., Rinott, R., Lample, G., Williams, A., Bowman, S.R., Schwenk, H., Stoyanov, V.: XNLI: evaluating cross-lingual sentence representations. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, pp. 2475–2485. ACL (2018). https://doi.org/10.18653/v1/d18-1269

  11. Conneau, A., Wu, S., Li, H., Zettlemoyer, L., Stoyanov, V.: Emerging cross-lingual structure in pretrained language models. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 6022–6034. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-main.536

  12. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186. ACL, Minneapolis, Minnesota (2019). https://doi.org/10.18653/v1/N19-1423

  13. Gargiulo, F., et al.: An electra-based model for neural coreference resolution. IEEE Access 10, 75144–75157 (2022). https://doi.org/10.1109/ACCESS.2022.3189956

  14. Goldberg, Y.: Assessing BERT’s syntactic abilities. CoRR abs/1901.05287 (2019)

    Google Scholar 

  15. Guarasci, R., Damiano, E., Minutolo, A., Esposito, M.: When lexicon-grammar meets open information extraction: a computational experiment for Italian sentences. In: Proceedings of the Sixth Italian Conference on Computational Linguistics CLIC-IT, vol. 2481. CEUR-WS.org, Bari, Italy (2019)

    Google Scholar 

  16. Guarasci, R., Damiano, E., Minutolo, A., Esposito, M., De Pietro, G.: Lexicon-grammar based open information extraction from natural language sentences in Italian. Expert Syst. Appl. 143, 112954 (2020). https://doi.org/10.1016/j.eswa.2019.112954

  17. Guarasci, R., Minutolo, A., Damiano, E., De Pietro, G., Fujita, H., Esposito, M.: ELECTRA for neural coreference resolution in Italian. IEEE Access 9, 115,643–115,654 (2021). https://doi.org/10.1109/ACCESS.2021.3105278

  18. Guarasci, R., Silvestri, S., De Pietro, G., Fujita, H., Esposito, M.: BERT syntactic transfer: a computational experiment on Italian, French and English languages. Comput. Speech Lang. 71 (2022). https://doi.org/10.1016/j.csl.2021.101261

  19. Guarasci, R., Silvestri, S., De Pietro, G., Fujita, H., Esposito, M.: Assessing BERT’S ability to learn Italian syntax: a study on null-subject and agreement phenomena. J. Ambient. Intell. Humaniz. Comput. 14(1), 289–303 (2023)

    Article  Google Scholar 

  20. Guillaume, B., de Marneffe, M.C., Perrier, G.: Conversion et améliorations de corpus du Français annotés en Universal Dependencies. Traitement Automatique des Langues 60(2), 71–95 (2019)

    Google Scholar 

  21. Hewitt, J., Manning, C.D.: A structural probe for finding syntax in word representations. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4129–4138. ACL, Minneapolis, Minnesota (2019). https://doi.org/10.18653/v1/N19-1419

  22. Jawahar, G., Sagot, B., Seddah, D.: What does BERT learn about the structure of language? In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 3651–3657. ACL, Florence, Italy (2019). https://doi.org/10.18653/v1/P19-1356

  23. Li, W., Zhu, L., Shi, Y., Guo, K., Cambria, E.: User reviews: sentiment analysis using lexicon integrated two-channel CNN-LSTM family models. Appl. Soft Comput. 94, 106435 (2020). https://doi.org/10.1016/j.asoc.2020.106435

  24. Linzen, T., Baroni, M.: Syntactic structure from deep learning. Annu. Rev. Linguist. 7, 195–212 (2021). https://doi.org/10.1146/annurev-linguistics-032020-051035

    Article  Google Scholar 

  25. Minutolo, A., Guarasci, R., Damiano, E., De Pietro, G., Fujita, H., Esposito, M.: A multi-level methodology for the automated translation of a coreference resolution dataset: an application to the Italian language. Neural Comput. Appl. 34(24), 22,493–22,518 (2022). https://doi.org/10.1007/s00521-022-07641-3

  26. Pires, T., Schlinger, E., Garrette, D.: How multilingual is multilingual BERT? In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4996–5001. ACL, Florence, Italy (2019). https://doi.org/10.18653/v1/P19-1493

  27. Ravishankar, V., Kulmizev, A., Abdou, M., Søgaard, A., Nivre, J.: Attention can reflect syntactic structure (if you let it). In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 3031–3045. Association for Computational Linguistics, Online (2021)

    Google Scholar 

  28. Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Erk, K., Smith, N.A. (eds.) Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1715–1725. Association for Computational Linguistics, Berlin, Germany (2016). https://doi.org/10.18653/v1/P16-1162

  29. Silveira, N., et al.: A gold standard dependency corpus for English. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pp. 2897–2904. ELRA, Reykjavik, Iceland (2014)

    Google Scholar 

  30. Silvestri, S., Gargiulo, F., Ciampi, M., De Pietro, G.: Exploit multilingual language model at scale for ICD-10 clinical text classification. In: ISCC 2020, pp. 1–7. IEEE, Rennes, France (2020). https://doi.org/10.1109/ISCC50000.2020.9219640

  31. Simi, M., Bosco, C., Montemagni, S.: Less is more? Towards a reduced inventory of categories for training a parser for the Italian Stanford dependencies. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pp. 83–90. ELRA, Reykjavik, Iceland (2014)

    Google Scholar 

  32. Sukthanker, R., Poria, S., Cambria, E., Thirunavukarasu, R.: Anaphora and coreference resolution: a review. Inf. Fusion 59, 139–162 (2020). https://doi.org/10.1016/j.inffus.2020.01.010

    Article  Google Scholar 

  33. Tenney, I., Das, D., Pavlick, E.: BERT rediscovers the classical NLP pipeline. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4593–4601. ACL, Florence, Italy (2019). https://doi.org/10.18653/v1/P19-1452

  34. Tenney, I., et al.: What do you learn from context? probing for sentence structure in contextualized word representations. In: 7th International Conference on Learning Representations, ICLR 2019. New Orleans, LA, USA (2019)

    Google Scholar 

  35. Trotta, D., Guarasci, R., Leonardelli, E., Tonelli, S.: Monolingual and cross-lingual acceptability judgments with the Italian CoLA corpus. In: M.F. Moens, X. Huang, L. Specia, S.W.t. Yih (eds.) Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 2929–2940. Association for Computational Linguistics, Punta Cana, Dominican Republic (2021). https://doi.org/10.18653/v1/2021.findings-emnlp.250

  36. Warstadt, A., Singh, A., Bowman, S.R.: Neural network acceptability judgments. Trans. Assoc. Comput. Linguist. 7, 625–641 (2019). https://doi.org/10.1162/tac_a_00290

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by European Union - NextGenerationEU - National Recovery and Resilience Plan (Piano Nazionale di Ripresa e Resilienza, PNRR) - Project: “SoBigData.it - Strengthening the Italian RI for Social Mining and Big Data Analytics” - Prot. IR0000013 - Avviso n. 3264 del 28/12/2021

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Raffaele Guarasci .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Guarasci, R., Silvestri, S., Esposito, M. (2024). Probing Cross-lingual Transfer of XLM Multi-language Model. In: Barolli, L. (eds) Advances in Internet, Data & Web Technologies. EIDWT 2024. Lecture Notes on Data Engineering and Communications Technologies, vol 193. Springer, Cham. https://doi.org/10.1007/978-3-031-53555-0_21

Download citation

Publish with us

Policies and ethics