Probing Cross-lingual Transfer of XLM Multi-language Model

Guarasci, Raffaele; Silvestri, Stefano; Esposito, Massimo

doi:10.1007/978-3-031-53555-0_21

Raffaele Guarasci³,
Stefano Silvestri³ &
Massimo Esposito³

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 193))

Included in the following conference series:

International Conference on Emerging Internet, Data & Web Technologies

708 Accesses
1 Citations

Abstract

This paper investigates the ability of XLM language model to transfer linguistic knowledge cross-lingually, verifying if and to which extent syntactic dependency relationships learnt in a language are maintained in other languages. In detail, a structural probe is developed to analyse the cross-lingual syntactic transfer capability of XLM model and comparison of cross-language syntactic transfer among languages belonging to different families from a typological classification, which are characterised by very different syntactic constructions. The probe aims to reconstruct the dependency parse tree of a sentence in order to representing the input sentences with the contextual embeddings from XLM layers. The results of the experimental assessment improved the previous results obtained using mBERT model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 229.00; Price excludes VAT (USA)

Softcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

MULTEXT-East

Crosslingual Content Scoring in Five Languages Using Machine-Translation and Multilingual Transformer Models

Article Open access 03 November 2023

Metrics of Syntactic Equivalence to Assess Translation Difficulty

Notes

1.
XLM models are available at https://github.com/facebookresearch/XLM.
2.
https://github.com/UniversalDependencies/UD_English-EWT.
3.
https://github.com/UniversalDependencies/UD_Italian-ISDT.
4.
https://github.com/UniversalDependencies/UD_French-GSD.
5.
https://github.com/UniversalDependencies/UD_French-Sequoia.
6.
https://universaldependencies.org/treebanks/en_pud/index.html.

References

Arslan, T.P., Eryiğit, G.: Incorporating dropped pronouns into coreference resolution: the case for Turkish. In: Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, pp. 14–25 (2023)
Google Scholar
Bjerva, J., Augenstein, I.: Does typological blinding impede cross-lingual sharing? In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 480–486. Association for Computational Linguistics, Online (2021)
Google Scholar
Bonetti, F., Leonardelli, E., Trotta, D., Guarasci, R., Tonelli, S.: Work hard, play hard: collecting acceptability annotations through a 3D game. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, LREC 2022, pp. 1740–1750. ELRA, Marseille, France (2022)
Google Scholar
Bosco, C., Montemagni, S., Simi, M.: Converting Italian treebanks: towards an Italian Stanford dependency treebank. In: Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, pp. 61–69. ACL, Sofia, Bulgaria (2013)
Google Scholar
Candito, M., et al.: Deep syntax annotation of the Sequoia French Treebank. In: Calzolari, N., et al. (eds.) Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC 2014, pp. 2298–2305. European Language Resources Association (ELRA), Reykjavik, Iceland (2014)
Google Scholar
Chi, E.A., Hewitt, J., Manning, C.D.: Finding universal grammatical relations in multilingual BERT. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 5564–5577. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-main.493
Clark, K., Khandelwal, U., Levy, O., Manning, C.D.: What does BERT look at? An analysis of BERT’s attention. In: Proceedings of the 2019 ACL Workshop BlackboxNLP: analyzing and Interpreting Neural Networks for NLP, pp. 276–286. ACL, Florence, Italy (2019). https://doi.org/10.18653/v1/W19-4828
Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8440–8451. ACL, Online (2020)
Google Scholar
Conneau, A., Lample, G.: Cross-lingual language model pretraining. In: Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019. NeurIPS 2019, pp. 7057–7067. Vancouver, BC, Canada (2019)
Google Scholar
Conneau, A., Rinott, R., Lample, G., Williams, A., Bowman, S.R., Schwenk, H., Stoyanov, V.: XNLI: evaluating cross-lingual sentence representations. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, pp. 2475–2485. ACL (2018). https://doi.org/10.18653/v1/d18-1269
Conneau, A., Wu, S., Li, H., Zettlemoyer, L., Stoyanov, V.: Emerging cross-lingual structure in pretrained language models. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 6022–6034. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-main.536
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186. ACL, Minneapolis, Minnesota (2019). https://doi.org/10.18653/v1/N19-1423
Gargiulo, F., et al.: An electra-based model for neural coreference resolution. IEEE Access 10, 75144–75157 (2022). https://doi.org/10.1109/ACCESS.2022.3189956
Goldberg, Y.: Assessing BERT’s syntactic abilities. CoRR abs/1901.05287 (2019)
Google Scholar
Guarasci, R., Damiano, E., Minutolo, A., Esposito, M.: When lexicon-grammar meets open information extraction: a computational experiment for Italian sentences. In: Proceedings of the Sixth Italian Conference on Computational Linguistics CLIC-IT, vol. 2481. CEUR-WS.org, Bari, Italy (2019)
Google Scholar
Guarasci, R., Damiano, E., Minutolo, A., Esposito, M., De Pietro, G.: Lexicon-grammar based open information extraction from natural language sentences in Italian. Expert Syst. Appl. 143, 112954 (2020). https://doi.org/10.1016/j.eswa.2019.112954
Guarasci, R., Minutolo, A., Damiano, E., De Pietro, G., Fujita, H., Esposito, M.: ELECTRA for neural coreference resolution in Italian. IEEE Access 9, 115,643–115,654 (2021). https://doi.org/10.1109/ACCESS.2021.3105278
Guarasci, R., Silvestri, S., De Pietro, G., Fujita, H., Esposito, M.: BERT syntactic transfer: a computational experiment on Italian, French and English languages. Comput. Speech Lang. 71 (2022). https://doi.org/10.1016/j.csl.2021.101261
Guarasci, R., Silvestri, S., De Pietro, G., Fujita, H., Esposito, M.: Assessing BERT’S ability to learn Italian syntax: a study on null-subject and agreement phenomena. J. Ambient. Intell. Humaniz. Comput. 14(1), 289–303 (2023)
Article Google Scholar
Guillaume, B., de Marneffe, M.C., Perrier, G.: Conversion et améliorations de corpus du Français annotés en Universal Dependencies. Traitement Automatique des Langues 60(2), 71–95 (2019)
Google Scholar
Hewitt, J., Manning, C.D.: A structural probe for finding syntax in word representations. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4129–4138. ACL, Minneapolis, Minnesota (2019). https://doi.org/10.18653/v1/N19-1419
Jawahar, G., Sagot, B., Seddah, D.: What does BERT learn about the structure of language? In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 3651–3657. ACL, Florence, Italy (2019). https://doi.org/10.18653/v1/P19-1356
Li, W., Zhu, L., Shi, Y., Guo, K., Cambria, E.: User reviews: sentiment analysis using lexicon integrated two-channel CNN-LSTM family models. Appl. Soft Comput. 94, 106435 (2020). https://doi.org/10.1016/j.asoc.2020.106435
Linzen, T., Baroni, M.: Syntactic structure from deep learning. Annu. Rev. Linguist. 7, 195–212 (2021). https://doi.org/10.1146/annurev-linguistics-032020-051035
Article Google Scholar
Minutolo, A., Guarasci, R., Damiano, E., De Pietro, G., Fujita, H., Esposito, M.: A multi-level methodology for the automated translation of a coreference resolution dataset: an application to the Italian language. Neural Comput. Appl. 34(24), 22,493–22,518 (2022). https://doi.org/10.1007/s00521-022-07641-3
Pires, T., Schlinger, E., Garrette, D.: How multilingual is multilingual BERT? In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4996–5001. ACL, Florence, Italy (2019). https://doi.org/10.18653/v1/P19-1493
Ravishankar, V., Kulmizev, A., Abdou, M., Søgaard, A., Nivre, J.: Attention can reflect syntactic structure (if you let it). In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 3031–3045. Association for Computational Linguistics, Online (2021)
Google Scholar
Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Erk, K., Smith, N.A. (eds.) Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1715–1725. Association for Computational Linguistics, Berlin, Germany (2016). https://doi.org/10.18653/v1/P16-1162
Silveira, N., et al.: A gold standard dependency corpus for English. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pp. 2897–2904. ELRA, Reykjavik, Iceland (2014)
Google Scholar
Silvestri, S., Gargiulo, F., Ciampi, M., De Pietro, G.: Exploit multilingual language model at scale for ICD-10 clinical text classification. In: ISCC 2020, pp. 1–7. IEEE, Rennes, France (2020). https://doi.org/10.1109/ISCC50000.2020.9219640
Simi, M., Bosco, C., Montemagni, S.: Less is more? Towards a reduced inventory of categories for training a parser for the Italian Stanford dependencies. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pp. 83–90. ELRA, Reykjavik, Iceland (2014)
Google Scholar
Sukthanker, R., Poria, S., Cambria, E., Thirunavukarasu, R.: Anaphora and coreference resolution: a review. Inf. Fusion 59, 139–162 (2020). https://doi.org/10.1016/j.inffus.2020.01.010
Article Google Scholar
Tenney, I., Das, D., Pavlick, E.: BERT rediscovers the classical NLP pipeline. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4593–4601. ACL, Florence, Italy (2019). https://doi.org/10.18653/v1/P19-1452
Tenney, I., et al.: What do you learn from context? probing for sentence structure in contextualized word representations. In: 7th International Conference on Learning Representations, ICLR 2019. New Orleans, LA, USA (2019)
Google Scholar
Trotta, D., Guarasci, R., Leonardelli, E., Tonelli, S.: Monolingual and cross-lingual acceptability judgments with the Italian CoLA corpus. In: M.F. Moens, X. Huang, L. Specia, S.W.t. Yih (eds.) Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 2929–2940. Association for Computational Linguistics, Punta Cana, Dominican Republic (2021). https://doi.org/10.18653/v1/2021.findings-emnlp.250
Warstadt, A., Singh, A., Bowman, S.R.: Neural network acceptability judgments. Trans. Assoc. Comput. Linguist. 7, 625–641 (2019). https://doi.org/10.1162/tac_a_00290
Article Google Scholar

Download references

Acknowledgements

This work is supported by European Union - NextGenerationEU - National Recovery and Resilience Plan (Piano Nazionale di Ripresa e Resilienza, PNRR) - Project: “SoBigData.it - Strengthening the Italian RI for Social Mining and Big Data Analytics” - Prot. IR0000013 - Avviso n. 3264 del 28/12/2021

Author information

Authors and Affiliations

National Research Council of Italy (CNR), Institute for High Performance Computing and Networking (ICAR), Napoli, Italy
Raffaele Guarasci, Stefano Silvestri & Massimo Esposito

Authors

Raffaele Guarasci
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Silvestri
View author publications
You can also search for this author in PubMed Google Scholar
Massimo Esposito
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Raffaele Guarasci .

Editor information

Editors and Affiliations

Faculty of Information Engineering, Fukuoka Institute of Technology, Fukuoka, Japan
Leonard Barolli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Guarasci, R., Silvestri, S., Esposito, M. (2024). Probing Cross-lingual Transfer of XLM Multi-language Model. In: Barolli, L. (eds) Advances in Internet, Data & Web Technologies. EIDWT 2024. Lecture Notes on Data Engineering and Communications Technologies, vol 193. Springer, Cham. https://doi.org/10.1007/978-3-031-53555-0_21

Download citation

DOI: https://doi.org/10.1007/978-3-031-53555-0_21
Published: 14 February 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53554-3
Online ISBN: 978-3-031-53555-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Probing Cross-lingual Transfer of XLM Multi-language Model