Abstract
This paper investigates the ability of XLM language model to transfer linguistic knowledge cross-lingually, verifying if and to which extent syntactic dependency relationships learnt in a language are maintained in other languages. In detail, a structural probe is developed to analyse the cross-lingual syntactic transfer capability of XLM model and comparison of cross-language syntactic transfer among languages belonging to different families from a typological classification, which are characterised by very different syntactic constructions. The probe aims to reconstruct the dependency parse tree of a sentence in order to representing the input sentences with the contextual embeddings from XLM layers. The results of the experimental assessment improved the previous results obtained using mBERT model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
XLM models are available at https://github.com/facebookresearch/XLM.
- 2.
- 3.
- 4.
- 5.
- 6.
References
Arslan, T.P., Eryiğit, G.: Incorporating dropped pronouns into coreference resolution: the case for Turkish. In: Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, pp. 14–25 (2023)
Bjerva, J., Augenstein, I.: Does typological blinding impede cross-lingual sharing? In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 480–486. Association for Computational Linguistics, Online (2021)
Bonetti, F., Leonardelli, E., Trotta, D., Guarasci, R., Tonelli, S.: Work hard, play hard: collecting acceptability annotations through a 3D game. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, LREC 2022, pp. 1740–1750. ELRA, Marseille, France (2022)
Bosco, C., Montemagni, S., Simi, M.: Converting Italian treebanks: towards an Italian Stanford dependency treebank. In: Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, pp. 61–69. ACL, Sofia, Bulgaria (2013)
Candito, M., et al.: Deep syntax annotation of the Sequoia French Treebank. In: Calzolari, N., et al. (eds.) Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC 2014, pp. 2298–2305. European Language Resources Association (ELRA), Reykjavik, Iceland (2014)
Chi, E.A., Hewitt, J., Manning, C.D.: Finding universal grammatical relations in multilingual BERT. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 5564–5577. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-main.493
Clark, K., Khandelwal, U., Levy, O., Manning, C.D.: What does BERT look at? An analysis of BERT’s attention. In: Proceedings of the 2019 ACL Workshop BlackboxNLP: analyzing and Interpreting Neural Networks for NLP, pp. 276–286. ACL, Florence, Italy (2019). https://doi.org/10.18653/v1/W19-4828
Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8440–8451. ACL, Online (2020)
Conneau, A., Lample, G.: Cross-lingual language model pretraining. In: Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019. NeurIPS 2019, pp. 7057–7067. Vancouver, BC, Canada (2019)
Conneau, A., Rinott, R., Lample, G., Williams, A., Bowman, S.R., Schwenk, H., Stoyanov, V.: XNLI: evaluating cross-lingual sentence representations. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, pp. 2475–2485. ACL (2018). https://doi.org/10.18653/v1/d18-1269
Conneau, A., Wu, S., Li, H., Zettlemoyer, L., Stoyanov, V.: Emerging cross-lingual structure in pretrained language models. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 6022–6034. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-main.536
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186. ACL, Minneapolis, Minnesota (2019). https://doi.org/10.18653/v1/N19-1423
Gargiulo, F., et al.: An electra-based model for neural coreference resolution. IEEE Access 10, 75144–75157 (2022). https://doi.org/10.1109/ACCESS.2022.3189956
Goldberg, Y.: Assessing BERT’s syntactic abilities. CoRR abs/1901.05287 (2019)
Guarasci, R., Damiano, E., Minutolo, A., Esposito, M.: When lexicon-grammar meets open information extraction: a computational experiment for Italian sentences. In: Proceedings of the Sixth Italian Conference on Computational Linguistics CLIC-IT, vol. 2481. CEUR-WS.org, Bari, Italy (2019)
Guarasci, R., Damiano, E., Minutolo, A., Esposito, M., De Pietro, G.: Lexicon-grammar based open information extraction from natural language sentences in Italian. Expert Syst. Appl. 143, 112954 (2020). https://doi.org/10.1016/j.eswa.2019.112954
Guarasci, R., Minutolo, A., Damiano, E., De Pietro, G., Fujita, H., Esposito, M.: ELECTRA for neural coreference resolution in Italian. IEEE Access 9, 115,643–115,654 (2021). https://doi.org/10.1109/ACCESS.2021.3105278
Guarasci, R., Silvestri, S., De Pietro, G., Fujita, H., Esposito, M.: BERT syntactic transfer: a computational experiment on Italian, French and English languages. Comput. Speech Lang. 71 (2022). https://doi.org/10.1016/j.csl.2021.101261
Guarasci, R., Silvestri, S., De Pietro, G., Fujita, H., Esposito, M.: Assessing BERT’S ability to learn Italian syntax: a study on null-subject and agreement phenomena. J. Ambient. Intell. Humaniz. Comput. 14(1), 289–303 (2023)
Guillaume, B., de Marneffe, M.C., Perrier, G.: Conversion et améliorations de corpus du Français annotés en Universal Dependencies. Traitement Automatique des Langues 60(2), 71–95 (2019)
Hewitt, J., Manning, C.D.: A structural probe for finding syntax in word representations. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4129–4138. ACL, Minneapolis, Minnesota (2019). https://doi.org/10.18653/v1/N19-1419
Jawahar, G., Sagot, B., Seddah, D.: What does BERT learn about the structure of language? In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 3651–3657. ACL, Florence, Italy (2019). https://doi.org/10.18653/v1/P19-1356
Li, W., Zhu, L., Shi, Y., Guo, K., Cambria, E.: User reviews: sentiment analysis using lexicon integrated two-channel CNN-LSTM family models. Appl. Soft Comput. 94, 106435 (2020). https://doi.org/10.1016/j.asoc.2020.106435
Linzen, T., Baroni, M.: Syntactic structure from deep learning. Annu. Rev. Linguist. 7, 195–212 (2021). https://doi.org/10.1146/annurev-linguistics-032020-051035
Minutolo, A., Guarasci, R., Damiano, E., De Pietro, G., Fujita, H., Esposito, M.: A multi-level methodology for the automated translation of a coreference resolution dataset: an application to the Italian language. Neural Comput. Appl. 34(24), 22,493–22,518 (2022). https://doi.org/10.1007/s00521-022-07641-3
Pires, T., Schlinger, E., Garrette, D.: How multilingual is multilingual BERT? In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4996–5001. ACL, Florence, Italy (2019). https://doi.org/10.18653/v1/P19-1493
Ravishankar, V., Kulmizev, A., Abdou, M., Søgaard, A., Nivre, J.: Attention can reflect syntactic structure (if you let it). In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 3031–3045. Association for Computational Linguistics, Online (2021)
Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Erk, K., Smith, N.A. (eds.) Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1715–1725. Association for Computational Linguistics, Berlin, Germany (2016). https://doi.org/10.18653/v1/P16-1162
Silveira, N., et al.: A gold standard dependency corpus for English. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pp. 2897–2904. ELRA, Reykjavik, Iceland (2014)
Silvestri, S., Gargiulo, F., Ciampi, M., De Pietro, G.: Exploit multilingual language model at scale for ICD-10 clinical text classification. In: ISCC 2020, pp. 1–7. IEEE, Rennes, France (2020). https://doi.org/10.1109/ISCC50000.2020.9219640
Simi, M., Bosco, C., Montemagni, S.: Less is more? Towards a reduced inventory of categories for training a parser for the Italian Stanford dependencies. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pp. 83–90. ELRA, Reykjavik, Iceland (2014)
Sukthanker, R., Poria, S., Cambria, E., Thirunavukarasu, R.: Anaphora and coreference resolution: a review. Inf. Fusion 59, 139–162 (2020). https://doi.org/10.1016/j.inffus.2020.01.010
Tenney, I., Das, D., Pavlick, E.: BERT rediscovers the classical NLP pipeline. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4593–4601. ACL, Florence, Italy (2019). https://doi.org/10.18653/v1/P19-1452
Tenney, I., et al.: What do you learn from context? probing for sentence structure in contextualized word representations. In: 7th International Conference on Learning Representations, ICLR 2019. New Orleans, LA, USA (2019)
Trotta, D., Guarasci, R., Leonardelli, E., Tonelli, S.: Monolingual and cross-lingual acceptability judgments with the Italian CoLA corpus. In: M.F. Moens, X. Huang, L. Specia, S.W.t. Yih (eds.) Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 2929–2940. Association for Computational Linguistics, Punta Cana, Dominican Republic (2021). https://doi.org/10.18653/v1/2021.findings-emnlp.250
Warstadt, A., Singh, A., Bowman, S.R.: Neural network acceptability judgments. Trans. Assoc. Comput. Linguist. 7, 625–641 (2019). https://doi.org/10.1162/tac_a_00290
Acknowledgements
This work is supported by European Union - NextGenerationEU - National Recovery and Resilience Plan (Piano Nazionale di Ripresa e Resilienza, PNRR) - Project: “SoBigData.it - Strengthening the Italian RI for Social Mining and Big Data Analytics” - Prot. IR0000013 - Avviso n. 3264 del 28/12/2021
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Guarasci, R., Silvestri, S., Esposito, M. (2024). Probing Cross-lingual Transfer of XLM Multi-language Model. In: Barolli, L. (eds) Advances in Internet, Data & Web Technologies. EIDWT 2024. Lecture Notes on Data Engineering and Communications Technologies, vol 193. Springer, Cham. https://doi.org/10.1007/978-3-031-53555-0_21
Download citation
DOI: https://doi.org/10.1007/978-3-031-53555-0_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53554-3
Online ISBN: 978-3-031-53555-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)