Exploring Parameter Sharing Techniques for Cross-Lingual and Cross-Task Supervision

Pikuliak, Matúš; Šimko, Marián

doi:10.1007/978-3-030-59430-5_8

Matúš Pikuliak¹¹ &
Marián Šimko¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12379))

Included in the following conference series:

International Conference on Statistical Language and Speech Processing

384 Accesses

Abstract

Many languages still lack the annotated training data needed for supervised learning. This issue is often addressed by using auxiliary supervision and the so called transfer learning. In this work we focus on the problem of combining two types of auxiliary supervision – cross-lingual and cross-task. Previous work has shown promising results for this combination. Here, we aim to explore various advanced parameter sharing techniques to improve the results. We propose three distinct techniques with various properties and evaluate their performance on four Indo-European languages and four distinct NLP tasks (dependency parsing, language modeling, named entity recognition and part-of-speech tagging). We conclude that the proposed techniques significantly improve the performance for zero-shot learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/matus-pikuliak/crosslingual-parameter-sharing.

References

Akhtar, M.S., Chauhan, D., Ghosal, D., Poria, S., Ekbal, A., Bhattacharyya, P.: Multi-task learning for multi-modal emotion recognition and sentiment analysis. In: Proceedings of the 2019 Conference of NAACL, Minneapolis, Minnesota, pp. 370–379. ACL (2019)
Google Scholar
Benikova, D., Biemann, C., Reznicek, M.: Nosta-d named entity annotation for German: guidelines and dataset. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC 2014, 26–31 May 2014, Reykjavik, Iceland, pp. 2524–2531. ELRA (2014)
Google Scholar
Bos, J., Basile, V., Evang, K., Venhuizen, N.J., Bjerva, J.: The Groningen meaning bank. In: Ide, N., Pustejovsky, J. (eds.) Handbook of Linguistic Annotation, pp. 463–496. Springer, Dordrecht (2017). https://doi.org/10.1007/978-94-024-0881-2_18
Chapter Google Scholar
Caruana, R.: Multitask learning: a knowledge-based source of inductive bias. In: Machine Learning, Proceedings of the Tenth International Conference, 27–29 June 1993, University of Massachusetts, Amherst, MA, USA, pp. 41–48 (1993)
Google Scholar
Conneau, A., Lample, G., Ranzato, M., Denoyer, L., Jégou, H.: Word translation without parallel data. In: 6th International Conference on Learning Representations. Vancouver, Canada (2018)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of NAACL, Minneapolis, Minnesota, pp. 4171–4186. ACL (2019)
Google Scholar
Ganin, Y., Lempitsky, V.S.: Unsupervised domain adaptation by backpropagation. In: Proceedings of the 32nd ICML 2015, Lille, France, 6–11 July 2015. JMLR Workshop and Conference Proceedings, vol. 37, pp. 1180–1189. JMLR.org (2015)
Google Scholar
Guo, J., Che, W., Yarowsky, D., Wang, H., Liu, T.: Cross-lingual dependency parsing based on distributed representations. In: Proceedings of the 53rd Annual Meeting of the ACL and the 7th IJCNLP, Beijing, China, pp. 1234–1244. ACL (2015)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Houlsby, N., et al.: Parameter-efficient transfer learning for NLP. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9–15 June 2019, Long Beach, California, USA. Proceedings of Machine Learning Research, vol. 97, pp. 2790–2799. PMLR (2019). http://proceedings.mlr.press/v97/houlsby19a.html
Hu, J., Ruder, S., Siddhant, A., Neubig, G., Firat, O., Johnson, M.: XTREME: a massively multilingual multi-task benchmark for evaluating cross-lingual generalization. CoRR abs/2003.11080 (2020)
Google Scholar
Hwa, R., Resnik, P., Weinberg, A., Kolak, O.: Evaluating translational correspondence using annotation projection. In: Proceedings of the 40th Annual Meeting of the ACL, Philadelphia, USA, pp. 392–399. ACL (2002)
Google Scholar
Johnson, M., et al.: Google’s multilingual neural machine translation system: enabling zero-shot translation. Trans. ACL 5, 339–351 (2017)
Google Scholar
Joty, S., Nakov, P., Màrquez, L., Jaradat, I.: Cross-language learning with adversarial neural networks. In: Proceedings of the 21st CoNLL, Vancouver, Canada, pp. 226–237. ACL (2017)
Google Scholar
Karthikeyan, K., Wang, Z., Mayhew, S., Roth, D.: Cross-lingual ability of multilingual BERT: an empirical study. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. OpenReview.net (2020)
Google Scholar
Klementiev, A., Titov, I., Bhattarai, B.: Inducing crosslingual distributed representations of words. In: Proceedings of COLING 2012, pp. 1459–1474. The COLING 2012 Organizing Committee, Mumbai, India (2012)
Google Scholar
Kravalova, J., Zabokrtsky, Z.: Czech named entity corpus and SVM-based recognizer. In: Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009), pp. 194–201. ACL (2009)
Google Scholar
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th ICML, Williams College, Williamstown, USA, pp. 282–289. Morgan Kaufmann (2001)
Google Scholar
Lin, Y., Yang, S., Stoyanov, V., Ji, H.: A multi-lingual multi-task architecture for low-resource sequence labeling. In: Proceedings of the 56th Annual Meeting of the ACL, Melbourne, Australia, pp. 799–809. ACL (2018)
Google Scholar
McDonald, R., Pereira, F., Ribarov, K., Hajič, J.: Non-projective dependency parsing using spanning tree algorithms. In: Proceedings of HLT-EMNLP, Vancouver, British Columbia, Canada, pp. 523–530. ACL (2005)
Google Scholar
McDonald, R., Petrov, S., Hall, K.: Multi-source transfer of delexicalized dependency parsers. In: Proceedings of the 2011 EMNLP, Edinburgh, Scotland, UK, pp. 62–72. ACL (2011)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: 27th NIPS Proceedings, 5–8 December 2013, Lake Tahoe, Nevada, United States, pp. 3111–3119 (2013)
Google Scholar
Pikuliak, M., Šimko, M.: Combining cross-lingual and cross-task supervision for zero-shot learning. In: Sojka, P., Kopeček, I., Pala, K., Horák, A. (eds.) TSD 2020. LNCS (LNAI), vol. 12284, pp. 162–170. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58323-1_17
Chapter Google Scholar
Sang, E.F.T.K.: Introduction to the conll-2002 shared task: language-independent named entity recognition. CoRR cs.CL/0209010 (2002)
Google Scholar
Wan, X.: Co-training for cross-lingual sentiment classification. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Suntec, Singapore, pp. 235–243. ACL (2009)
Google Scholar
Wang, D., Zheng, T.F.: Transfer learning for speech and language processing. In: Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2015, Hong Kong, 16–19 December 2015, pp. 1225–1237 (2015)
Google Scholar
Zapotoczny, M., Rychlikowski, P., Chorowski, J.: On multilingual training of neural dependency parsers. In: Ekštein, K., Matoušek, V. (eds.) TSD 2017. LNCS (LNAI), vol. 10415, pp. 326–334. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64206-2_37
Chapter Google Scholar

Download references

Acknowledgments

This work was partially supported by the Scientific Grant Agency of the Slovak Republic, grants No. VG 1/0725/19 and VG 1/0667/18 and by the Slovak Research and Development Agency under the contracts No. APVV-15-0508, APVV-17-0267 and APVV SK-IL-RD-18-0004.

Author information

Authors and Affiliations

Faculty of Informatics and Information Technologies, Slovak University of Technology in Bratislava, Ilkovicova 2, Bratislava, Slovakia
Matúš Pikuliak & Marián Šimko

Authors

Matúš Pikuliak
View author publications
You can also search for this author in PubMed Google Scholar
Marián Šimko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matúš Pikuliak .

Editor information

Editors and Affiliations

Cardiff University, Cardiff, UK
Luis Espinosa-Anke
Rovira i Virgili University, Tarragona, Tarragona, Spain
Carlos Martín-Vide
Computer Science, Cardiff University, Cardiff, UK
Irena Spasić

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pikuliak, M., Šimko, M. (2020). Exploring Parameter Sharing Techniques for Cross-Lingual and Cross-Task Supervision. In: Espinosa-Anke, L., Martín-Vide, C., Spasić, I. (eds) Statistical Language and Speech Processing. SLSP 2020. Lecture Notes in Computer Science(), vol 12379. Springer, Cham. https://doi.org/10.1007/978-3-030-59430-5_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-59430-5_8
Published: 26 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59429-9
Online ISBN: 978-3-030-59430-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics