Skip to main content

Multilingual Dependency Parsing from Universal Dependencies to Sesame Street

  • Conference paper
  • First Online:
Text, Speech, and Dialogue (TSD 2020)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12284))

Included in the following conference series:

Abstract

Research on dependency parsing has always had a strong multilingual orientation, but the lack of standardized annotations for a long time made it difficult both to meaningfully compare results across languages and to develop truly multilingual systems. The Universal Dependencies project has during the last five years tried to overcome this obstacle by developing cross-linguistically consistent morphosyntactic annotation for many languages. During the same period, dependency parsing (like the rest of NLP) has been transformed by the adoption of continuous vector representations and neural network techniques. In this paper, I will introduce the framework and resources of Universal Dependencies, and discuss advances in dependency parsing enabled by these resources in combination with deep learning techniques, ranging from traditional word and character embeddings to deep contextualized word representations like ELMo and BERT.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A multilingual perspective is also prevalent in the theoretical tradition of dependency grammar, starting with the seminal work of Tesnière  [38], and in earlier rule-based approaches to dependency parsing  [34].

  2. 2.

    There was an overlap of 4 languages between the two shared tasks.

  3. 3.

    This research trend was not limited to dependency parsing, but also included groundbreaking work on constituency parsing.

  4. 4.

    https://universaldependencies.org.

  5. 5.

    The features displayed in Fig. 1 are only a small subset of the features that would appear in a complete annotation of the two sentences.

  6. 6.

    UD releases are numbered by letting the first digit (2) refer to the version of the guidelines and the second digit (5) to the number of releases under that version.

  7. 7.

    The proportion of Indo-European languages has gone from 60% in v2.1 to 53% in v2.5.

  8. 8.

    Except for Chinese, for which we make use of a separate, pretrained model.

  9. 9.

    https://github.com/google-research/bert.

  10. 10.

    The published paper contains a third extension, which we omit here because of space constraints, where we investigate whether the models exhibit a preference for different syntactic frameworks.

References

  1. Andrews, A.D.: The major functions of the noun phrase. In: Shopen, T. (ed.) Language Typology and Syntactic Description. Volume I: Clause Structure, 2nd edn., pp. 132–223. Cambridge University Press, Cambridge (2007)

    Chapter  Google Scholar 

  2. Buchholz, S., Marsi, E.: CoNLL-X shared task on multilingual dependency parsing. In: Proceedings of the 10th Conference on Computational Natural Language Learning (CoNLL), pp. 149–164 (2006)

    Google Scholar 

  3. Che, W., Liu, Y., Wang, Y., Zheng, B., Liu, T.: Towards better UD parsing: deep contextualized word embeddings, ensemble, and treebank concatenation. In: Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp. 55–64 (2018)

    Google Scholar 

  4. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2019)

    Google Scholar 

  5. Dozat, T., Manning, C.D.: Deep biaffine attention for neural dependency parsing. In: Proceedings of the 5th International Conference on Learning Representations (2017)

    Google Scholar 

  6. Dozat, T., Qi, P., Manning, C.D.: Stanford’s graph-based neural dependency parser at the CoNLL 2017 shared task. In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp. 20–30 (2017)

    Google Scholar 

  7. Duong, L., Cohn, T., Bird, S., Cook, P.: Low resource dependency parsing: cross-lingual parameter sharing in a neural network parser. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL), pp. 845–850 (2015)

    Google Scholar 

  8. Futrell, R., Mahowald, K., Gibson, E.: Quantifying word order freedom in dependency corpora. In: Proceedings of the Third International Conference on Dependency Linguistics (Depling), pp. 91–100 (2015)

    Google Scholar 

  9. Guo, J., Che, W., Yarowsky, D., Wang, H., Liu, T.: Cross-lingual dependency parsing based on distributed representations. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL), pp. 1234–1244 (2015)

    Google Scholar 

  10. Hewitt, J., Manning, C.D.: A structural probe for finding syntax in word representations. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2019)

    Google Scholar 

  11. Kiperwasser, E., Goldberg, Y.: Simple and accurate dependency parsing using bidirectional LSTM feature representations. Trans. Assoc. Comput. Linguist. 4, 313–327 (2016)

    Article  Google Scholar 

  12. Kondratyuk, D., Straka, M.: 75 languages, 1 model: parsing Universal Dependencies universally. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 2779–2795 (2019)

    Google Scholar 

  13. Kuhlmann, M., Gómez-Rodríguez, C., Satta, G.: Dynamic programming algorithms for transition-based dependency parsers. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 673–682 (2011)

    Google Scholar 

  14. Kulmizev, A., de Lhoneux, M., Gontrum, J., Fano, E., Nivre, J.: Deep contextualized word embeddings in transition-based and graph-based dependency parsing - a tale of two parsers revisited. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 2755–2768 (2019)

    Google Scholar 

  15. Kulmizev, A., Ravishankar, V., Abdou, M., Nivre, J.: Do neural language models show preferences for syntactic formalisms? In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 4077–4091 (2020)

    Google Scholar 

  16. Levshina, N.: Token-based typology and word order entropy: a study based on Universal Dependencies. Linguist. Typology 23, 533–572 (2019)

    Article  Google Scholar 

  17. de Lhoneux, M.: Linguistically informed neural dependency parsing for typologically diverse languages. Ph.D. thesis, Uppsala University (2019)

    Google Scholar 

  18. de Lhoneux, M., et al.: From raw text to Universal Dependencies - look, no tags! In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp. 207–217 (2017)

    Google Scholar 

  19. de Lhoneux, M., Stymne, S., Nivre, J.: Arc-hybrid non-projective dependency parsing with a static-dynamic oracle. In: Proceedings of the 15th International Conference on Parsing Technologies, pp. 99–104 (2017)

    Google Scholar 

  20. McDonald, R., Nivre, J.: Characterizing the errors of data-driven dependency parsing models. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 122–131 (2007)

    Google Scholar 

  21. McDonald, R., Nivre, J.: Analyzing and integrating dependency parsers. Comput. Linguist. 37(1), 197–230 (2011)

    Article  Google Scholar 

  22. McDonald, R., Pereira, F., Ribarov, K., Hajič, J.: Non-projective dependency parsing using spanning tree algorithms. In: Proceedings of the Human Language Technology Conference and the Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pp. 523–530 (2005)

    Google Scholar 

  23. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  24. Nivre, J.: An efficient algorithm for projective dependency parsing. In: Proceedings of the 8th International Workshop on Parsing Technologies (IWPT), pp. 149–160 (2003)

    Google Scholar 

  25. Nivre, J.: Algorithms for deterministic incremental dependency parsing. Comput. Linguist. 34, 513–553 (2008)

    Article  MathSciNet  Google Scholar 

  26. Nivre, J.: Non-projective dependency parsing in expected linear time. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP (ACL-IJCNLP), pp. 351–359 (2009)

    Google Scholar 

  27. Nivre, J.: Towards a universal grammar for natural language processing. In: Gelbukh, A. (ed.) CICLing 2015. LNCS, vol. 9041, pp. 3–16. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18111-0_1

    Chapter  Google Scholar 

  28. Nivre, J., Boguslavsky, I.M., Iomdin, L.L.: Parsing the SynTagRus treebank of Russian. In: Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), pp. 641–648 (2008)

    Google Scholar 

  29. Nivre, J., et al.: The CoNLL 2007 shared task on dependency parsing. In: Proceedings of the CoNLL Shared Task of EMNLP-CoNLL 2007, pp. 915–932 (2007)

    Google Scholar 

  30. Nivre, J., et al.: Universal Dependencies v1: a multilingual treebank collection. In: Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC) (2016)

    Google Scholar 

  31. Nivre, J., et al.: Universal Dependencies v2: an evergrowing multilingual treebank collection. In: Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC) (2020)

    Google Scholar 

  32. Östling, R.: Word order typology through multilingual word alignment. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 205–211 (2015)

    Google Scholar 

  33. Peters, M.E., et al.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237 (2018)

    Google Scholar 

  34. Schubert, K., Maxwell, D.: Metataxis in Practice: Dependency Syntax for Multilingual Machine Translation. Mouton de Gruyter, Berlin (1987)

    Book  Google Scholar 

  35. Smith, A., Bohnet, B., de Lhoneux, M., Nivre, J., Shao, Y., Stymne, S.: 82 treebanks, 34 models: universal dependency parsing with multi-treebank models. In: Proceedings of the 2018 CoNLL Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies (2018)

    Google Scholar 

  36. Smith, A., de Lhoneux, M., Stymne, S., Nivre, J.: An investigation of the interactions between pre-trained word embeddings, character models and POS tags in dependency parsing. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (2018)

    Google Scholar 

  37. Stevenson, M., Greenwood, M.A.: Dependency pattern models for information extraction. Res. Lang. Comput. 7, 13–39 (2009). https://doi.org/10.1007/s11168-009-9061-2

    Article  Google Scholar 

  38. Tesnière, L.: Éléments de syntaxe structurale. Editions Klincksieck (1959)

    Google Scholar 

  39. Thompson, S.A.: Discourse motivations for the core-oblique distinction as a language universal. In: Kamio, A. (ed.) Directions in Functional Linguistics, pp. 59–82. John Benjamins, Amsterdam (1997)

    Chapter  Google Scholar 

  40. Tiedemann, J.: Cross-lingual dependency parsing with Universal Dependencies and predicted PoS labels. In: Proceedings of the Third International Conference on Dependency Linguistics (Depling), pp. 340–349 (2015)

    Google Scholar 

  41. Tsarfaty, R., Seddah, D., Kübler, S., Nivre, J.: Parsing morphologically rich languages: introduction to the special issue. Computat. Linguist. 39, 15–22 (2013)

    Article  Google Scholar 

  42. Zeman, D., et al.: CoNLL 2018 shared task: multilingual parsing from raw text to universal dependencies. In: Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies (2018)

    Google Scholar 

  43. Zeman, D., et al.: Universal Dependencies 2.5 (2019). http://hdl.handle.net/11234/1-3105, LINDAT/CLARIN digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University. http://hdl.handle.net/11234/1-3105

  44. Zeman, D., et al.: CoNLL 2017 shared task: multilingual parsing from raw text to universal dependencies. In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp. 1–19 (2017)

    Google Scholar 

  45. Zhang, Y., Nivre, J.: Analyzing the effect of global learning and beam-search on transition-based dependency parsing. In: Proceedings of COLING 2012: Posters, pp. 1391–1400 (2012)

    Google Scholar 

Download references

Acknowledgments

I want to thank (present and former) members of the Uppsala parsing group – Ali Basirat, Miryam de Lhoneux, Artur Kulmizev, Paola Merlo, Aaron Smith and Sara Stymne – colleagues in the core UD group – Marie de Marneffe, Filip Ginter, Yoav Goldberg, Jan Hajič, Chris Manning, Ryan McDonald, Slav Petrov, Sampo Pyysalo, Sebastian Schuster, Reut Tsarfaty, Francis Tyers and Dan Zeman – and all contributors in the UD community. I acknowledge the computational resources provided by CSC in Helsinki and Sigma2 in Oslo through NeIC-NLPL (www.nlpl.eu).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joakim Nivre .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nivre, J. (2020). Multilingual Dependency Parsing from Universal Dependencies to Sesame Street. In: Sojka, P., Kopeček, I., Pala, K., Horák, A. (eds) Text, Speech, and Dialogue. TSD 2020. Lecture Notes in Computer Science(), vol 12284. Springer, Cham. https://doi.org/10.1007/978-3-030-58323-1_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58323-1_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58322-4

  • Online ISBN: 978-3-030-58323-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics