Abstract
This paper introduces an effective method for improving dependency parsing which is based on a graph embedding model. The model helps extract local and global connectivity patterns between tokens. This method allows neural network models to perform better on dependency parsing benchmarks. We propose to incorporate node embeddings trained by a graph embedding algorithm into a bidirectional recurrent neural network scheme. The new model outperforms a baseline reference using a state-of-the-art method on three dependency treebanks for both low-resource and high-resource natural languages, namely Indonesian, Vietnamese and English. We also show that the popular pretraining technique of BERT would not pick up on the same kind of signal as graph embeddings. The new parser together with all trained models is made available under an open-source license, facilitating community engagement and advancement of natural language processing research for two low-resource languages with around 300 million users worldwide in total.







Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The datasets generated during and/or analyzed during the current study are available in the Universal Dependencies repository: https://universaldependencies.org.
Notes
https://github.com/phuonglh/jvl/, under the VLP/aep module.
In practice, the dimensionality is usually in the range of a dozen to one thousand.
Vietnamese Language and Speech Processing, http://vlsp.org.vn/.
The shape dictionary of a word includes a dozen of different word forms such as number, date, allcaps, url....
Kiperwasser and Goldberg [18] select the top three tokens on the stack and the first token on the buffer. They use the arc-hybrid system instead of the arc-eager system as in our work.
The GSD treebank is about five times larger than the PUD or CSUI treebanks.
All models are implemented in the Julia programming language using the https://fluxml.ai library.
Recall that in the SOF variant, 20 embedding vectors of individual features are concatenated, resulting in an embedding dimension of \(20*e\).
Kiperwasser and Goldberg [18] evaluated their model on the English Penn Treebank corpus.
This small number makes sense due to a small number of 12 different possible word shapes.
These embedding dimensions have been tuned by Kiperwasser and Goldberg [18].
We use the package HypothesisTests of Julia to perform the statistical tests.
More precisely, we use the model bert-uncased_L-12_H-768_A-12 which is publicly available.
References
Alves M (1999) What’s so Chinese about Vietnamese? In: Proceedings of the Ninth Annual Meeting of the Southeast Asian Linguistics Society, pp 221–224, University of California, Berkeley, USA
Baroni M, Lenci A (2010) Distributional memory: a general framework for corpus-based semantics. Comput Linguist 36(4):673–721
Björkelund A, Falenska A, Yu X, and Kuhn J (2017) Ims at the conll 2017 ud shared task: Crfs and perceptrons meet neural networks. In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp 40–51, Vancouver, Canada. Association for Computational Linguistics
Bordes A, Usunier N, Garcia-Duran A, Weston J, Yakhnenko O (2013) Translating embeddings for modeling multi-relational data. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in Neural Information Processing Systems, vol 26. Curran Associates Inc, pp 1–9
Buchholz S and Marsi E (2006) CoNLL-X shared task on multilingual dependency parsing. In: Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X), pp 149–164, New York City. Association for Computational Linguistics
Cavallari S, Cambria E, Cai H, Chang K, Zheng V (2019) Embedding both finite and infinite communities on graph. IEEE Comput Intell Mag 14(3):39–50
Chen D. and Manning C (2014) A fast and accurate dependency parser using neural networks. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 740–750, Doha, Qatar. Association for Computational Linguistics
Dang HV and Le-Hong P (2021) A combined syntactic-semantic embedding model based on lexicalized tree-adjoining grammar. Comput Speech Lang 68
Devlin J, Chang M-W, Lee K, and Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL, pages 1–16, Minnesota, USA
Dozat T, Qi P, and Manning CD (2017) Stanford’s graph-based neural dependency parser at the CoNLL 2017 shared task. In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp 20–30, Vancouver, Canada. Association for Computational Linguistics
Dyer C, Ballesteros M, Ling W, Matthews A, and Smith NA (2015) Transition-based dependency parsing with stack long short-term memory. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp 334–343, Beijing, China. Association for Computational Linguistics
Fernandez Astudillo R, Ballesteros M, Naseem T, Blodgett A, and Florian R (2020) Transition-based parsing with stack-transformers. In Findings of the Association for Computational Linguistics: EMNLP 2020, pp 1001–1007, Online. Association for Computational Linguistics
Green N, Larasati SD, and Zabokrtsky Z (2012) Indonesian dependency treebank: annotation and parsing. In: Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation, pp 137–145, Bali, Indonesia. Faculty of Computer Science, Universitas Indonesia
Harris ZS (1954) Distributional structure. Word 10(2–3):146–162
Ji S, Pan S, Cambria E, Marttinen P, Yu PS (2022) A survey on knowledge graphs: representation, acquisition and applications. IEEE Trans Neural Netw Learn Syst 33(10):494–514
Kingma DP and Ba J (2015) Adam: a method for stochastic optimization. In: Bengio Y and LeCun Y, eds, Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, pp 1–15, San Diego, CA, USA
Kiperwasser E, Goldberg Y (2016) Easy-first dependency parsing with hierarchical tree LSTMs. Trans Assoc Comput Linguist 4:445–461
Kiperwasser E, Goldberg Y (2016) Simple and accurate dependency parsing using bidirectional LSTM feature representations. Trans Assoc Comput Linguist 4:313–327
Kolen JF and Kremer SC (2001) Gradient Flow in Recurrent Nets: The Difficulty of Learning LongTerm Dependencies, pp 237–243. IEEE
Kondratyuk D. and Straka M (2019) 75 languages, 1 model: parsing Universal Dependencies universally. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp 2779–2795, Hong Kong, China. Association for Computational Linguistics
Kübler S, McDonald R, and Nivre J (2009) Dependency parsing. Morgan & Claypool Publishers
Le P and Zuidema W (2014) The inside-outside recursive neural network model for dependency parsing. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 729–739, Doha, Qatar. Association for Computational Linguistics
Le-Hong P, Nguyen TMH, and Azim R (2012) Vietnamese parsing with an automatically extracted tree-adjoining grammar. In: Proceedings of the IEEE RIVF, pp 91–96, HCMC, Vietnam
Le-Hong P, Roussanaly A, Nguyen T-M-H (2015) A syntactic component for Vietnamese language processing. J Lang Modell 3(1):145–184
Lei T, Xin Y, Zhang Y, Barzilay R, and Jaakkola T (2014) Low-rank tensors for scoring dependency structures. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1381–1391, Baltimore, Maryland. Association for Computational Linguistics
Levy O and Goldberg Y (2014) Dependency-based word embeddings. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp 302–308, Baltimore, Maryland. Association for Computational Linguistics
Ling W, Tsvetkov Y, Amir S, Fermandez R, Dyer C, Black AW, Trancoso I, and Lin C-C (2015) Not all contexts are created equal: better word representations with variable attention. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 1367–1372, Lisbon, Portugal. Association for Computational Linguistics
Liu J and Zhang Y (2017) Encoder-decoder shift-reduce syntactic parsing. In: Proceedings of the 15th International Conference on Parsing Technologies, pp 105–114, Pisa, Italy. Association for Computational Linguistics
McDonald R, Nivre J (2011) Analyzing and integrating dependency parsers. Comput Linguist 37(1):197–230
McDonald R and Pereira F (2006) Online learning of approximate dependency parsing algorithms. In: Proceedings of EACL, pp 81–88, Trento, Italy
McDonald R, Pereira F, Ribarov K, and Hajic J (2005) Non-projective dependency parsing using spanning tree algorithms. In: Proceedings of HLT-EMNLP, pp 522–530, Vancouver, Canada
Nguyen TL, Ha ML, Nguyen VH, Nguyen TMH, and Le-Hong P (2013) Building a treebank for Vietnamese dependency parsing. In The 10th IEEE RIVF, pp 147–151, Hanoi, Vietnam. IEEE
Nivre J (2003) An efficient algorithm for projective dependency parsing. In: Proceedings of the Eighth International Conference on Parsing Technologies, pp 149–160, Nancy, France
Nivre J, de Marneffe M-C, Ginter F, Goldberg Y, Hajič J, Manning CD, McDonald R, Petrov S, Pyysalo S, Silveira N, Tsarfaty R, and Zeman D (2016) Universal Dependencies v1: a multilingual treebank collection. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pp 1659–1666, Portorož, Slovenia. European Language Resources Association (ELRA)
Nivre J, Hall J, Kübler S, McDonald R, Nilsson J, Riedel S, and Yuret D (2007) The CoNLL 2007 shared task on dependency parsing. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp 915–932, Prague, Czech Republic. Association for Computational Linguistics
Nivre J and McDonald R (2008) Integrating graph-based and transition-based dependency parsers. In: Proceedings of ACL-08, pp 950–958, Columbus, Ohio, USA. ACL
Nivre J and Scholz M (2004) Deterministic dependency parsing of English text. In: Proceedings of COLING 2004, pp 1–7, Geneva, Switzerland
Nivre JEA (2018) Universal dependencies 2.2. LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Lewis MP, Simons GF, Fennig CD (eds) (2014) Ethnologue: languages of the World, 17th edn. SIL International, Dallas, Texas, USA
Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, and Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of NAACL, pp 1–15, Louisiana, USA
Sneddon JN (2004) The Indonesian language: its history and role in modern society. UNSW Press
Turian J, Ratinov L, and Bengio Y (2010) Word representations: a simple and general method for semi-supervised learning. In: Proceedings of ACL, pp 384–394, Uppsala, Sweden
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in Neural Information Processing Systems, vol 30. Curran Associates Inc
Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, and Bengio Y (2018) Graph attention networks. In: Proceedings of the Sixth International Conference on Learning Representations (ICLR), pp 1–12, Vancouver, Canada
Wilie B, Vincentio K, Winata GI, Cahyawijaya S, Li X, Lim ZY, Soleman S, Mahendra R, Fung P, Bahar S, and Purwarianti A (2020) IndoNLU: benchmark and resources for evaluating Indonesian natural language understanding. In: Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing. Association for Computational Linguistics
Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov R, and Le QV (2019) XLNet: generalized autoregressive pretraining for language understanding. In: Proceedings of NeurIPS, pp 5754–5764
Zeman D, Hajič J, Popel M, Potthast M, Straka M, Ginter F, Nivre, J, and Petrov S (2018) CoNLL 2018 shared task: multilingual parsing from raw text to Universal Dependencies. In: Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp 1–21, Brussels, Belgium. Association for Computational Linguistics
Zeman D, Popel M, Straka M, Hajič J, Nivre J, Ginter F, Luotolahti J, Pyysalo S, Petrov S, Potthast M, Tyers F, Badmaeva E, Gokirmak M, Nedoluzhko A, Cinková S, Hajič jr J, Hlaváčová J, Kettnerová V, Urešová Z, Kanerva J, Ojala S, Missilä A, Manning CD, Schuster S, Reddy S, Taji D, Habash N., Leung, H., de Marneffe, M.-C., Sanguinetti, M., Simi, M., Kanayama, H, de Paiva, V., Droganova, K., Martínez Alonso, H., Çöltekin, Ç., Sulubacak, U., Uszkoreit, H, Macketanz, V, Burchardt A, Harris K, Marheinecke K, Rehm G, Kayadelen T, Attia M, Elkahky A, Yu Z, Pitler E, Lertpradit S, Mandl M, Kirchner J, Alcalde HF, Strnadová J, Banerjee E, Manurung R, Stella A, Shimada A, Kwak S, Mendonça G, Lando T, Nitisaroj R, and Li J (2017) CoNLL 2017 shared task: Multilingual parsing from raw text to Universal Dependencies. In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp 1–19, Vancouver, Canada. Association for Computational Linguistics
Zhang Y and Nivre J (2011) Transition-based dependency parsing with rich non-local features. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp 188–193, Portland, Oregon, USA. Association for Computational Linguistics
Zhang Z, Liu S, Li M, Zhou M, and Chen E (2017) Stack-based multi-layer attention for transition-based dependency parsing. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp 1677–1682, Copenhagen, Denmark. Association for Computational Linguistics
Zhu C, Qiu X, Chen X, and Huang X (2015) A re-ranking model for dependency parser with recursive convolutional neural network. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp 1159–1168, Beijing, China. Association for Computational Linguistics
Acknowledgements
This study is supported by Vingroup Innovation Foundation (VINIF) in project code VINIF.2020.DA14.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Informed Consent
Informed consent was not required as no human or animals were involved.
Human and animal rights
This article does not contain any studies with human or animal subjects performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Le-Hong, P., Cambria, E. Integrating graph embedding and neural models for improving transition-based dependency parsing. Neural Comput & Applic 36, 2999–3016 (2024). https://doi.org/10.1007/s00521-023-09223-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-09223-3