Abstract
As a method for exploiting multiple heterogeneous data, supervised treebank conversion can straightforwardly and effectively utilize linguistic knowledge contained in heterogeneous treebank. In order to efficiently and deeply encode the source-side tree, we for the first time investigate and propose to use Full-tree LSTM as a tree encoder for treebank conversion. Furthermore, the corpus weighting strategy and the concatenation with fine-tuning approach are introduced to weaken the noise contained in the converted treebank. Experimental results on two benchmark datasets with bi-tree aligned trees show that (1) the proposed Full-Tree LSTM approach is more effective than previous treebank conversion methods, (2) the corpus weighting strategy and the concatenation with fine-tuning approach are both useful for the exploitation of the noisy converted treebank, and (3) supervised treebank conversion methods can achieve higher final parsing accuracy than multi-task learning approach.
Supported by National Natural Science Foundation of China (Grant No. 61525205, 61876116). Zhenghua Li is the corresponding author. We thank the anonymous reviewers for the helpful comments and Qingrong Xia and Houquan Zhou for their help on preparing this English version.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Andor, D., et al.: Globally normalized transition-based neural networks. In: Proceedings of ACL, pp. 2442–2452 (2016)
Che, W., Li, Z., Liu, T.: Chinese dependency treebank 1.0 (LDC2012T05). In: Philadelphia: Linguistic Data Consortium (2012)
Chen, D., Manning, C.: A fast and accurate dependency parser using neural networks. In: Proceedings of EMNLP, pp. 740–750 (2014)
Dozat, T., Manning, C.D.: Deep biaffine attention for neural dependency parsing. In: Proceedings of ICLR (2017)
Dyer, C., Ballesteros, M., Ling, W., Matthews, A., Smith, N.A.: Transition-based dependency parsing with stack long short-term memory. In: Proceedings of ACL, pp. 334–343 (2015)
Guo, J., Che, W., Wang, H., Liu, T.: A universal framework for inductive transfer parsing across multi-typed treebanks. In: Proceedings of COLING, pp. 12–22 (2016)
Jiang, X., Li, Z., Zhang, B., Zhang, M., Li, S., Si, L.: Supervised treebank conversion: data and approaches. In: Proceedings of ACL, pp. 2706–2716 (2018)
Li, X., Jiang, W., Lü, Y., Liu, Q.: Iterative transformation of annotation guidelines for constituency parsing. In: Proceedings of ACL, pp. 591–596 (2013)
Li, Z., Che, W., Liu, T.: Exploiting multiple treebanks for parsing with quasi-synchronous grammar. In: Proceedings of ACL, pp. 675–684 (2012)
Magerman, D.M.: Natural language parsing as statistical pattern recognition. arXiv preprint cmp-lg/9405009 (1994)
Miwa, M., Bansal, M.: End-to-end relation extraction using LSTMs on sequences and tree structures. In: Proceedings of ACL, pp. 1105–1116 (2016)
Niu, Z.Y., Wang, H., Wu, H.: Exploiting heterogeneous treebanks for parsing. In: Proceedings of ACL, pp. 46–54 (2009)
Nivre, J., Scholz, M.: Deterministic dependency parsing of English text. In: Proceedings of COLING 2004, pp. 64–70 (2004)
Stymne, S., de Lhoneux, M., Smith, A., Nivre, J.: Parser training with heterogeneous treebanks. arXiv preprint arXiv:1805.05089 (2018)
Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of ACL, pp. 1556–1566 (2015)
Xue, N., Xia, F., Chiou, F.D., Palmer, M.: The penn Chinese treebank: phrase structure annotation of a large corpus. Nat. Lang. Eng. 11(2), 207–238 (2005)
Zhou, H., Zhang, Y., Huang, S., Chen, J.: A neural probabilistic structured-prediction model for transition-based dependency parsing. In: Proceedings of ACL, pp. 1213–1222 (2015)
Zhu, M., Zhu, J., Hu, M.: Exploitating multiple treebanks using a feature-based approach. In: Proceedings of ACL, pp. 715–719 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, B., Li, Z., Zhang, M. (2019). Conversion and Exploitation of Dependency Treebanks with Full-Tree LSTM. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11839. Springer, Cham. https://doi.org/10.1007/978-3-030-32236-6_41
Download citation
DOI: https://doi.org/10.1007/978-3-030-32236-6_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32235-9
Online ISBN: 978-3-030-32236-6
eBook Packages: Computer ScienceComputer Science (R0)