Conversion and Exploitation of Dependency Treebanks with Full-Tree LSTM

Zhang, Bo; Li, Zhenghua; Zhang, Min

doi:10.1007/978-3-030-32236-6_41

Conversion and Exploitation of Dependency Treebanks with Full-Tree LSTM

Bo Zhang¹³,
Zhenghua Li¹³ &
Min Zhang¹³

Conference paper
First Online: 30 September 2019

4634 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11839))

Abstract

As a method for exploiting multiple heterogeneous data, supervised treebank conversion can straightforwardly and effectively utilize linguistic knowledge contained in heterogeneous treebank. In order to efficiently and deeply encode the source-side tree, we for the first time investigate and propose to use Full-tree LSTM as a tree encoder for treebank conversion. Furthermore, the corpus weighting strategy and the concatenation with fine-tuning approach are introduced to weaken the noise contained in the converted treebank. Experimental results on two benchmark datasets with bi-tree aligned trees show that (1) the proposed Full-Tree LSTM approach is more effective than previous treebank conversion methods, (2) the corpus weighting strategy and the concatenation with fine-tuning approach are both useful for the exploitation of the noisy converted treebank, and (3) supervised treebank conversion methods can achieve higher final parsing accuracy than multi-task learning approach.

Supported by National Natural Science Foundation of China (Grant No. 61525205, 61876116). Zhenghua Li is the corresponding author. We thank the anonymous reviewers for the helpful comments and Qingrong Xia and Houquan Zhou for their help on preparing this English version.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Andor, D., et al.: Globally normalized transition-based neural networks. In: Proceedings of ACL, pp. 2442–2452 (2016)
Google Scholar
Che, W., Li, Z., Liu, T.: Chinese dependency treebank 1.0 (LDC2012T05). In: Philadelphia: Linguistic Data Consortium (2012)
Google Scholar
Chen, D., Manning, C.: A fast and accurate dependency parser using neural networks. In: Proceedings of EMNLP, pp. 740–750 (2014)
Google Scholar
Dozat, T., Manning, C.D.: Deep biaffine attention for neural dependency parsing. In: Proceedings of ICLR (2017)
Google Scholar
Dyer, C., Ballesteros, M., Ling, W., Matthews, A., Smith, N.A.: Transition-based dependency parsing with stack long short-term memory. In: Proceedings of ACL, pp. 334–343 (2015)
Google Scholar
Guo, J., Che, W., Wang, H., Liu, T.: A universal framework for inductive transfer parsing across multi-typed treebanks. In: Proceedings of COLING, pp. 12–22 (2016)
Google Scholar
Jiang, X., Li, Z., Zhang, B., Zhang, M., Li, S., Si, L.: Supervised treebank conversion: data and approaches. In: Proceedings of ACL, pp. 2706–2716 (2018)
Google Scholar
Li, X., Jiang, W., Lü, Y., Liu, Q.: Iterative transformation of annotation guidelines for constituency parsing. In: Proceedings of ACL, pp. 591–596 (2013)
Google Scholar
Li, Z., Che, W., Liu, T.: Exploiting multiple treebanks for parsing with quasi-synchronous grammar. In: Proceedings of ACL, pp. 675–684 (2012)
Google Scholar
Magerman, D.M.: Natural language parsing as statistical pattern recognition. arXiv preprint cmp-lg/9405009 (1994)
Google Scholar
Miwa, M., Bansal, M.: End-to-end relation extraction using LSTMs on sequences and tree structures. In: Proceedings of ACL, pp. 1105–1116 (2016)
Google Scholar
Niu, Z.Y., Wang, H., Wu, H.: Exploiting heterogeneous treebanks for parsing. In: Proceedings of ACL, pp. 46–54 (2009)
Google Scholar
Nivre, J., Scholz, M.: Deterministic dependency parsing of English text. In: Proceedings of COLING 2004, pp. 64–70 (2004)
Google Scholar
Stymne, S., de Lhoneux, M., Smith, A., Nivre, J.: Parser training with heterogeneous treebanks. arXiv preprint arXiv:1805.05089 (2018)
Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of ACL, pp. 1556–1566 (2015)
Google Scholar
Xue, N., Xia, F., Chiou, F.D., Palmer, M.: The penn Chinese treebank: phrase structure annotation of a large corpus. Nat. Lang. Eng. 11(2), 207–238 (2005)
Article Google Scholar
Zhou, H., Zhang, Y., Huang, S., Chen, J.: A neural probabilistic structured-prediction model for transition-based dependency parsing. In: Proceedings of ACL, pp. 1213–1222 (2015)
Google Scholar
Zhu, M., Zhu, J., Hu, M.: Exploitating multiple treebanks using a feature-based approach. In: Proceedings of ACL, pp. 715–719 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Soochow University, Suzhou, China
Bo Zhang, Zhenghua Li & Min Zhang

Authors

Bo Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhenghua Li
View author publications
You can also search for this author in PubMed Google Scholar
Min Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhenghua Li .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Jie Tang
National University of Singapore, Singapore, Singapore
Min-Yen Kan
Peking University, Beijing, China
Dongyan Zhao
Peking University, Beijing, China
Sujian Li
Zhengzhou University, Zhengzhou, China
Hongying Zan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, B., Li, Z., Zhang, M. (2019). Conversion and Exploitation of Dependency Treebanks with Full-Tree LSTM. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11839. Springer, Cham. https://doi.org/10.1007/978-3-030-32236-6_41

Download citation

DOI: https://doi.org/10.1007/978-3-030-32236-6_41
Published: 30 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32235-9
Online ISBN: 978-3-030-32236-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)