Cross-Domain Transfer Learning for Dependency Parsing

Li, Zuchao; Zhou, Junru; Zhao, Hai; Wang, Rui

doi:10.1007/978-3-030-32236-6_77

Cross-Domain Transfer Learning for Dependency Parsing

Zuchao Li^13,14,15,
Junru Zhou^13,14,15,
Hai Zhao^13,14,15 &
…
Rui Wang¹⁶

Conference paper
First Online: 30 September 2019

4715 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11839))

Abstract

In recent years, the research of dependency parsing focuses on improving the accuracy of in-domain data and has made remarkable progress. However, the real world is different from a single scenario dataset, filled with countless scenarios that are not covered by the dataset, namely, out-of-domain. As a result, parsers that perform well on the in-domain data often suffer significant performance degradation on the out-of-domain data. Therefore, in order to adapt the existing in-domain parsers with substantial performance to the new domain scenario, cross-domain transfer learning techniques are essential to solve the domain problem in parsing. In this paper, we examine two scenarios for cross-domain transfer learning: semi-supervised and unsupervised cross-domain transfer learning. Specifically, we adopt a pretrained language model BERT for training on the source domain (in-domain) data at subword level and introduce two tri-training variant methods for the two scenarios so as to achieve the goal of cross-domain transfer learning. The system based on this paper participated in NLPCC-2019-shared-task on cross-domain dependency parsing and won the first place on the “subtask3-un-open” and “subtask4-semi-open” subtasks, indicating the effectiveness of the approaches adopted.

This paper was partially supported by National Key Research and Development Program of China (No. 2017YFB0304100) and Key Projects of National Natural Science Foundation of China (U1836222 and 61733011).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Our code will be available at https://github.com/bcmi220/cddp.
2.
Subtasks “subtask1-un-closed” and “subtask2-semi-closed” are not our focus. Since our baseline parsing framework is based on BERT and subtasks 1 and 2 prohibit the use of BERT and other external resources, we only use the transformer structure of BERT, without using BERT pretrained weights for initialization. The transformer network of BERT is very deep and the currently offered training dataset is too small to train the deep network well, so we only reached comparable results to other participants. This illustrates the deep neural network need enough data for training.
3.
For the training phase, there is no need to consider this issue at all. As with other graph-based models, the predicted tree at training time is the one where each word is a dependent of its highest scoring head including intra-word and inter-word dependencies.
4.
Due to the tri-training iterative training process, the unlabeled data will be much larger than the golden annotation data. In order to balance the training process of the model, we repeat the golden data to achieve the same amount of data as the unlabeled data, and then perform data shuffle during training.
5.
The initial score for each model run is set to 0, so at least one model will be saved for each training session.

References

Andor, D., et al.: Globally normalized transition-based neural networks. In: Proceedings of ACL (2016)
Google Scholar
Angeli, G., Premkumar, M.J.J., Manning, C.D.: Leveraging linguistic structure for open domain information extraction. In: Proceedings of ACL-IJCNLP (2015)
Google Scholar
Bowman, S.R., Gauthier, J., Rastogi, A., Gupta, R., Manning, C.D., Potts, C.: A fast unified model for parsing and sentence understanding. In: Proceedings of ACL (2016)
Google Scholar
Chen, K., et al.: Neural machine translation with source dependency representation. In: Proceedings of EMNLP (2017)
Google Scholar
Clark, K., Luong, M.T., Manning, C.D., Le, Q.: Semi-supervised sequence modeling with cross-view training. In: Proceedings of EMNLP (2018)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Dozat, T., Manning, C.D.: Deep biaffine attention for neural dependency parsing. arXiv preprint arXiv:1611.01734 (2016)
Hatori, J., Matsuzaki, T., Miyao, Y., Tsujii, J.: Incremental joint approach to word segmentation, POS tagging, and dependency parsing in Chinese. In: Proceedings of ACL (2012)
Google Scholar
He, S., Li, Z., Zhao, H., Bai, H.: Syntax for semantic role labeling, to be, or not to be. In: Proceedings of ACL (2018)
Google Scholar
Jiang, X., Li, Z., Zhang, B., Zhang, M., Li, S., Si, L.: Supervised treebank conversion: data and approaches. In: Proceedings of ACL (2018)
Google Scholar
Kurita, S., Kawahara, D., Kurohashi, S.: Neural joint model for transition-based Chinese syntactic analysis. In: Proceedings of ACL (2017)
Google Scholar
Levy, O., Goldberg, Y.: Dependency-based word embeddings. In: Proceedings of ACL (2014)
Google Scholar
Li, Z., Cai, J., He, S., Zhao, H.: Seq2seq dependency parsing. In: Proceedings of COLING, pp. 3203–3214 (2018)
Google Scholar
Peters, M., et al.: Deep contextualized word representations. In: Proceedings of NAACL-HLT, pp. 2227–2237 (2018)
Google Scholar
Ruder, S., Plank, B.: Strong baselines for neural semi-supervised learning under domain shift. In: Proceedings of ACL (2018)
Google Scholar
Peng, X., Li, Z., Zhang, M., Wang, R., Si, L.: Overview of the NLPCC 2019 shared task: cross-domain dependency parsing. In: Proceedings of NLPCC (2019)
Google Scholar
Yan, H., Qiu, X., Huang, X.: A unified model for joint Chinese word segmentation and dependency parsing. arXiv preprint arXiv:1904.04697 (2019)
Zhang, M., Zhang, Y., Che, W., Liu, T.: Character-level Chinese dependency parsing. In: Proceedings of ACL (2014)
Google Scholar
Zhang, Y., Wang, R.: Cross-domain dependency parsing using a deep linguistic grammar. In: Proceedings of ACL-AFNLP (2009)
Google Scholar
Zhang, Y., Li, Z., Lang, J., Xia, Q., Zhang, M.: Dependency parsing with partial annotations: an empirical comparison. In: Proceedings of IJCNLP (2017)
Google Scholar
Zhou, Z.H., Li, M.: Tri-training: exploiting unlabeled data using three classifiers. IEEE TKDE 17(11), 1529–1541 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
Zuchao Li, Junru Zhou & Hai Zhao
Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University, Shanghai, China
Zuchao Li, Junru Zhou & Hai Zhao
MoE Key Lab of Artificial Intelligence, AI Institude, Shanghai Jiao Tong University, Shanghai, China
Zuchao Li, Junru Zhou & Hai Zhao
National Institute of Information and Communications Technology (NICT), Kyoto, Japan
Rui Wang

Authors

Zuchao Li
View author publications
You can also search for this author in PubMed Google Scholar
Junru Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Hai Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Rui Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hai Zhao .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Jie Tang
National University of Singapore, Singapore, Singapore
Min-Yen Kan
Peking University, Beijing, China
Dongyan Zhao
Peking University, Beijing, China
Sujian Li
Zhengzhou University, Zhengzhou, China
Hongying Zan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, Z., Zhou, J., Zhao, H., Wang, R. (2019). Cross-Domain Transfer Learning for Dependency Parsing. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11839. Springer, Cham. https://doi.org/10.1007/978-3-030-32236-6_77

Download citation

DOI: https://doi.org/10.1007/978-3-030-32236-6_77
Published: 30 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32235-9
Online ISBN: 978-3-030-32236-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)