Skip to main content

Cross-Domain Transfer Learning for Dependency Parsing

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11839))

Abstract

In recent years, the research of dependency parsing focuses on improving the accuracy of in-domain data and has made remarkable progress. However, the real world is different from a single scenario dataset, filled with countless scenarios that are not covered by the dataset, namely, out-of-domain. As a result, parsers that perform well on the in-domain data often suffer significant performance degradation on the out-of-domain data. Therefore, in order to adapt the existing in-domain parsers with substantial performance to the new domain scenario, cross-domain transfer learning techniques are essential to solve the domain problem in parsing. In this paper, we examine two scenarios for cross-domain transfer learning: semi-supervised and unsupervised cross-domain transfer learning. Specifically, we adopt a pretrained language model BERT for training on the source domain (in-domain) data at subword level and introduce two tri-training variant methods for the two scenarios so as to achieve the goal of cross-domain transfer learning. The system based on this paper participated in NLPCC-2019-shared-task on cross-domain dependency parsing and won the first place on the “subtask3-un-open” and “subtask4-semi-open” subtasks, indicating the effectiveness of the approaches adopted.

This paper was partially supported by National Key Research and Development Program of China (No. 2017YFB0304100) and Key Projects of National Natural Science Foundation of China (U1836222 and 61733011).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Our code will be available at https://github.com/bcmi220/cddp.

  2. 2.

    Subtasks “subtask1-un-closed” and “subtask2-semi-closed” are not our focus. Since our baseline parsing framework is based on BERT and subtasks 1 and 2 prohibit the use of BERT and other external resources, we only use the transformer structure of BERT, without using BERT pretrained weights for initialization. The transformer network of BERT is very deep and the currently offered training dataset is too small to train the deep network well, so we only reached comparable results to other participants. This illustrates the deep neural network need enough data for training.

  3. 3.

    For the training phase, there is no need to consider this issue at all. As with other graph-based models, the predicted tree at training time is the one where each word is a dependent of its highest scoring head including intra-word and inter-word dependencies.

  4. 4.

    Due to the tri-training iterative training process, the unlabeled data will be much larger than the golden annotation data. In order to balance the training process of the model, we repeat the golden data to achieve the same amount of data as the unlabeled data, and then perform data shuffle during training.

  5. 5.

    The initial score for each model run is set to 0, so at least one model will be saved for each training session.

References

  1. Andor, D., et al.: Globally normalized transition-based neural networks. In: Proceedings of ACL (2016)

    Google Scholar 

  2. Angeli, G., Premkumar, M.J.J., Manning, C.D.: Leveraging linguistic structure for open domain information extraction. In: Proceedings of ACL-IJCNLP (2015)

    Google Scholar 

  3. Bowman, S.R., Gauthier, J., Rastogi, A., Gupta, R., Manning, C.D., Potts, C.: A fast unified model for parsing and sentence understanding. In: Proceedings of ACL (2016)

    Google Scholar 

  4. Chen, K., et al.: Neural machine translation with source dependency representation. In: Proceedings of EMNLP (2017)

    Google Scholar 

  5. Clark, K., Luong, M.T., Manning, C.D., Le, Q.: Semi-supervised sequence modeling with cross-view training. In: Proceedings of EMNLP (2018)

    Google Scholar 

  6. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  7. Dozat, T., Manning, C.D.: Deep biaffine attention for neural dependency parsing. arXiv preprint arXiv:1611.01734 (2016)

  8. Hatori, J., Matsuzaki, T., Miyao, Y., Tsujii, J.: Incremental joint approach to word segmentation, POS tagging, and dependency parsing in Chinese. In: Proceedings of ACL (2012)

    Google Scholar 

  9. He, S., Li, Z., Zhao, H., Bai, H.: Syntax for semantic role labeling, to be, or not to be. In: Proceedings of ACL (2018)

    Google Scholar 

  10. Jiang, X., Li, Z., Zhang, B., Zhang, M., Li, S., Si, L.: Supervised treebank conversion: data and approaches. In: Proceedings of ACL (2018)

    Google Scholar 

  11. Kurita, S., Kawahara, D., Kurohashi, S.: Neural joint model for transition-based Chinese syntactic analysis. In: Proceedings of ACL (2017)

    Google Scholar 

  12. Levy, O., Goldberg, Y.: Dependency-based word embeddings. In: Proceedings of ACL (2014)

    Google Scholar 

  13. Li, Z., Cai, J., He, S., Zhao, H.: Seq2seq dependency parsing. In: Proceedings of COLING, pp. 3203–3214 (2018)

    Google Scholar 

  14. Peters, M., et al.: Deep contextualized word representations. In: Proceedings of NAACL-HLT, pp. 2227–2237 (2018)

    Google Scholar 

  15. Ruder, S., Plank, B.: Strong baselines for neural semi-supervised learning under domain shift. In: Proceedings of ACL (2018)

    Google Scholar 

  16. Peng, X., Li, Z., Zhang, M., Wang, R., Si, L.: Overview of the NLPCC 2019 shared task: cross-domain dependency parsing. In: Proceedings of NLPCC (2019)

    Google Scholar 

  17. Yan, H., Qiu, X., Huang, X.: A unified model for joint Chinese word segmentation and dependency parsing. arXiv preprint arXiv:1904.04697 (2019)

  18. Zhang, M., Zhang, Y., Che, W., Liu, T.: Character-level Chinese dependency parsing. In: Proceedings of ACL (2014)

    Google Scholar 

  19. Zhang, Y., Wang, R.: Cross-domain dependency parsing using a deep linguistic grammar. In: Proceedings of ACL-AFNLP (2009)

    Google Scholar 

  20. Zhang, Y., Li, Z., Lang, J., Xia, Q., Zhang, M.: Dependency parsing with partial annotations: an empirical comparison. In: Proceedings of IJCNLP (2017)

    Google Scholar 

  21. Zhou, Z.H., Li, M.: Tri-training: exploiting unlabeled data using three classifiers. IEEE TKDE 17(11), 1529–1541 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hai Zhao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, Z., Zhou, J., Zhao, H., Wang, R. (2019). Cross-Domain Transfer Learning for Dependency Parsing. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11839. Springer, Cham. https://doi.org/10.1007/978-3-030-32236-6_77

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-32236-6_77

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-32235-9

  • Online ISBN: 978-3-030-32236-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics