Learning Domain Invariant Word Representations for Parsing Domain Adaptation

Qiao, Xiuming; Zhang, Yue; Zhao, Tiejun

doi:10.1007/978-3-030-32233-5_62

Learning Domain Invariant Word Representations for Parsing Domain Adaptation

Xiuming Qiao¹³,
Yue Zhang^14,15 &
Tiejun Zhao¹³

Conference paper
First Online: 30 September 2019

2216 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11838))

Abstract

We show that strong domain adaptation results for dependency parsing can be achieved using a conceptually simple method that learns domain-invariant word representations. Lacking labeled resources, dependency parsing for low-resource domains has been a challenging task. Existing work considers adapting a model trained on a resource-rich domain to low-resource domains. A mainstream solution is to find a set of shared features across domains. For neural network models, word embeddings are a fundamental set of initial features. However, little work has been done investigating this simple aspect. We propose to learn domain-invariant word representations by fine-tuning pretrained word representations adversarially. Our parser achieves error reductions of 5.6% UAS, 7.9% LAS on PTB respectively, and 4.2% UAS, 3.2% LAS on Genia respectively, showing the effectiveness of domain invariant word representations for alleviating lexical bias between source and target data.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Shen, L., Xu, J., Weischedel, R.: A new string-to-dependency machine translation algorithm with a target dependency language model. In: ACL (2008)
Google Scholar
McClosky, D., Surdeanu, M., Manning, C.: Event extraction as dependency parsing. In: ACL (2011)
Google Scholar
Chen, Q., Zhu, X., Ling, Z.-H., Wei, S., Jiang, H., Inkpen, D.: Enhanced LSTM for natural language inference. In: ACL (2017)
Google Scholar
Guo, J., Che, W., Yarowsky, D., Wang, H., Liu, T.: A representation learning framework for multi-source transfer parsing. In: AAAI (2016)
Google Scholar
Duong, L., Cohn, T., Bird, S., Cook, P.: Low resource dependency parsing: cross-lingual parameter sharing in a neural network parser. In: ACL (2015)
Google Scholar
Ben-David, S., Blitzer, J., Crammer, K., Pereira, F.: Analysis of representations for domain adaptation. In: NIPS (2006)
Google Scholar
Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of English: the Penn treebank. Computat. Linguist. 19(2), 313–330 (1993)
Google Scholar
Kim, J.-D., Ohta, T., Tateisi, Y., Tsujii, J.: Genia corpus-semantically annotated corpus for bio-textmining. Bioinformatics 19(Suppl 1), i180–i182 (2003)
Article Google Scholar
Koo, T., Carreras, X., Collins, M.: Simple semi-supervised dependency parsing. In: ACL (2008)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR, abs/1301.3781 (2013)
Google Scholar
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN. CoRR, abs/1701.07875 (2017)
Google Scholar
Dozat, T., Manning, C.D.: Deep biaffine attention for neural dependency parsing. In: ICLR (2017)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. In: EMNLP (2014)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: NIPS (2014)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: EMNLP (2014)
Google Scholar
Ma, X., Hu, Z., Liu, J., Peng, N., Neubig, G., Hovy, E.: Stack-pointer networks for dependency parsing. In: ACL (2018)
Google Scholar
McClosky, D., Charniak, E.: Self-training for biomedical parsing. In: ACL (2008)
Google Scholar
Ling, W., Dyer, C., Black, A.W., Trancoso, I.: Two/too simple adaptations of Word2Vec for syntax problems. In: NAACL (2015)
Google Scholar
Zhang, Y., Clark, S.: A tale of two parsers: investigating and combining graph-based and transition-based dependency parsing. In: EMNLP (2008)
Google Scholar
Liu, J., Zhang, Y.: Encoder-decoder shift-reduce syntactic parsing. In: IWPT (2017)
Google Scholar
McClosky, D., Charniak, E., Johnson, M.: Automatic domain adaptation for parsing. In: NAACL (2010)
Google Scholar
Jiang, J., Zhai, C.X.: Instance weighting for domain adaptation in NLP. In: ACL (2007)
Google Scholar
Wang, R., Utiyama, M., Liu, L., Chen, K., Sumita, E.: Instance weighting for neural machine translation domain adaptation. In: EMNLP (2017)
Google Scholar
Chen, W., Zhang, M., Zhang, Y.: Distributed feature representations for dependency parsing. IEEE/ACM TASLP 23, 451–460 (2015)
Article Google Scholar
Long, M., Cao, Y., Wang, J., Jordan, M.I.: Learning transferable features with deep adaptation networks. In: ICML (2015)
Google Scholar
Shen, J., Qu, Y., Zhang, W., Yu, Y.: Wasserstein distance guided representation learning for domain adaptation. In: AAAI (2018)
Google Scholar
Levy, O., Goldberg, Y.: Dependency-based word embeddings. In: ACL (2014)
Google Scholar
Sun, C., Yan, H., Qiu, X., Huang, X.: Gaussian word embedding with a wasserstein distance loss. CoRR, abs/1808.07016 (2018)
Google Scholar
Xu, H., Wang, W., Liu, W., Carin, L.: Distilled wasserstein learning for word embedding and topic modeling. In: NIPS (2018)
Google Scholar
Sato, M., Manabe, H., Noji, H., Matsumoto, Y.: Adversarial training for cross-domain universal dependency parsing. In: CoNLL (2017)
Google Scholar

Download references

Acknowledgments

This work was done when the first author was visiting Westlake University. We gratefully acknowledge the funding from the project of National Key Research and Development Program of China (No. 2018YFC0830700). We also thank the anonymous reviewers for their helpful comments and suggestions.

Author information

Authors and Affiliations

School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
Xiuming Qiao & Tiejun Zhao
School of Engineering, Westlake University, Hangzhou, China
Yue Zhang
Institute of Advanced Technology, Westlake Institute for Advanced Study, Hangzhou, China
Yue Zhang

Authors

Xiuming Qiao
View author publications
You can also search for this author in PubMed Google Scholar
Yue Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Tiejun Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiuming Qiao .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Jie Tang
National University of Singapore, Singapore, Singapore
Min-Yen Kan
Peking University, Beijing, China
Dongyan Zhao
Peking University, Beijing, China
Sujian Li
Zhengzhou University, Zhengzhou, China
Hongying Zan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Qiao, X., Zhang, Y., Zhao, T. (2019). Learning Domain Invariant Word Representations for Parsing Domain Adaptation. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11838. Springer, Cham. https://doi.org/10.1007/978-3-030-32233-5_62

Download citation

DOI: https://doi.org/10.1007/978-3-030-32233-5_62
Published: 30 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32232-8
Online ISBN: 978-3-030-32233-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)