An Empirical Study of Multi-domain and Multi-task Learning in Chinese Named Entity Recognition

Hu, Yun; Liao, Mingxue; Lv, Pin; Zheng, Changwen

doi:10.1007/978-3-030-30484-3_58

Yun Hu^12,13,
Mingxue Liao¹³,
Pin Lv¹³ &
…
Changwen Zheng¹³

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11728))

Included in the following conference series:

International Conference on Artificial Neural Networks

4026 Accesses

Abstract

Named entity recognition (NER) often suffers from lack of annotation data. Multi-domain and multi-task learning solve this problem in some degree. However, previous multi-domain and multi-task learning are often studied in English. In the other part, multi-domain and multi-task learning are often researched independently. In this manuscript, we first summarize the previous works of multi-domain and multi-task learning in NER. Then, we introduce the multi-domain and multi-task learning in Chinese NER. Finally, we explore the universal models between multi-domain and multi-task learning. Experiments show that the universal models can be used in Chinese NER and outperform the baseline model.

The work is supported by both National scientific and Technological Innovation Zero (No. 17-H863-01-ZT-005-005-01) and State’s Key Project of Research and Development Plan (No. 2016QY03D0505).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bunescu, R., Mooney, R.: A shortest path dependency kernel for relation extraction. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (2005). http://aclweb.org/anthology/H05-1091
Cao, P., Chen, Y., Liu, K., Zhao, J., Liu, S.: Adversarial transfer learning for Chinese named entity recognition with self-attention mechanism. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 182–192. Association for Computational Linguistics (2018). http://aclweb.org/anthology/D18-1017
Changpinyo, S., Hu, H., Sha, F.: Multi-task learning for sequence tagging: an empirical study. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 2965–2977. Association for Computational Linguistics (2018). http://aclweb.org/anthology/C18-1251
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(Aug), 2493–2537 (2011). http://dl.acm.org/citation.cfm?id=2078186
MATH Google Scholar
Dong, C., Zhang, J., Zong, C., Hattori, M., Di, H.: Character-based LSTM-CRF with radical-level features for chinese named entity recognition. In: Lin, C.-Y., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds.) ICCPOL/NLPCC -2016. LNCS (LNAI), vol. 10102, pp. 239–250. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50496-4_20
Chapter Google Scholar
Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM networks. In: 2005 IEEE International Joint Conference on Neural Networks, IJCNN 2005, Proceedings, vol. 4, pp. 2047–2052 (2005). https://doi.org/10.1016/j.neunet.2005.06.042
Article Google Scholar
He, H., Sun, X.: A unified model for cross-domain and semi-supervised named entity recognition in Chinese social media. In: AAAI, pp. 3216–3222 (2017). http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14484
Ji, Y.L., Dernoncourt, F., Szolovits, P.: Transfer learning for named-entity recognition with neural networks (2017). http://www.lrec-conf.org/proceedings/lrec2018/summaries/878.html
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. Comput. Sci. (2014). http://arxiv.org/abs/1412.6980
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 260–270. Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/N16-1030
Levow, G.A.: The third international Chinese language processing bakeoff: word segmentation and named entity recognition. In: Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing, pp. 108–117. Association for Computational Linguistics (2006). http://www.aclweb.org/anthology/W06-0115
Lin, B.Y., Lu, W.: Neural adaptation layers for cross-domain named entity recognition. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2012–2022. Association for Computational Linguistics (2018). http://aclweb.org/anthology/D18-1226
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Lingvisticae Investigationes 30(1), 3–26 (2007)
Article Google Scholar
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)
Article Google Scholar
Peng, N., Dredze, M.: Named entity recognition for Chinese social media with jointly trained embeddings. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 548–554. Association for Computational Linguistics (2015). http://www.aclweb.org/anthology/D15-1064, https://doi.org/10.18653/v1/D15-1064
Peng, N., Dredze, M.: Improving named entity recognition for Chinese social media with word segmentation representation learning. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 149–155. Association for Computational Linguistics (2016). http://www.aclweb.org/anthology/P16-2025, https://doi.org/10.18653/v1/P16-2025
Peng, N., Dredze, M.: Multi-task domain adaptation for sequence tagging. In: Proceedings of the 2nd Workshop on Representation Learning for NLP, pp. 91–100. Association for Computational Linguistics (2017). http://aclweb.org/anthology/W17-2612
Qian, P., Qiu, X., Huang, X.: A new psychometric-inspired evaluation metric for Chinese word segmentation. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2185–2194. Association for Computational Linguistics (2016). http://www.aclweb.org/anthology/P16-1206, https://doi.org/10.18653/v1/P16-1206
Sharnagat, R.: Named entity recognition: a literature survey. Center For Indian Language Technology (2014)
Google Scholar
Tan, C., Sun, F., Tao, K., Zhang, W., Chao, Y., Liu, C.: A survey on deep transfer learning, pp. 270–279 (2018). https://doi.org/10.1007/978-3-030-01424-7_27
Chapter Google Scholar
Wang, Z., et al.: Label-aware double transfer learning for cross-specialty medical named entity recognition. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 1–15. Association for Computational Linguistics (2018). http://aclweb.org/anthology/N18-1001
Weischedel, R., et al.: Ontonotes release 5.0 ldc2013t19. Linguistic Data Consortium, Philadelphia (2013)
Google Scholar
Wu, W., et al.: Glyce: Glyph-vectors for Chinese character representations. arXiv preprint arXiv:1901.10125 (2019)
Yadav, V., Bethard, S.: A survey on recent advances in named entity recognition from deep learning models. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 2145–2158. Association for Computational Linguistics (2018). http://aclweb.org/anthology/C18-1182
Yang, Z., Salakhutdinov, R., Cohen, W.W.: Transfer learning for sequence tagging with hierarchical recurrent networks (2017). https://openreview.net/forum?id=ByxpMd9lx
Yao, X., Van Durme, B.: Information extraction over structured data: Question answering with freebase. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 956–966. Association for Computational Linguistics (2014). https://doi.org/10.3115/v1/P14-1090

Download references

Author information

Authors and Affiliations

University of Chinese Academy of Sciences, Beijing, China
Yun Hu
Institute of Software, Chinese Academy of Sciences, Beijing, China
Yun Hu, Mingxue Liao, Pin Lv & Changwen Zheng

Authors

Yun Hu
View author publications
You can also search for this author in PubMed Google Scholar
Mingxue Liao
View author publications
You can also search for this author in PubMed Google Scholar
Pin Lv
View author publications
You can also search for this author in PubMed Google Scholar
Changwen Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yun Hu .

Editor information

Editors and Affiliations

Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Igor V. Tetko
Institute of Computer Science, Czech Academy of Sciences, Prague 8, Czech Republic
Věra Kůrková
Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Pavel Karpov
Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Fabian Theis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, Y., Liao, M., Lv, P., Zheng, C. (2019). An Empirical Study of Multi-domain and Multi-task Learning in Chinese Named Entity Recognition. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Deep Learning. ICANN 2019. Lecture Notes in Computer Science(), vol 11728. Springer, Cham. https://doi.org/10.1007/978-3-030-30484-3_58

Download citation

DOI: https://doi.org/10.1007/978-3-030-30484-3_58
Published: 09 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30483-6
Online ISBN: 978-3-030-30484-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics