TL-NER: A Transfer Learning Model for Chinese Named Entity Recognition

Peng, DunLu; Wang, YinRui; Liu, Cong; Chen, Zhang

doi:10.1007/s10796-019-09932-y

TL-NER: A Transfer Learning Model for Chinese Named Entity Recognition

Published: 04 June 2019

Volume 22, pages 1291–1304, (2020)
Cite this article

Information Systems Frontiers Aims and scope Submit manuscript

DunLu Peng ORCID: orcid.org/0000-0003-0223-5447¹,
YinRui Wang¹,
Cong Liu¹ &
…
Zhang Chen¹

1390 Accesses
23 Citations
Explore all metrics

Abstract

Most of the current research on Named Entity Recognition (NER) in the Chinese domain is based on the assumption that annotated data are adequate. However, in many scenarios, the sufficient amount of annotated data required for Chinese NER task is difficult to obtain, resulting in poor performance of machine learning methods. In view of this situation, this paper tries to excavate the information contained in the massive unlabeled raw text data and utilize it to enhance the performance of Chinese NER task. A deep learning model combined with Transfer Learning technique is proposed in this paper. This method can be leveraged in some domains where there is a large amount of unlabeled text data and a small amount of annotated data. The experiment results show that the proposed method performs well on different sized datasets, and this method also avoids errors that occur during the word segmentation process. We also evaluate the effect of transfer learning from different aspects through a series of experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

Article Open access 05 March 2024

The Breakthrough of Large Language Models Release for Medical Applications: 1-Year Timeline and Perspectives

Article Open access 17 February 2024

Joint entity and relation extraction model based on directed-relation GAT oriented to Chinese patent texts

Article 08 February 2024

References

Agrawal, A., Lu, J., Antol, S., Mitchell, M., Zitnick, C.L., Parikh, D., Batra, D. (2015). Vqa: Visual question answering. International Journal of Computer Vision, 123(1), 1–28.
Google Scholar
Chakrabarty, B., & Shkilko, A. (2013). Information transfers and learning in financial markets: Evidence from short selling around insider sales. Journal of Banking & Finance, 37(5), 1560–1572.
Google Scholar
Che, W., Wang, M., Manning, C.D., Liu, T. (2013). Named entity recognition with bilingual constraints. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 52–62).
Chiu, J.P., & Nichols, E. (2016). Named entity recognition with bidirectional lstm-cnns. Transactions of the Association for Computational Linguistics, 4, 357–370.
Google Scholar
Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y. (2014). Learning phrase representations using rnn encoder–decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1724–1734).
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P. (2011). Natural language processing (almost) from scratch. Journal of Machine Learning Research, 12, 2493–2537.
Google Scholar
Derczynski, L., Maynard, D., Rizzo, G., Erp, M.V., Gorrell, G., Troncy, R., Petrak, J., Bontcheva, K. (2015). Analysis of named entity recognition and linking for tweets. Information Processing & Management, 51 (2), 32–49.
Google Scholar
Dernoncourt, F., Lee, J.Y., Uzuner, O., Szolovits, P. (2017). De-identification of patient notes with recurrent neural networks. Journal of the American Medical Informatics Association, 24(3), 596–606.
Google Scholar
Dong, C., Zhang, J., Zong, C., Hattori, M., Di, H. (2016). Character-based lstm-crf with radical-level features for chinese named entity recognition. In Natural Language Understanding and Intelligent Applications (pp. 239–250): Springer.
Dong, X., Chowdhury, S., Qian, L., Guan, Y., Yang, J., Yu, Q. (2017). Transfer bi-directional lstm rnn for named entity recognition in chinese electronic medical records. In 2017 IEEE 19Th international conference on e-health networking, applications and services (Healthcom) (pp. 1–4): IEEE.
Forney, G.D. (1973). The viterbi algorithm. Proceedings of the IEEE, 61(3), 268–278.
Google Scholar
Graves, A. (2012). Long short-term memory. In Supervised sequence labelling with recurrent neural networks (pp. 37–45): Springer.
Guo, H., Jiang, J., Hu, G., Zhang, T. (2004). Chinese Named Entity Recognition Based on Multilevel Linguistic Features. Berlin: Springer.
Google Scholar
He, H., & Sun, X. (2017). F-score driven max margin neural network for named entity recognition in chinese social media. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, (Vol. 2 pp. 713–718).
Huang, Z., Xu, W., Yu, K. (2015). Bidirectional lstm-crf models for sequence tagging. arXiv:abs/150801991.
Huang, S., Sun, X., Wang, H. (2017). Addressing domain adaptation for chinese word segmentation with global recurrent structure. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), (Vol. 1 pp. 184–193).
Kalchbrenner, N., Grefenstette, E., Blunsom, P. (2014). A convolutional neural network for modelling sentences. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), (Vol. 1 pp. 655–665).
Kingma, D.P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv:abs/14126980.
Kuru, O., Can, O.A., Yuret, D. (2016). Charner: Character-level named entity recognition. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers (pp. 911–921).
Lafferty, J.D., Mccallum, A., Pereira, F.C.N. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Eighteenth International Conference on Machine Learning (pp. 282–289).
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C. (2016). Neural architectures for named entity recognition. In Proceedings of NAACL-HLT (pp. 260–270).
Levow, G.A. (2006). The third international chinese language processing bakeoff: Word segmentation and named entity recognition. In Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing (pp. 108–117).
Li, H., Hagiwara, M., Li, Q., Ji, H. (2014). Comparison of the impact of word segmentation on name tagging for chinese and japanese. In LREC (pp. 2532–2536).
Liu, Z., Zhu, C., Zhao, T. (2010). Chinese named entity recognition with a sequence labeling approach: based on characters, or based on words? In Advanced intelligent computing theories and applications. With aspects of artificial intelligence (pp. 634–640): Springer.
Liu, L., Shang, J., Ren, X., Xu, F.F., Gui, H., Peng, J., Han, J. (2018). Empower sequence labeling with task-aware neural language model. In Proceedings of Thirty-Second AAAI Conference on Artificial Intelligence.
Lu, Y., Zhang, Y., Ji, D. (2016). Multi-prototype chinese character embedding. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016).
Luo, G., Huang, X., Lin, C.Y., Nie, Z. (2015). Joint entity recognition and disambiguation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp. 879–888).
Ma, X., & Hovy, E. (2016). End-to-end sequence labeling via bi-directional lstm-cnns-crf. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (vol. 1, pp. 1064–1074).
Mikolov, T., Chen, K., Corrado, G., Dean, J. (2013a). Efficient estimation of word representations in vector space. arXiv:abs/13013781.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J. (2013b). Distributed representations of words and phrases and their compositionality, (Vol. 26.
Mnih, V., Heess, N., Graves, A., Kavukcuoglu, K. (2014). Recurrent models of visual attention. In Proceedings of the 27th International Conference on Neural Information Processing Systems-Volume 2 (pp. 2204–2212): MIT Press.
Mou, L., Meng, Z., Yan, R., Li, G., Xu, Y., Zhang, L., Jin, Z. (2016). How transferable are neural networks in nlp applications? In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (pp. 479–489).
Nadeau, D., & Sekine, S. (2007). A survey of named entity recognition and classification. Lingvisticae Investigationes, 30(1), 3–26.
Google Scholar
Nemeskey, D.M., & Kornai, A. (2018). Emergency vocabulary. Information Systems Frontiers, 20(5), 909–923.
Google Scholar
Oquab, M., Bottou, L., Laptev, I., Sivic, J. (2014). Learning and transferring mid-level image representations using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1717–1724).
Pan, S.J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge & Data Engineering, 22(10), 1345–1359.
Google Scholar
Passos, A., Kumar, V., McCallum, A. (2014). Lexicon infused phrase embeddings for named entity resolution. CoNLL-2014, 78.
Peng, N., & Dredze, M. (2015). Named entity recognition for chinese social media with jointly trained embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp 548–554).
Peng, N., & Dredze, M. (2016). Improving named entity recognition for chinese social media with word segmentation representation learning. In Meeting of the Association for Computational Linguistics (pp 149–155).
Qiu, L., & Zhang, Y. (2015). Word segmentation for chinese novels. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (pp. 2440–2446): AAAI Press.
Rei, M., Crichton, G., Pyysalo, S. (2016). Attending to characters in neural sequence labeling models. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers (pp 309–318).
Rei, M. (2017). Semi-supervised multitask learning for sequence labeling. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (vol. 1, pp. 2121–2130.
Smith, K.S., McCreadie, R., Macdonald, C., Ounis, I. (2018). Regional sentiment bias in social media reporting during crises. Information Systems Frontiers, 20(5), 1013–1025.
Google Scholar
Wang, M., Che, W., Manning, C.D. (2013). Effective bilingual constraints for semi-supervised learning of named entity recognizers. In Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence (pp. 919–925): AAAI Press.
Wang, D., & Zheng, T.F. (2015). Transfer learning for speech and language processing. In Proceedings of 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) (pp. 1225–1237).
Weischedel, R., Pradhan, S., Ramshaw, L., Palmer, M., Xue, N., Marcus, M., Taylor, A., Greenberg, C., Hovy, E., Belvin, R., et al. (2011). Ontonotes release 4.0. LDC2011t03. Philadelphia: Linguistic Data Consortium.
Google Scholar
Williams, R.J., & Zipser, D. (1989). A learning algorithm for continually running fully recurrent neural networks. Neural computation, 1(2), 270–280.
Google Scholar
Wu, Y., Zhao, J., Xu, B. (2003). Chinese named entity recognition combining a statistical model with human knowledge. In ACL 2003 Workshop on Multilingual and Mixed-Language Named Entity Recognition (pp. 65–72).
Yang, H.L., & Chao, A.F.Y. (2015). Sentiment analysis for chinese reviews of movies in multi-genre based on morpheme-based features and collocations. Information Systems Frontiers, 17(6), 1335–1352.
Google Scholar
Yang, J., Teng, Z., Zhang, M., Zhang, Y. (2016). Combining discrete and neural features for sequence labeling. In International Conference on Intelligent Text Processing and Computational Linguistics (pp. 140–154): Springer.
Yang, J., Zhang, Y., Dong, F. (2017a). Neural word segmentation with rich pretraining. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (vol. 1, pp. 839–849).
Yang, Z., Salakhutdinov, R., Cohen, W.W. (2017b). Transfer learning for sequence tagging with hierarchical recurrent networks. arXiv:abs/170306345.
Zhang, S., Qin, Y., Wen, J., Wang, X. (2006). Word segmentation and named entity recognition for sighan bakeoff3. In Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing (pp. 158–161).
Zhou, J., Qu, W., Zhang, F. (2013). Chinese named entity recognition via joint identification and categorization. Chinese Journal of Electronics, 22(2), 225–230.
Google Scholar
Zhuang, F.Z., Ping, L., Qing, H.E., Shi, Z.Z. (2015). Survey on transfer learning research. Journal of Software, 26, 26–39.
Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China, under Grants 61772342 and 61703278. We would like to express our special thanks to the reviewers for their presious comments and suggestions which are of great help to us in perfecting this paper.

Author information

Authors and Affiliations

School of Optical-Electrical and Computer Engineer, University of Shanghai for Science and Technology, Shanghai, 200093, China
DunLu Peng, YinRui Wang, Cong Liu & Zhang Chen

Authors

DunLu Peng
View author publications
You can also search for this author in PubMed Google Scholar
YinRui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Cong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhang Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to DunLu Peng.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Peng, D., Wang, Y., Liu, C. et al. TL-NER: A Transfer Learning Model for Chinese Named Entity Recognition. Inf Syst Front 22, 1291–1304 (2020). https://doi.org/10.1007/s10796-019-09932-y

Download citation

Published: 04 June 2019
Issue Date: December 2020
DOI: https://doi.org/10.1007/s10796-019-09932-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

TL-NER: A Transfer Learning Model for Chinese Named Entity Recognition

Abstract

Access this article

Similar content being viewed by others

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

The Breakthrough of Large Language Models Release for Medical Applications: 1-Year Timeline and Perspectives

Joint entity and relation extraction model based on directed-relation GAT oriented to Chinese patent texts

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

TL-NER: A Transfer Learning Model for Chinese Named Entity Recognition

Abstract

Access this article

Similar content being viewed by others

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

The Breakthrough of Large Language Models Release for Medical Applications: 1-Year Timeline and Perspectives

Joint entity and relation extraction model based on directed-relation GAT oriented to Chinese patent texts

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation