Skip to main content

ALBERT-Based Chinese Named Entity Recognition

  • Conference paper
  • First Online:
Cognitive Computing – ICCC 2020 (ICCC 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12408))

Included in the following conference series:

Abstract

Chinese named entity recognition (NER) has been an important problem in natural language processing (NLP) field. Most existing methods mainly use traditional deep learning models which cannot fully leverage contextual dependencies that are very important for capturing the relations between words or characters for modeling. To address this problem, various language representation methods such as BERT have been proposed to learn the global context information. Although these methods can achieve good results, the large number of parameters limited the efficiency and application in real-world scenarios. To improve both of the performance and efficiency, this paper proposes an ALBERT-based Chinese NER method which uses ALBERT, a Lite version of BERT, as the pre-trained model to reduce model parameters and to improve the performance through sharing cross-layer parameters. Besides, it uses conditional random field (CRF) to capture the sentence-level correlation information between words or characters to alleviate the tagging inconsistency problems. Experimental results demonstrate that our method outperforms the comparison methods over 4.23–11.17% in terms of relative F1-measure with only 4% of BERT’s parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Sang, E.F., Meulder, F.: Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. arXiv preprint arXiv:cs/0306050 (2003)

  2. Collobert, R., Weston, J., Bottou, L.: Natural language processing almost from scratch, pp. 2493–2573 (2006)

    Google Scholar 

  3. Peters, M.E., Ammar, W., et al.: Semi-supervised sequence tagging with bidirectional language models. arXiv preprint, pp. 1756–1765 (2017)

    Google Scholar 

  4. Shao, Y., Hardmeier, C., et al.: Character-based Joint Segmentation and POS tagging for Chinese using bidirectional RNN-CRF. arXiv preprint, pp. 173–183 (2017)

    Google Scholar 

  5. Rei, M., Crichton, G.K., Pyysalo, S.: Attending to characters in neural sequence labeling models. arXiv preprint, pp. 309–318 (2016)

    Google Scholar 

  6. Knobelreiter, P., Reinbacher, C., Shekhovtsov, A.: End-to-end training of hybrid CNN-CRF models for stereo. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2339–2348 (2017)

    Google Scholar 

  7. Morwal, S., Jahan, N., Chopra, D.: Named entity recognition using hidden Markov model (HMM). Int. J. Nat. Lang. Comput. (IJNLC) 1(4), 15–23 (2012)

    Article  Google Scholar 

  8. Zhao, X.F., Zhao, D., Liu, Y.G.: Automatic recognition of Chinese names and genders using CRF. Microelectron. Comput. 28(10), 122–124+128 (2011)

    Google Scholar 

  9. Jiang, W., Wang, X.L., Guan, Y.: Improving sequence tagging using machine-learning techniques. In: Proceedings of 2006 International Conference on Machine Learning and Cybernetics, pp. 2636–2641. IEEE (2006)

    Google Scholar 

  10. Luo, G., Huang, X., Lin, C.Y.: Joint entity recognition and disambiguation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 879–888 (2015)

    Google Scholar 

  11. Zhang, Y., Yang, J.: Chinese NER using lattice LSTM, pp. 2227–2237 (2018)

    Google Scholar 

  12. Peters, M.E., Neumann, M., Iyyer, M.: Deep contextualized word representations, pp. 1554–1564 (2018)

    Google Scholar 

  13. Devlin, J., Chang, M.W., Lee, K.: BERT: pre-training of deep bidirectional transformers for language understanding, pp. 4171–4186 (2018)

    Google Scholar 

  14. Lan, Z., Chen, M., Goodman, S., et al.: ALBERT: a lite BERT for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019)

  15. Collobert, R., Weston, J., Bottou, L., et al.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)

    MATH  Google Scholar 

  16. Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)

  17. Mikolov, T., Chen, K., Corrado, G., et al.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  18. Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  19. Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)

    Google Scholar 

  20. Yang, Z.L., Dai, Z.H., Yang, Y.M., Carbonell, J., Salakhutdinov, R., Quoc, V.L.: XLNet: generalized autoregressive pre-training for language understanding. arXiv preprint arXiv:1906.08237 (2019)

  21. Liu, Y., Ott, M., Goyal, N., et al.: RoBERTa: a robustly optimized bert pre-training approach. arXiv preprint arXiv:1907.11692 (2019)

  22. Lample, G., Ballesteros, M., Subramanian, S., et al.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016)

  23. Sun, S., Cheng, Y., Gan, Z., et al.: Patient knowledge distillation for BERT model compression. arXiv preprint arXiv:1908.09355 (2019)

  24. Viterbi, A.J., Wolf, J.K., Zehavi, E., et al.: A pragmatic approach to trellis-coded modulation. IEEE Commun. Mag. 27(7), 11–19 (1989)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yishuang Ning .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lv, H., Ning, Y., Ning, K. (2020). ALBERT-Based Chinese Named Entity Recognition. In: Yang, Y., Yu, L., Zhang, LJ. (eds) Cognitive Computing – ICCC 2020. ICCC 2020. Lecture Notes in Computer Science(), vol 12408. Springer, Cham. https://doi.org/10.1007/978-3-030-59585-2_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59585-2_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59584-5

  • Online ISBN: 978-3-030-59585-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics