ALBERT-Based Chinese Named Entity Recognition

Lv, Haifeng; Ning, Yishuang; Ning, Ke

doi:10.1007/978-3-030-59585-2_7

Haifeng Lv^11,12,
Yishuang Ning^11,12 &
Ke Ning^11,12

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12408))

Included in the following conference series:

International Conference on Cognitive Computing

362 Accesses
2 Citations

Abstract

Chinese named entity recognition (NER) has been an important problem in natural language processing (NLP) field. Most existing methods mainly use traditional deep learning models which cannot fully leverage contextual dependencies that are very important for capturing the relations between words or characters for modeling. To address this problem, various language representation methods such as BERT have been proposed to learn the global context information. Although these methods can achieve good results, the large number of parameters limited the efficiency and application in real-world scenarios. To improve both of the performance and efficiency, this paper proposes an ALBERT-based Chinese NER method which uses ALBERT, a Lite version of BERT, as the pre-trained model to reduce model parameters and to improve the performance through sharing cross-layer parameters. Besides, it uses conditional random field (CRF) to capture the sentence-level correlation information between words or characters to alleviate the tagging inconsistency problems. Experimental results demonstrate that our method outperforms the comparison methods over 4.23–11.17% in terms of relative F1-measure with only 4% of BERT’s parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Sang, E.F., Meulder, F.: Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. arXiv preprint arXiv:cs/0306050 (2003)
Collobert, R., Weston, J., Bottou, L.: Natural language processing almost from scratch, pp. 2493–2573 (2006)
Google Scholar
Peters, M.E., Ammar, W., et al.: Semi-supervised sequence tagging with bidirectional language models. arXiv preprint, pp. 1756–1765 (2017)
Google Scholar
Shao, Y., Hardmeier, C., et al.: Character-based Joint Segmentation and POS tagging for Chinese using bidirectional RNN-CRF. arXiv preprint, pp. 173–183 (2017)
Google Scholar
Rei, M., Crichton, G.K., Pyysalo, S.: Attending to characters in neural sequence labeling models. arXiv preprint, pp. 309–318 (2016)
Google Scholar
Knobelreiter, P., Reinbacher, C., Shekhovtsov, A.: End-to-end training of hybrid CNN-CRF models for stereo. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2339–2348 (2017)
Google Scholar
Morwal, S., Jahan, N., Chopra, D.: Named entity recognition using hidden Markov model (HMM). Int. J. Nat. Lang. Comput. (IJNLC) 1(4), 15–23 (2012)
Article Google Scholar
Zhao, X.F., Zhao, D., Liu, Y.G.: Automatic recognition of Chinese names and genders using CRF. Microelectron. Comput. 28(10), 122–124+128 (2011)
Google Scholar
Jiang, W., Wang, X.L., Guan, Y.: Improving sequence tagging using machine-learning techniques. In: Proceedings of 2006 International Conference on Machine Learning and Cybernetics, pp. 2636–2641. IEEE (2006)
Google Scholar
Luo, G., Huang, X., Lin, C.Y.: Joint entity recognition and disambiguation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 879–888 (2015)
Google Scholar
Zhang, Y., Yang, J.: Chinese NER using lattice LSTM, pp. 2227–2237 (2018)
Google Scholar
Peters, M.E., Neumann, M., Iyyer, M.: Deep contextualized word representations, pp. 1554–1564 (2018)
Google Scholar
Devlin, J., Chang, M.W., Lee, K.: BERT: pre-training of deep bidirectional transformers for language understanding, pp. 4171–4186 (2018)
Google Scholar
Lan, Z., Chen, M., Goodman, S., et al.: ALBERT: a lite BERT for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019)
Collobert, R., Weston, J., Bottou, L., et al.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
MATH Google Scholar
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
Mikolov, T., Chen, K., Corrado, G., et al.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Yang, Z.L., Dai, Z.H., Yang, Y.M., Carbonell, J., Salakhutdinov, R., Quoc, V.L.: XLNet: generalized autoregressive pre-training for language understanding. arXiv preprint arXiv:1906.08237 (2019)
Liu, Y., Ott, M., Goyal, N., et al.: RoBERTa: a robustly optimized bert pre-training approach. arXiv preprint arXiv:1907.11692 (2019)
Lample, G., Ballesteros, M., Subramanian, S., et al.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016)
Sun, S., Cheng, Y., Gan, Z., et al.: Patient knowledge distillation for BERT model compression. arXiv preprint arXiv:1908.09355 (2019)
Viterbi, A.J., Wolf, J.K., Zehavi, E., et al.: A pragmatic approach to trellis-coded modulation. IEEE Commun. Mag. 27(7), 11–19 (1989)
Article Google Scholar

Download references

Author information

Authors and Affiliations

National Engineering Research Center for Supporting Software of Enterprise Internet Services, Shenzhen, China
Haifeng Lv, Yishuang Ning & Ke Ning
Kingdee International Software Group Company Limited, Shenzhen, China
Haifeng Lv, Yishuang Ning & Ke Ning

Authors

Haifeng Lv
View author publications
You can also search for this author in PubMed Google Scholar
Yishuang Ning
View author publications
You can also search for this author in PubMed Google Scholar
Ke Ning
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yishuang Ning .

Editor information

Editors and Affiliations

Tsinghua Shenzhen International Graduate School, Shenzhen, China
Yujiu Yang
IBM Research – Thomas J. Watson Research, New York, NY, USA
Lei Yu
Kingdee International Software Group Co., Ltd, Shenzhen, China
Liang-Jie Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lv, H., Ning, Y., Ning, K. (2020). ALBERT-Based Chinese Named Entity Recognition. In: Yang, Y., Yu, L., Zhang, LJ. (eds) Cognitive Computing – ICCC 2020. ICCC 2020. Lecture Notes in Computer Science(), vol 12408. Springer, Cham. https://doi.org/10.1007/978-3-030-59585-2_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-59585-2_7
Published: 14 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59584-5
Online ISBN: 978-3-030-59585-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics