Skip to main content

ENER: Named Entity Recognition Model for Ethnic Ancient Books Based on Entity Boundary Detection

  • Conference paper
  • First Online:
Cognitive Computing – ICCC 2023 (ICCC 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14207))

Included in the following conference series:

Abstract

Due to the significant differences between the entity identification rules in the field of ethnic ancient books and the existing methods, the general model has poor accuracy in identifying specific terms in the field entity extraction task and fails to effectively solve the problems of ambiguity and nesting of Chinese entities by using boundary information. In this paper, we construct a small-scale named entity corpus of ethnic ancient books and propose an Ethnic Naming Entity Recognition (ENER) model integrating entity boundary detection. In ENER, BERT model is used to pre-train the corpus of ancient book text annotation, Bidirectional Gate Recurrent Unit (BiGRU) encodes the contextual features of ancient books. Conditional Random Field (CRF) adds an auxiliary task of entity boundary detection based on named entity identification task to enhance model’s ability to identify entity boundaries and generates the named entity tag sequence of ancient books. Experiments on the corpus of ancient books named entities and other general Chinese data sets show the effectiveness of our approach. On the one hand, ENER has improved the accuracy, recall and F1 value by 2.09%, 1.62% and 1.85% respectively. Compared with the baseline BERT-BiLSTM-CRF model and achieved higher indicators than other models. On the other hand, ENER shows better effect on the recognition of ancient book named entities in small-scale corpus and it is also stable on Chinese general data sets. It can be applied in dealing with text containing specific terms in the ethnic field and promoted to more tasks in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Sundheim, B.: Named entity task definition. In: Proceedings of Message Understanding Conference (1995)

    Google Scholar 

  2. Lin, Y., Shen, S.: Neural relation extraction with selective attention over instances. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, vol. 1, pp. 2124–2133 (2016)

    Google Scholar 

  3. Guo, X.: CG-ANER: enhanced contextual embeddings and glyph features-based agricultural named entity recognition. Comput. Electron. Agric. 194, 106776 (2022)

    Google Scholar 

  4. Wu, Z.: Summary of research on named entity recognition technology for electronic medical records. Comput. Eng. Appl. 58(21), 13–29 (2021)

    Google Scholar 

  5. Tong, Z.: Research on military domain named entity recognition based on pre training model. Front. Data Comput. 4(5), 120–128 (2022)

    Google Scholar 

  6. Ma, K.: Ontology-based BERT model for automated information extraction from geological hazard reports. J. Earth Sci. 34(5), 1390–1405 (2023)

    Article  Google Scholar 

  7. Fan, G.: Analysis of hot topics and evolution trends of ancient books digitization research based on Knowledge Mapping. View Publ. 3(11), 85–87 (2020)

    Google Scholar 

  8. Yingjie Wang, F.: A survey of Chinese named entity recognition. J. Front. Comput. Sci. Technol. 17(2), 324–341 (2023)

    Google Scholar 

  9. Liu, C.F., Huang, C.S.: Mining local gazetteers of literary Chinese with CRF and pattern based methods for biographical information in Chinese history. In: 2015 IEEE International Conference on Big Data, Santa Clara, USA, pp. 1629–1638 (2015)

    Google Scholar 

  10. Khanam, M.H., Khudhus, M.A., Babu, M.S.P.: Named entity recognition using machine learning techniques for Telugu language. In: 2016 7th IEEE International Conference on Software Engineering and Service Science, Beijing, China, pp. 940–944 (2016)

    Google Scholar 

  11. Li, N.: Construction of an automatic extraction model for local chronicles and ancient book aliases based on conditional random fields. J. Chin. Inf. Process. 32(11), 41–48 (2018)

    Google Scholar 

  12. Hinton, G.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Google Scholar 

  13. Liu, L.: Automatic extraction of traditional musical terms from intangible cultural heritage. Data Anal. Knowl. Disc. 4(12), 68–75 (2020)

    MathSciNet  Google Scholar 

  14. Zhao, Z., Zhou, Z., Xing, W., Wu, J., Chang, Y., Li, B.: A neural framework for Chinese medical named entity recognition. In: Xu, R., De, W., Zhong, W., Tian, L., Bai, Y., Zhang, L.-J. (eds.) AIMS 2020. LNCS, vol. 12401, pp. 74–83. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59605-7_6

    Chapter  Google Scholar 

  15. Lv, H., Ning, Y., Ning, Ke.: ALBERT-based Chinese named entity recognition. In: Yang, Y., Yu, L., Zhang, L.-J. (eds.) ICCC 2020. LNCS, vol. 12408, pp. 79–87. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59585-2_7

    Chapter  Google Scholar 

  16. Xie, X.: Geological named entity recognition based on BERT and BiGRU-Attention - CRF model. Geol. Bull. China 42(5), 846–855 (2021)

    Google Scholar 

  17. Zhou, F.: Named entity recognition of ancient poems based on Albert-BiLSTM-MHA-CRF model. Wirel. Commun. Mob. Comput. 2022, 1–11 (2022)

    Article  Google Scholar 

  18. Wang, Y.: Geotechnical engineering entity recognition based on BERT-BiGRU-CRF model. Earth Sci. 48(8), 3137–3150 (2023)

    Google Scholar 

  19. Li, X.: Named entity recognition method based on joint entity boundary detection. J. Hebei Univ. Sci. Technol. 44(1), 20–28 (2023)

    Google Scholar 

  20. Chun, C., Kong, F.: Enhancing entity boundary detection for better Chinese named entity recognition. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, vol. 2, pp. 20–25. Online (2021)

    Google Scholar 

  21. Devlin, J.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv arXiv:1810.04805v1, 11 October 2018

  22. Zhang, Y., Yang, J.: Chinese NER using lattice LSTM. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, vol. 1, pp. 1554–1564 (2018)

    Google Scholar 

  23. Gui, T., Ma, R.: CNN-based Chinese NER with lexicon rethinking. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, Macao, China, pp. 4982–4988 (2019)

    Google Scholar 

  24. Xue, M., Yu, B.: Porous lattice transformer encoder for Chinese NER. In: Proceedings of the 28th International Conference on Computational Linguistics, vol. 1, pp. 3831–3841 (2020). Online

    Google Scholar 

  25. Wu, S., Song, X.: MECT: multi-metadata embedding based cross-transformer for Chinese named entity recognition. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, vol. 1, pp. 1529–1539 (2021). Online

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ziquan Feng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhao, L., Feng, Z., Sun, N., Lu, Y. (2024). ENER: Named Entity Recognition Model for Ethnic Ancient Books Based on Entity Boundary Detection. In: Pan, X., Jin, T., Zhang, LJ. (eds) Cognitive Computing – ICCC 2023. ICCC 2023. Lecture Notes in Computer Science, vol 14207. Springer, Cham. https://doi.org/10.1007/978-3-031-51671-9_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-51671-9_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-51670-2

  • Online ISBN: 978-3-031-51671-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics