Skip to main content

Multi-label Fine-Grained Entity Typing for Baidu Wikipedia Based on Pre-trained Model

  • Conference paper
  • First Online:
  • 377 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1553))

Abstract

It is a crucial premise for named entity recognition task to achieve high-accuracy entity extraction. CCKS-2021 held a Knowledge Graph Fine-grained Entity Typing competition, and 262 teams participated. What is challenging in the task is the extremely large amounts of unlabeled data and the multi-label entity typing. In our approach, a semi-supervised learning strategy is conducted to cope with the unlabeled data, and a multi-label loss is employed to recognize the multi-label entity. An F1-score of 0.85498 on the final testing data is achieved, which verifies the performance of our approach, and ranks the second place in the task.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Chinchor, N.: MUC-6 named entity task definition (version 2.1). In: Proceedings of the 6th Conference on Message Understanding, Columbia, Maryland (1995)

    Google Scholar 

  2. Chinchor, N., Robinson, P.: MUC-7 named entity task definition. In: Proceedings of the 7th Conference on Message Understanding (1997)

    Google Scholar 

  3. Wikipedia. https://en.wikipedia.org/w/index.php?title=Named-entity_recognition&oldid=1040289513. Accessed 16 Sept 2021

  4. Lee, C., et al.: Fine-grained named entity recognition using conditional random fields for question answering. In: Ng, H.T., Leong, M.-K., Kan, M.-Y., Ji, D. (eds.) AIRS 2006. LNCS, vol. 4182, pp. 581–587. Springer, Heidelberg (2006). https://doi.org/10.1007/11880592_49

    Chapter  Google Scholar 

  5. Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pp. 1003–1011 (2009)

    Google Scholar 

  6. Yosef, M.A., Bauer, S., Hoffart, J., Saniol, M., Weikum, G.: Hyena: hierarchical type classification for entity names. In: Proceedings of COLING 2012, pp.1361–1370 (2012)

    Google Scholar 

  7. Dong, L., Wei, F., Sun, H., Zhou, M.l., Xu, K.: A hybrid neural model for type classification of entity mentions. In: Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015), pp.1243–1249. AAAI Press (2015)

    Google Scholar 

  8. Berger, A., Della Pietra, S.A., Della Pietra, V.J.: A maximum entropy approach to natural language processing. In: Computational Linguistics, pp. 39–71 (1996)

    Google Scholar 

  9. Karn, S., Waltinger, U., Schütze, H.: End-to-end trainable attentive decoder for hierarchical entity classification. In: Association for Computational Linguistics, Valencia, pp. 752–758 (2017)

    Google Scholar 

  10. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT (2019)

    Google Scholar 

  11. Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre-training (2018)

    Google Scholar 

  12. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)

    Google Scholar 

  13. Wei, J., et al.: NEZHA: Neural Contextualized Representation for Chinese Language Understanding. arXiv preprint. arXiv:1909.00204 (2019)

  14. Liu, Y.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint. arXiv:1907.11692 (2019)

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pu, K., Liu, H., Yang, Y., Lv, W., Li, J. (2022). Multi-label Fine-Grained Entity Typing for Baidu Wikipedia Based on Pre-trained Model. In: Qin, B., Wang, H., Liu, M., Zhang, J. (eds) CCKS 2021 - Evaluation Track. CCKS 2021. Communications in Computer and Information Science, vol 1553. Springer, Singapore. https://doi.org/10.1007/978-981-19-0713-5_13

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-0713-5_13

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-0712-8

  • Online ISBN: 978-981-19-0713-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics