Multi-label Fine-Grained Entity Typing for Baidu Wikipedia Based on Pre-trained Model

Pu, Keyu; Liu, Hongyi; Yang, Yixiao; Lv, Wenyi; Li, Jinlong

doi:10.1007/978-981-19-0713-5_13

Multi-label Fine-Grained Entity Typing for Baidu Wikipedia Based on Pre-trained Model

Keyu Pu⁹,
Hongyi Liu⁹,
Yixiao Yang⁹,
Wenyi Lv⁹ &
…
Jinlong Li⁹

Conference paper
First Online: 12 March 2022

377 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1553))

Abstract

It is a crucial premise for named entity recognition task to achieve high-accuracy entity extraction. CCKS-2021 held a Knowledge Graph Fine-grained Entity Typing competition, and 262 teams participated. What is challenging in the task is the extremely large amounts of unlabeled data and the multi-label entity typing. In our approach, a semi-supervised learning strategy is conducted to cope with the unlabeled data, and a multi-label loss is employed to recognize the multi-label entity. An F1-score of 0.85498 on the final testing data is achieved, which verifies the performance of our approach, and ranks the second place in the task.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Chinchor, N.: MUC-6 named entity task definition (version 2.1). In: Proceedings of the 6th Conference on Message Understanding, Columbia, Maryland (1995)
Google Scholar
Chinchor, N., Robinson, P.: MUC-7 named entity task definition. In: Proceedings of the 7th Conference on Message Understanding (1997)
Google Scholar
Wikipedia. https://en.wikipedia.org/w/index.php?title=Named-entity_recognition&oldid=1040289513. Accessed 16 Sept 2021
Lee, C., et al.: Fine-grained named entity recognition using conditional random fields for question answering. In: Ng, H.T., Leong, M.-K., Kan, M.-Y., Ji, D. (eds.) AIRS 2006. LNCS, vol. 4182, pp. 581–587. Springer, Heidelberg (2006). https://doi.org/10.1007/11880592_49
Chapter Google Scholar
Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pp. 1003–1011 (2009)
Google Scholar
Yosef, M.A., Bauer, S., Hoffart, J., Saniol, M., Weikum, G.: Hyena: hierarchical type classification for entity names. In: Proceedings of COLING 2012, pp.1361–1370 (2012)
Google Scholar
Dong, L., Wei, F., Sun, H., Zhou, M.l., Xu, K.: A hybrid neural model for type classification of entity mentions. In: Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015), pp.1243–1249. AAAI Press (2015)
Google Scholar
Berger, A., Della Pietra, S.A., Della Pietra, V.J.: A maximum entropy approach to natural language processing. In: Computational Linguistics, pp. 39–71 (1996)
Google Scholar
Karn, S., Waltinger, U., Schütze, H.: End-to-end trainable attentive decoder for hierarchical entity classification. In: Association for Computational Linguistics, Valencia, pp. 752–758 (2017)
Google Scholar
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT (2019)
Google Scholar
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre-training (2018)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Wei, J., et al.: NEZHA: Neural Contextualized Representation for Chinese Language Understanding. arXiv preprint. arXiv:1909.00204 (2019)
Liu, Y.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint. arXiv:1907.11692 (2019)

Download references

Author information

Authors and Affiliations

China Merchants Bank Artificial Intelligence Laboratory, ShenZhen, 518000, China
Keyu Pu, Hongyi Liu, Yixiao Yang, Wenyi Lv & Jinlong Li

Authors

Keyu Pu
View author publications
You can also search for this author in PubMed Google Scholar
Hongyi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yixiao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Wenyi Lv
View author publications
You can also search for this author in PubMed Google Scholar
Jinlong Li
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Harbin Institute of Technology, Harbin, China
Bing Qin
Tongji University, Shanghai, China
Haofen Wang
Harbin Institute of Technology, Harbin, China
Ming Liu
Tsinghua University, Beijing, China
Jiangtao Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pu, K., Liu, H., Yang, Y., Lv, W., Li, J. (2022). Multi-label Fine-Grained Entity Typing for Baidu Wikipedia Based on Pre-trained Model. In: Qin, B., Wang, H., Liu, M., Zhang, J. (eds) CCKS 2021 - Evaluation Track. CCKS 2021. Communications in Computer and Information Science, vol 1553. Springer, Singapore. https://doi.org/10.1007/978-981-19-0713-5_13

Download citation

DOI: https://doi.org/10.1007/978-981-19-0713-5_13
Published: 12 March 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-0712-8
Online ISBN: 978-981-19-0713-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics