Abstract
Existing weakly supervised named entity recognition (NER) research only deals with flat entities and ignores nested entities. This paper proposes a multi-stage nested entity recognition method (MNR) that utilizes weakly labeled data to recognize nested entities. However, weak labels generated through external knowledge bases have two problems: incompleteness and labeling bias. To address this challenge, the MNR comprises two models. First, we propose a neural transition-based attention model (NTAM) to solve the problem of weak-label incompleteness by learning the correlation between words. Simultaneously, the NTAM obtains candidate entities, including nested entities. Second, we propose a multi-marker fusion attention judgment model (MAJM) for selecting candidate entities through context semantics, candidate entities’ meanings, and their boundary information, thereby solving the labeling bias problem. The boundary information of candidate entities is enhanced by fusing their type markers. To our knowledge, we are the first to recognize nested entities under weak supervision by alleviating the noise of weakly labeled data. Experiments on three public nested NER datasets prove the effectiveness of our proposed method under weak supervision and demonstrate that the method outperforms previous state-of-the-art models under supervision.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11227-023-05619-z/MediaObjects/11227_2023_5619_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11227-023-05619-z/MediaObjects/11227_2023_5619_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11227-023-05619-z/MediaObjects/11227_2023_5619_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11227-023-05619-z/MediaObjects/11227_2023_5619_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11227-023-05619-z/MediaObjects/11227_2023_5619_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11227-023-05619-z/MediaObjects/11227_2023_5619_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11227-023-05619-z/MediaObjects/11227_2023_5619_Fig7_HTML.png)
Similar content being viewed by others
Data availability
The access address of the data used has been given in the article.
References
Ganea O-E , Hofmann T (2017) Deep joint entity disambiguation with local neural attention. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Copenhagen, Denmark, pp 2619– 2629
Le P, Titov I (2018) Improving entity linking by modeling latent relations between mentions. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Melbourne, Australia, pp 1595– 1604
Li Q, Ji H (2014) Incremental joint extraction of entity mentions and relations. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (Vol. 1: Long Papers). Association for Computational Linguistics, Baltimore, Maryland, pp 402– 412
Miwa M, Bansal M (2016) End-to-end relation extraction using LSTMs on sequences and tree structures. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Berlin, Germany, pp 1105– 1116
Chang K-W, Samdani R, Roth D (2013) A constrained latent variable model for coreference resolution. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Seattle, Washington, USA, pp 601– 612
Chiu JP, Nichols E (2015) Named entity recognition with bidirectional lstm-cnns. arXiv preprint arXiv:1511.08308
Liang C, Yu Y, Jiang H, Er S, Wang R, Zhao T, Zhang C (2020) Bond: Bert-assisted open-domain named entity recognition with distant supervision. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’20. Association for Computing Machinery, New York, NY, USA, pp 1054– 1064
Jiang H, Zhang D, Cao T, Yin B, Zhao T (2021) Named entity recognition with small strongly labeled and large weakly labeled data. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, pp 1775– 1789
Huang Z, Xu W, Yu K (2015) Bidirectional lSTM-crf models for sequence tagging. arXiv preprint arXiv:1508.01991
Zheng C , Cai Y, Xu J, Leung H-F , Xu G (2019) A boundary-aware neural model for nested named entity recognition. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, pp 357– 366
Wang Y , Shindo H , Matsumoto Y, Watanabe T (2021) Nested named entity recognition via explicitly excluding the influence of the best path. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, pp 3547– 3557
Muis AO, Lu W (2017) Labeling gaps between words: Recognizing overlapping mentions with mention separators. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, pp 2608– 2618
Katiyar A, Cardie C (2018) Nested named entity recognition revisited. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, pp 861– 871
Wang J, Shou L, Chen K, Chen G (2020) Pyramid: A layered model for nested named entity recognition. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, New York, NY, USA, pp 5918– 5928
Sohrab MG, Miwa M (2018) Deep exhaustive model for nested named entity recognition. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, pp 2843– 2849
Shen Y, Ma X, Tan Z, Zhang S, Wang W, Lu W (2021) Locate and label: A two-stage identifier for nested named entity recognition. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, pp 2782– 2794
Lou C, Yang S, Tu K (2022) Nested named entity recognition as latent lexicalized constituency parsing. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Dublin, Ireland, pp 6183– 6198
Wan J, Ru D, Zhang W, Yu Y (2022) Nested named entity recognition with span-level graphs. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Dublin, Ireland, pp 892– 903
Li X, Feng J, Meng Y, Han Q, Wu F, Li J (2019) A unified MRC framework for named entity recognition. arXiv preprint arXiv:1910.11476
Shen Y, Wang X, Tan Z, Xu G, Xie P, Huang F, Lu W, Zhuang Y (2022) Parallel instance query network for named entity recognition. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Dublin, Ireland. https://aclanthology.org/2022.acl-long.67, pp 947– 961
Wang B, Lu W, Wang Y, Jin H (2018) A neural transition-based model for nested mention recognition. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, pp 1011– 1017
Giannakopoulos A, Musat C, Hossmann A, Baeriswyl M (2017) Unsupervised aspect term extraction with B-LSTM and CRF using automatically labelled datasets. In: Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Association for Computational Linguistics, Copenhagen, Denmark, pp 180– 188
Fries J, Wu S, Ratner A, Ré C (2017) A generative model for biomedical named entity recognition without labeled data. arXiv preprint arXiv:1704.06360
Shang J, Liu L, Gu X, Ren X, Ren T, Han J (2018) Learning named entity tagger using domain-specific dictionary. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, pp 2054– 2064
Ni J, Dinu G, Florian R (2017) Weakly supervised cross-lingual named entity recognition via effective annotation and representation projection. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. (Long Papers. Association for Computational Linguistics, Vancouver, Canada), Vol 1, pp 1470–1480
Cao Y, Hu Z, Chua T-S, Liu Z, Ji H (2019) Low-resource name tagging learned with weakly labeled data. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, pp 261– 270
Alex B, Haddow B, Grover C (2007) Recognising nested named entities in biomedical text. Biological. Translational, and Clinical Language Processing. Prague, Czech Republic, Association for Computational Linguistics, pp 65–72
Straková J, Straka M, Hajic J (2019) Neural architectures for nested NER through linearization. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, pp 5326– 5331
Lu W, Roth D (2015) Joint mention extraction and classification with mention hypergraphs. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Lisbon, Portugal, pp 857– 867
Tan C, Qiu W, Chen M, Wang R, Huang F (2020) Boundary enhanced neural span classification for nested named entity recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence. Association for Computational Linguistics, New York, NY, USA, pp 9016– 9023
Zhang Y, Clark S (2009) Transition-based parsing of the Chinese treebank using a global discriminative model
Ji Z, Xia T, Han M, Xiao J (2021) A neural transition-based joint model for disease named entity recognition and normalization. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, pp 2819– 2827
Ohta T, Tateisi Y, Kim J-D (2002) The Genia corpus: An annotated research abstract corpus in molecular biology domain. In: Proceedings of the Second International Conference on Human Language Technology Research. HLT ’02. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 82– 86
Doddington G, Mitchell A, Przybocki M, Ramshaw L, Strassel S, Weischedel R (2004) The automatic content extraction (ACE) program – tasks, data, and evaluation. In: Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04). European Language Resources Association (ELRA), Lisbon, Portugal. http://www.lrec-conf.org/proceedings/lrec2004/pdf/5.pdf
Yu, S, Duan, H, Wu, Y (2018) Corpus of Multi-level Processing for Modern Chinese. https://doi.org/10.18170/DVN/SEYRX5
Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, Kang J (2019) BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. https://doi.org/10.1093/bioinformatics/btz682
Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, ((Long and Short Papers), Association for Computational Linguistics, Minneapolis, Minnesota), vol 1: pp 4171–4186
Chiu B, Crichton G, Korhonen A, Pyysalo S (2016) How to train good word embeddings for biomedical NLP. In: Proceedings of the 15th Workshop on Biomedical Natural Language Processing Association for Computational Linguistics, Berlin, Germany. pp 166– 174 https://doi.org/10.18653/v1/W16-2922
Song Y, Shi S, Li J, Zhang H (2018) Directional skip-gram: Explicitly distinguishing left and right context for word embeddings. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). Association for Computational Linguistics, New Orleans, Louisiana, pp 175– 180. https://doi.org/10.18653/v1/N18-2028
Yanqun L, Yunqi H, Longhua Q, Guodong Z (2018) Chinese nested named entity recognition corpus construction. J Chin Inform Process 32(8):19–26
Yanliang J, Jinfei X, Dijia W (2022) Chinese nested named entity recognition based on hierarchical tagging. J Shanghai Univ (Natural Science Edition) 28(2):270–280
Yuan Z, Zhang H (2021) Improving named entity recognition of Chinese legal documents by lexical enhancement. In: 2021 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA), pp 999–1004. https://doi.org/10.1109/ICAICA52286.2021.9498036
Acknowledgements
This research work is supported by Zhejiang Provincial Natural Science Foundation of China (No. LGF22F020014), National Key Research and Development Program of China (No. 2020YFB1707700), National Natural Science Foundation of China (No. 62036009, U1909203).
Author information
Authors and Affiliations
Contributions
NG contributed to Conceptualization, Methodology, Writing—review, Visualization. BY contributed to Investigation, Writing—original draft and editing, Software, Validation. PC contributed to Supervision, Project administration, Funding acquisition. LQ contributed to Supervision, Formal analysis.
Corresponding author
Ethics declarations
Conflict of interest
I declare that the authors have no competing interests as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Gao, N., Yang, B., Chen, P. et al. A multi-stage recognizer for nested named entity with weakly labeled data. J Supercomput 80, 3663–3693 (2024). https://doi.org/10.1007/s11227-023-05619-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-023-05619-z