A simple but effective span-level tagging method for discontinuous named entity recognition

Mao, Tingyun; Xu, Yaobin; Liu, Weitang; Peng, Jingchao; Chen, Lili; Zhou, Mingwei

doi:10.1007/s00521-024-09454-y

A simple but effective span-level tagging method for discontinuous named entity recognition

Original Article
Published: 17 February 2024

Volume 36, pages 7187–7201, (2024)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Tingyun Mao ORCID: orcid.org/0000-0001-5233-5863^1,2^na1,
Yaobin Xu¹^na1,
Weitang Liu¹^na1,
Jingchao Peng³,
Lili Chen¹ &
…
Mingwei Zhou¹

152 Accesses
Explore all metrics

Abstract

Discontinuous named entity recognition (NER) is a more challenging task compared to continuous NER. It aims to extract discontinuous entities composed of multiple no-adjacent spans, which requires representing and combining all the spans of each discontinuous entity. However, discontinuous NER may suffer from decoding ambiguity due to the large space of span combinations and the lack of association information between spans. To address this problem, we propose a simple yet effective span-level tagging scheme for discontinuous NER. The tagging scheme defines simple span-level tags to represent and associate all the spans of each discontinuous entity simultaneously, effectively solving the decoding ambiguity problem. Moreover, the proposed model employs a co-predictor consisting of a span-level graph-based predictor and a position-aware biaffine predictor to predict span-level tags. The span-level graph-based predictor enhances span representations by employing graph convolutional network on span-level graphs to capture the dependence between spans of each discontinuous entity. The position-aware biaffine predictor incorporates relative position information into biaffine to enrich the structural information of span representations. To verify the effectiveness of our method, we conduct experiments on three benchmark datasets (i.e., CADEC, ShARe 13 and ShARe 14). The results show our method significantly outperforms previous state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Auxiliary Information Enhanced Span-Based Model for Nested Named Entity Recognition

Dictionary-Assisted Chinese Nested Named Entity Recognition

Span-Based Nested Named Entity Recognition with Pretrained Language Model

Data availability

The datasets used in this article are publicly available.

Code availability

Our code is available at https://github.com/isLouisHsu/span-scheme-for-disc-ner.

Notes

References

Zhong Z, Chen D (2021) A frustratingly easy approach for entity and relation extraction. In: proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: human language technologies, Association for Computational Linguistics, Online, pp 50–61. https://doi.org/10.18653/v1/2021.naacl-main.5
Hou F, Wang R, He J, Zhou Y (2020) Improving entity linking through semantic reinforced entity embeddings. In: proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, Online, pp 6843–6848. https://doi.org/10.18653/v1/2020.acl-main.612
Chatterjee S, Dietz L (2021) Entity retrieval using fine-grained entity aspects. In: proceedings of the 44th international ACM SIGIR conference on research and development in information retrieval
Wongso R, Meiliana Suhartono D (2016) A literature review of question answering system using named entity recognition. In: 2016 3rd international conference on information technology, computer, and electrical engineering (ICITACEE), pp 274–277
Dai X (2018) Recognizing complex entity mentions: a review and future directions. In: proceedings of ACL 2018, student research workshop, association for computational linguistics, Melbourne, Australia, pp 37–44. https://doi.org/10.18653/v1/P18-3006
Huang Z, Xu W, Yu K (2015) Bidirectional LSTM-CRF models for sequence tagging. Preprint at ArXiv abs/1508.01991
Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural architectures for named entity recognition. In: proceedings of the 2016 conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, San Diego, California, pp 260–270. https://doi.org/10.18653/v1/N16-1030
Ratinov L, Roth D (2009) Design challenges and misconceptions in named entity recognition. In: proceedings of the thirteenth conference on computational natural language learning (CoNLL-2009), Association for Computational Linguistics, Boulder, Colorado, pp 147–155
Florian R, Jing H, Kambhatla N, Zitouni, I (2006) Factorizing complex models: a case study in mention detection. In: proceedings of the 21st international conference on computational linguistics and 44th annual meeting of the association for computational linguistics, Association for Computational Linguistics, Sydney, Australia, pp 473–480. https://doi.org/10.3115/1220175.1220235
Tang B, Wu Y, Jiang M, Denny JC, Xu H (2013) Recognizing and encoding discorder concepts in clinical text using machine learning and vector space model. In: conference and labs of the evaluation forum
Tang B, Chen Q, Wang X, Wu Y, Zhang Y, Jiang M, Wang J, Qi W (2015) Recognizing disjoint clinical concepts in clinical text using machine learning-based methods. AMIA Ann Symp Proc 2015:1184–1193
Google Scholar
Metke-Jimenez A, Karimi S (2016) Concept identification and normalisation for adverse drug event discovery in medical forums. In: proceedings of the BMDID-ISWC
Buzhou T, Jianglu H, Xiaolong W, Qingcai C (2018) Recognizing continuous and discontinuous adverse drug reaction mentions from social media using LSTM-CRF. Wirel Commun Mob Comput 2018:1–8
Google Scholar
Wang B, Lu W (2019) Combining spans into entities: a neural two-stage approach for recognizing discontiguous entities. In: conference on empirical methods in natural language processing
Li F, Lin Z, Zhang M, Ji D (2021) A span-based model for joint overlapped and discontinuous named entity recognition. In: proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, vol 1, Long Papers. Association for Computational Linguistics, Online, pp 4814–4828. https://doi.org/10.18653/v1/2021.acl-long.372
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: 5th international conference on learning representations
Ding J, Xu W, Wang A, Zhao S, Zhang Q (2023) Joint multi-view character embedding model for named entity recognition of Chinese car reviews. Neural Comput Appl 35:14947–14962
Article Google Scholar
Metke-Jimenez A, Karimi S (2015) Concept extraction to identify adverse drug reactions in medical forums: a comparison of algorithms. Preprint. arXiv:150406936
Tang B, Hu J, Wang X, Chen Q (2018) Recognizing continuous and discontinuous adverse drug reaction mentions from social media using LSTM-CRF. Wirel Commun Mob Comput 2018:2379208
Article Google Scholar
Muis AO, Lu W (2016) Learning to recognize discontiguous entities. In: proceedings of the 2016 conference on empirical methods in natural language processing, Association for Computational Linguistics, Austin, Texas, pp 75–84. https://doi.org/10.18653/v1/D16-1008
Dai X, Karimi S, Hachey B, Paris C (2020) An effective transition-based model for discontinuous NER. In: proceedings of the 58th annual meeting of the association for computational linguistics, Association for Computational Linguistics, Online, pp 5860–5870. https://doi.org/10.18653/v1/2020.acl-main.520
Fei H, Ji, D-H, Li B, Liu Y, Ren Y, Li F (2021) Rethinking boundaries: end-to-end recognition of discontinuous mentions with pointer networks. In: proceedings of the aaai conference on artificial intelligence
Yan H, Gui T, Dai J, Guo Q, Zhang Z, Qiu X (2021) A unified generative framework for various NER subtasks. In: proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, vol 1, Long Papers. pp 5808–5822
Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L (2020) BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: proceedings of the 58th annual meeting of the association for computational linguistics, Association for Computational Linguistics, Online, pp 7871–7880. https://doi.org/10.18653/v1/2020.acl-main.703
Zhang S, Shen Y, Tan Z, Wu Y, Lu W (2022) De-bias for generative extraction in unified NER task. In: proceedings of the 60th annual meeting of the association for computational linguistics, vol 1, Long Papers, Association for Computational Linguistics, Dublin, Ireland, pp 808–818. https://doi.org/10.18653/v1/2022.acl-long.59
Li J, Fei H, Liu J, Wu S, Zhang M, Teng C, Ji D, Li F (2022) Unified named entity recognition as word-word relation classification. In: proceedings of the AAAI conference on artificial intelligence, vol 36. pp 10965–10973
Zhang C, Li Q, Song D (2019) Aspect-based sentiment classification with aspect-specific graph convolutional networks. In: proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), Association for Computational Linguistics, Hong Kong, China, pp 4568–4578. https://doi.org/10.18653/v1/D19-1464
Luan Y, Wadden D, He L, Shah A, Ostendorf M, Hajishirzi H (2019) A general framework for information extraction using dynamic span graphs. In: North American chapter of the association for computational linguistics
Tayal K, Rao N, Agarwal S, Jia X, Subbian K, Kumar V (2020)Regularized graph convolutional networks for short text classification. In: proceedings of the 28th international conference on computational linguistics: industry track, international committee on computational linguistics, Online, pp 236–242. https://doi.org/10.18653/v1/2020.coling-industry.22
Qiu D, Zhang Y, Feng X, Liao X, Jiang W, Lyu Y, Liu K, Zhao J (2019) Machine reading comprehension using structural knowledge graph-aware network. In: proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), Association for Computational Linguistics, Hong Kong, China, pp 5896–5901. https://doi.org/10.18653/v1/D19-1602
Wan J, Ru D, Zhang W, Yu Y (2022) Nested named entity recognition with span-level graphs. In: proceedings of the 60th annual meeting of the association for computational linguistics, vol 1, Long Papers. Association for Computational Linguistics, Dublin, Ireland, pp 892–903. https://doi.org/10.18653/v1/2022.acl-long.63
Zaratiana U, Tomeh N, Holat P, Charnois T (2022) GNNer: reducing overlapping in span-based NER using graph neural networks. In: proceedings of the 60th annual meeting of the association for computational linguistics: student research workshop, Association for Computational Linguistics, Dublin, Ireland, pp 97–103. https://doi.org/10.18653/v1/2022.acl-srw.9
Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. Preprint at arXiv:1810.04805
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9:1735–1780
Article CAS PubMed Google Scholar
Yu J, Bohnet B, Poesio M (2020) Named entity recognition as dependency parsing. In: proceedings of the 58th annual meeting of the association for computational linguistics, Association for Computational Linguistics, Online, pp 6470–6476. https://doi.org/10.18653/v1/2020.acl-main.577
Wei J, Ren X, Li X, Huang W-C, Liao Y, Wang Y, Lin J, Jiang X, Chen X, Liu Q (2019) Nezha: neural contextualized representation for Chinese language understanding. ArXiv abs/1909.00204
Karimi S, Metke-Jimenez A, Kemp M, Wang C (2015) Cadec: a corpus of adverse drug event annotations. J Biomed Inform 55:73–81
Article PubMed Google Scholar
Pradhan S, Elhadad N, South BR, Martinez D, Christensen LM, Vogel A, Suominen H, Chapman WW, Savova GK (2013) Task 1: share/clef ehealth evaluation lab 2013. In: proceedings of CLEF
Mowery DL, Velupillai S, South BR, Christensen L, Martinez D, Kelly L, Goeuriot L, Elhadad N, Pradhan S, Savova G et al. (2014) Task 2: share/clef ehealth evaluation lab 2014. In: proceedings of CLEF
Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, Kang J (2020) Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36:1234–1240
Article CAS PubMed Google Scholar
Alsentzer E, Murphy JR, Boag W, Weng W-H, Jin D, Naumann T, McDermott M (2019) Publicly available clinical BERT embeddings. In: proceedings of the 2nd clinical natural language processing workshop
Yan H, Deng B, Li X, Qiu X (2019) Tener: adapting transformer encoder for named entity recognition. ArXiv abs/1911.04474

Download references

Acknowledgements

The authors greatly appreciate the anonymous reviewers for providing valuable and insightful comments on the manuscript.

Funding

The authors receive no financial support for the research, authorship, and publication of this article.

Author information

Tingyun Mao, Yaobin Xu and Weitang Liu have contributed equally to this work.

Authors and Affiliations

Zhejiang Dahua Technology Co., Ltd., Hangzhou, 310053, China
Tingyun Mao, Yaobin Xu, Weitang Liu, Lili Chen & Mingwei Zhou
College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, 310027, China
Tingyun Mao
College of Information Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China
Jingchao Peng

Authors

Tingyun Mao
View author publications
You can also search for this author in PubMed Google Scholar
Yaobin Xu
View author publications
You can also search for this author in PubMed Google Scholar
Weitang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jingchao Peng
View author publications
You can also search for this author in PubMed Google Scholar
Lili Chen
View author publications
You can also search for this author in PubMed Google Scholar
Mingwei Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tingyun Mao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest with respect to the research, authorship and/or publication of this article.

Ethical approval

This article does not involve human subject for data collection. There is no need for ethical approval.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

1.1 Analysis of time complexity

We calculate the time complexity of our method. Some variables for calculating the time complexity are defined as follows: l: the maximum text length; m: the number of entities; k: the number of non-O spans (see Line 1 in Algorithm 1); \(d_w\): word embedding size; \(l_{b}\): the number of BERT’s layers; \(d_{l}\): hidden layer dimension of LSTM; \(d_h\): the size of the final word representation; \(l_{g}\): the number of GCN’s layers; \(d_{s}\): the size of the span representation; e: the number of edges in the span-level graph; c: the number of the span-level tags; \(d_z\): the size of RPE. Our method mainly includes three components: encoder, co-predictor and decoding process.

Encoder The word embeddings for target words are generated by BERT, whose time complexity is \(O(l_{b}l^2d_w + l_{b} ld_{w}^2)\). To obtain the final word representations, these word embeddings are further fed to Bi-LSTM, whose time complexity is \(O(ld_{l}^2+ld_{w}d_{l})\). Thus, the time complexity of the encoder is \(O(l_{b} l^2 d_{w} + l_{b} l d_{w}^2 +l d_{l}^2 + ld_{w} d_{l})\).

Co-predictor The tag scores of spans are predicted jointly by the span-level graph-based predictor and the position-aware biaffine predictor. The time complexity of the former is \(O(l_{g} e d_{s} + l_{g} \frac{l(l+1)}{2} d_{s}^2 + d_{s} c)\) and that of the latter is \(O(d_h^2 c + 2d_h c + (d_h c + c) d_z)\). Therefore, the time complexity of the co-predictor is \(O(l_{g} e d_{s} + l_{g} \frac{l(l+1)}{2} d_{s}^2 + d_{s} c + d_h^2 c + 2d_h c + (d_h c+c) d_z)\).

Decoding Process In the inference phase, all entities are decoded by the decoding algorithm, whose time complexity is \(O(k^2 + mk + k)\).

Overall Based on the above time complexity, the time complexity of our method in a single inference is represented as \(O(l_{b} l^2 d_{w} + l_{b} l d_{w}^2 +l d_{l}^2 + ld_{w} d_{l} + l_{g} e d_{s} + l_{g} \frac{l(l+1)}{2} d_{s}^2 + d_{s} c + d_h^2 c + 2d_h c + (d_h c+c) d_z + k^2 + mk + k)\).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Mao, T., Xu, Y., Liu, W. et al. A simple but effective span-level tagging method for discontinuous named entity recognition. Neural Comput & Applic 36, 7187–7201 (2024). https://doi.org/10.1007/s00521-024-09454-y

Download citation

Received: 10 May 2023
Accepted: 14 January 2024
Published: 17 February 2024
Issue Date: May 2024
DOI: https://doi.org/10.1007/s00521-024-09454-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A simple but effective span-level tagging method for discontinuous named entity recognition

Abstract

Access this article

Similar content being viewed by others

Auxiliary Information Enhanced Span-Based Model for Nested Named Entity Recognition

Dictionary-Assisted Chinese Nested Named Entity Recognition

Span-Based Nested Named Entity Recognition with Pretrained Language Model

Data availability

Code availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Appendix A

1.1 Analysis of time complexity

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A simple but effective span-level tagging method for discontinuous named entity recognition

Abstract

Access this article

Similar content being viewed by others

Auxiliary Information Enhanced Span-Based Model for Nested Named Entity Recognition

Dictionary-Assisted Chinese Nested Named Entity Recognition

Span-Based Nested Named Entity Recognition with Pretrained Language Model

Data availability

Code availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Appendix A

Appendix A

1.1 Analysis of time complexity

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation