Abstract
The scenario that one entity contains other entities is known as nested entities. Nested named entity recognition is a fundamental and challenging task in various NLP applications. The state-of-the-art nested NER approach first enumerates all the text spans in a sentence and then performs classification. We realize that a large proportion of entities contain only one token which cannot be nested, and most text spans in a sentence are not entities and the full enumeration is thus costly and unnecessary. In this paper, we propose an efficient selective enumeration approach named BOUNCE. We decompose the nested NER task into two subtasks for identifying unit-length entities and the others respectively. We develop a delicate model for each subtask and perform joint training for both of them. To improve the efficiency, we employ a head detection module to locate the start points of entities, which acts as a filtering step before enumeration. We provide a detailed analysis on the time complexity of the existing nested NER techniques and conduct extensive experiments on two datasets. The results demonstrate that BOUNCE outperforms various nested NER techniques and achieves higher efficiency than the state-of-the-art method with comparable accuracy performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Our code is available at https://github.com/LiujunWang/BOUNCE.
- 2.
The NNE and ACE2004 datasets are inaccessible due to lack of license.
- 3.
References
Akbik, A., Blythe, D., Vollgraf, R.: Contextual string embeddings for sequence labeling. In: COLING (2018)
Alex, B., Haddow, B., Grover, C.: Recognising nested named entities in biomedical text. In: BioNLP (2007)
Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv preprint arXiv:1607.06450 (2016)
Byrne, K.: Nested named entity recognition in historical archive text. In: ICSC (2007)
Chen, R., Shen, Y., Zhang, D.: GNEM: a generic one-to-set neural entity matching framework. In: WWW (2021)
Chiu, B., Crichton, G., Korhonen, A., Pyysalo, S.: How to train good word embeddings for biomedical NLP. In: BioNLP (2016)
Clark, C., Gardner, M.: Simple and effective multi-paragraph reading comprehension. In: ACL (2018)
Doddington, G.R., Mitchell, A., Przybocki, M., Ramshaw, L., Strassel, S., Weischedel, R.: The automatic content extraction (ACE) program-tasks, data, and evaluation. In: LREC (2004)
Fisher, J., Vlachos, A.: Merge and label: a novel neural network architecture for nested NER. In: ACL (2019)
He, D., et al.: Dual learning for machine translation. In: NIPS (2016)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
Ji, H., Grishman, R.: Refining event extraction through cross-document inference. In: ACL (2008)
Ju, M., Miwa, M., Ananiadou, S.: A neural layered model for nested named entity recognition. In: NAACL (2018)
Jue, W., Shou, L., Chen, K., Chen, G.: Pyramid: a layered model for nested named entity recognition. In: ACL (2020)
Katiyar, A., Cardie, C.: Nested named entity recognition revisited. In: EMNLP (2018)
Kim, J.D., Ohta, T., Tateisi, Y., Tsujii, J.: Genia corpus–a semantically annotated corpus for bio-textmining. In: ISMB (2003)
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: ICML (2001)
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: NAACL (2016)
Lin, H., Lu, Y., Han, X., Sun, L.: Sequence-to-nuggets: nested entity mention detection via anchor-region networks. In: ACL (2019)
Liu, L., et al.: Heterogeneous supervision for relation extraction: a representation learning approach. In: EMNLP (2017)
Lu, W., Roth, D.: Joint mention extraction and classification with mention hypergraphs. In: EMNLP (2015)
Luo, Y., Xiao, F., Zhao, H.: Hierarchical contextualized representation for named entity recognition. In: AAAI (2020)
Luo, Y., Zhao, H.: Bipartite flat-graph network for nested named entity recognition. In: ACL (2020)
Muis, A.O., Lu, W.: Labeling gaps between words: Recognizing overlapping mentions with mention separators. In: EMNLP (2017)
Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: EMNLP (2014)
Peters, M., Ammar, W., Bhagavatula, C., Power, R.: Semi-supervised sequence tagging with bidirectional language models. In: ACL (2017)
Sohrab, M.G., Miwa, M.: Deep exhaustive model for nested named entity recognition. In: EMNLP (2018)
Straková, J., Straka, M., Hajic, J.: Neural architectures for nested NER through linearization. In: ACL (2019)
Wang, B., Lu, W.: Neural segmental hypergraphs for overlapping mention recognition. In: EMNLP (2018)
Wang, B., Lu, W., Wang, Y., Jin, H.: A neural transition-based model for nested mention recognition. In: EMNLP (2018)
Wang, Z., Mi, H., Hamza, W., Florian, R.: Multi-perspective context matching for machine comprehension. arXiv preprint arXiv:1612.04211 (2016)
Wu, Y., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)
Xu, M., Jiang, H., Watcharawittayakul, S.: A local detection approach for named entity recognition and mention detection. In: ACL (2017)
Yang, J., Liang, S., Zhang, Y.: Design challenges and misconceptions in neural sequence labeling. In: COLING (2018)
Zhang, Y., Qi, P., Manning, C.D.: Graph convolution over pruned dependency trees improves relation extraction. In: EMNLP (2018)
Zhao, Y., Shen, Y., Zhu, Y., Yao, J.: Forecasting wavelet transformed time series with attentive neural networks. In: ICDM (2018)
Zheng, C., Cai, Y., Xu, J., Leung, H.F., Xu, G.: A boundary-aware neural model for nested named entity recognition. In: EMNLP-IJCNLP (2019)
Zhou, G., Su, J., Zhang, J., Zhang, M.: Exploring various knowledge in relation extraction. In: ACL (2005)
Acknowledgements
This work is supported by the National Key Research and Development Program of China (No. 2018YFC0831604), NSFC (No. 61602297), and the Tencent Wechat Rhino-Bird Focused Research Program.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, L., Shen, Y. (2021). BOUNCE: An Efficient Selective Enumeration Approach for Nested Named Entity Recognition. In: U, L.H., Spaniol, M., Sakurai, Y., Chen, J. (eds) Web and Big Data. APWeb-WAIM 2021. Lecture Notes in Computer Science(), vol 12859. Springer, Cham. https://doi.org/10.1007/978-3-030-85899-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-85899-5_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85898-8
Online ISBN: 978-3-030-85899-5
eBook Packages: Computer ScienceComputer Science (R0)