Skip to main content

Incorporating Boundary and Category Feature for Nested Named Entity Recognition

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12113))

Included in the following conference series:

Abstract

In the natural language processing (NLP) field, it is fairly common that an entity is nested in another entity. Most existing named entity recognition (NER) models focus on flat entities but ignore nested entities. In this paper, we propose a neural model for nested named entity recognition. Our model employs a multi-label boundary detection module to detect entity boundaries, avoiding boundary detection conflict existing in the boundary-aware model. Besides, our model with a boundary detection module and a category detection module detects entity boundaries and entity categories simultaneously, avoiding the error propagation problem existing in current pipeline models. Furthermore, we introduce multitask learning to train the boundary detection module and the category detection module to capture the underlying association between entity boundary information and entity category information. In this way, our model achieves better performance of entity extraction. In evaluations on two nested NER datasets and a flat NER dataset, we show that our model outperforms previous state-of-the-art models on nested and flat NER.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The results on GENIA test set are taken from  [24], and the results on GermEval 2014 test set are obtained by running the codes shared by  [24].

References

  1. Baxter, J.: A Bayesian/information theoretic model of learning to learn via multiple task sampling. Mach. Learn. 28(1), 7–39 (1997). https://doi.org/10.1023/A:1007327622663

    Article  MATH  Google Scholar 

  2. Benikova, D., Biemann, C., Reznicek, M.: NoSta-d named entity annotation for German: guidelines and dataset. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014), pp. 2524–2531 (2014)

    Google Scholar 

  3. Finkel, J.R., Manning, C.D.: Nested named entity recognition. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, vol. 1-vol. 1, pp. 141–150. Association for Computational Linguistics (2009)

    Google Scholar 

  4. Gridach, M.: Character-level neural network for biomedical named entity recognition. J. Biomed. Inform. 70, 85–91 (2017)

    Article  Google Scholar 

  5. Gu, B.: Recognizing nested named entities in GENIA corpus. In: Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology, pp. 112–113 (2006)

    Google Scholar 

  6. Ju, M., Miwa, M., Ananiadou, S.: A neural layered model for nested named entity recognition. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1446–1459 (2018). https://doi.org/10.18653/v1/N18-1131

  7. Katiyar, A., Cardie, C.: Nested named entity recognition revisited. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long Papers), pp. 861–871. Association for Computational Linguistics (2018)

    Google Scholar 

  8. Khot, T., Balasubramanian, N., Gribkoff, E., Sabharwal, A., Clark, P., Etzioni, O.: Exploring Markov logic networks for question answering. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 685–694 (2015). https://doi.org/10.18653/v1/D15-1080

  9. Kim, J.D., Ohta, T., Tateisi, Y., Tsujii, J.: Genia corpus–a semantically annotated corpus for bio-textmining. Bioinformatics, 19(suppl\_1), i180–i182 (2003)

    Google Scholar 

  10. Kim, J.D., Ohta, T., Tsuruoka, Y., Tateisi, Y., Collier, N.: Introduction to the bio-entity recognition task at JNLPBA. In: Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications, pp. 70–75. Citeseer (2004)

    Google Scholar 

  11. Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001)

    Google Scholar 

  12. Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 260–270 (2016)

    Google Scholar 

  13. Liu, B., Gao, H., Qi, G., Duan, S., Wu, T., Wang, M.: Adversarial discriminative denoising for distant supervision relation extraction. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds.) DASFAA 2019. LNCS, vol. 11448, pp. 282–286. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-18590-9_29

    Chapter  Google Scholar 

  14. Lu, W., Roth, D.: Joint mention extraction and classification with mention hypergraphs. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 857–867 (2015). https://doi.org/10.18653/v1/D15-1102

  15. Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1064–1074 (2016)

    Google Scholar 

  16. Muis, A.O., Lu, W.: Labeling gaps between words: recognizing overlapping mentions with mention separators. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2608–2618 (2017)

    Google Scholar 

  17. Ruder, S.: An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098 (2017)

  18. Shen, D., Zhang, J., Zhou, G., Su, J., Tan, C.L.: Effective adaptation of hidden Markov model-based named entity recognizer for biomedical domain. In: Proceedings of the ACL 2003 Workshop on Natural Language Processing in Biomedicine, pp. 49–56 (2003). https://doi.org/10.3115/1118958.1118965

  19. Sohrab, M.G., Miwa, M.: Deep exhaustive model for nested named entity recognition. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2843–2849 (2018). https://doi.org/10.18653/v1/D18-1309

  20. Sun, Y., Li, L., Xie, Z., Xie, Q., Li, X., Xu, G.: Co-training an improved recurrent neural network with probability statistic models for named entity recognition. In: Candan, S., Chen, L., Pedersen, T.B., Chang, L., Hua, W. (eds.) DASFAA 2017. LNCS, vol. 10178, pp. 545–555. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55699-4_33

    Chapter  Google Scholar 

  21. Tong, P., Zhang, Q., Yao, J.: Leveraging domain context for question answering over knowledge graph. Data Sci. Eng. 4(4), 323–335 (2019). https://doi.org/10.1007/s41019-019-00109-w

    Article  Google Scholar 

  22. Zhang, J., Li, J., Li, X.-L., Shi, Y., Li, J., Wang, Z.: Domain-specific entity linking via fake named entity detection. In: Navathe, S.B., Wu, W., Shekhar, S., Du, X., Wang, X.S., Xiong, H. (eds.) DASFAA 2016. LNCS, vol. 9642, pp. 101–116. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-32025-0_7

    Chapter  Google Scholar 

  23. Zhang, J., Shen, D., Zhou, G., Su, J., Tan, C.L.: Enhancing HMM-based biomedical named entity recognition by studying special phenomena. J. Biomed. Inform. 37(6), 411–422 (2004)

    Article  Google Scholar 

  24. Zheng, C., Cai, Y., Xu, J., Leung, H.F., Xu, G.: A boundary-aware neural model for nested named entity recognition. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 357–366 (2019)

    Google Scholar 

  25. Zhou, G., Zhang, J., Su, J., Shen, D., Tan, C.: Recognizing names in biomedical texts: a machine learning approach. Bioinformatics 20(7), 1178–1190 (2004)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the Fundamental Research Funds for the Central Universities, SCUT (No. 2017ZD048, D2182480), the Science and Technology Planning Project of Guangdong Province (No.2017B050506004), the Science and Technology Programs of Guangzhou (No.201704030076, 201802010027, 201902010046), the Hong Kong Research Grants Council (project no. PolyU 1121417), and an internal research grant from the Hong Kong Polytechnic University (project 1.9B0V).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yi Cai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cao, J. et al. (2020). Incorporating Boundary and Category Feature for Nested Named Entity Recognition. In: Nah, Y., Cui, B., Lee, SW., Yu, J.X., Moon, YS., Whang, S.E. (eds) Database Systems for Advanced Applications. DASFAA 2020. Lecture Notes in Computer Science(), vol 12113. Springer, Cham. https://doi.org/10.1007/978-3-030-59416-9_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59416-9_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59415-2

  • Online ISBN: 978-3-030-59416-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics