Abstract
Medical named entity recognition (NER), a fundamental task of medical information extraction, is crucial for medical knowledge graph construction, medical question answering, and automatic medical record analysis, etc. Compared with named entities (NEs) in general domain, medical named entities are usually more complex and prone to be nested. To cope with both flat NEs and nested NEs, we propose a MRC-based approach with multi-task learning and multi-strategies. NER can be treated as a sequence labeling (SL) task or a span boundary detection (SBD) task. We integrate MRC-CRF model for SL and MRC-Biaffine model for SBD into the multi-task learning architecture, and select the more efficient MRC-CRF as the final decoder. To further improve the model, we employ multi-strategies, including adaptive pre-training, adversarial training, and model stacking with cross validation. Experiments on both nested NER corpus CMeEE and flat NER corpus CCKS2019 show the effectiveness of the MRC-based model with multi-task learning and multi-strategies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cao, J., et al.: Electronic medical record entity recognition via machine reading comprehension and biaffine. Discrete Dyn. Nat. Soc. 2021 (2021)
Chen, X., Ouyang, C., Liu, Y., Bu, Y.: Improving the named entity recognition of Chinese electronic medical records by combining domain dictionary and rules. Int. J. Environ. Res. Pub. Health 17(8), 2687 (2020)
Chowdhury, S., et al.: A multitask bi-directional RNN model for named entity recognition on Chinese electronic medical records. BMC Bioinf. 19(17), 75–84 (2018)
Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using BERT BilSTM CRF for Chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (CISP-BMEI), pp. 1–5 (2019)
Gu, Y., Qu, X., Wang, Z., Zheng, Y., Huai, B., Yuan, N.J.: Delving deep into regularity: a simple but effective method for Chinese named entity recognition. arXiv preprint arXiv:2204.05544 (2022)
Gururangan, S., et al.: Don’t stop pretraining: adapt language models to domains and tasks. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8342–8360 (2020)
Han, X., Wang, Z., Zhang, J., Wen, Q., Lin, Y.: Overview of the CCKS 2019 knowledge graph evaluation track: entity, relation, event and QA. arXiv preprint arXiv:2003.03875 (2020)
Kong, J., Zhang, L., Jiang, M., Liu, T.: Incorporating multi-level CNN and attention mechanism for Chinese clinical named entity recognition. J. Biomed. Inform. 116, 103737 (2021)
Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236 (2016)
Lafferty, J.D., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 282–289 (2001)
Li, N., Luo, L., Ding, Z., Song, Y., Yang, Z., Lin, H.: DUTIR at the CCKS-2019 task1: improving Chinese clinical named entity recognition using stroke ELMo and transfer learning. In: Proceedings of the 4th China Conference on Knowledge Graph and Semantic Computing (CCKS 2019), pp. 24–27 (2019)
Li, X., Zhang, H., Zhou, X.H.: Chinese clinical named entity recognition with variant neural structures based on BERT methods. J. Biomed. Inform. 107, 103422 (2020)
Li, X., Feng, J., Meng, Y., Han, Q., Wu, F., Li, J.: A unified MRC framework for named entity recognition. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 5849–5859 (2020)
Li, Y.: Chinese clinical named entity recognition in electronic medical records: development of a lattice long short-term memory model with contextualized character representations. JMIR Med. Inform. 8(9), e19848 (2020)
Liu, N., Hu, Q., Xu, H., Xu, X., Chen, M.: Med-BERT: A pre-training framework for medical records named entity recognition. IEEE Trans. Ind. Inf. 18(8), 5600–5608 (2021)
Luo, L., Yang, Z., Song, Y., Li, N., Lin, H.: Chinese clinical named entity recognition based on stroke ELMo and multi-task learning. Chin. J. Comput. 43(10), 1943–1957 (2020)
Qin, Q., Zhao, S., Liu, C.: A BERT-BIGRU-CRF model for entity recognition of Chinese electronic medical records. Complexity 2021 (2021)
Wan, Q., Liu, J., Wei, L., Ji, B.: A self-attention based neural architecture for Chinese medical named entity recognition. Math. Biosci. Eng. 17(4), 3498–3511 (2020)
Wei, J., et al.: NEZHA: neural contextualized representation for Chinese language understanding. arXiv preprint arXiv:1909.00204 (2019)
Zhang, N., et al.: CBLUE: a Chinese biomedical language understanding evaluation benchmark. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 7888–7915 (2022)
Zhang, Z., Qin, X., Qiu, Y., Liu, D.: Well-behaved transformer for Chinese medical NER. In: 2021 3rd International Conference on Natural Language Processing (ICNLP), pp. 162–167 (2021)
Zhao, S., Cai, Z., Chen, H., Wang, Y., Liu, F., Liu, A.: Adversarial training based lattice LSTM for Chinese clinical named entity recognition. J. Biomed. Inform. 99, 103290 (2019)
Zheng, H., Qin, B., Xu, M.: Chinese medical named entity recognition using CRF-MT-adapt and NER-MRC. In: 2021 2nd International Conference on Computing and Data Science (CDS), pp. 362–365 (2021)
Zhong, S., Yu, Q.: Improving Chinese medical named entity recognition using glyph and lexicon. In: Proceedings of 2021 International Conference on Advanced Education and Information Management (AEIM 2021), pp. 75–80 (2021)
Zhu, Q., et al.: HITSZ-HLT at semEval-2021 task 5: ensemble sequence labeling and span boundary detection for toxic span detection. In: Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), pp. 521–526 (2021)
Acknowledgements
We would like to thank the anonymous reviewers for their insightful and valuable comments. This work was supported in part by Major Program of National Social Science Foundation of China (Grant No.17ZDA318, 18ZDA295), National Natural Science Foundation of China (Grant No.62006211), and China Postdoctoral Science Foundation (Grant No.2019TQ0286, 2020M682349).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Du, X., Jia, Y., Zan, H. (2022). MRC-Based Medical NER with Multi-task Learning and Multi-strategies. In: Sun, M., et al. Chinese Computational Linguistics. CCL 2022. Lecture Notes in Computer Science(), vol 13603. Springer, Cham. https://doi.org/10.1007/978-3-031-18315-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-18315-7_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18314-0
Online ISBN: 978-3-031-18315-7
eBook Packages: Computer ScienceComputer Science (R0)