skip to main content
10.1145/3578741.3578780acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmlnlpConference Proceedingsconference-collections
research-article

Chinese medical named entity recognition based on zero-shot learning

Published:06 March 2023Publication History

ABSTRACT

To solve the problem that existing named entity recognition models lacked of ability to deal with unseen classes, the zero-shot learning was proposed to be used in the task of Chinese medical named entity recognition. Zero-shot learning utilizes the description information of the entity's class to establish the connection between the entity and the class, and transfers information from the observed classes to the unseen target classes. The model proposed by this paper was mainly based on BERT, which was used to model the relationship between the entity and the description. Moreover, the static word embedding, which is as a supplementary information, is concatenated with the features obtained from the BERT to solve the problem that BERT is not suitable for a specific field. At the same time, Correlation Searchers are added between the transformers of BERT to search for the word information most relevant to the character, so as to solve the problem that the model cannot obtain complete word information with characters as the input unit. Experiments show that the model's recognition performance has been significantly improved after adding the static word embedding and word information.

References

  1. GRISHMAN R, SUNDHEIM B. Message Understanding Conference- 6: A Brief History[C/OL]//COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics. 1996[2022-10-24]. https://aclanthology.org/C96-1079.Google ScholarGoogle Scholar
  2. JI B, LIU R, LI S, A BILSTM-CRF method to Chinese electronic medical record named entity recognition[C]//Proceedings of the 2018 International Conference on Algorithms, Computing and Artificial Intelligence. 2018: 1-6.Google ScholarGoogle Scholar
  3. LAMPLE G, BALLESTEROS M, SUBRAMANIAN S, Neural Architectures for Named Entity Recognition: arXiv:1603.01360[R/OL]. arXiv, 2016[2022-05-27]. http://arxiv.org/abs/1603.01360. DOI:10.48550/arXiv.1603.01360.Google ScholarGoogle ScholarCross RefCross Ref
  4. VASWANI A, SHAZEER N, PARMAR N, Attention is all you need[J]. Advances in neural information processing systems, 2017, 30.Google ScholarGoogle Scholar
  5. LAMPERT C H, NICKISCH H, HARMELING S. Learning to detect unseen object classes by between-class attribute transfer[C]//2009 IEEE conference on computer vision and pattern recognition. IEEE, 2009: 951-958.Google ScholarGoogle Scholar
  6. FU Y, HOSPEDALES T M, XIANG T, Transductive multi-view zero-shot learning[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(11): 2332-2345.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. XIAN Y, LAMPERT C H, SCHIELE B, Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly[J]. IEEE transactions on pattern analysis and machine intelligence, 2018, 41(9): 2251-2265.Google ScholarGoogle Scholar
  8. LOGESWARAN L, CHANG M W, LEE K, Zero-shot entity linking by reading entity descriptions[J]. arXiv preprint arXiv:1906.07348, 2019.Google ScholarGoogle Scholar
  9. WU L, PETRONI F, JOSIFOSKI M, Scalable zero-shot entity linking with dense entity retrieval[J]. arXiv preprint arXiv:1911.03814, 2019.Google ScholarGoogle Scholar
  10. DEVLIN J, CHANG M W, LEE K, Bert: Pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint arXiv:1810.04805, 2018.Google ScholarGoogle Scholar
  11. MIKOLOV T, CHEN K, CORRADO G, Efficient estimation of word representations in vector space[J]. arXiv preprint arXiv:1301.3781, 2013.Google ScholarGoogle Scholar
  12. WANG R, TANG D, DUAN N, K-adapter: Infusing knowledge into pre-trained models with adapters[J]. arXiv preprint arXiv:2002.01808, 2020.Google ScholarGoogle Scholar
  13. HOULSBY N, GIURGIU A, JASTRZEBSKI S, Parameter-efficient transfer learning for NLP[C]//International Conference on Machine Learning. PMLR, 2019: 2790-2799.Google ScholarGoogle Scholar
  14. LIU W, FU X, ZHANG Y, Lexicon enhanced chinese sequence labeling using bert adapter[J]. arXiv preprint arXiv:2105.07148, 2021.Google ScholarGoogle Scholar
  15. JAWAHAR G, SAGOT B, SEDDAH D. What does BERT learn about the structure of language?[C]//ACL 2019-57th Annual Meeting of the Association for Computational Linguistics. 2019.Google ScholarGoogle Scholar
  16. VAN DER MAATEN L, HINTON G. Visualizing data using t-SNE.[J]. Journal of machine learning research, 2008, 9(11).Google ScholarGoogle Scholar
  17. SONG Y, SHI S, LI J, Directional skip-gram: Explicitly distinguishing left and right context for word embeddings[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). 2018: 175-180.Google ScholarGoogle Scholar
  18. PETERS M E, NEUMANN M, IYYER M, Deep contextualized word representations: arXiv:1802.05365[R/OL]. arXiv, 2018[2022-05-27]. http://arxiv.org/abs/1802.05365. DOI:10.48550/arXiv.1802.05365.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Chinese medical named entity recognition based on zero-shot learning

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        MLNLP '22: Proceedings of the 2022 5th International Conference on Machine Learning and Natural Language Processing
        December 2022
        406 pages
        ISBN:9781450399067
        DOI:10.1145/3578741

        Copyright © 2022 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 6 March 2023

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited
      • Article Metrics

        • Downloads (Last 12 months)23
        • Downloads (Last 6 weeks)0

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format