Biomedical Entity Normalization Using Encoder Regularization and Dynamic Ranking Mechanism

Chen, Siye; Xie, Chunmei; Wang, Hang; Ma, Shihan; Liu, Yarong; Shi, Qiuhui; Huang, Wenkang; Wang, Hongbin

doi:10.1007/978-3-031-44693-1_39

Siye Chen¹¹,
Chunmei Xie¹¹,
Hang Wang¹¹,
Shihan Ma¹¹,
Yarong Liu¹¹,
Qiuhui Shi¹¹,
Wenkang Huang¹¹ &
…
Hongbin Wang¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14302))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

1879 Accesses

Abstract

Biomedical entity normalization is a fundamental method for lots of downstream applications. Due to the rich additional information for biomedical entities in medical dictionaries, such as synonyms or definitions, transformer-based models are applied to dig semantic representations in normalization recently. Despite the high performance of the transformer-based model, the over-fitting problem remains challenging and unsolved. Besides, bi-encoder structure and cross-encoder structure are popularly applied in many biomedical entity normalization works, the issue to measure the distance of such encoder structures is very challenging. Moreover, the triples margin ranking loss mechanism is widely used in reranking stage of entity normalization. In this paper, we proposed an encoder-level regularization to restrain the over-fitting problem caused by the deep representation of transformer. Moreover, we use a dynamic margin ranking mechanism instead of fixed margin selection in reranking stage during training. In this way, we experiment our model on three biomedical entity normalization datasets, and the empirical results outperform previous state-of-the-art models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

CNN-based ranking for biomedical entity normalization

Article Open access 03 October 2017

Continuous Prompt Enhanced Biomedical Entity Normalization

Drug and Disease Interpretation Learning with Biomedical Entity Representation Transformer

Notes

References

Dogan, R.I., Murray, G.C., Névéol, A., Lu, Z.: Understanding pubmed® user search behavior through log analysis. In: Database 2009 (2009)
Google Scholar
Leaman, R., Doğan, R.I., Lu, Z.: DNorm: disease name normalization with pairwise learning to rank. Bioinformatics 29(22), 2909–2917 (2013)
Article Google Scholar
Wei, C.-H., Kao, H.-Y., Lu, Z.: GNormPlus: an integrative approach for tagging genes, gene families, and protein domains. BioMed Res. Int. 2015 (2015)
Google Scholar
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Wu, L., et al.: R-drop: regularized dropout for neural networks. In: Advances in Neural Information Processing Systems, vol. 34 (2021)
Google Scholar
Bhowmik, R., Stratos, K., de Melo, G.: Fast and effective biomedical entity linking using a dual encoder. arXiv preprint arXiv:2103.05028 (2021)
Xu, D., Zhang, Z., Bethard, S.: A generate-and-rank framework with semantic type regularization for biomedical concept normalization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8452–8464 (2020)
Google Scholar
Luan, Y., Eisenstein, J., Toutanova, K., Collins, M.: Sparse, dense, and attentional representations for text retrieval. Trans. Assoc. Comput. Linguist. 9, 329–345 (2021)
Article Google Scholar
Sung, M., Jeon, H., Lee, J., Kang, J.: Biomedical entity representations with synonym marginalization. arXiv preprint arXiv:2005.00239 (2020)
Yan, C., Zhang, Y., Liu, K., Zhao, J., Shi, Y., Liu, S.: Biomedical concept normalization by leveraging hypernyms. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 3512–3517 (2021)
Google Scholar
Li, H., et al.: CNN-based ranking for biomedical entity normalization. BMC Bioinform. 18(11), 79–86 (2017)
Google Scholar
Fakhraei, S., Mathew, J., Ambite, J.L.: NSEEN: neural semantic embedding for entity normalization. In: Brefeld, U., Fromont, E., Hotho, A., Knobbe, A., Maathuis, M., Robardet, C. (eds.) ECML PKDD 2019. LNCS (LNAI), vol. 11907, pp. 665–680. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-46147-8_40
Chapter Google Scholar
Lee, J., et al.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4), 1234–1240 (2020)
Article MathSciNet Google Scholar
Vashishth, S., Joshi, R., Newman-Griffis, D., Dutt, R., Rose, C.: Med-type: improving medical entity linking with semantic type prediction. arxiv e-prints, page. arXiv preprint arXiv:2005.00460 (2020)
Gao, L., Dai, Z., Callan, J.: Modularized transfomer-based ranking framework. arXiv preprint arXiv:2004.13313 (2020)
Zhang, W., Hua, W., Stratos, K.: EntQA: entity linking as question answering. arXiv preprint arXiv:2110.02369 (2021)
Johnson, J., Douze, M., Jégou, H.: Billion-scale similarity search with GPUS. IEEE Trans. Big Data 7(3), 535–547 (2019)
Article Google Scholar
Davis, A.P., Wiegers, T.C., Rosenstein, M.C., Mattingly, C.J.: MEDIC: a practical disease vocabulary used at the comparative toxicogenomics database. Database 2012, bar065 (2012)
Google Scholar
Davis, A.P., et al.: The comparative toxicogenomics database: update 2019. Nucl. Acids Res. 47(D1), D948–D954 (2019)
Article Google Scholar
Gillick, D., et al.: Learning dense representations for entity retrieval. arXiv preprint arXiv:1909.10506 (2019)
Wu, L., Petroni, F., Josifoski, M., Riedel, S., Zettlemoyer, L.: Scalable zero-shot entity linking with dense entity retrieval. arXiv preprint arXiv:1911.03814 (2019)
Zhang, W., Stratos, K.: Understanding hard negatives in noise contrastive estimation. arXiv preprint arXiv:2104.06245 (2021)
Doğan, R.I., Leaman, R., Lu, Z.: NCBI disease corpus: a resource for disease name recognition and concept normalization. J. Biomed. Inform. 47, 1–10 (2014)
Article Google Scholar
Li, J., et al.: Biocreative V CDR task corpus: a resource for chemical disease relation extraction. In: Database 2016 (2016)
Google Scholar
D’Souza, J., Ng, V.: Sieve-based entity linking for the biomedical domain. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp. 297–302 (2015)
Google Scholar
Wright, D.: NormCo: Deep Disease Normalization for Biomedical Knowledge Base Construction. University of California, San Diego (2019)
Google Scholar
Phan, M.C., Sun, A., Tay, Y.: Robust representation learning of biomedical names. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 3275–3285 (2019)
Google Scholar
Ji, Z., Wei, Q., Hua, X.: Bert-based ranking for biomedical entity normalization. AMIA Summits Transl. Sci. Proc. 2020, 269 (2020)
Google Scholar
Mondal, I., et al.: Medical entity linking using triplet network. arXiv preprint arXiv:2012.11164 (2020)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Leaman, R., Zhiyong, L.: TaggerOne: joint named entity recognition and normalization with semi-Markov models. Bioinformatics 32(18), 2839–2846 (2016)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Ant Group, Hangzhou, China
Siye Chen, Chunmei Xie, Hang Wang, Shihan Ma, Yarong Liu, Qiuhui Shi, Wenkang Huang & Hongbin Wang

Authors

Siye Chen
View author publications
You can also search for this author in PubMed Google Scholar
Chunmei Xie
View author publications
You can also search for this author in PubMed Google Scholar
Hang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shihan Ma
View author publications
You can also search for this author in PubMed Google Scholar
Yarong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Qiuhui Shi
View author publications
You can also search for this author in PubMed Google Scholar
Wenkang Huang
View author publications
You can also search for this author in PubMed Google Scholar
Hongbin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongbin Wang .

Editor information

Editors and Affiliations

Emory University, Atlanta, GA, USA
Fei Liu
Microsoft Research Asia, Beijing, China
Nan Duan
Soochow University, Suzhou, China
Qingting Xu
Soochow University, Suzhou, China
Yu Hong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, S. et al. (2023). Biomedical Entity Normalization Using Encoder Regularization and Dynamic Ranking Mechanism. In: Liu, F., Duan, N., Xu, Q., Hong, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2023. Lecture Notes in Computer Science(), vol 14302. Springer, Cham. https://doi.org/10.1007/978-3-031-44693-1_39

Download citation

DOI: https://doi.org/10.1007/978-3-031-44693-1_39
Published: 08 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44692-4
Online ISBN: 978-3-031-44693-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)

Biomedical Entity Normalization Using Encoder Regularization and Dynamic Ranking Mechanism

Abstract

Access this chapter

Similar content being viewed by others

CNN-based ranking for biomedical entity normalization

Continuous Prompt Enhanced Biomedical Entity Normalization

Drug and Disease Interpretation Learning with Biomedical Entity Representation Transformer

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Biomedical Entity Normalization Using Encoder Regularization and Dynamic Ranking Mechanism

Abstract

Access this chapter

Similar content being viewed by others

CNN-based ranking for biomedical entity normalization

Continuous Prompt Enhanced Biomedical Entity Normalization

Drug and Disease Interpretation Learning with Biomedical Entity Representation Transformer

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation