Abstract
Adverse drug reaction (ADR) is a serious medical issue, so early ADR extraction from Electronic Medical Records (EMRs) is necessary. The majority of current researches on ADR extraction from EMRs are mainly oriented to sentence-level, non-real and single-source data, leading a gap in research and practice. To solve this problem, we propose a novel method LLMADR based on style aligned large language models (LLMs) fine-tuning for ADR extraction from document-level and real multi-source Chinese EMRs. We utilize the comprehension and generation capability of LLMs to accomplish ADR extraction from document-level EMRs where irrelevant information interference and long-distance ADR exist, and we craft prompts to guide LLMs in aligning multi-source EMRs with varying styles before training and reasoning, thereby enhancing the generalization capability of our model. Furthermore, We construct a document-level Chinese ADR dataset CADR from two medical organizations without simplification of EMRs to training and evaluating. Comparative experiments on CADR illustrate that from classification and extraction perspectives, LLMADR performs better than several mainstream models and has better generalization capability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bao, Z., et al.: Disc-medllm: bridging general large language models and real-world medical consultation. arXiv preprint arXiv:2308.14346 (2023)
Chee, B.W., Berlin, R., Schatz, B.: Predicting adverse drug events from personal health messages. In: AMIA Annual Symposium Proceedings, vol. 2011, p. 217. American Medical Informatics Association (2011)
Chen, Y., Wu, H., Ge, W.H., Zhang, H.X., Liao, J.: Research on entity relation extraction of Chinese adverse drug reaction reports based on deep learning method. Journal of China Pharmaceutical University 50(6), 753–759 (2019). https://doi.org/10.11665/j.issn.1000-5048.20190617, https://jcpu.cpu.edu.cn/cn/article/doi/10.11665/j.issn.1000-5048.20190617
Cocos, A., Fiks, A.G., Masino, A.J.: Deep learning for pharmacovigilance: recurrent neural network architectures for labeling adverse drug reactions in Twitter posts. J. Am. Med. Inf. Assoc. 24(4), 813–821 (2017). https://doi.org/10.1093/jamia/ocw180, https://doi.org/10.1093/jamia/ocw180
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Du, Z., et al.: GLM: general language model pretraining with autoregressive blank infilling. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 320–335 (2022)
Federer, C., Yoo, M., Tan, A.C.: Big data mining and adverse event pattern analysis in clinical drug trials. Assay Drug Dev. Technol. 14(10), 557–566 (2016)
Feng, Z.Y., et al.: DKADE: a novel framework based on deep learning and knowledge graph for identifying adverse drug events and related medications. Briefings in Bioinformatics 24(4), bbad228 (2023). https://doi.org/10.1093/bib/bbad228, https://doi.org/10.1093/bib/bbad228
Guan, T., Zan, H., Zhou, X., Xu, H., Zhang, K.: CMeIE: construction and Evaluation of Chinese Medical Information Extraction Dataset. Natural Language Processing and Chinese Computing, 9th CCF International Conference, NLPCC 2020, Zhengzhou, China, October 14-18, 2020, Proceedings, Part I (2020)
Hadzi-Puric, J., Grmusa, J.: Automatic drug adverse reaction discovery from parenting websites using disproportionality methods. In: 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 792–797. IEEE (2012)
Hu, E.J., et al.: Lora: low-rank adaptation of large language models (2021)
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
Li, F., Zhang, M., Fu, G., Ji, D.: A neural joint model for entity and relation extraction from biomedical text. BMC Bioinform. 18, 1–11 (2017)
Li, F., Zhang, Y., Zhang, M., Ji, D.: Joint models for extracting adverse drug events from biomedical text. In: IJCAI, vol. 2016, pp. 2838–2844 (2016)
Li, Z.H., et al.: Cmedcausal: Chinese medical causal relationship extraction dataset. J. Med. Inform. 43(12), 23–27 (2022)
Nikfarjam, A., Gonzalez, G.H.: Pattern mining for extraction of mentions of adverse drug reactions from user comments. In: AMIA Annual Symposium Proceedings, vol. 2011, p. 1019. American Medical Informatics Association (2011)
Nori, H., et al.: Can generalist foundation models outcompete special-purpose tuning? case study in medicine. ArXiv abs/2311.16452 (2023). https://api.semanticscholar.org/CorpusID:265466787
Spandana, S., Prakash, R.V.: Multiple features-based adverse drug reaction detection from social media using deep convolutional neural networks (DCNN). Multimedia Tools Appl. 1–15 (2024)
Uzuner, , South, B.R., Shen, S., DuVall, S.L.: 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. J. Am. Med. Inform. Assoc. 18(5), 552–556 (2011). https://doi.org/10.1136/amiajnl-2011-000203
Wang, H., et al.: Huatuo: tuning llama model with Chinese medical knowledge. arXiv preprint arXiv:2304.06975 (2023)
Wang, X.Y., Cui, L.: Extract semantic relations between biomedical entities applied hybrid method. Data Anal. Knowl. Discov. 3, 77–82 (2013)
Xiong, H., et al.: Doctorglm: fine-tuning your Chinese doctor is not a herculean task. arXiv preprint arXiv:2304.01097 (2023)
Yang, X., Bian, J., Gong, Y., Hogan, W.R., Wu, Y.: Madex: a system for detecting medications, adverse drug events, and their relations from clinical notes. Drug Saf. 42, 123–133 (2019)
Yildirim, P., Majnarić, L., Ekmekci, O.I., Holzinger, A.: Knowledge discovery of drug data on the example of adverse reaction prediction. BMC Bioinform. 15, 1–11 (2014)
Zeng, A., et al.: GLM-130b: an open bilingual pre-trained model. arXiv preprint arXiv:2210.02414 (2022)
Acknowledgments
This work was supported by the National Key Research and Development Project of China (No. 2021ZD0110700) and Hunan Provincial Natural Science Foundation (Grant Nos. 2022JJ30668). The authors would like to thank the anonymous reviewers for their valuable comments and suggestions to improve this paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Yin, H., Tang, J., Li, S., Wang, T. (2025). LLMADR: A Novel Method for Adverse Drug Reaction Extraction Based on Style Aligned Large Language Models Fine-Tuning. In: Wong, D.F., Wei, Z., Yang, M. (eds) Natural Language Processing and Chinese Computing. NLPCC 2024. Lecture Notes in Computer Science(), vol 15359. Springer, Singapore. https://doi.org/10.1007/978-981-97-9431-7_36
Download citation
DOI: https://doi.org/10.1007/978-981-97-9431-7_36
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-9430-0
Online ISBN: 978-981-97-9431-7
eBook Packages: Computer ScienceComputer Science (R0)