Abstract
Automatic knowledge extraction from medical dialogues has emerged as an increasingly significant problem in modern medical care. However, diagnosis characteristics of medical texts and imbalanced distribution of item categories within inquiry points are ignored in traditional methods used for medical information extraction, resulting in unsatisfactory performance. In this paper, we propose a Diagnosis-assisted Inquiry Point Extractor (DIPE), where a novel hierarchical attention mechanism, named WSWC (Word-Sentence-Window-Context), is devised to simulate diagnosis-oriented inference and further effectively captures semantic correlation in utterances. Additionally, we construct an imbalance-aware loss function to mitigate the imbalanced distribution of entity categories within inquiry points by assigning weights based on the disparity in sample counts for each category. Experimental results on two public datasets demonstrate that DIPE is an effective solution for inquiry point extraction in medical dialogues.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability and access
All the experiments are conducted utilizing publicly accessible datasets.
References
Overhage JM, McCallie D Jr (2020) Physician time spent using the electronic health record during outpatient encounters: a descriptive study. Ann Int Med 172(3):169–174
Wachter R, Goldsmith J (2018) To combat physician burnout and improve care, fix the electronic health record. Harvard Business Review
Akkasi A, Varoğlu E, Dimililer N (2018) Balanced undersampling: a novel sentence-based undersampling method to improve recognition of named entities in chemical and biomedical text. Appl Intell 48:1965–1978
Cho K, Merrienboer B, Gulcehre C et al (2014) Learning phrase representations using rnn encoder–decoder for statistical machine translation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, p 1724
Cui Y, Che W, Liu T et al (2021) Pre-training with whole word masking for chinese bert. IEEE/ACM Transactions on Audio, Speech, and Language Processing 29:3504–3514
Du N, Chen K, Kannan A et al (2019) Extracting symptoms and their status from clinical conversations. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp 915–925
Finley G, Edwards E, Robinson A et al (2018) An automated medical scribe for documenting clinical encounters. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, pp 11–15
Zhang S, Li Y, Li S, Yan F (2022) Bi-lstm-crf network for clinical event extraction with medical knowledge features. IEEE Access 10:110100–110109
Li Z, Zhang Q, Liu Y, Feng D et al (2017) Recurrent neural networks with specialized word embedding for chinese clinical named entity recognition
Liu H, Liu M, Tang D (2021) Biomedical event extraction based on dependency syntax and multi-head attention. J Wuhan Univ (Nat Sci Ed) 67(6):578–588
Zhang Y, Liu M, Hu H (2019) Chinese medical entity classification and relationship extraction based on joint neural network model. Comput Eng Sci 41(6):1110–1118
Li X, Guo H, Xu N, Li J (2020) Discussion on the general practice initial diagnosis model based on process thinking. Open J Int Med 10(04):342–349
Sarafyazd M, Jazayeri M (2019) Hierarchical reasoning by neural circuits in the frontal cortex. Sci 364(6441):8911
Zhang M, Dai R, Dong M, He T (2022) Drlk: Dynamic hierarchical reasoning with language model and knowledge graph for question answering. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp 5123–5133
Huang F, Yuan C, Bi Y et al (2022) Multi-granular document-level sentiment topic analysis for online reviews. Appl Intell 1–11
Chen Y, Zhuang T, Guo K (2021) Memory network with hierarchical multi-head attention for aspect-based sentiment analysis. Appl Intell 51:1–18. https://doi.org/10.1007/s10489-020-02069-5
Yang Z, Yang D, Dyer C et al () Hierarchical attention networks for document classification. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1480–1489
Wu ST, Liu H, Li D, Tao C, Musen MA, Chute CG, Shah NH (2012) Unified medical language system term occurrences in clinical notes: a large-scale corpus analysis. J Am Med Inf Assoc 19(e1):149–156
Chen X, Ouyang C, Liu Y, Bu Y (2020) Improving the named entity recognition of chinese electronic medical records by combining domain dictionary and rules. International Journal of Environmental Research and Public Health 17(8):2687
Guo Y, Gaizauskas R, Roberts I, Demetriou G, Hepple M et al (2006) Identifying personal health information using support vector machines. In: I2b2 Workshop on Challenges in Natural Language Processing for Clinical Data, pp 10–11
Settles B (2004) Biomedical named entity recognition using conditional random fields and rich feature sets. In: Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and Its Applications (NLPBA/BioNLP), pp 107–110
Dong X, Qian L, Guan Y, Huang L, Yu Q, Yang J (2016) A multiclass classification method based on deep learning for named entity recognition in electronic medical records. In: 2016 New York Scientific Data Summit (NYSDS), pp 1–10. IEEE
Liu W, Fu X, Zhang Y, Xiao W (2021) Lexicon enhanced chinese sequence labeling using bert adapter. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp 5847–5858
Wang L, Zhao Y, Cui R, Jin G, Wang J (2023) Chinese medical record entity recognition based on lexicon and self-attention. In: 2023 IEEE International Conference on Sensors, Electronics and Computer Engineering (ICSECE), pp 1457–1461. IEEE
Lin X, He X, Chen Q et al (2019) Enhancing dialogue symptom diagnosis with global attention and symptom graph. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp 5033–5042
Zhang Y, Jiang Z, Zhang T et al (2020) Mie: A medical information extractor towards medical dialogues. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp 6460–6469
Lin R, Fan J, Wu H (2023) Multi-aspect understanding with cooperative graph attention networks for medical dialogue information extraction. ACM Trans Intell Syst Technol. https://doi.org/10.1145/3620675
Bombieri M, Meli D, Dall’Alba D, Rospocher M, Fiorini P (2023) Mapping natural language procedures descriptions to linear temporal logic templates: an application in the surgical robotic domain. Appl Intell 53:26351–26363
Ouyang E, Li Y, Jin L et al (2017) Exploring n-gram character presentation in bidirectional rnn-crf for chinese clinical named entity recognition. CEUR Workshop Proceedings 1976:37–42
Luo X, Xia X, An Y et al (2021) Chinese cner combined with multi-head self-attention and bilstm-crf. J Hunan Univ (Nat Sci) 48(4):45–55
Yang N, Pun SH, Vai MI et al (2022) A unified knowledge extraction method based on bert and handshaking tagging scheme. Appl Sci 12(13):6543
Wang Z, Poon J, Poon S (2019) Ami-net+: A novel multi-instance neural network for medical diagnosis from incomplete and imbalanced data. arXiv:1907.01734
Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. Adv Neural Inf Process Syst 30
Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Netw 18(5–6):602–610
Lin TY, Goyal P, Girshick R et al (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2980–2988
Shan Y, Li Z, Zhang J et al (2020) A contextual hierarchical attention network with adaptive objective for dialogue state tracking. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp 6322–6333
Zhou P, Shi W, Tian J et al (2016) Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp 207–212
Hu G, Lyu S, Wu X et al (2022) Contextual-aware information extractor with adaptive objective for chinese medical dialogues. Trans Asian Low-Resour Lang Inf Process 21(5):1–21
Wang X, Tang X (2023) Automatically extracting information in medical dialogue: expert system and attention for labelling. In: International Workshop on Health Intelligence, pp 151–161. Springer
Li M, Xiang L, Kang X, Zhao Y, Zhou Y, Zong C (2021) Medical term and status generation from chinese clinical dialogue with multi-granularity transformer. IEEE/ACM Trans Audio Speech Lang Process 29:3362–3374
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv:1412.6980
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Qi Li, Faliang Huang, Lin Ge, and Jie Zhao. The first draft of the manuscript was written by Qi Li and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing Interests
The authors have no competing interests to declare that are relevant to the content of this article.
Ethical and informed consent for data used
Not Applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, Q., Huang, F., Ge, L. et al. DIPE: a diagnosis-assisted inquiry point extractor towards medical dialogues. Appl Intell 55, 230 (2025). https://doi.org/10.1007/s10489-024-06138-x
Accepted:
Published:
DOI: https://doi.org/10.1007/s10489-024-06138-x