skip to main content
10.1145/3638584.3638614acmotherconferencesArticle/Chapter ViewAbstractPublication PagescsaiConference Proceedingsconference-collections
research-article

A Novel Classification Model for Automatic Multi-Label ICD Coding via BERT-LSTM

Published: 14 March 2024 Publication History

Abstract

Clinical notes are text documents created by physicians at each patient visit to record details of diagnosis and treatment, and are labeled using medical codes. However, manually marking up these codes is time-consuming and error-prone. To address this problem, we propose a new multi-label classification method inspired by the encoder-decoder structure that utilizes the BERT-LSTM network structure to automatically assign ICD codes to clinical texts. The model is able to accurately predict the appropriate medical codes based on the content and contextual information of the clinical text, improving efficiency while reducing errors. By combining these two powerful neural network models, we are able to better handle the task of coding clinical notes. In comparative experiments, the application results of the model are better than some basic neural network architectures, achieving 85.7% of AUC, 61.2% of precesion@5 and 56.5% of Micro-F1. This result demonstrates the robustness of our proposed method and the effectiveness automatic ICD coding classification.

References

[1]
Yadav P, Steinbach M, Kumar V, Mining Electronic Health Records: A Survey.2017[2023-07-06]. https://doi.org/10.1145/3127881
[2]
To Nguyen Phuoc Vinh and Ha Hoang Kha, "Vietnamese News Articles Classification Using Neural Networks," Journal of Advances in Information Technology, Vol. 12, No. 4, pp. 363-369, November 2021. https://doi.org/10.12720/jait.12.4.363-369
[3]
Teng F, Liu Y, Li T, A review on deep neural networks for ICD coding[J]. IEEE Transactions on Knowledge and Data Engineering, 2022, 35(5): 4357-4375. https://ieeexplore.ieee.org/document/9705116
[4]
Benchaji I, Douzi S, Ouahidi B E .Credit Card Fraud Detection Model Based on LSTM Recurrent Neural Networks[J].Engineering and Technology Publishing, 2021(2).https://doi.org/10.12720/jait.12.2.113-118.
[5]
Mary H Stanfill and others, A systematic literature review of automated clinical coding and classification systems, Journal of the American Medical Informatics Association, Volume 17, Issue 6, November 2010, Pages 646–651, https://doi.org/10.1136/jamia.2009.001024
[6]
Baumel T, Nassour-Kassis J, Elhadad M, Multi-Label Classification of Patient Notes a Case Study on ICD Code Assignment[J]. 2017. https://arxiv.org/abs/1709.09587
[7]
Mullenbach J, Wiegreffe S, Duke J, Explainable Prediction of Medical Codes from Clinical Text[J]. 2018. https://arxiv.org/abs/1802.05695
[8]
Duarte F, Martins B, Pinto, Cátia Sousa,et al.Deep Neural Models for ICD-10 Coding of Death Certificates and Autopsy Reports in Free-Text[J].Journal of Biomedical Informatics, 2018:S1532046418300303. https://doi.org/10.1016/j.jbi.2018.02.011
[9]
Shi H, Xie P, Hu Z, Towards Automated ICD Coding Using Deep Learning[J]. 2017. https://doi.org/10.48550/arXiv.1711.04075
[10]
Pengtao Xie, Haoran Shi§,Ming Zhang§,et al.A Neural Architecture for Automated ICD Coding[C]//Meeting of the Association for Computational Linguistics.2018. https://aclanthology.org/P18-1098
[11]
Devlin J, Chang M W, Lee K, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[J]. 2018. https://doi.org/10.48550/arXiv.1810.04805
[12]
Ohno-Machado L. Realizing the full potential of electronic health records: the role of natural language processing. J Am Med Inform Assoc. 2011;18(5):539. https://doi.org/10.1136/amiajnl-2011-000501
[13]
He Y, Zhu Z, Zhang Y, Infusing Disease Knowledge into BERT for Health Question Answering, Medical Inference and Disease Name Recognition[J]. 2020. https://doi.org/10.48550/arXiv.2010.03746
[14]
Huang K, Altosaar J, Ranganath R ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission[J]. 2019. https://doi.org/10.48550/arXiv.1904.05342
[15]
Lee J, Yoon W, Kim S, BioBERT: a pre-trained biomedical language representation model for biomedical text mining[J]Bioinformatics, 2019. https://doi.org/10.1093/bioinformatics/btz682
[16]
Xu K, Lam M, Pang J, Multimodal Machine Learning for Automated ICD Coding[J]. 2018. https://doi.org/10.48550/arXiv.1810.13348
[17]
Wang S, Manning C D Baselines and Bigrams: Simple, Good Sentiment and Topic Classification[C]//Meeting of the Association for Computational Linguistics: Short Papers.Association for Computational Linguistics, 2012. https://aclanthology.org/P12-2018
[18]
M. Li, "Automated ICD-9 Coding via A Deep Learning Approach," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 16, no. 4, pp. 1193-1202, 1 July-Aug. 2019. https://ieeexplore.ieee.org/abstract/document/8320340
[19]
Pandey, R., Singh, J.P. BERT-LSTM model for sarcasm detection in code-mixed social media post. J Intell Inf Syst 60, 235–254 (2023). https://doi.org/10.1007/s10844-022-00755-z
[20]
Huang J, Osorio C, Sy L W An Empirical Evaluation of Deep Learning for ICD-9 Code Assignment using MIMIC-III Clinical Notes[J]Computer Methods & Programs in Biomedicine, 2018. https://doi.org/10.1016/j.cmpb.2019.05.024
[21]
Amin S, Günter Neumann, Dunfield K, MLT-DFKI at CLEF eHealth 2019: Multi-label Classification of ICD-10 Codes with BERT[C]//CLEF 2019. https://api.semanticscholar.org/CorpusID:198488837
[22]
Alistair EW Johnson, Tom J Pollard, Lu Shen, H Lehman Li-wei, Mengling Feng, Mohammad Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G Mark. Mimic-iii, a freely accessible critical care database. Scientific data, 3:160035, 2016
[23]
Yao L, Mao C, Luo Y.Graph Convolutional Networks for Text Classification[J]. 2018. https://doi.org/10.48550/arXiv.1809.05679.45
[24]
Gonçalves L, Subtil A, Oliveira M R, ROC curve estimation: An overview[J]. REVSTAT-Statistical journal, 2014, 12(1): 1–20-1–20. https://doi.org/10.57805/revstat.v12i1.141

Cited By

View all
  • (2025)Source Code Error Understanding Using BERT for Multi-Label ClassificationIEEE Access10.1109/ACCESS.2024.352506113(3802-3822)Online publication date: 2025

Index Terms

  1. A Novel Classification Model for Automatic Multi-Label ICD Coding via BERT-LSTM

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    CSAI '23: Proceedings of the 2023 7th International Conference on Computer Science and Artificial Intelligence
    December 2023
    563 pages
    ISBN:9798400708688
    DOI:10.1145/3638584
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 March 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. BERT
    2. Clinical note automatic coding
    3. LSTM
    4. Multi-label classification

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    CSAI 2023

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)66
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 12 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Source Code Error Understanding Using BERT for Multi-Label ClassificationIEEE Access10.1109/ACCESS.2024.352506113(3802-3822)Online publication date: 2025

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media