skip to main content
10.1145/3308558.3313485acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Improving Medical Code Prediction from Clinical Text via Incorporating Online Knowledge Sources

Published: 13 May 2019 Publication History

Abstract

Clinical notes contain detailed information about health status of patients for each of their encounters with a health system. Developing effective models to automatically assign medical codes to clinical notes has been a long-standing active research area. Despite a great recent progress in medical informatics fueled by deep learning, it is still a challenge to find the specific piece of evidence in a clinical note which justifies a particular medical code out of all possible codes. Considering the large amount of online disease knowledge sources, which contain detailed information about signs and symptoms of different diseases, their risk factors, and epidemiology, there is an opportunity to exploit such sources. In this paper we consider Wikipedia as an external knowledge source and propose Knowledge Source Integration (KSI), a novel end-to-end code assignment framework, which can integrate external knowledge during training of any baseline deep learning model. The main idea of KSI is to calculate matching scores between a clinical note and disease related Wikipedia documents, and combine the scores with output of the baseline model. To evaluate KSI, we experimented with automatic assignment of ICD-9 diagnosis codes to the emergency department clinical notes from MIMIC-III data set, aided by Wikipedia documents corresponding to the ICD-9 codes. We evaluated several baseline models, ranging from logistic regression to recently proposed deep learning models known to achieve the state-of-the-art accuracy on clinical notes. The results show that KSI consistently improves the baseline models and that it is particularly successful in assignment of rare codes. In addition, by analyzing weights of KSI models, we can gain understanding about which words in Wikipedia documents provide useful information for predictions.

References

[1]
Anand Avati, Kenneth Jung, Stephanie Harman, Lance Downing, Andrew Ng, and Nigam H Shah. 2017. Improving palliative care with deep learning. In Bioinformatics and Biomedicine (BIBM), 2017 IEEE International Conference on. IEEE, 311-316.
[2]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473(2014).
[3]
Tian Bai, Ashis Kumar Chanda, Brian L Egleston, and Slobodan Vucetic. 2017. Joint learning of representations of medical concepts and words from ehr data. In 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 764-769.
[4]
Tian Bai, Ashis Kumar Chanda, Brian L Egleston, and Slobodan Vucetic. 2018. EHR phenotyping via jointly embedding medical concepts and words into a unified vector space. BMC medical informatics and decision making 18, 4 (2018), 123.
[5]
Tian Bai, Shanshan Zhang, Brian L Egleston, and Slobodan Vucetic. 2018. Interpretable Representation Learning for Healthcare via Capturing Disease Progression through Time. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 43-51.
[6]
Tal Baumel, Jumana Nassour-Kassis, Raphael Cohen, Michael Elhadad, and Noemie Elhadad. 2017. Multi-label classification of patient notes a case study on ICD code assignment. arXiv preprint arXiv:1709.09587(2017).
[7]
David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. Journal of machine Learning research 3, Jan (2003), 993-1022.
[8]
Edward Choi, Mohammad Taha Bahadori, Andy Schuetz, Walter F Stewart, and Jimeng Sun. 2016. Doctor ai: Predicting clinical events via recurrent neural networks. In Machine Learning for Healthcare Conference. 301-318.
[9]
Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Joshua Kulas, Andy Schuetz, and Walter Stewart. 2016. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. In Advances in Neural Information Processing Systems. 3504-3512.
[10]
Koby Crammer, Mark Dredze, Kuzman Ganchev, Partha Pratim Talukdar, and Steven Carroll. 2007. Automatic code assignment to medical text. In Proceedings of the workshop on bionlp 2007: Biological, translational, and clinical language processing. Association for Computational Linguistics, 129-136.
[11]
Luciano RS de Lima, Alberto HF Laender, and Berthier A Ribeiro-Neto. 1998. A hierarchical approach to the automatic categorization of medical documents. In Proceedings of the seventh international conference on Information and knowledge management. ACM, 132-139.
[12]
Scott Deerwester, Susan T Dumais, George W Furnas, Thomas K Landauer, and Richard Harshman. 1990. Indexing by latent semantic analysis. Journal of the American society for information science 41, 6(1990), 391-407.
[13]
Richárd Farkas and György Szarvas. 2008. Automatic construction of rule-based ICD-9-CM coding systems. In BMC bioinformatics, Vol. 9. BioMed Central, S10.
[14]
Jacob Goldberger, Geoffrey E Hinton, Sam T Roweis, and Ruslan R Salakhutdinov. 2005. Neighbourhood components analysis. In Advances in neural information processing systems. 513-520.
[15]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735-1780.
[16]
Karla L Hoffman and Ted K Ralphs. 2013. Integer and combinatorial optimization. In Encyclopedia of Operations Research and Management Science. Springer, 771-783.
[17]
Gao Huang, Chuan Guo, Matt J Kusner, Yu Sun, Fei Sha, and Kilian Q Weinberger. 2016. Supervised word mover's distance. In Advances in Neural Information Processing Systems. 4862-4870.
[18]
Alistair EW Johnson, Tom J Pollard, Lu Shen, H Lehman Li-wei, Mengling Feng, Mohammad Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G Mark. 2016. MIMIC-III, a freely accessible critical care database. Scientific data 3(2016), 160035.
[19]
Yoon Kim. 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882(2014).
[20]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014).
[21]
Matt Kusner, Yu Sun, Nicholas Kolkin, and Kilian Weinberger. 2015. From word embeddings to document distances. In International Conference on Machine Learning. 957-966.
[22]
Leah S Larkey and W Bruce Croft. 1995. Automatic assignment of icd9 codes to discharge summaries. Technical Report. Technical report, University of Massachusetts at Amherst, Amherst, MA.
[23]
Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In International Conference on Machine Learning. 1188-1196.
[24]
Nathan Levitan, A Dowlati, SC Remick, HI Tahsildar, LD Sivinski, R Beyth, and AA Rimm. 1999. Rates of initial and recurrent thromboembolic disease among patients with malignancy versus those without malignancy. Risk analysis using Medicare claims data. Medicine (Baltimore) 78, 5(1999), 285-91.
[25]
Lucian Vlad Lita, Shipeng Yu, Stefan Niculescu, and Jinbo Bi. 2008. Large scale diagnostic code classification for medical patient records. In Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II.
[26]
Minh-Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025(2015).
[27]
Fenglong Ma, Radha Chitta, Jing Zhou, Quanzeng You, Tong Sun, and Jing Gao. 2017. Dipole: Diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1903-1911.
[28]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111-3119.
[29]
James Mullenbach, Sarah Wiegreffe, Jon Duke, Jimeng Sun, and Jacob Eisenstein. 2018. Explainable Prediction of Medical Codes from Clinical Text. arXiv preprint arXiv:1802.05695(2018).
[30]
Adler Perotte, Rimma Pivovarov, Karthik Natarajan, Nicole Weiskopf, Frank Wood, and Noe´mie Elhadad. 2013. Diagnosis code assignment: models and evaluation metrics. Journal of the American Medical Informatics Association 21, 2(2013), 231-237.
[31]
Adler J Perotte, Frank Wood, Noemie Elhadad, and Nicholas Bartlett. 2011. Hierarchically supervised latent Dirichlet allocation. In Advances in Neural Information Processing Systems. 2609-2617.
[32]
Aaditya Prakash, Siyuan Zhao, Sadid A Hasan, Vivek V Datla, Kathy Lee, Ashequl Qadir, Joey Liu, and Oladimeji Farri. 2017. Condensed Memory Networks for Clinical Diagnostic Inferencing. In AAAI. 3274-3280.
[33]
Alvin Rajkomar, Eyal Oren, Kai Chen, Andrew M Dai, Nissan Hajaj, Peter J Liu, Xiaobing Liu, Mimi Sun, Patrik Sundberg, Hector Yee, 2018. Scalable and accurate deep learning for electronic health records. arXiv preprint arXiv:1801.07860(2018).
[34]
Narges Razavian, Saul Blecker, Ann Marie Schmidt, Aaron Smith-McLallen, Somesh Nigam, and David Sontag. 2015. Population-level prediction of type 2 diabetes from claims data and analysis of risk factors. Big Data 3, 4 (2015), 277-287.
[35]
Haoran Shi, Pengtao Xie, Zhiting Hu, Ming Zhang, and Eric P Xing. 2017. Towards Automated ICD Coding Using Deep Learning. arXiv preprint arXiv:1711.04075(2017).
[36]
Donald H Taylor Jr, Truls Østbye, Kenneth M Langa, David Weir, and Brenda L Plassman. 2009. The accuracy of Medicare claims as an epidemiological tool: the case of dementia revisited. Journal of Alzheimer's Disease 17, 4 (2009), 807-815.
[37]
Ankit Vani, Yacine Jernite, and David Sontag. 2017. Grounded Recurrent Neural Networks. arXiv preprint arXiv:1705.08557(2017).
[38]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 6000-6010.
[39]
Guoyin Wang, Chunyuan Li, Wenlin Wang, Yizhe Zhang, Dinghan Shen, Xinyuan Zhang, Ricardo Henao, and Lawrence Carin. 2018. Joint Embedding of Words and Labels for Text Classification. arXiv preprint arXiv:1805.04174(2018).
[40]
Wolfgang C Winkelmayer, Sebastian Schneeweiss, Helen Mogun, Amanda R Patrick, Jerry Avorn, and Daniel H Solomon. 2005. Identification of individuals with CKD from Medicare claims data: a validation study. American Journal of Kidney Diseases 46, 2 (2005), 225-232.
[41]
Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1480-1489.

Cited By

View all
  • (2024)A Unified Review of Deep Learning for Automated Medical CodingACM Computing Surveys10.1145/366461556:12(1-41)Online publication date: 17-May-2024
  • (2024)Automated ICD Coding via Contrastive Learning With Back-Reference and Synonym Knowledge for Smart Self-Diagnosis ApplicationsIEEE Transactions on Consumer Electronics10.1109/TCE.2024.341944770:3(6042-6053)Online publication date: Aug-2024
  • (2024)Large Language Model in Medical Informatics: Direct Classification and Enhanced Text Representations for Automatic ICD Coding2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)10.1109/BIBM62325.2024.10822419(3066-3069)Online publication date: 3-Dec-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
WWW '19: The World Wide Web Conference
May 2019
3620 pages
ISBN:9781450366748
DOI:10.1145/3308558
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • IW3C2: International World Wide Web Conference Committee

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Multi-label classification
  2. attention mechanism
  3. document similarity learning
  4. healthcare

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

WWW '19
WWW '19: The Web Conference
May 13 - 17, 2019
CA, San Francisco, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)73
  • Downloads (Last 6 weeks)10
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A Unified Review of Deep Learning for Automated Medical CodingACM Computing Surveys10.1145/366461556:12(1-41)Online publication date: 17-May-2024
  • (2024)Automated ICD Coding via Contrastive Learning With Back-Reference and Synonym Knowledge for Smart Self-Diagnosis ApplicationsIEEE Transactions on Consumer Electronics10.1109/TCE.2024.341944770:3(6042-6053)Online publication date: Aug-2024
  • (2024)Large Language Model in Medical Informatics: Direct Classification and Enhanced Text Representations for Automatic ICD Coding2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)10.1109/BIBM62325.2024.10822419(3066-3069)Online publication date: 3-Dec-2024
  • (2024)Knowledge-based dynamic prompt learning for multi-label disease diagnosisKnowledge-Based Systems10.1016/j.knosys.2024.111395286:COnline publication date: 17-Apr-2024
  • (2023)SeqCare: Sequential Training with External Medical Knowledge Graph for Diagnosis Prediction in Healthcare DataProceedings of the ACM Web Conference 202310.1145/3543507.3583543(2819-2830)Online publication date: 30-Apr-2023
  • (2023)Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review and Replicability StudyProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591918(2572-2582)Online publication date: 19-Jul-2023
  • (2023)MKFN: Multimodal Knowledge Fusion Network for Automatic ICD Coding2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)10.1109/BIBM58861.2023.10385669(2294-2297)Online publication date: 5-Dec-2023
  • (2023)Disease Diagnosis based on Multiple Semantic Relationship Prompt Subgraph2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)10.1109/BIBM58861.2023.10385346(3398-3405)Online publication date: 5-Dec-2023
  • (2023)Automatic International Classification of Diseases Coding via Note-Code Interaction Network with Denoising MechanismJournal of Computational Biology10.1089/cmb.2023.007930:8(912-925)Online publication date: 1-Aug-2023
  • (2023)Integrating domain knowledge for biomedical text analysis into deep learning: A surveyJournal of Biomedical Informatics10.1016/j.jbi.2023.104418143(104418)Online publication date: Jul-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media