Deep Learning to Predict Patient Future Diseases from the Electronic Health Records

Miotto, Riccardo; Li, Li; Dudley, Joel T.

doi:10.1007/978-3-319-30671-1_66

Riccardo Miotto²¹,
Li Li²¹ &
Joel T. Dudley²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9626))

Included in the following conference series:

European Conference on Information Retrieval

5932 Accesses
22 Citations

Abstract

The increasing cost of health care has motivated the drive towards preventive medicine, where the primary concern is recognizing disease risk and taking action at the earliest stage. We present an application of deep learning to derive robust patient representations from the electronic health records and to predict future diseases. Experiments showed promising results in different clinical domains, with the best performances for liver cancer, diabetes, and heart failure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In this architecture, each patient can be described by just one single vector (as done in this study) or by a bag of vectors computed in, e.g., predefined temporal windows.
2.
While this study focuses on future disease prediction, it should be noted that the patient representation derived from the stack of denoising autoencoders can also be applied to unsupervised tasks (e.g., patient clustering and similarity) as well as to other supervised applications (e.g., personalized prescriptions).
3.
While in this study we favored a basic pipeline to process EHRs, it should be noted that more sophisticated techniques might lead to better features as well as to better predictive results.
4.
All parameters in the feature learning models were identified through preliminary experiments, not reported here for brevity, on the validation set.
5.
This experiment only evaluates the prediction of new diseases for each patient, therefore not considering the re-diagnosis of a disease previously reported.

References

Kennedy, E., Wiitala, W., Hayward, R., Sussman, J.: Improved cardiovascular risk prediction using non-parametric regression and electronic health record data. Med Care 51(3), 251–258 (2013)
Article Google Scholar
Perotte, A., Ranganath, R., Hirsch, J.S., Blei, D., Elhadad, N.: Risk prediction for chronic disease progression using heterogeneous electronic health record data and time series analysis. J Am Med Inform Assoc 22(4), 872–880 (2015)
Article Google Scholar
Jensen, P.B., Jensen, L.J., Brunak, S.: Mining electronic health records: towards better research applications and clinical care. Nat. Rev. Genet. 13(6), 395–405 (2012)
Article Google Scholar
Wu, J., Roy, J., Stewart, W.: Prediction modeling using EHR data: Challenges, strategies, and a comparison of machine learning approaches. Med. Care 48(Suppl 6), 106–113 (2010)
Article Google Scholar
Bengio, Y., Courville, A., Vincent, P.: Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Article Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
Helmstaedter, M., Briggman, K.L., Turaga, S.C., Jain, V., Seung, H.S., Denk, W.: Connectomic reconstruction of the inner plexiform layer in the mouse retina. Nature 500(7461), 168–174 (2013)
Article Google Scholar
Ma, J.S., Sheridan, R.P., Liaw, A., Dahl, G.E., Svetnik, V.: Deep neural nets as a method for quantitative structure-activity relationships. J. Chem. Inf. Model 55(2), 263–274 (2015)
Article Google Scholar
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn Res. 11, 3371–3408 (2010)
MathSciNet MATH Google Scholar
Cowen, M.E., Dusseau, D.J., Toth, B.G., Guisinger, C., Zodet, M.W., Shyr, Y.: Casemix adjustment of managed care claims data using the clinical classification for health policy research method. Med. Care 36(7), 1108–1113 (1998)
Article Google Scholar
LePendu, P., Iyer, S., Fairon, C., Shah, N.: Annotation analysis for testing drug safety signals using unstructured clinical notes. J. Biomed. Semant. 3(S–1), S5 (2012)
Google Scholar
Blei, D., Ng, A., Jordan, M.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, USA
Riccardo Miotto, Li Li & Joel T. Dudley

Authors

Riccardo Miotto
View author publications
You can also search for this author in PubMed Google Scholar
Li Li
View author publications
You can also search for this author in PubMed Google Scholar
Joel T. Dudley
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Riccardo Miotto .

Editor information

Editors and Affiliations

Department of Information Engineering, University of Padua, Padova, Italy
Nicola Ferro
Faculty of Informatics, University of Lugano (USI), Lugano, Switzerland
Fabio Crestani
Department of Computer Science, Katholieke Universiteit Leuven, Heverlee, Belgium
Marie-Francine Moens
Systèmes d’informations, Big Data et Recherche d’Information, Institut de Recherche en Informatique de Toulouse IRIT/équipe SIG, Toulouse Cedex 04, France
Josiane Mothe
Yahoo! Labs London, London, UK
Fabrizio Silvestri
Department of Information Engineering, University of Padua, Padova, Italy
Giorgio Maria Di Nunzio
TU Delft - EWI/ST/WIS, Delft, The Netherlands
Claudia Hauff
Department of Information Engineering, University of Padua, Padova, Italy
Gianmaria Silvello

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Miotto, R., Li, L., Dudley, J.T. (2016). Deep Learning to Predict Patient Future Diseases from the Electronic Health Records. In: Ferro, N., et al. Advances in Information Retrieval. ECIR 2016. Lecture Notes in Computer Science(), vol 9626. Springer, Cham. https://doi.org/10.1007/978-3-319-30671-1_66

Download citation

DOI: https://doi.org/10.1007/978-3-319-30671-1_66
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-30670-4
Online ISBN: 978-3-319-30671-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics