research-article

Deep Computational Phenotyping

Authors:
Zhengping Che

University of Southern California, Los Angeles, CA, USA

University of Southern California, Los Angeles, CA, USA
View Profile

,
David Kale

University of Southern California, Los Angeles, CA, USA

University of Southern California, Los Angeles, CA, USA
View Profile

,
Wenzhe Li

University of Southern California, Los Angeles, CA, USA

University of Southern California, Los Angeles, CA, USA
View Profile

,
Mohammad Taha Bahadori

University of Southern California, Los Angeles, CA, USA

University of Southern California, Los Angeles, CA, USA
View Profile

,
Yan Liu

University of Southern California, Los Angeles, CA, USA

University of Southern California, Los Angeles, CA, USA
View Profile

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data MiningAugust 2015Pages 507–516https://doi.org/10.1145/2783258.2783365

Published:10 August 2015Publication History

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Pages 507–516

ABSTRACT

We apply deep learning to the problem of discovery and detection of characteristic patterns of physiology in clinical time series data. We propose two novel modifications to standard neural net training that address challenges and exploit properties that are peculiar, if not exclusive, to medical data. First, we examine a general framework for using prior knowledge to regularize parameters in the topmost layers. This framework can leverage priors of any form, ranging from formal ontologies (e.g., ICD9 codes) to data-derived similarity. Second, we describe a scalable procedure for training a collection of neural networks of different sizes but with partially shared architectures. Both of these innovations are well-suited to medical applications, where available data are not yet Internet scale and have many sparse outputs (e.g., rare diagnoses) but which have exploitable structure (e.g., temporal order and relationships between labels). However, both techniques are sufficiently general to be applied to other problems and domains. We demonstrate the empirical efficacy of both techniques on two real-world hospital data sets and show that the resulting neural nets learn interpretable and clinically relevant features.

References

S. L. Achour, M. Dojat, C. Rieux, P. Bierling, and E. Lepage. A umls-based knowledge acquisition tool for rule-based clinical decision support system development. JAMIA, 8(4):351--360, 2001.Google Scholar
R. K. Ando and T. Zhang. Learning on graph with laplacian regularization. NIPS, 2007.Google Scholar
M. T. Bahadori, Q. R. Yu, and Y. Liu. Fast multivariate spatio-temporal analysis via low rank tensor learning. In NIPS, 2014.Google ScholarDigital Library
P. L. Bartlett and S. Mendelson. Rademacher and gaussian complexities: Risk bounds and structural results. JMLR, 2003. Google ScholarDigital Library
Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell., 2013. Google ScholarDigital Library
Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle. Greedy layer-wise training of deep networks. In NIPS. 2007.Google Scholar
Y. Bengio, G. Mesnil, Y. Dauphin, and S. Rifai. Better mixing via deep representations. In ICML, 2013.Google Scholar
J. Bergstra, O. Breuleux, F. Bastien, P. Lamblin, R. Pascanu, G. Desjardins, J. Turian, D. Warde-Farley, and Y. Bengio. Theano: a CPU and GPU math expression compiler. In SciPy, 2010.Google ScholarCross Ref
O. Bodenreider. The unified medical language system (umls): integrating biomedical terminology. Nucleic Acids Res., 2004.Google ScholarCross Ref
R. Cornet and N. de Keizer. Forty years of snomed: a literature review. BMC medical informatics and decision making, 8(Suppl 1):S2, 2008.Google Scholar
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In CVPR, 2009.Google ScholarCross Ref
M. Denil, B. Shakibi, L. Dinh, M. Ranzato, and N. de Freitas. Predicting parameters in deep learning. In NIPS. 2013.Google Scholar
M. C. Díaz-Galiano, M. T. Martín-Valdivia, and L. Ureña-López. Query expansion with a medical ontology to improve a multimodal information retrieval system. Comput. Biol. Med., 2009. Google ScholarDigital Library
A. Goldberger, L. N. Amaral, L. Glass, J. Hausdorff, P. Ivanov, R. Mark, J. Mietus, G. Moody, C. Peng, and H. Stanley. Physiobank, physiotoolkit, and physionet: Components of a new research resource for complex physiologic signals. Circulation, 2000.Google Scholar
A. Graves and N. Jaitly. Towards end-to-end speech recognition with recurrent neural networks. In Proceedings of the 31st International Conference on Machine Learning (ICML-14), pages 1764--1772, 2014.Google ScholarDigital Library
G. Hinton, S. Osindero, and Y.-W. Teh. A fast learning algorithm for deep belief nets. Neural Comput., 2006. Google ScholarDigital Library
J. C. Ho, J. Ghosh, and J. Sun. Marble: high-throughput phenotyping from electronic health records via sparse nonnegative tensor factorization. In KDD, 2014. Google ScholarDigital Library
A. Hyvärinen and S. M. Smith. Pairwise likelihood ratios for estimation of non-gaussian structural equation models. JMLR, 2013. Google ScholarDigital Library
D. Kale, Z. Che, Y. Liu, and R. Wetzel. Computational discovery of physiomes in critically ill children using deep learning. In DMMI Workshop, AMIA, 2014.Google Scholar
A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. In CVPR, 2015.Google ScholarCross Ref
T. A. Lasko, J. Denny, and M. Levy. Computational phenotype discovery using unsupervised feature learning over noisy, sparse, and irregular clinical data. PLoS ONE, 2013.Google Scholar
B. Marlin, D. Kale, R. Khemani, and R. Wetzel. Unsupervised pattern discovery in electronic health care data using probabilistic clustering models. In IHI, 2012. Google ScholarDigital Library
W. H. Organization. International statistical classification of diseases and related health problems. 2004.Google Scholar
J. Pearl. Causality: models, reasoning and inference. Cambridge Univ Press, 2009. Google ScholarDigital Library
M. Ranzato, C. Poultney, S. Chopra, and Y. L. Cun. Efficient learning of sparse representations with an energy-based model. In NIPS. 2007.Google Scholar
S. Reed, K. Sohn, Y. Zhang, and H. Lee. Learning to disentangle factors of variation with manifold interaction. In ICML, 2014.Google ScholarDigital Library
P. Schulam, F. Wigley, and S. Saria. Clustering longitudinal clinical marker trajectories from electronic health data: Applications to phenotyping and endotype discovery. 2015.Google Scholar
S. Shimizu, P. O. Hoyer, A. Hyvçrinen, and A. Kerminen. A linear non-gaussian acyclic model for causal discovery. JMLR, 2006. Google ScholarDigital Library
S. Shimizu, T. Inazumi, Y. Sogawa, A. Hyvçrinen, Y. Kawahara, T. Washio, P. O. Hoyer, and K. Bollen. Directlingam: A direct method for learning a linear non-gaussian structural equation model. JMLR, 2011. Google ScholarDigital Library
I. Silva, G. Moody, D. J. Scott, L. A. Celi, and R. G. Mark. Predicting in-hospital mortality of icu patients: The physionet/computing in cardiology challenge 2012. Computing in cardiology, 2012.Google Scholar
N. Srivastava and R. R. Salakhutdinov. Discriminative transfer learning with tree-based priors. In NIPS, pages 2094--2102, 2013.Google Scholar
A. Torralba, R. Fergus, and W. T. Freeman. 80 million tiny images: A large data set for nonparametric object and scene recognition. PAMI, 2008. Google ScholarDigital Library
J. Turian, L. Ratinov, and Y. Bengio. Word representations: A simple and general method for semi-supervised learning. In ACL, 2010. Google ScholarDigital Library
V. Vapnik. The nature of statistical learning theory. Springer Science & Business Media, 2000. Google ScholarDigital Library
P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol. Extracting and composing robust features with denoising autoencoders. In ICML, 2008. Google ScholarDigital Library
K. Q. Weinberger, F. Sha, Q. Zhu, and L. K. Saul. Graph laplacian regularization for large-scale semidefinite programming. In NIPS, 2006.Google Scholar
G. Wu, M. Kim, Q. Wang, Y. Gao, S. Liao, and D. Shen. Unsupervised deep feature learning for deformable registration of mr brain images. In MICCAI. 2013.Google ScholarCross Ref
R. Wu, S. Yan, Y. Shan, Q. Dang, and G. Sun. Deep image: Scaling up image recognition. arXiv:1501.02876, 2015.Google Scholar
T. Xiang, D. Ray, T. Lohrenz, P. Dayan, and P. R. Montague. Computational phenotyping of two-person interactions reveals differential neural response to depth-of-thought. PLoS Comput. Biol., 2012.Google ScholarCross Ref
T. Zhang, A. Popescul, and B. Dom. Linear prediction models with graph regularization for web-page categorization. In KDD, 2006. Google ScholarDigital Library
G. Zhou, K. Sohn, and H. Lee. Online incremental feature learning with denoising autoencoders. In AISTATS, 2012.Google Scholar
J. Zhou, F. Wang, J. Hu, and J. Ye. From micro to macro: Data driven phenotyping by densification of longitudinal electronic medical records. In KDD, 2014. Google ScholarDigital Library

Index Terms

Deep Computational Phenotyping

Recommendations

GRAM: Graph-based Attention Model for Healthcare Representation Learning
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Deep learning methods exhibit promising performance for predictive modeling in healthcare, but two important challenges remain: -
- Data insufficiency: Often in healthcare predictive modeling, the sample size is insufficient for deep learning methods to ...
Read More
KAME: Knowledge-based Attention Model for Diagnosis Prediction in Healthcare
CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management

The goal of diagnosis prediction task is to predict the future health information of patients from their historical Electronic Healthcare Records (EHR). The most important and challenging problem of diagnosis prediction is to design an accurate, robust ...
Read More
Patient Subtyping via Time-Aware LSTM Networks
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

In the study of various diseases, heterogeneity among patients usually leads to different progression patterns and may require different types of therapeutic intervention. Therefore, it is important to study patient subtyping, which is grouping of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
August 2015
2378 pages
ISBN:9781450336642
DOI:10.1145/2783258
General Chairs:
Longbing Cao
University of Technology, Sydney
,
Chengqi Zhang
University of Technology, Sydney
,
Program Chairs:
Thorsten Joachims
Cornell University
,
Geoff Webb
Monash University
,
Dragos D. Margineantu
Boeing Research
,
Graham Williams
Australian Taxation Office
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 August 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
deep learning
healthcare
medical informatics
multi-label classification
multivari- ate time series
phenotyping
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '15 Paper Acceptance Rate160of819submissions,20%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 143
  Total Citations
  View Citations
- 2,208
  Total Downloads
- Downloads (Last 12 months)125
- Downloads (Last 6 weeks)19
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Deep Computational Phenotyping

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

GRAM: Graph-based Attention Model for Healthcare Representation Learning

KAME: Knowledge-based Attention Model for Diagnosis Prediction in Healthcare

Patient Subtyping via Time-Aware LSTM Networks