skip to main content
10.1145/2783258.2783365acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Deep Computational Phenotyping

Authors Info & Claims
Published:10 August 2015Publication History

ABSTRACT

We apply deep learning to the problem of discovery and detection of characteristic patterns of physiology in clinical time series data. We propose two novel modifications to standard neural net training that address challenges and exploit properties that are peculiar, if not exclusive, to medical data. First, we examine a general framework for using prior knowledge to regularize parameters in the topmost layers. This framework can leverage priors of any form, ranging from formal ontologies (e.g., ICD9 codes) to data-derived similarity. Second, we describe a scalable procedure for training a collection of neural networks of different sizes but with partially shared architectures. Both of these innovations are well-suited to medical applications, where available data are not yet Internet scale and have many sparse outputs (e.g., rare diagnoses) but which have exploitable structure (e.g., temporal order and relationships between labels). However, both techniques are sufficiently general to be applied to other problems and domains. We demonstrate the empirical efficacy of both techniques on two real-world hospital data sets and show that the resulting neural nets learn interpretable and clinically relevant features.

References

  1. S. L. Achour, M. Dojat, C. Rieux, P. Bierling, and E. Lepage. A umls-based knowledge acquisition tool for rule-based clinical decision support system development. JAMIA, 8(4):351--360, 2001.Google ScholarGoogle Scholar
  2. R. K. Ando and T. Zhang. Learning on graph with laplacian regularization. NIPS, 2007.Google ScholarGoogle Scholar
  3. M. T. Bahadori, Q. R. Yu, and Y. Liu. Fast multivariate spatio-temporal analysis via low rank tensor learning. In NIPS, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. P. L. Bartlett and S. Mendelson. Rademacher and gaussian complexities: Risk bounds and structural results. JMLR, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell., 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle. Greedy layer-wise training of deep networks. In NIPS. 2007.Google ScholarGoogle Scholar
  7. Y. Bengio, G. Mesnil, Y. Dauphin, and S. Rifai. Better mixing via deep representations. In ICML, 2013.Google ScholarGoogle Scholar
  8. J. Bergstra, O. Breuleux, F. Bastien, P. Lamblin, R. Pascanu, G. Desjardins, J. Turian, D. Warde-Farley, and Y. Bengio. Theano: a CPU and GPU math expression compiler. In SciPy, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  9. O. Bodenreider. The unified medical language system (umls): integrating biomedical terminology. Nucleic Acids Res., 2004.Google ScholarGoogle ScholarCross RefCross Ref
  10. R. Cornet and N. de Keizer. Forty years of snomed: a literature review. BMC medical informatics and decision making, 8(Suppl 1):S2, 2008.Google ScholarGoogle Scholar
  11. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In CVPR, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  12. M. Denil, B. Shakibi, L. Dinh, M. Ranzato, and N. de Freitas. Predicting parameters in deep learning. In NIPS. 2013.Google ScholarGoogle Scholar
  13. M. C. Díaz-Galiano, M. T. Martín-Valdivia, and L. Ureña-López. Query expansion with a medical ontology to improve a multimodal information retrieval system. Comput. Biol. Med., 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Goldberger, L. N. Amaral, L. Glass, J. Hausdorff, P. Ivanov, R. Mark, J. Mietus, G. Moody, C. Peng, and H. Stanley. Physiobank, physiotoolkit, and physionet: Components of a new research resource for complex physiologic signals. Circulation, 2000.Google ScholarGoogle Scholar
  15. A. Graves and N. Jaitly. Towards end-to-end speech recognition with recurrent neural networks. In Proceedings of the 31st International Conference on Machine Learning (ICML-14), pages 1764--1772, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. G. Hinton, S. Osindero, and Y.-W. Teh. A fast learning algorithm for deep belief nets. Neural Comput., 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. C. Ho, J. Ghosh, and J. Sun. Marble: high-throughput phenotyping from electronic health records via sparse nonnegative tensor factorization. In KDD, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Hyvärinen and S. M. Smith. Pairwise likelihood ratios for estimation of non-gaussian structural equation models. JMLR, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Kale, Z. Che, Y. Liu, and R. Wetzel. Computational discovery of physiomes in critically ill children using deep learning. In DMMI Workshop, AMIA, 2014.Google ScholarGoogle Scholar
  20. A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. In CVPR, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  21. T. A. Lasko, J. Denny, and M. Levy. Computational phenotype discovery using unsupervised feature learning over noisy, sparse, and irregular clinical data. PLoS ONE, 2013.Google ScholarGoogle Scholar
  22. B. Marlin, D. Kale, R. Khemani, and R. Wetzel. Unsupervised pattern discovery in electronic health care data using probabilistic clustering models. In IHI, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. W. H. Organization. International statistical classification of diseases and related health problems. 2004.Google ScholarGoogle Scholar
  24. J. Pearl. Causality: models, reasoning and inference. Cambridge Univ Press, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Ranzato, C. Poultney, S. Chopra, and Y. L. Cun. Efficient learning of sparse representations with an energy-based model. In NIPS. 2007.Google ScholarGoogle Scholar
  26. S. Reed, K. Sohn, Y. Zhang, and H. Lee. Learning to disentangle factors of variation with manifold interaction. In ICML, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. P. Schulam, F. Wigley, and S. Saria. Clustering longitudinal clinical marker trajectories from electronic health data: Applications to phenotyping and endotype discovery. 2015.Google ScholarGoogle Scholar
  28. S. Shimizu, P. O. Hoyer, A. Hyvçrinen, and A. Kerminen. A linear non-gaussian acyclic model for causal discovery. JMLR, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. Shimizu, T. Inazumi, Y. Sogawa, A. Hyvçrinen, Y. Kawahara, T. Washio, P. O. Hoyer, and K. Bollen. Directlingam: A direct method for learning a linear non-gaussian structural equation model. JMLR, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. I. Silva, G. Moody, D. J. Scott, L. A. Celi, and R. G. Mark. Predicting in-hospital mortality of icu patients: The physionet/computing in cardiology challenge 2012. Computing in cardiology, 2012.Google ScholarGoogle Scholar
  31. N. Srivastava and R. R. Salakhutdinov. Discriminative transfer learning with tree-based priors. In NIPS, pages 2094--2102, 2013.Google ScholarGoogle Scholar
  32. A. Torralba, R. Fergus, and W. T. Freeman. 80 million tiny images: A large data set for nonparametric object and scene recognition. PAMI, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J. Turian, L. Ratinov, and Y. Bengio. Word representations: A simple and general method for semi-supervised learning. In ACL, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. V. Vapnik. The nature of statistical learning theory. Springer Science & Business Media, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol. Extracting and composing robust features with denoising autoencoders. In ICML, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. K. Q. Weinberger, F. Sha, Q. Zhu, and L. K. Saul. Graph laplacian regularization for large-scale semidefinite programming. In NIPS, 2006.Google ScholarGoogle Scholar
  37. G. Wu, M. Kim, Q. Wang, Y. Gao, S. Liao, and D. Shen. Unsupervised deep feature learning for deformable registration of mr brain images. In MICCAI. 2013.Google ScholarGoogle ScholarCross RefCross Ref
  38. R. Wu, S. Yan, Y. Shan, Q. Dang, and G. Sun. Deep image: Scaling up image recognition. arXiv:1501.02876, 2015.Google ScholarGoogle Scholar
  39. T. Xiang, D. Ray, T. Lohrenz, P. Dayan, and P. R. Montague. Computational phenotyping of two-person interactions reveals differential neural response to depth-of-thought. PLoS Comput. Biol., 2012.Google ScholarGoogle ScholarCross RefCross Ref
  40. T. Zhang, A. Popescul, and B. Dom. Linear prediction models with graph regularization for web-page categorization. In KDD, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. G. Zhou, K. Sohn, and H. Lee. Online incremental feature learning with denoising autoencoders. In AISTATS, 2012.Google ScholarGoogle Scholar
  42. J. Zhou, F. Wang, J. Hu, and J. Ye. From micro to macro: Data driven phenotyping by densification of longitudinal electronic medical records. In KDD, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Deep Computational Phenotyping

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
            August 2015
            2378 pages
            ISBN:9781450336642
            DOI:10.1145/2783258

            Copyright © 2015 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 10 August 2015

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            KDD '15 Paper Acceptance Rate160of819submissions,20%Overall Acceptance Rate1,133of8,635submissions,13%

            Upcoming Conference

            KDD '24

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader