Original Research
Continuous-time probabilistic models for longitudinal electronic health records

https://doi.org/10.1016/j.jbi.2022.104084Get rights and content
Under an Elsevier user license
open archive

Highlights

  • We develop a continuous time unsupervised learning model for heterogenous and irregularly time sampled electronic health record data.

  • The model is a joint distribution between two variables and their inter-measurement time.

  • Data from the United States Veterans Health Administration is used to train the model.

  • Time-dependent likelihood ratio maps are produced for minimal vs moderate-severe depression.

Abstract

Analysis of longitudinal Electronic Health Record (EHR) data is an important goal for precision medicine. Difficulty in applying Machine Learning (ML) methods, either predictive or unsupervised, stems in part from the heterogeneity and irregular sampling of EHR data. We present an unsupervised probabilistic model that captures nonlinear relationships between variables over continuous-time. This method works with arbitrary sampling patterns and captures the joint probability distribution between variable measurements and the time intervals between them. Inference algorithms are derived that can be used to evaluate the likelihood of future using under a trained model. As an example, we consider data from the United States Veterans Health Administration (VHA) in the areas of diabetes and depression. Likelihood ratio maps are produced showing the likelihood of risk for moderate-severe vs minimal depression as measured by the Patient Health Questionnaire-9 (PHQ-9).

Keywords

Electronic health records
Probabilistic models
Mixture models
Time-dependent modeling

Cited by (0)