Using network projections to explore co-incidence and context in large clinical datasets: Application to homelessness among U.S. Veterans

https://doi.org/10.1016/j.jbi.2016.03.023Get rights and content
Under an Elsevier user license
open archive

Highlights

  • Network projections of ICD codes reveal patterns prior to recognition of homelessness.

  • Network projections efficiently display co-incidence in large clinical datasets.

  • Projections of ICD codes may be configured to show comparison or change over time.

  • Exploring data co-incidence with network projections can aid hypothesis generation.

  • These data exploration methods complement traditional statistical techniques.

Abstract

Introduction

Network projections of data can provide an efficient format for data exploration of co-incidence in large clinical datasets. We present and explore the utility of a network projection approach to finding patterns in health care data that could be exploited to prevent homelessness among U.S. Veterans.

Method

We divided Veteran ICD-9-CM (ICD9) data into two time periods (0–59 and 60–364 days prior to the first evidence of homelessness) and then used Pajek social network analysis software to visualize these data as three different networks. A multi-relational network simultaneously displayed the magnitude of ties between the most frequent ICD9 pairings. A new association network visualized ICD9 pairings that greatly increased or decreased. A signed, subtraction network visualized the presence, absence, and magnitude difference between ICD9 associations by time period.

Result

A cohort of 9468 U.S. Veterans was identified as having administrative evidence of homelessness and visits in both time periods. They were seen in 222,599 outpatient visits that generated 484,339 ICD9 codes (average of 11.4 (range 1–23) visits and 2.2 (range 1–60) ICD9 codes per visit). Using the three network projection methods, we were able to show distinct differences in the pattern of co-morbidities in the two time periods. In the more distant time period preceding homelessness, the network was dominated by routine health maintenance visits and physical ailment diagnoses. In the 59 days immediately prior to the homelessness identification, alcohol related diagnoses along with economic circumstances such as unemployment, legal circumstances, along with housing instability were noted.

Conclusion

Network visualizations of large clinical datasets traditionally treated as tabular and difficult to manipulate reveal rich, previously hidden connections between data variables related to homelessness. A key feature is the ability to visualize changes in variables with temporality and in proximity to the event of interest. These visualizations lend support to cognitive tasks such as exploration of large clinical datasets as a prelude to hypothesis generation.

Keywords

Homelessness
U.S. Veterans
Social network analysis
Information foraging
Data exploration
ICD9 codes

Cited by (0)