Development and evaluation of an osteoarthritis risk model for integration into primary care health information technology

https://doi.org/10.1016/j.ijmedinf.2020.104160Get rights and content

Highlights

Abstract

Background

We developed and evaluated a prognostic prediction model that estimates osteoarthritis risk for use by patients and practitioners that is designed to be appropriate for integration into primary care health information technology systems. Osteoarthritis, a joint disorder characterized by pain and stiffness, causes significant morbidity among older Canadians. Because our prognostic prediction model for osteoarthritis risk uses data that are readily available in primary care settings, it supports targeting of interventions delivered as part of clinical practice that are aimed at risk reduction.

Methods

We used the CPCSSN (Canadian Primary Sentinel Surveillance Network) database, which contains aggregated electronic health information from a cohort of primary care practices, to develop and evaluate a prognostic prediction model to estimate 5-year osteoarthritis risk, addressing contextual challenges of data availability and missingness. We constructed a retrospective cohort of 383,117 eligible primary care patients who were included in the cohort if they had an encounter with their primary care practitioner between 1 January 2009 and 31 December 2010. Patients were excluded if they had a diagnosis of osteoarthritis prior to their first visit in this time period. Incident cases of osteoarthritis were observed. The model was constructed to predict incident osteoarthritis based on age, sex, BMI, previous leg injury, and osteoporosis. Evaluation of the model used internal 10-fold cross-validation; we argue that internal validation is particularly appropriate for a model that is to be integrated into the same context from which the data were derived.

Results

The resulting prediction model for 5-year risk of osteoarthritis diagnosis demonstrated state-of-the-art discrimination (estimated AUROC 0.84) and good calibration (assessed visually.) The model relies only on information that is readily available in Canadian primary care settings, and hence is appropriate for integration into Canadian primary care health information technology.

Conclusions

If the contextual challenges arising when using primary care electronic medical record data are appropriately addressed, highly discriminative models for osteoarthritis risk may be constructed using only data commonly available in primary care. Because the models are constructed from data in the same setting where the model is to be applied, internal validation provides strong evidence that the resulting model will perform well in its intended application.

Introduction

Prognostic prediction models (PPMs) estimate a patient’s risk of disease development [1,2] based on various predictors [3,4]. Predictors may include patient demographics (such as age and sex), family history, lifestyle factors (such as smoking status or physical activity level), prior medical conditions, laboratory test results, radiographic imaging, or genetic markers [5]. In turn, health care practitioners and patients can make decisions informed by disease risk [6,7]. For example, a patient found to be at high risk of lung cancer may be advised by their health care provider to quit smoking. Several PPMs estimate a patient’s risk of developing osteoarthritis [[8], [9], [10], [11]]; however, all existing PPMs for osteoarthritis that we identified require information on predictors that are not routinely collected in primary care, such as the Kellgren and Lawrence grade (which requires radiographic imaging), and hence are not suitable for integration into primary care health information systems. Using existing data within primary care electronic medical records (EMRs) instead would eliminate the need for collection of additional, oftentimes burdensome, measures and would enable real-time risk estimation at the point of care.

There is the potential for significant benefit to be derived by deploying such a risk engine in primary care. Affecting an estimated 13 % of adults over the age of 20, osteoarthritis causes significant morbidity in Canada [12]. This estimate increases to 29 % in adults 70 years of age and older who receive primary care [13]. Symptoms of osteoarthritis include joint pain and stiffness [14], commonly affecting the joints of the hands, neck, lower back, hips, and knees. Osteoarthritis treatment largely consists of symptom management (e.g., non-steroidal anti-inflammatory medications for pain management), rather than treatment of underlying disease mechanisms [15]. Total joint replacement is often required after significant degradation of the affected joint. To mitigate this burden, prevention strategies have shown potential in reducing the incidence of osteoarthritis. For example, a diet and exercise program aimed at weight loss reduced the incidence of osteoarthritis, though not statistically significantly [16]. Injury prevention programs have been suggested as a potential strategy to prevent osteoarthritis [17]. Interventions such as these may be improved by selectively targeting those at the greatest risk of osteoarthritis in order to reduce their risk. To perform this selective targeting, individualized risk estimates for osteoarthritis are required.

Designing and evaluating a PPM specifically in the primary care context leads to two important design decisions: 1) the PPM should use only data that are easily available in the primary care context where it is to be deployed, for example those that exist in EMRs already, and 2) evaluation of the PPM should reflect the population where it is to be deployed, that is, in primary care encounters. Researchers have begun to recognize the value of EMR data for research purposes more generally [18]; however, data quality within these databases remains uncertain [19]. Issues such as implausible data and missing data are common in EMR data. When working with EMR data, researchers must address these contextual challenges [20].

In this work, we developed and validated a PPM to estimate a patient’s five-year risk of osteoarthritis development using primary care EMR data. Ultimately, we see this model being developed into a purpose-built tool to be used routinely by primary care practitioners during patient encounters to: 1) deliver a quantitative assessment of osteoarthritis risk in patients where the patient and/or primary care practitioner is concerned about osteoarthritis risk; and 2) act as a passive risk screening tool to identify high-risk patients who may have gone undetected otherwise. We are confident that our learnings from this work will be of use to others who are designing and evaluating PPMs using EMR from and for primary care.

Section snippets

Methods

We developed and validated a prognostic prediction model to estimate the risk of osteoarthritis development within five years among Canadian adults receiving primary care. Model development was informed by strategies of PPM development suggested by the TRIPOD statement [21], Steyerberg [22], Lee et al. [5], and Hendriksen et al. [3]. First, we compiled a list of risk indicators for osteoarthritis development based on the existing literature. Next, we identified a cohort of patients whose risk

Results

The final cohort was composed of 383,117 patients (Fig. 1). Patient characteristics were typical of a primary care population, as they were slightly older and more likely to be female [[38], [39], [40]] (Table 3). After five years of follow-up, 12,803 (3.3 %) patients developed osteoarthritis.

Data were commonly missing for BMI, while sex was almost never missing (Table 4). Multiple imputation was used to address missing data for BMI and sex.

Kernel density estimates of the distribution over the

Discussion

We produced a prognostic prediction model for the diagnosis of osteoarthritis using EMR data that are commonly available in primary care. We see this model ultimately being used in two ways in Canadian primary care settings. First, the model can provide estimates of osteoarthritis risk on demand when requested by a provider or patient who is interested in osteoarthritis risk. Second, the model can operate in the background of the provider’s EMR system during all patient visits and automatically

Conclusions

Primary care EMRs are a rich, yet underutilized, source of longitudinal health data that can support the development of novel tools for integration into primary care health information systems. Our work demonstrates the utility of these data for constructing PPMs despite contextual challenges such as missing data, using an osteoarthritis risk model as a success story, and provides a strategy and rationale for internal validation. Two key future directions for this work will be to 1) design and

Author statement

Lead author was JB. All authors initiated the research idea. JB drafted the research idea. JB wrote the article that is being submitted. AT and DL revised the article. JB extracted and analyzed the CPCSSN data. DL and AT supported the methodology development and contributed to the writing of the article. Each author has read and approved the final version of this article.

Ethics

Ethics approval was obtained from the Western University Research Ethics Board #107572.

Consent for publication

Not applicable as no individual person’s data were presented.

Availability of data and materials

The Canadian Primary Care Sentinel Surveillance Network (CPCSSN) database is not publicly accessible, in keeping with the intent of the agreement made with primary health care practitioners contributing to the CPCSSN database. Researchers can request data from CPCSSN directly.

Funding

Funding for this research was provided by the Natural Sciences and Engineering Research Council of Canada. The funding source played no role in this research.

Declaration of Competing Interest

The authors have no conflicts of interest to declare.

References (41)

  • Public Health Agency of Canada

    Canadian Chronic Disease Indicators, Quick Stats

    (2017)
  • R. Birtwhistle et al.

    Prevalence and management of osteoarthritis in primary care: an epidemiologic cohort study from the Canadian Primary Care Sentinel Surveillance Network

    C Open

    (2015)
  • M. Doherty et al.

    Osteoarthritis and Crystal Arthropathy

    (2016)
  • M.C. Hochberg et al.

    American College of Rheumatology 2012 recommendations for the use of nonpharmacologic and pharmacologic therapies in osteoarthritis of the hand, hip, and knee

    Arthritis Care Res. (Hoboken)

    (2012)
  • J. Runhaar et al.

    Prevention of knee osteoarthritis in overweight females: the first preventive randomized controlled trial in osteoarthritis

    Am. J. Med.

    (2015)
  • C.A. Emery et al.

    OARSI Clinical Trials Recommendations: design and conduct of clinical trials for primary prevention of osteoarthritis by joint injury prevention in sport and recreation

    Osteoarthr. Cartil.

    (2015)
  • D.M. Lloyd-Jones et al.

    Framingham risk score and prediction of lifetime risk for coronary heart disease

    Am. J. Cardiol.

    (2004)
  • S.A.M. Nashef et al.

    European system for cardiac operative risk evaluation (EuroSCORE)

    Eur. J. Cardio-Thorac. Surg.

    (1999)
  • J.M.T. Hendriksen et al.

    Diagnostic and prognostic prediction models

    J. Thromb. Haemost.

    (2013)
  • E.W. Steyerberg et al.

    Towards better clinical prediction models: seven steps for development and an ABCD for validation

    Eur. Heart J.

    (2014)
  • Y.H. Lee et al.

    How to establish clinical prediction models

    Endocrinol. Metab. (Seoul, Korea).

    (2016)
  • D.T. Felson et al.

    Weight loss reduces the risk for symptomatic knee osteoarthritis in women. The Framingham study

    Ann. Intern. Med.

    (1992)
  • D.T. Felson

    Weight and osteoarthritis

    Am. J. Clin. Nutr.

    (1996)
  • G.B. Joseph et al.

    Tool for osteoarthritis risk prediction (TOARP) over 8 years using baseline clinical data, X-ray, and MRI: data from the osteoarthritis initiative

    J. Magn. Reson. Imaging

    (2018)
  • W. Zhang et al.

    Nottingham knee osteoarthritis risk prediction models

    Ann. Rheum. Dis.

    (2011)
  • H.J.M. Kerkhof et al.

    Prediction model for knee osteoarthritis incidence, including clinical, genetic and biochemical risk factors

    Ann. Rheum. Dis.

    (2014)
  • D.L. Riddle et al.

    The incident tibiofemoral osteoarthritis with rapid progression phenotype: development and validation of a prognostic prediction rule

    Osteoarthr. Cartil.

    (2016)
  • H. Carr et al.

    Defining dimensions of research readiness: a conceptual model for primary care research networks

    BMC Fam. Pract.

    (2014)
  • S. de Lusignan et al.

    Key concepts to assess the readiness of data for international research: data quality, lineage and provenance, extraction and processing errors, traceability, and curation. Contribution of the IMIA Primary Health Care Informatics Working Group

    Yearb. Med. Inform.

    (2011)
  • A.L. Terry et al.

    A basic model for assessing primary health care electronic medical record data quality

    BMC Med. Inform. Decis. Mak.

    (2019)
  • Cited by (7)

    • Osteoarthritis year in review 2021: epidemiology & therapy

      2022, Osteoarthritis and Cartilage
      Citation Excerpt :

      We recommend the OARSI Imaging Year in Review by Oei for studies predicting OA outcomes from machine learning technologies applied to x-rays and MRI. The large retrospective study (n = 383,117) by Black and colleagues23 was novel in predicting risk of new OA diagnosis at 5 years at any joint using routine Canadian primary care electronic health record data. The resulting prediction model (estimated area under the receiver operating characteristic = 0.84), relied on just 5 predictors: age, sex, BMI, previous leg injury, osteoporosis.

    • Identification of most important features based on a fuzzy ensemble technique: Evaluation on joint space narrowing progression in knee osteoarthritis patients

      2021, International Journal of Medical Informatics
      Citation Excerpt :

      A typical example of such a dataset is the Osteoarthritis Initiative (OAI) cohort1 which is used in the current study for identifying the most important features that contribute to the progression of the Joint Space Narrowing (JSN) of knee osteoarthritis (KOA) patients. Various studies have been focused on the problem of KOA or JSN prediction by using machine learning in the recent years being motivated by the importance of the disease and its severe implications in daily life of KOA patients [20–22]. The results of these studies showed that JSN is among the top five most important features that contribute to predict KOA progression.

    • Evaluation of Digital Health & Information Technology in Primary Care

      2020, International Journal of Medical Informatics
    View all citing articles on Scopus
    View full text