Abstract
Although machine learning models are increasingly being developed for clinical decision support for patients with type 2 diabetes, the adoption of these models into clinical practice remains limited. Currently, machine learning (ML) models are being constructed on local healthcare systems and are validated internally with no expectation that they would validate externally and thus, are rarely transferrable to a different healthcare system. In this work, we aim to demonstrate that (1) even a complex ML model built on a national cohort can be transferred to two local healthcare systems, (2) while a model constructed on a local healthcare system’s cohort is difficult to transfer; (3) we examine the impact of training cohort size on the transferability; and (4) we discuss criteria for external validity. We built a model using our previously published Multi-Task Learning-based methodology on a national cohort extracted from OptumLabs® Data Warehouse and transferred the model to two local healthcare systems (i.e., University of Minnesota Medical Center and Mayo Clinic) for external evaluation. The model remained valid when applied to the local patient populations and performed as well as locally constructed models (concordance: .73–.92), demonstrating transferability. The performance of the locally constructed models reduced substantially when applied to each other’s healthcare system (concordance: .62–.90). We believe that our modeling approach, in which a model is learned from a national cohort and is externally validated, produces a transferable model, allowing patients at smaller healthcare systems to benefit from precision medicine.




Similar content being viewed by others
References
Obermeyer, Z., and Emanuel, E. J., Predicting the future - big data, machine learning, and clinical medicine. The New England journal of medicine 375(13):1216–1219, 2016.
Florez, J. C., Precision medicine in diabetes: Is it time? Diabetes Care 39(7):1085–1088, 2016.
Kavakiotis, I., Tsave, O., Salifoglou, A., Maglaveras, N., Vlahavas, I., and Chouvarda, I., Machine learning and data mining methods in diabetes research. Comput. Struct. Biotechnol. J. 15:104–116, 2017.
Perveen, S. et al., A systematic machine learning based approach for the diagnosis of non-alcoholic fatty liver disease risk and progression. Comput. Struct. Biotechnol. J. 13(December):1445–1454, 2017, 2016.
Lagani, V. et al., Development and validation of risk assessment models for diabetes-related complications based on the DCCT/EDIC data. J. Diabetes Complications 29(4):479–487, 2015.
Cichosz, S. L., Johansen, M. D., and Hejlesen, O., Toward big data analytics. Review of Predictive Models in Management of Diabetes and Its Complications, 2016.
Bengio, Y., Delalleau, O., and Simard, C., Decision trees do not Generaliza to new variations. Comput. Intell. 26(4):449–467, 2010.
Lisboa, P. J., and Taktak, A. F. G., The use of artificial neural networks in decision support in cancer: A systematic review. Neural Networks 19(4):408–415, 2006.
Cruz, J. A., and Wishart, D. S., Applications of machine learning in cancer prediction and prognosis. Cancer Inform. 2:59–77, 2006.
Kourou, K., Exarchos, T. P., Exarchos, K. P., Karamouzis, M. V., and Fotiadis, D. I., Machine learning applications in cancer prognosis and prediction. Comput. Struct. Biotechnol. J. 13:8–17, 2015.
Weeks, J., and Pardee, R., Learning to share health care data : A brief timeline of influential common data models and distributed health data networks in U.S. health care research. Gener. Evid. Methods to Improv. patient outcomes 7(1):1–7, 2019.
Arterburn, D. et al., Comparative effectiveness and safety of bariatric procedures for weight loss. Ann. Intern. Med. 169(11):741–750, 2018.
Inge, T. H. et al., Comparative effectiveness of bariatric procedures among adolescents : The PCORnet bariatric study. Surg. Obes. Relat. Dis. 14(9):1374–1386, 2018.
C. L. Roumie et al., “Performance of a computable phenotype for identification of patients with diabetes within PCORnet : The Patient - Centered Clinical Research Network,” no. December 2018, pp. 1–8, 2019.
Chubak, J. et al., The Cancer research network : A platform for epidemiologic and health services research on cancer prevention, care, and outcomes in large, stable populations. Cancer Causes Control 27(11):1315–1323, 2016.
Hripcsak, G., Ryan, P. B., Duke, J. D., and Shah, N. H., R. Woong, and V. Huser, “Characterizing treatment pathways at scale using the OHDSI network,” 113(27):7329–7336, 2016.
Wallace, P. J., Shah, N. D., Dennen, T., Bleicher, P. A., and Crown, W. H., Optum labs: Building a novel node in the learning health care system. Health Aff. 33(7):1187–1194, 2014.
OptumLabs, “OptumLabs and OptumLabs Data Warehouse (OLDW) Descriptions and Citation,” Cambridge, MA: n.p., PDF, Reproduced with permission from OptumLabs, 2018.
American Diabetes Association (ADA), “Standards of Medical Care in Diabetes - 2017,” Diabetes Care, vol. 40 (sup 1), no. January, pp. s4–s128, 2017.
Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B., and Wei, L. J., On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30(10):1105–1117, 2011.
Nathan, D. M., Kuenen, J., Borg, R., Zheng, H., Schoenfeld, D., and Heine, R. J., Translating the A1C assay into estimated average glucose values. Diabetes Care 31(8):1473–1478, 2008.
E. Kim, D. S. Pieczkiewicz, M. R. Castro, P. J. Caraballo, and G. J. Simon, “Multi-Task Learning to Identify Outcome-Specific Risk Factors that Distinguish Individual Micro and Macrovascular Complications of Type 2 Diabetes,” AMIA 2018 Informatics Summit Proc., 2018.
Deedwania, P. C. et al., Differing predictive relationships between baseline LDL-C, systolic blood pressure, and cardiovascular outcomes. Int. J. Cardiol. 222:548–556, 2016.
Despres, J. P., Lemieux, I., Dagenais, G. R., Cantin, B., and Lamarche, B., HDL-cholesterol as a marker of coronary heart disease risk: The Quebec cardiovascular study. Atherosclerosis 153:263–272, 2000.
Retnakaran, R., Cull, C. A., Thorne, K. I., Adler, A. I., and Holman, R. R., Risk factors for renal dysfunction in type 2 diabetes. Diabetes 55(6):1832–1839, 2006.
Franklin, S. et al., Does the relation of blood pressure to coronary heart disease risk change with aging?: The Framingham heart study. Circulation 103(9):1245–1249, 2001.
Evans, G. W. et al., Effects of intensive blood-pressure control in type 2 diabetes mellitus. N. Engl. J. Med. 362(17):1575–1585, 2010.
Li, W. et al., Body mass index and heart failure among patients with type 2 diabetes mellitus. Circ. Hear. Fail. 8(3):455–463, 2015.
Schulz, K. F. et al., CONSORT 2010 statement: Updated guidelines for reporting parallel group randomised trials. BMC Med. 8(1):18, 2010.
von Elm, E., Altman, D. G., Egger, M., Pocock, S. J., Gøtzsche, P. C., and Vandenbroucke, J. P., The strengthening the reporting of observational studies in epidemiology (STROBE) statement: Guidelines for reporting observational studies. Int. J. Surg. 12(12):1495–1499, 2014.
Bossuyt, P. M. et al., RESEARCH METHODS & REPORTING STARD 2015 : An updated list of essential items for. Radiographies 277(3):1–9, 2015.
Collins, G. S., Reitsma, J. B., Altman, D. G., and Moons, K. G. M., Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): The TRIPOD statement. Eur. Urol. 67(6):1142–1151, 2015.
Ahmed, I., Debray, T. P. A., Moons, K. G. M., and Riley, R. D., Developing and validating risk prediction models in an individual participant data meta-analysis. BMC Med. Res. Methodol. 14(1):3, 2014.
Abo-Zaid, G., Sauerbrei, W., and Riley, R. D., Individual participant data meta-analysis of prognostic factor studies: State of the art? BMC Med. Res. Methodol. 12:56, 2012.
Dekkers, O. M., von Elm, E., Algra, A., Romijn, J. A., and Vandenbroucke, J. P., How to assess the external validity of therapeutic trials: A conceptual approach. Int. J. Epidemiol. 39(1):89–94, 2010.
Van Soest, J. et al., Prospective validation of pathologic complete response models in rectal cancer: Transferability and reproducibility. Med. Phys. 44(9), 2017.
Huang, J., and Ling, C. X., Using AUC and accuracy in evaluating learning algorithms. IEEE Trans. Knowl. Data Eng. 17(3):299–310, 2005.
Sokolova, M., and Lapalme, G., A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 45(4):427–437, 2009.
Osborn, C. Y., Groot, M., and Wagner, J. A., Racial and ethnic disparities in diabetes complications in the northeastern United States: The role of socioeconomic status. J. Natl. Med. Assoc. 105(1):51–58, Jan. 2013.
Maier, W. et al., The impact of regional deprivation and individual socio-economic status on the prevalence of type 2 diabetes in Germany. A pooled analysis of five population-based studies. Diabet. Med. 30(3):e78–e86, Mar. 2013.
Hu, R., Shi, L., Rane, S., Zhu, J., and Chen, C. C., Insurance, racial/ethnic, SES-related disparities in quality of care among US adults with diabetes. J. Immigr. Minor. Heal. 16(4):565–575, 2014.
Funding
This work was supported by NIH award R01 LM011972, NSF awards IIS 1602198. The views expressed in this paper are those of the authors and do not necessarily reflect the views of the funding agencies.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The access to the claims and EHR data from the OLDW was made possible through use of an OptumLabs research credit. Author Era Kim owns stock in UnitedHealth Group.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the Topical Collection on Patient Facing Systems
Rights and permissions
About this article
Cite this article
Kim, E., Caraballo, P.J., Castro, M.R. et al. Towards more Accessible Precision Medicine: Building a more Transferable Machine Learning Model to Support Prognostic Decisions for Micro- and Macrovascular Complications of Type 2 Diabetes Mellitus. J Med Syst 43, 185 (2019). https://doi.org/10.1007/s10916-019-1321-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10916-019-1321-6