ABSTRACT
The rapid accumulation of large-scale Electronic Health Records (EHR) presents considerable opportunities to generate real-world evidence to inform clinical decision-making and accelerate drug development. However, the complexity of EHR has turned them into a formidable testing ground for cutting-edge AI algorithms. Furthermore, a significant gap still exists between algorithm development in the computer science community and clinical translation within the healthcare community. This tutorial aims to bridge this divide by fostering mutual understanding between the two communities by discussing using advanced machine learning and data mining technologies tailored to tackle real-world healthcare challenges, including 1) using EHR and trial emulation for understanding Long Covid and drug repurposing for Alzheimer's disease, and 2) risk prediction and associated fairness, interpretability, generalizability, etc., issues. We will conclude this tutorial by delving into potential opportunities for future research and unveiling the prospects of a career as a health data scientist.
- Bing Bai, Jian Liang, Guanhua Zhang, Hao Li, Kun Bai, and Fei Wang. 2021. Why attentions may not be interpretable?. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 25--34.Google ScholarDigital Library
- Office of the Commissioner. 2023. Real-World Evidence. https://www.fda.gov/science-research/science-and-research-special-topics/real-world-evidence Publisher: FDA.Google Scholar
- John Concato and Jacqueline Corrigan-Curay. 2022. Real-world evidence-where are we now? The New England journal of medicine, Vol. 386, 18 (2022), 1680--1682.Google Scholar
- Sen Cui, Weishen Pan, Jian Liang, Changshui Zhang, and Fei Wang. 2021a. Addressing algorithmic disparity and performance inconsistency in federated learning. Advances in Neural Information Processing Systems, Vol. 34 (2021), 26091--26102.Google Scholar
- Sen Cui, Weishen Pan, Changshui Zhang, and Fei Wang. 2021b. Towards model-agnostic post-hoc adjustment for balancing ranking fairness and algorithm utility. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 207--217.Google ScholarDigital Library
- Dhruv Khullar, Yongkang Zhang, Chengxi Zang, Zhenxing Xu, Fei Wang, Mark G Weiner, Thomas W Carton, Russell L Rothman, Jason P Block, and Rainu Kaushal. 2023. Racial/Ethnic Disparities in Post-acute Sequelae of SARS-CoV-2 Infection in New York: an EHR-Based Cohort Study from the RECOVER Program. Journal of General Internal Medicine, Vol. 38, 5 (2023), 1127--1136.Google ScholarCross Ref
- Weishen Pan, Sen Cui, Jiang Bian, Changshui Zhang, and Fei Wang. 2021. Explaining algorithmic fairness through fairness-aware causal path decomposition. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 1287--1297.Google ScholarDigital Library
- Chang Su, Robert Aseltine, Riddhi Doshi, Kun Chen, Steven C Rogers, and Fei Wang. 2020. Machine learning for suicide risk prediction in children and adolescents with electronic health records. Translational psychiatry (2020), 413.Google Scholar
- Jay K Varma, Chengxi Zang, Thomas W Carton, Jason P Block, Dhruv J Khullar, Yongkang Zhang, Mark G Weiner, Russell L Rothman, Edward J Schenck, Zhenxing Xu, et al. 2023. Excess burden of respiratory and abdominal conditions following COVID-19 infections during the ancestral and Delta variant periods in the United States: An EHR-based cohort study from the RECOVER Program. medRxiv (2023), 2023--02.Google Scholar
- Fei Wang, Rainu Kaushal, and Dhruv Khullar. 2020. Should health care demand interpretable artificial intelligence or accept ?black box" medicine?, 59--60 pages.Google Scholar
- Tingyi Wanyan, Hossein Honarvar, Suraj K Jaladanki, Chengxi Zang, Nidhi Naik, Sulaiman Somani, Jessica K De Freitas, Ishan Paranjpe, Akhil Vaid, Jing Zhang, et al. 2021. Contrastive learning improves critical event prediction in COVID-19 patients. Patterns, Vol. 2, 12 (2021), 100389.Google ScholarCross Ref
- Jie Xu, Fei Wang, Chengxi Zang, Hao Zhang, Kellyann Niotis, Ava L Liberman, Cynthia M Stonnington, Makoto Ishii, Prakash Adekkanattu, Yuan Luo, et al. 2023. Comparing the effects of four common drug classes on the progression of mild cognitive impairment to dementia using electronic health records. Scientific Reports, Vol. 13, 1 (2023), 8102.Google ScholarCross Ref
- He S Yang, Yu Hou, Ljiljana V Vasovic, Peter AD Steel, Amy Chadburn, Sabrina E Racine-Brzostek, Priya Velu, Melissa M Cushing, Massimo Loda, Rainu Kaushal, et al. 2020. Routine laboratory blood tests predict SARS-CoV-2 infection using machine learning. Clinical chemistry, Vol. 66, 11 (2020), 1396--1404.Google Scholar
- He S Yang, Daniel D Rhoads, Jorge Sepulveda, Chengxi Zang, Amy Chadburn, and Fei Wang. 2022. Building the Model Challenges and Considerations of Developing and Implementing Machine Learning Tools for Clinical Laboratory Medicine Practice. Archives of Pathology & Laboratory Medicine (2022).Google Scholar
- Chengxi Zang, Marianne Goodman, Zheng Zhu, Lulu Yang, Ziwei Yin, Zsuzsanna Tamas, Vikas Mohan Sharma, Fei Wang, and Nan Shao. 2022a. Development of a screening algorithm for borderline personality disorder using electronic health records. Scientific Reports, Vol. 12, 1 (2022), 1--12.Google ScholarCross Ref
- Chengxi Zang, Yu Hou, Edward Schenck, Zhenxing Xu, Yongkang Zhang, Jie Xu, Jiang Bian, Dmitry Morozyuk, Dhruv Khullar, Anna Nordvig, et al. 2023 a. Risk Factors and Predictive Modeling for Post-Acute Sequelae of SARS-CoV-2 Infection: Findings from EHR Cohorts of the RECOVER Initiative. Research Square (2023), rs-3.Google Scholar
- Chengxi Zang and Fei Wang. 2021. SCEHR: Supervised Contrastive Learning for Clinical Risk Prediction using Electronic Health Records. In 2021 IEEE International Conference on Data Mining (ICDM). IEEE, 857--866.Google ScholarCross Ref
- Chengxi Zang, Hao Zhang, Jie Xu, Hansi Zhang, Sajjad Fouladvand, Shreyas Havaldar, Feixiong Cheng, Kun Chen, Yong Chen, Benjamin S Glicksberg, et al. 2022b. High-throughput clinical trial emulation with real world data and machine learning: a case study of drug repurposing for Alzheimer's disease. medRxiv (2022), 2022-01.Google Scholar
- Chengxi Zang, Yongkang Zhang, Jie Xu, Jiang Bian, Dmitry Morozyuk, Edward J Schenck, Dhruv Khullar, Anna S Nordvig, Elizabeth A Shenkman, Russell L Rothman, et al. 2023 b. Data-driven analysis to understand long COVID using electronic health records from the RECOVER initiative. Nature Communications, Vol. 14, 1 (2023), 1948.Google ScholarCross Ref
- Hao Zhang, Chengxi Zang, Zhenxing Xu, Yongkang Zhang, Jie Xu, Jiang Bian, Dmitry Morozyuk, Dhruv Khullar, Yiye Zhang, Anna S Nordvig, et al. 2023 b. Data-driven identification of post-acute SARS-CoV-2 infection subphenotypes. Nature Medicine, Vol. 29, 1 (2023), 226--235.Google ScholarCross Ref
- Xi Sheryl Zhang, Fengyi Tang, Hiroko H Dodge, Jiayu Zhou, and Fei Wang. 2019. Metapred: Meta-learning for clinical risk prediction with limited patient electronic health records. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 2487--2495.Google ScholarDigital Library
- Yongkang Zhang, Hui Hu, Vasilios Fokaidis, Jie Xu, Chengxi Zang, Zhenxing Xu, Fei Wang, Michael Koropsak, Jiang Bian, Jaclyn Hall, et al. 2023 a. Identifying environmental risk factors for post-acute sequelae of SARS-CoV-2 infection: An EHR-based cohort study from the recover program. Environmental Advances, Vol. 11 (2023), 100352.Google ScholarCross Ref
Index Terms
- Mining Electronic Health Records for Real-World Evidence
Recommendations
Electronic health records: how can IS researchers contribute to transforming healthcare?
Electronic health records (EHR) facilitate integration of patient health history for planning safe and proper treatment. Combined with data analytics, aggregate-level EHR enable examination and development of effective medicines and therapies for ...
Meaningful Use of Electronic Health Records for Physician Collaboration: A Patient Centered Health Care Perspective
HICSS '14: Proceedings of the 2014 47th Hawaii International Conference on System SciencesEHRs (Electronic Health Records), can contribute greatly to improving care and managing the rising costs of healthcare. The use and the integration of EHRs (Electronic Health Records) in supporting collaboration to increase the efficiency and ...
Mining Electronic Health Records
Initial efforts to mine electronic health records are unlikely to yield many Eureka insights, but there are many opportunities for improving the delivery, efficiency, and effectiveness of healthcare.
Comments