loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Hasan Zafari 1 ; Jie Li 1 ; Farhana Zulkernine 1 ; Leanne Kosowan 2 and Alexander Singer 2

Affiliations: 1 School of Computing, Queen's University, Kingston, Ontario, Canada ; 2 Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Canada

Keyword(s): Machine Learning, EMR Data, Diabetes, Ensemble Models, Classification Algorithms, Imbalanced Data.

Abstract: As the prevalence of diabetes continues to increase globally, an efficient diabetes prediction model based on Electronic Medical Records (EMR) is critical to ensure the well-being of the patients and reduce the burden on the healthcare system. Prediction of diabetes in patients at an early stage and analysis of the risk factors can enable diabetes primary and secondary prevention. The objective of this study is to explore various classification models for identifying diabetes using EMR data. We extracted patient information, disease, health conditions, billing, and medication from EMR data. Six machine learning algorithms including three ensemble and three non-ensemble classifiers were used namely XGBoost, Random Forest, AdaBoost, Logistic Regression, Naive Bayes, and K-Nearest Neighbor (KNN). We experimented with both imbalanced data with the original class distribution and artificially balanced data for training the models. Our results indicate that the Random Forest model overall outperformed other models. When applied to the imbalanced data (112,837 instances), it results in the highest values in specificity (0.99) and F1-score (0.84), and when training with balanced data (35,858 instances) it achieves better values in sensitivity (1.00) and AUC (0.96). Analyzing feature importance, we identified a set of features that are more impactful in deciding the outcome including a number of comorbid conditions such as hypertension, dyslipidemia, osteoarthritis, CKD, and depression as well as a number of medication codes such as A10, D08, C10, and C09. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.133.79.70

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Zafari, H.; Li, J.; Zulkernine, F.; Kosowan, L. and Singer, A. (2022). Predictive Modeling of Diabetes using EMR Data. In Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2022) - HEALTHINF; ISBN 978-989-758-552-4; ISSN 2184-4305, SciTePress, pages 211-218. DOI: 10.5220/0010908900003123

@conference{healthinf22,
author={Hasan Zafari. and Jie Li. and Farhana Zulkernine. and Leanne Kosowan. and Alexander Singer.},
title={Predictive Modeling of Diabetes using EMR Data},
booktitle={Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2022) - HEALTHINF},
year={2022},
pages={211-218},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010908900003123},
isbn={978-989-758-552-4},
issn={2184-4305},
}

TY - CONF

JO - Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2022) - HEALTHINF
TI - Predictive Modeling of Diabetes using EMR Data
SN - 978-989-758-552-4
IS - 2184-4305
AU - Zafari, H.
AU - Li, J.
AU - Zulkernine, F.
AU - Kosowan, L.
AU - Singer, A.
PY - 2022
SP - 211
EP - 218
DO - 10.5220/0010908900003123
PB - SciTePress