ABSTRACT
Diabetes mellitus, a severe and enduring condition characterized by impaired glucose metabolism, poses a substantial threat to public health. Its pervasive impact continues to escalate globally, with a rising incidence that challenges preventive measures. Despite earnest efforts, individuals struggle to evade the clutches of diabetes, necessitating innovative approaches for disease management. Traditional methodologies in diabetes health monitoring exhibit limitations, prompting the exploration of advanced techniques. This study employs machine learning (ML) methods to delve into diabetes, aiming to enhance diagnostic accuracy. The primary objective is to develop a method capable of precise diabetes diagnoses with a heightened level of precision. The investigation incorporates machine learning algorithms, specifically Random Forest (RF), K Nearest Neighbor (KNN), and Logistic Regression. The inclusion of these algorithms seeks to streamline data processing times.Notably, this study incorporates the Synthetic Minority Over-sampling Technique (SMOTE) as a data augmentation strategy. SMOTE addresses imbalances in the dataset, contributing to a more robust and representative sample. The research evaluates the effectiveness and accuracy of diabetes prediction using these algorithms both before and after SMOTE implementation. By considering the impact of SMOTE, the study aims to determine the optimal algorithm for assessing diabetes development. The comparative analysis sheds light on how SMOTE enhances the overall performance of machine learning models. This nuanced approach not only refines diabetes diagnostic protocols but also underscores the significance of addressing data imbalances in predictive modeling for enhanced precision in disease prediction.
- Kayaer, K., & Yildirim, T. (2003, June). Medical diagnosis on Pima Indian diabetes using general regression neural networks. In Proceedings of the international conference on artificial neural networks and neural information processing (ICANN/ICONIP) (Vol. 181, p. 184).Google Scholar
- Christobel, Y. A., & Sivaprakasam, P. (2013). A new classwise k nearest neighbor (CKNN) method for the classification of diabetes dataset. International Journal of Engineering and Advanced Technology, 2(3), 396-200.Google Scholar
- Farahmandian, M., Lotfi, Y., & Maleki, I. (2015). Data mining algorithms application in diabetes diseases diagnosis: A case study. vol, 3, 989-997.Google Scholar
- Alauthman M, Al-qerem A, Sowan B, Alsarhan A, Eshtay M, Aldweesh A, Aslam N. Enhancing Small Medical Dataset Classification Performance Using GAN. Informatics. 2023; 10(1):28.Google Scholar
- Alauthman M, Aldweesh A, Al-qerem A, Aburub F, Al-Smadi Y, Abaker AM, Alzubi OR, Alzubi B. Tabular Data Generation to Improve Classification of Liver Disease Diagnosis. Applied Sciences. 2023; 13(4):2678.Google Scholar
- Panda, M., Mishra, D. P., Patro, S. M., & Salkuti, S. R. (2022). Prediction of diabetes disease using machine learning algorithms. IAES International Journal of Artificial Intelligence, 11(1), 284.Google Scholar
- Sharma, A., Guleria, K., & Goyal, N. (2021). Prediction of diabetes disease using machine learning model. In International Conference on Communication, Computing and Electronics Systems: Proceedings of ICCCES 2020 (pp. 683-692). Springer Singapore.Google ScholarCross Ref
- Maniruzzaman, M., Kumar, N., Abedin, M. M., Islam, M. S., Suri, H. S., El-Baz, A. S., & Suri, J. S. (2017). Comparative approaches for classification of diabetes mellitus data: Machine learning paradigm. Computer methods and programs in biomedicine, 152, 23-34.Google Scholar
- Pham, B. T., Bui, D. T., Prakash, I., & Dholakia, M. B. (2017). Hybrid integration of Multilayer Perceptron Neural Networks and machine learning ensembles for landslide susceptibility assessment at Himalayan area (India) using GIS. Catena, 149, 52-63.Google ScholarCross Ref
- T.Mitchell, Machine Learning, McGrawHill, New York, 1997Google ScholarDigital Library
- Herron P., “Machine Learning for Medical Decision Support: Evaluating Diagnostic Performance of Machine Learning Classification Algorithms”, INLS 110, Data Mining, 2004Google Scholar
- H. Wu, S. Yang, Z. Huang, J. He, and X. Wang, “Type 2 diabetes mellitus prediction model based on data mining,” Informatics in Medicine Unlocked, vol. 10, pp. 100–107, 2018, doi: 10.1016/j.imu.2017.12.006.Google ScholarCross Ref
- A. B. Olokoba, O. A. Obateru, and L. B. Olokoba, “Type 2 diabetes mellitus: a review of current trends,” Oman Medical Journal, vol. 27, no. 4, pp. 269–273, Jul. 2012, doi: 10.5001/omj.2012.68.Google ScholarCross Ref
- T. Zheng , “A machine learning-based framework to identify type 2 diabetes through electronic health records,” International Journal of Medical Informatics, vol. 97, pp. 120–127, Jan. 2017, doi: 10.1016/j.ijmedinf.2016.09.014.Google ScholarCross Ref
- Kim, S. J., Bae, S. J., & Jang, M. W. (2022). Linear Regression Machine Learning Algorithms for Estimating Reference Evapotranspiration Using Limited Climate Data. Sustainability, 14(18), 11674.Google ScholarCross Ref
- Anazi, M. M. A., & Shahin, O. R. (2022). A machine learning model for the identification of the holy Quran reciter utilizing k-nearest neighbor and artificial neural networks. Inf. Sci. Lett., 11(4), 1093-1102.Google ScholarCross Ref
- Elbasi, Ersin, and Aymen I. Zreikat. "Heart Disease Classification for Early Diagnosis based on Adaptive Hoeffding Tree Algorithm in IoMT Data." INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY 20.1 (2023): 38-48.Google Scholar
Index Terms
- The effect of Data Augmentation Using SMOTE: Diabetes Prediction by Machine Learning Techniques
Recommendations
Prediction on diabetes patient's hospital readmission rates
ICAICR '19: Proceedings of the Third International Conference on Advanced Informatics for Computing ResearchHospital Readmission is considered as an effective measurement of service and care provided within the hospital. Emergency readmission to hospital is frequently used as a measure of the quality of a hospital because a high proportion of readmissions ...
Diabetes prediction using supervised machine learning
AbstractDiabetes is a disease that can lead to blindness, kidney failure, and heart attacks, as well as death. According to the International Diabetes Federation, there were 463 million diabetics in 2019. If predictions are correct, this number will rise ...
Machine learning techniques for medical diagnosis of diabetes using iris images
Highlights- Diabetes detection through computer machine vision technique using iris.
- ...
Abstract Background and ObjectiveComplementary and alternative medicine techniques have shown their potential for the treatment and diagnosis of chronical diseases like diabetes, arthritis etc. On the same time digital image ...
Comments