ABSTRACT
Diabetes mellitus type 2 is a chronic disease which poses a serious challenge to human health worldwide. Globally, about 8.3% of the population is diagnosed with the disease. The applications of predictive analytics in diagnosis of diabetes are gaining significant momentum in medical research. The aim of this research paper is to aid medical professionals in the early detection and efficient diagnosis of Type 2 diabetes. We utilize bioinformatics theory and supervised machine learning techniques for improving the accuracy in predicting diabetes, based on 8 clinical measurements existing in the widely used PIMA dataset. We outline our methodology and highlight the implementation steps, while reviewing prominent past work in the field. Moreover, this paper fully exploits known machine learning algorithms and provides a detailed comparison of the results obtained from each method. The gradient boosting algorithm with parameter tuning proves to be the most successful, having an F1 Score of 0.853 and out of sample accuracy of 89.94%. Our prediction model focuses on computing the probability of the onset of diabetes in an individual based on their clinical data. The most crucial results of using this research within the healthcare sector are its cost-effectiveness and yielding of instant diagnosis. With this work, we intend to improve the process of diagnosing Type 2 diabetes and inspire other researchers to use machine learning based techniques for further inquiry into diabetes prediction.
- N.H. Choa, J.E. Shaw, S. Karuranga, Y. Huang, J.D. da Rocha Fernandes, A.W. Ohlrogge, B. Malanda. 2018. IDF Diabetes Atlas: Global estimates of diabetes prevalence for 2017 and projections for 2045. Diabetes Research and Clinical Practice Volume 138, April 2018, Pages 271--281Google Scholar
- SA, Meo & Zia, Inam & Bukhari, Ishfaq & Arain, Shoukat. 2016. Type 2 diabetes mellitus in Pakistan: Current prevalence and future forecast. JPMA. The Journal of the Pakistan Medical Association. 66. 1637--1642.Google Scholar
- Stephanie D. Zaugg, Godwin Dogbey, Karen Collins, Sharon Reynolds, Carter Batista, Grace Brannan, Jay H. Shubrook. 2014. Diabetes Numeracy and Blood Glucose Control: Association With Type of Diabetes and Source of Care. Clinical Diabetes Oct 2014, 32 (4) 152--157.Google Scholar
- Aiswarya Iyer, S. Jeyalatha, Ronak Sumbaly. 2015. Diagnosis of diabetes using classification mining techniques. International Journal of Data Mining & Knowledge Management Process (IJDKP), Vol.5, No.1, January 2015, pp. 1--14.Google ScholarCross Ref
- Andreas Mayr, Harald Binder, Olaf Gefeller, Matthias Schmid. 2014. The Evolution of Boosting Algorithms - From Machine Learning to Statistical Modelling. Methods Inf Med 2014; 53(6): 419--427.Google Scholar
- E. K. Hashi, M. S. U. Zaman and M. R. Hasan. 2017. An expert clinical decision support system to predict disease using classification techniques. 2017 International Conference on Electrical, Computer and Communication Engineering (ECCE), Cox's Bazar, pp. 396--400.Google Scholar
- Alghamdi M, Al-Mallah M, Keteyian S, Brawner C, Ehrman J, Sakr S. 2017. Predicting diabetes mellitus using SMOTE and ensemble machine learning approach: The Henry Ford ExercIse Testing (FIT) project. PLoS ONE 12(7): e0179805.Google ScholarCross Ref
- Kavakiotis, Ioannis and Tsave, Olga and Salifoglou, Athanasios and Maglaveras, N and Vlahavas, I and Chouvarda, Ioanna. 2017. Machine Learning and Data Mining Methods in Diabetes Research. Computational and Structural Biotechnology Journal. 15.Google Scholar
- G. D. Kalyankar, S. R. Poojara and N. V. Dharwadkar. 2017. Predictive analysis of diabetic patient data using machine learning and Hadoop. International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), Palladam, pp. 619--624.Google Scholar
- A. Anand and D. Shakti. 2015. Prediction of diabetes based on personal lifestyle indicators. 1st International Conference on Next Generation Computing Technologies (NGCT), Dehradun, pp. 673--676.Google Scholar
- Dua, D. and Karra Taniskidou, E. 2017. UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science. DOI= http://archive.ics.uci.edu/ml.Google Scholar
- Keating BJ. 2015. Advances in risk prediction of type 2 diabetes: integrating genetic scores with Framingham risk models. Diabetes 64(5):1495--7.Google ScholarCross Ref
- M. Fathi Ganji and M. Saniee Abadeh. 2010. Using fuzzy ant colony optimization for diagnosis of diabetes disease. 18th Iranian Conference on Electrical Engineering, Isfahan, 2010, pp. 501--505.Google Scholar
- Mohamed, Ehab and Linder, Roland and Perriello, Gabriele and Daniele, Nicola and Pöppl, Siegfried and De Lorenzo, Antonino. 2002. Predicting Type 2 diabetes using an electronic nose-based artificial neural network analysis. Diabetes, nutrition & metabolism. 15. 215--21.Google Scholar
- R Chandvaniya, Jitendra and Aluvalu, Rajanikanth. 2014. Ranking with Distance based Outlier Detection Techniques: A Survey. International Journal of Computer Applications. 89.Google Scholar
- S. Vijayarani and S. Dhayanand. 2015. Data Mining Classification Algorithms for Kidney Disease Prediction. International Journal on Cybernetics & Informatics. 4. 13--25.Google Scholar
- T. Jayalakshmi and A. Santhakumaran. 2010. A Novel Classification Method for Diagnosis of Diabetes Mellitus Using Artificial Neural Networks. International Conference on Data Storage and Data Engineering, Bangalore, 2010, pp. 159--163. Google ScholarDigital Library
- Gorunescu, Florin. 2011. Data Mining: Concepts, models and techniques. Intelligent Systems Reference Library. Berlin Heidelberg, Springer-Verlag. p. 256--60Google Scholar
- Nirmala Devi, M., Balamurugan, A., & Reshma Kris, M. 2016. Developing a Modified Logistic Regression Model for Diabetes Mellitus and Identifying the Important Factors of Type II Dm. Indian Journal Of Science And Technology, 9(4).Google Scholar
- Shankaracharya, and Odedra, Devang and Mallick, Medhavi and Shukla, Prateek and Samanta, Subir and Vidyarthi, Ambarish. 2011. Java-Based Diabetes Type 2 Prediction Tool for Better Diagnosis. Diabetes technology & therapeutics. 14. 251--6.Google Scholar
Index Terms
- Predictive Analytics in Healthcare for Diabetes Prediction
Recommendations
Data science for healthcare predictive analytics
IDEAS '20: Proceedings of the 24th Symposium on International Database Engineering & ApplicationsBig data are everywhere nowadays. Many businesses possess big data for their success because big data are very useful and are considered as new oil. For instance, big data are very important in predicting the trends on what will happen in the future. ...
Machine Learning Based Unified Framework for Diabetes Prediction
BDET 2018: Proceedings of the 2018 International Conference on Big Data Engineering and TechnologyMachine learning gained a significant position in healthcare services (HCS) due to its ability to improve the disease prediction in HCS. Machine learning techniques and artificial intelligence have already been worked in the HCS area. Recently, diabetes ...
Deep Learning Approach for Accurate Prediction of diabetes
ICIMMI '23: Proceedings of the 5th International Conference on Information Management & Machine IntelligenceOver the decades, diabetes has proven to be a chronic disease, causing significant impact on individuals and healthcare systems globally. This disease increases the threat of diseases like cardiovascular illness, blindness, and may even cause early ...
Comments