Skip to main content

Advertisement

Log in

Diabetes subtypes classification for personalized health care: A review

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Healthcare is evolving from standard to personalized, driven by the patients’ needs. Personalized healthcare is a medical model based on genetics, genomics, and other biological information that helps to predict risk for disease. To date, machine learning and data mining are the fastest-growing healthcare field used to classify patient cohorts from a large dataset and its application for diabetes subtyping will be a breakthrough. In this review paper, we have identified, analyzed, and summarized how previous studies distinguished diabetes into subtypes besides implementing the methods for diabetes subtyping using data mining and various clustering algorithms. We have discovered that many studies have suggested diabetes can be differentiated into subtypes clinically based on the risk complications, genetically defined, using clinical features, and for treatment selection. As for clustering algorithms, k-means clustering and hierarchical clustering were shown to be widely used in determining sub-clusters of diabetes. To further investigate diabetes subtyping, understanding the specific objective and method of diabetes subtyping using clustering algorithms from a large dataset will be crucial which could contribute to novel knowledge and improvement for diabetes management.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Ahlqvist E, Storm P, Käräjämäki A, Martinell M, Dorkhan M, Carlsson A, Vikman P, Prasad RB, Aly DM, Almgren P (2017) ‘Clustering of adult-onset diabetes into novel subgroups guides therapy and improves prediction of outcome’, bioRxiv, 186387

  • Ahlqvist E, Storm P, Käräjämäki A, Martinell M, Dorkhan M, Carlsson A, Vikman P, Prasad RB, Aly DM, Almgren P, Wessman Y, Shaat N, Spégel P, Mulder H, Lindholm E, Melander O, Hansson O, Malmqvist U, Lernmark Ã, Lahti K, Forsén T, Tuomi T, Rosengren AH, Groop L (2018) ‘Novel subgroups of adult-onset diabetes and their association with outcomes: a data-driven cluster analysis of six variables’, The Lancet Diabetes and Endocrinology,

  • Alamsyah M, Nafisah Z, Prayitno E, Afida A, Imah E (2018) The classification of diabetes mellitus using Kernel k-means. Journal of Physics: Conference Series. 2018. 012003

  • Aris T, Yusoff M, Abd Ghani MF, Ahmad AA, Omar NA, Guat Hiong MA, Hasri TMohd, Radzi NHM, Manan NF, Kamaruddin NA (2015) National Health & Morbidity Survey 2015 Non-communicable diseases, risk factors and other health problems. National Institutes of Health, Ministry of Health Malaysia, Kuala Lumpur

    Google Scholar 

  • Bancks MP, Bertoni AG, Carnethon M, Chen H, Cotch MF, Gujral UP, Herrington D, Kanaya AM, Szklo M, Vaidya D (2021) ‘Association of diabetes subgroups with race/ethnicity, risk factor burden and complications: the MASALA and MESA studies’. J Clin Endocrinol Metabolism 106:e2106–e2115

    Article  Google Scholar 

  • Berkhin P (2006) A survey of clustering data mining techniques. Grouping multidimensional data. Springer

  • Carruth L, Chard S, Howard HA, Manderson L, Mendenhall E, Vasquez E, Yates-Doerr E (2019) ‘Disaggregating diabetes: New subtypes, causes, and care’, Medicine Anthropology Theory | An open-access journal in the anthropology of health, illness, and medicine,

  • Cho SB, Kim SC, Chung MG (2019) ‘Identification of novel population clusters with different susceptibilities to type 2 diabetes and their impact on the prediction of diabetes’. Sci Rep 9:1–9

    Google Scholar 

  • Collins FS, Varmus H (2015) ‘A new initiative on precision medicine’. N Engl J Med 372:793–795

    Article  Google Scholar 

  • Dahl A, Cai N, Ko A, Laakso M, Pajukanta P, Flint J, Zaitlen N (2019) ‘Reverse GWAS: Using genetics to identify and model phenotypic subtypes’,PLoS Genetics

  • Dennis JM (2020) ‘Precision medicine in type 2 diabetes. Using individualized prediction models to optimize selection of treatment’, Diabetes

    Google Scholar 

  • Dennis JM, Shields BM, Henley WE, Jones AG, Hattersley AT (2019) ‘Disease progression and treatment response in data-driven subgroups of type 2 diabetes compared with models based on simple clinical features: an analysis using clinical trial data’. The Lancet Diabetes and Endocrinology 7:442–451

    Article  Google Scholar 

  • Devasena MG, Grace RK, Gopu G (2020) PDD: Predictive Diabetes Diagnosis using Datamining Algorithms. 2020 International Conference on Computer Communication and Informatics (ICCCI). 2020. 1–4

  • Fayyad U, Piatetsky-Shapiro G, Smyth P (1996) ‘From data mining to knowledge discovery in databases’. AI magazine 17:37–37

    Google Scholar 

  • Fedotkina O, Sulaieva O, Ozgumus T, Cherviakova L, Khalimon N, Svietleisha T, Buldenko T, Ahlqvist E, Asplund O, Groop L (2021) ‘Novel reclassification of adult diabetes is useful to distinguish stages of β-cell function linked to the risk of vascular complications: the DOLCE study from northern Ukraine’. Front Genet 12:1114

    Article  Google Scholar 

  • Fiarni C, Sipayung EM, Maemunah S (2019) ‘Analysis and prediction of diabetes complication disease using data mining algorithm’. Procedia Comput Sci 161:449–457

    Article  Google Scholar 

  • Fitipaldi H, McCarthy MI, Florez JC, Franks PW (2018) ‘A global overview of precision medicine in type 2 diabetes’. Diabetes 67:1911–1922

    Article  Google Scholar 

  • García S, Luengo J, Herrera F (2015) Data preprocessing in data mining. Springer

  • Greene J (2018) Dividing Diabetes by Cluster Instead of Types. Available: https://www.managedcaremag.com/archives/2018/6/dividing-diabetes-cluster-instead-types

  • Hattersley AT, Patel KA (2017) Precision diabetes: learning from monogenic diabetes.Diabetologia.

  • Hu J, Perer A, Wang F (2016) Data driven analytics for personalized healthcare. Healthcare Information Management Systems. Springer

  • IDF IDF (2019) IDF Diabetes Atlas 9th Edition 2019

  • Irani J, Pise N, Phatak M (2016) ‘Clustering techniques and the similarity measures used in clustering: A survey’. Int J Comput Appl 134:9–14

    Google Scholar 

  • Jee K, Kim G-H (2013) ‘Potentiality of big data in the medical sector: focus on how to reshape the healthcare system’. Healthc Inf Res 19:79–85

    Article  Google Scholar 

  • Kahkoska AR, Geybels MS, Klein KR, Kreiner FF, Marx N, Nauck MA, Pratley RE, Wolthers BO, Buse JB (2020) ‘Validation of distinct type 2 diabetes clusters and their association with diabetes complications in the DEVOTE, LEADER and SUSTAIN-6 cardiovascular outcomes trials’. Diabetes Obes Metabolism 22:1537–1547

    Article  Google Scholar 

  • Kamaruddin NA (2015) Clinical Practice Guidelines: Management of Type 2 Diabetes Mellitus. 5th Edition. Ministry of Health Malaysia

  • Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas I, Chouvarda I (2017) ‘Machine learning and data mining methods in diabetes research’. Comput Struct Biotechnol J 15:104–116

    Article  Google Scholar 

  • Kesavadev J, Sadikot SM, Saboo B, Shrestha D, Jawad F, Azad K, Wijesuriya MA, Latt TS, Kalra S (2014) ‘Challenges in Type 1 diabetes management in South East Asia: Descriptive situational assessment’. Indian J Endocrinol Metabol 18:600

    Article  Google Scholar 

  • Khan K, Rehman SU, Aziz K, Fong S, Sarasvady S (2014) DBSCAN: Past, present and future. The fifth international conference on the applications of digital information and web technologies (ICADIWT 2014). 2014. 232–238

  • Kumar VB, Vijayalakshmi K, Padmavathamma M (2020)’A Hybrid Data Mining Approach for Diabetes Prediction and Classification’

  • Kuwil FH, Atila Ü, Abu-Issa R, Murtagh F (2020) ‘A novel data clustering algorithm based on gravity center methodology’. Expert Syst Appl 156:113435

    Article  Google Scholar 

  • Lee J, Maslove DM, Dubin JA (2015) ‘Personalized mortality prediction driven by electronic medical data and a patient similarity metric’. PLoS ONE 10:e0127428

    Article  Google Scholar 

  • Li L, Cheng WY, Glicksberg BS, Gottesman O, Tamler R, Chen R, Bottinger EP, Dudley JT (2015) ‘Identification of type 2 diabetes subgroups through topological analysis of patient similarity’. Sci Transl Med 7:1–16

    Article  Google Scholar 

  • Madhulatha TS (2012) ‘An overview on clustering methods’, arXiv preprint arXiv:1205.1117,

  • Mahajan A, Taliun D, Thurner M, Robertson NR, Torres JM, Rayner NW, Payne AJ, Steinthorsdottir V, Scott RA, Grarup N, Cook JP, Schmidt EM, Wuttke M, Sarnowski C, Mägi R, Nano J, Gieger C, Trompet S, Lecoeur C, Preuss MH, Prins BP, Guo X, Bielak LF, Below JE, Bowden DW, Chambers JC, Kim YJ, Ng MCY, Petty LE, Sim X, Zhang W, Bennett AJ, Bork-Jensen J, Brummett CM, Canouil M, Ec kardt KU, Fischer K, Kardia SLR, Kronenberg F, Läll K, Liu CT, Locke AE, Luan Ja, Ntalla I, Nylander V, Schönherr S, Schurmann C, Yengo L, Bottinger EP, Brandslund I, Christensen C, Dedoussis G, Florez JC, Ford I, Franco OH, Frayling TM, Giedraitis V, Hackinger S, Hattersley AT, Herder C, Ikram MA, Ingelsson M, Jørgensen ME, Jørgensen T, Kriebel J, Kuusisto J, Ligthart S, Lindgren CM, Linneberg A, Lyssenko V, Mamakou V, Meitinger T, Mohlke KL, Morris AD, Nadkarni G, Pankow JS, Peters A, Sattar N, Stančáková A, Strauch K, Taylor KD, Thorand B, Thorleifsson G, Thorsteinsdottir U, Tuomilehto J, Witte DR, Dupuis J, Peyser PA, Zeggini E, Loos RJF, Froguel P, Ingelsson E, Lind L, Groop L, Laakso M, Collins FS, Jukema JW, Palmer H, Metspalu, A., et al (2018) ‘Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps’, Nature Genetics,

  • Mahayidin H, Zakariah SZ, Ishah NA, Wee XA, Mohamed N, Nor MA ‘Diabetes-Associated Autoantibodies Among Young Diabetes Mellitus Patients in Malaysia’,Age, 19, 8.413a.

  • Marso SP, Bain SC, Consoli A, Eliaschewitz FG, Jódar E, Leiter LA, Lingvay I, Rosenstock J, Seufert J, Warren ML (2016a) ‘Semaglutide and cardiovascular outcomes in patients with type 2 diabetes’. N Engl J Med 375:1834–1844

    Article  Google Scholar 

  • Marso SP, Daniels GH, Brown-Frandsen K, Kristensen P, Mann JF, Nauck MA, Nissen SE, Pocock S, Poulter NR, Ravn LS (2016b) ‘Liraglutide and cardiovascular outcomes in type 2 diabetes’. N Engl J Med 375:311–322

    Article  Google Scholar 

  • Marso SP, McGuire DK, Zinman B, Poulter NR, Emerson SS, Pieber TR, Pratley RE, Haahr P-M, Lange M, Brown-Frandsen K (2017) ‘Efficacy and safety of degludec versus glargine in type 2 diabetes’. N Engl J Med 377:723–732

    Article  Google Scholar 

  • McCarthy MI (2017) Painting a new picture of personalised medicine for diabetes.Diabetologia.

  • Mehta P (2019) Deconstructing complex diseases: identification of new phenotypical sub-clusters of Type 2 diabetes using machine learning

  • Miotto R, Li L, Kidd BA, Dudley JT (2016) ‘Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records’. Scientific Reports

  • Ng K, Sun J, Hu J, Wang F (2015) ‘Personalized predictive modeling and risk factor identification using patient similarity’, AMIA Summits on Translational Science Proceedings, 2015, 132

  • Nilashi M, Ibrahim O, Dalvi M, Ahmadi H, Shahmoradi L (2017) ‘Accuracy improvement for diabetes disease classification: a case on a public medical dataset’. Fuzzy Inform Eng 9:345–357

    Article  Google Scholar 

  • Nilashi MI, Othman, Mardani A, Ahani A, Jusoh A (2018) ‘A soft computing approach for diabetes disease classification’. Health Inf J 24:379–393

    Article  Google Scholar 

  • Nithya R, Manikandan P, Ramyachitra D (2015) ‘Analysis of clustering technique for the diabetes dataset using the training set parameter’. Int J Adv Res Comput Communication Eng 4:166–169

    Google Scholar 

  • Panahiazar M, Taslimitehrani V, Pereira NL, Pathak J (2015) ‘Using EHRs for heart failure therapy recommendation using multidimensional patient similarity analytics’. Stud Health Technol Inform 210:369

    Google Scholar 

  • Patel S, Patel H (2016) ‘Survey of data mining techniques used in healthcare domain’. Int J Inform 6:53–60

    Google Scholar 

  • Pearson ER (2019) Type 2 diabetes: a multifaceted disease.Diabetologia.

  • Raihan M, Islam MT, Farzana F, Raju MGM, Mondal HS (2019) An Empirical Study to Predict Diabetes Mellitus using K-Means and Hierarchical Clustering Techniques. 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT). 2019. 1–6

  • Rokach L, Maimon O (2005) Clustering methods. Data mining and knowledge discovery handbook. Springer

  • Saravananathan K, Velmurugan T (2018) ‘Cluster based performance analysis for Diabetic data’. Int J Pure Appl Math 119:399–410

    Google Scholar 

  • Sharafoddini A, Dubin JA, Lee J (2017) ‘Patient similarity in prediction models based on health data: a scoping review’. JMIR Med Inf 5:e7

    Article  Google Scholar 

  • Sheet MM, Khudhair HAA (2019) ‘Beta-cell Death and/or Stress Biomarkers in Diabetes Mellitus Type’,Al-Kufa University Journal for Biology, 11

  • Shirkhorshidi AS, Aghabozorgi S, Wah TY (2015) ‘A comparison study on similarity and dissimilarity measures in clustering continuous data’. PLoS ONE 10:e0144059

    Article  Google Scholar 

  • Slieker RC, Donnelly LA, Fitipaldi H, Bouland GA, Giordano GN, Åkerlund M, Gerl MJ, Ahlqvist E, Ali A, Dragan I (2021) ‘Replication and cross-validation of type 2 diabetes subtypes based on clinical variables: an IMI-RHAPSODY study’, Diabetologia, 1–8

  • Srinivasan U, Arunasalam B (2013) ‘Leveraging big data analytics to reduce healthcare costs’. IT Prof 15:21–28

    Article  Google Scholar 

  • Sujatha DC, Kumar DM, Peter MC (2018) ‘Building predictive model for diabetics data using k means algorithm’. Int J Manage IT Eng 8:58–65

    Google Scholar 

  • Sun W, Cai Z, Li Y, Liu F, Fang S, Wang G (2018) ‘Data Processing and Text Mining Technologies on Electronic Medical Records: A Review’, Journal of Healthcare Engineering, 2018

  • Sun W, Cai Z, Liu F, Fang S, Wang G (2017) A survey of data mining technology on electronic medical records. e-Health Networking, Applications and Services (Healthcom), 2017 IEEE 19th International Conference on. 2017. 1–6

  • Tanabe H, Saito H, Kudo A, Machii N, Hirai H, Maimaituxun G, Tanaka K, Masuzaki H, Watanabe T, Asahi K (2020) ‘Factors associated with risk of diabetic complications in novel cluster-based diabetes subgroups: a Japanese retrospective cohort study’, Journal of clinical medicine, 9, 2083

  • Tooke J, Lundgren J, Trembath R, Iredale J (2015) Stratified, personalised or P4 medicine: a new direction for placing the patient at the centre of healthcare and health education. The Academy of Medical Sciences. 2015. 37

  • Udler MS, Kim J, Grotthuss Mv, Bonàs-Guarch S, Mercader JM, Cole JB, Chiou J, Anderson CD, Boehnke M, Laakso M, Atzmon G, Glaser B, Gaulton K, Flannick J, Getz G, Florez JC (2018) ‘Clustering of Type 2 Diabetes Genetic Loci by Multi-Trait Associations Identifies Disease Mechanisms and Subtypes’, bioRxiv, 319509

  • van Smeden M, Harrell FE, Dahly DL (2018) ‘Novel diabetes subgroups’. The Lancet Diabetes and Endocrinology 6:439–440

    Article  Google Scholar 

  • Venkatachalam MG, M (2015) ‘Performance analysis of clustering algorithms for diabetes data’. Int J Appl Eng Res 10:38014–38017

    Google Scholar 

  • Vijayakumar R, Arjunan KP, Sivasakthi M, Lakshmanan K (2019) ‘Diabetes Prediction by Machine Learning over Big Data from Healthcare Communities’,Diabetes, 6

  • Wang F, Sun J (2015) ‘PSF: a unified patient similarity evaluation framework through metric learning with weak supervision’. IEEE J biomedical health Inf 19:1053–1060

    Article  Google Scholar 

  • WHO, W.H.O (2016) Diabetes country profiles 2016.

  • WHO, W.H.O (2021) Diabetes [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/diabetes

  • Wu H, Yang S, Huang Z, He J, Wang X (2018) ‘Type 2 diabetes mellitus prediction model based on data mining’. Inf Med Unlocked 10:100–107

    Article  Google Scholar 

  • Yardimci A (2009) ‘Soft computing in medicine’. Appl Soft Comput 9:1029–1043

    Article  Google Scholar 

  • Yeow TP, Aun ES-Y, Hor CP, Lim SL, Khaw CH, Aziz NA (2019) ‘Challenges in the classification and management of Asian youth-onset diabetes mellitus-lessons learned from a single centre study’,PloS one,14, e0211210

  • Yildirim P, Birant D (2017) ‘K-linkage: A new agglomerative approach for hierarchical clustering’. Adv Electr Comput Eng 17:77–88

    Article  Google Scholar 

  • Zaharia OP, Strassburger K, Strom A, Bönhof GJ, Karusheva Y, Antoniou S, Bódis K, Markgraf DF, Burkart V, Müssig K, Hwang JH, Asplund O, Groop L, Ahlqvist E, Seissler J, Nawroth P, Kopf S, Schmid SM, Stumvoll M, Pfeiffer AFH, Kabisch S, Tselmin S, Häring HU, Ziegler D, Kuss O, Szendroedi J, Roden M, Belgardt BF, Buyken A, Eckel J, Geerling G, Al-Hasani H, Herder C, Icks A, Kotzka J, Lammert E, Markgraf D, Rathmann W (2019) ‘Risk of diabetes-associated diseases in subgroups of patients with recent-onset diabetes: a 5-year follow-up study’, The Lancet Diabetes and Endocrinology,

  • Zou X, Zhou X, Zhu Z, Ji L (2019) Novel subgroups of patients with adult-onset diabetes in Chinese and US populations. The Lancet Diabetes and Endocrinology

Download references

Acknowledgements

This work was supported by the Universiti Teknologi Malaysia under the Fundamental Research Grant Scheme (FRGS) of Ministry of Science, Technology and Innovation (MOSTI) with reference number of 5F364 and the Ministry of Higher Education Malaysia with reference number 07G22.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Asnida Abdul Wahab.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Omar, N., Nazirun, N.N., Vijayam, B. et al. Diabetes subtypes classification for personalized health care: A review. Artif Intell Rev 56, 2697–2721 (2023). https://doi.org/10.1007/s10462-022-10202-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-022-10202-8

Keywords

Navigation