skip to main content
10.1145/3616901.3617018acmotherconferencesArticle/Chapter ViewAbstractPublication PagesfaimlConference Proceedingsconference-collections
research-article

Diabetes Prediction Based on Limited Medical Indication

Published:05 March 2024Publication History

ABSTRACT

Using machine learning methods based on relevant physiological indicators for disease prediction has become a relatively mature and widely applied technique. Diabetes is a common and prevalent disease, posing a significant global public health problem that seriously threatens human health. Currently, various machine learning and deep learning techniques, such as neural networks, have been employed for diabetes prediction. Most methods have achieved significant results by utilizing complex models on large datasets. However, many discussions on these methods lack classification analysis of experimental datasets or unilaterally pursue high precision. In the case of potential patients, a model with high recall rate becomes more important, as it reduces the probability of potential patients being misclassified as normal individuals. In this study, detailed analysis of various attributes of the dataset was conducted through a hybrid approach based on statistics and mathematics. Based on the conclusions drawn from this stage and considering factors of diabetes in society, the composition of potential cases in a small dataset was analysed. The performance of common machine learning models was tested in experiments. Ultimately, a linear model was selected and optimized, and model performance was evaluated using the confusion matrix and ROC curve, demonstrating balanced and satisfactory precision and recall scores.

References

  1. Yang Guo, Guohua Bai and Yan Hu, "Using Bayes Network for Prediction of Type-2 diabetes," 2012 International Conference for Internet Technology and Secured Transactions, London, 2012, pp. 471-472.Google ScholarGoogle Scholar
  2. M. NirmalaDevi, S. A. alias Balamurugan and U. V. Swathi, "An amalgam KNN to predict diabetes mellitus," 2013 IEEE International Conference ON Emerging Trends in Computing, Communication and Nanotechnology (ICECCN), Tirunelveli, India, 2013, pp. 691-695, doi: 10.1109/ICE-CCN.2013.6528591.Google ScholarGoogle ScholarCross RefCross Ref
  3. N. Nnamoko, A. Hussain and D. England, "Predicting Diabetes Onset: An Ensemble Supervised Learning Approach," 2018 IEEE Congress on Evolutionary Computation (CEC), Rio de Janeiro, Brazil, 2018, pp. 1-7, doi: 10.1109/CEC.2018.8477663.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. G. Verma, V. Nijhawan and A. Kumar, "A Supervised Ensemble Machine Learning Model To Predict Diabetes At Early Stage," 2022 10th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), Noida, India, 2022, pp. 1-4, doi: 10.1109/ICRITO56286.2022.9965058.Google ScholarGoogle ScholarCross RefCross Ref
  5. C. Chethana, "Tree based Predictive Modelling for Prediction of the Accuracy of Diabetics," 2021 International Conference on Intelligent Technologies (CONIT), Hubli, India, 2021, pp. 1-6, doi: 10.1109/CONIT51480.2021.9498571.Google ScholarGoogle ScholarCross RefCross Ref
  6. Rashi Rastogi, Mamta Bansal. 2023. Diabetes prediction model using data mining techniques. Measurement: Sensors, Volume 25, pp. 100605, ISSN 2665-9174. https://doi.org/10.1016/j.measen.2022.100605.Google ScholarGoogle ScholarCross RefCross Ref
  7. Aman Chauhan. Predict Diabetes. Retrieved October 23, 2022 from https://www.kaggle.com/datasets/whenamancodes/predict-diabities.Google ScholarGoogle Scholar
  8. Saul Stahl, The Evolution of the Normal Distribution, Mathematics Magazine, Vol. 79, no. 2 (2006), pp. 96–113.Google ScholarGoogle ScholarCross RefCross Ref
  9. Pritha Bhandari. Normal Distribution | Examples, Formulas, & Uses. Retrieved May 27, 2023 from https://www.scribbr.com/statistics/normal-distribution/.Google ScholarGoogle Scholar
  10. Shaun Turney. Central Limit Theorem | Formula, Definition & Examples. Retrieved May 27, 2023 from https://www.scribbr.com/statistics/central-limit-theorem/.Google ScholarGoogle Scholar
  11. Shaun Turney. Skewness | Definition, Examples & Formula. Retrieved May 27, 2023 from https://www.scribbr.com/statistics/skewness/.Google ScholarGoogle Scholar
  12. N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman and A. Galstyan. 2019. A Survey on Bias and Fairness in Machine Learning. https://doi.org/10.48550/arXiv.1908.09635.Google ScholarGoogle ScholarCross RefCross Ref
  13. Maalouf, Maher. Logistic regression in data analysis: An overview. 2011. International Journal of Data Analysis Techniques and Strategies, Volume 3, pp. 281-299, doi: 10.1504/IJDATS.2011.041335.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Evgeniou, Theodoros and Pontil, Massimiliano. Support Vector Machines: Theory and Applications. 2001. Volume 2049, pp. 249-257, isbn 978-3-540-42490-1, doi: 10.1007/3-540-44673-7_12.Google ScholarGoogle ScholarCross RefCross Ref
  15. Aurélien Géron. 2019. Hands-On Machine learning with Scikit-Learn, Keras & TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems, Second Edition. O'Reilly Media, Inc.Google ScholarGoogle Scholar
  16. K. Taunk, S. De, S. Verma and A. Swetapadma, "A Brief Review of Nearest Neighbor Algorithm for Learning and Classification," 2019 International Conference on Intelligent Computing and Control Systems (ICCS), Madurai, India, 2019, pp. 1255-1260, doi: 10.1109/ICCS45141.2019.9065747.Google ScholarGoogle ScholarCross RefCross Ref
  17. Kotsiantis, S.B. Decision trees: a recent overview. Artif Intell Rev 39, 261–283 (2013). https://doi.org/10.1007/s10462-011-9272-4.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Sebastian Raschka. Machine Learning FAQ. Retrieved May 24, 2023 from https://sebastianraschka.com/faq/docs/decision-tree-binary.html.Google ScholarGoogle Scholar
  19. Tin Kam Ho, "Random decision forests," Proceedings of 3rd International Conference on Document Analysis and Recognition, Montreal, QC, Canada, 1995, pp. 278-282 vol.1, doi: 10.1109/ICDAR.1995.598994.Google ScholarGoogle ScholarCross RefCross Ref
  20. James Bergstra and Yoshua Bengio. Random search for hyper-parameter optimization. The Journal of Machine Learning Research 13, 1 (2012), Volume 13, pp 281–305.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Tom Fawcett. An introduction to ROC analysis. 2006. Pattern Recognition Letters, Volume 27, pp. 861-874. https://www.sciencedirect.com/science/article/pii/S016786550500303X.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Diabetes Prediction Based on Limited Medical Indication
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Other conferences
            FAIML '23: Proceedings of the 2023 International Conference on Frontiers of Artificial Intelligence and Machine Learning
            April 2023
            296 pages
            ISBN:9798400707544
            DOI:10.1145/3616901

            Copyright © 2023 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 5 March 2024

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed limited
          • Article Metrics

            • Downloads (Last 12 months)12
            • Downloads (Last 6 weeks)7

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format