Skip to main content

Evaluation of Outcome Prediction for a Clinical Diabetes Database

  • Conference paper
Knowledge Exploration in Life Science Informatics (KELSI 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3303))

Abstract

Diabetes is a metabolic disorder which can be greatly affected by lifestyle. The disease cannot be cured but can be controlled, which will minimize the complications such as heart disease, stroke and blindness. Clinicians routinely collect large amounts of information on diabetic patients as part of their day to day management for control of the disease. We investigate the potential for data mining in order to spot trends in the data and attempt to predict outcome. Feature selection has been used to improve the efficiency of the data mining algorithms and identify the contribution of different features to diabetes control status prediction. Decision trees can provide classification accuracy over 78%. However, while most bad control cases (90%) can be correctly classified, at least 50% of good control cases will be misclassified, which means that current feature selection and prediction models illustrate some potential but need additional refinement.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. IDF.International Diabetes Federation, Diabetes Atlas, 2nd edn. (2003)

    Google Scholar 

  2. Diabetes UK, Diabetes in Northern Ireland (March 2004), http://www.diabetes.org.uk/n.ireland/nireland.htm

  3. Diabetes UK, Understanding Diabetes-Your key to better health (2003), http://www.diabetes.org.uk/infocentre/pubs/Understand.doc

  4. Lehmann, E.D., Deutsch, T.: Application of Computers in Diabetes Care-A Review I: Computers for Data Collection and Interpretation. MED INFORM 20(4), 281–302 (1995)

    Article  Google Scholar 

  5. American Association of Clinical Endocrinologists and the American College of Endocrinology. Medical Guidelines for the Management of Diabetes Mellitus: The AACE System of Intensive Diabetes Self-Management-2002, Update. Endocrine Practice. Vol.8 (Suppl.1), 40-82 (2002)

    Google Scholar 

  6. Diabetes Control and Complications Trial Research Group: The effect of intensive treat ment of diabetes on the development and progression of long-term complications in insulin- dependent diabetes mellitus. N Engl. J Med.329, 977-986 (1993)

    Google Scholar 

  7. UK Prospective Diabetes Study (UKPDS) Group. Intensive blood-glucose control with sulphonylureas or insulin compared with conventional treatment and risk of complications in patients with type 2 diabetes (UKPDS 33). Lancet. 352, 837-853 (1998)

    Google Scholar 

  8. UK Prospective Diabetes Study (UKPDS) Group. Effect of intensive blood-glucose control with metformin on complications in overweight patients with type 2 diabetes (UKPDS34) Lancet. 352, 854-865 (1998)

    Google Scholar 

  9. Rahman, Y., Nolan, J., Grimson, J.: E-Clinic: Re-engineering Clinical Care Process in Diabetes Management. In: HISI (2002)

    Google Scholar 

  10. American Diabetes Association, About us.American Diabetes Association (2004), http://www.diabetes.org/aboutus.jsp?WTLPromo=HEADER_aboutus&vms=142585600057

  11. Strattpm, I.M., Adler, A.I., Neil, H.A.W.: Association of Glycaemia with Macrovascular and Microvascular complications of Type 2 Diabetes. Br. Med. J 321, 405–412 (2000)

    Article  Google Scholar 

  12. Lavrac, N.: Selected Techniques for Data Mining in Medicine, AI Med. AI Med. 16(1), 3–23 (2002)

    Google Scholar 

  13. Huang, Y., McCullagh, P.J., Black, N.D., Harper, R.: Feature Selection and Classification Model Construction on Type 2 Diabetic Patient’s Data. In: Proceeding of 4th Industrial Conference on Data Mining, Springer, Heidelberg (2004)

    Google Scholar 

  14. Hegland, M.: Computational Challgnges in Data Mining. ANZIAM J 42(E), C1-C43 (2000)

    MathSciNet  Google Scholar 

  15. Duhamel, A., Nuttens, M.C., Devos, P., Picavet, M., Beuscart, R.: A preprocessing method for improving data mining techniques: Application to a large medical diabetes database. Stud Health Technol Inform, 268–274 (2003)

    Google Scholar 

  16. Stilou, S., Bamidis, P.D., Maglaveras, N., Pappas, C.: Mining Association Rules from Clinical Databases: An Intelligent Diagnostic Process in Healthcare. MEDINFO, 1399–1403 (2001)

    Google Scholar 

  17. Kononenko, I.: Estimating attributes: Analysis and extensions of Relief. In: Proceeding of the Seventh European Conference on Machine Learning, pp. 171–182. Springer, Heidelberg (1994)

    Google Scholar 

  18. Demsar, J., Zupan, B., Aoki, N., Wall, M.J., Granchi, T.H., Beck, J.R.: Feature Mining and Predictive Model Construction from Severe Trauma Patient’s Data. Int. J Med. Inf. 63, 41–50 (2001)

    Article  Google Scholar 

  19. Molina, L., Belanche, L., Nebot, A.: Feature Selection Algorithms: A Survey and Experimental Evaluation. In: Proceeding of IEEE International Conference on Data Mining, pp. 306–313. IEEE, Los Alamitos (2002)

    Chapter  Google Scholar 

  20. Perner, P.: Improving the Accuracy of Decision Tree Induction by Feature Pre-Selection. Applied Artificial Intelligence 15(8), 747–760 (2001)

    Article  Google Scholar 

  21. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

  22. Chen, M.S., Han, J., Yu, P.S.: Data Mining: An Overview from Database Perspective. IEEE Transaction on Knowledge and Data Engineering 8(6), 866–883 (1996)

    Article  Google Scholar 

  23. Turney, P.: Theoretical Analysis of Cross-Validation Error and Voting in Instance-Based Learning. J Experimental and Theoretical Artificial Intelligence 6, 361–391 (1994)

    Article  MATH  Google Scholar 

  24. Perner, P., Trautzsch, S.: Multi-interval Diacretization for Decision Tree Learning. In: Amin, A., Pudil, P., Dori, D. (eds.) SPR 1998 and SSPR 1998. LNCS, vol. 1451, pp. 475–482. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  25. Dougherty, J., Kohavi, R., Sahamin, M.: Supervised and Unsupervised Discretization of Continuous Features. Machine Learning, 14th IJCAI, 194–202 (1995)

    Google Scholar 

  26. Fayyad, U.M., Irani, K.B.: Multi-interval Discretization of Continuous-Valued Attributes for Classification Learning. In: Machine Learning, 13th IJCAI, pp. 1022–1027 (1993)

    Google Scholar 

  27. Veropoulos, K., Campbell, C., Cristianini, N.: Conrolling the Sensitivity of Support Vector Machines. In: IJCAI 1999 (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Huang, Y., McCullagh, P., Black, N., Harper, R. (2004). Evaluation of Outcome Prediction for a Clinical Diabetes Database. In: López, J.A., Benfenati, E., Dubitzky, W. (eds) Knowledge Exploration in Life Science Informatics. KELSI 2004. Lecture Notes in Computer Science(), vol 3303. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30478-4_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30478-4_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23927-7

  • Online ISBN: 978-3-540-30478-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics