Abstract
Diabetes is the third chronic disease threatening human health after cardiovascular and cerebrovascular diseases and malignant tumors. The latest survey shows that there are as many as 463 million diabetic patients in the world, most of which are type 2 diabetes, and present a state of high incidence. Therefore, preventing and controlling the occurrence of type 2 diabetes is of great strategic significance for protecting human health and saving medical resources. This paper uses the SVM classification technology in data mining to establish a type 2 diabetes risk prediction model based on the SVM classifier, and uses the model to predict the original data of diabetic patients in the endocrinology department of a third-class hospital in Wuhan. Finally, an evaluation tool is used to evaluate the prediction performance and quality of the prediction model. The experimental results show that the prediction model based on the SVM classifier has the advantages of high prediction accuracy, good stability, fast learning speed and good classification effect under complex clinical data sets. It has important guiding significance for assisting the clinical diagnosis and risk prediction of type 2 diabetes.
Similar content being viewed by others
References
Cho, N. H., Shaw, J. E., Karuranga, S., et al. (2018). IDF Diabetes Atlas: Global estimates of diabetes prevalence for 2017 and projections for 2045. Diabetes Research & Clinical Practice, 138, 271–281.
American Diabetes Association. (2003). Report of the Expert Committee on the Diagnosis and Classification of Diabetes Mellitus. Diabetes Care, 26(S1), S5–S20.
Hu, Z. (2019). Complications of diabetes and its harm. Home Medicine, 6, 75.
Diabetes Branch of Chinese Medical Association. (2021). Guideline for the prevention and treatment of type 2 diabetes mellitus in China (2020 edition). Chinese Journal of Diabetes Mellitus, 13(4), 315-409.
Qiu, M., Chen, Z., & Liu, M. (2014). Low-power low-latency data allocation for hybrid scratch-pad memory. IEEE Embedded Systems Letters, 6(4), 69–72.
Gao, Y., Iqbal, S., et al. (2015). Performance and power analysis of high-density multi-GPGPU architectures: A preliminary case study. IEEE 17th HPCC.
Qiu, M., Ming, Z., Li, J., Liu, S., Wang, B., & Lu, Z. (2012). Three-phase time-aware energy minimization with DVFS and unrolling for chip multiprocessors. Journal of Systems Architecture, 58(10), 439–445.
Tao, L., Golikov, S., et al. (2015). A reusable software component for integrated syntax and semantic validation for services computing, IEEE Symposium on Service-Oriented System Engineering, 127-132
Zhang, K., Kong, J., Qiu, M., & Song, G. (2005). Multimedia layout adaptation through grammatical specifications. Multimedia Systems, 10(3), 245–260.
Zhang, L., Qiu, M., Tseng, W., & Sha, E. (2010). Variable partitioning and scheduling for MPSoC with virtually shared scratch pad memory. Journal of Signal Processing Systems, 58(2), 247–265.
Tang, X., Li, K., et al. (2012). A hierarchical reliability-driven scheduling algorithm in grid systems. Journal of Parallel and Distributed Computing, 72(4), 525–535.
Qiu, M., Ming, Z., Li, J., Liu, J., Quan, G., & Zhu, Y. (2013). Informer homed routing fault tolerance mechanism for wireless sensor networks. Journal of Systems Architecture, 59(4–5), 260–270.
Gai, K., Qiu, M., Chen, L., & Liu, M. (2015). Electronic health record error prevention approach using ontology in big data. IEEE 17th HPCC Conference, pp 752-757.
Lu, R., Jin, X., Zhang, S., Qiu, M., & Wu, X. (2018). A study on big knowledge and its engineering issues. IEEE Transactions on Knowledge and Data Engineering, 31(9), 1630–1644.
Guo, Y., Zhuge, Q., Hu, J., et al. (2013). Data placement and duplication for embedded multicore systems with scratch pad memory. IEEE Transactions on Computer Aided Design Integral Circuits Systems, 32(6), 809–817.
Si, J., Mu, D., Sun, L., Qiao, Z., & Yang, K. (2017). Analysis and forecast of clinical decision support system for diabetes mellitus based on big data technique. International Journal of Biomedical Engineering, 40(3), 216–220.
Gai, K., Qiu, M., Thuraisingham, B., & Tao, L. (2015). Proactive attribute-based secure data schema for mobile cloud in financial industry. 2015 IEEE 17th International Conference on High Performance Computing.
Liu, M., Zhang, S., et al. (2012). State Estimation for Discrete-Time Chaotic Systems Based on a Unified Model. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
Lu, Z., Wang, N., et al. (2018). IoTDeM: An IoT Big Data-oriented MapReduce performance prediction extended model in multiple edge clouds. Journal of Parallel and Distributed Computing, 118, 316–327.
Li, J., Qiu, M., Niu, J., et al. (2010) Feedback dynamic algorithms for preemptable job scheduling in cloud systems. IEEE/WIC/ACM conf. on Web Intelligence.
Qiu, M., Khisamutdinov, E., Zhao, Z., Pan, C., Choi, J., Leontis, N., & Guo, P. (2013) RNA nanotechnology for computer design and in vivo computation. Philosophical Transactions of the Royal Society A.
Zhao, H., Chen, M., et al. (2016). A novel pre-cache schema for high performance Android system. Future Generation Computer Systems, 56, 766–772.
Gai, K., Qiu, M., Sun, X., & Zhao, H. (2016). Security and privacy issues: A survey on FinTech. International Conference on Smart Computing and Communication, 236–247.
Su, H., Qiu, M., & Wang, H. (2012). Secure wireless communication system for smart grid with rechargeable electric vehicles. IEEE Communications Magazine, 50(8), 62–68.
Thakur, K., Qiu, M., Gaim K., & Ali, M. (2015). An investigation on cyber security threats and security models. IEEE CSCloud.
Zhang, Z., Wu, J., et al. (2008). Jamming ACK attack to wireless networks and a mitigation approach. IEEE GLOBECOM Conference, 1-5.
Qiu, H., Qiu, M., Memmi, G., Ming, Z., & Liu, M. (2018). A dynamic scalable blockchain based communication architecture for IoT. International Conference on Smart Blockchain, 159-166.
Che, Q., Zheng, Q., Chen, Si., Ma, Y., Zhou, Z., Wu, Y., et al. (2020). The construction of predicting model for type 2 diabetes mellitus risk on the basis of artificial neural network approach. Chinese Journal of Prevention and Control of Chronic Diseases, 28(4), 274–279.
Hou, Y., Zhu, Y., Zhu, L., Wu, S., & Gao, Q. (2016). Application of Decision Tree Model in Prediction of Type 2 Diabetes Risk. Chinese Journal of Health Statistics, 33(6), 976-978, 982.
Liu, Y., Sun, H., Zhang, Y., & Zhao, Z. (2018). Research on diabetes prediction model based on support vector machine. Journal of Harbin University of Commerce (Natural Sciences Edition), 34(1), 61-65, 74.
Wang, X., & Chen, D. (2010). Application of Support Vector Machine on Predictive Model of Type 2 Diabetes. Chinese Journal of Prevention and Control of Chronic Non-Communicable Diseases, 18(6), 560–562.
Li, J., Wu, Q., & Li, S. (2014). Application of Data Mining Technology in Building a Risk Assessment Model of Type 2 Diabetes Mellitus. Journal of Gannan Medical University, (6), 974-977, 982.
Liu, X., & Li, W. (2018). Study on Jump Volatility of Financial High-frequency Data: Based on the Method of Big-data Kernel Functions SVM. Statistics & Information Forum, 33(9), 23–30.
Feng, G. (2011). Parameter optimizing for Support Vector Machines classification. Computer Engineering and Applications, 47(3), 123-124,128.
College of Marine Science, Shanghai Ocean University. (2020). Influence of different SVM kernel functions on the classification accuracy of GF-2 image in Nanhui tidal flat. Transactions of Oceanology and Limnology, 2, 78–89.
Jiang, J., He, Y., & Li, J. (2012). Modification of SVM’s optimal hyperplane based on minimal mistake. Journal of Beijing University of Aeronautics and Astronautics, 38(11), 1483–1486.
Mi, A., & Zhang, P. (2017). A method of classifier selection based on confusion matrix. Journal of Henan Polytechnic University (Natural Science), 36(2), 116–121.
Guo, Y., Guo, W., Qin, Y., He, Q., Zhang, X., & Wu, C. (2016). Consistency Check Based on Kappa Coefficient and Its Software Realization. Chinese Journal of Health Statistics, 33(1), 169-170, 174.
Wang, J. (2008). Application of ROC curve in clinical medical diagnosis experiment. Chinese Journal of Hypertension, 16(02), 175–177.
Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Patttern Recognition, 1145-1159.
Ren, M., Sun, X., Wang, M., Huo, D., Li, Y., & Guo, L. (2019). Risk factors associated with prediabetes in Chinese: a meta-analysis. Chinese Journal of Evidence-Based Medicine, 19(2), 140–146.
Jia, H., & Sun, Y. (2017). Analysis of the Prevalence and Related Factors of Prediabetes. Journal of Imaging Research and Medical Applications, 1(11), 22–23.
Meng, X., Yu, T., & Zhang, X. (2010). A case-control study on risk factors of type 2 diabetes mellitus. Chinese Journal of Disease Control & Prevention, 14(7), 600–602.
Yang, M., Pu, K., & Li Z. (2016). Data Preprocessing of Diabetes Electronic Medical Records. Journal of Medical Intelligence, 37(5), 59-62, 84.
Chen, H., Lin, L., Wang, J., & Miao, X. (2008). Data mining platform-WEKA and secondary development on WEKA. Computer Engineering and Applications, 44(19), 76–79.
Chen, S., Luo, S., Pan, L., et al. (2014). Quantitative Influence of Risk Factors on Blood Glucose Level. Biomedical Materials and Engineering, 24(1), 1359–1366.
Lin, X., Li, J., Liu, L., Liang, C., & Ren, H. (2019). Risk prediction models of type 2 diabetic nephropathy. Chinese Journal of Medical Library and Information Science, 28(4), 41–45.
Han, L., Luo, S., Yu, J., et al. (2015). Rule Extraction from Support Vector Machines Using Ensemble Learning Approach: An Application for Diagnosis of Diabetes. IEEE Journal of Biomedical and Health Informatics, 19(2), 728–734.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the Topical Collection on: Big Data Security Track
Rights and permissions
About this article
Cite this article
Guo, H., Fan, Z. & Zeng, Y. Novel Data Mining Analysis Method on Risk Prediction of Type 2 Diabetes. J Sign Process Syst 94, 1183–1198 (2022). https://doi.org/10.1007/s11265-021-01717-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-021-01717-4