Skip to main content
Log in

An intelligent data-driven model for disease diagnosis based on machine learning theory

  • Published:
Journal of Combinatorial Optimization Aims and scope Submit manuscript

Abstract

In the era of data, major decisions are determined by massive data, especially in the healthcare industry. In this paper, an intelligent data-driven model is proposed based on machine learning theory, specifically, support vector machine (SVM) and random forest (RF). The model is then applied to a case of disease diagnosis, cough variant asthma (CVA). The data of 137 samples with 12 attributes is collected for experiments. The results show that the proposed model achieves better prediction performance than single SVM and single RF. Besides, in order to identify the key medical indicators to enhance diagnosis accuracy and efficiency, the most important factors affecting CVA are generated by the proposed model, including FENO, EOS%, MMEF75/25, FEV1/FVC, PEF, etc. Meanwhile, it is demonstrated that the proposed model could be a user-friendly tool to improve the performance of disease diagnosis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Alam MGR, Haw R, Kim SS, Azad MAK, Abedin SF, Hong CS (2016) Em-psychiatry: an ambient intelligent system for psychiatric emergency. IEEE Trans Ind Inform 12(6):2321–2330

    Article  Google Scholar 

  • Bahri S, Zoghlami N, Abed M, Tavares JMR (2018) Big data for healthcare: a survey. IEEE Access 7:7397–7408

    Article  Google Scholar 

  • Bai Y, Han X, Chen T, Yu H (2015) Quadratic kernel-free least squares support vector machine for target diseases classification. J Combin Optim 30(4):850–870

    Article  MathSciNet  Google Scholar 

  • Bertsimas D, O’Hair A, Relyea S, Silberholz J (2016) An analytics approach to designing combination chemotherapy regimens for cancer. Manag Sci 62(5):1511–1531

    Article  Google Scholar 

  • Calderoni L, Ferrara M, Franco A, Maio D (2015) Indoor localization in a hospital environment using random forest classifiers. Exp Syst Appl 42(1):125–134

    Article  Google Scholar 

  • Cao X, Liu L, Cheng Y, Shen XS (2017) Towards energy-efficient wireless networking in the big data era: a survey. IEEE Commun Surv Tutor 20(1):303–332

    Article  Google Scholar 

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  • Dapogny A, Bailly K, Dubuisson S (2018) Confidence-weighted local expression predictions for occlusion handling in expression recognition and action unit detection. Int J Comput Vis 126(2–4):255–271

    Article  MathSciNet  Google Scholar 

  • Dongxiao N, Tiannan M, Bingyi L (2017) Power load forecasting by wavelet least squares support vector machine with improved fruit fly optimization algorithm. J Combin Optim 33(3):1122–1143

    Article  MathSciNet  Google Scholar 

  • Gai L, Ji J (2019) An integrated method to solve the healthcare facility layout problem under area constraints. J Combin Optim 37(1):95–113

    Article  MathSciNet  Google Scholar 

  • Gao W, Bao W, Zhou X (2019) Analysis of cough detection index based on decision tree and support vector machine. J Combin Optim 37(1):375–384

    Article  MathSciNet  Google Scholar 

  • Jin C, Shi F, Xiang D, Jiang X, Zhang B, Wang X, Zhu W, Gao E, Chen X (2016) 3D fast automatic segmentation of kidney based on modified AAM and random forest. IEEE Trans Med Imaging 35(6):1395–1407

    Article  Google Scholar 

  • Khanmohammadi S, Adibeig N, Shanehbandy S (2017) An improved overlapping k-means clustering method for medical applications. Expert Syst Appl 67:12–18

    Article  Google Scholar 

  • Liu L, Tang G, Fan B, Wang X (2015) Two-person cooperative games on scheduling problems in outpatient pharmacy dispensing process. J Combin Optim 30(4):938–948

    Article  MathSciNet  Google Scholar 

  • Shevchik SA, Saeidi F, Meylan B, Wasmer K (2016) Prediction of failure in lubricated surfaces using acoustic time-frequency features and random forest algorithm. IEEE Trans Ind Inform 13(4):1541–1553

    Article  Google Scholar 

  • Thong NT et al (2015) Hifcf: an effective hybrid model between picture fuzzy clustering and intuitionistic fuzzy recommender systems for medical diagnosis. Expert Syst Appl 42(7):3682–3701

    Article  Google Scholar 

  • Tsyurmasto P, Zabarankin M, Uryasev S (2014) Value-at-risk support vector machine: stability to outliers. J Combin Optim 28(1):218–232

    Article  MathSciNet  Google Scholar 

  • Yadav P, Steinbach M, Kumar V, Simon G (2018) Mining electronic health records (EHRS): a survey. ACM Comput Surv (CSUR) 50(6):85

    Article  Google Scholar 

  • Yang Y, Shen B, Gao W, Liu Y, Zhong L (2015) A surgical scheduling method considering surgeons’ preferences. J Combin Optim 30(4):1016–1026

    Article  MathSciNet  Google Scholar 

  • Zhang Y, Qiu M, Tsai CW, Hassan MM, Alamri A (2015) Health-CPS: healthcare cyber-physical system assisted by cloud and big data. IEEE Syst J 11(1):88–95

    Article  Google Scholar 

  • Zhong L, Bai Y (2019) Three-sided stable matching problem with two of them as cooperative partners. J Combin Optim 37(1):286–292

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This research is supported by five projects: The project of Shanghai Shenkang hospital development center, clinical science and technology optimization (SHDC12017623); The doctoral start-up project of USST (BSQD201901); National natural science foundation of China (71840003, 71801150); The scientific and technological development project of USST (2018KJFZ043).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Gao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, H., Gao, W. & Ye, C. An intelligent data-driven model for disease diagnosis based on machine learning theory. J Comb Optim 42, 884–895 (2021). https://doi.org/10.1007/s10878-019-00495-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10878-019-00495-x

Keywords

Navigation