Skip to main content
Log in

Evolutionary synthetic oversampling technique and cocktail ensemble model for warfarin dose prediction with imbalanced data

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

To improve the accuracy of warfarin daily dose prediction, we develop an evolutionary synthetic oversampling technique (ESMOTE) with a cocktail ensemble model (CEM) called ESMOTE-CEM. Different from conventional oversampling methods, ESMOTE finds the near-optimal oversampling parameters by evolving the parameter representation based on the pre-predicted warfarin dose and then synthesizes new samples to balance the data. The CEM, which improves the performance of random forest (RF) and boosted regression tree (BRT) models using a hybrid mechanism in the regression calculation, estimates the daily dose of warfarin. We test the ESMOTE-CEM on a dataset of 733 samples collected from the First Affiliated Hospital of Soochow University and the International Warfarin Pharmacogenetics Consortium (IWPC). The results show that ESMOTE outperformed the other oversampling methods by at least 6.98% for R2 and 5.03% for the mean squared error (MSE). In terms of the percentage of patients whose predicted warfarin dose is within 20% of the actual stable therapeutic dose (20%-p value), the ESMOTE-CEM achieves a 20%-p value of 50%. Moreover, compared to RF, BRT and AdaBoost models, the CEM is the most suitable base predictive model for ESMOTE.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Kirchhof P, Benussi S, Kotecha D et al (2016) 2016 ESC Guidelines for the management of atrial fibrillation developed in collaboration with EACTS. Europace 18(11):1609–1678

    Article  Google Scholar 

  2. Valgimigli M, Bueno H, Byrne RA et al (2017) ESC focused update on dual antiplatelet therapy in coronary artery disease developed in collaboration with EACTS: The Task Force for dual antiplatelet therapy in coronary artery disease of the European Society of Cardiology (ESC) and of the European Association for Cardio-Thoracic Surgery (EACTS). Eur Heart J 53(1):34–78

    Google Scholar 

  3. Johnson JA, Caudle KE, Gong L, Whirl-Carrillo M, Stein CM, Scott SA, Lee MT, Gage BF, Kimmel SE, Perera MA, Anderson JL, Pirmohamed M, Klein TE, Limdi NA, Cavallari LH, Wadelius M (2017) Clinical Pharmacogenetics Implementation Consortium (CPIC) Guideline for Pharmacogenetics-Guided Warfarin Dosing: 2017 Update. Clin Pharmacol Ther 102(3):397–404

    Article  Google Scholar 

  4. Gage BF, Eby C, Milligan PE, Banet GA, Duncan JR, Mcleod HL (2004) Use of pharmacogenetics and clinical factors to predict the maintenance dose of warfarin. Thromb Haemost 91(1):87–94

    Article  Google Scholar 

  5. Fung E, Patsopoulos NA, Belknap SM, Rourke DJO, Robb JF, Anderson JL, Shworak NW, Moore JH (2012) Effect of Genetic Variants Especially CYP2C9 and VKORC1 on the Pharmacology of Warfarin. Seminars Thromb Hemost 38(8):893–904

    Article  Google Scholar 

  6. Chen J, Shao L, Gong L, Luo F, Wang JE, Shi Y, Yu T, Chen Q, Zhang Y, Hui R (2014) A Pharmacogenetics-Based Warfarin Maintenance Dosing Algorithm from Northern Chinese Patients. PLoS ONE 9(8):e105250

    Article  Google Scholar 

  7. Verhoef TI, Redekop WK, Daly AK, van Schie RM, De BA, Ah MVDZ (2014) Pharmacogenetic-guided dosing of coumarin anticoagulants: algorithms for warfarin, acenocoumarol and phenprocoumon. Br J Clin Pharmacol 77(4):626–641

    Article  Google Scholar 

  8. Saffian SM, Wright DF, Roberts RL, Duffull SB (2015) Methods for Predicting Warfarin Dose Requirements. Ther Drug Monit 37(4):531–538

    Article  Google Scholar 

  9. Klein TE, Altman RB, Eriksson N, Gage BF, Kimmel SE, Lee MTM, Limdi NA, Page D, Roden DM, Wagner MJ (2009) Estimation of warfarin dose with clinical and pharmacogenetic data. N. Engl. J. Med. 360, 753–764. N Engl J Med 360(8):753–764

    Article  Google Scholar 

  10. Miao LY, Huang C, Shen Z (2007) Contribution of age, body weight, and CYP2C9 and VKORC1 genotype to the anticoagulant response to warfarin: proposal for a new dosing regimen in Chinese patients. Eur J Clin Pharmacol 63(12):1135–1141

    Article  Google Scholar 

  11. Yang J, Huang C, Shen Z, Miao L (2011) Contribution of 1173C > T polymorphism in the VKORC1 gene to warfarin dose requirements in Han Chinese patients receiving anticoagulation. Clin Pharmacol Ther 49(1):23–29

    Google Scholar 

  12. Yu Z, Ding Y, Lu F, Miao L, Shen Z, Ye W (2015) Warfarin dosage adjustment strategy in Chinese population. Int J Clin Exp Med 8(6):9904–9910

    Google Scholar 

  13. Sharabiani A. , Bress A. , Douzali E., Darabi H., “Revisiting Warfarin Dosing Using Machine Learning Techniques,” Computational and Mathematical Methods in Medicine, pp. 1–9, 2015.

  14. Jones RT, Sullivan M, Barrett D (2005) INRstar: computerised decision support software for anticoagulation management in primary care. Inform Prim Care 13(3):215–221

    Google Scholar 

  15. Yet B, Bastani K, Raharjo H, Lifvergren S, Marsh W, Bergman B (2013) Decision support system for Warfarin therapy management using Bayesian networks. Decis Support Syst 55(2):488–498

    Article  Google Scholar 

  16. Carty DM, Young TM, Zaretzki RL, Guess FM, Petutschnigg A (2015) Predicting and Correlating the Strength Properties of Wood Composite Process Parameters by Use of Boosted Regression Tree Models. Forest Prod J 65(7/8):365–371

    Article  Google Scholar 

  17. Byrne S, Cunningham P, Barry A, Graham I, Delaney T, Corrigan OI (2000) "Using Neural Nets for Decision Support in Prescription and Outcome Prediction in Anticoagulation Drug Therapy."The Fifth Workshop on Intelligent Data Analysis in Medicine and Pharmacology (IDAMAP-2000). Workshop Notes of the 14th European Conference on Artificial Intelligence (ECAI-2000), pp: 576–581

  18. Solomon I, Maharshak N, Chechik G, Leibovici L, Lubetsky A, Halkin H, Ezra D, Ash N (2004) Applying an artificial neural network to warfarin maintenance dose prediction. Isr Med Assoc J Imaj 6(12):732–735

    Google Scholar 

  19. Grossi E, Podda GM, Pugliano M, Gabba S, Verri A, Carpani G, Buscema M, Casazza G, Cattaneo M (2014) Prediction of optimal warfarin maintenance dose using advanced artificial neural networks. Pharmacogenomics 15(1):29–37

    Article  Google Scholar 

  20. Zhou Q, Wong JK, Chen J, Qin W, Chen J, Dong L (2014) Use of artificial neural network to predict warfarin individualized dosage regime in Chinese patients receiving low-intensity anticoagulation after heart valve replacement. Int J Cardiol 176(3):1462–1464

    Article  Google Scholar 

  21. Wintner S (2000) Dietterich TG: An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees. Mach Learn 40(2):139–157

    Article  Google Scholar 

  22. Smola AJ (2004) A tutorial on support vector regression. Stat Comp 14(3):199–222

    Article  MathSciNet  Google Scholar 

  23. Cosgun E, Limdi NA, Duarte CW (2011) High-dimensional pharmacogenetic prediction of a continuous trait using machine learning techniques with application to warfarin dose prediction in African Americans. Bioinformatics 27(10):1384–1389

    Article  Google Scholar 

  24. Hu YH, Wu F, Lo CL, Tai CT (2012) Predicting warfarin dosage from clinical data: A supervised learning approach. Artif Intell Med 56(1):27–34

    Article  Google Scholar 

  25. Wall R, Cunningham P, Walsh P, Byrne S (2003) Explaining the output of ensembles in medical decision support on a case by case basis. Artif Intell Med 28(2):191–206

    Article  Google Scholar 

  26. Zhang GP (2007) A neural network ensemble method with jittered training data for time series forecasting. Inf Sci 177(23):5329–5346

    Article  Google Scholar 

  27. Parker WS (2013) Ensemble modeling, uncertainty and robust predictions. Wiley Interdiscipl Rev Climate Change 4(3):213–223

    Article  Google Scholar 

  28. Chen WC, Tseng LY, Wu CS (2014) A unified evolutionary training scheme for single and ensemble of feedforward neural network. Neurocomputing 143(143):347–361

    Article  Google Scholar 

  29. Friedman JH (2001) Greedy Function Approximation: A Gradient Boosting Machine. Ann Stat 29(5):1189–1232

    Article  MathSciNet  Google Scholar 

  30. Hara K,Chellappa R(2013) “Computationally Efficient Regression on a Dependency Graph for Human Pose Estimation,” 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013), 23–28 June 2013, Portland, OR, USA, vol. 9, no. 4, pp. 3390–3397, 2013.

  31. Barua S, Islam MM, and Murase K (2011) A Novel Synthetic Minority Oversampling Technique for Imbalanced Data Set Learning. 2011 International Conference on Neural Information Processing (ICONIP 2011), pp: 735–744.

  32. Haibo He, Yang Bai, Edwardo A. Garcia,et.al. ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning. 2018 IEEE International Joint Conference on Neural Networks-(IEEE World Congress on Computational Intelligence). 2018

  33. Tian J, Gu H, Liu W (2011) Imbalanced classification using support vector machine ensemble. Neural Comput Appl 20(2):203–209

    Article  Google Scholar 

  34. Li Q, Yang B , Li Y , et al (2013) Constructing support vector machine ensemble with segmentation for imbalanced datasets. Neural Computing & Applications, vol. 22, no.1 Supplement, pp:249–256

  35. PharmGKB. "rs1799853," https://www.pharmgkb.org/variant/ PA166153972.

  36. Ageno W, Gallus AS, Wittkowsky A et al (2012) Oral anticoagulant therapy: Antithrombotic Therapy and Prevention of Thrombosis, 9th ed: American College of Chest Physicians Evidence-Based Clinical Practice Guidelines. Chest 141(2 Supplement):e44S

    Article  Google Scholar 

  37. Lenzini P, Wadelius M, Kimmel S et al (2010) Integration of genetic, clinical, and INR data to refine warfarin dosing. Clin Pharmacol Ther 87(5):572

    Article  Google Scholar 

  38. Bryk A, Wypasek E, Awsiuk M et al (2015) Warfarin Metabolites in Patients Following Cardiac Valve Implantation: A Contribution of Clinical and Genetic Factors. Cardiovasc Drugs Ther 29(3):1–8

    Article  Google Scholar 

  39. Harris JE (1995) Interaction of dietary factors with oral anticoagulants: review and applications. J Am Diet Assoc 95(5):580–584

    Article  Google Scholar 

  40. Tao Y, Chen YJ, Fu X et al (2018) Evolutionary ensemble learning algorithm to modeling of warfarin dose prediction for Chinese. IEEE J Biomed Health Inform 23(1):395–406

    Article  Google Scholar 

  41. Yu Y, Zhou ZH, Ting KM (2007) Cocktail Ensemble for Regression, Seventh IEEE International Conference on Data Mining. IEEE ICDM 2007. Omaha, NE, USA, 10.28–2007.10.31

  42. Cavallari LH, Nutescu EA (2014) Warfarin Pharmacogenetics: To Genotype or Not to Genotype, That Is the Question. Clin Pharm Therap 96(1):22–32

    Article  Google Scholar 

  43. Tao Y, Chen YJ, Fu X et al (2019) An Ensemble Model With Clustering Assumption for Warfarin Dose Prediction in Chinese Patients. IEEE J Biomed Health Inform 23:2642–2654

    Article  Google Scholar 

  44. Logan IR, Sheerin NS (2013) Anticoagulation and kidney injury: rare observation or common problem? J Nephrol 26(4):603–660

    Article  Google Scholar 

  45. Matsuo H, Matsumura M, Nakajima Y et al (2014) Frequency of deep vein thrombosis among hospitalized non-surgical Japanese patients with congestive heart failure. J Cardiol 64(6):430–434

    Article  Google Scholar 

  46. Keeling D, Baglin T, Tait C, et al (2011) Guidelines on oral anticoagulation with warfarin–fourth edition. British Journal of Haematology, 154(3): 311–324

  47. Sharabiani A., A Computer-Aided System for Determining the Application Range of a Warfarin Clinical Dosing Algorithm Using Support Vector Machines with a Polynomial Kernel Function, 2019 IEEE 15th International Conference on Automation Science and Engineering (CASE), https://doi.org/10.1109/COASE.2019.

  48. Tao Y, Zhang Y (2018) “WarfarinSeer”: a predictive tool based on SMOTE-random forest to improve warfarin dose prediction in Chinese patients. 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). Spain, 2018.04–2018.10

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (grant nos. 61872259 and 81700298) and the Suzhou Science and Technology Plan Project (grant no. SYS201736). Additional support was provided by Jiangsu Planned Projects for Postdoctoral Research Funds (2019K056A) and a China Postdoctoral Science Foundation funded project (2019M661935) for 2019. Research project of Jiangsu Health Commission (M2020023).

Author information

Authors and Affiliations

Authors

Contributions

Yanyun Tao designed and implemented the programming framework, conducted the experiments and wrote the manuscript. Yuzhen Zhang contributed to the design of the framework, collection and analysis of the data, and reviewed and edited the manuscript. Bin Jiang was a major contributor in writing the manuscript. Ling Xue and Cheng Xie collected and processed the dataset of patients treated with warfarin. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Yuzhen Zhang.

Ethics declarations

Conflict of interest

We declare that we have no financial and personal relationships with other people or organizations that could inappropriately influence our work, and there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled, “Evolutionary synthetic oversampling technique and cocktail ensemble model for data-imbalanced warfarin dose prediction with imbalanced data”.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tao, Y., Jiang, B., Xue, L. et al. Evolutionary synthetic oversampling technique and cocktail ensemble model for warfarin dose prediction with imbalanced data. Neural Comput & Applic 33, 11203–11221 (2021). https://doi.org/10.1007/s00521-020-05568-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-020-05568-1

Keywords

Navigation