Comparing Effectiveness of Machine Learning Methods for Diagnosis of Deep Vein Thrombosis

Sorano, Ruslan; Magnusson, Lars V.; Abbas, Khurshid

doi:10.1007/978-3-031-10548-7_21

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13381))

Included in the following conference series:

International Conference on Computational Science and Its Applications

890 Accesses
3 Citations

Abstract

This paper presents the results of a comparative study of machine learning techniques when predicting deep vein thrombosis. We used the Ri-Schedule dataset with Electronic Health Records of suspected thrombotic patients for training and validation. A total of 1653 samples and 59 predictors were included in this study.

We have compared 20 standard machine learning algorithms and identified the best-performing ones: Random Forest, XGBoost, GradientBoosting and HistGradientBoosting classifiers. After hyper-parameter optimization, the best overall accuracy of 0.91 was shown by GradientBoosting classifier using only 15 of the original variables.

We have also tuned the algorithms for maximum sensitivity. The best specificity was offered by Random Forests. At maximum sensitivity of 1.0 and specificity of 0.41, the Random Forest model was able to identify 23% additional negative cases over the screening practice in use today.

These results suggest that machine learning could offer practical value in real-life implementations if combined with traditional methods for ruling out deep vein thrombosis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bishop, C.M., et al.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)
MATH Google Scholar
Bordes, A., Bottou, L., Gallinari, P.: SGD-QN: careful quasi-newton stochastic gradient descent. J. Mach. Learn. Res. 10, 1737–1754 (2009)
MathSciNet MATH Google Scholar
Božič, M., Blinc, A., Stegnar, M.: D-dimer, other markers of haemostasis activation and soluble adhesion molecules in patients with different clinical probabilities of deep vein thrombosis. Thromb. Res. 108(2), 107–114 (2002). https://doi.org/10.1016/S0049-3848(03)00007-0
Article Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996). https://doi.org/10.1007/BF00058655
Article MATH Google Scholar
Breiman, L.: Rejoinder: arcing classifiers. Ann. Stat. 26(3), 841–849 (1998). http://www.jstor.org/stable/120059
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Article MATH Google Scholar
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Routledge, New York (2017)
Book Google Scholar
Chan, T., Golub, G., LeVeque, R.: Technical report STAN-CS-79-773, Department of Computer Science (1979)
Google Scholar
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)
Google Scholar
Coleman, D.M., Wakefield, T.W.: Biomarkers for the diagnosis of deep vein thrombosis. Expert Opin. Med. Diagn. 6(4), 253–257 (2012)
Article Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995). https://doi.org/10.1007/BF00994018
Article MATH Google Scholar
Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., Singer, Y.: Online passive aggressive algorithms (2006)
Google Scholar
Douma, R.A., et al.: Using an age-dependent D-dimer cut-off value increases the number of older patients in whom deep vein thrombosis can be safely excluded. Haematologica 97(10), 1507 (2012)
Article Google Scholar
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugen. 7(2), 179–188 (1936)
Article Google Scholar
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001). https://doi.org/10.1214/aos/1013203451
Article MathSciNet MATH Google Scholar
Fronas, S.G., et al.: Safety of D-dimer testing as a stand-alone test for the exclusion of deep vein thrombosis as compared with other strategies. J. Thromb. Haemost. 16(12), 2471–2481 (2018). https://doi.org/10.1111/jth.14314
Article Google Scholar
Fronas, S.G., et al.: Safety and feasibility of rivaroxaban in deferred workup of patients with suspected deep vein thrombosis. Blood Adv. 4(11), 2468–2476 (2020). https://doi.org/10.1182/bloodadvances.2020001556
Article Google Scholar
Hoerl, A.E., Kennard, R.W.: Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1), 55–67 (1970). https://doi.org/10.1080/00401706.1970.10488634
Article MATH Google Scholar
Hosmer, D.W., Jr., Lemeshow, S., Sturdivant, R.X.: Applied Logistic Regression, vol. 398. Wiley, Hoboken (2013)
Book Google Scholar
Johnson, E.D., Schell, J.C., Rodgers, G.M.: The D-dimer assay. Am. J. Hematol. 94(7), 833–839 (2019)
Google Scholar
Ke, G., et al.: LightGBM: a highly efficient gradient boosting decision tree. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017). https://proceedings.neurips.cc/paper/2017/file/6449f44a102fde848669bdd9eb6b76fa-Paper.pdf
Le Gal, G., et al.: Prediction of pulmonary embolism in the emergency department: the revised Geneva score. Ann. Intern. Med. 144(3), 165–171 (2006). https://doi.org/10.7326/0003-4819-144-3-200602070-00004
Article Google Scholar
Lippi, G., Cervellin, G., Franchini, M., Favaloro, E.J.: Biochemical markers for the diagnosis of venous thromboembolism: the past, present and future. J. Thromb. Thrombolysis 30(4), 459–471 (2010). https://doi.org/10.1007/s11239-010-0460-x
Article Google Scholar
Luo, L., Kou, R., Feng, Y., Xiang, J., Zhu, W.: Cost-effective machine learning based clinical pre-test probability strategy for DVT diagnosis in neurological intensive care unit. Clin. Appl. Thromb. Hemost. 27 (2021). https://doi.org/10.1177/10760296211008650
Ma, H., et al.: A novel hierarchical machine learning model for hospital-acquired venous thromboembolism risk assessment among multiple-departments. J. Biomed. Inform. 122, 103892 (2021). https://doi.org/10.1016/j.jbi.2021.103892
Article Google Scholar
Nafee, T., et al.: Machine learning to predict venous thrombosis in acutely ill medical patients. Res. Pract. Thromb. Haemost. 4(2), 230–237 (2020). https://doi.org/10.1002/rth2.12292
Article Google Scholar
Peterson, L.E.: K-nearest neighbor. Scholarpedia 4(2), 1883 (2009)
Article Google Scholar
Ruder, S.: An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747 (2016)
Schapire, R.E.: Explaining AdaBoost. In: Schölkopf, B., Luo, Z., Vovk, V. (eds.) Empirical Inference, pp. 37–52. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41136-6_5
Chapter Google Scholar
Tharwat, A.: Linear vs quadratic discriminant analysis classifier: a tutorial. Int. J. Appl. Pattern Recogn. 3(2), 145–180 (2016)
Article Google Scholar
Wang, K.Y., et al.: Using predictive modeling and supervised machine learning to identify patients at risk for venous thromboembolism following posterior lumbar fusion. Glob. Spine J. (2021). https://doi.org/10.1177/21925682211019361
Wang, X., Yang, Y.Q., Liu, S.H., Hong, X.Y., Sun, X.F., Shi, J.H.: Comparing different venous thromboembolism risk assessment machine learning models in Chinese patients. J. Eval. Clin. Pract. 26(1), 26–34 (2020). https://doi.org/10.1111/jep.13324
Article Google Scholar
Wells, P.S., et al.: Value of assessment of pretest probability of deep-vein thrombosis in clinical management. The Lancet 350(9094), 1795–1798 (1997). https://doi.org/10.1016/S0140-6736(97)08140-3
Article Google Scholar
Wilbur, J., Shian, B.: Diagnosis of deep venous thrombosis and pulmonary embolism. Am. Fam. Physician 86(10), 913–919 (2012)
Google Scholar
Williams, C.K., Rasmussen, C.E.: Gaussian Processes for Machine Learning, vol. 2. MIT Press, Cambridge (2006)
MATH Google Scholar
Xue, B., et al.: Use of machine learning to develop and evaluate models using preoperative and intraoperative data to identify risks of postoperative complications. JAMA Netw. Open 4(3), e212240 (2021). https://doi.org/10.1001/jamanetworkopen.2021.2240
Article Google Scholar

Download references

Author information

Authors and Affiliations

Østfold University College, Halden, Norway
Ruslan Sorano, Lars V. Magnusson & Khurshid Abbas

Authors

Ruslan Sorano
View author publications
You can also search for this author in PubMed Google Scholar
Lars V. Magnusson
View author publications
You can also search for this author in PubMed Google Scholar
Khurshid Abbas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ruslan Sorano .

Editor information

Editors and Affiliations

University of Perugia, Perugia, Italy
Osvaldo Gervasi
University of Basilicata, Potenza, Potenza, Italy
Beniamino Murgante
Østfold University College, Halden, Norway
Sanjay Misra
University of Minho, Braga, Portugal
Ana Maria A. C. Rocha
University of Cagliari, Cagliari, Italy
Chiara Garau

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sorano, R., Magnusson, L.V., Abbas, K. (2022). Comparing Effectiveness of Machine Learning Methods for Diagnosis of Deep Vein Thrombosis. In: Gervasi, O., Murgante, B., Misra, S., Rocha, A.M.A.C., Garau, C. (eds) Computational Science and Its Applications – ICCSA 2022 Workshops. ICCSA 2022. Lecture Notes in Computer Science, vol 13381. Springer, Cham. https://doi.org/10.1007/978-3-031-10548-7_21

Download citation

DOI: https://doi.org/10.1007/978-3-031-10548-7_21
Published: 26 July 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-10547-0
Online ISBN: 978-3-031-10548-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Comparing Effectiveness of Machine Learning Methods for Diagnosis of Deep Vein Thrombosis