Skip to main content

Evaluation of Machine Learning Algorithms for Early Prediction of Liver Disease

  • Conference paper
  • First Online:
Data Science and Emerging Technologies (DaSET 2023)

Abstract

Liver disease contributes to major morbidities and mortalities globally. Around 2 million people die annually worldwide from liver diseases. Diagnosing liver disease is usually made by measuring levels of biomarkers that could be enzymes linked to liver function tests and/or proteins. However, not often the levels of these biomarkers change in cases of liver diseases. The majority of diagnosis based on biomarkers occur when the liver is partially or fully damaged. On the other hand, early diagnosis of the disease can prevent further complication, decrease the burden on healthcare systems and save lives. Subsequently, this research evaluated machine learning algorithms for early prediction of liver disease. These algorithms included logistic regression, decision tree, random forest, adaptive boosting, extreme gradient boost, support vector machine and Naïve Bayes and were applied to a dataset of patients with and without liver diseases after data preprocessing. Metrics for evaluation included accuracy, precision, recall, AUC-ROC and F1-score. Based on these metrics and after hyperparameter tuning, support vector machine, random forest, adaptive gradient boost and extreme gradient boost showed to be the best performing models. However, the performance of these models did not exclude overfitting that could be related to the low sample size of the dataset. Future work involves applying these algorithms to a larger sample size of patients and more features.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Williams R (2006) Global challenges in liver disease. Hepatology 44(3):521–526

    Article  Google Scholar 

  2. Asrani SK, Devarbhavi H, Eaton J, Kamath PS (2019) Burden of liver diseases in the world. J Hepatol 70(1):151–171. https://doi.org/10.1016/j.jhep.2018.09.014

    Article  Google Scholar 

  3. Limdi JK, Hyde GM (2003) Evaluation of abnormal liver function tests. Postgrad Med J 79(932):307–312

    Article  Google Scholar 

  4. Ghosh M, Raihan MMS, Raihan M, Akter L, Bairagi AK, Alshamrani SS, Masud M (2021) A comparative analysis of machine learning algorithms to predict liver disease. Intell Autom Soft Comput 30(3)

    Google Scholar 

  5. Vijayarani S, Dhayanand S (2015) Liver disease prediction using SVM and Naïve Bayes algorithms. Int J Sci, Eng Technol Res (IJSETR) 4(4):816–820

    Google Scholar 

  6. Priya MB, Juliet PL, Tamilselvi PR (2018) Performance analysis of liver disease prediction using machine learning algorithms. Int Res J Eng Technol 5(1):206–211

    Google Scholar 

  7. Rahman AS, Shamrat FJM, Tasnim Z, Roy J, Hossain SA (2019) A comparative study on liver disease prediction using supervised machine learning algorithms. Int J Sci Technol Res 8(11):419–422

    Google Scholar 

  8. Wu CC, Yeh WC, Hsu WD, Islam MM, Nguyen PA, Poly TN, Wang YC, Yang HC, Li YC (2019) Prediction of fatty liver disease using machine learning algorithms. Comput Meth Programs Biomed 170:23–29

    Article  Google Scholar 

  9. Mostafa F, Hasan E, Williamson M, Khan H (2021) Statistical machine learning approaches to liver disease prediction. Livers 1(4):294–312

    Article  Google Scholar 

  10. Gupta S, Karanth G, Pentapati N, Prasad VB (2020) A web based framework for liver disease diagnosis using combined machine learning models. In: 2020 international conference on smart electronics and communication (ICOSEC). IEEE, pp 421–428

    Google Scholar 

  11. Vats V, Zhang L, Chatterjee S, Ahmed S, Enziama E, Tepe K (2018) A comparative analysis of unsupervised machine techniques for liver disease prediction. In: 2018 IEEE international symposium on signal processing and information technology (ISSPIT). IEEE, pp 486–489

    Google Scholar 

  12. University of California Irvine machine learning repository (2023). Available at: https://archive.ics.uci.edu/datasets. Accessed 26 August 2023

  13. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Article  Google Scholar 

  14. LaValley MP (2008) Logistic regression. Circulation 117(18):2395–2399

    Article  Google Scholar 

  15. Kingsford C, Salzberg SL (2008) What are decision trees? Nat Biotechnol 26(9):1011–1013

    Article  Google Scholar 

  16. Chen T, He T, Benesty M, Khotilovich V, Tang Y, Cho H, Chen K, Mitchell R, Cano I, Zhou T (2015) Xgboost: extreme gradient boosting. R package version 0.4–2, 1(4):1–4

    Google Scholar 

  17. Khan F, Ahamed J, Kadry S, Ramasamy LK (2020) Detecting malicious URLs using binary classification through ada boost algorithm. Int J Electr Comput Eng (2088–8708) 10(1)

    Google Scholar 

  18. Rish I (2001) An empirical study of the naive Bayes classifier. IJCAI 2001 Workshop Empirical Meth Artif Intell 3(22):41–46

    Google Scholar 

  19. Germani G, Theocharidou E, Adam R, Karam V, Wendon J, O’Grady J, Burra P, Senzolo M, Mirza D, Castaing D, Klempnauer J, Burroughs AK (2012) Liver transplantation for acute liver failure in Europe: outcomes over 20 years from the ELTR database. J Hepatol 57(2):288–296

    Google Scholar 

  20. Alizargar A, Chang YL, Tan TH (2023) Performance comparison of machine learning approaches on Hepatitis C prediction employing data mining techniques. Bioengineering 10(4):481

    Article  Google Scholar 

  21. Tanwar N, Rahman KF (2021) Machine learning in liver disease diagnosis: current progress and future opportunities. In: IOP conference series: materials science and engineering, vol 1022, no 1. IOP Publishing, p 012029

    Google Scholar 

  22. Pal M (2005) Random forest classifier for remote sensing classification. Int J Remote Sens 26(1):217–222

    Article  Google Scholar 

  23. Pisner DA, Schnyer DM (2020) Support vector machine. In: Machine learning. Academic Press, pp 101–121

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank UITAR International University for supporting this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dhiya Al-Jumeily OBE .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Geddam, S., Assi, S., Naghavipour, H., Jayabalan, M., Al-Hamid, A., Al-Jumeily OBE, D. (2024). Evaluation of Machine Learning Algorithms for Early Prediction of Liver Disease. In: Bee Wah, Y., Al-Jumeily OBE, D., Berry, M.W. (eds) Data Science and Emerging Technologies. DaSET 2023. Lecture Notes on Data Engineering and Communications Technologies, vol 191. Springer, Singapore. https://doi.org/10.1007/978-981-97-0293-0_37

Download citation

Publish with us

Policies and ethics