Skip to main content

Optimized Machine Learning Models for Hepatitis C Prediction: Leveraging Optuna for Hyperparameter Tuning and Streamlit for Model Deployment

  • Conference paper
  • First Online:
Pan-African Conference on Artificial Intelligence (PanAfriConAI 2023)

Abstract

Machine Learning techniques have gained significant attention for their potential to solve diverse real-world problems across various fields. This study uses machine learning algorithms to predict hepatitis C stages, a prevalent liver disease affecting a substantial portion of the global population. By employing a dataset encompassing 615 patients and incorporating a multitude of factors associated with hepatitis C, a comprehensive analysis was conducted to compare the performance of six prominent machine learning algorithms. The algorithms considered include categorical boosting (CatBoost), Gaussian Naive Bayes (GNB), Random Forest (RF), Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LGBM), and ExtraTreeClassifier (ExtraT). To optimize the performance of these models, a hyperparameter optimization technique called Optuna was utilized to find the ideal parameters for each algorithm. Subsequently, all models’ performance was evaluated using the test dataset, comprising 20% of the overall patient data. The research findings revealed that the XGBoost algorithm emerged as the most effective approach, exhibiting a remarkable accuracy of 94.31%. Furthermore, the XGBoost model demonstrated exceptional F1-score, precision, and recall values, measuring 94.23%, 94.63%, and 94.31%, respectively. Building upon these promising results, we deployed the XGBoost model in a user-friendly web application leveraging Streamlit. This deployment ensures easy accessibility and usability of the model for the broader community.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2623–2631 (2019)

    Google Scholar 

  2. Alizargar, A., Chang, Y.L., Tan, T.H.: Performance comparison of machine learning approaches on hepatitis c prediction employing data mining techniques. Bioengineering 10(4), 481 (2023)

    Article  Google Scholar 

  3. Alotaibi, A., et al.: Explainable ensemble-based machine learning models for detecting the presence of cirrhosis in hepatitis c patients. Computation 11(6), 104 (2023)

    Article  Google Scholar 

  4. Anand, M.V., KiranBala, B., Srividhya, S., Younus, M., Rahman, H., et al.: Gaussian naïve bayes algorithm: A reliable technique involved in the assortment of the segregation in cancer. Mobile Information Systems 2022 (2022)

    Google Scholar 

  5. Breiman, L.: Random forests. Machine Learn. 45, 5–32 (2001)

    Google Scholar 

  6. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  Google Scholar 

  7. Chen, L., Ji, P., Ma, Y.: Machine learning model for hepatitis c diagnosis customized to each patient. IEEE Access 10, 106655–106672 (2022)

    Article  Google Scholar 

  8. Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM sigkdd International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)

    Google Scholar 

  9. Cohen, I., et al.: Pearson correlation coefficient. Noise reduction in speech processing, pp. 1–4 (2009)

    Google Scholar 

  10. Gerber, M.A.: Pathology of hepatitis c. FEMS Microbiol. Rev. 14(3), 205–210 (1994)

    Article  Google Scholar 

  11. Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63, 3–42 (2006)

    Google Scholar 

  12. Hancock, J.T., Khoshgoftaar, T.M.: Catboost for big data: an interdisciplinary review. J. big data 7(1), 1–45 (2020)

    Article  Google Scholar 

  13. Kalra, A., Yetiskul, E., Wehrle, C.J., Tuma, F.: Physiology, liver (2018)

    Google Scholar 

  14. Ke, G., et al.: Lightgbm: a highly efficient gradient boosting decision tree. In: Advances in Neural Information Processing Systems 30 (2017)

    Google Scholar 

  15. Lichtinghagen, R., Klawonn, F., Hoffmann, G.: Hcv data data set. Available online:(accessed on 19 March 2023), UCI Machine Learning Repository (2020)

    Google Scholar 

  16. Ma, L., Yang, Y., Ge, X., Wan, Y., Sang, X.: Prediction of disease progression of chronic hepatitis c based on xgboost algorithm. In: 2020 International Conference on Robots & Intelligent System (ICRIS), pp. 598–601. IEEE (2020)

    Google Scholar 

  17. Marcellin, P., Asselah, T., Boyer, N.: Fibrosis and disease progression in hepatitis c. Hepatology 36(S1), S47–S56 (2002)

    Article  Google Scholar 

  18. Nandipati, S.C., XinYing, C., Wah, K.K.: Hepatitis c virus (hcv) prediction by machine learning techniques. Appl. Modell. Simul. 4, 89–100 (2020)

    Google Scholar 

  19. Oladimeji, O.O., Oladimeji, A., Olayanju, O.: Machine learning models for diagnostic classification of hepatitis c tests. Front. Health Inform. 10(1), 70 (2021)

    Article  Google Scholar 

  20. Oleiwi, A.: Development of diagnostic decision making for chronic hepatitis c virus patients by various supervised predictive model. J. Adv. Res. Dyn. Control Syst. 12, 3113–3123 (10 2020)

    Google Scholar 

  21. Organization, W.H., et al.: Global hepatitis report 2017: World health organization. Accessed Oct 23 2020 (2017)

    Google Scholar 

  22. Organization, W.H., et al.: Hepatitis C rapid diagnostic tests for professional use and/or self-testing. World Health Organization (2022)

    Google Scholar 

  23. Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A.V., Gulin, A.: Catboost: unbiased boosting with categorical features. In: Advances in Neural Information Processing Systems, vol. 31 (2018)

    Google Scholar 

  24. Raschka, S.: An overview of general performance metrics of binary classifier systems (2014)

    Google Scholar 

  25. Safdari, R., Deghatipour, A., Gholamzadeh, M., Maghooli, K.: Applying data mining techniques to classify patients with suspected hepatitis c virus infection. Intell. Med. 2(04), 193–198 (2022)

    Article  Google Scholar 

  26. Sokolova, M., Japkowicz, N., Szpakowicz, S.: Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation. In: Sattar, A., Kang, B. (eds.) AI 2006: Advances in Artificial Intelligence, pp. 1015–1021. Springer Berlin Heidelberg, Berlin, Heidelberg (2006). https://doi.org/10.1007/11941439_114

    Chapter  Google Scholar 

  27. Zingaretti, C., De Francesco, R., Abrignani, S.: Why is it so difficult to develop a hepatitis c virus preventive vaccine? Clin. Microbiol. Infect. 20, 103–109 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Uriel Nguefack Yefou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yefou, U.N., Choudja, P.O.M., Sow, B., Adejumo, A. (2024). Optimized Machine Learning Models for Hepatitis C Prediction: Leveraging Optuna for Hyperparameter Tuning and Streamlit for Model Deployment. In: Debelee, T.G., Ibenthal, A., Schwenker, F., Megersa Ayano, Y. (eds) Pan-African Conference on Artificial Intelligence. PanAfriConAI 2023. Communications in Computer and Information Science, vol 2068. Springer, Cham. https://doi.org/10.1007/978-3-031-57624-9_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-57624-9_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-57623-2

  • Online ISBN: 978-3-031-57624-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics