Skip to main content

Advertisement

Log in

A novel hybrid algorithm for aiding prediction of prognosis in patients with hepatitis

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

This study investigated the application of a novel hybrid artificial intelligence (AI)-based classifier for aiding prediction of the prognosis in patients with chronic hepatitis. Nineteen biomarkers on 155 patients with hepatitis from the University California Irvine Machine Learning repository were used as input data. Weights derived by applying the geometric margin maximisation criterion of a Lagrangian support vector machine (LSVM) were used for selecting the features associated with the highest relative importance towards the required classification, i.e. to predict whether a patient with hepatitis would have survived or died. Thus, the 19 initial features were reduced to the 16 most important prognostic factors and were fed into various AI-based classifiers. Results indicated an overall classification accuracy and area under the receiver operating characteristic curve of 100% for the proposed hybrid algorithm, the LSVM multilayer perceptron (MLP), thus demonstrating its potential for aiding prediction of prognosis in patients with hepatitis in a clinical setting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. World Health Organization (2016) Global health sector strategy on viral hepatitis 2016–2021. Towards ending viral hepatitis

  2. Nelson PK, Mathers BM, Cowie B, Hagan H, Des Jarlais D, Horyniak D, Degenhardt L (2011) Global epidemiology of hepatitis B and hepatitis C in people who inject drugs: results of systematic reviews. Lancet 378(9791):571–583

    Article  Google Scholar 

  3. Desmet VJ, Gerber M, Hoofnagle JH, Manns M, Scheuer PJ (1994) Classification of chronic hepatitis: diagnosis, grading and staging. Hepatology 19(6):1513–1520

    Article  Google Scholar 

  4. Castera L (2012) Noninvasive methods to assess liver disease in patients with hepatitis B or C. Gastroenterology 142(6):1293–1302

    Article  Google Scholar 

  5. Salkic NN, Jovanovic P, Hauser G, Brcic M (2014) FibroTest/Fibrosure for significant liver fibrosis and cirrhosis in chronic hepatitis B: a meta-analysis. Am J Gastroenterol 109(6):796–809

    Article  Google Scholar 

  6. Parisi L, Manaog ML (2016) Preliminary validation of the Lagrangian support vector machine learning classifier as clinical decision-making support tool to aid prediction of prognosis in patients with hepatitis. In: The 16th international conference on biomedical engineering, National University of Singapore (NUS)

  7. Kohonen T (1982) Self-organized formation of topologically correct feature maps. Biol Cybern 43(1):59–69

    Article  MathSciNet  Google Scholar 

  8. Parisi L (2014) Exploiting kinetic and kinematic data to plot cyclograms for managing the rehabilitation process of BKAs by applying neural networks. Int J Biomed Biol Eng 8(10):664–668

    Google Scholar 

  9. Parisi L (2014) Neural networks for distinguishing the performance of two hip joint implants on the basis of hip implant side and ground reaction force. Int J Med Health Biomed Bioeng Pharm Eng 8(10):659–663

    Google Scholar 

  10. Parisi L, Biggs PR, Whatling GM, Holt CA (2015) A novel comparison of artificial intelligence methods for diagnosing knee osteoarthritis. In: XXV congress of the international society of biomechanics, pp 1227–1229

  11. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 1:318–362

    MATH  Google Scholar 

  12. Bascil MS, Oztekin H (2012) A study on hepatitis disease diagnosis using probabilistic neural network. J Med Syst 36(3):1603–1606

    Article  Google Scholar 

  13. Nahato KB, Harichandran KN, Arputharaj K (2015) Knowledge mining from clinical datasets using rough sets and backpropagation neural network. Computational and Mathematical Methods in Medicine, 2015

  14. Kaya Y, Uyar M (2013) A hybrid decision support system based on rough set and extreme learning machine for diagnosis of hepatitis disease. Appl Soft Comput 13(8):3429–3438

    Article  Google Scholar 

  15. Çalişir D, Dogantekin E (2011) A new intelligent hepatitis diagnosis system: PCA–LSSVM. Expert Syst Appl 38(8):10705–10708

    Article  Google Scholar 

  16. Gong G (1988) Hepatitis data set. UCI Machine Learning Repository [http:archive.ics.uci.edu/ml]. University of California, School of Information and Computer Sciences, Irvine, CA

  17. Mangasarian OL, Musicant DR (2001) Lagrangian support vector machines. J Mach Learn Res 1:161–177

    MathSciNet  MATH  Google Scholar 

  18. Grunkemeier GL, Jin R (2001) Receiver operating characteristic curve analysis of clinical risk models. Ann Thorac Surg 72:323–326

    Article  Google Scholar 

  19. Parisi L, Manaog ML (2017) A minimum viable machine learning-based speech processing solution for facilitating early diagnosis of Parkinson’s disease. In: MATLAB conference 2017

  20. Parisi L, Manaog ML (2017) The importance of selecting appropriate k-fold cross-validation and training algorithms in improving postoperative discharge decision-making via artificial intelligence. In: 2017 AUT mathematical sciences symposium, 2017, 1(1), 16

  21. Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst 2(4):303–314

    Article  MathSciNet  Google Scholar 

  22. Luo W, Phung D, Tran T, Gupta S, Rana S, Karmakar C, Venkatesh S (2016) Guidelines for developing and reporting machine learning predictive models in biomedical research: a multidisciplinary view. J Med Internet Res 18(12):e323

    Article  Google Scholar 

  23. Stang A (2010) Critical evaluation of the Newcastle-Ottawa scale for the assessment of the quality of nonrandomized studies in meta-analyses. Eur J Epidemiol 25(9):603–605

    Article  Google Scholar 

  24. Hayashi Y, Fukunaga K (2016) Accuracy of rule extraction using a recursive-rule extraction algorithm with continuous attributes combined with a sampling selection technique for the diagnosis of liver disease. Inform Med Unlocked 5:26–38

    Article  Google Scholar 

  25. Ulutasdemir N, Dagli O (2010) Evaluation of risk of death in hepatitis by rule induction algorithms. Sci Res Essays 5(20):3059–3062

    Google Scholar 

  26. Yildirim P (2015) Filter based feature selection methods for prediction of risks in hepatitis disease. Int J Mach Learn Comput 5(4):258

    Article  Google Scholar 

  27. Tan KC, Teoh EJ, Yu Q, Goh KC (2009) A hybrid evolutionary algorithm for attribute selection in data mining. Expert Syst Appl 36(4):8616–8630

    Article  Google Scholar 

  28. Ansari S, Shafi I, Ansari A, Ahmad J, Shah SI (2011) Diagnosis of liver disease induced by hepatitis virus using artificial neural networks. In: 2011 IEEE 14th international multitopic conference (INMIC). IEEE, pp 8–12

  29. Dogantekin E, Dogantekin A, Avci D (2009) Automatic hepatitis diagnosis system based on linear discriminant analysis and adaptive network based on fuzzy inference system. Expert Syst Appl 36(8):11282–11286

    Article  Google Scholar 

  30. Neshat M, Sargolzaei M, Nadjaran Toosi A, Masoumi A (2012) Hepatitis disease diagnosis using hybrid case based reasoning and particle swarm optimization. ISRN Artificial Intelligence, 2012

  31. Avci D (2016) An automatic diagnosis system for hepatitis diseases based on genetic wavelet kernel extreme learning machine. J Electr Eng Technol 11(4):993–1002

    Article  Google Scholar 

  32. Patil BM, Joshi RC, Toshniwal D (2010) Effective framework for prediction of disease outcome using medical datasets: clustering and classification. Int J Comput Intell Stud 1(3):273–290

    Article  Google Scholar 

  33. Afif MH, Hedar AR, Hamid THA, Mahdy YB (2013) SS-SVM (3SVM): a new classification method for hepatitis disease diagnosis. Int J Adv Comput Sci Appl 4(2):54–58

    Google Scholar 

  34. Friedrich-Rust M, Ong MF, Martens S, Sarrazin C, Bojunga J, Zeuzem S, Herrmann E (2008) Performance of transient elastography for the staging of liver fibrosis: a meta-analysis. Gastroenterology 134(4):960–974

    Article  Google Scholar 

  35. Shahangian S, LaBeau KM, Howerton DA (2006) Prothrombin time testing practices: adherence to guidelines and standards. Clin Chem 52(5):793–794

    Article  Google Scholar 

  36. Santos MS, Abreu PH, Garcia-Laencina PJ, Simao A, Carvalho A (2015) A new cluster-based oversampling method for improving survival prediction of hepatocellular carcinoma patients. J Biomed Inform 58:49–59

    Article  Google Scholar 

Download references

Acknowledgements

The authors Luca Parisi and Narrendar RaviChandran would like to thank the University of Auckland for giving them the opportunity to carry out their Ph.D. research projects. The authors Luca Parisi and Narrendar RaviChandran would like to thank the University of Auckland Rehabilitative Technologies Association (UARTA) and MedIntellego®, for giving them the chance of developing this collaborative research work.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Contributions

All authors directly participated in the planning, execution and analysis in the study. All authors also approved the final version of the manuscript, and this submission for possible publication in Neural Computing and Applications.

Corresponding author

Correspondence to Luca Parisi.

Ethics declarations

Conflict of interest

The authors declare no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 Mean imputation to replace missing values from the initial hepatitis dataset

The “mean imputation” method was applied to replace all missing values that were represented by a question mark in the initial hepatitis dataset (“?”, as per the dataset description [16]) of all the following features by the mean of all available instances:

  • fourth (4th) feature, “steroid”, with one (1) missing value;

  • sixth (6th) feature, “fatigue”, with one (1) missing value;

  • seventh (7th) feature, “malaise”, with one (1) missing value;

  • eighth (8th) feature, “anorexia”, with one (1) missing value;

  • ninth (9th) feature, “liver big”, with ten (10) missing values;

  • tenth (10th) feature, “liver firm”, with eleven (11) missing values;

  • eleventh (11th) feature, “spleen palpable”, with five (5) missing values;

  • twelfth (12th) feature, “spiders”, with five (5) missing values;

  • thirteenth (13th) feature, “ascites”, with five (5) missing values;

  • fourteenth (14th) feature, “varices”, with five (5) missing values;

  • fifteenth (15th) feature, “bilirubin”, with six (6) missing values;

  • sixteenth (16th) feature, “alkaline phosphate”, with twenty-nine (29) missing values;

  • seventeenth (17th) feature, “SGOT”, with four (4) missing values;

  • eighteenth (18th) feature, “albumin”, with sixteen (16) missing values;

  • nineteenth (19th) feature, “protime”, with sixty-seven (67) missing values.

1.2 MedIntellego® Quality Assessment Scale (MQAS) for Studies on Applied Artificial Intelligence in Healthcare

Note: A maximum of two stars can be attributed for Pre-processing, Performance and Clinical Implications. A maximum of three stars can be attributed for Advantages.

  1. 1.

    Rationale

    1. (1)

      Does the study have a clearly identified clinical goal?

      1. (a)

        Yes ☆

      2. (b)

        No

  2. 2.

    Objective

    1. (1)

      Is it clear how the predictive task may help fulfil the clinical goal?

      1. (a)

        Yes ☆

      2. (b)

        No

  3. 3.

    Classification

    1. (1)

      Is the measurement for the predictive task clearly defined?

      1. (a)

        Yes ☆

      2. (b)

        No

    2. (2)

      Is the problem correctly identified (e.g. diagnostic or prognostic)?

      1. (a)

        Yes ☆

      2. (b)

        No

    3. (3)

      Is the form of the predictive model (i.e. classification) correctly identified?

      1. (a)

        Yes ☆

      2. (b)

        No

  4. 4.

    Pre-processing

    1. (1)

      Is data cleaning performed?

      1. (a)

        Yes ☆

      2. (b)

        No

    2. (2)

      Is data transformation (e.g. normalisation, standardisation) performed?

      1. (a)

        Yes ☆

      2. (b)

        No

    3. (3)

      Are outliers identified and removed?

      1. (a)

        Yes ☆

      2. (b)

        No

    4. (4)

      Are the methods used for handling missing values clearly described?

      1. (a)

        Yes ☆

      2. (b)

        No

  5. 5.

    Methods

    1. (1)

      Clearly defined data sources

      1. (a)

        Which data are used ☆

      2. (b)

        Presence or lack of missing values ☆

      3. (c)

        Ethical approvals obtained or not required as data sources are correctly cited ☆

    2. (2)

      Clearly defined feature selection algorithm

      1. (a)

        Which feature selection algorithm is used ☆

    3. (3)

      Clearly defined artificial intelligence-based models

      1. (a)

        Which learning-based classifier is used ☆

      2. (b)

        Which parameters are used ☆

      3. (c)

        Which learning algorithm is used ☆

      4. (d)

        Which transfer function is used ☆

  6. 6.

    Validation

    1. (1)

      Are the validation metrics (e.g. mean squared error, sensitivity, specificity and area under the receiver operating characteristic curve) clearly defined?

      1. (a)

        Yes ☆

      2. (b)

        No

    2. (2)

      Is the cross-validation set created from the initial data?

      1. (a)

        Yes ☆

      2. (b)

        No

  7. 7.

    Performance

    1. (1)

      Are the results reported in confidence intervals?

      1. (a)

        Yes ☆

      2. (b)

        No

    2. (2)

      Are results compared with the literature based on confidence intervals?

      1. (a)

        Yes ☆

      2. (b)

        No

    3. (3)

      Are results compared with the literature based on accuracy and at least one performance measure (e.g. mean squared error, sensitivity, specificity, area under the receiver operating characteristic curve)?

      1. (a)

        Yes ☆

      2. (b)

        No

  8. 8.

    Advantages

    1. (1)

      Are any assumed input and output data format explicitly mentioned?

      1. (a)

        Yes ☆

      2. (b)

        No

    2. (2)

      Are there any potential pitfalls in interpreting the model?

      1. (a)

        Yes

      2. (b)

        No ☆

    3. (3)

      Are there any potential bias of the data used in the model?

      1. (a)

        Yes

      2. (b)

        No ☆

    4. (4)

      Are the findings reporting clinically relevant test classification accuracy (≥ 80%)?

      1. (a)

        Yes ☆

      2. (b)

        No

  9. 9.

    Clinical Translation

    1. (1)

      Is the need for balance between model accuracy and computational cost, as well as model simplicity and interpretability, discussed?

      1. (a)

        Yes ☆

      2. (b)

        No

    2. (2)

      Are any of the learning-based classifiers deployed in the study familiar or, at least, easy to understand to the end-user (e.g. physicians)?

      1. (a)

        Yes ☆

      2. (b)

        No

  10. 10.

    Clinical Implications

    1. (1)

      Are any clinical implications derived from the obtained predictive performance clearly discussed?

      1. (a)

        Yes ☆

      2. (b)

        No

    2. (2)

      Is the amount of money, which could be saved via a better prediction, reported?

      1. (a)

        Yes ☆

      2. (b)

        No

    3. (3)

      Is it reported how many patients could benefit from a care model leveraging the model prediction? Some statistics reported in the introduction would be also fine.

      1. (a)

        Yes ☆

      2. (b)

        No

1.3 Validation of the MQAS

See Table 5.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Parisi, L., RaviChandran, N. & Manaog, M.L. A novel hybrid algorithm for aiding prediction of prognosis in patients with hepatitis. Neural Comput & Applic 32, 3839–3852 (2020). https://doi.org/10.1007/s00521-019-04050-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-019-04050-x

Keywords

Navigation