Abstract
Despite recent advances in Machine Learning (ML)-based applications for clinical decision making, there is no objective method that can assist physicians in discriminating clinically significant features for predicting the prognosis of hepatitis. To this purpose, an expert-based objective feature engineering (FE) method is required to complement the physician’s subjective multi-criteria decision making and ML-based feature sets. Thus, this study proposes a novel feature selection (FS) method, which blends neighbourhood component analysis (NCA) and ReliefF by averaging their outcomes, coupled with a Lagrangian Support Vector Machine (LSVM) leveraged as an ML-based classifier for decision support to aid a binary classification of prognosis (survival or death) in patients with chronic hepatitis. Using this syncretic FS technique, which integrates the best performing FS methods tested, averaged feature ranks are obtained. Clinical data on 320 patients with hepatitis were obtained from two benchmark datasets from the University of California-Irvine database. The performance of the hybrid algorithms resulting from using statistical- and ML-based FE, both individually and in a syncretic manner, i.e., via a multi-expert ML system, was evaluated and compared. The proposed hybrid classifier NCA-ReliefF-LSVM, using an ML-based syncretic FS, led to the highest classification performance (AUC = 0.97/F1-score = 97.51, and AUC = 0.94/F1-score = 94.57) and the lowest computational cost (1 and 2 epochs, 13 and 11.67 s respectively) amongst all algorithms tested on both benchmark datasets. Thus, this study strongly supports the use of ML-based syncretic FS for predicting survival in individuals affected by hepatitis.
Similar content being viewed by others
References
Castera L (2012) Noninvasive methods to assess liver disease in patients with hepatitis B or C. Gastroenterology 142(6):1293–1302
Salkic NN, Jovanovic P, Hauser G, Brcic M (2014) FibroTest/Fibrosure for significant liver fibrosis and cirrhosis in chronic hepatitis B: a meta-analysis. Am J Gastroenterol 109(6):796–809
Gong G (1988) Hepatitis data set. UCI machine learning repository [http:archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Sciences
Parisi L, Manaog ML (2016) Preliminary validation of the Lagrangian support vector machine learning classifier as clinical decision-making support tool to aid prediction of prognosis in patients with hepatitis. In The 16th international conference on biomedical engineering, National University of Singapore (NUS)
Hansen JV, McDonald JB (2001) Some experimental evidence on the performance of GA-designed neural networks. J Exp Theor Artif Intell 13(3):307–321
Yildirim P (2015) Filter based feature selection methods for prediction of risks in hepatitis disease. Int J Mach Learn Comput 5(4):258
Kent JT (1983) Information gain and a general measure of correlation. Biometrika 70(1):163–173
Selvi C, Ahuja C, Sivasankar E (2015) A comparative study of feature selection and machine learning methods for sentiment classification on movie data set. In: Intelligent computing and applications (pp. 367–379). Springer, New Delhi
Çalişir D, Dogantekin E (2011) A new intelligent hepatitis diagnosis system: PCA–LSSVM. Expert Syst Appl 38(8):10705–10708
Hall MA (1999) Correlation-based feature selection for machine learning
Dash M, Liu H, Motoda H (2000) Consistency based feature selection. In Pacific-Asia conference on knowledge discovery and data mining (pp. 98–109). Springer, Berlin, Heidelberg
Mangasarian OL, Musicant DR (2001) Lagrangian support vector machines. J Mach Learn Res 1(Mar):161–177
Parisi L, RaviChandran, N. (2018, April). Genetic algorithms and unsupervised machine learning for predicting robotic manipulation failures for force-sensitive tasks. In: 2018 4th International conference on control, automation and robotics (ICCAR) (pp. 22–25). IEEE
Parisi L, RaviChandran N, Manaog ML (2019) A novel hybrid algorithm for aiding prediction of prognosis in patients with hepatitis. Neural Comput Appl 32:3839–3852
Parisi L, Ravichandran N (2017) Genetic algorithms and artificial neural networks for optimising user control in hand prosthetic devices. In: AUT mathematical sciences symposium, p 16
Parisi L, RaviChandran N (2020) Evolutionary feature transformation to improve prognostic prediction of hepatitis. Knowledge-Based Syst 200:106012
Parisi L, RaviChandran N, Lanzillotta M (2020) Artificial intelligence for clinical gait diagnostics of knee osteoarthritis: an evidence-based review and analysis. TechrXiv
Parisi L, RaviChandran N (2018) Evolutionary algorithms for margin maximisation of support vector machine. In: ANZAMP Meeting 2018 – sixth annual meeting of the australian and new zealand association of mathematical physics, 1(1):32
Parisi L, Neagu D, Ma R, Campean F (2020) QReLU and m-QReLU: Two novel quantum activation functions to aid medical diagnostics. arXiv:2011.07661
Parisi L, Ma R, RaviChandran N, Lanzillotta M (2020) hyper-sinh: an accurate and reliable function from shallow to deep learning in TensorFlow and Keras. arXiv:2010.08031
Parisi L (2020) m-arcsinh: An efficient and reliable function for SVM and MLP in scikit-learn. arXiv:2009.07530
Parisi L, RaviChandran N (2020) Evolutionary denoising-based machine learning for detecting knee disorders. Neural Process Lett 52(3):2565–2581
Parisi L, Ma R, Zaernia A, Youseffi M (2021) m-ark-Support vector machine for early detection of parkinson’s disease from speech signals. Int J Math Comput Simul 15:34
Parisi L, Ma R, Zaernia A, Youseffi M (2021) Ηyper-sinh-convolutional neural network for early detection of parkinson’s disease from spiral drawings. WSEAS Trans Comput Res 9:1–7
Parisi L, RaviChandran N, Lanzillotta M (2020) Supervised machine learning for aiding diagnosis of knee osteoarthritis: a systematic review and meta-analysis. TechrXiv
Parisi L (2019) Machine learning-based feature selection and optimisation for clinical decision support systems. Optimal data-driven feature selection methods for binary and multi-class classification problems: towards a minimum viable solution for predicting early diagnosis and prognosis. PhD thesis. University of Bradford, United Kingdom
Keltch B, Lin Y, Bayrak C (2014) Comparison of AI techniques for prediction of liver fibrosis in hepatitis patients. J Med Syst 38(8):60
Kaya Y, Uyar M (2013) A hybrid decision support system based on rough set and extreme learning machine for diagnosis of hepatitis disease. Appl Soft Comput 13(8):3429–3438
Nahato KB, Harichandran KN, Arputharaj K (2015) Knowledge mining from clinical datasets using rough sets and backpropagation neural network. Comput Math Methods Med 2015:1–13
Zolbanin HM, Delen D, Zadeh AH (2015) Predicting overall survivability in comorbidity of cancers: a data mining approach. Decis Support Syst 74:150–161
Parisi L (2014) Exploiting kinetic and kinematic data to plot cyclograms for managing the rehabilitation process of BKAs by applying neural networks. Int J Biomed Biol Eng 8(10):664–668
Parisi L (2014) Neural networks for distinguishing the performance of two hip joint implants on the basis of hip implant side and ground reaction force. Int J Med Heal Pharm Biomed Eng 8(10):659–663
Parisi L, Biggs PR, Whatling GM, Holt CA (2015) A novel comparison of artificial intelligence methods for diagnosing knee osteoarthritis. XXV Congr Int Soc Biomech 1227–1229
Santos MS, Abreu PH, Garcia-Laencina PJ, Simao A, Carvalho A (2015) A new cluster-based oversampling method for improving survival prediction of hepatocellular carcinoma patients. J Biomed Inform 58:49–59
Duan K, Keerthi SS, Poo AN (2003) Evaluation of simple performance measures for tuning SVM hyperparameters. Neurocomputing 51:41–59
Geladi P, Kowalski BR (1986) Partial least-squares regression: a tutorial. Anal Chim Acta 185:1–17
Goldberger J, Hinton GE, Roweis ST, Salakhutdinov RR (2005) Neighbourhood components analysis. Adv Neural Inf Process Syst pp. 513–520
Kononenko I, Šimec E, Robnik-Šikonja M (1997) Overcoming the myopia of inductive learning algorithms with ReliefF. Appl Intell 7(1):39–55
Kira K, Rendell LA (1992) A practical approach to feature selection. In Machine learning proceedings 1992 (pp. 249–256)
Robnik-Šikonja M, Kononenko I (2003) Theoretical and empirical analysis of ReliefF and RReliefF. Mach Learn 53(1–2):23–69
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Grunkemeier GL, Jin R (2001) Receiver operating characteristic curve analysis of clinical risk models. Ann Thorac Surg 72(2):323–326
Parisi L, Manaog ML (2017b) The importance of selecting appropriate k-fold cross-validation and training algorithms in improving postoperative discharge decision-making via artificial intelligence. In 2017 AUT mathematical sciences symposium, 2017 1(1):16
Parisi L, RaviChandran N, Manaog ML (2018) Decision support system to improve postoperative discharge: a novel multi-class classification approach. Knowl-Based Syst 152:1–10
Parisi L, RaviChandran N, Manaog ML (2018) Feature-driven machine learning to improve early diagnosis of parkinson’s disease. Expert Syst Appl 110C:182–190
Parisi L, Manaog ML (2017a) A minimum viable machine learning-based speech processing solution for facilitating early diagnosis of parkinson’s disease. In MATLAB conference 2017 1(1)
Dogantekin E, Dogantekin A, Avci D (2009) Automatic hepatitis diagnosis system based on linear discriminant analysis and adaptive network based on fuzzy inference system. Expert Syst Appl 36(8):11282–11286
Tan KC, Teoh EJ, Yu Q, Goh KC (2009) A hybrid evolutionary algorithm for attribute selection in data mining. Expert Syst Appl 36(4):8616–8630
Almogahed BA, Kakadiaris IA (2015) NEATER: filtering of over-sampled data using non-cooperative game theory. Soft Comput 19(11):3301–3322
Acknowledgements
The authors would like to thank the University of Auckland Rehabilitative Technologies Association (UARTA) for giving them the chance of developing this collaborative research work.
Funding
This research did not receive any specific Grant from funding agencies in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Parisi, L., RaviChandran, N. Syncretic Feature Selection for Machine Learning-Aided Prognostics of Hepatitis. Neural Process Lett 54, 1009–1033 (2022). https://doi.org/10.1007/s11063-021-10668-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-021-10668-7