HRDEL: High ranking deep ensemble learning-based lung cancer diagnosis model

doi:10.1016/j.eswa.2022.118956

Expert Systems with Applications

Volume 213, Part A, 1 March 2023, 118956

https://doi.org/10.1016/j.eswa.2022.118956 Get rights and content

Highlights

•
We proposed a novel intelligent method for lung cancer diagnosis.
•
We proposed BF-SSA for mitigating the dimension of features.
•
We develop HRDEL learning approach with a high-ranking process.
•
Evaluation of the performance of the proposed approach for two different data sets.
•
The efficiency of the proposed approach is better than existing methods.

Abstract

Among all the diseases in human beings, lung cancer is known as the most hazardous disease that often leads to death rather than other cancer ailments. Lung cancer is asymptomatic, and so, it is unable to detect at the early stage. But, the rapid identification of lung cancer helps for sustaining the survival rate of people. Hence, many researchers develop various techniques for detecting lung cancer by undergoing different studies. Recently, computer technology has been used for solving these diagnosis problems. These developed systems involve diverse deep and machine learning approaches along with certain image-processing techniques for forecasting the severity level of lung cancer. Hence, this methodology plans to develop a novel intelligent method for diagnosing lung cancer. Initially, data is gathered by downloading two benchmark datasets, which include attribute information from various patients' health records. Furthermore, two standard techniques, “Principal Component Analysis (PCA) and t-Distributed Stochastic Neighbor Embedding (t-SNE)” have been used to extract features. Further, the deep features are retrieved from “the pooling layer of Convolutional Neural Network (CNN)”. Further to choose the significant features, the feature selection is taken place by the Best Fitness-based Squirrel Search Algorithm (BF-SSA), which is known as optimal feature selection. This hybrid optimization concept is considered to be superior in various domains to explore the search space efficiently and makes better performance in exploiting the feature selection. In the final phase called prediction, High Ranking Deep Ensemble Learning (HR-DEL) takes place concerning five forms of detection models. Finally, the high ranking of all the classifiers yields the final predicted output. The developed HR-DEL makes accurate prediction up to 8.79% better than the conventional methods and provides high robustness by reducing the dispersion or spread of the classification and model efficiency. The classification is performed, and the results are evaluated with the performance comparison of various algorithms.

Introduction

Cancer is considered a severe issue that may result in deaths with high mortality in both women and men owing to unclear clinical examinations and non-invasive treatments. The survival rate of lung cancer patients is very low while comparing with other cancers (Suresh & Mohan, 2020). The challenging task is to detect the nodule regions that are present in the soft lung tissues in the earlier stage of lung cancer (Wang and Charkborty, 2021, Sori et al., 2021). Computerized Tomography (CT) and Chest Radiograph (CXR) is employed for detecting pulmonary nodules for detecting lung cancer (Tian et al., 2021). Further, advanced technology is used to scan the whole chest region within a single breath hold and ensures low noisy images (Faruqui et al., 2021). Generally, detecting the pulmonary nodule is done through radiologists when cancer-related decisions (Xiao & Wu, 2021) are provided by the oncologists. But, the size of the nodule is small at the primary stage of cancer which makes the doctor take more time for checking the lung cancer (Naik & Edla, 2021) even with the help of experienced radiologists. The radiologists suggest different opinions when slight variations of morphology occur between the benign and malignant nodule (Li, 2021).

For achieving an accurate diagnosis and for detecting the entire nodules, it is necessary to involve expert doctors and radiologists (McWilliams et al., 2015, Yin et al., 2018). However, the symptoms of lung cancer are shown after the lung cancer reaches its severity, where it is impossible to cure. This is due to two vital issues such as high time consumption for a complete diagnosis of lung cancer and lack of recognition of the lung cancer symptoms at the early stage (Lee, 2018). The time taken for diagnosing and treating lung cancer needs to be improved for enhancing the probability of curing the lung cancer (Masood, 2020). Hence, the screening of lung cancer is the most important step by utilizing better identification approaches that must improve the patient health (Cirujeda, 2016). On the other hand, cancer detection using the lung image leads to true negative and false positives that further cause tension and additional investigation and cost to the patients and also for the doctor to suffer from the additional burden. The recent development of computer technology in lung cancer detection (Ozdemir et al., May 2020, Pang et al., 2020, He, June 2020) ensures better performance in attaining better detection accuracy that overrides human performance.

Different deep learning techniques achieve better efficiency in natural image recognition and detection, which is also extended to diverse medical imaging modalities and problems (Wu et al., 2019, Chen et al., 2020, Liu et al., 2013). Here, the deep CNN model helps to ensure huge success in different medical image processing when compared with the conventional methods (Mingxiang Feng, Xin Ye, Baishen Chen, Juncheng Zhang, Miao Lin, Haining Zhou, Meng Huang, Yanci Chen, Yunhe Zhu, 2021, Yang Jian and Zhou Yikai, 2004, Xing and Kejing, 1999). Especially in the field of medical image, the Convolutional network attains superior accuracy and sensitivity to the performance of the human expert (Shuji Sakai et al., 2006). Then, 3D-CNN is introduced for detecting lung cancer which achieves better performance in detection (Ramami, 2020). But it is restricted by certain limitations like time and memory complexity of the network. The conventional lung cancer detection model using CNN has not succeeded in performing more than 90 % levels for practical applications. Therefore, it is a prerequisite to raising a new lung cancer detection model with deep processing of lung images with an objective of enhanced deep learning approaches.

The goal of the research discussed in this paper is the proposed lung cancer diagnosis model deploys RNN to generate an ensemble learning model based on the proposed BF-SSA, in which five sets of features are assigned to each classification stage for lung cancer classification. This ensemble-based classification is used to improve the classification accuracy for correctly diagnosing lung cancer, which helps save lives by recognizing it early.

The residual part of the developed model is sequenced as follows. Part II explains the related works and their challenging issues. Part III depicts the initial process of the proposed lung cancer diagnosis model. The feature extraction process is explored in Part IV. Part V describes the lung cancer classification and the improved algorithm. Part VI discusses the simulation result. Part VIII summarizes the suggested lung cancer, diagnosis model.

Section snippets

Related work

In 2020, Suresh and Mohan cancers (Suresh & Mohan, 2020). have implemented an automated way of extracting the self-learned features with the help of end-to-end learning CNN, where obtained results from the proposed model were compared with the performance traditional computer-based system. The input images were acquired from the standard datasets with several 1018 cases. Then, the obtained images were undergone the pre-processing stage to effectively segment the required region. Here, the

Proposed model and description

The improvement has been made with the deep learning techniques for assisting the detection of the extremity of the lung cancer disease. These methods are used for controlling the disease outburst by determining the disease at the early stage by considering the suitable measures. However, these techniques are used individually, which makes complexity in achieving reliability and more standardized outcomes. Hence, there is a need for an ensemble learning approach to provide advanced features for

PCA features

The proposed lung cancer diagnosis model utilizes PCA (Cirujeda, 2016) for extracting the features from the input data $D_{i}^{in}$ . It is one of the unsupervised and feature extraction approaches for minimizing the length of huge data. The input data $D_{i}^{in}$ is given for identifying the informative features for lung cancer diagnosis. The input data aids to form the input matrix $MRX$ , where the mean value of $p$ variables is calculated by Eq. (1). ${\bar{a}}_{v} = \frac{1}{r} \sum_{u = 1}^{r} a_{uv}$

Here, $a$ refers to the variable in terms of total

Proposed HR-DEL model

The recommended model employs the RNN for generating the ensemble learning model using the suggested BF-SSA, where the five sets of features are given in each classification stage for classifying lung cancer. This ensemble-based classification is performed for improving the accuracy of the classification for correctly detecting lung cancer, which helps for saving human life at the right time. The first stage of RNN-based classification utilizes the extracted features from PCA $F E_{a}^{pca}$ for

Experimental setup

The proposed lung cancer diagnosis model was implemented in MATLAB 2020a, and the results were evaluated. Here, the population size was taken as 10 and a maximum number of 25 iterations were performed. Here, the proposed BF-SSA-HR-DEL was compared with several optimization algorithms such as PSO-HR-DEL (Wang & Tan, 2018), SSA-HR-DEL (Nagarajan & Dinesh Babu,2021), SA_SLno-HR-DEL (Mohammad & Masadeh, 2018) and SA-SLnO-RNN (Pradhan et al., 2022) as well as deep learning algorithms like CNN (

Conclusion

This paper is implemented with a novel lung cancer diagnosis model for accurately detecting lung cancer. Initially, the dataset is gathered and the collected data is subjected to the feature extraction phase, which is accomplished by getting the deep features of CNN and further, the input data is also given to the PCA and t-SNE for getting the concatenated features. Then, the concatenated features are fed into the optimal feature selection, where the optimal features are selected using the

CRediT authorship contribution statement

Kanchan Sitaram Pradhan: Conceptualization, Methodology, Software, Data curation, Writing – original draft. Priyanka Chawla: Data curation, Supervision, Validation, Conceptualization, Writing – original draft, Writing – review & editing. Rajeev Tiwari: Visualization, Investigation, Validation, Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (47)

Hieu Trung Huynh et al.
Nonparametric maximum likelihood estimation using neural networks
Pattern Recognition Letters
(2020)
D. Moitra et al.
Classification of non-small cell lung cancer using one-dimensional convolutional neural network
Expert Systems with Applications
(2020)
A. Masood
Cloud-Based Automated Clinical Decision Support System for Detection and Diagnosis of Lung Cancer in Chest CT
IEEE Journal of Translational Engineering in Health and Medicine
(2020)
A. McWilliams et al.
Sex and Smoking Status Effects on the Early Detection of Early Lung Cancer in High-Risk Smokers Using an Electronic Nose
IEEE Transactions on Biomedical Engineering
(2015)
Amrita Naik et al.
Lung Nodule Classification on Computed Tomography Images Using Deep Learning
Wireless Personal Communications
(2021)
Bhandary, Abhir; Prabhu, G. Ananth; Rajinikanth, V.; Thanaraj, K. Palani; Satapathy, Suresh Chandra; Robbins, David E.;...
C. Liu et al.
Early diagnostic value of circulating MiRNA-21 in lung cancer: A meta-analysis
Tsinghua Science and Technology
(2013)
D. Valluru et al.
IoT with cloud based lung cancer diagnosis model using optimal support vector machine
Health Care Management Science
(2020)
Gayathri Nagarajan and L. D. Dhinesh Babu,“A hybrid feature selection model based on improved squirrel search algorithm...
H. Chen et al.
Decision-Making Model Based on Ensemble Method in Auxiliary Medical System for Non-Small Cell Lung Cancer
IEEE Access
(2020)

H.K. Lee

A System-Theoretic Method for Modeling, Analysis, and Improvement of Lung Cancer Diagnosis-to-Surgery Process

IEEE Transactions on Automation Science and Engineering

(2018)

Imayanmosha Wahlang, Arnab Kumar Maji, Goutam Saha, Prasun Chakrabarti, Michal Jasinski, Zbigniew Leonowicz, Elzbieta...

Ioan-Daniel Borlea, Radu-Emil Precup, Alexandra-Bianca Borlea, Daniel Iercan, “A Unified Form of Fuzzy C-Means and...

J. Wu et al.

Diagnosis and data probability decision based on non-small cell lung cancer in medical system

IEEE Access

(2019)

J. Chena et al.

A visualized classification method via t-distributed stochasticneighbor embedding and various diagnostic parameters for planetarygearbox fault identification from raw mechanical data

Sensors and Actuators

(2018)

Karadal, C. H., Kaya, M. C., Tuncer, T., Dogan, S., & Acharya, U. R. (2021).”Automated classification of remote sensing...

Luis Vogado, Flávio Araújo, Pedro Santos Neto, João Almeida, João Manuel R.S.Tavares, RodrigoVeras, “A ensemble...

M. Li

Research on the Auxiliary Classification and Diagnosis of Lung Cancer Subtypes Based on Histopathological Images

IEEE Access

(2021)

Maleki, Negar; Zeinali, Yasser; Niaki, Seyed Taghi Akhavan (2021). A k-NN method for lung cancer prognosis with the use...

M. Nagaraju et al.

Convolution network model based leaf disease detection using augmentation techniques

Expert Systems

(2021)

Togaçar Mesut, Cömert Zafer, Ergen Burhan, “Intelligent skin cancer detection applying autoencoder, MobileNetV2 and...

Mesut Toğaçar, Burhan Ergen, Zafer Cömert “Tumor type detection in brain MR images of the deep model developed using...

Mingxiang Feng, Xin Ye, Baishen Chen, Juncheng Zhang, Miao Lin, Haining Zhou, Meng Huang, Yanci Chen, Yunhe Zhu, Botao...

Cited by (22)

A novel framework for lung cancer classification using lightweight convolutional neural networks and ridge extreme learning machine model with SHapley Additive exPlanations (SHAP)
2024, Expert Systems with Applications
This paper presents a novel approach that merges a lightweight parallel depth-wise separable convolutional neural network (LPDCNN) with a ridge regression extreme learning machine (Ridge-ELM) for precise classification of three lung cancer types alongside normal lung tissue (adenocarcinoma, large cell carcinoma, normal, and squamous cell carcinoma) using CT images. The proposed methodology combines contrast-limited adaptive histogram equalization (CLAHE) and Gaussian blur to enhance image quality, reduce noise, and improve visual clarity. The LPDCNN extracts discriminant features while minimizing computational complexity (0.53 million parameters and 9 layers). The Ridge-ELM model was developed to enhance classification performance, replacing the traditional pseudoinverse in the ELM approach. Through comprehensive evaluation against state-of-the-art models, the framework achieves remarkable average recall and accuracy values of 98.25 ± 1.031 % and 98.40 ± 0.822 %, respectively, through rigorous five-fold cross-validation for four-class classifications. In binary classifications, outstanding results are obtained with recall and accuracy values of 99.70 ± 0.671 % and 99.70 ± 0.447 %%, respectively. Notably, the framework exhibits exceptional efficiency, with a testing time of only 0.003 s. Additionally, integrating the SHAP (Shapley Additive Explanations) in the proposed framework enhances Explain-ability, providing insights into decision-making and boosting confidence in real-world lung cancer diagnoses.
A dynamic support ratio of selected feature-based information for feature selection
2023, Engineering Applications of Artificial Intelligence
Feature selection aims to select crucial features to improve classification accuracy in machine learning and data mining. Existing methods concentrate on the classification information from candidate features while seldom considering the changing information supported by selected features. In this paper, we construct a dynamic support ratio (DSR), which employs the new information of selected features to support classification. DSR explicitly describes the dynamic interactions between selected features and candidate features. Simultaneously, the feature relevance and feature redundancy are treated adaptively. Thus, distinctive features can be noticed sensitively. Afterward, a novel feature selection method based on a dynamic support ratio (DSRFS) is proposed. The proposed method is established on 18 benchmark data sets with four different classifiers. Classification accuracy, standard deviation, recall and statistical validations are employed to measure the classification performance. Extensive experiments demonstrate that DSRFS not only reduces the dimension of the feature space effectively, but also obtains the best average classification accuracy.
Lung cancer detection from CT scans using modified DenseNet with feature selection methods and ML classifiers
2023, Expert Systems with Applications
Lung cancer is a highly life-threatening disease worldwide, and detection is crucial. In this study, the Kaggle chest CT-scan images dataset was used to identify lung cancer in four categories: adenocarcinoma, large cell carcinoma, squamous cell carcinoma, and normal cell. A unique Deep Learning (DL) based method was suggested by modifying the DenseNet201 model and adding layers to the original DenseNet framework to identify lung cancer disease. Two feature selection methods were used to select the best features extracted from DenseNet201, which were then applied to various ML classifiers. The system’s performance was evaluated using a confusion matrix, ROC curve, Cohen’s Matthews Correlation Coefficient (MCC), Kappa score (KS), 5-fold method, and p-value. The proposed system achieved a high accuracy of 100%, an average accuracy of 95%, and a p-value of less than 0.001 after applying a 5-fold method. This study highlights the potential of using computer technology and ML methods to improve the accuracy of a lung cancer diagnosis from CT scans.
Toward Fairness-Aware Gradient Boosting Decision Trees for Ranking
2024, SSRN
Cancer detection and segmentation using machine learning and deep learning techniques: a review
2024, Multimedia Tools and Applications
An efficient deep learning model based lung cancer detection and risk identification using cox proportional hazard analysis
2024, Multimedia Tools and Applications

View all citing articles on Scopus

¹: ORCID: 0000-0002-8245-4748.

²: ORCID: 0000-0002-6029-4122.

View full text

HRDEL: High ranking deep ensemble learning-based lung cancer diagnosis model

Highlights

Abstract

Introduction

Section snippets

Related work

Proposed model and description

PCA features

Proposed HR-DEL model

Experimental setup

Conclusion

CRediT authorship contribution statement

Declaration of Competing Interest

Pattern Recognition Letters

Expert Systems with Applications

Cloud-Based Automated Clinical Decision Support System for Detection and Diagnosis of Lung Cancer in Chest CT

IEEE Journal of Translational Engineering in Health and Medicine

Sex and Smoking Status Effects on the Early Detection of Early Lung Cancer in High-Risk Smokers Using an Electronic Nose

IEEE Transactions on Biomedical Engineering

Lung Nodule Classification on Computed Tomography Images Using Deep Learning

Wireless Personal Communications

Early diagnostic value of circulating MiRNA-21 in lung cancer: A meta-analysis

Tsinghua Science and Technology

IoT with cloud based lung cancer diagnosis model using optimal support vector machine

Health Care Management Science

Decision-Making Model Based on Ensemble Method in Auxiliary Medical System for Non-Small Cell Lung Cancer

IEEE Access

A System-Theoretic Method for Modeling, Analysis, and Improvement of Lung Cancer Diagnosis-to-Surgery Process

IEEE Transactions on Automation Science and Engineering

Diagnosis and data probability decision based on non-small cell lung cancer in medical system

IEEE Access

A visualized classification method via t-distributed stochasticneighbor embedding and various diagnostic parameters for planetarygearbox fault identification from raw mechanical data

Sensors and Actuators

Research on the Auxiliary Classification and Diagnosis of Lung Cancer Subtypes Based on Histopathological Images

IEEE Access

Convolution network model based leaf disease detection using augmentation techniques

Expert Systems