HRDEL: High ranking deep ensemble learning-based lung cancer diagnosis model

https://doi.org/10.1016/j.eswa.2022.118956Get rights and content

Highlights

  • We proposed a novel intelligent method for lung cancer diagnosis.

  • We proposed BF-SSA for mitigating the dimension of features.

  • We develop HRDEL learning approach with a high-ranking process.

  • Evaluation of the performance of the proposed approach for two different data sets.

  • The efficiency of the proposed approach is better than existing methods.

Abstract

Among all the diseases in human beings, lung cancer is known as the most hazardous disease that often leads to death rather than other cancer ailments. Lung cancer is asymptomatic, and so, it is unable to detect at the early stage. But, the rapid identification of lung cancer helps for sustaining the survival rate of people. Hence, many researchers develop various techniques for detecting lung cancer by undergoing different studies. Recently, computer technology has been used for solving these diagnosis problems. These developed systems involve diverse deep and machine learning approaches along with certain image-processing techniques for forecasting the severity level of lung cancer. Hence, this methodology plans to develop a novel intelligent method for diagnosing lung cancer. Initially, data is gathered by downloading two benchmark datasets, which include attribute information from various patients' health records. Furthermore, two standard techniques, “Principal Component Analysis (PCA) and t-Distributed Stochastic Neighbor Embedding (t-SNE)” have been used to extract features. Further, the deep features are retrieved from “the pooling layer of Convolutional Neural Network (CNN)”. Further to choose the significant features, the feature selection is taken place by the Best Fitness-based Squirrel Search Algorithm (BF-SSA), which is known as optimal feature selection. This hybrid optimization concept is considered to be superior in various domains to explore the search space efficiently and makes better performance in exploiting the feature selection. In the final phase called prediction, High Ranking Deep Ensemble Learning (HR-DEL) takes place concerning five forms of detection models. Finally, the high ranking of all the classifiers yields the final predicted output. The developed HR-DEL makes accurate prediction up to 8.79% better than the conventional methods and provides high robustness by reducing the dispersion or spread of the classification and model efficiency. The classification is performed, and the results are evaluated with the performance comparison of various algorithms.

Introduction

Cancer is considered a severe issue that may result in deaths with high mortality in both women and men owing to unclear clinical examinations and non-invasive treatments. The survival rate of lung cancer patients is very low while comparing with other cancers (Suresh & Mohan, 2020). The challenging task is to detect the nodule regions that are present in the soft lung tissues in the earlier stage of lung cancer (Wang and Charkborty, 2021, Sori et al., 2021). Computerized Tomography (CT) and Chest Radiograph (CXR) is employed for detecting pulmonary nodules for detecting lung cancer (Tian et al., 2021). Further, advanced technology is used to scan the whole chest region within a single breath hold and ensures low noisy images (Faruqui et al., 2021). Generally, detecting the pulmonary nodule is done through radiologists when cancer-related decisions (Xiao & Wu, 2021) are provided by the oncologists. But, the size of the nodule is small at the primary stage of cancer which makes the doctor take more time for checking the lung cancer (Naik & Edla, 2021) even with the help of experienced radiologists. The radiologists suggest different opinions when slight variations of morphology occur between the benign and malignant nodule (Li, 2021).

For achieving an accurate diagnosis and for detecting the entire nodules, it is necessary to involve expert doctors and radiologists (McWilliams et al., 2015, Yin et al., 2018). However, the symptoms of lung cancer are shown after the lung cancer reaches its severity, where it is impossible to cure. This is due to two vital issues such as high time consumption for a complete diagnosis of lung cancer and lack of recognition of the lung cancer symptoms at the early stage (Lee, 2018). The time taken for diagnosing and treating lung cancer needs to be improved for enhancing the probability of curing the lung cancer (Masood, 2020). Hence, the screening of lung cancer is the most important step by utilizing better identification approaches that must improve the patient health (Cirujeda, 2016). On the other hand, cancer detection using the lung image leads to true negative and false positives that further cause tension and additional investigation and cost to the patients and also for the doctor to suffer from the additional burden. The recent development of computer technology in lung cancer detection (Ozdemir et al., May 2020, Pang et al., 2020, He, June 2020) ensures better performance in attaining better detection accuracy that overrides human performance.

Different deep learning techniques achieve better efficiency in natural image recognition and detection, which is also extended to diverse medical imaging modalities and problems (Wu et al., 2019, Chen et al., 2020, Liu et al., 2013). Here, the deep CNN model helps to ensure huge success in different medical image processing when compared with the conventional methods (Mingxiang Feng, Xin Ye, Baishen Chen, Juncheng Zhang, Miao Lin, Haining Zhou, Meng Huang, Yanci Chen, Yunhe Zhu, 2021, Yang Jian and Zhou Yikai, 2004, Xing and Kejing, 1999). Especially in the field of medical image, the Convolutional network attains superior accuracy and sensitivity to the performance of the human expert (Shuji Sakai et al., 2006). Then, 3D-CNN is introduced for detecting lung cancer which achieves better performance in detection (Ramami, 2020). But it is restricted by certain limitations like time and memory complexity of the network. The conventional lung cancer detection model using CNN has not succeeded in performing more than 90 % levels for practical applications. Therefore, it is a prerequisite to raising a new lung cancer detection model with deep processing of lung images with an objective of enhanced deep learning approaches.

The goal of the research discussed in this paper is the proposed lung cancer diagnosis model deploys RNN to generate an ensemble learning model based on the proposed BF-SSA, in which five sets of features are assigned to each classification stage for lung cancer classification. This ensemble-based classification is used to improve the classification accuracy for correctly diagnosing lung cancer, which helps save lives by recognizing it early.

The residual part of the developed model is sequenced as follows. Part II explains the related works and their challenging issues. Part III depicts the initial process of the proposed lung cancer diagnosis model. The feature extraction process is explored in Part IV. Part V describes the lung cancer classification and the improved algorithm. Part VI discusses the simulation result. Part VIII summarizes the suggested lung cancer, diagnosis model.

Section snippets

Related work

In 2020, Suresh and Mohan cancers (Suresh & Mohan, 2020). have implemented an automated way of extracting the self-learned features with the help of end-to-end learning CNN, where obtained results from the proposed model were compared with the performance traditional computer-based system. The input images were acquired from the standard datasets with several 1018 cases. Then, the obtained images were undergone the pre-processing stage to effectively segment the required region. Here, the

Proposed model and description

The improvement has been made with the deep learning techniques for assisting the detection of the extremity of the lung cancer disease. These methods are used for controlling the disease outburst by determining the disease at the early stage by considering the suitable measures. However, these techniques are used individually, which makes complexity in achieving reliability and more standardized outcomes. Hence, there is a need for an ensemble learning approach to provide advanced features for

PCA features

The proposed lung cancer diagnosis model utilizes PCA (Cirujeda, 2016) for extracting the features from the input dataDiin. It is one of the unsupervised and feature extraction approaches for minimizing the length of huge data. The input data Diin is given for identifying the informative features for lung cancer diagnosis. The input data aids to form the input matrixMRX, where the mean value of p variables is calculated by Eq. (1).a¯v=1ru=1rauv

Here, a refers to the variable in terms of total

Proposed HR-DEL model

The recommended model employs the RNN for generating the ensemble learning model using the suggested BF-SSA, where the five sets of features are given in each classification stage for classifying lung cancer. This ensemble-based classification is performed for improving the accuracy of the classification for correctly detecting lung cancer, which helps for saving human life at the right time. The first stage of RNN-based classification utilizes the extracted features from PCA FEapca for

Experimental setup

The proposed lung cancer diagnosis model was implemented in MATLAB 2020a, and the results were evaluated. Here, the population size was taken as 10 and a maximum number of 25 iterations were performed. Here, the proposed BF-SSA-HR-DEL was compared with several optimization algorithms such as PSO-HR-DEL (Wang & Tan, 2018), SSA-HR-DEL (Nagarajan & Dinesh Babu,2021), SA_SLno-HR-DEL (Mohammad & Masadeh, 2018) and SA-SLnO-RNN (Pradhan et al., 2022) as well as deep learning algorithms like CNN (

Conclusion

This paper is implemented with a novel lung cancer diagnosis model for accurately detecting lung cancer. Initially, the dataset is gathered and the collected data is subjected to the feature extraction phase, which is accomplished by getting the deep features of CNN and further, the input data is also given to the PCA and t-SNE for getting the concatenated features. Then, the concatenated features are fed into the optimal feature selection, where the optimal features are selected using the

CRediT authorship contribution statement

Kanchan Sitaram Pradhan: Conceptualization, Methodology, Software, Data curation, Writing – original draft. Priyanka Chawla: Data curation, Supervision, Validation, Conceptualization, Writing – original draft, Writing – review & editing. Rajeev Tiwari: Visualization, Investigation, Validation, Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (47)

  • Hieu Trung Huynh et al.

    Nonparametric maximum likelihood estimation using neural networks

    Pattern Recognition Letters

    (2020)
  • D. Moitra et al.

    Classification of non-small cell lung cancer using one-dimensional convolutional neural network

    Expert Systems with Applications

    (2020)
  • A. Masood

    Cloud-Based Automated Clinical Decision Support System for Detection and Diagnosis of Lung Cancer in Chest CT

    IEEE Journal of Translational Engineering in Health and Medicine

    (2020)
  • A. McWilliams et al.

    Sex and Smoking Status Effects on the Early Detection of Early Lung Cancer in High-Risk Smokers Using an Electronic Nose

    IEEE Transactions on Biomedical Engineering

    (2015)
  • Amrita Naik et al.

    Lung Nodule Classification on Computed Tomography Images Using Deep Learning

    Wireless Personal Communications

    (2021)
  • Bhandary, Abhir; Prabhu, G. Ananth; Rajinikanth, V.; Thanaraj, K. Palani; Satapathy, Suresh Chandra; Robbins, David E.;...
  • C. Liu et al.

    Early diagnostic value of circulating MiRNA-21 in lung cancer: A meta-analysis

    Tsinghua Science and Technology

    (2013)
  • D. Valluru et al.

    IoT with cloud based lung cancer diagnosis model using optimal support vector machine

    Health Care Management Science

    (2020)
  • Gayathri Nagarajan and L. D. Dhinesh Babu,“A hybrid feature selection model based on improved squirrel search algorithm...
  • H. Chen et al.

    Decision-Making Model Based on Ensemble Method in Auxiliary Medical System for Non-Small Cell Lung Cancer

    IEEE Access

    (2020)
  • H.K. Lee

    A System-Theoretic Method for Modeling, Analysis, and Improvement of Lung Cancer Diagnosis-to-Surgery Process

    IEEE Transactions on Automation Science and Engineering

    (2018)
  • Imayanmosha Wahlang, Arnab Kumar Maji, Goutam Saha, Prasun Chakrabarti, Michal Jasinski, Zbigniew Leonowicz, Elzbieta...
  • Ioan-Daniel Borlea, Radu-Emil Precup, Alexandra-Bianca Borlea, Daniel Iercan, “A Unified Form of Fuzzy C-Means and...
  • J. Wu et al.

    Diagnosis and data probability decision based on non-small cell lung cancer in medical system

    IEEE Access

    (2019)
  • J. Chena et al.

    A visualized classification method via t-distributed stochasticneighbor embedding and various diagnostic parameters for planetarygearbox fault identification from raw mechanical data

    Sensors and Actuators

    (2018)
  • Karadal, C. H., Kaya, M. C., Tuncer, T., Dogan, S., & Acharya, U. R. (2021).”Automated classification of remote sensing...
  • Luis Vogado, Flávio Araújo, Pedro Santos Neto, João Almeida, João Manuel R.S.Tavares, RodrigoVeras, “A ensemble...
  • M. Li

    Research on the Auxiliary Classification and Diagnosis of Lung Cancer Subtypes Based on Histopathological Images

    IEEE Access

    (2021)
  • Maleki, Negar; Zeinali, Yasser; Niaki, Seyed Taghi Akhavan (2021). A k-NN method for lung cancer prognosis with the use...
  • M. Nagaraju et al.

    Convolution network model based leaf disease detection using augmentation techniques

    Expert Systems

    (2021)
  • Togaçar Mesut, Cömert Zafer, Ergen Burhan, “Intelligent skin cancer detection applying autoencoder, MobileNetV2 and...
  • Mesut Toğaçar, Burhan Ergen, Zafer Cömert “Tumor type detection in brain MR images of the deep model developed using...
  • Mingxiang Feng, Xin Ye, Baishen Chen, Juncheng Zhang, Miao Lin, Haining Zhou, Meng Huang, Yanci Chen, Yunhe Zhu, Botao...
  • Cited by (22)

    • A dynamic support ratio of selected feature-based information for feature selection

      2023, Engineering Applications of Artificial Intelligence
    View all citing articles on Scopus
    1

    ORCID: 0000-0002-8245-4748.

    2

    ORCID: 0000-0002-6029-4122.

    View full text