Deep convolutional neural network and emotional learning based breast cancer detection using digital mammography

https://doi.org/10.1016/j.compbiomed.2021.104318Get rights and content

Highlights

  • An automatic Diverse Features based Breast Cancer Detection (DFeBCD) system to classify a mammogram as normal or abnormal.

  • The performance of the DFeBCD system is better on the features, generated dynamically through highway network-based CNN.

  • The hybridization of the four types of features improves the performance of the system by nearly 2–3%.

Abstract

Breast cancer is one of the deadly diseases among women. However, the chances of death are highly reduced if it gets diagnosed and treated at its early stage. Mammography is one of the reliable methods used by the radiologist to detect breast cancer at its initial stage. Therefore, an automatic and secure breast cancer detection system that accurately detects abnormalities not only increases the radiologist's diagnostic confidence but also provides more objective evidence. In this work, an automatic Diverse Features based Breast Cancer Detection (DFeBCD) system is proposed to classify a mammogram as normal or abnormal. Four sets of distinct feature types are used. Among them, features based on taxonomic indexes, statistical measures and local binary patterns are static. The proposed DFeBCD dynamically extracts the fourth set of features from mammogram images using a highway-network based deep convolution neural network (CNN). Two classifiers, Support Vector Machine (SVM) and Emotional Learning inspired Ensemble Classifier (ELiEC), are trained on these distinct features using a standard IRMA mammogram dataset. The reliability of the system performance is ensured by applying 5-folds cross-validation. Through experiments, we have observed that the performance of the DFeBCD system on dynamically generated features through highway network-based CNN is better than that of all the three individual sets of ad-hoc features. Furthermore, the hybridization of all four types of features improves the system's performance by nearly 2–3%. The performance of both the classifiers is comparable using the individual sets of ad-hoc features. However, the ELiEC classifier's performance is better than SVM using both hybrid and dynamic features.

Introduction

Breast cancer has become one of the deadliest diseases among women around the world. This starts due to the uncontrolled cell division that results in the formation of a tumor in the breast [1]. The symptoms of breast cancer include abnormalities like breast aches, change in breast skin color, change in dimension and shape, and formation of a breast mass. Normally, imaging techniques like ultrasound, X-rays and magnetic imaging are used to analyze breast cancer. However, for early breast cancer detection, one of the most effective techniques is mammography [2] which uses low-dose of X-rays for image formation. The abnormalities like calcification and masses and other subtle signs like architectural distortion and bilateral asymmetry could be detected using mammography. A mass, density, nodule or distortion in the mammogram represent potential abnormalities. However, not all abnormalities are cancerous. For example, a smooth and well-defined bordered mass (lump) is normally benign. On the other hand, an irregular bordered mass (lump) with a starburst appearance (speculated) might be cancerous, and to verify it, a biopsy is needed. Micro-calcifications are basically a small set of calcium clusters that might be of benign, indeterminate or suspicious nature. Mostly these micro-calcifications clusters are of benign nature. However, in some of the cases, these micro-calcifications might appear in a certain form of clusters and patterns. In that case, they indicate precancerous cells or may represent an initial stage of breast cancer, which could be verified by biopsy. To diagnose breast cancer, radiologists analyze these mammogram images. However, the opinion of the radiologists regarding the presence of breast cancer may not be consistent due to the differences in their previous knowledge and experiences. Therefore, a machine learning based breast cancer detection system may be used to increase the radiologist confidence and can also be used as a second opinion for the detection of breast cancer [3]. In the past two decades, machine learning algorithms have gained popularity by solving problems of complex nature, like clustering, prediction and classification [[3], [4], [5]].

Machine learning (ML) based techniques have shown exemplary performance for different image recognition problems [6]. Based on complexity, experimental data and biological description, different mathematical models [7] exist for the control [8] and prediction [9]of the glucose-insulin system. Previously, many ML-based computer-aided diagnostic systems have been successfully developed for the diabetic retinopathy problem [10,11]. In this regard, SVM is often applied to classification problems because of its strong discrimination ability [12]. To detect the breast cancer at its early stage, researches are continuously developing interesting machine learning-based systems. The major difference in these works is the usage of different feature extraction methodologies, mammogram datasets and the machine learning model. The problem of detecting breast cancer can be treated as a hierarchical classification problem [13,14]. First, the mammograms are classified into normal or abnormal. If it is detected as abnormal, it can be further classified as benign and malicious. Most of the previous work commonly used models based on statistics, texture, or signal processing for feature extraction. The majority of the reported works has used Support Vector Machine (SVM) as a classifier and is evaluated by taking varying subsets of either the Digital Database for Screening Mammography (DDSM) or the database of Mammographic Image Analysis Society (MIAS). Harefa et al. showed that using grey level co-occurrence matrices (GLCM), SVM outperformed k-Nearest Neighbour (k-NN) in detecting breast cancer abnormalities with an accuracy of 93.88% on the MIAS database [15]. De Oliveira et al. used the taxonomic distinctness index and taxonomic diversity index and achieved a maximum accuracy of 98.88% with the SVM classifier [16]. Görgela et al. used local seed growing technique with spherical wavelet transformation in combination with SVM classifier for the classification of mass/non-mass and reported an accuracy of 94% [17]. Gabor filters of various scales and directions are employed by Hussain and achieved 0.98 area under ROC [18]. Berbar et al. exploited hybrid features based on local binary patterns and statistical measures and thus reported accuracies of 98.63% and 97.25% with SVM and k-NN, respectively [19]. Nithya et al. reported an accuracy of 98% using the statistical, grey level, and horizontal based features [20]. The above discussion shows that the texture-based features seem more feasible to achieve good breast cancer detection results with SVM.

In this work, statistical, textural and dynamic features are used, and different metrics are evaluated to quantify the classification performance. Statistical measures obtained from the image histogram are used as statistical features. To perceive texture information, two types of textural features, local binary patterns (LBP) and taxonomic indices, are used. Local Binary Pattern offers high discriminative power and invariance against brightness for texture analysis and is widely used for texture-based image classification, while taxonomic indices are obtained from the phylogenetic tree [16]. On the other hand, dynamic features are extracted through highway network-based Deep Convolution Neural Network (CNN) [30]. CNNs [21] have shown effective and proven results, especially in image recognition and classification. CNN architecture uses the convolution operator on the input, assuming that the input is spatially correlated, like in images, which allows encoding specific properties. This ability of CNN, in turn, results in the implementation of a more efficient forward function and reduction in model parameters [22]. The layered architecture of CNN tries to extract detail information in its higher layers and more general information in its deeper layers [23]. These CNN architectures have found great attention in different areas, including cancer detection [24,25].

The main contribution of this work is the development of an automatic Diverse Features based Breast Cancer Detection (DFeBCD) system. The proposed DFeBCD uses an emotional-learning based ensemble classifier (ELiEC) for classifying mammographic images as normal or abnormal. Four (04) sets of distinct feature types are used; among them, three are static features; based on taxonomic indexes, statistical measures, and local binary patterns. The fourth sets of features are dynamically extracted from mammogram images by using a highway-network based deep convolutional neural network (CNN). Two classifiers Support Vector Machine (SVM) and Emotional Learning inspired Ensemble Classifier (ELiEC), are trained on these distinct features using standard IRMA mammogram dataset. The reliability of the system performance is ensured by applying 5-folds cross-validation. Through experiments, we have observed that the performance of the system on dynamically generated features is better than that of all the three individual sets of ad-hoc features. Furthermore, the hybridization of all four types of features improves the performance of the system by nearly 2–3%. During this research, all the experiments are performed using MATLAB on GPU enabled machine (GeForce GTX 1070 with compute capability 6.1) having 32 GB RAM.

The commonly used symbols are given in Table 1. The rest of the paper is divided in the following way. In section 2, the proposed DFeBCD technique is explained. The results of the proposed methodology are discussed in section 3, and finally, section 4 concludes the paper.

Section snippets

Proposed DFeBCD technique

The major steps taken during this research are image acquisition, pre-processing, feature extraction, which includes ad-hoc and dynamic feature extraction, classification, and result evaluation and analysis. Fig. 1 shows the overall process for classification. In the following subsections, each step is explained in detail.

Results and discussion

IRMA dataset is one of the renowned datasets that have been used by many researchers for the development of breast cancer detection system. This dataset consists of 2796 image patches of size 128x128 from mammogram images of the DDSM dataset. Among these images, 932 are from normal cases, while 1864 are from abnormal cases. First, the images are pre-processed, as mentioned in section 2.1 Image acquisition, 2.2 Pre-processing for enhancing the image quality. Then these images are split into

Conclusion

An automatic and secure breast cancer detection system that accurately detects abnormalities not only increases the radiologist's diagnostic confidence but also provides more objective evidence. In this work, an automatic Diverse Features based Breast Cancer Detection (DFeBCD) system is proposed to classify a mammogram as normal or abnormal. The proposed system uses a deep Highway-Network for extracting dynamic features. Using one of the renowned datasets, IRMA, first, the images are

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgement

This work is conducted with the support of the Higher Education Commission (HEC) of Pakistan under the Indigenous PhD Scholarship Program (PIN No. 213-59966-2EG2-143). We thank Anabia Sohail from Pattern Recognition Lab (PR-Lab) for reviewing the manuscript. We also thank Pattern Recognition Lab (PR-Lab) and Pakistan Institute of Engineering and Applied Sciences (PIEAS) for providing necessary computational resources and a healthy research environment.

References (30)

  • R.F. Mansour

    Deep-learning-based automatic computer-aided diagnosis system for diabetic retinopathy

    Biomed. Eng. Lett.

    (2018)
  • M.W. Khan et al.

    Fractional order Bergman's minimal model-A better representation of blood glucose-insulin system

  • M.W. Khan et al.

    Controller design for a fractional-order nonlinear glucose-insulin system using feedback linearization

    Trans. Inst. Meas. Contr.

    (2020)
  • M.W. Khan et al.

    Sliding mode control for a fractional-order non-linear glucose-insulin system

    IET Syst. Biol.

    (2020)
  • R.F. Mansour

    Using genetic algorithm for identification of diabetic retinal exudates in digital color images

    J. Intell. Learn Syst. Appl.

    (2012)
  • Cited by (48)

    • Vision Transformers in medical computer vision—A contemplative retrospection

      2023, Engineering Applications of Artificial Intelligence
    • A review on recent developments in cancer detection using Machine Learning and Deep Learning models

      2023, Biomedical Signal Processing and Control
      Citation Excerpt :

      Therefore, it can be concluded that algorithms having high precision and recall shows high accuracy. Naveed Chouhan et al. [60] developed a Diverse Feature based Breast Cancer Detection (DFeBCD) for detecting whether the person is having cancer or not and for understanding the cancer stages in malignant or benign state. Two classifiers like SVM and ELiEC (Emotional Learning inspired Ensemble Classifier) are trained on multiple features using std.

    • Breast cancer detection using ensemble of convolutional neural networks

      2024, International Journal of Electrical and Computer Engineering
    View all citing articles on Scopus
    View full text