Neural network and multi-fractal dimension features for breast cancer classification from ultrasound images

https://doi.org/10.1016/j.compeleceng.2018.01.033Get rights and content

Abstract

Breast cancer is considered to be one of the most threatening issues in clinical practice. However, existing breast cancer diagnosis methods face questions of complexity, cost, human-dependency, and inaccuracy. Recently, many computerized and interdisciplinary systems have been developed to avoid human errors in both quantification and diagnosis. A computerized system can be further improved to optimize the efficiency of breast tumour identification. The current paper presents an effort to automate characterization of breast cancer from ultrasound images using multi-fractal dimensions and backpropagation neural networks. In this study, a total of 184 breast ultrasound images (72 abnormal (tumour cases) and 112 normal cases) were examined. Various setups were employed to achieve a decent balance between positive and negative rates of the diagnosed cases. The obtained results manifested in high rates of precision (82.04%), sensitivity (79.39%), and specificity (84.75%).

Introduction

Breast cancer, first identified in Egypt in approximately 1600 BC, is one of the oldest known types of cancer [1]. However, even after intensive research over the last few decades, no solution is available to eradicate this fatal disease. Therefore, innovative methods are needed to find an optimal solution. In current medical practices, the quantification of the region of a tumour and discrimination between positive and negative cases are done manually from real-time scanned images. Thus, the diagnostic procedure entirely depends on the experience of the operator and involves multiple subjective decisions. Subjective decision-making can result in inter- and intra-observer variations. The inter-observer variation is the measure of the difference between the outcomes acquired by at least two observers while looking at similar material. The intra-observer variation occurs when one observer evaluates a similar material more than once. Such inconsistencies cause difficulties and errors in the diagnosis stage and consequently increase patient anxiety. To avoid human errors in both quantification and diagnosis stages, computer-based image processing and analysis tools need to be developed.

From the recent studies on cancer conducted in Iraq, it was observed that approximately one-fourth of the enrolled malignancy cases were affected by breast cancer, and it is the primary factor of death among Iraqi women [1]. As Iraq is a low-income country, the establishment of full-featured detection systems, such as mammography machines and ultrasounds, are difficult. A 2010 study noted that 143 out of 721 (19.8%) Iraqi females, who presented with substantial breast masses in a screening for early discovery of breast malignancy, had abnormal breast growth. Among 90.6% of the patients with recognized lumps, only 32% were therapeutically treated in the first month, while 16% consulted a specialist 1 year later [1]. Mammography and ultrasound machines are accessible in the main hospitals in every city in Iraq. Clearly, due to financial difficulties, it is impossible for specialists to use mammography and ultrasound for the screening of every Iraqi woman.

Breast cancer can be classified into two categories—normal and abnormal, and the abnormal (tumour) category can be divided into two classes—benign (non-harmful) and cancerous (malignant). Benign tumours are not injurious to health; their cells have a close resemblance to normal cells. Benign tumours grow relatively slowly and do not attack the adjacent tissues or spread to different parts of the body [2]. The images of a breast tumour provide oncologists with a full description of the disease, such as the size, shape, and area of spread along with the location of the tumour. Formerly, mammography, which uses X-rays to capture images of the breast, was the most commonly used screening method. This method is hazardous because the patients face a high amount of radiation during each screening, thus leading to leukaemia or other long-term diseases [3]. Ultrasound scanning is a safer screening alternative. Ultrasound scanning involves the exposure of body parts to high-frequency sound waves to create images of the internal structure of the human body. Ultrasound scanning does not use ionizing radiation as in mammography, making it a safer alternative. Ultrasound scanning also produces images with a relatively high resolution [4]. Mammogram images are less clear and the tumour is often poorly defined, whereas an ultrasound image clearly shows the tumour and surrounding tissues (Fig. 1).

A biopsy is a traditional way of detecting breast cancer. While the ultrasound method detects breast abnormalities, biopsy uses pathological investigation (by cutting breast tissue) to detect whether the tumour is benign or malignant. In modern medical practices, many biopsy methods are available based on the location, size, appearance, and characteristics of the abnormalities—fine and core needle biopsy, vacuum helped biopsy, expansive enter biopsy, and open surgical biopsy.

A hazard factor can be anything that transforms a disease into malignancy [5], [6]. Different types of cancers have their own hazard factors. For instance, sunlight is a hazard factor for skin disease and smoking is a hazard factor for various types of malignancies. Having a hazard factor does not imply that one will be affected by cancer. Many individuals with hazard factors never encounter the stage of malignancies. Scientific studies have discovered several hazard factors that make an individual more prone to breast cancer. Well-known Machine Learning algorithms, such as genetic algorithms, neural networks, support vector machines, clustering analysis, are the most popular approaches for finding hazard factors. Since many studies are available in this domain, the current study does not aim to present the technical aspects of these algorithms. The majority of IDSs consider only a single algorithm and solve a specific problem. However, recent studies reported the use of structured frameworks involving ML algorithms, such as committees of machines, where more than one algorithm work together to solve a given problem [7], [8].

Over recent decades, many researchers have been trying to find the optimal solution to enhance breast cancer diagnosis; however, they have only determined approximate solutions that differ in efficiency depending on the search space. In the literature [9], many enhanced NN trainings were utilized for the classification of breast tumour datasets. In addition, the gathering NN concept was employed by changing the beginning of every NN-based technique to improve the execution. The outcomes demonstrated that the NN gathering-based classification methods performed better than the NN-based methods, and the best training method to classify the dataset was RBP. Moreover, the distinctive troupe algorithms were applied to enhance other classification methodologies (for example, SVM). Further, different strides training techniques were used to consolidate a few outcomes obtained from different classifiers. A recent study [10] presented the learning classifier approaches in mammogram image recognition. In this paper, the primary challenge of dealing with a large amount of information in feature-based image classification techniques was addressed and resolved. Discrete wavelet transform elements and local binary pattern components are large in number, so it is difficult to practically apply them successfully in learning classifier approaches. As a solution, the point by point distance-based approach was used to decrease the number of input features. Finally, five condition sets of different features and distances were created to show the development of speculation in classifier rules.

In another study [11], a novel automatic system to recognize early breast cancer by analysing breast thermograms was presented. The proposed system consisted of three stages: the breast district segmentation from the image, feature extraction, and finally, classification and execution. In the segmentation stage, the background region was removed by Otsu's thresholding technique followed by a reconstruction method. At that point, the inframammary crease was identified to mar the lower furthest reach of the breast. From that point onward, the upper point of the breast was recognized by discerning the axilla. Finally, the breast area was extracted using these two limits. The various kinds of features were extracted from the IFI for the diagnosis of BCD. Finally, feed-forward ANN with gradient descent training rule was utilized as a classifier. The real issue in this study was the constrained collection of openly accessible BCD databases. A total of 306 breast thermograms from 102 patients were garnered from the Visual Lab, and the accuracy, sensitivity, and specificity were recorded as 90.48%, 87.6%, and 89.73%, respectively. In another study [12], automatic breast division of mammographic images was proposed. In this study, the thresholding procedure and morphological pre-processing were used to isolate the foundation district from the breast lesion and take out radiopaque ancient rarities and labels. To demonstrate the authenticity of the technique, a dataset of 322 samples (mammographic images) with high-density rectangular distances were utilized. All square high-intensity labels were taken out at the accuracy rate of 99.06%. The exploratory outcomes showed that the approach precisely segmented the breast district in an extensive scope of digitized mammograms by covering all density classes. An earlier work [13] presented a fuzzy based decision-making approach for the analysis of breast tumour. In this study, an artificial intelligent procedure (fuzzy structure) was adopted as a decision support tool for the determination of breast tumours. The fuzzy inference framework predicted the breast tumour, and in addition, provided basic and significant conditions for the determination of breast cancer. The proposed technique managed different inputs to effectively address instability in the diagnosis stage.

In the literature [14], the diagnosis of breast tumours by GLSM has been presented. The evolutionary feature selection and decision tree classification algorithms help the radiologists in distinguishing the benign cases from the malignant ones in their visual diagnosis. The classification of breast images can be done by the GLSM features with an accuracy of 96.58%. In addition, another study [15] presented the performance comparison of many clustering algorithms for the diagnosis of breast tumours. In this study, five clustering methods, hierarchical clustering, farthest first, LVQ, canopy, and DBSCAN in the Weka tool were compared, and the result demonstrated that the farthest first clustering technique had the highest prediction accuracy (72%). In another study [16], the CAD-based PCA method was applied to choose the most instructive features (shape, texture, and histogram) for the digitized MG. The CAD framework automatically recognized the ROI from an obscure MG image. ROC investigation affirmed better execution for the identification of MCs and masses. The proposed method exhibited the best performance with an accuracy of 94.4% in the classification of the lesions (MC and masses) by concentrating on only one kind of lesion. The breast tumour is also diagnosed based on human experience [17]. The method is time-consuming and prone to human error. The mentioned study used an intelligent multi-target classifier and multilayer perceptron (MLP) neural network with a differential evolution method to diagnose the breast tumour. The DE method was utilized to tackle the multi-target enhancement issues by tuning the MLP neural system parameters. Moreover, it improved the number of hidden nodes in the hidden layer of the MLPNN and decreased the network error rate by employing the multi-target differential evolution. The obtained result displayed that the use of proposed intelligent multi-target classifiers was reasonable in breast tumour identification. In another interesting work [18], the ANFIS (Adaptive Neuro-Fuzzy Inference System), a neuro-fluffy model, was applied for the soft tissue compel modelling using a haptic interface. The data (taken from the breast tissue simulation model created in ANSYS 12.0) required for the training of the neuro-fluffy system was identified in this model. In the approving session, the numerical data were compared with the test data, and a normal mistake of less than 3% was found. In the testing session, a root mean square error of approximately 0.02 (N) was obtained, and thus the model evinced a high level of exactness. The displayed model used approximately 100 times that of the required continuous feature for the compel modelling simulations. In a separate study [19], neural network (NN) was introduced for the determination of breast tumour. NN has the capability to be trained by the huge data and hidden data features of tumour cases. Subsequently, contextual analyses of a particular specialist can be utilized to train a neural system for malignancy determination. An ANN-based diagnostic tool for breast tumours was created in two phases. First, the well-known Wisconsin Breast Cancer Database (WDBC) was utilized to improve the diagnostic tool. Then, the supervised training of NN using backpropagation was carried out. Two variations of backpropagation were explored with three and four layers of NN. The tested technique was used for the biopsy slide images of breast tissues, and the dataset was comprised of the contextual analyses of breast tumour cases. This work exhibited an effective NN-based decision-making system for tumour examination using biopsy slides. In another study [20], the procedure of object detection, recognition, and classification of digital optical images of human breast cells was applied to separate normal and abnormal (malignant) cells. The method depended on two types of feature vectors. The first vector included the statistical features, such as the mean, standard deviation, mode, and median; and the features of the second vector were made out of Euclidian geometric parameters, such as the object border, region, and infill coefficient. All elements of the feature vectors were considered to reflect the statistical attributes and geometric structure of the cells. The detection stage incorporated a segmentation technique based on the adaptive imaging threshold process that was sensitive to the nearby ranges in pixel intensity. The decision criteria depended on the use of fuzzy logic and the membership function hypothesis. Specifically, a method for the creation and extraction of information to build the membership function was exhibited in this study.

Early detection of breast cancer plays a significant role in the fatality of it. The aim of our research is to develop a breast cancer detection system by analysing the ultrasound images of breast scans using ANN [21], [22]. The goal of the proposed system is to provide a fast and accurate expert guidance on breast cancer disease diagnosis. In addition, for the training targets, it helps in decreasing the learning gap between various people amid breast cancer diagnosis. The main goals of our study are as follows: (1) To help identify patients who have a high risk of developing breast cancer utilizing ultrasound screening to support early cancer identification and low death rates, (2) To build a precise and harmonious imaging system to evaluate breast ultrasound cases, (3) To outline a proper method for the proposed breast cancer diagnosis, (4) To investigate and understand the infected region of the breast, (5) To develop a fractal extraction method that supports the implementation of the functionalities of the proposed system, and (6) To test and substantiate the performance of the system. This research aims to create a completely new image analysis method using ultrasound screening for the early detection of breast cancer.

The rest of the paper is organized as follows: Section 2 sets the parameters by describing the Materials and Methods. Then, we introduce our approach for identify the breast cancer and extract the Multifractal dimension features and Neural network classification strategies. Then,Section 3 and Section 4 we discuss the experiment result which include The Dataset and Testing Protocol. In Section 5, we highlight Limitations and Advantages. Finally, Section 6 concludes the study and outlines several possible recommendations of our approach.

Section snippets

Materials

Ultrasound images of the breast cancer were garnered from the Breast Cancer Department of the Oncology Specialist Hospital in Baghdad, Iraq. The dataset involved a total of 184 breast ultrasound images, of which 72 were abnormal (tumour cases) and 112 were normal cases. Computer software Matlab 2014a was used for the execution of the current approach. The experiments were conducted on an HP Pavilion desktop of XPS Series, Pentium (R) Core (TM) I7-4770 CPU (3.40 GHz) of 16 GB RAM, and 1 TB hard

The dataset

Every sample utilized in this experiment was a 2D B-mode ultrasound picture of a surgically extracted breast cancer and was collected from the Breast Cancer Department of the Oncology Specialist Hospital in Baghdad, Iraq. The dataset involved a total of 184 breast ultrasound images that included 72 abnormal (tumour) cases and 112 normal cases. Fig. 5 displays examples of benign and malignant cases in the database. These diverse subclasses influenced the classification outcome because of the

Results and discussion

The efficiency of our proposed strategy was assessed in three different phases:

  • Evaluation of the accuracy of the classification stage.

  • Comparison between the habitual measurements and manual readings that were generated by the domain experts.

  • Assessment of the accuracy of the habitual measurements towards recognizing the early breast cancer for the cases that possessed a label.

This subsection presents a number of experiments. The first experiment was to evaluate the effect of FD number on the

Limitations and advantages

The two main limitations of our study are as follows: First, the huge variation in tumour shape, size, and position in the images made the automatic segmentation of a breast tumour very complex, thus hindering the feature extraction process. Second, the dataset had a limited number of cases for both classes. Thus, our future work is aimed at collecting more samples for both classes.

On the other hand, our paper exhibits various merits. The breast classification process demonstrated high

Conclusion

A computerized system for classifying the region of interest could emulate a decision-support strategy for the analysis of a breast tumour. Such a computerized system is an interdisciplinary method that diminishes the false negative and false positive rates and enhances the accuracy, sensitivity, and specificity using image handling and learning techniques. Using ultrasound images to classify and diagnose breast tumours is a challenge. This study investigated the ultrasound image-based breast

Acknowledgment

This research was supported by the fellowship scheme (UTeM Zamalah Scheme) by Universiti Teknikal Malaysia Melaka, Malaysia.

Mazin Abed Mohammed is a Ph.D. candidate at the Faculty of Information and Communication Technology, Universiti Teknikal Malaysia Melaka, Malaysia. He received his B.Sc. in Computer Science from the College of Computer, University of Anbar, Iraq in 2008. He obtained his M.Sc. in Information Technology from the College of Graduate Studies, Universiti Tenaga Nasional, Malaysia in 2011. His current research interests include artificial intelligence and biomedical computing.

References (27)

  • M.H. Yap et al.

    A novel algorithm for initial lesion detection in ultrasound breast images

    J Appl Clin Med Phys

    (2008)
  • H. Azami et al.

    A comparative study of breast cancer diagnosis based on neural network ensemble via improved training algorithms

  • A. Siddique et al.

    A comprehensive strategy for mammogram image classification using learning classifier systems

  • Cited by (0)

    Mazin Abed Mohammed is a Ph.D. candidate at the Faculty of Information and Communication Technology, Universiti Teknikal Malaysia Melaka, Malaysia. He received his B.Sc. in Computer Science from the College of Computer, University of Anbar, Iraq in 2008. He obtained his M.Sc. in Information Technology from the College of Graduate Studies, Universiti Tenaga Nasional, Malaysia in 2011. His current research interests include artificial intelligence and biomedical computing.

    Belal Al-Khateeb is currently an assist. prof. at the College of Computer Science and Information Technology, University of Anbar. He has published 37 refereed journal and conference papers. His current research interests include evolutionary and adaptive learning, expert systems, and heuristics and meta/hyper-heuristics. Dr. Al-Khateeb is a reviewer of nine international journals (including two IEEE Transactions) and twelve conferences.

    Ahmed N Rashid is an academic staff at the College of Computer Science and Information Technology, University of Anbar. He has published over 15 papers in various scientific journals and at international conferences. His research concerns wireless sensor networks and digital signal processing concerning the development of localization and tracking of sensor types for target detection and prototyping of multi-sensor systems.

    Dheyaa Ahmed Ibrahim received his B.Sc. in Computer Science from the College of Computer, University of Anbar, Iraq in 2009. He obtained his M.Sc. in Computer Science from the College of Computer, University of Anbar, in 2012. His current research interests include artificial intelligence, biomedical computing, medical image processing, and optimization methods.

    Mohd. Khanapi Abd. Ghani received his Ph.D. in Biomedical Computing from Coventry University, U.K. He obtained a Masters in Software Engineering from the Malaysia University of Technology (UTM) and a B.Sc. (Hons) in Computer Science from the Malaysia University of Technology (UTM). His research areas of interest include electronic healthcare systems, telemedicine, healthcare knowledge management, system architecture, and software reuse.

    Salama A. Mostafa obtained his B.Sc. in Computer Science from the University of Mosul, Iraq in 2003. He obtained his M.Sc. and Ph.D. in Information Technology from Universiti Tenaga Nasional (UNITEN), Malaysia in 2011 and 2016, respectively. His research interests are in the areas of software engineering, artificial intelligence, and their integration including software agents and intelligent autonomous systems.

    Reviews processed and recommended for publication to the Editor-in-Chief by Associate Editor Dr. G. R. Gonzalez.

    View full text