Elsevier

Applied Soft Computing

Volume 109, September 2021, 107533
Applied Soft Computing

Breast cancer diagnosis using thermal image analysis: A data-driven approach based on swarm intelligence and supervised learning for optimized feature selection

https://doi.org/10.1016/j.asoc.2021.107533Get rights and content

Highlights

  • A painless detection of breast lesions helps to prevent Breast Cancer deterioration in women.

  • The extraction of features from Breast Thermography images can provide enough information for Breast Cancer detection.

  • Feature selection process using multi-objective algorithms based on swarm intelligence can improve classification tasks.

  • Multi-Objective Binary Fish School Search (MOBFSS) is efficient on Feature Selection for a real complex problem.

  • Shape and texture are two types of moments important to the Breast Cancer diagnosis.

Abstract

Breast cancer is one of the deadliest forms of cancer in women but the disease has a good prognosis when diagnosed early. The gold standard for the diagnosis of breast cancer is mammography imaging analysis but the acquisition of mammograms is a painful and embarrassing procedure for women involving breast compression. Alternative methods have been investigated in the last years, including breast thermography, which does not involve ionizing radiation, pain or contact with the breast. However, the accuracy of these modern techniques still needs to be improved to allow the widespread use in practical applications but machine learning techniques have brought in an increased accuracy and reduction in false positives and false negatives to the analysis of breast thermograms. We propose a methodology for detecting and classifying breast lesions using a database of real images of Brazilian patients. We divide our methodology into three steps. In the first step, the shape and texture characteristics of breast thermograms are extracted using Zernike and Haralick moments. The second step is the feature selection process using multi-objective binary optimization algorithms based on swarm intelligence. The third step is to analyze the best vectors’ classification using eleven algorithms such as Convolutional Neural Networks, Extreme Learning Machines, and Support Vector Machines. Finally, we discuss the computational time and performance of various techniques based on swarm intelligence, artificial neural networks, and statistical models to improve the computational time and accuracy of breast cancer diagnoses. Indeed, we observe that the feature selection process has helped us decrease computational time with a high potential to improve diagnostic accuracy. We also demonstrate that the extracted features considering the shape of breast lesions are highly important to a high diagnostic accuracy.

Introduction

Breast cancer is one of the most deadly diseases for women in the world. Breast cancer accounts for 25% of all new cancer diagnoses in women worldwide. In 2012, nearly 1.7 million new cases of breast cancer were diagnosed [1], [2]. Survival rates vary across the world but are improving overall. In countries with advanced care, the survivability is 85% for those diagnosed with first-stage diagnosis and 24% if the diagnosis occurs at a later stage [1], [2].

Most new diagnoses and deaths from breast cancer occur in developing countries, as opposed to developed countries [3], [4]. The most significant number of cases in developing countries is due in part to the more substantial portion of the world’s population. However, rates have steadily increased in these developing nations also in recent decades [1], [2]. In global statistics, Brazil appears as one of the countries with the highest incidence rate and death rate due to breast cancer. More than 66,000 new cases of breast cancer are estimated for the year 2020 alone [5].

Breast cancer is currently the leading cause of cancer-related deaths in women in the developing regions of the world. The incidence of breast cancer, or the number of cases per 100,000 women, is still lower in developing countries than in the West, but the rates of death from illness are higher in developing countries. This pattern can be associated with subsequent diagnosis and poor access to treatment [1], [2], [3], [4]. The rate of breast cancer per 100,000 women is higher in the US, Canada, and Europe than in developing countries, but the mortality rates are markedly lower. In developed countries, more breast cancer cases are detected in the early stage when cure is more likely, and more women can receive treatment. Nevertheless, breast cancer is the second cause of cancer-related deaths in women in developed countries, just after lung cancer [1], [2].

Nowadays, the most efficient and popular methods in mastology are Mammography, Ultrasound, Magnetic Resonance, and clinical breast examination. There are still many limitations associated with them, including the fact that some breast lesions in early stages cannot be detected [6]. The chance of early identification of breast lesions in young women is smaller than for older women. The younger the woman is, the harder it is to find small lesions in their breast images due to breast tissue composition. In fact, only older women can perform some of the exams due to the use of a high dose of ionizing radiation. Many research groups are working on different ways to reduce the amount of radiation dose delivered to the patient. However, reducing the number of people who get exposed to this radiation may be a more effective way to mitigate health damage.

Some groups started using infrared thermography as a screening technique, and it was first applied to mastology in 1982. In fact, it was not recommended for breast diagnosis at the time due to the lack of proper information extracted from these images. However, technological improvements in infrared cameras and image analysis allowed the development of tools to detect breast changes and lesions in images. As a consequence, thermography became a complementary screening test in mastology circa 2012 [7], [8].

Accelerated metabolic activities tend to increase the temperature of the breast, so injuries such as cancers and locations where angiogenesis is occurring can be seen through thermograms. These highly metabolic tissues appear in the images as hotter points [6]. Since the increase in temperature mainly occurs at the breast surface, low infrared penetration capacity has not been considered a limitation of breast thermograms [6], [9], [10]. Thus, thermography using infrared technology create a surface temperature map that provides physiological information that is sufficient to identify anomalies in the breast such as cancer [6].

Because the physiological changes usually precede anatomical changes, we are able to capture changes in the breast even during the early stages of a tumor through mammary thermography [6]. There is also no physical contact and no form of radiation and compression used in this technique [6]. Therefore, thermograms can be used for all women of all ages, including pregnant and lactating women [6], [9], [10].

The diagnosis of breast cancer based on imaging performed by human specialists present limited accuracy. The use of diagnostic support tools based on digital image analysis by computational intelligence tools becomes quite useful to maximize accuracy and sensitivity, and minimize false positives and false negatives. The potential for improving the accuracy of diagnoses through the use of intelligent and cheap support tools becomes even more relevant to poor or underdeveloped countries where there is a lack of experts and infrastructure.

Recent works have shown that it is possible to obtain useful diagnostic results by the representation of thermograms and mammograms considering shape and texture attributes [9], [10], [11], [12]. These characteristics are commonly represented by the Haralick and Zernike moments. However, the feature vectors used to represent these thermographic images described in terms of shape and texture suffer from the curse of high dimensionality. In this way, to obtain an efficient classification, a feature selection task minimizes the memory usage and processing demand. Therefore, our strategy relies on the reduction of the number of attributes without reducing the classification accuracy. One can achieve this by properly selecting the most relevant characteristics using metaheuristic algorithms.

In this work, we propose a method for the detection and classification of breast lesions using a database of real images of Brazilian patients. We divide our method into three steps. The first step is the extraction of features from breast thermography images [9]. The shape and texture characteristics of breast thermograms are extracted using the moments of Zernike and Haralick. The second step is the feature selection process using multi-objective algorithms based on swarm intelligence that minimizes the computational cost while minimizes the error classification. After that, we assess different classifiers to compare further the accuracy of the diagnosis aiming to define an efficient method. Using WEKA [13] and keras [14], we test the diagnosis of eleven classifiers: BayesNet, NaiveBayes, MLP, SMO/SVM, J48, Random Forest, Random Tree, CNN, ELM sigmoid, mELM dilatation and mELM erosion. Then, using the best results from the classification and the feature selection steps, we analyze the three main metrics in the literature: specificity, sensitivity, and area under the ROC curve (AUC-values). Thus, this paper unveils alternatives to improve breast cancer diagnosis by minimizing the computational cost while maximizing the accuracy and robustness. Even though our results show the performance of a static methodology, we recommend the use of our methodology in a dynamic process, whereas new images and features are available, we can calibrate the system to further improve its efficiency. The feature and classifier selections are key to the improvement of the breast cancer diagnosis, and the shape characteristics present an important role in this improvement.

We organize the article as follows: Section 2 describes the related works in breast cancer diagnosis; Section 3 explains the methods, the theoretical and software tools, and the data sets; Section 4 presents the experimental results; we discuss these results on Section 5; finally, we give the general conclusions in Section 6.

Section snippets

Related works

Ng et al. [15] trained a backpropagation neural network to identify benign or malignant lesions from a database of 200 breast thermography images. Four different methods of extracting features were attempted. The neural network used in this study was configured to have a learning rate of 0.5, with a momentum equal to 0.4 and a sigmoid activation function. In their study, only one classifier configuration was used. Since the authors’ goal was to assess the different sets of features, they

Material and methods

In this section, we present a brief context of breast thermography, the theoretical and software tools deployed in the system, and the database used to validate the proposal. Some previous results using the multi-objective binary optimization swarm-based algorithm for feature selection are presented and discussed.

Results

Our results cover two main contributions. First, we analyze the performance of several versions of MOBFSS on the selection of features extracted from breast thermographic images to the breast cancer diagnosis. Applying feature selection, we mainly decrease the computational cost and time of delivering the diagnosis. Second, we analyze the performance of several classifiers for the breast cancer diagnosis using the best feature vectors found from the feature selection step. After having a

Discussion

In this paper, we apply a methodology of three main steps to improve and analyze the breast cancer diagnosis based on the breast thermography images. The breast cancer diagnosis using mammography imaging is the best current procedure in the market with the highest efficiency. However, mammography imaging is a painful procedure for women, involves ionizing radiation, should not be performed on pregnant and lactating women and do not accurately detect early stages of breast cancer, especially in

Conclusions

The breast cancer diagnosis can be less painful using a noninvasive device and better detected by new methodologies. In this paper, we contributed with further analyses regarding extracted features from Thermography Breast images from Brazilian patients. Our methodology discussed the usage of multi-objective fish school search algorithms for feature selection of extracted features from the thermography images. Moreover, we compared the performance of ten different classifiers in the diagnosis

CRediT authorship contribution statement

Mariana Macedo: Conceptualization, Methodology, Software, Validation, Investigation, Visualization, Writing - original, Review & editing. Maira Santana: Conceptualization, Methodology, Software, Data curation, Visualization, Writing - original, Review & editing. Wellington P. dos Santos: Conceptualization, Supervision, Funding acquisition, Writing - original, Review & editing. Ronaldo Menezes: Conceptualization, Supervision, Writing - original, Review & editing. Carmelo Bastos-Filho:

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

We are grateful to the Brazilian research-funding agencies CAPES, CNPq and Facepe , for the partial support of this research.

References (46)

  • EtehadtavakolM. et al.

    Breast thermography as a potential non-contact method in the early detection of cancer: a review

    J. Mech. Med. Biol.

    (2013)
  • WalkerD. et al.

    Breast thermography: history, theory, and use. Is this screening tool adequate for standalone use

    Nat. Med. J.

    (2012)
  • MilosevicM. et al.

    Comparative analysis of breast cancer detection in mammograms and thermograms

    Biomed. Eng. / Biomed. Tech.

    (2015)
  • SantanaM.A.d. et al.

    Breast cancer diagnosis based on mammary thermography and extreme learning machines

    Res. Biomed. Eng.

    (2018)
  • de VasconcelosJ.H. et al.

    Investigations on statistical classification methods for use in breast thermography

    IEEE Latin Amer. Trans.

    (2018)
  • AzevedoW.W. et al.

    Fuzzy morphological extreme learning machines to detect and classify masses in mammograms

  • CordeiroF. et al.

    Segmentation of mammography by applying GrowCut for mass detection

    Stud. Health Technol. Inform.

    (2013)
  • FrankE. et al.

    Weka - a machine learning workbench for data mining

  • CholletF.

    Keras

    (2015)
  • NgE.-K. et al.

    Computerized detection of breast cancer with artificial intelligence and thermograms

    J. Med. Eng. Technol.

    (2002)
  • RaghavendraU. et al.

    An integrated index for breast cancer identification using histogram of oriented gradient and kernel locality preserving projection features extracted from thermograms

    Quant. InfraRed Thermogr. J.

    (2016)
  • Fernández-OviesF.J. et al.

    Detection of breast cancer using infrared thermography and deep neural networks

  • OliveiraM.M.d.

    Desenvolvimento de protocolo e construção de um aparato mecânico para padronização da aquisição de imagens termográficas de mama [dissertation]

    (2012)
  • Cited by (0)

    View full text