Patch-based system for Classification of Breast Histology images using deep learning
Introduction
Breast cancer is one of the leading cause (Rangayyan et al., 2007) of cancer in women worldwide. According to a report published by WHO (World Health Organization, 2018) in 2013, nearly five lakh women lost their lives due to this deadly disease worldwide in 2011. In India, too, the number of breast cancer incidences are rising at an alarming rate. It has now become the most common cancer among women in most cities in India and the second most common disease in rural areas. In addition, most females diagnosed with breast cancer are in the younger age group (25-40 years). The risk of breast cancer (Surakasula et al., 2014) increases alarmingly until menopause then it decreases gradually. Breast cancer diagnosis consists of a series of steps. Whenever a lump or nodule is discovered in a breast during clinical examination, various screening tests like mammography (Behrens et al., 2007) or ultrasound is performed to detect changes in the breast. These screening tests are followed by a biopsy to make a definite diagnosis and detect any malignant growth in the breast tissue. Biopsies enable a doctor to analyze the microscopic structure of the tissue and hence differentiate between normal, benign or malignant lesions and accordingly, performs successive prognosis. Cancer might become fatal if not detected early. However, early detection of this deadly disease can decrease the mortality rate since more treatment options become available when discovered in the early stage. The traditional method of inspecting the biopsy slides under the microscope is laid on the shoulders of the pathologists. However, this manual inspection technique is time-consuming and is dependent on the expertise of the pathologists. Thus developing an automated system for breast cancer detection from the breast histology images is the need of the hour. Carcinomas can be divided into two classes namely in-situ and invasive carcinomas. An in-situ carcinoma is one in which the malignant growth is restricted to the tissues in which they have occurred and have not spread to the surrounding tissues. In contrary, an invasive carcinoma is one in which the malignant growth has spread to surrounding areas from their point of occurrence. The primary task in developing an automated system for breast cancer detection is to classify the breast histology images into four class namely normal, benign, in situ or invasive carcinomas. In this paper, we have reviewed some of the recent state-of-the-art techniques for automated breast histology classification and have developed a patch-based classifier (PBC) using deep learning approach for automated classification of breast histology images.
As mentioned in the previous section, breast cancer is one of the deadliest diseases amongst women worldwide, and the traditional method of microscopic inspection is highly time-consuming and prone to manual errors. This motivated us to develop an automated system for classification of microscopic breast histopathological images. Most of the reported literature for this work use handcrafted features for breast cancer classification. However, deep learning approaches eliminate the need for extracting handcrafted features. Thus, in this work, we have developed a patch-based classifier (PBC) using the convolutional neural network (CNN) for automated breast histology image classification. The details of the work are mentioned in section 2.
This section describes some of the standard state-of-the-art methodologies for breast cancer detection from histopathological images. The state-of-the-art can be broadly categorized as either handcrafted feature-based approach or deep learning based approach using the convolutional neural network (CNN).
The handcrafted features (He et al., 2012) used by most researchers are mostly thresholding-based, clustering-based, active contour-based, watershed-based, graph-cut, etc. The handcrafted features mainly aim at segmenting the nuclei from the entire breast cancer (BCa) histopathology slide images. Distinguishing features are extracted from the segmented nuclei to differentiate between malignant and benign slides. In (Veta et al., 2013), the fast radial symmetry-based approach followed by marker-controlled watershed segmentation was used for nuclei extraction from breast cancer histopathology images. In their work, 39 biopsy slide images were acquired from 38 different patients. In another work (Jain et al., 2014), Chan Vese (CV) model based active contour technique was implemented to segment the cells from the background in breast histopathological images. Morphological features were extracted from these segmented cells which were used for classification of the cells as either normal or cancerous. In (Basavanhally et al., 2013), geodesic based active contour model was used for segmenting nuclei from BCa histopathological images. Both architectural and textural features of the nuclei were considered. Graph-based features (architectural feature) and 13 Haralick features (texture features) were extracted from the segmented nuclei which were used for classification and developing an automated system for detecting the Modified Bloom–Richardson (mBR) (Breast cancer and breast pathology, 2018) grade of different histopathological slides. In their work, breast histopathological images collected from 126 different patients were considered. In (Khan et al., 2013), an automated system for segmenting tumor cells in the BCa histopathological images by segmenting the image into hypocellular and hypercellular stroma regions using magnitude and phase spectra in the frequency domain was proposed. They have worked on MITOS dataset (MITOS Dataset, 2018) which consist of 35 breast histopathological images collected from 5 different patients. In (Roullier et al., 2016), graph-based segmentation has been used to extract the mitotic nuclei from the BCa histopathological whole slide images (WSI). In (Kaymak et al., 2017), an artificial neural network (ANN) based approach has been used for automatic breast histology classification.
In recent years, Convolutional Neural Networks (CNNs) has gained immense importance for breast histopathological image classification. CNNs have huge advantages over the handcrafted feature extraction techniques since CNNs extract features automatically from the image patches and the results obtained are comparable with those obtained from traditional feature extraction techniques. Spanhol et al. (Fabio Alexandre Spanhol et al., 2016) have worked on BreaKHis dataset which consists of microscopic histopathological breast images captured at different magnifications. They have developed a CNN model to classify the images as either benign or malignant. The authors reported that the accuracy of the system decreased with increase in magnification since at higher magnification their CNN architecture failed to extract useful features. In another work (Cruz-Roa et al., 2014), a CNN model was proposed for the automatic classification of invasive ductal carcinoma in whole slide images (WSI) and hence to differentiate between the invasive and non-invasive images. Both (Fabio Alexandre Spanhol et al., 2016) and (Cruz-Roa et al., 2014) is a 2-class classification problem where the classes are either benign/ malignant or invasive/ non-invasive. In (Wahab et al., 2017), authors have proposed a CNN model for separating the mitotic and non-mitotic nuclei from breast histopathological images. In (Vang et al., 2018), the authors have used the pre-trained Inception-V3 model (Szegedy et al., 2016) for 4 class classification of breast cancer histopathology image with some post-processing techniques. The Inception-V3 model is a pre-trained model that was developed for the classification of the images in the ImageNet database into 1000 different image classes.
However, the field of deep learning has been very less explored in the field of breast cancer histopathological image classification. The few state-of-the-art that exists performs 2 class classification that is a classification of the histopathological images into two histological classes namely normal and malignant. However, none of the reported state-of-the-art separates the benign ones from the normal ones. In addition, malignancy is also of two types in-situ (cancer cells are limited to the regions in which they have occurred) and invasive (cancer cells have spread to the surrounding tissues from their point of occurrence). Thus to develop a fully automated system for histopathological image classification, all the different categories should be considered. Further, deep learning models eliminate the need for extracting handcrafted features for performing automatic classification and outperforms the results obtained with handcrafted features in most cases. The performance of automated systems for classification using handcrafted features is mainly dependent on nuclei segmentation step as in (Veta et al., 2013), (Jain et al., 2014), (Basavanhally et al., 2013). But the performance of deep learning approaches is not limited by the classification results of nuclei segmentation step since training and classification using deep learning is based on the direct processing of image regions. This motivated the authors in this paper to develop a fully automatic system for 4-class (normal, benign, in situ and invasive carcinoma) and 2-class breast cancer histopathological image classification using deep learning approaches.
In this work, we have proposed a patch-based classifier (PBC) using CNN to classify breast histopathological images into four as well as two histopathological classes. The details of the classes and the CNN architecture deployed for this work is mentioned in section 2.5.
The main contribution of this paper can be summarized as follows-
- •
In this work, we have developed a patch-based classifier (PBC) which uses an optimal architecture of a convolutional neural network (CNN), for automated classification of breast cancer histopathology images.
- •
The proposed classification system works in two different modes: one patch in one decision (OPOD) and all patches in one decision (APOD). The patch labels are predicted by OPOD mode, and the result is obtained unanimously whereas in the APOD mode class label of the image is obtained by a majority voting scheme.
- •
To verify the classification ability of the proposed system, the breast histopathological images are classified into 2 classes (non-malignant and malignant) as well as 4 classes (normal, benign, in situ and invasive carcinoma) while most of the existing methods classify the same broadly into 2 classes.
- •
We have also explored the potentiality of our proposed model in classifying the images in the test dataset obtained by splitting the training set as well as the actual hidden test data set of ICIAR-2018 breast cancer histology image dataset.
- •
Our model achieves an accuracy of 87% in classifying the images of ICIAR-2018 hidden test dataset.
This paper is organized as follows. Section 2 describes elaborately the materials and methodology employed in this paper. Section 3 contains the experimental results, discussions, and comparison with the state-of-the-art. Finally, the paper is concluded by section 4.
Section snippets
Schematic representation
Fig. 1 represents the block diagram of the entire methodology used in this work. Each of the parts is explained elaborately in the upcoming sections.
Preprocessing
For examination of histopathological slides, stains are used to enhance the contrast between the different histological structures especially the nuclei and the cytoplasm which eases their manual inspection under the microscope. The most commonly used stain in histopathological slides for their microscopic examination is the hematoxylin and eosin
Experimental Results and Discussion
In this section, we evaluate the classification performance of our proposed model in terms of sensitivity, precision, F1-score, and accuracy. We have initially split the training set into three parts for training, validation, and test (details in section 3.1.) and have reported the classification performance of both patch-wise and image-wise classification on this test set. In addition, the accuracy obtained in classifying the histology images in the hidden test dataset of the challenge has
Conclusion
In this work, we have developed a patch-based classifier (PBC) for automatic classification of ICIAR-2018 breast histology dataset into four class namely normal, benign, in situ and invasive carcinoma. The number of filters in each layer, kernel size was adjusted in such a way that the number of trainable parameters is less than the number of samples so as to prevent overfitting. This proposed classifier first predicts the class label of each input patch by OPOD technique. The whole image label
Declarations of interest
None.
Acknowledgments
The first author is grateful to the Department of Science and Technology (DST), Government of India for providing her Junior Research Fellowship (JRF) under DST INSPIRE fellowship program (IF170366).
References (34)
A review of computer-aided diagnosis of breast cancer: Toward the detection of subtle signs
Journal of the Franklin Institute
(2007)Computer assistance for MR based diagnosis of breast cancer: present and future challenges
Computerized medical imaging and graphics
(2007)Histology image analysis for carcinoma detection and grading
Computer methods and programs in biomedicine
(2012)- et al.
Breast cancer image classification using artificial neural networks
Procedia Computer Science
(2017) Two-phase deep convolutional neural network for reducing class skewness in histopathological images based breast cancer detection
Computers in biology and medicine
(2017)Segmentation methods of H&E-stained histological images of lymphoma: A review
Informatics in Medicine Unlocked
(2017)- et al.
A survey on deep learning in medical image analysis
Medical image analysis
(2017) (WHO) report on “Cancer”
(2018)A comparative study of pre-and post-menopausal breast cancer: Risk factors, presentation, characteristics and management
Journal of research in pharmacy practice
(2014)Automatic nuclei segmentation in H&E stained breast cancer histopathology images
PloS one
(2013)
Cancerous cell detection using histopathological image analysis
International Journal of Innovative Research in Computer and Communication Engineering
Multi-field-of-view framework for distinguishing tumor grade in ER+ breast cancer from entire histopathology slides
IEEE Transactions on biomedical engineering
HyMaP: A hybrid magnitude-phase approach to unsupervised segmentation of tumor areas in breast cancer histology images
Journal of pathology informatics
Multi-resolution graph-based analysis of histopathological whole slide images: Application to mitotic cell extraction and visualization
Computerized Medical Imaging and Graphics
Breast cancer histopathological image classification using convolutional neural networks, IEEE International Conference on Neural Networks (IJCNN)
Cited by (168)
Computational pathology: A survey review and the way forward
2024, Journal of Pathology InformaticsSCANED: Siamese collateral assessment network for evaluation of collaterals from ischemic damage
2024, Computerized Medical Imaging and GraphicsAutomatic myeloblast segmentation in acute myeloid leukemia images based on adversarial feature learning
2024, Computer Methods and Programs in BiomedicineDeep sample clustering domain adaptation for breast histopathology image classification
2024, Biomedical Signal Processing and ControlMulti-cell type and multi-level graph aggregation network for cancer grading in pathology images
2023, Medical Image Analysis