Elsevier

Information Sciences

Volume 608, August 2022, Pages 1093-1112
Information Sciences

Deconv-transformer (DecT): A histopathological image classification model for breast cancer based on color deconvolution and transformer architecture

https://doi.org/10.1016/j.ins.2022.06.091Get rights and content

Abstract

Histopathological image recognition of breast cancer is an onerous task. Although many deep learning models have achieved good classification results on histopathological image classification tasks, these models do not take full advantage of the staining properties of histopathological images. In this paper, we propose a novel Deconv-Transformer (DecT) network model, which incorporates the color deconvolution in the form of convolution layers. This model uses a self-attention mechanism to match the independent properties of the HED channel information obtained by the color deconvolution. It also uses a method similar to the residual connection to fuse the information of both RGB and HED color space images, which can compensate for the information loss in the process of transferring RGB images to HED images. The training process of the DecT model is divided into two stages so that the parameters of the deconvolution layer can be better adapted to different types of histopathological images. We use the color jitter in the image data augmentation process to reduce the overfitting in the model training process. The DecT model achieves an average accuracy of 93.02% and F1-score of 0.9389 on BreakHis dataset, and an average accuracy of 79.06% and 81.36% on BACH and UC datasets.

Introduction

In 2020, there were 19.3 million new cancer cases worldwide, including 2.26 million new cases of breast cancer in women, accounting for 11.7% of all new cancer cases. Breast cancer has become the number one cancer worldwide, followed by the lung cancer (11.4%) and colorectal cancer (10.0%). 680,000 women died from breast cancer [1]. Therefore, breast cancer diagnosis and treatment are critical to human health, especially for women. In the early stage of breast cancer development, because of the small size of the lump and mild symptoms, it is difficult to attract patients' attention and miss the best treatment time. Then, the cancer cells gradually develop, invade the surrounding breast tissues, and spread to the nearby lymph nodes or other organs. As a result, extensive breast cancer screening is important for the prevention of breast cancer [2].

In the process of breast cancer diagnosis, doctors can observe breast masses through X-rays. When cancer is suspected, pathological diagnosis is further needed to look over the breast tissue to determine whether it is normal tissue, benign lesions or malignant lesions. The pathological examination is still the gold standard in the diagnosis of breast cancer. Since the tissues are mostly colorless and transparent, and there is a lack of contrast between the various tissues and intercellular structures, stains must be used to stain the tissues. The Hematoxylin-eosin (HE) staining is the most commonly used staining technique [3]. The principle is that the hematoxylin binds to nucleic acids and stains the nucleus in dark blue or purple, while eosin adheres to proteins in the tissue, thereby staining the cytoplasm and extracellular matrix pink. Pathologists can distinguish normal cells from cancer cells by observing the information about the structure, size and spatial arrangement of the nucleus as well as the tissue structure and density, and can grade the breast cancer.

Computer-aided diagnostic (CAD) systems [4] have long been a research hotspot in the medical field and an important aid for the disease diagnosis. CAD mainly uses the statistical methods, machine learning, image processing and other techniques to process medical data such as numbers, text, and images to derive diagnostic predictions. It can reduce the workload of physicians and also the risk of misdiagnosis by physicians, thereby achieving the effect of aiding diagnosis [5]. In the medical field, the detection of breast cancer is a challenging workload due to the large number of patients with breast cancer, the tedious testing steps and the time required to evaluate biopsy slides. Moreover, it is not easy to evaluate the results of breast cancer tissue biopsies, with approximately 75% agreement in diagnosis among different experts [6]. Although the development of CAD systems has contributed to advances in the medical field, the accuracy and reliability of the systems’ diagnostic and predictive results still need to be further improved.

In recent years, because of the rapid development of deep learning in the field of image processing, deep learning-based CAD systems have matured and have been widely used in many fields [7], [8], [9], [10], [11]. The use of deep learning techniques to automatically classify histopathological images of breast cancer can reduce the time to diagnosis, increase the efficiency of breast cancer prevention and diagnosis, and improve the consistency of diagnostic results for pathologists [12]. Deep learning is a representation learning method, by using multiple levels of representation and then repeatedly combining non-linear simple modules. It can learn more complex functions, and also express higher and more abstract features. Convolutional Neural Network (CNN) is a feed-forward neural network that extracts multiple features of an image by using different convolution kernels. It can extract high-dimensional features of an image by deepening the depth of the model, and has the advantage of fast and accurate classification of images. However, CNN is weak for global feature extraction of images, while Transformer architecture [13] based on attention mechanism is stronger for global feature extraction of images and has been widely used for image classification tasks in recent years [14], [15], [16].

Both the CNN architecture and the Transformer architecture are originally developed for natural image classification tasks. Although these two models can still achieve good classification results when transferred to histopathological image classification tasks, they do not take into account that histopathological images are rendered in color with specific stains and the staining effects of the stains are capable of being separated from each other by color deconvolution. It is one of the biggest differences between histopathological images and natural images. In this paper, we will study and use the nature of histopathological image staining to propose appropriate classification models for this type of images.

The contributions of this study are shown as follows:

(1) The classification effect of combining the color deconvolution with the CNN model is examined and analyzed in this study.

(2) A novel Deconv-Transformer (DecT) model and its variants are proposed in this study. The novelty is that both RGB and HED color space information are fused in the DecT model.

(3) A new training method is proposed for the DecT model and its variants. Moreover, the classification experiments are conducted on three datasets to verify the reliability of the proposed DecT model.

(4) The classification results obtained by the DecT model and its variants are demonstrated and analyzed. Moreover, the attention maps are shown.

The structure of this paper: Section 2 reviews the color deconvolution principle and deep learning. Section 3 presents three datasets used in this paper, as well as the specific details of the proposed DecT model and its variants. In Section 4, the training methods for all the models are described in this paper. Section 5 presents and analyzes the experimental results of the models. Section 6 analyzes the nature of the model and visualizes it. Section 7 summarizes all the contents of this studies.

Section snippets

Color deconvolution

The use of stains to stain the tissue cells allows different tissue structures to appear in different colors for easy observation by pathologists. The stain separation operation can separate out the effect of each stain on the tissue. According to the Beer-Lambert law, let Io be the intensity of the incident light of the input image, I be the intensity of the outgoing light of the output image, H be the amount of applied stain, and W be the absorption factor, then we can get:I=Ioexp-HW

The

Datasets

In this section, we will briefly review three datasets used in the classification experiments, including two breast cancer tissue datasets: BreakHis and BACH, and one endometrial tissue dataset: UC. Fig. 2 shows the samples of these three datasets, and the specific details of these datasets are recorded in Table 1.

The BreakHis dataset [34] is one of the most commonly used datasets to study the classification of breast cancer histopathological images. This dataset has four magnifications (40×,

Data pre-processing

Before the images are fed into the network model, we perform online data augmentation, as documented in Table 3. It includes both basic data augmentation and color jitter. During the random rotation operation, the margins of edges after rotating the image are supplemented with 0 pixels. Samples of online data augmentation are depicted in Fig. 6. HE stained images can cause significant color variation due to differences in stains, staining conditions, preparation processes, and image acquisition

Datasets classification results

We set the optimal values of k in DecT-HED, DecT, and DecT-conv models to 20, 40, and 5, respectively.

At least five classification experiments are conducted on the BreakHis, BACH, and UC datasets, and then the classification accuracies of the models in the two training phases are counted. We use the ViT model to directly perform the classification experiments on the RGB images and HED images of these three datasets, which are denoted as RGB-ViT, HED-ViT, respectively. The experimental results

Analysis and visualization

In this section, we focus on the reasons why CNN model classifies HED images less correctly than RGB images in the experimental results in Section 3.2, and the main factors why our proposed DecT model can outperform CNN and simple ViT model to achieve the highest classification accuracy. We also use the model visualization to show the regions in the images that the model focuses on. Since the BreakHis dataset has only binary classification and the number of images is large, the trained model is

Conclusion and future work

In this paper, we pioneer the study of histopathological image color deconvolution combined with deep learning models. Based on this work, we propose the Deconv-Transformer (DecT) model for classification of histopathological images of breast cancer. The proposed DecT model mainly uses the Transformer architecture instead of convolution layers to better match color deconvolution. In addition, we divide the training process of the proposed DecT model and its variants into two stages so that more

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This research work was supported by the Natural Science Foundation of Fujian Province of China under Grant Nos. 2022J06020 and 2022J01193, and Chongqing Research Program of Basic Research and Frontier Technology under Grant No. cstc2021jcyj-msxmX0530.

References (47)

  • H.P. Chan et al.

    Computer-aided diagnosis in the era of deep learning

    Med. Phys.

    (2020)
  • J.G. Elmore et al.

    Diagnostic concordance among pathologists interpreting breast biopsy specimens

    JAMA

    (2015)
  • D. Wu, X. Luo, M. Shang, Y. He, G. Wang, X. Wu, A data-characteristic-aware latent factor model for web services QoS...
  • X. Luo, H. Wu, Z. Wang, J. Wang, D. Meng, A Novel Approach to Large-Scale Dynamically Weighted Directed Network...
  • M. Lin et al.

    Bibliometric analysis on Pythagorean fuzzy sets during 2013–2020

    Int. J. Intell. Comput. Cybernet.

    (2020)
  • M. Lin et al.

    Directional correlation coefficient measures for Pythagorean fuzzy sets: their applications to medical diagnosis and cluster analysis

    Complex Intell. Syste.

    (2021)
  • A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is all you...
  • A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold,...
  • H. Wang, Y. Zhu, B. Green, H. Adam, A. Yuille, L.-C. Chen, Axial-deeplab: Stand-alone axial-attention for panoptic...
  • Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, B. Guo, Swin transformer: Hierarchical vision transformer...
  • A.C. Ruifrok et al.

    Quantification of histochemical staining by color deconvolution

    Anal. Quant. Cytol. Histol.

    (2001)
  • A. Krizhevsky et al.

    Imagenet classification with deep convolutional neural networks

    Adv. Neural Inform. Process. Syst.

    (2012)
  • K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, in: International...
  • Cited by (47)

    View all citing articles on Scopus
    View full text