MICaps: Multi-instance capsule network for machine inspection of Munro's microabscess

https://doi.org/10.1016/j.compbiomed.2021.105071Get rights and content

Highlights

  • Attempts to solve the clinical challenge of automatic inspection of Munro's Microabscess (MM).

  • Presents a fully automated capsule network for SC layer segmentation and MM detection.

  • Weakly supervised approach for localisation of SC layer's neutrophils.

  • Reports several experiments and in-depth analysis of the proposed capsule network.

Abstract

Munro's Microabscess (MM) is the diagnostic hallmark of psoriasis. Neutrophil detection in the Stratum Corneum (SC) of the skin epidermis is an integral part of MM detection in skin biopsy. The microscopic inspection of skin biopsy is a tedious task and staining variations in skin histopathology often hinder human performance to differentiate neutrophils from skin keratinocytes. Motivated from this, we propose a computational framework that can assist human experts and reduce potential errors in diagnosis. The framework first segments the SC layer, and multiple patches are sampled from the segmented regions which are classified to detect neutrophils. Both UNet and CapsNet are used for segmentation and classification. Experiments show that of the two choices, CapsNet, owing to its robustness towards better hierarchical object representation and localisation ability, appears as a better candidate for both segmentation and classification tasks and hence, we termed our framework as MICaps. The training algorithm explores both minimisation of Dice Loss and Focal Loss and makes a comparative study between the two. The proposed framework is validated with our in-house dataset consisting of 290 skin biopsy images. Two different experiments are considered. Under the first protocol, only 3-fold cross-validation is done to directly compare the current results with the state-of-the-art ones. Next, the performance of the system on a held-out data set is reported. The experimental results show that MICaps improves the state-of-the-art diagnosis performance by 3.27% (maximum) and reduces the number of model parameters by 50%.

Introduction

Psoriasis is a chronic, immune-mediated, recurrent, inflammatory skin disease [1,2]. The prevalence of psoriasis varies from 1% to 12% among different populations world-wide [3]. On many occasions, it is difficult to discriminate psoriasis from other erythemato-squamous skin diseases as they carry similar clinical symptoms [2,[4], [5], [6]]. Hence, histopathological examination is considered for confirmation. In clinical pathology, Munro's Microabscess (MM) is considered as the diagnostic hallmark of psoriasis [2]. MM is characterised by skin keratinocytes (parakeratosis) along with neutrophils in the SC layer of the skin epidermis. The neutrophil distribution can be either confluent (throughout the SC layer) or focal (not confluent or localised).

The microscopic inspection of tissue is error-prone, suffers from human limitations and is subject to vary the diagnostic outcome. In the MM detection task, the key source of such human errors lies in the recognition of neutrophils in the SC layer. In the SC layer, the skin keratinocytes appear as a light stained and oval-shaped object and the neutrophils appear as circular shaped and dark stained objects (see Fig. 1). However, due to over-staining, sometimes a keratinocyte is misclassified as neutrophil and vice-versa. Note that parakeratosis occurs in many erythemato-squamous diseases (i.e. Pityriasis lichenoides chronia, Pityriasis rubra pilaris, Lichenification, etc.). However, neutrophils in the SC layer occur only for psoriasis [2]. Hence, we focus to develop a computational method for the automatic detection of neutrophils in the SC layer in the presence of skin keratinocytes.

MM detection in the histopathology images is an image classification problem for which Convolutional Neural Network (CNN) [[7], [8], [9]] is one of the standard approaches. However, training a CNN with biopsy images is computationally intractable due to the high spatial dimension (1936 × 2584 pixels) of the images. As the SC layer covers only a small fractional region of the affected skin biopsy (approx. 2–15%) and a small portion of the SC region may contain neutrophils, down-sampling of biopsy images will destroy important information for subsequent biopsy classification. Hence, instead of training a single deep classification model, we develop a deep learning-based framework that segments the SC layer and then multiple regions sampled from the SC layer are analysed to detect neutrophils.

The most common way for CNN-based medical image segmentation is the classification of every image pixel [[10], [11], [12]] (or super-pixel [[13], [14], [15], [16]]) separately and aggregating them for obtaining the segmentation output. These approaches are tardy as all pixels (or super-pixels) are classified separately. Hence, end-to-end CNN models which take original images as input and pixel-wise semantic labelings as the desired output are also developed for segmentation [17,18] and used in several medical image segmentation tasks [13,17,[19], [20], [21]]. This way of training allows a CNN to generate semantic labelling of all image pixels together. But for object detection problems, a CNN takes images as input and object's availability information as ground-truth. In the test phase, an image is processed with trained CNN to predict the object's presence probability [[22], [23], [24], [25], [26], [27], [28], [29]].

The traditional approaches of object detection suffer from two limitations. First, an object present in a digital image is an aggregation of several small object parts and every object parts have sub-parts and so on. The multi-layer deep neural networks are trained to capture the underlying hierarchical representation of the object parts in the lower layer and corresponding object in the higher layer. In traditional deep CNN, sub-sampling layers (like max-pooling) are employed to increase the “field of view” of higher layer's neurons to detect higher-level features in a larger region of the input image. However, a sub-sampling layer loses important spatial hierarchies between object parts in the lower layer and corresponding object in the higher layer as it destroys the spatial arrangements of the object parts present in the lower layer. Hence, the network cannot produce a translational equivariance feature. Second, the trained models cannot localise the desired objects in the input images. To overcome this issue and to find discriminative image regions, a few localisation approaches like Class Activation Map (CAM) [30] and Gradient-weighted Class Activation Mapping (GradCAM) [31] are proposed. However, both of them are not capable of exact localisation of the image objects.

The ability to detect the spatial distribution of objects makes an automated prediction system trustworthy. In many scenarios, labelling small object which carries diagnostic information is difficult, time-consuming. To address this, we design a CapsNet based computational framework that can detect and locate objects in a weakly supervised manner. Our computational framework is termed as MICaps, a Capsule Network-based framework that detects and localises Multiple Instances of objects in the region of interest. The proposed framework is shown to solve image analysis tasks where spotting of multiple similar and smaller instances is required from weak labelling. We demonstrate results for neutrophil spotting in skin biopsy and MM detection.

The specific contributions of the present research in terms of the characteristics of MICaps are highlighted below:

  • Design of MICaps addresses the current challenges behind analysing the skin cells and sub-cellular structures in large spatial dimensions of histopathology images.

  • MICaps conceptualises a weakly supervised method that contributes to many image analysis tasks, where manual labelling of multiple smaller objects in high-dimensional images is very expensive, time-consuming, and error-prone. Hence, object-level annotations may not be available.

  • Design of MICaps caters to the need for a framework for detecting multiple object instances using CapsNet.

  • This research exhibits that the object localisation ability of MICaps further facilitates the analysis of histopathological images that requires spotting of objects.

  • The proposed framework advances the existing state-of-the-art on MM detection [32].

The rest of the paper is organised as follows. Section 2 discusses the components of MICaps. The experimental protocol and analysis of experimental results are given in Section 3 and Section 4 respectively. Finally, Section 5 concludes the paper.

Section snippets

Methodology

MICaps (see Fig. 2) consists of the following two modules- (i) training module where a segmentation network is trained for SC layer segmentation and a classification network is trained for classifying SC patches; (ii) test module containing three sequential steps viz. SC layer segmentation, sampling patches from the SC layer and finally, classifying patches based on the presence of neutrophils. For performance optimisation, the patch selection procedure should satisfy these two important

Tissue imaging

Generally, digital scanners are used to image biopsy samples that use advanced imaging sensors and can store multiple optically magnified versions of the tissue. However, digital scanners are costly and inaccessible in many parts of the world. Hence, to reduce the imaging cost we use a digital camera and fix it on the top of an optical microscope under which the tissue specimens were kept. Images are captured with 10X optical zoom as 10× is the maximum possible magnification fits in the entire

Stratum Corneum segmentation

In [32], for SC layer segmentation, a UNet was trained in minimising Dice loss. The advantage of the Dice loss minimisation-based segmentation network is that it can deal with the data imbalance issue in the present SC layer segmentation task. Hence, for exact comparison with the previous effort, we train the UCaps (architecture available in Fig. 3) in minimising Dice loss. Further, we experiment with focal loss minimisation based UCaps training [40]. Focal loss is a parametric loss function

Conclusion and scope of future work

This paper presents a capsule network-based computational framework (called MICaps) to detect multiple similar and smaller objects in high-dimensional images. MICaps consists of two deep capsule networks- (a) segmentation network for segmentation of the region of interest (RoI) and (b) classification network for classifying selected patches from RoI. The experimental analysis of MICaps has been performed for Munro's Microabscess (MM) detection in skin biopsy images which successfully

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The authors want to acknowledge all the volunteers who participated in this study. This work is partially supported by Science and Engineering Research Board (SERB), Dept. of Science and Technology (DST), Govt. of India through Grant File No. SPR/2020/000495.

References (41)

  • K. He et al.

    Deep residual learning for image recognition

  • P. Liskowski et al.

    Segmenting retinal blood vessels with deep neural networks

    IEEE Trans. Med. Imag.

    (2016)
  • E. Nasr-Esfahani et al.

    Vessel extraction in x-ray angiograms using deep learning

  • W. Li et al.

    Gland segmentation in colon histology images using hand-crafted features and convolutional neural networks

  • A. Farag et al.

    A bottom-up approach for pancreas segmentation using cascaded superpixels and (deep) image patch labeling

    IEEE Trans. Image Process.

    (2017)
  • D. Boschetto et al.

    Superpixel-based automatic segmentation of villi in confocal endomicroscopy

  • Z. Tian et al.

    Superpixel-based segmentation for 3d prostate mr images

    IEEE Trans. Med. Imag.

    (2016)
  • O. Ronneberger et al.

    Convolutional networks for biomedical image segmentation

  • V. Badrinarayanan et al.

    Segnet: A Deep Convolutional Encoder-Decoder Architecture for Robust Semantic Pixel-wise Labelling

    (2015)
  • A.G. Roy et al.

    Recalibrating fully convolutional networks with spatial and channel ‘squeeze amp; excitation’ blocks

    IEEE Trans. Med. Imag.

    (2018)
  • View full text