Elsevier

Neurocomputing

Volume 351, 25 July 2019, Pages 77-86
Neurocomputing

Deeply supervised full convolution network for HEp-2 specimen image segmentation

https://doi.org/10.1016/j.neucom.2019.03.067Get rights and content

Abstract

Human Epithelial-2 (HEp-2) cell images play an important role for the detection of antinuclear autoantibodies (ANA) in autoimmune diseases. Segmentation is the primary step for classification, further treatment and diagnosis. However, the staining patterns and scales of HEp-2 specimen images have different variances, which still make segmentation quite a challenging task. To solve it, we propose a novel deeply supervised full convolutional network (DSFCN) for robust segmentation of different HEp-2 cell images dataset. DSFCN is based on a very deep network, which integrates the dense deconvolution layer (DDL) and hierarchical supervision structure (HS). Specifically, The DDL uses the up-sampling to restore the high resolution of the original input images to replace the traditional deconvolution layer, and the hierarchical supervision is added to capture feature information of the shallow layers. The high-resolution predictive output is obtained by capturing local and global information between layers. Without relying on the prior knowledge and complex post-processing, DSFCN can be effectively trained in an end-to-end manner. The proposed method is trained and tested on the I3A-2014 public dataset, and the segmentation result demonstrates that the performance of our model outperforms other state-of-the-art methods.

Introduction

HEp-2 cells with indirect immunofluorescence (IIF) is a commonly-used technique for detecting anti-nuclear anti-bodies (ANA), which can be visualized via a fluorescence microscope. Segmenting HEp-2 specimen images is indispensable due to its importance in daily clinical practice to improve the efficiency of computer-aided diagnosis and detection. However, manual analysis from a large number of IIF images still has the limitations (e.g., high clinical experiences, time-consuming and inter-variability among doctors’ knowledge). As a result, the subjective results and inter-laboratories diversity restrict the true expression of the reading results [1]. To address these limitations, a number of automatic and robust HEp-2 cell classification models have been proposed in recent years [2], [3], [4]. In these methods, segmentation is the first step for HEp-2 cell images classification since the accurate segmentation results are beneficial for the subsequent classification processing [3], [4].

Intensity thresholding is one of the most popular and preliminary approaches for cell segmentation. Petra et al. proposed a HEp-2 cell image segmentation method using Otsu by utilizing the first thresholding [5]. Jiang et al. proposed a novel approach based on the framework of verification-based multi-threshold probing for HEp-2 cell image segmentation [6]. Many studies aimed at the segmentation of HEp-2 cells [7], [8] due to the large variances of appearances among different HEp-2 cell categories. However, most of the previous works achieved accurate segmentation for images containing a certain pattern of cells while failing to achieve good results when different staining patterns were provided. Examples of the staining patterns of HEp-2 specimen images are illustrated in Fig. 1. However, there is still room for robustness improvement of the HEp-2 specimen images segmentation method.

To improve the segmentation performance of HEp-2 specimen images, a method with an impressive feature is highly desirable. In recent years, the deep convolutional neural networks (CNNs) have attracted wide attention due to their impressive performance in various image processing tasks [9], [10], [11]. The fully convolutional network (FCN) extends the traditional CNN, which is one of the most representative models [12], [13] for segmentation. The main idea under FCN model is to apply the classification networks (AlexNet [14], VGG net [15], GoogLeNet [16], and ResNet [17]) to the segmentation task by transforming the last classifier layers to the deconvolutional layers. In fact, the deeper level of network layer information and fusion can further improve the segmentation performance. Hence, the full convolution residual network (FCRN) with a deeper residual network (ResNet) was proposed [18]. However, when the fully convolutional connection is applied to the fully convolutional layer in the deep network, the resolution of the feature map of the output layer will be reduced. This results in loss of information which is highly undesirable for the segmentation of medical images.

In order to tackle this problem, a lightweight neural network called U-Net was proposed [19]. The U-Net architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. However, the network is inefficient to capture the edge information for some HEp-2 specimen patterns. To better express the image information, Isola et al. explored generative adversarial networks (GANs) in the conditional setting and proposed pix2pix network framework based on U-Net [20]. This architecture makes conditional GANs suitable for image-to-image translation tasks, where an input image is fed into the network and a corresponding output image is generated. With the adversarial learning, the network can learn rich edge information. Nevertheless, the feature information is easily decreased in the processing of skip connection. To overcome this limitation, the Dense Deconvolutional Layer (DDL) structure is fetched in this paper. This idea has been proposed in recent years and achieved considerable segmentation performance [21], [22], [23], [24], [25].

The DDL consists of a series of skip connection layers between the previous and later layer instead of performing a summation operation. This architecture also resolves the gradient vanishing problem effectively. Inspired by the previous works, we propose a novel end-to-end Deep Supervised Fully Convolutional Network (DSFCN), which utilizes DDLs without requiring prior knowledge and post-processing. The improved loss functions are introduced in the two lateral output layers to optimize the output feature maps so that the hierarchical supervision (HS) depth is fully exploited. Our proposed DSFCN framework is able to learn rich hierarchical features and captures the local and global contextual information effectively. Due to the perfect performance, the proposed approach can be regarded as a general technology for image segmentation. In summary, the main contributions of this paper are three-fold:

  • We propose a novel end-to-end deep learning framework based on FCN. Due to the DDL structure, the network can learn rich boundary information for HEp-2 specimen images.

  • An improved HS mechanism is added to the network, which can optimize the output feature maps.

  • Experimental results demonstrate that the proposed method achieves the state-of-the-art segmentation performance on the I3A-2014 dataset.

The rest of this paper is organized as follows. The related work is presented in Section 2. Section 3 introduces the proposed network framework in detail. The experiment settings and comparison results are illustrated in Sections 4 and 5. Sections 6 and 7 are dedicated to discussions and conclusions, respectively.

Section snippets

Image segmentation

We all know that image segmentation is usually the basic research for other visual tasks, such as visual tracking [26], [27], classification [28], [29], [30], [31], [32], [33], [34], detection [35], [36], [37] and cropping [38]. Recently, there are many outstanding image segmentation methods [39], [40], [41]. For example, Shen et al. proposed a novel method to optimize the higher-order energy with appearance entropy by transforming a higher-order energy function to a lower-order one at a local

Methodology

Our proposed deep supervised full convolution network (DSFCN) consists of the adaptive convolution unit, DDL, skip connection unit and HS. The architecture of our proposed model is illustrated in Fig. 2. Similar to FCN, the adaptive convolution unit in DSFCN is used to adjust the weight parameters. We use DDL to optimize and fuse the feature maps to generate a higher resolution image. Thus, DSFCN can sharpen object boundaries in an end-to-end way. Here, we adopt DDL instead of the original

Database

Our experiments are evaluated on the public dataset I3A-2014. The I3A dataset was first released in the fluorescent image based cell classification contest organized by ICIP 2013 [54], and later used in the contest organized by ICPR 2014. The dataset records 252 specimens from seven categories: Homogeneous (53), Speckled (52), Nucleolar (50), Centromere (51), Golgi (10), Nuclear membrane (21), and Mitotic spindle (15). The number in brackets indicates the number of specimen samples for the

Comparison results of different augmented datasets

Due to the limitation of the number of instances in I3A-2014 dataset for a deep network, different data augmentation strategies are made to avoid over-fitting in the training process. In addition, the augmented dataset is beneficial for improving the segmentation performance. The proposed method is evaluated on different augmented datasets that are described in Section 3.1, respectively. Fig. 5 shows that the segmentation performance on I3A-2014 dataset for different data augmented methods. In

Discussions

As described in the above sections, we present an automated deep supervised full convolution network for HEp-2 specimen images segmentation in an end-to-end way. In the following, we will discuss the effect of the dataset size and network architecture.

Conclusion

In this paper, we propose an automated DSFCN framework for HEp-2 specimen images segmentation, which is able to tackle the problem of localization for classification. The proposed model includes DDL and HS mechanism. DSFCN is able to learn discriminative feature representation and effective integration of multi-level contextual information. Obviously, our method can automatically and accurately segment the region of interest. DSFCN can build a feature connection on DDL to learn the

Acknowledgments

This work was supported partly by National Natural Science Foundation of China (Nos. 61871274, and 61801305), Guangdong Province Key Laboratory of Popular High Performance Computers (No. 2017B030314073), Natural Science Foundation of Guangdong Province (Nos. 2017A030313377 and 2016A030313047), Shenzhen Peacock Plan (No. KQTD2016053112051497), Shenzhen Key Basic Research Project (Nos. JCYJ20170302153337765 and JCYJ20170818142347251), NTUT-SZU Joint Research Program (No. 2018006), and Open Fund

Hai Xie received the B.E. degree in department of information science and Engineering, Wanfang College of Science and Technology HPU, Zhengzhou, Henan Province, China, in 2013 and M.E. degree in College of Computer Science and Software Engineering, Shenzhen University, China, in 2015. He is currently a Ph.D. candidate in the College of Information and Engineering at Shenzhen University, Shenzhen, China. His search interest is medical image analysis and deep learning.

References (54)

  • P. Hobson et al.

    Benchmarking human epithelial type 2 interphase cells classification methods on a very large dataset

    Artif. Intell. Med.

    (2015)
  • P. Foggia et al.

    Benchmarking HEp-2 cells classification methods

    IEEE Trans. Med. Imaging

    (2013)
  • LiY. et al.

    HEp-2 specimen image segmentation and classification using very deep fully convolutional network

    IEEE Trans. Med. Imaging

    (2017)
  • LiY. et al.

    cC-GAN: a robust transfer-learning framework for HEp-2 specimen image segmentation

    IEEE Access

    (2018)
  • P. Petra et al.

    Mining knowledge for HEp-2 cell image classification

    Artif. Intell. Med.

    (2002)
  • JiangX. et al.

    A verification-based multithreshold probing approach to HEp-2 cell segmentation

  • M. Merone et al.

    On using active contour to segment HEp-2 cells

  • P. Tobias et al.

    Full-resolution residual networks for semantic segmentation in street scenes

  • ZhangH. et al.

    Context encoding for semantic segmentation

  • YuZ. et al.

    A deep convolutional neural network-based framework for automatic fetal facial standard plane recognition

    IEEE J. Biomed. Health Inform

    (2018)
  • LongJ. et al.

    Fully convolutional networks for semantic segmentation

  • NieD. et al.

    Fully convolutional networks for multi-modality isointense infant brain image segmentation

  • I.S.A. Krizhevsky et al.

    ImageNet classification with deep convolutional neural networks

  • K. Simonyan, A. Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition, arXiv:1409.1556v6...
  • W.L.C. Szegedy et al.

    Going deeper with convolutions

  • HeK. et al.

    Deep residual learning for image recognition

  • Z. Wu, C. Shen, A.v.d. Hengel, High-performance Semantic Segmentation Using Very Deep Fully Convolutional Networks,...
  • Cited by (0)

    Hai Xie received the B.E. degree in department of information science and Engineering, Wanfang College of Science and Technology HPU, Zhengzhou, Henan Province, China, in 2013 and M.E. degree in College of Computer Science and Software Engineering, Shenzhen University, China, in 2015. He is currently a Ph.D. candidate in the College of Information and Engineering at Shenzhen University, Shenzhen, China. His search interest is medical image analysis and deep learning.

    Haijun Lei received the M.E. degree in department of Electrical and Electronic Engineering, Huazhong University of Science and Technology, Wuhan, China, in 1997 and the Ph.D. degree in Institute for Image Recognition and Artificial Intelligence, Huazhong University of Science and Technology, Wuhan in 2001. Since 2006, he has been with the faculty of the College of Computer Science and Software Engineering, Shenzhen University, China. His current research interests include image processing, and pattern recognition.

    Yejun He (SM’09) received the Ph.D. degree in information and communication engineering from the Huazhong University of Science and Technology, Wuhan, China, in 2005. Since 2011, he has been a Full Professor with the College of Information Engineering, Shenzhen University, Shenzhen, China, where he is currently the Director of the Guangdong Engineering Research Center of Base Station Antennas and Propagation and the Shenzhen Key Laboratory of Antennas and Propagation, Shenzhen, China, and the Vice Director of Shenzhen Engineering Research Center of Base Station Antennas and Radio Frequency, Shenzhen, China. He has authored or co-authored over 100 research papers and books (chapters) and holds about 20 patents. His current research interests include wireless mobile communication, antennas, and RF. Dr. He is a Fellow of the IET.

    Baiying Lei received her M. Eng. degree in electronics science and technology from Zhejiang University, China in 2007, and Ph.D. degree from Nanyang Technological University (NTU), Singapore in 2013. She is currently with School of Biomedical Engineering, Health Science Center, Shenzhen University, China. Her current research interests include medical image analysis, machine learning, and pattern recognition. Dr. Lei has coauthored more than 100 scientific articles, e.g., IEEE TCYB, IEEE TMI, IEEE TBME, IEEE JBHI. Pattern Recognition and Information Sciences. She is an IEEE senior member and serves as the editorial board member of Scientific Reports, Frontiers in Neuroinformatics, Frontiers in Aging Neuroscience, and Academic Editor of Plos One.

    View full text