Elsevier

Neurocomputing

Volume 229, 15 March 2017, Pages 88-99
Neurocomputing

Accurate segmentation of nuclei in pathological images via sparse reconstruction and deep convolutional networks

https://doi.org/10.1016/j.neucom.2016.08.103Get rights and content

Abstract

Automated cell segmentation is a critical step for computer assisted pathology related image analysis, such as automated grading of breast cancer tissue specimens. However, automated cell segmentation is complicated by (1) complexity of the data (possibly touching cells, stains, background clutters, and image artifacts) and (2) the variability in size, shape, appearance, and texture of the individual nuclei. Recently, there has been a growing interest in the application of “Deep Learning” strategies for the analysis of natural and pathological images. Histopathology, given its diversity and complexity, represents an excellent use case for application of deep learning strategies. In this paper, we put forward an automated nuclei segmentation method that works with hematoxylin and eosin (H&E) stained breast cancer histopathology images, which represent regions of whole digital slides. The procedure can be divided into three main stages. Initially, the sparse reconstruction method is employed to roughly remove the background and accentuate the nuclei of pathological images. Then, deep convolutional networks (DCN), cascaded by multi-layer convolution networks, are trained using gradient descent techniques to efficiently segment the cell nuclei from the background. In this stage, input patches and its corresponding labels are randomly sampled from the pathological images and fed to the training networks. The size of the sampled patches can be flexible, and the proposed method is robust when the times of sampling and the number of feature maps vary in a wide range. Finally, morphological operations and some prior knowledge are introduced to improve the segmentation performance and reduce the errors. Our method achieves about 92.45% pixel-wise segmentation accuracy and the F1-measure is 0.8393. This result leads to a promising segmentation performance, equivalent and sometimes surpassing recently published leading alternative segmentation methods with the same benchmark datasets.

Introduction

Cell image analysis plays a very important role in the pathological diagnosis, and cell segmentation often constitutes the base and key step in cell pathological image diagnosis. Manual inspection of the histopathology images is an extremely tedious and time consuming process, and the results are also subject to intra- and inter-individual variability. At the same time, automated image segmentation for cell analysis is generally a difficult problem due to the large variability (different microscopes, stains, cell types, cell inhomogeneous intensities) and complexity of the data (possibly touching cells, background clutters, image artifacts such as bright halos or shade-off and containing large numbers of cells) [1], [2], [3]. In recent years, computer-aided diagnosis (CAD) systems are promising technology for ensuring a standardized, objective pathology specimen analysis and are of great research and clinical interest [4].

In this work, we propose a novel combined strategy to robustly segment the nuclei region of breast cancer histopathology images. In the next part of this section, we review the recent methods and present the motivation for developing a novel approach overcoming the current limitations.

Automated segmentation of cell nuclei is now a well-studied topic for which a large number of algorithms have been described in the literature [5], [6], [7], [8], [9], [10], [11], [12], [13]. Most of the developed cell and nuclei segmentation techniques revolve around thresholding, watershed segmentation, active contours and pixel-wise clustering/classification or a combination of the above, supplemented by different pre-processing and post-processing steps and detection/localization schemes.

Zhou et al. [5] have used the adaptive thresholding and watershed algorithm for cell nuclei segmentation followed by a fragment merging method that combines two scoring models based on trend and no trend features. Using the context information of time-lapse data, the phases of cell nuclei are identified accurately via a Markov model. Experimental results show that the proposed system is effective for nuclei segmentation and phase identification. Chen et al. [6] have employed Otsu’s method to segment nuclei from the background and then deployed an improved watershed technique to further separate touching nuclei. The two papers similarly adopt the time-lapse fluorescence microscopy images as a study case. Law et al. [7] have proposed a semi-supervised optimization model that determines an efficient segmentation of input images. The model only requires minimal tuning of model parameters during the initial stage. In [8], a marker-controlled watershed segmentation method has designed with multiple scales and different markers. The procedure can be divided into four main steps: 1) pre-processing with color unmixing and morphological operators, 2) marker-controlled watershed segmentation at multiple scales and with different markers, 3) post-processing for rejection of false regions and 4) merging of the results from multiple scales.

The algorithm proposed in [9] is a combined method. After preprocessing the image, the authors employ the maximally stable extremal regions (MSER) algorithm to separate all foreground objects from the background. Then, they split clusters of multiple cells through marker-based water shedding. Lu et al. [10] have employed a hybrid morphological reconstruction module to reduce the intensity variation within the nuclei regions and suppress the noise in the image. A local region adaptive threshold selection module, based on local optimal threshold, is used to segment the nuclei. The technique incorporates domain-specific knowledge of skin histopathological images to obtain more accurate segmentation results.

Al-Kofahi et al. [11] have presented a robust and accurate novel method for segmenting cell nuclei using a combination of ideas. The image foreground is extracted automatically using a graph-cuts-based binarization. Next, nuclear seed points are detected by combining multiscale Laplacian-of-Gaussian filtering constrained by distance-map-based adaptive scale selection. These points are used to perform an initial segmentation that is refined using a second graph-cuts-based algorithm incorporating the method of alpha expansions and graph coloring to reduce computational complexity. Ali et al. [12] have proposed an elegant segmentation method that uses boundary- and region-based active contours with statistical shape model to accurately detect all the specific shapes in the scene. Furthermore, they have presented their model in a multiple level set formulation to segment multiple objects under mutual occlusion. Their proposed model is accurate compared to traditional active contours and statistical shape models. Lu et al. [13] have proposed an algorithm that addresses the challenging problem of segmenting each individual cell’s nucleus and cytoplasm from a clump of cervical cells deposited on a microscope slide. This method based on a joint optimization of several level set functions was demonstrated to perform well on clumps of up to 10 cells.

All the methods reviewed above yield good segmentation results under certain circumstances. However, in general, almost all of them have some limitations. For example, both [6], [8] are sensitive to the initializations and local noisy gradients, easy to be over-segmented [5], [9]. The presence of staining variations (intra-image and inter-image) and similar backgrounds can pose obstacles for accurate segmentation of the nuclei in H&E–stained histopathological skin epidermis images [10]. Over-segmentation usually happens when a nucleus’ chromatin is highly textured (especially true for large nuclei) or when the nucleus shape is extremely elongated [12]. In [13], they have only tested the algorithm on normal appearing cervical cytology images, and not on abnormal cervical cells. Also, reliable prior knowledge of the nucleus cytoplasm structure is highly needed. Without this information, it is likely that the effectiveness of the methodology will be severely compromised.

Recently, there has been a growing interest in the application of “Deep Learning” strategies for the analysis of natural and pathological image. Histopathology, given its diversity and complexity, represents an excellent use case for application of deep learning strategies. The main challenge in terms of the computational techniques is to analyze all individual cells for accurate diagnosis, since the differentiation of most disease grades highly depends on the cell-level information. To this end, deep convolutional neural network has been investigated to robustly and accurately detect and segment cells from histopathological images [14], [15], [16], [17], which can significantly benefit the cell-level analysis for cancer diagnosis [17], [18], [19].

Liu et al. [14] have employed maximum weight independent set selection to choose the heaviest subset from a pool of cell detection candidates generated from different algorithms using various parameters and a deep convolutional neural network to compute the weights of the graph. Xie et al. [15] have proposed a novel deep voting model for accurate and robust nucleus localization, which extended the convolutional neural network (CNN) model to jointly learn the voting confidence and voting offset by introducing a hybrid non-linear activation function. Su et al. [16] have proposed a novel cell detection and segmentation algorithm. To handle the shape variations, inhomogeneous intensity, and cell overlapping, the sparse reconstruction, using an adaptive dictionary and trivial templates, is proposed to detect cells. In the segmentation stage, a stacked denoising autoencoder (sDAE) trained with structural labels is used for cell segmentation. Xu et al. [3] have employed stacked sparse autoencoder (SSAE) to detect the nuclei efficiently on high-resolution histopathological images of breast cancer. Zhang et al. [17] focus on the rank-level fusion of local and holistic features for the image-guided diagnosis of breast cancer and employ content-based image retrieval to discover clinically relevant instances from an image database, which can be used to infer and classify the new image.

Cruz-Roa et al. [20] have presented a novel unified approach for learning image representations, visual interpretation and automatic basal-cell carcinoma cancer detection from routine H&E histopathology images. Their approach demonstrates that a learned representation is better than a canonical predefined representation. Hatipoglu et al. [21] have introduced a cell classification method in histopathological images using the spatial information via CNN. A coarse-to-fine nucleus segmentation framework is developed with multiscale convolutional network (MSCN) and graph-partitioning-based method [22]. In [23], a new deep convolutional neural network based model has been proposed for segmentation and classification of epithelial and stromal regions within histopathological images, and the present model outperforms handcrafted features based models. Su et al. [24] have applied a fast scanning deep convolutional neural network (fCNN) to pixel-wise region segmentation. The fCNN removes the redundant computations in the original CNN without sacrificing its performance. Wan et al. [25] have presented a computer-aided grading method based on multi-level features and cascaded SVM classification to automatically distinguish histopathological breast cancer images with low, intermediate, and high grades. Pixel-, object-, and semantic-level features are extracted to quantitatively characterize morphological patterns and interpretable concepts from the breast cancer tissue images. Semantic-level features have been extracted by a CNN approach and the performance suggests that the method could be useful in developing a computational diagnostic tool for differentiating breast cancer grades. In [26], a novel architecture has been defined as the combined semantic information from a deep, coarse layer with appearance information from a shallow, fine layer to produce accurate and detailed segmentations. The proposed method achieves state-of-the-art segmentation results on PASCAL VOC 2011-2, NYUDv2, and SIFTS Flow. Ronneberger et al. [27] have presented a network and training strategy that relies on the data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization.

Almost all the deep learning networks reviewed above include not only convolutional layer but also subsampling layer. Subsampling is an important strategy in object recognition, where it helps achieve invariance to distortions of the visual image by discarding positional information about image features and details. However, it will produce an output representation of much lower resolution than the input image. Many image processing applications require precise positional information. The segmentation of H&E histopathology images of breast tissue is a good example. Therefore, our convolutional networks would not include subsampling.

In this paper, we propose a method to segment the nuclei regions from the pathological images. The main contributions of this work include three points.

First, a deep convolutional network, cascaded by multi-layer convolution operation without subsampling layers, is employed for cell segmentation. At the same time, the sparse reconstruction method is employed to accentuate the nuclei region of pathological images before the network training. To improve the segmentation performance, morphological operations and some prior knowledge are introduced as the post-processing step.

Second, input patches and its corresponding labels are randomly sampled from the pathological images and fed to the training networks. The size of the sampled patches can be flexible, and the proposed method is robust when the times of sampling and the number of feature maps vary in a wide range. (See Section 3.3 for detailed description).

Finally, extensive experiments and comparisons with recently published models show that our method achieves a promising segmentation performance, equaling and sometimes surpassing the published leading alternative segmentation methods with the same benchmark datasets.

In the following sections, we demonstrate that our proposed method can accurately segment the nuclei regions from the pathological images. Section 2 presents our methodological contributions in detail, while Section 3 describes the experimental setup, validation results and detailed comparisons with the previous methods. Section 4 gives some discussions and presents some insights for potential extension of the method. Section 5 concludes the paper.

Section snippets

Methodology

We propose a combined and efficient strategy to segment the H&E cell images. An overview of the proposed method is shown in Fig. 1. Sparse reconstruction (SR) method is first employed to roughly remove the background and accentuate the nuclei in the H&E breast cell images. Next, a DCN, cascaded by multi-layer convolution networks, is trained to efficiently segment the cell nuclei from the background. At last, a series of morphological operations and some prior knowledge are introduced to

Dataset description

There are 58 Hematoxylin and Eosin (H&E) histopathology images of breast tissue from Yale, David Rimm’s Laboratory, with 32 benign and 26 malignant images respectively. http://medicine.yale.edu/bbs/molecularcell/people/david_rimm.profile.

As shown in Fig. 5, the color histopathology images have 3 channels (RGB color channels). Each channel is represented in an analog image as an 8 bit, 896×768 grayscale image. For each image, a pixel-level markup is specified within a designated truth window of

Discussions

In most of previous image segmentation approaches, domain specific knowledge is extracted based on studying the characteristic and structure of the images and then manually embedded into the behavior of the algorithm. These methods are simple and surprisingly effective in some cases. However, there are some limits in most domain-specific applications due to the difficulty in encoding complex high-level knowledge into the general low-level algorithms.

Moreover, in histopathology, distinguishing

Conclusions

In this paper, we propose an automatic image processing pipeline that is able to accurately detect and segment nuclei in breast pathological images. First, sparse reconstruction with K-SVD and Batch-OMP algorithms are employed to enhance the nucleus area and remove background preliminarily. Moreover, the segmentation stage exploits the DCN trained with structural labels to obtain the accurate pixels of the cell nuclei. Finally, morphological operations and some prior knowledge are introduced to

Acknowledgments

The authors would like to thank Dr. Gelasca et al. for publishing the dataset, the authors of [9], [28] for making the code available. We are grateful for helpful comments from the anonymous reviewers. This research was supported by the National Natural Science Foundation of China (Grant numbers 21365008, 61105004, 61562013 and 61462018).

Xipeng Pan is a Ph.D. candidate with School of Automation, Beijing University of Posts and Telecommunications, China. His research interests include machine learning, medical image processing. He is a member of CCF.

References (45)

  • X. Zhang et al.

    High-throughput histopathological image analysis via robust cell segmentation and hashing

    Med. Image Anal.

    (2015)
  • J. Xu et al.

    A deep convolutional neural network for segmenting and classifying epithelial and stromal regions in histopathological images

    Neurocomputing

    (2016)
  • E. Meijering

    Cell segmentation: 50 years down the road

    IEEE Signal Process. Mag.

    (2012)
  • H. Su, Z. Yin, T. Kanade, S. Huh, Phase contrast image restoration via dictionary representation of diffraction...
  • J. Xu et al.

    Stacked sparse autoencoder (SSAE) for nuclei detection on breast cancer histopathology images

    IEEE Trans. Med. Imaging

    (2015)
  • F. Xing et al.

    Robust nucleus/cell detection and segmentation in digital pathology and microscopy images: a comprehensive review

    IEEE Rev. Biomed. Eng.

    (2016)
  • X. Zhou et al.

    A novel cell segmentation method and cell phase identification using markov model

    IEEE Trans. Inf. Technol. Biomed.

    (2009)
  • X. Chen et al.

    Automated segmentation, classification, and tracking of cancer cell nuclei in time-lapse microscopy

    IEEE Trans. Biomed. Eng.

    (2006)
  • Y.N. Law et al.

    A semisupervised segmentation model for collections of images

    IEEE Trans. Image Process.

    (2012)
  • M. Veta et al.

    Automatic nuclei segmentation in H&E stained breast cancer histopathology images

    PLoS One

    (2013)
  • F. Buggenthin et al.

    An automatic method for robust and fast cell detection in bright field images from high-throughput microscopy

    BMC Bioinform.

    (2013)
  • C. Lu et al.

    A robust automatic nuclei segmentation technique for quantitative histopathological image analysis

    Anal. Quant. Cytol. Histol.

    (2012)
  • Y. Al-Kofahi et al.

    Improved automatic detection and segmentation of cell nuclei in histopathology images

    IEEE Trans. Biomed. Eng.

    (2010)
  • S. Ali et al.

    An integrated region-, boundary-, shape based active contour for multiple object overlap resolution in histological imagery

    IEEE Trans. Med. Imaging

    (2012)
  • Z. Lu et al.

    An improved joint optimization of multiple level set functions for the segmentation of overlapping cervical cells

    IEEE Trans. Image Process.

    (2015)
  • F. Liu, L. Yang, A novel cell detection method using deep convolutional neural network and maximum-weight independent...
  • Y. Xie, X. Kong, F. Xing, F. Liu, H. Su, L. Yang, Deep voting: a robust approach toward nucleus localization in...
  • H. Su, F. Xing, X. Kong, Y. Xie, S. Zhang, L. Yang, Robust cell detection and segmentation in histopathological images...
  • X. Zhang et al.

    Fusing heterogeneous features from stacked sparse autoencoder for histopathological image analysis

    IEEE J. Biomed. Health Inform.

    (2015)
  • X. Zhang et al.

    Towards large-scale histopathological image analysis: hashing-based image retrieval

    IEEE Trans. Med. Imaging

    (2015)
  • A. Cruz-Roa, J.A. Ovalle, A. Madabhushi, F. González, A deep learning architecture for image representation, visual...
  • N. Hatipoglu, G. Bilgin, Classification of histopathological images using convolutional neural network, in: Proceedings...
  • Cited by (117)

    View all citing articles on Scopus

    Xipeng Pan is a Ph.D. candidate with School of Automation, Beijing University of Posts and Telecommunications, China. His research interests include machine learning, medical image processing. He is a member of CCF.

    Lingqiao Li is a Ph.D. candidate with School of Automation, Beijing University of Posts and Telecommunications, China. Currently he is assistant researcher at the Guilin University of Electronic and Technology, China and his research interests include machine learning, spectrum analysis.

    Huihua Yang received his Ph.D. degree from East China University of Science and Technology, China in 2005. He was a postdoctoral research fellow of Tsinghua University from 2005 to 2007. Currently, he is a professor of School of Automation, Beijing University of Posts and Telecommunications, China. His research interests include machine learning, spectrum analysis, and optimization. Dr. Yang has published more than 40 papers and serves as Director of China Instrument and Control Society (CICS), Vice Director of NIR Division of CICS, and is a senior member of CCF, and a member of ACM.

    Zhenbing Liu received his Ph.D. degree in Computer Science from Huazhong University of Science and Technology, China in 2010. He was a visiting research fellow at Pennsylvania University in 2015. Currently, he is a professor of Guilin University of Electronic Technology, China. His research interests include machine learning and medical image processing. He has published more than 30 papers.

    Jinxin Yang is a Master’s candidate with School of Automation, Beijing University of Posts and Telecommunications, China. His research interests include deep learning and segmentation of medical cell image.

    Lingling Zhao is a Master’s candidate with School of Mechanical and Electrical Engineering, Guilin University of Electronic Technology, China. His research interests include machine learning and segmentation of medical cell image.

    Yongxian Fan received his Ph.D. degree in Department of Automation from Shanghai Jiao Tong University, China in 2013. Currently, he is an associate professor of School of Computer Science and Information Security, Guilin University of Electronic Technology, China. His research interests include machine learning, pattern recognition and bioinformatics.

    View full text