RBC Semantic Segmentation for Sickle Cell Disease Based on Deformable U-Net

Zhang, Mo; Li, Xiang; Xu, Mengjia; Li, Quanzheng

doi:10.1007/978-3-030-00937-3_79

RBC Semantic Segmentation for Sickle Cell Disease Based on Deformable U-Net

Mo Zhang¹⁸,
Xiang Li¹⁹,
Mengjia Xu²⁰ &
…
Quanzheng Li^19,21,22

Conference paper
First Online: 13 September 2018

9934 Accesses
19 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11073))

Abstract

Reliable cell segmentation and classification from biomedical images is a crucial step for both scientific research and clinical practice. A major challenge for more robust segmentation and classification methods is the large variations in the size, shape and viewpoint of the cells, combining with the low image quality caused by noise and artifacts. To address this issue, in this work we propose a learning-based, simultaneous cell segmentation and classification method based on the U-Net structure with deformable convolution layers. The U-Net architecture has been shown to offer a precise localization for image semantic segmentation. Moreover, deformable convolution enables the free form deformation of the feature learning process, thus making the whole network more robust to various cell morphologies and image settings. The proposed method is tested on microscopic red blood cell images from patients with sickle cell disease. The results show that U-Net with deformable convolution achieves the highest accuracy for both segmentation and classification tasks, compared with the original U-Net structure and unsupervised methods.

M. Zhang, X. Li and M. Xu—Joint first authors.

You have full access to this open access chapter, Download conference paper PDF

1 Introduction

Sickle cell disease (SCD) is an inherited blood disorder, where SCD patients have abnormal hemoglobin that can cause normal disc-shaped red blood cells (RBCs) to distort and generate heterogeneous shapes. The differences in cell morphology between healthy and pathological cells make it possible to perform image-based diagnosis using image processing techniques, which is very important for faster and more accurate diagnosis of potential SCD crises. Various methods have been developed to perform RBC segmentation and/or classification, such as thresholding, region growing [1], watershed transform [2], deformable models [3], and clustering [4]. However, traditional image processing models such as thresholding and region growing are susceptible to the noisy image background and blurred cell boundaries, which are common in microscopy images. Moreover, deformable models like active contour [3] needs good initialization and relies on relatively clear cell morphology. In addition, due to the heterogeneous shapes and touching RBCs in SCD, recent open source cell detection tools, such as CellProfiler [5], CellTrack [6] or Fiji [7] are not readily used to accurately detect and classify the SCD RBCs. Hence, an effective SCD cell segmentation and classification method is still an open problem for the field.

Recently, deep learning methods with convolutional neural networks (CNN) have achieved remarkable success in the field of both natural image [8] and medical image analysis [9]. Among these methods, the fully convolutional network (FCN) has shown state-of-the-art performance in various real-world applications [10]. Specifically, FCN has been applied in the cell segmentation problems [11, 12] and obtained good results. U-Net was developed based on FCN and takes skip connection between encoder and decoder into consideration [13], which has also been applied on medical images. On the other hand, one of the major challenges in capturing the most discriminative shape and texture features of the RBCs is that cells can be imaged in various poses and sizes, thus a spatial-invariant scheme is needed to overcome those variations. For example, the work applies dense transformer network based on thin-plate spline, and has achieved superior performance on brain electron microscopy image segmentation problems [14]. In this work, we apply deformable convolution [15] to the U-Net architecture and develop the deformable U-Net framework for semantic cell segmentation. Deformable convolution accommodates geometric variations in the images by learning and applying adaptive receptive fields driven by data [15], in contrast to standard CNNs where the receptive field is constant. Therefore, it can be more robust to the spatial variations of the RBCs.

The proposed framework is trained and tested on a large, multi-institutional RBC microscopic image base with manual annotations, consisting of both healthy and pathological populations. We perform the simultaneous segmentation and classification of the RBCs using the trained network based on various experimental settings. The supreme accuracy for both segmentation and classification indicates that the proposed framework is an effective solution for the automatic detection of SCD RBCs. To the best of our knowledge, this work is the first attempt of solving the SCD detection problem in an end-to-end semantic segmentation approach.

2 Materials and Methods

Since the traditional U-Net is inherently limited to deal with object shape transformations due to its regular square receptive field, in this work, we propose deformable U-Net replacing the convolution kernel with deformable convolution throughout the U-Net. In the classic CNN architecture, convolution kernel is defined with fixed shape and size by sampling the input feature map on a regular grid. For example, the grid $\mathcal {R}$ for a $3\times 3$ kernel is $\mathcal {R}=\{(-1,-1),(-1,0),\cdots ,(0,1), (1,1)\}$. For each pixel $\mathbf {p}_0$ on the output feature map $\mathbf {y}$, the standard convolution can be expressed as:

$$\begin{aligned} \mathbf {y}(\mathbf {p}_0)=\sum _{\mathbf {p}_n\in \mathcal {R}}\mathbf {w}(\mathbf {p}_n)\cdot \mathbf {x}(\mathbf {p}_0+\mathbf {p}_n), \end{aligned}$$

(1)

where $\mathbf {y}(\mathbf {p}_0)$ denotes the value of pixel $\mathbf {p}_0$ on the output feature map, $\mathbf {x}(\mathbf {p}_0+\mathbf {p}_n)$ denotes the value of pixel $\mathbf {p}_0+\mathbf {p}_n$ on the input feature map, and $\mathbf {w}(\mathbf {p}_n)$ is the weight parameter. In contrast, deformable convolution adds 2D offsets to the regular sampling grid $\mathcal {R}$, thus Eq. (1) becomes:

$$\begin{aligned} \mathbf {y}(\mathbf {p}_0)=\sum _{\mathbf {p}_n\in \mathcal {R}}\mathbf {w}(\mathbf {p}_n)\cdot \mathbf {x}(\mathbf {p}_0+\mathbf {p}_n+\varDelta \mathbf {p}_n). \end{aligned}$$

(2)

As offset $\varDelta \mathbf {p}_n$ is probably fractional, Eq. (2) is implemented by bilinear interpolation as:

$$\begin{aligned} \mathbf {x}(\mathbf {p})=\sum _{\mathbf {q}}max(0,1-|q_x-p_x|)\cdot max(0,1-|q_y-p_y|)\cdot \mathbf {x}(\mathbf {q}), \end{aligned}$$

(3)

where $\mathbf {p}$ enumerates an arbitrary fractional location on the input feature map, $\mathbf {q}$ denotes all integer locations on the input feature map, and $p_x$ ($p_y$) denotes the x-coordinate (y-coordinate) of $\mathbf {p}$. Equation (3) is easy to compute as it is only related with the four nearest integer coordinates $\mathbf {q}_i,i=[1,2,3,4]$ of $\mathbf {p}$. Equation (3) is also equivalent to:

$$\begin{aligned} \mathbf {x}(\mathbf {p})=\sum _{i=1}^{n}\mathbf {x}(\mathbf {q}_i)\cdot S_i, \end{aligned}$$

(4)

where $S_i,i=[1,2,3,4]$ is the area of the assigned rectangle generated by $\mathbf {q}_i,i=[1,2,3,4]$ and $\mathbf {p}$, and the illustration is shown in Fig. 1.

The detailed procedure of deformable convolution is described in Fig. 1. First, we implement an additional classic convolution with activation function TANH to learn offset field from the input feature map, which is normalized to $[-1,1]$. The offset field has the same height and width with input feature map while its number of channels is $2N (N=|\mathcal {R}|)$. The offset field is then multiplied by parameter s (which is used to adjust the scope of receptive field) and added by the regular grid $\mathcal {R}$ to obtain the sampling locations (every coordinate on offset field has N pairs of values corresponding to the regular grid $\mathcal {R}$). Finally, values of the irregular sampling coordinates are computed via bilinear interpolation, then the original convolution kernel samples the deformed feature map to get the new feature map. In this work, we set deformable kernel works in the same way across different channels, rather than learning a separate kernel for each channel to improve the learning efficiency. Deformable convolution can sample the input feature map in a local and dense way, and be adaptive to the localization for objects with different shapes [15], which is exactly what we need in SCD RBC semantic segmentation.

The main architecture of the deformable U-Net is shown in Fig. 2. It includes two parts: encoder path and decoder path. In the encoder path, each layer has two $3\times 3$ deformable convolutions followed by a $2\times 2$ max pooling operation with stride of 2, which doubles the number of channels and halves the resolution of input feature map for down-sampling. The encoder is followed by two $3\times 3$ deformable convolutions called bottom layers. Each step in the decoder path contains a $3\times 3$ deconvolution with stride 2 followed by two $3\times 3$ deformable convolutions. The skip connection between encoder and decoder helps to preserve more contextual information for better localization [13]. The proposed deformable U-Net can be easily trained end-to-end (from the input image to the label map) through back propagation in the same way with the U-Net architecture.

3 Results

In this section, to evaluate the performance of the proposed deformable U-Net in dealing with RBC semantic segmentation for SCD, we perform experiments from two different aspects: (1) single-class RBC semantic segmentation, which aims at differentiating cells from background, and (2) multi-class RBC semantic segmentation, which aims at differentiating various sub-types of SCD RBC. The experimental data and implementation details are presented below.

Data and Implementation Details. In terms of the latest public SCD RBC image dataset from MIT, refer to [16], we use 266 raw microscopy images of 4 different SCD patients as our experimental data. The original blood sample is collected from UPMC (University of Pittsburgh Medical Center) and MGH (Massachusetts General Hospital). In the dataset, raw microscopy images are acquired using a Zeiss inverted Axiovert 200 microscope under 63$\times $ oil objective lens using an industrial camera (Sony Exmor CMOS color sensor, 1080p resolution), the image resolution is $1920\times 1080$. Additionally, RBC areas and RBC categories are manually annotated as ground truths by the data provider. Based on the coarse RBC labeling strategy in previous work [16], three SCD RBC categories are employed in our experiments: (1) Dic+Ovl, (2) El+Sk, and (3) others. During the implementation, we initially pre-process the collected raw image data by removing two-side margins and resize them into same size $512\times 512$.The network is implemented in TensorFlow 1.2.1, and we use RELU as activation function with scope of 2 and batch normalization for convolution operations. Furthermore, we employ the Adam algorithm for training with learning rate $10^{-3}$, weight decay $10^{-8}$, batch size (2) and epoch (30000).Our code can be accessed on GitHub^{Footnote 1}.

Evaluation of Single-Class RBC Semantic Segmentation Performance. To demonstrate the performance of our method in single-class RBC semantic segmentation, we compare the proposed method with the prevalent U-Net and region growing methods.The preliminary SCD RBC dataset are divided into two parts: 166 random samples for training, and the rest 100 samples for testing. As it can be seen from Fig. 3, the proposed method improves the performance of U-Net in SCD RBC semantic segmentation significantly. First, it can effectively separate touching RBCs as shown in A and D (yellow circles) of Fig. 3; Second, for the cases of heterogeneous shapes SCD RBC segmentation, deformable U-Net obtains more accurate results than the other two methods, see C and D (blue circles) of Fig. 3; Third, regarding the RBC segmentation under blur boundary, our method receives a clearly high accuracy, see purple circles in C of Fig. 3. Moreover, Fig. 3 indicates that the proposed method has a better generalization ability for shaded cell segmentation at the edge. Furthermore, deformable U-Net can effectively avoid the disturbance of various noises (e.g. dirties, halos, etc.) in the RBC semantic segmentation procedure. To quantify the comprehensive performance of our method, three main indices are calculated, see Table 1. The proposed network outperforms the other two approaches in terms of accuracy, precision and F1 score.

Table 1. Quantitative performance analysis of different methods in single-class SCD RBC segmentation.

Full size table

Evaluation of Multi-class RBC Semantic Segmentation Performance. In addition to single-class segmentation evaluation for the proposed network, we also conduct an experiment on multi-class RBC semantic segmentation for SCD based on the same dataset division schema as above. The corresponding segmentation results is shown in Fig. 4, different colors indicate different RBC types: red (Dic+Ovl), blue (El+Sk) and green (others).Specifically,the proposed deformable U-Net gains better capability of predicting an integrated RBC without any shape prior than the standard U-Net method, as certain cells are segmented out yet identified as two classes simultaneously in the U-Net prediction, the yellow square region in Fig. 4. Additionally, deformable U-Net is more robust to the background noise presented in the microscopic images, e.g. the baseline U-Net predict background objects as RBCs in the blue square of Fig. 4, while deformable U-Net predict the accurate negative label. Furthermore, we perform the quantitative analysis for our trained model by three statistic metrics: loss, accuracy, and mean IoU (Intersection over Union). The evaluation results in Table 2 indicate that deformable U-Net possess a superior performance than the standard U-Net.

Table 2. Quantitative performance analysis of different methods in multi-class SCD RBC segmentation.

Full size table

4 Conclusion

In this work, we present an improved U-Net framework (deformable U-Net) for automated SCD RBC semantic segmentation. Experimental results demonstrate that the proposed approach obtains an obvious superior performance than the baseline U-Net, especially for the key problems, e.g.background noise discrimination, heterogeneous shapes of RBC segmentation, touching RBC separation, blurred RBC segmentation. Moreover, it has high consistency in performing the prediction on cell boundaries.

Notes

1.
https://github.com/moliqingcha/Deformable-U-Net.

References

Chassery, J.-M., Garbay, C.: An iterative segmentation method based on a contextual color and shape criterion. IEEE Trans. Pattern Anal. Mach. Intell. 6, 794–800 (1984)
Article Google Scholar
Plissiti, M.E., Nikou, C., Charchanti, A.: Watershed-based segmentation of cell nuclei boundaries in pap smear images. In: 2010 10th IEEE International Conference on Information Technology and Applications in Biomedicine (ITAB), pp. 1–4. IEEE (2010)
Google Scholar
Zamani, F., Safabakhsh, R.: An unsupervised GVF snake approach for white blood cell segmentation based on nucleus. In: 2006 8th International Conference on Signal Processing, vol. 2. IEEE (2006)
Google Scholar
Savkare, S.S., Narote, S.P.: Blood cell segmentation from microscopic blood images. In: 2015 International Conference on Information Processing (ICIP), pp. 502–505. IEEE (2015)
Google Scholar
Carpenter, A.E., et al.: CellProfiler: image analysis software for identifying and quantifying cell phenotypes. Genome Biol. 7(10), R100 (2006)
Article Google Scholar
Sacan, A., Ferhatosmanoglu, H., Coskun, H.: CellTrack: an open-source software for cell tracking and motility analysis. Bioinformatics 24(14), 1647–1649 (2008)
Article Google Scholar
Schindelin, J., et al.: Fiji: an open-source platform for biological-image analysis. Nat. Methods 9(7), 676 (2012)
Article Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Article Google Scholar
Shen, D., Guorong, W., Suk, H.-I.: Deep learning in medical image analysis. Ann. Rev. Biomed. Eng. 19, 221–248 (2017)
Article Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Google Scholar
Yang, L., Zhang, Y., Guldner, I.H., Zhang, S., Chen, D.Z.: 3D segmentation of glial cells using fully convolutional networks and k-terminal Cut. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016 Part II. LNCS, vol. 9901, pp. 658–666. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46723-8_76
Chapter Google Scholar
Aydin, A.S., Dubey, A., Dovrat, D., Aharoni, A., Shilkrot, R.: CNN based yeast cell segmentation in multi-modal fluorescent microscopy data. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 753–759 (2017)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015 Part III. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Li, J., Chen, Y., Cai, L., Davidson, I., Ji, S.: Dense transformer networks. arXiv preprint arXiv:1705.08881 (2017)
Dai, J., et al.: Deformable convolutional networks. CoRR, abs/1703.06211, vol. 1, no. 2, p. 3 (2017)
Google Scholar
Xu, M., Papageorgiou, D.P., Abidi, S.Z., Dao, M., Zhao, H., Karniadakis, G.E.: A deep convolutional neural network for classification of red blood cells in sickle cell anemia. PLoS Comput. Biol. 13(10), e1005746 (2017)
Article Google Scholar

Download references

Acknowledgments

Quanzheng Li is supported in part by the National Institutes of Health under Grant R01AG052653.

Author information

Authors and Affiliations

Center for Data Science, Peking University, Beijing, 100871, China
Mo Zhang
MGH/BWH Center for Clinical Data Science, Boston, MA, 02115, USA
Xiang Li & Quanzheng Li
Beijing International Center for Mathematical Research, Peking University, Beijing, 100871, China
Mengjia Xu
Center for Data Science in Health and Medicine, Peking University, Beijing, 100871, China
Quanzheng Li
Laboratory for Biomedical Image Analysis, Beijing Institute of Big Data Research, Beijing, 100871, China
Quanzheng Li

Authors

Mo Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Li
View author publications
You can also search for this author in PubMed Google Scholar
Mengjia Xu
View author publications
You can also search for this author in PubMed Google Scholar
Quanzheng Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Quanzheng Li .

Editor information

Editors and Affiliations

University of Leeds, Leeds, UK
Alejandro F. Frangi
King’s College London, London, UK
Julia A. Schnabel
University of Pennsylvania, Philadelphia, PA, USA
Christos Davatzikos
Universidad de Valladolid, Valladolid, Spain
Carlos Alberola-López
Queen’s University, Kingston, ON, Canada
Gabor Fichtinger

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, M., Li, X., Xu, M., Li, Q. (2018). RBC Semantic Segmentation for Sickle Cell Disease Based on Deformable U-Net. In: Frangi, A., Schnabel, J., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2018. MICCAI 2018. Lecture Notes in Computer Science(), vol 11073. Springer, Cham. https://doi.org/10.1007/978-3-030-00937-3_79

Download citation

DOI: https://doi.org/10.1007/978-3-030-00937-3_79
Published: 13 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00936-6
Online ISBN: 978-3-030-00937-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics