cGAN-Based Lacquer Cracks Segmentation in ICGA Image

Jiang, Hongjiu; Ma, Yuhui; Zhu, Weifang; Fan, Ying; Hua, Yihong; Chen, Qiuying; Chen, Xinjian

doi:10.1007/978-3-030-00949-6_27

Hongjiu Jiang²⁸,
Yuhui Ma²⁸,
Weifang Zhu²⁸,
Ying Fan²⁹,
Yihong Hua²⁹,
Qiuying Chen²⁹ &
…
Xinjian Chen²⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11039))

Included in the following conference series:

2054 Accesses

Abstract

The increasing prevalence of high myopia has raised concern worldwide. In high myopia, myopia macular degeneration (MMD) is a major cause of vision impairment and lacquer crack (LC) is one of the main signs of MMD. Since the development of LC can reflect the severity of MMD, it is important and meaningful to segment LCs. Indocyanine green angiography (ICGA) has been used for visualizing LCs and is considered to be superior to fluorescein angiography (FA). However, LCs segmentation is difficult due to the image blurring and the confusion between LCs and the background. In this paper, we propose an automatic LCs segmentation method based on the improved conditional generative adversarial nets (cGAN). To apply the advanced cGAN on ICGA images, Dice loss function is added to improve the accuracy of segmentation. Experiments on the ICGA images of high myopia denoted that the proposed method can successfully segment LCs with the trained model and achieve better performance than other popular nets.

You have full access to this open access chapter, Download conference paper PDF

LS-Net: An Improved Deep Generative Adversarial Network for Retinal Lesion Segmentation in Fundus Image

An Efficient Hierarchical Optic Disc and Cup Segmentation Network Combined with Multi-task Learning and Adversarial Learning

Article 25 February 2022

Joint Optic Disc and Cup Segmentation Using Fully Convolutional and Adversarial Networks

Keywords

1 Introduction

The prevalence of myopia and high myopia are increasing globally at an alarming rate, with significant increases in the risk for vision impairment from pathologic conditions associated with high myopia, including retinal damage, cataract and glaucoma [1]. Myopia macular degeneration (MMD) is a major cause of vision impairment in high myopia. Lacquer cracks (LCs), signs of MMD, typically present as yellowish to white lines in the posterior segment in eyes with high myopia and are believed to be breaks in the choroid/retinal pigment epithelium (RPE)/Bruch membrane [2]. The prevalence of LC is 4.3%–9.2% in high myopia eyes [3]. Patients with LCs are at high risk of visual impairment because LCs may lead to further adverse changes in the fundus, such as patchy chorioretinal atrophy or myopic choroidal neovascularization [4]. Thus, the segmentation of LCs are quite important in clinical ophthalmology, which helps doctors diagnose and analyze the development of MMD.

Indocyanine green angiography (ICGA) is considered to be the ground truth for LCs detection. It provides details of choroidal vasculature in high myopia eyes and allows observation of the location and extent of LCs much more clearly than fundus photography and typical fluorescein angiography (FA) [2, 5, 6].

There is little studies on LCs segmentation recently. In order to achieve the accurate segmentation of LCs in ICGA images, we propose a novel method based on conditional generative adversarial networks (cGAN) [7]. As generative adversarial networks (GANs) [8] learn a generative model of data, cGANs learn a condition generative model. This makes cGANs much more suitable for image segmentation tasks, where we condition on an input image and generate the corresponding output segmentation image. Previous cGANs have tackled inpainting [9], image prediction from a normal map [10], image manipulation guided by user constraint [11], future frame prediction [12], etc. We are the first to apply cGAN on LCs segmentation. According to the features of ICGA images with LCs, Dice loss function [13] is added in the cGAN model to deal with the situation where there is a strong imbalance between the number of object the background pixels so that the generator can finally achieve better segmentation.

2 Method

2.1 Conditional Generative Adversarial Networks and Improvements

Image-conditional generative adversarial nets consists of two adversarial models: a generative model $ G $ that extracts the image features and generates fake images, and a discriminator $ D $ that estimates the probability that the image came from the training data rather than the generator.

It is easier to understand the network through the diagram below (see Fig. 1). cGANs learn a mapping from original image $ x $ and random noise vector $ z $ to the ground truth $ y $. The generator $ G $ is trained to produce outputs that cannot be distinguished by real images, while the discriminator $ D $ is trained to detect the fake images produced by generator. This training process is diagrammed in Fig. 1.

The objective function of a cGAN can be expressed as following [7]:

$$ L_{cGAN} (G,D) = E_{{x,y \sim p_{data} (x,y)}} [\log D(x,y)] + E_{{x \sim p_{data} (x),z \sim p_{z} (z)}} [\log (1 - D(x,G(x,z)))] $$

(1)

Since the generator tries to minimize the objective function against the adversarial discriminator that tries to maximize it, the final objective function is:

$$ F = \arg \mathop {\hbox{min} }\limits_{G} \mathop {\hbox{max} }\limits_{D} L_{cGAN} (G,D) $$

(2)

Previous approaches to cGANs have found it beneficial to mix the GAN objective with a traditional loss, such as L1 and L2 loss functions [9]. By adding L1 or L2 loss, discriminator’s job remains unchanged, but the generator is tasked to produce not only undistinguishable fake images but also images much more similar to the ground truth. According to previous work [7], L1 loss, which encourages less blurring than L2 loss, is adopted in this paper.

To apply cGANs on LCs segmentation, improvement is made based on the objective function. In ICGA images, LCs usually occupy a relatively small part of whole image. This data imbalance problem often causes the learning process to get trapped in a local minimum of the loss function and finally achieves predictions which are mainly biased to backgrounds. Also, it may just mistake the vessel, choroidal hemorrhage and the shade at the edge of images for LCs. To solve this problem, Dice loss function is added in the objective function. Dice loss function can effectively deal with the imbalance between the number of object and background pixels and finally make the proposed segmentation much more accurate.

Thus, both L1 loss function and Dice loss function [13] as follows are adopted:

$$ L_{L1} \left( G \right) = E_{{x,y \sim p_{data} (x,y),z \sim p_{z} (z)}} [\left\| {y - G(x,z)} \right\|_{1} ] $$

(3)

$$ L_{Dice} (G) = E_{{x,y \sim p_{data} (x,y),z \sim p_{z} (z)}} [1 - \frac{{2\sum\nolimits_{i}^{N} {y_{i} G(x,z)_{i} } }}{{\sum\nolimits_{i}^{N} {y_{i}^{2} + \sum\nolimits_{i}^{N} {G(x,z)_{i}^{2} } } }}] $$

(4)

where the sums run over all $ N $ pixels, of the generated binary segmentation pixel $ G(x,z)_{i} \in G(x,z) $ and the ground truth binary pixel $ y_{i} \in y $.

The final objective function is:

$$ F = \arg \mathop {\hbox{min} }\limits_{G} \mathop {\hbox{max} }\limits_{D} L_{cGAN} (G,D) + \mu L_{L1} (G) + \lambda L_{Dice} (G) $$

(5)

Past cGANs [10] provided Gaussian noise as an input to the generator since the net would produce deterministic outputs without the noise. In the proposed net, we provide random noise in the form of dropout, which is applied in several layers of our generator, and the initialization of kernels.

2.2 Network Architectures

The architecture of our network including generator and discriminator is illustrated in Figs. 2 and 3. The generator in Fig. 2 is similar to the traditional encoder-decoder architecture. Each encoding layer is a convolution layer with batch normalization and ReLU activation function. Each layer of decoder consists of a deconvolution, batch normalization and ReLU activation function. Dropout with a rate of 50% is adopted in the first three decoding layers to efficiently prevent the overfitting during training. In practise, leaky ReLU function, a variant of ReLU function, is adopted with the slope of 0.2. Leaky ReLU function can be used to mitigate the vanishing gradient problem and makes the network converge much faster during training.

All convolutions and deconvolutions are $ 4 \times 4 $ spatial filters applied with stride 2. Compared to other traditional deep convolution networks, our network adopt filters in convolution layer applied with stride 2 instead of pooling layer to reduce the spatial size of the representation, since discarding the pooling layer performs better in training good generative models [14].

We adopt the recent popular U-Net [15] as our main framework in generator. For medical image segmentation tasks, predicted segmentation shares the structure information with original images. Skip connections can constrain the output to be aligned with the input and make segmentation result more reasonable and accurate. In the proposed generator, high resolution features from the contracting path are combined with the upsampled output by skip connections. Thus, a successive convolution layer can then learn to assemble a more precise output based on this information. Since LCs in ICGA are mostly tiny and irregular, skip connections, which helps generator produce images with more details that look similar to the real LCs, can improve the accuracy of our segmentation.

PatchGAN is adopted in the discriminator which is shown in Fig. 3. Traditional discriminator in GANs for image processing estimates the probability that the image is real or fake and outputs a single scalar to represent. In contrast, patchGAN tries to classify if each $ N \times N $ patch in an image is real or fake. We run this discriminator convolutionally across the image, averaging all responses to provide the ultimate output possibility.

We create 2 copies of discriminator with the same underlying variables, one for real pairs and one for fake pairs. For real pairs, we concatenate the input and the ground truth first. For fake pairs, we concatenate the input and the output first. They all run over 5 encoding layers right after the concatenation. The convolutions are $ 4 \times 4 $ spatial filters with stride 2 except for those in last two layers with stride 1. These two layers applied zero-padding before convolution to change the size of representation in every channel from $ 32 \times 32 $ to $ 30 \times 30 $ and make the size of reception field, also the $ N $ in patchGAN, to be 70, which has better image quality than the full size image [7]. In the final $ 30 \times 30 $ image, each pixel represents the probability of a $ 70 \times 70 $ patch in the original image.

Discriminator with patchGAN effectively models the image as a Markov random field, assuming independence between pixels separated by more than a patch diameter [16]. It is demonstrated that the size of patch can be much smaller than the full size of the image and still produce high quality results [7]. Compared to traditional discriminators, discriminator using patchGAN has fewer parameters, runs faster and can be applied on arbitrarily large images.

3 Experiments and Results

The proposed network is evaluated on the ICGA data set of patients with LCs. The training data set consists of 22 annotated ICGA images of size $ 768 \times 768 $. Because of the lack of training data, we use excessive data augmentation by flipping images vertically and horizontally. During the experiments, 6 images were randomly chosen as the testing images. Data augmentation is applied to the rest 16 images and finally the training set consists of 64 images.

In order to evaluate the performance of segmentation, our method is compared with U-Net, a recently popular network that specializes on biomedical image segmentation, DenseNet [17], a new network which is good at feature extraction, and also the original cGAN without Dice loss function. U-Net with 5 layers and DenseNet with 7 dense blocks are adopted in the comparison, which have the best performance on the ICGA data set. The segmentation results are shown in Fig. 4.

As shown in Fig. 4, original cGAN, improved cGAN and DenseNet perform better than U-Net on segmenting LCs. Since ICGA images with LCs does not include many obvious features and the intensity information of LCs might be confused by nets with vessels, macular and the shade at the edge of images, it is quite difficult for U-Net to extract key features without discriminator networks. Second, original and improved cGAN get reasonable segmentation, which is quite similar to the ground truth. However, original cGAN and DenseNet segment some part of choroidal hemorrhage and retinal vessels as false positives. This situation is drastically suppressed in our proposed net due to the addition of Dice loss function.

To make the comparison more quantitatively, we adopt intersection-over-union (IoU) and pixel accuracy (PA) to evaluate the segmentation results in Table 1. IoU is the standard metric for segmentation purposes. It computes a ratio between the intersection and the union of the ground truth and the predicted segmentation. The ratio can be reformulated as the number of true positives (TP) over the sum of true positives, false positives (FP) and false negatives (FN). Pixel accuracy is another metric, simply computing a ratio between the amount of properly segmented pixels and the total number of them [18].

Table 1. Quantitative segmentation results of different networks.

Full size table

$$ IoU = \frac{TP}{TP + FP + FN} $$

(6)

$$ PA = \frac{TP}{TP + FP} $$

(7)

As we have seen in the Table 1, all three cGANs and DenseNet achieve better segmentation according to both IoU and PA. Results from DenseNet is overall better than original cGAN but worse than the improved cGAN. cGAN with only Dice loss function is added to reflect the importance of Dice loss function. Compared with original cGAN, segmentation of cGAN with only Dice loss function can achieve higher PA but lower IoU. In theory, Dice loss function adds more strict constraint to net, and the segmentation just contains the most obvious part of LCs and losses some part of the LCs, which is not obvious to the net. L1 loss is also important in cGAN since it penalizes the difference between ground truth and outputs and also encourages the output to be aligned with the input. Finally, the proposed net with both L1 loss and Dice loss achieves better IoU and better PA than other nets. It seems to be the most appropriate method to solve the LCs segmentation problem.

4 Conclusion

We propose an improved conditional GAN to segment LCs in ICGA images. Compared with original cGAN, U-Net and DenseNet, adding Dice loss function can solve the data imbalance problem and optimize the segmentation results. According to the results of experiments on the data set, the segmentation in the proposed network is overall better than other nets.

References

The Impact of Myopia and High Myopia: Report of the Joint World Health Organization–Brien Holden Vision Institute Global Science Meeting on Myopia. World Health Organization, Geneva (2015)
Google Scholar
Ikuno, Y., Sayanagi, K., Soga, K., Sawa, M., Gomi, F., et al.: Lacquer crack formation and choroidal neovascularization in pathologic myopia. Retina 28, 1124–1131 (2008)
Article Google Scholar
Wang, N.K., Lai, C.C., Chou, C.L., Chen, Y.P., Chuang, L.H., Chao, A.N., et al.: Choroidal thickness and biometric markers for the screening of lacquer cracks in patients with high myopia. PLoS One 8, e53660 (2013)
Article Google Scholar
Ohno-Matsui, K., Tokoro, T.: The progression of lacquer cracks in pathologic myopia. Retina 16, 29–37 (1996)
Article Google Scholar
Ohno-Matsui, K., Morishima, N., Ito, M., Tokoro, T.: Indocyanine green angiographic findings of lacquer cracks in pathologic myopia. Jpn. J. Ophthalmol. 42, 293–299 (1998)
Article Google Scholar
Quaranta, M., Arnold, J., Coscas, G., Francais, C., Quentel, G., et al.: Indocyanine green angiographic features of pathologic myopia. Am. J. Ophthalmol. 122, 663–671 (1996)
Article Google Scholar
Isola, P., Zhu, J.-Y., Zhou, T., Efros, A.A.: Image to image translation with conditional adversarial networks. In: CVPR (2017)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: NIPS (2014)
Google Scholar
Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: feature learning by inpainting. In: CVPR (2016)
Google Scholar
Wang, X., Gupta, A.: Generative image modeling using style and structure adversarial networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 318–335. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_20
Chapter Google Scholar
Zhu, J.-Y., Krähenbühl, P., Shechtman, E., Efros, A.A.: Generative visual manipulation on the natural image manifold. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 597–613. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_36
Chapter Google Scholar
Mathieu, M., Couprie, C., LeCun, Y.: Deep multi-scale video prediction beyond mean square error. In: ICLR (2016)
Google Scholar
Milletari, F., Navab, N., Ahmadi, S.-A.: V-Net: fully convolutional neural networks for volumetric medical image segmentation. In: Proceeding of 3D Vision, pp. 565–571. IEEE (2016)
Google Scholar
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Li, C., Wand, M.: Precomputed real-time texture synthesis with Markovian generative adversarial networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 702–716. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_43
Chapter Google Scholar
Huang, G., Liu, Z., Weinberger, K.Q.: Densely connected convolutional networks. In: CVPR (2017)
Google Scholar
Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., Villena-Martinez, V., Garcia-Rodriguez, J.: A review on deep learning techniques applied to semantic segmentation. arXiv preprint arXiv:1704.06857 (2017)

Download references

Acknowledgement

This study was supported in part by the National Basic Research Program of China (973 Program) under Grant 2014CB748600, in part by the National Nature Science Foundation of China for Excellent Young Scholars under Grant 61622114, in part by the National Nature Science Foundation of China under Grants 81371629, 81401472, 61401294, 61401293 and 61601317, and in part by the International Cooperation Project of Ministry of Science and Technology (2016YFE010770).

Author information

Authors and Affiliations

School of Electronic and Information Engineering, Soochow University, Suzhou, 215006, Jiangsu, China
Hongjiu Jiang, Yuhui Ma, Weifang Zhu & Xinjian Chen
Shanghai General Hospital, Shanghai, 200080, China
Ying Fan, Yihong Hua & Qiuying Chen

Authors

Hongjiu Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Yuhui Ma
View author publications
You can also search for this author in PubMed Google Scholar
Weifang Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Ying Fan
View author publications
You can also search for this author in PubMed Google Scholar
Yihong Hua
View author publications
You can also search for this author in PubMed Google Scholar
Qiuying Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xinjian Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xinjian Chen .

Editor information

Editors and Affiliations

University College London, London, UK
Danail Stoyanov
University of Leeds, Leeds, UK
Zeike Taylor
Radboud University Medical Center, Nijmegen, The Netherlands
Francesco Ciompi
Baidu, Beijing, China
Yanwu Xu
Sunnybrook Health Science Centre, Toronto, ON, Canada
Anne Martel
Deutsches Krebsforschungszentrum (DKFZ), Heidelberg, Germany
Lena Maier-Hein
University of Warwick, Coventry, UK
Nasir Rajpoot
Radboud University Medical Centre, Nijmegen, The Netherlands
Jeroen van der Laak
Eindhoven University of Technology, Eindhoven, The Netherlands
Mitko Veta
University of Dundee, Dundee, UK
Stephen McKenna
University Hospital Coventry, Coventry, UK
David Snead
University of Dundee, Dundee, UK
Emanuele Trucco
University of Iowa, Iowa City, IA, USA
Mona K. Garvin
Soochow University, Suzhou, China
Xin Jan Chen
Medical University of Vienna, Vienna, Austria
Hrvoje Bogunovic

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiang, H. et al. (2018). cGAN-Based Lacquer Cracks Segmentation in ICGA Image. In: Stoyanov, D., et al. Computational Pathology and Ophthalmic Medical Image Analysis. OMIA COMPAY 2018 2018. Lecture Notes in Computer Science(), vol 11039. Springer, Cham. https://doi.org/10.1007/978-3-030-00949-6_27

Download citation

DOI: https://doi.org/10.1007/978-3-030-00949-6_27
Published: 14 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00948-9
Online ISBN: 978-3-030-00949-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics