OfGAN: Realistic Rendition of Synthetic Colonoscopy Videos

Xu, Jiabo; Anwar, Saeed; Barnes, Nick; Grimpen, Florian; Salvado, Olivier; Anderson, Stuart; Armin, Mohammad Ali

doi:10.1007/978-3-030-59716-0_70

Jiabo Xu^16,17,
Saeed Anwar^16,17,
Nick Barnes¹⁶,
Florian Grimpen¹⁸,
Olivier Salvado¹⁷,
Stuart Anderson¹⁷ &
…
Mohammad Ali Armin^16,17

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12263))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

7865 Accesses
6 Citations

Abstract

Data-driven methods usually require a large amount of labelled data for training and generalization, especially in medical imaging. Targeting the colonoscopy field, we develop the Optical Flow Generative Adversarial Network (OfGAN) to transform simulated colonoscopy videos into realistic ones while preserving annotation. The advantages of our method are three-fold: the transformed videos are visually much more realistic; the annotation, such as optical flow of the source video is preserved in the transformed video, and it is robust to noise. The model uses a cycle-consistent structure and optical flow for both spatial and temporal consistency via adversarial training. We demonstrate that the performance of our OfGAN overwhelms the baseline method in relative tasks through both qualitative and quantitative evaluation.

Supported by ANU and CSIRO.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Implicit domain adaptation with conditional generative adversarial networks for depth prediction in endoscopy

Article Open access 15 April 2019

Endora: Video Generation Models as Endoscopy Simulators

Depth Estimation for Colonoscopy Images with Self-supervised Learning from Videos

References

Armanious, K., et al.: Medgan: medical image translation using gans. Comput. Med. Imaging Graph. 79, 101684 (2020)
Article Google Scholar
Bansal, A., Ma, S., Ramanan, D., Sheikh, Y.: Recycle-gan: unsupervised video retargeting. In: Proceedings of the European conference on computer vision (ECCV), pp. 119–135 (2018)
Google Scholar
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Article Google Scholar
Chen, Y., Pan, Y., Yao, T., Tian, X., Mei, T.: Mocycle-gan: unpaired video-to-video translation. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 647–655 (2019)
Google Scholar
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3213–3223 (2016)
Google Scholar
De Visser, H., et al.: Developing a next generation colonoscopy simulator. Int. J. Image Graph. 10(02), 203–217 (2010)
Article MathSciNet Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255. IEEE (2009)
Google Scholar
Dosovitskiy, A., et al.: Flownet: learning optical flow with convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp. 2758–2766 (2015)
Google Scholar
Engelhardt, S., De Simone, R., Full, P.M., Karck, M., Wolf, I.: Improving surgical training phantoms by hyperrealism: deep unpaired image-to-image translation from real surgeries. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11070, pp. 747–755. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00928-1_84
Chapter Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vision 88(2), 303–338 (2010)
Article Google Scholar
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2414–2423. IEEE (2016)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in neural information processing systems, pp. 2672–2680 (2014)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp. 2961–2969. IEEE (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778. IEEE (2016)
Google Scholar
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1125–1134. IEEE (2017)
Google Scholar
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Chapter Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lee, K., Jung, H.: Davincigan: unpaired surgical instrument translation for data augmentation (2018)
Google Scholar
Li, C., Wand, M.: Precomputed real-time texture synthesis with markovian generative adversarial networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 702–716. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_43
Chapter Google Scholar
Mahmood, F., Chen, R., Durr, N.J.: Unsupervised reverse domain adaptation for synthetic medical images via adversarial training. IEEE Trans. Med. Imaging 37(12), 2572–2581 (2018)
Article Google Scholar
Oda, M., Tanaka, K., Takabatake, H., Mori, M., Natori, H., Mori, K.: Realistic endoscopic image generation method using virtual-to-real image-domain translation. Healthcare Technology Letters (2019)
Google Scholar
Paszke, A., et al.: Automatic differentiation in pytorch, (2017)
Google Scholar
Rau, A., et al.: Implicit domain adaptation with conditional generative adversarial networks for depth prediction in endoscopy. Int. J. Comput. Assist. Radiol. Surg. 14(7), 1167–1176 (2019). https://doi.org/10.1007/s11548-019-01962-w
Article Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Shrivastava, A., Pfister, T., Tuzel, O., Susskind, J., Wang, W., Webb, R.: Learning from simulated and unsupervised images through adversarial training. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2107–2116 (2017)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Sun, D., Yang, X., Liu, M.Y., Kautz, J.: Pwc-net: cnns for optical flow using pyramid, warping, and cost volume. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8934–8943. IEEE (2018)
Google Scholar
Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8798–8807. IEEE (2018)
Google Scholar
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp. 2223–2232. IEEE (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Research School of Engineering, Australian National University, Canberra, Australia
Jiabo Xu, Saeed Anwar, Nick Barnes & Mohammad Ali Armin
Data61, CSIRO, Canberra, Australia
Jiabo Xu, Saeed Anwar, Olivier Salvado, Stuart Anderson & Mohammad Ali Armin
Department of Gastroenterology and Hepatology, RBWH, Herston, Australia
Florian Grimpen

Authors

Jiabo Xu
View author publications
You can also search for this author in PubMed Google Scholar
Saeed Anwar
View author publications
You can also search for this author in PubMed Google Scholar
Nick Barnes
View author publications
You can also search for this author in PubMed Google Scholar
Florian Grimpen
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Salvado
View author publications
You can also search for this author in PubMed Google Scholar
Stuart Anderson
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Ali Armin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiabo Xu .

Editor information

Editors and Affiliations

University of Toronto, Toronto, ON, Canada
Anne L. Martel
The University of British Columbia, Vancouver, BC, Canada
Purang Abolmaesumi
University College London, London, UK
Danail Stoyanov
École Centrale de Nantes, Nantes, France
Diana Mateus
EURECOM, Biot, France
Maria A. Zuluaga
Chinese Academy of Sciences, Beijing, China
S. Kevin Zhou
Sorbonne University, Paris, France
Daniel Racoceanu
The Hebrew University of Jerusalem, Jerusalem, Israel
Leo Joskowicz

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (avi 4233 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, J. et al. (2020). OfGAN: Realistic Rendition of Synthetic Colonoscopy Videos. In: Martel, A.L., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2020. MICCAI 2020. Lecture Notes in Computer Science(), vol 12263. Springer, Cham. https://doi.org/10.1007/978-3-030-59716-0_70

Download citation

DOI: https://doi.org/10.1007/978-3-030-59716-0_70
Published: 29 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59715-3
Online ISBN: 978-3-030-59716-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)