CycleSTTN: A Learning-Based Temporal Model for Specular Augmentation in Endoscopy

Daher, Rema; Barbed, O. León; Murillo, Ana C.; Vasconcelos, Francisco; Stoyanov, Danail

doi:10.1007/978-3-031-43999-5_54

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14229))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

5699 Accesses
2 Altmetric

Abstract

Feature detection and matching is a computer vision problem that underpins different computer assisted techniques in endoscopy, including anatomy and lesion recognition, camera motion estimation, and 3D reconstruction. This problem is made extremely challenging due to the abundant presence of specular reflections. Most of the solutions proposed in the literature are based on filtering or masking out these regions as an additional processing step. There has been little investigation into explicitly learning robustness to such artefacts with single-step end-to-end training. In this paper, we propose an augmentation technique (CycleSTTN) that adds temporally consistent and realistic specularities to endoscopic videos. Such videos can act as ground truth data with known texture occluded behind the added specularities. We demonstrate that our image generation technique produces better results than a standard CycleGAN model. Additionally, we leverage this data augmentation to re-train a deep-learning based feature extractor (SuperPoint) and show that it improves. CycleSTTN code is made available here.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

CLTS-GAN: Color-Lighting-Texture-Specular Reflection Augmentation for Colonoscopy

An automatic framework for endoscopic image restoration and enhancement

Article 22 October 2020

A Novel Hybrid Endoscopic Dataset for Evaluating Machine Learning-Based Photometric Image Enhancement Models

Notes

1.
https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix/tree/pytorch0.3.1.

References

Asif, M., Chen, L., Song, H., Yang, J., Frangi, A.F.: An automatic framework for endoscopic image restoration and enhancement. Appl. Intell. 51(4), 1959–1971 (2021)
Article Google Scholar
Azagra, P., et al.: Endomapper dataset of complete calibrated endoscopy procedures. arXiv preprint arXiv:2204.14240 (2022)
Barbed, O.L., Chadebecq, F., Morlana, J., Montiel, J.M.M., Murillo, A.C.: Superpoint features in endoscopy. In: Manfredi, L., et al. (eds.) ISGIE GRAIL 2022. LNCS, vol. 13754, pp. 45–55. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-21083-9_5
Chapter Google Scholar
Borgli, H., et al.: Hyperkvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy. Sci. Data 7(1), 1–14 (2020)
Article Google Scholar
Chadebecq, F., Lovat, L.B., Stoyanov, D.: Artificial intelligence and automation in endoscopy and surgery. Nat. Rev. Gastroenterol. Hepatol. 20(3), 171–182 (2023)
Article Google Scholar
Chang, Y.L., Liu, Z.Y., Lee, K.Y., Hsu, W.: Free-form video inpainting with 3D gated convolution and temporal PatchGAN. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9066–9075 (2019)
Google Scholar
Daher, R., Vasconcelos, F., Stoyanov, D.: A temporal learning approach to inpainting endoscopic specularities and its effect on image correspondence. arXiv preprint arXiv:2203.17013 (2022)
DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: self-supervised interest point detection and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 224–236 (2018)
Google Scholar
Diamantis, D.E., Gatoula, P., Iakovidis, D.K.: Endovae: generating endoscopic images with a variational autoencoder. In: 2022 IEEE 14th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP), pp. 1–5. IEEE (2022)
Google Scholar
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)
Article MathSciNet Google Scholar
Funke, I., Bodenstedt, S., Riediger, C., Weitz, J., Speidel, S.: Generative adversarial networks for specular highlight removal in endoscopic images. In: Medical Imaging 2018: Image-Guided Procedures, Robotic Interventions, and Modeling, vol. 10576, pp. 8–16. SPIE (2018)
Google Scholar
García-Vega, A., et al.: A novel hybrid endoscopic dataset for evaluating machine learning-based photometric image enhancement models. In: Pichardo Lagunas, O., Martínez-Miranda, J., Martínez Seis, B. (eds.) MICAI 2022. LNCS, vol. 13612, pp. 267–281. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19493-1_22
Chapter Google Scholar
Hegenbart, S., Uhl, A., Vécsei, A.: Impact of endoscopic image degradations on LBP based features using one-class SVM for classification of celiac disease. In: 2011 7th International Symposium on Image and Signal Processing and Analysis (ISPA), pp. 715–720. IEEE (2011)
Google Scholar
Mathew, S., Nadeem, S., Kaufman, A.: CLTS-GAN: color-lighting-texture-specular reflection augmentation for colonoscopy. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022. LNCS, vol. 13437, pp. 519–529. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16449-1_49
Chapter Google Scholar
Mathew, S., Nadeem, S., Kumari, S., Kaufman, A.: Augmenting colonoscopy using extended and directional cyclegan for lossy image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4696–4705 (2020)
Google Scholar
Ozyoruk, K.B., et al.: Endoslam dataset and an unsupervised monocular visual odometry and depth estimation approach for endoscopic videos. Med. Image Anal. 71, 102058 (2021)
Article Google Scholar
Rivoir, D., et al.: Long-term temporally consistent unpaired video translation from simulated surgical 3D data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3343–3353 (2021)
Google Scholar
Schonberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4104–4113 (2016)
Google Scholar
Schönberger, J.L., Zheng, E., Frahm, J.-M., Pollefeys, M.: Pixelwise view selection for unstructured multi-view stereo. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 501–518. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_31
Chapter Google Scholar
de Souza Jr, L.A., et al.: Assisting barrett’s esophagus identification using endoscopic data augmentation based on generative adversarial networks. Comput. Biol. Med. 126, 104029 (2020)
Article Google Scholar
Xu, J., et al.: OfGAN: realistic rendition of synthetic colonoscopy videos. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12263, pp. 732–741. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59716-0_70
Chapter Google Scholar
Yamane, H., et al.: Automatic generation of polyp image using depth map for endoscope dataset. Procedia Comput. Sci. 192, 2355–2364 (2021)
Article Google Scholar
Zeng, Y., Fu, J., Chao, H.: Learning joint spatial-temporal transformations for video inpainting. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12361, pp. 528–543. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58517-4_31
Chapter Google Scholar
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: 2017 IEEE International Conference on Computer Vision (ICCV) (2017)
Google Scholar

Download references

Acknowledgments

This research was funded in part, by the Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS) [203145/Z/16/Z]; the Engineering and Physical Sciences Research Council (EPSRC) [EP/P027938/1, EP/R004080/1, EP/P012841/1]; the Royal Academy of Engineering Chair in Emerging Technologies Scheme; H2020 FET (GA863146); and the UCL Centre for Digital Innovation through the Amazon Web Services (AWS) Doctoral Scholarship in Digital Innovation 2022/2023. For the purpose of open access, the author has applied a CC BY public copyright licence to any author accepted manuscript version arising from this submission.

Author information

Authors and Affiliations

University College London, London, UK
Rema Daher, Francisco Vasconcelos & Danail Stoyanov
Universidad de Zaragoza, Zaragoza, Spain
O. León Barbed & Ana C. Murillo

Authors

Rema Daher
View author publications
You can also search for this author in PubMed Google Scholar
O. León Barbed
View author publications
You can also search for this author in PubMed Google Scholar
Ana C. Murillo
View author publications
You can also search for this author in PubMed Google Scholar
Francisco Vasconcelos
View author publications
You can also search for this author in PubMed Google Scholar
Danail Stoyanov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rema Daher .

Editor information

Editors and Affiliations

Icahn School of Medicine, Mount Sinai, NYC, NY, USA, Tel Aviv University, Tel Aviv, Israel
Hayit Greenspan
Emory University, Atlanta, GA, USA
Anant Madabhushi
Queen's University, Kingston, ON, Canada
Parvin Mousavi
The University of British Columbia, Vancouver, BC, Canada
Septimiu Salcudean
Yale University, New Haven, CT, USA
James Duncan
IBM Research, San Jose, CA, USA
Tanveer Syeda-Mahmood
Johns Hopkins University, Baltimore, MD, USA
Russell Taylor

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 65009 KB)

Supplementary material 2 (pdf 49202 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Daher, R., Barbed, O.L., Murillo, A.C., Vasconcelos, F., Stoyanov, D. (2023). CycleSTTN: A Learning-Based Temporal Model for Specular Augmentation in Endoscopy. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14229. Springer, Cham. https://doi.org/10.1007/978-3-031-43999-5_54

Download citation

DOI: https://doi.org/10.1007/978-3-031-43999-5_54
Published: 01 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43998-8
Online ISBN: 978-3-031-43999-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

CycleSTTN: A Learning-Based Temporal Model for Specular Augmentation in Endoscopy