Skip to main content

UNet-eVAE: Iterative Refinement Using VAE Embodied Learning for Endoscopic Image Segmentation

  • Conference paper
  • First Online:
Machine Learning in Medical Imaging (MLMI 2022)

Abstract

While endoscopy is routinely used for surveillance, high operator dependence demands robust automated image analysis methods. Automated segmentation of region-of-interest (ROI) that includes lesions, inflammations, and instruments can serve to cope with the operator dependence problem in this field. Most supervised methods are developed by fitting models on the available ground truth mask samples only. This work proposes a joint training approach using the UNet coupled with a variational auto-encoder (VAE) to improve endoscopic image segmentation by exploiting original samples, predicted masks and ground truth masks. In the proposed UNet-eVAE, VAE utilises the masks to constrain ROI-specific feature representations for reconstruction as an auxiliary task. The fine-grained spatial information from VAE is fused with the UNet decoder to enrich the feature representations and improve segmentation performance. Our experimental results on both colonoscopy and ureteroscopy datasets demonstrate that the proposed architecture can learn robust representations and generalise segmentation performance on unseen samples while improving the baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aldoukhi, A.H., Roberts, W.W., Hall, T.L., Ghani, K.R.: Holmium laser lithotripsy in the new stone age: dust or bust? Front. Surg. 4, 57 (2017)

    Article  Google Scholar 

  2. Alelign, T., Petros, B.: Kidney stone disease: an update on current concepts. Adv. Urol. 2018 (2018)

    Google Scholar 

  3. Ali, S., et al.: Deep learning for detection and segmentation of artefact and disease instances in gastrointestinal endoscopy. Med. Image Anal. 70, 102002 (2021)

    Article  Google Scholar 

  4. Ali, S., et al.: PolypGen: a multi-center polyp detection and segmentation dataset for generalisability assessment. arXiv preprint arXiv:2106.04463 (2021)

  5. Ali, S., et al.: An objective comparison of detection and segmentation algorithms for artefacts in clinical endoscopy. Sci. Rep. 10(1), 1–15 (2020)

    Google Scholar 

  6. Bernal, J., Sánchez, F.J., Fernández-Esparrach, G., Gil, D., Rodríguez, C., Vilariño, F.: WM-DOVA maps for accurate polyp highlighting in colonoscopy: validation vs. saliency maps from physicians. Comput. Med. Imaging Graph. 43, 99–111 (2015)

    Article  Google Scholar 

  7. Bray, F., Ferlay, J., Soerjomataram, I., Siegel, R.L., Torre, L.A., Jemal, A.: Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 68(6), 394–424 (2018)

    Article  Google Scholar 

  8. Fan, D.-P., et al.: PraNet: parallel reverse attention network for polyp segmentation. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12266, pp. 263–273. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59725-2_26

    Chapter  Google Scholar 

  9. Galdran, A., Carneiro, G., Ballester, M.A.G.: Double encoder-decoder networks for gastrointestinal polyp segmentation. In: Del Bimbo, A., et al. (eds.) ICPR 2021. LNCS, vol. 12661, pp. 293–307. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-68763-2_22

    Chapter  Google Scholar 

  10. Gupta, S., Ali, S., Goldsmith, L., Turney, B., Rittscher, J.: MI-UNet: improved segmentation in ureteroscopy. In: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI), pp. 212–216 (2020). https://doi.org/10.1109/ISBI45749.2020.9098608

  11. Jha, D., et al.: Kvasir-SEG: a segmented polyp dataset. In: Ro, Y.M., et al. (eds.) MMM 2020. LNCS, vol. 11962, pp. 451–462. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-37734-2_37

    Chapter  Google Scholar 

  12. Kohl, S., et al.: A probabilistic U-Net for segmentation of ambiguous images. In: Advances in Neural Information Processing Systems, vol. 31 (2018)

    Google Scholar 

  13. Li, K., Kong, L., Zhang, Y.: 3D U-Net brain tumor segmentation using VAE skip connection. In: 2020 IEEE 5th International Conference on Image, Vision and Computing (ICIVC), pp. 97–101. IEEE (2020)

    Google Scholar 

  14. Milletari, F., Navab, N., Ahmadi, S.A.: V-net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV), pp. 565–571. IEEE (2016)

    Google Scholar 

  15. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)

    Article  Google Scholar 

  16. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  17. Tomar, N.K., et al.: DDANet: dual decoder attention network for automatic polyp segmentation. In: Del Bimbo, A., et al. (eds.) ICPR 2021. LNCS, vol. 12668, pp. 307–314. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-68793-9_23

    Chapter  Google Scholar 

  18. Yeung, M., Sala, E., Schönlieb, C.B., Rundo, L.: Focus U-Net: a novel dual attention-gated CNN for polyp segmentation during colonoscopy. Comput. Biol. Med. 137, 104815 (2021)

    Article  Google Scholar 

  19. Zhang, Z., Liu, Q., Wang, Y.: Road extraction by deep residual u-net. IEEE Geosci. Remote Sens. Lett. 15(5), 749–753 (2018)

    Article  Google Scholar 

  20. Zhu, Y., Min, M.R., Kadav, A., Graf, H.P.: S3VAE: self-supervised sequential VAE for representation disentanglement and data generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6538–6547 (2020)

    Google Scholar 

Download references

Acknowledgement

We would like to thank Boston Scientific for funding this project (Grant No: DFR04690). SG and BT are funded by BSC, BB is funded by EndoMapper Horizon 2020 FET (GA 863146), SA and JR were supported by the NIHR Oxford Biomedical Research Centre.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Soumya Gupta .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 184 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gupta, S., Ali, S., Xu, Z., Bhattarai, B., Turney, B., Rittscher, J. (2022). UNet-eVAE: Iterative Refinement Using VAE Embodied Learning for Endoscopic Image Segmentation. In: Lian, C., Cao, X., Rekik, I., Xu, X., Cui, Z. (eds) Machine Learning in Medical Imaging. MLMI 2022. Lecture Notes in Computer Science, vol 13583. Springer, Cham. https://doi.org/10.1007/978-3-031-21014-3_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-21014-3_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-21013-6

  • Online ISBN: 978-3-031-21014-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics