Skip to main content

Synthesising Rare Cataract Surgery Samples with Guided Diffusion Models

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 (MICCAI 2023)

Abstract

Cataract surgery is a frequently performed procedure that demands automation and advanced assistance systems. However, gathering and annotating data for training such systems is resource intensive. The publicly available data also comprises severe imbalances inherent to the surgical process. Motivated by this, we analyse cataract surgery video data for the worst-performing phases of a pre-trained downstream tool classifier. The analysis demonstrates that imbalances deteriorate the classifier’s performance on underrepresented cases. To address this challenge, we utilise a conditional generative model based on Denoising Diffusion Implicit Models (DDIM) and Classifier-Free Guidance (CFG). Our model can synthesise diverse, high-quality examples based on complex multi-class multi-label conditions, such as surgical phases and combinations of surgical tools. We affirm that the synthesised samples display tools that the classifier recognises. These samples are hard to differentiate from real images, even for clinical experts with more than five years of experience. Further, our synthetically extended data can improve the data sparsity problem for the downstream task of tool classification. The evaluations demonstrate that the model can generate valuable unseen examples, allowing the tool classifier to improve by up to 10% for rare cases. Overall, our approach can facilitate the development of automated assistance systems for cataract surgery by providing a reliable source of realistic synthetic data, which we make available for everyone.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Al Hajj, H., et al.: CATARACTS: challenge on automatic tool annotation for cataRACT surgery. Med. Image Anal. 52, 24–41 (2019)

    Article  Google Scholar 

  2. Bińkowski, M., Sutherland, D.J., Arbel, M., Gretton, A.: Demystifying MMD GANs. arXiv preprint arXiv:1801.01401 (2018)

  3. Chen, X., Mishra, N., Rohaninejad, M., Abbeel, P.: PixelSNAIL: an improved autoregressive generative model. In: International Conference on Machine Learning, pp. 864–872. PMLR (2018)

    Google Scholar 

  4. Dhariwal, P., Nichol, A.: Diffusion models beat GANs on image synthesis. In: Advances in Neural Information Processing Systems, vol. 34, pp. 8780–8794 (2021)

    Google Scholar 

  5. Dorjsembe, Z., Odonchimed, S., Xiao, F.: Three-dimensional medical image synthesis with denoising diffusion probabilistic models. In: Medical Imaging with Deep Learning (2022)

    Google Scholar 

  6. Grammatikopoulou, M., et al.: CaDIS: cataract dataset for surgical RGB-image segmentation. Med. Image Anal. 71, 102053 (2021)

    Article  Google Scholar 

  7. Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Advances in Neural Information Processing Systems, vol. 33, pp. 6840–6851 (2020)

    Google Scholar 

  8. Ho, J., Salimans, T.: Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598 (2022)

  9. Kalia, M., Aleef, T.A., Navab, N., Black, P., Salcudean, S.E.: Co-generation and segmentation for generalized surgical instrument segmentation on unlabelled data. In: de Bruijne, M., et al. (eds.) MICCAI 2021, Part IV. LNCS, vol. 12904, pp. 403–412. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87202-1_39

    Chapter  Google Scholar 

  10. Khader, F., et al.: Medical diffusion-denoising diffusion probabilistic models for 3D medical image generation. arXiv preprint arXiv:2211.03364 (2022)

  11. Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Paul Smolley, S.: Least squares generative adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2794–2802 (2017)

    Google Scholar 

  12. Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)

  13. Moghadam, P.A., et al.: A morphology focused diffusion probabilistic model for synthesis of histopathology images. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2000–2009 (2023)

    Google Scholar 

  14. Müller-Franzes, G., et al.: Diffusion probabilistic models beat GANs on medical images. arXiv preprint arXiv:2212.07501 (2022)

  15. Nichol, A., et al.: GLIDE: towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741 (2021)

  16. Nichol, A.Q., Dhariwal, P.: Improved denoising diffusion probabilistic models. In: International Conference on Machine Learning, pp. 8162–8171. PMLR (2021)

    Google Scholar 

  17. Peng, W., Adeli, E., Zhao, Q., Pohl, K.M.: Generating realistic 3D brain MRIs using a conditional diffusion probabilistic model. arXiv preprint arXiv:2212.08034 (2022)

  18. Pfeiffer, M., et al.: Generating Large Labeled Data Sets for Laparoscopic Image Processing Tasks Using Unpaired Image-to-Image Translation. In: Shen, D., et al. (eds.) MICCAI 2019, Part V. LNCS, vol. 11768, pp. 119–127. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32254-0_14

    Chapter  Google Scholar 

  19. Pinaya, W.H., et al.: Brain imaging generation with latent diffusion models. In: Mukhopadhyay, A., Oksuz, I., Engelhardt, S., Zhu, D., Yuan, Y. (eds.) DGM4MICCAI 2022. LNCS, vol. 13609, pp. 117–126. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-18576-2_12

    Chapter  Google Scholar 

  20. Razavi, A., Van den Oord, A., Vinyals, O.: Generating diverse high-fidelity images with VQ-VAE-2. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  21. Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10684–10695 (2022)

    Google Scholar 

  22. Roychowdhury, S., Bian, Z., Vahdat, A., Macready, W.G.: Identification of surgical tools using deep neural networks. Technical report, D-Wave Systems Inc. (2017)

    Google Scholar 

  23. Sagers, L.W., Diao, J.A., Groh, M., Rajpurkar, P., Adamson, A.S., Manrai, A.K.: Improving dermatology classifiers across populations using images generated by large diffusion models. arXiv preprint arXiv:2211.13352 (2022)

  24. Sommersperger, M., et al.: Surgical scene generation and adversarial networks for physics-based iOCT synthesis. Biomed. Opt. Express 13(4), 2414–2430 (2022)

    Article  Google Scholar 

  25. Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502 (2020)

  26. Uzunova, H., Wilms, M., Forkert, N.D., Handels, H., Ehrhardt, J.: A systematic comparison of generative models for medical images. Int. J. Comput. Assist. Radiol. Surg. 17(7), 1213–1224 (2022). https://doi.org/10.1007/s11548-022-02567-6

    Article  Google Scholar 

  27. Wang, W., et al.: Cataract surgical rate and socioeconomics: a global study. Invest. Ophthalmol. Vis. Sci. 57(14), 5872–5881 (2016)

    Article  Google Scholar 

  28. Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yannik Frisch .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 5354 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Frisch, Y. et al. (2023). Synthesising Rare Cataract Surgery Samples with Guided Diffusion Models. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14228. Springer, Cham. https://doi.org/10.1007/978-3-031-43996-4_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43996-4_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43995-7

  • Online ISBN: 978-3-031-43996-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics