Skip to main content

Clinically-relevant Summarisation of Cataract Surgery Videos Using Deep Learning

  • Conference paper
  • First Online:
Recent Challenges in Intelligent Information and Database Systems (ACIIDS 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1716))

Included in the following conference series:

Abstract

Cataract surgery is one of the most frequently performed medical procedures worldwide, an estimated 20 million such surgeries occurring annually. However, the training required to become a competent cataract surgeon takes years due to its challenging technical nature. This limits the supply of capable surgeons. One aspect of modern cataract surgery is that video recordings are routinely taken using microscope cameras, and these recordings can be used to review errors and improve technique throughout surgical training. However, reviewing raw surgery video footage is tedious and may not lead to actionable insights improving surgeon performance. To tackle this issue, a novel artificial intelligence (AI)-based framework for the extraction of detailed surgery video summary statistics directly from the raw surgery footage is proposed. The input to the system is a video of a cataract surgery procedures and the output is a summary report. The approach uses deep learning models (ResNet-50, ResNet-152 and InceptionV3 were tested) to identify and time surgical instrument activity. Additionally, a unique dataset consisting of 57,422 hand-labelled frames extracted from a new locally-sourced video dataset of 29 retrospective cataract surgery recordings was created. Testing these predictive models with 4-fold cross validation across ten different surgical instruments resulted in a best mean testing prediction area under the ROC curve of 97.6%, and a mean testing sensitivity of 96.6%. Given these high levels of accuracy, the reports generated by our system are high quality and could be used to provide actionable insight into surgical technique during surgical training.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Al Hajj, H., Lamard, M., Charrière, K., Cochener, B., Quellec, G.: Surgical tool detection in cataract surgery videos through multi-image fusion inside a convolutional neural network. In: 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 2002–2005 (2017). https://doi.org/10.1109/EMBC.2017.8037244

  2. Al Hajj, H., et al.: Cataracts challenge on automatic tool annotation for cataract surgery. Med. Image Anal. 52, 24–41 (2019). https://doi.org/10.1016/j.media.2018.11.008

    Article  Google Scholar 

  3. Ballard, D.: Generalizing the hough transform to detect arbitrary shapes. Pattern Recogn. 13(2), 111–122 (1981)

    Article  MATH  Google Scholar 

  4. Charriere, K., Quellec, G., Lamard, M., Coatrieux, G., Cochener, B., Cazuguel, G.: Automated surgical step recognition in normalized cataract surgery videos. In: 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2014, pp. 4647–4650 (2014). https://doi.org/10.1109/EMBC.2014.6944660

  5. Charrière, K., et al.: Real-time analysis of cataract surgery videos using statistical models (2017). https://doi.org/10.1007/s11042-017-4793-8

  6. Deng, J., Dong, W., Socher, R., Li, L., Kai Li, Li Fei-Fei: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)

    Google Scholar 

  7. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

    Google Scholar 

  8. Lalys, F., Riffaud, L., Bouget, D., Jannin, P.: A framework for the recognition of high-level surgical tasks from video images for cataract surgeries. IEEE Trans. Bio-med. Eng. 59, 966–976 (2011). https://doi.org/10.1109/TBME.2011.2181168

    Article  Google Scholar 

  9. Primus, M.J., et al.: Frame-based classification of operation phases in cataract surgery videos. In: Schoeffmann, K., et al. (eds.) MMM 2018. LNCS, vol. 10704, pp. 241–253. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73603-7_20

    Chapter  Google Scholar 

  10. Quellec, G., Charriere, K., Lamard, M., Cochener, B., Cazuguel, G.: Normalizing videos of anterior eye segment surgeries. In: 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2014, pp. 122–125 (2014). https://doi.org/10.1109/EMBC.2014.6943544

  11. Quellec, G., Lamard, M., Cochener, B., Cazuguel, G.: Real-time segmentation and recognition of surgical tasks in cataract surgery videos. IEEE Trans. Med. Imaging 33, 2352–2360 (2014). https://doi.org/10.1109/TMI.2014.2340473

    Article  Google Scholar 

  12. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826 (2016)

    Google Scholar 

  13. Yu, F., et al.: Assessment of automated identification of phases in videos of cataract surgery using machine learning and deep learning techniques. JAMA Netw. Open 2(4), e191860–e191860 (2019). https://doi.org/10.1001/jamanetworkopen.2019.1860

    Article  Google Scholar 

  14. Zisimopoulos, O., et al.: Deepphase: surgical phase recognition in cataracts videos. ArXiv abs/1807.10565 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Mayo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Whitten, J., McKelvie, J., Mayo, M. (2022). Clinically-relevant Summarisation of Cataract Surgery Videos Using Deep Learning. In: Szczerbicki, E., Wojtkiewicz, K., Nguyen, S.V., Pietranik, M., Krótkiewicz, M. (eds) Recent Challenges in Intelligent Information and Database Systems. ACIIDS 2022. Communications in Computer and Information Science, vol 1716. Springer, Singapore. https://doi.org/10.1007/978-981-19-8234-7_55

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-8234-7_55

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-8233-0

  • Online ISBN: 978-981-19-8234-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics