Clinically-relevant Summarisation of Cataract Surgery Videos Using Deep Learning

Whitten, Jesse; McKelvie, James; Mayo, Michael

doi:10.1007/978-981-19-8234-7_55

Jesse Whitten¹⁰,
James McKelvie¹¹ &
Michael Mayo¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1716))

Included in the following conference series:

Asian Conference on Intelligent Information and Database Systems

898 Accesses
1 Citations

Abstract

Cataract surgery is one of the most frequently performed medical procedures worldwide, an estimated 20 million such surgeries occurring annually. However, the training required to become a competent cataract surgeon takes years due to its challenging technical nature. This limits the supply of capable surgeons. One aspect of modern cataract surgery is that video recordings are routinely taken using microscope cameras, and these recordings can be used to review errors and improve technique throughout surgical training. However, reviewing raw surgery video footage is tedious and may not lead to actionable insights improving surgeon performance. To tackle this issue, a novel artificial intelligence (AI)-based framework for the extraction of detailed surgery video summary statistics directly from the raw surgery footage is proposed. The input to the system is a video of a cataract surgery procedures and the output is a summary report. The approach uses deep learning models (ResNet-50, ResNet-152 and InceptionV3 were tested) to identify and time surgical instrument activity. Additionally, a unique dataset consisting of 57,422 hand-labelled frames extracted from a new locally-sourced video dataset of 29 retrospective cataract surgery recordings was created. Testing these predictive models with 4-fold cross validation across ten different surgical instruments resulted in a best mean testing prediction area under the ROC curve of 97.6%, and a mean testing sensitivity of 96.6%. Given these high levels of accuracy, the reports generated by our system are high quality and could be used to provide actionable insight into surgical technique during surgical training.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Al Hajj, H., Lamard, M., Charrière, K., Cochener, B., Quellec, G.: Surgical tool detection in cataract surgery videos through multi-image fusion inside a convolutional neural network. In: 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 2002–2005 (2017). https://doi.org/10.1109/EMBC.2017.8037244
Al Hajj, H., et al.: Cataracts challenge on automatic tool annotation for cataract surgery. Med. Image Anal. 52, 24–41 (2019). https://doi.org/10.1016/j.media.2018.11.008
Article Google Scholar
Ballard, D.: Generalizing the hough transform to detect arbitrary shapes. Pattern Recogn. 13(2), 111–122 (1981)
Article MATH Google Scholar
Charriere, K., Quellec, G., Lamard, M., Coatrieux, G., Cochener, B., Cazuguel, G.: Automated surgical step recognition in normalized cataract surgery videos. In: 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2014, pp. 4647–4650 (2014). https://doi.org/10.1109/EMBC.2014.6944660
Charrière, K., et al.: Real-time analysis of cataract surgery videos using statistical models (2017). https://doi.org/10.1007/s11042-017-4793-8
Deng, J., Dong, W., Socher, R., Li, L., Kai Li, Li Fei-Fei: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Google Scholar
Lalys, F., Riffaud, L., Bouget, D., Jannin, P.: A framework for the recognition of high-level surgical tasks from video images for cataract surgeries. IEEE Trans. Bio-med. Eng. 59, 966–976 (2011). https://doi.org/10.1109/TBME.2011.2181168
Article Google Scholar
Primus, M.J., et al.: Frame-based classification of operation phases in cataract surgery videos. In: Schoeffmann, K., et al. (eds.) MMM 2018. LNCS, vol. 10704, pp. 241–253. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73603-7_20
Chapter Google Scholar
Quellec, G., Charriere, K., Lamard, M., Cochener, B., Cazuguel, G.: Normalizing videos of anterior eye segment surgeries. In: 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2014, pp. 122–125 (2014). https://doi.org/10.1109/EMBC.2014.6943544
Quellec, G., Lamard, M., Cochener, B., Cazuguel, G.: Real-time segmentation and recognition of surgical tasks in cataract surgery videos. IEEE Trans. Med. Imaging 33, 2352–2360 (2014). https://doi.org/10.1109/TMI.2014.2340473
Article Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826 (2016)
Google Scholar
Yu, F., et al.: Assessment of automated identification of phases in videos of cataract surgery using machine learning and deep learning techniques. JAMA Netw. Open 2(4), e191860–e191860 (2019). https://doi.org/10.1001/jamanetworkopen.2019.1860
Article Google Scholar
Zisimopoulos, O., et al.: Deepphase: surgical phase recognition in cataracts videos. ArXiv abs/1807.10565 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Waikato, Hamilton, New Zealand
Jesse Whitten & Michael Mayo
Department of Ophthalmology, University of Auckland, Auckland, New Zealand
James McKelvie

Authors

Jesse Whitten
View author publications
You can also search for this author in PubMed Google Scholar
James McKelvie
View author publications
You can also search for this author in PubMed Google Scholar
Michael Mayo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael Mayo .

Editor information

Editors and Affiliations

University of Newcastle Australia, Newcastle, NSW, Australia
Edward Szczerbicki
Wrocław University of Science and Technology, Wrocław, Poland
Krystian Wojtkiewicz
International University - VNU-HCM, Ho Chi Minh City, Vietnam
Sinh Van Nguyen
Wrocław University of Science and Technology, Wrocław, Poland
Marcin Pietranik
Wrocław University of Science and Technology, Wrocław, Poland
Marek Krótkiewicz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Whitten, J., McKelvie, J., Mayo, M. (2022). Clinically-relevant Summarisation of Cataract Surgery Videos Using Deep Learning. In: Szczerbicki, E., Wojtkiewicz, K., Nguyen, S.V., Pietranik, M., Krótkiewicz, M. (eds) Recent Challenges in Intelligent Information and Database Systems. ACIIDS 2022. Communications in Computer and Information Science, vol 1716. Springer, Singapore. https://doi.org/10.1007/978-981-19-8234-7_55

Download citation

DOI: https://doi.org/10.1007/978-981-19-8234-7_55
Published: 24 November 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-8233-0
Online ISBN: 978-981-19-8234-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Clinically-relevant Summarisation of Cataract Surgery Videos Using Deep Learning