skip to main content
10.1145/3372278.3391937acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Enabling Relevance-Based Exploration of Cataract Videos

Published: 08 June 2020 Publication History

Abstract

Training new surgeons as one of the major duties of experienced expert surgeons demands a considerable supervisory investment of them. To expedite the training process and subsequently reduce the extra workload on their tight schedule, surgeons are seeking a surgical video retrieval system. Automatic workflow analysis approaches can optimize the training procedure by indexing the surgical video segments to be used for online video exploration. The aim of the doctoral project described in this paper is to provide the basis for a cataract video exploration system, that is able to (i) automatically analyze and extract the relevant segments of videos from cataract surgery, and (ii) provide interactive exploration means for browsing archives of cataract surgery videos. In particular, we apply deep-learning-based classification and segmentation approaches to cataract surgery videos to enable automatic phase and action recognition and similarity detection.

References

[1]
C. Beecks, K. Schoeffmann, M. Lux, M. S. Uysal, and T. Seidl. 2015. Endoscopic Video Retrieval: A Signature-Based Approach for Linking Endoscopic Images with Video Segments. In 2015 IEEE International Symposium on Multimedia (ISM). 33--38. https://doi.org/10.1109/ISM.2015.21
[2]
Sebastian Bodenstedt, Martin Wagner, Darko Katic, Patrick Mietkowski, Benjamin F. B. Mayer, Hannes Kenngott, Beat P. Müller-Stich, Rüdiger Dillmann, and Stefanie Speidel. 2017. http://arxiv.org/abs/1702.03684 Unsupervised temporal context learning using convolutional neural networks for laparoscopic workflow analysis. CoRR, Vol. abs/1702.03684 (2017). arxiv: 1702.03684 http://arxiv.org/abs/1702.03684
[3]
Katia Charrière, Gwénolé Quellec, Mathieu Lamard, David Martiano, Guy Cazuguel, Gouenou Coatrieux, and Béatrice Cochener. 2017. https://doi.org/10.1007/s11042-017--4793--8 Real-time analysis of cataract surgery videos using statistical models. Multimedia Tools and Applications, Vol. 76, 21 (01 Nov 2017), 22473--22491. https://doi.org/10.1007/s11042-017--4793--8
[4]
K. Charriere, G. Quelled, M. Lamard, D. Martiano, G. Cazuguel, G. Coatrieux, and B. Cochener. 2016. https://ieeexplore.ieee.org/document/7500245 Real-time multilevel sequencing of cataract surgery videos. In 2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI). 1--6. https://doi.org/10.1109/CBMI.2016.7500245
[5]
D. R. Chittajallu, B. Dong, P. Tunison, R. Collins, K. Wells, J. Fleshman, G. Sankaranarayanan, S. Schwaitzberg, L. Cavuoto, and A. Enquobahrie. 2019. XAI-CBIR: Explainable AI System for Content based Retrieval of Video Frames from Minimally Invasive Surgery Videos. In 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019). 66--69. https://doi.org/10.1109/ISBI.2019.8759428
[6]
Isabel Funke, Alexander Jenke, Sö ren Torge Mees, Jü rgen Weitz, Stefanie Speidel, and Sebastian Bodenstedt. 2018. http://arxiv.org/abs/1806.06811 Temporal coherence-based self-supervised learning for laparoscopic workflow analysis. CoRR, Vol. abs/1806.06811 (2018). arxiv: 1806.06811 http://arxiv.org/abs/1806.06811
[7]
A. Halevy, P. Norvig, and F. Pereira. 2009. The Unreasonable Effectiveness of Data. IEEE Intelligent Systems, Vol. 24, 2 (March 2009), 8--12. https://doi.org/10.1109/MIS.2009.36
[8]
Kaiming He, Georgia Gkioxari, Piotr Dollar, and Ross Girshick. 2017. Mask R-CNN. In The IEEE International Conference on Computer Vision (ICCV) .
[9]
Y. Jin, Q. Dou, H. Chen, L. Yu, J. Qin, C. Fu, and P. Heng. 2018. https://ieeexplore.ieee.org/document/8240734 SV-RCNet: Workflow Recognition From Surgical Videos Using Recurrent Convolutional Network. IEEE Transactions on Medical Imaging, Vol. 37, 5 (May 2018), 1114--1126. https://doi.org/10.1109/TMI.2017.2787657
[10]
F. Lalys, L. Riffaud, D. Bouget, and P. Jannin. 2012. A Framework for the Recognition of High-Level Surgical Tasks From Video Images for Cataract Surgeries. IEEE Transactions on Biomedical Engineering, Vol. 59, 4 (April 2012), 966--976. https://doi.org/10.1109/TBME.2011.2181168
[11]
Constantinos G. Loukas. 2018. http://arxiv.org/abs/1805.08569 Surgical Phase Recognition of Short Video Shots Based on Temporal Modeling of Deep Features. CoRR, Vol. abs/1807.07853 (2018). arxiv: 1807.07853 http://arxiv.org/abs/1807.07853
[12]
Lena Maier-Hein, Swaroop S. Vedula, Stefanie Speidel, Nassir Navab, Ron Kikinis, Adrian Park, Matthias Eisenmann, Hubertus Feussner, Germain Forestier, Stamatia Giannarou, Makoto Hashizume, Darko Katic, Hannes Kenngott, Michael Kranzfelder, Anand Malpani, Keno Marz, Thomas Neumuth, Nicolas Padoy, Carla Pugh, Nicolai Schoch, Danail Stoyanov, Russell H Taylor, Martin Wagner, Gregory Hager, and Pierre Jannin. 2017. https://www.nature.com/articles/s41551-017-0132--7 Surgical data science for next-generation interventions. Nature Biomedical Engineering, Vol. 1, 9 (1 9 2017), 691--696. https://doi.org/10.1038/s41551-017-0132--7
[13]
NeginGhamsarian, Mario Taschwer, and Klaus Schoeffmann. 2020. Deblurring cataract surgery videos using a multi-scale deconvolutional neural network. CoRR, Vol. abs/1504.06852 (2020). arxiv: 1504.06852 http://arxiv.org/abs/1504.06852
[14]
Mehdi Noroozi, Paramanand Chandramouli, and Paolo Favaro. 2017. https://link.springer.com/chapter/10.1007/978--3--319--66709--6_6 Motion Deblurring in the Wild. In Pattern Recognition, Volker Roth and Thomas Vetter (Eds.). Springer International Publishing, Cham, 65--77.
[15]
Konstantin Pogorelov, Michael Riegler, Pål Halvorsen, and Carsten Griwodz. 2017. ClusterTag: Interactive Visualization, Clustering and Tagging Tool for Big Image Collections. https://doi.org/10.1145/3078971.3079018
[16]
Klaus Schoeffmann, Bernd Münzer, Andreas Leibetseder, Jürgen Primus, and Sabrina Kletz. 2019. Autopiloting Feature Maps: The Deep Interactive Video Exploration (diveXplore) System at VBS2019. In MultiMedia Modeling, Ioannis Kompatsiaris, Benoit Huet, Vasileios Mezaris, Cathal Gurrin, Wen-Huang Cheng, and Stefanos Vrochidis (Eds.). Springer International Publishing, Cham, 585--590.
[17]
Klaus Schoeffmann, Mario Taschwer, Stephanie Sarny, Bernd Münzer, Manfred Jürgen Primus, and Doris Putzgruber. 2018. https://dl.acm.org/citation.cfm?id=3208137 Cataract-101: Video Dataset of 101 Cataract Surgeries. In Proceedings of the 9th ACM Multimedia Systems Conference (MMSys '18). ACM, New York, NY, USA, 421--425. https://doi.org/10.1145/3204949.3208137
[18]
Sameer Trikha, Andrew Turnbull, R.J. Morris, David Anderson, and Parwez Hossain. 2013. https://www.ncbi.nlm.nih.gov/ /23370418 The journey to femtosecond laser-assisted cataract surgery: New beginnings or a false dawn? Eye (London, England), Vol. 27 (02 2013). https://doi.org/10.1038/eye.2012.293
[19]
A. P. Twinanda, S. Shehata, D. Mutter, J. Marescaux, M. de Mathelin, and N. Padoy. 2017. https://ieeexplore.ieee.org/abstract/document/7519080 EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos. IEEE Transactions on Medical Imaging, Vol. 36, 1 (Jan 2017), 86--97. https://doi.org/10.1109/TMI.2016.2593957
[20]
Gü l Varol, Ivan Laptev, and Cordelia Schmid. 2016. Long-term Temporal Convolutions for Action Recognition. CoRR, Vol. abs/1604.04494 (2016). arxiv: 1604.04494 http://arxiv.org/abs/1604.04494
[21]
Igor Vasiljevic, Ayan Chakrabarti, and Gregory Shakhnarovich. 2016. https://arxiv.org/abs/1611.05760 Examining the Impact of Blur on Recognition by Convolutional Networks. CoRR, Vol. abs/1611.05760 (2016). arxiv: 1611.05760 http://arxiv.org/abs/1611.05760
[22]
Gaurav Yengera, Didier Mutter, Jacques Marescaux, and Nicolas Padoy. 2018. http://arxiv.org/abs/1805.08569 Less is More: Surgical Phase Recognition with Less Annotations through Self-Supervised Pre-training of CNN-LS™ Networks. CoRR, Vol. abs/1805.08569 (2018). arxiv: 1805.08569 http://arxiv.org/abs/1805.08569
[23]
Odysseas Zisimopoulos, Evangello Flouty, Imanol Luengo, Petros Giataganas, Jean Nehme, Andre Chow, and Danail Stoyanov. 2018. http://arxiv.org/abs/1807.10565 DeepPhase: Surgical Phase Recognition in CATARACTS Videos. CoRR, Vol. abs/1807.10565 (2018). arxiv: 1807.10565 http://arxiv.org/abs/1807.10565

Cited By

View all
  • (2024)Semantic-Preserving Surgical Video Retrieval With Phase and Behavior Coordinated HashingIEEE Transactions on Medical Imaging10.1109/TMI.2023.332138243:2(807-819)Online publication date: Feb-2024
  • (2024)Predicting Postoperative Intraocular Lens Dislocation in Cataract Surgery via Deep LearningIEEE Access10.1109/ACCESS.2024.336104212(21012-21025)Online publication date: 2024
  • (2024)Cataract-1K Dataset for Deep-Learning-Assisted Analysis of Cataract Surgery VideosScientific Data10.1038/s41597-024-03193-411:1Online publication date: 12-Apr-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMR '20: Proceedings of the 2020 International Conference on Multimedia Retrieval
June 2020
605 pages
ISBN:9781450370875
DOI:10.1145/3372278
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 June 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. action recognition
  2. cataract surgery
  3. deep learning
  4. phase recognition

Qualifiers

  • Research-article

Funding Sources

Conference

ICMR '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)2
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Semantic-Preserving Surgical Video Retrieval With Phase and Behavior Coordinated HashingIEEE Transactions on Medical Imaging10.1109/TMI.2023.332138243:2(807-819)Online publication date: Feb-2024
  • (2024)Predicting Postoperative Intraocular Lens Dislocation in Cataract Surgery via Deep LearningIEEE Access10.1109/ACCESS.2024.336104212(21012-21025)Online publication date: 2024
  • (2024)Cataract-1K Dataset for Deep-Learning-Assisted Analysis of Cataract Surgery VideosScientific Data10.1038/s41597-024-03193-411:1Online publication date: 12-Apr-2024
  • (2024)DeepPyramid+: medical image segmentation using Pyramid View Fusion and Deformable Pyramid ReceptionInternational Journal of Computer Assisted Radiology and Surgery10.1007/s11548-023-03046-219:5(851-859)Online publication date: 8-Jan-2024
  • (2024)Event Recognition in Laparoscopic Gynecology Videos with Hybrid TransformersMultiMedia Modeling10.1007/978-3-031-56435-2_7(82-95)Online publication date: 20-Mar-2024
  • (2023)Domain Adaptation for Medical Image Segmentation Using Transformation-Invariant Self-trainingMedical Image Computing and Computer Assisted Intervention – MICCAI 202310.1007/978-3-031-43907-0_32(331-341)Online publication date: 8-Oct-2023
  • (2022)DeepPyramid: Enabling Pyramid View and Deformable Pyramid Reception for Semantic Segmentation in Cataract Surgery VideosMedical Image Computing and Computer Assisted Intervention – MICCAI 202210.1007/978-3-031-16443-9_27(276-286)Online publication date: 18-Sep-2022
  • (2021)Relevance Detection in Cataract Surgery Videos by Spatio- Temporal Action Localization2020 25th International Conference on Pattern Recognition (ICPR)10.1109/ICPR48806.2021.9412525(10720-10727)Online publication date: 10-Jan-2021
  • (2021)ReCal-Net: Joint Region-Channel-Wise Calibrated Network for Semantic Segmentation in Cataract Surgery VideosNeural Information Processing10.1007/978-3-030-92238-2_33(391-402)Online publication date: 8-Dec-2021
  • (2020)Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Neural NetworksProceedings of the 28th ACM International Conference on Multimedia10.1145/3394171.3413658(3577-3585)Online publication date: 12-Oct-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media