A Polynomial Model of Surgical Gestures for Real-Time Retrieval of Surgery Videos

Quellec, Gwénolé; Lamard, Mathieu; Droueche, Zakarya; Cochener, Béatrice; Roux, Christian; Cazuguel, Guy

doi:10.1007/978-3-642-36678-9_2

Gwénolé Quellec¹⁹,
Mathieu Lamard^19,20,
Zakarya Droueche^19,21,
Béatrice Cochener^19,20,22,
Christian Roux^19,21 &
…
Guy Cazuguel^19,21

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7723))

Included in the following conference series:

MICCAI International Workshop on Medical Content-Based Retrieval for Clinical Decision Support

831 Accesses
7 Citations

Abstract

This paper introduces a novel retrieval framework for surgery videos. Given a query video, the goal is to retrieve videos in which similar surgical gestures appear. In this framework, the motion content of short video subsequences is modeled, in real-time, using spatiotemporal polynomials. The retrieval engine needs to be trained: key spatiotemporal polynomials, characterizing semantically-relevant surgical gestures, are identified through multiple-instance learning. Then, videos are compared in a high-level space spanned by these key spatiotemporal polynomials. The framework was applied to a dataset of 900 manually-delimited clips from 100 cataract surgery videos. High classification performance (A _z = 0.816±0.118) and retrieval performance (MAP = 0.358) were observed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 49.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Real-time analysis of cataract surgery videos using statistical models

Article 23 May 2017

Video retrieval in laparoscopic video recordings with dynamic content descriptors

Article Open access 03 November 2017

SF-TMN: SlowFast temporal modeling network for surgical phase recognition

Article 21 March 2024

References

Seshamani, S., Lau, W., Hager, G.: Real-Time Endoscopic Mosaicking. In: Larsen, R., Nielsen, M., Sporring, J. (eds.) MICCAI 2006. LNCS, vol. 4190, pp. 355–363. Springer, Heidelberg (2006)
Chapter Google Scholar
Cano, A.M., Gayá, F., Lamata, P., Sánchez-González, P., Gómez, E.J.: Laparoscopic Tool Tracking Method for Augmented Reality Surgical Applications. In: Bello, F., Edwards, E. (eds.) ISBMS 2008. LNCS, vol. 5104, pp. 191–196. Springer, Heidelberg (2008)
Chapter Google Scholar
Cao, Y., Liu, D., Tavanapong, W., Wong, J., Oh, J., de Groen, P.: Computer-aided detection of diagnostic and therapeutic operations in colonoscopy videos. IEEE Trans. Biomed. Eng. 54(7), 1268–1279 (2007)
Article Google Scholar
Giannarou, S., Yang, G.-Z.: Content-Based Surgical Workflow Representation Using Probabilistic Motion Modeling. In: Liao, H., Edwards, P.J., Pan, X., Fan, Y., Yang, G.-Z. (eds.) MIAR 2010. LNCS, vol. 6326, pp. 314–323. Springer, Heidelberg (2010)
Chapter Google Scholar
Patel, B.V., Meshram, B.B.: Content based video retrieval systems. Int. J. UbiComp 3(2), 13–30 (2012)
Article Google Scholar
Naturel, X., Gros, P.: Detecting repeats for video structuring. Multimedia Tools and Applications 38(2), 233–252 (2008)
Article Google Scholar
Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and TRECVid. In: 8th ACM Int. Workshop on Multimedia Information Retrieval, pp. 321–330. ACM Press, New York (2006)
Chapter Google Scholar
Hu, W., Xie, D., Fu, Z., Zeng, W., Maybank, S.: Semantic-based surveillance video retrieval. IEEE Trans. Image. Process. 16(4), 1168–1181 (2007)
Article MathSciNet Google Scholar
André, B., Vercauteren, T., Buchner, A.M., Shahid, M.W., Wallace, M.B., Ayache, N.: An Image Retrieval Approach to Setup Difficulty Levels in Training Systems for Endomicroscopy Diagnosis. In: Jiang, T., Navab, N., Pluim, J.P.W., Viergever, M.A. (eds.) MICCAI 2010, Part II. LNCS, vol. 6362, pp. 480–487. Springer, Heidelberg (2010)
Chapter Google Scholar
Xu, D., Chang, S.F.: Video event recognition using kernel methods with multilevel temporal alignment. IEEE Trans. Pattern. Anal. Mach. Intell. 30(11), 1985–1997 (2008)
Article Google Scholar
Rothganger, F., Lazebnik, S., Schmid, C., Ponce, J.: Segmenting, modeling, and matching video clips containing multiple moving objects. IEEE Trans. Pattern. Anal. Mach. Intell. 29(3), 477–491 (2007)
Article Google Scholar
Yamasaki, T., Aizawa, K.: Motion segmentation and retrieval for 3d video based on modified shape distribution. EURASIP J. Appl. Signal. Process 2007(1), 059535 (2007)
Google Scholar
Laptev, I.: On space-time interest points. Int. J. Comput. Vis. 64(2-3), 107–123 (2005)
Article Google Scholar
Jiang, Y.G., Ngo, C.W., Yang, J.: Towards optimal bag-of-features for object categorization and semantic video retrieval. In: ACM Int. Conf. on Image and Video Retrieval, Amsterdam, The Netherlands, pp. 494–501 (2007)
Google Scholar
Jeannin, S.: On the combination of a polynomial motion estimation with a hierarchical segmentation based video coding scheme. In: Int. Conf. on Image Processing, Lausanne, Switzerland, pp. 489–492 (1996)
Google Scholar
Kihl, O., Tremblais, B., Augereau, B., Khoudeir, M.: Human activities discrimination with motion approximation in polynomial bases. In: Int. Conf. on Image Processing, Hong Kong, China, pp. 2469–2472 (2010)
Google Scholar
Hu, X., Ahuja, N.: Long image sequence motion analysis using polynomial motion models. In: IAPR Workshop on Machine Vision Applications, Tokyo, Japan, pp. 109–114 (1992)
Google Scholar
Jakubiak, J., Nomm, S., Vain, J., Miyawaki, F.: Polynomial based approach in analysis and detection of surgeon’s motions. In: Int. Conf. on Control, Automation, Robotics and Vision, Hanoi, Vietnam, pp. 611–616 (2008)
Google Scholar
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26(1), 43–49 (1978)
Article MATH Google Scholar
Lee, D.S.: Meeting video retrieval using dynamic HMM model similarity. In: IEEE Int. Conf. on Multimedia and Expo., Amsterdam, The Netherlands (July 2005)
Google Scholar
Lili, N.A.: Hidden markov model for content-based video retrieval. In: Asia Int. Conf. on Modelling and Simulation, Bandung, Indonesia, pp. 353–358 (2009)
Google Scholar
Foulds, J.R., Frank, E.: Speeding up and boosting diverse density learning. In: Conf. on Discovery Science, Lyon, France, pp. 102–116 (2010)
Google Scholar
Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: DARPA Imaging Understanding Workshop, Washington, DC, USA, pp. 121–130 (1981)
Google Scholar
Maron, O., Lozano-Pérez, T.: A framework for multiple-instance learning. In: Conf. Advances in Neural Information Processing Systems, pp. 570–576. Denver, Co., USA (1998)
Google Scholar
Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: Advances in Neural Information Processing Systems, Whistler, Canada, vol. 15, pp. 561–568 (2003)
Google Scholar
Broyden, C.G.: The convergence of a class of double-rank minimization algorithms. J. Inst. Math. Appl. 6, 76–90 (1970)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Inserm, UMR 1101, Brest, F-29200, France
Gwénolé Quellec, Mathieu Lamard, Zakarya Droueche, Béatrice Cochener, Christian Roux & Guy Cazuguel
Univ Bretagne Occidentale, Brest, F-29200, France
Mathieu Lamard & Béatrice Cochener
Dpt. ITI, INSTITUT. TELECOM, TELECOM Bretagne, UEB, Brest, F-29200, France
Zakarya Droueche, Christian Roux & Guy Cazuguel
CHU Brest, Service d’Ophtalmologie, Brest, F-29200, France
Béatrice Cochener

Authors

Gwénolé Quellec
View author publications
You can also search for this author in PubMed Google Scholar
Mathieu Lamard
View author publications
You can also search for this author in PubMed Google Scholar
Zakarya Droueche
View author publications
You can also search for this author in PubMed Google Scholar
Béatrice Cochener
View author publications
You can also search for this author in PubMed Google Scholar
Christian Roux
View author publications
You can also search for this author in PubMed Google Scholar
Guy Cazuguel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The Iby and Aladar Fleischmann Faculty of Engineering, Tel Aviv University, Ramat Aviv, Israel
Hayit Greenspan
Business Information Systems, University of Applied Sciences Western Switzerland (HES-SO), TechnoArk 3, 3960, Sierre, Switzerland
Henning Müller
Multi-modal Mining for Healthcare, IBM Almaden Research Center, 650 Harry Road, 95120, San Jose, CA, USA
Tanveer Syeda-Mahmood

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Quellec, G., Lamard, M., Droueche, Z., Cochener, B., Roux, C., Cazuguel, G. (2013). A Polynomial Model of Surgical Gestures for Real-Time Retrieval of Surgery Videos. In: Greenspan, H., Müller, H., Syeda-Mahmood, T. (eds) Medical Content-Based Retrieval for Clinical Decision Support. MCBR-CDS 2012. Lecture Notes in Computer Science, vol 7723. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36678-9_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-36678-9_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36677-2
Online ISBN: 978-3-642-36678-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics