Abstract
This paper introduces a novel retrieval framework for surgery videos. Given a query video, the goal is to retrieve videos in which similar surgical gestures appear. In this framework, the motion content of short video subsequences is modeled, in real-time, using spatiotemporal polynomials. The retrieval engine needs to be trained: key spatiotemporal polynomials, characterizing semantically-relevant surgical gestures, are identified through multiple-instance learning. Then, videos are compared in a high-level space spanned by these key spatiotemporal polynomials. The framework was applied to a dataset of 900 manually-delimited clips from 100 cataract surgery videos. High classification performance (A z = 0.816±0.118) and retrieval performance (MAP = 0.358) were observed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Seshamani, S., Lau, W., Hager, G.: Real-Time Endoscopic Mosaicking. In: Larsen, R., Nielsen, M., Sporring, J. (eds.) MICCAI 2006. LNCS, vol. 4190, pp. 355–363. Springer, Heidelberg (2006)
Cano, A.M., Gayá, F., Lamata, P., Sánchez-González, P., Gómez, E.J.: Laparoscopic Tool Tracking Method for Augmented Reality Surgical Applications. In: Bello, F., Edwards, E. (eds.) ISBMS 2008. LNCS, vol. 5104, pp. 191–196. Springer, Heidelberg (2008)
Cao, Y., Liu, D., Tavanapong, W., Wong, J., Oh, J., de Groen, P.: Computer-aided detection of diagnostic and therapeutic operations in colonoscopy videos. IEEE Trans. Biomed. Eng. 54(7), 1268–1279 (2007)
Giannarou, S., Yang, G.-Z.: Content-Based Surgical Workflow Representation Using Probabilistic Motion Modeling. In: Liao, H., Edwards, P.J., Pan, X., Fan, Y., Yang, G.-Z. (eds.) MIAR 2010. LNCS, vol. 6326, pp. 314–323. Springer, Heidelberg (2010)
Patel, B.V., Meshram, B.B.: Content based video retrieval systems. Int. J. UbiComp 3(2), 13–30 (2012)
Naturel, X., Gros, P.: Detecting repeats for video structuring. Multimedia Tools and Applications 38(2), 233–252 (2008)
Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and TRECVid. In: 8th ACM Int. Workshop on Multimedia Information Retrieval, pp. 321–330. ACM Press, New York (2006)
Hu, W., Xie, D., Fu, Z., Zeng, W., Maybank, S.: Semantic-based surveillance video retrieval. IEEE Trans. Image. Process. 16(4), 1168–1181 (2007)
André, B., Vercauteren, T., Buchner, A.M., Shahid, M.W., Wallace, M.B., Ayache, N.: An Image Retrieval Approach to Setup Difficulty Levels in Training Systems for Endomicroscopy Diagnosis. In: Jiang, T., Navab, N., Pluim, J.P.W., Viergever, M.A. (eds.) MICCAI 2010, Part II. LNCS, vol. 6362, pp. 480–487. Springer, Heidelberg (2010)
Xu, D., Chang, S.F.: Video event recognition using kernel methods with multilevel temporal alignment. IEEE Trans. Pattern. Anal. Mach. Intell. 30(11), 1985–1997 (2008)
Rothganger, F., Lazebnik, S., Schmid, C., Ponce, J.: Segmenting, modeling, and matching video clips containing multiple moving objects. IEEE Trans. Pattern. Anal. Mach. Intell. 29(3), 477–491 (2007)
Yamasaki, T., Aizawa, K.: Motion segmentation and retrieval for 3d video based on modified shape distribution. EURASIP J. Appl. Signal. Process 2007(1), 059535 (2007)
Laptev, I.: On space-time interest points. Int. J. Comput. Vis. 64(2-3), 107–123 (2005)
Jiang, Y.G., Ngo, C.W., Yang, J.: Towards optimal bag-of-features for object categorization and semantic video retrieval. In: ACM Int. Conf. on Image and Video Retrieval, Amsterdam, The Netherlands, pp. 494–501 (2007)
Jeannin, S.: On the combination of a polynomial motion estimation with a hierarchical segmentation based video coding scheme. In: Int. Conf. on Image Processing, Lausanne, Switzerland, pp. 489–492 (1996)
Kihl, O., Tremblais, B., Augereau, B., Khoudeir, M.: Human activities discrimination with motion approximation in polynomial bases. In: Int. Conf. on Image Processing, Hong Kong, China, pp. 2469–2472 (2010)
Hu, X., Ahuja, N.: Long image sequence motion analysis using polynomial motion models. In: IAPR Workshop on Machine Vision Applications, Tokyo, Japan, pp. 109–114 (1992)
Jakubiak, J., Nomm, S., Vain, J., Miyawaki, F.: Polynomial based approach in analysis and detection of surgeon’s motions. In: Int. Conf. on Control, Automation, Robotics and Vision, Hanoi, Vietnam, pp. 611–616 (2008)
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26(1), 43–49 (1978)
Lee, D.S.: Meeting video retrieval using dynamic HMM model similarity. In: IEEE Int. Conf. on Multimedia and Expo., Amsterdam, The Netherlands (July 2005)
Lili, N.A.: Hidden markov model for content-based video retrieval. In: Asia Int. Conf. on Modelling and Simulation, Bandung, Indonesia, pp. 353–358 (2009)
Foulds, J.R., Frank, E.: Speeding up and boosting diverse density learning. In: Conf. on Discovery Science, Lyon, France, pp. 102–116 (2010)
Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: DARPA Imaging Understanding Workshop, Washington, DC, USA, pp. 121–130 (1981)
Maron, O., Lozano-Pérez, T.: A framework for multiple-instance learning. In: Conf. Advances in Neural Information Processing Systems, pp. 570–576. Denver, Co., USA (1998)
Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: Advances in Neural Information Processing Systems, Whistler, Canada, vol. 15, pp. 561–568 (2003)
Broyden, C.G.: The convergence of a class of double-rank minimization algorithms. J. Inst. Math. Appl. 6, 76–90 (1970)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Quellec, G., Lamard, M., Droueche, Z., Cochener, B., Roux, C., Cazuguel, G. (2013). A Polynomial Model of Surgical Gestures for Real-Time Retrieval of Surgery Videos. In: Greenspan, H., Müller, H., Syeda-Mahmood, T. (eds) Medical Content-Based Retrieval for Clinical Decision Support. MCBR-CDS 2012. Lecture Notes in Computer Science, vol 7723. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36678-9_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-36678-9_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36677-2
Online ISBN: 978-3-642-36678-9
eBook Packages: Computer ScienceComputer Science (R0)