Skip to main content

A Polynomial Model of Surgical Gestures for Real-Time Retrieval of Surgery Videos

  • Conference paper
Book cover Medical Content-Based Retrieval for Clinical Decision Support (MCBR-CDS 2012)

Abstract

This paper introduces a novel retrieval framework for surgery videos. Given a query video, the goal is to retrieve videos in which similar surgical gestures appear. In this framework, the motion content of short video subsequences is modeled, in real-time, using spatiotemporal polynomials. The retrieval engine needs to be trained: key spatiotemporal polynomials, characterizing semantically-relevant surgical gestures, are identified through multiple-instance learning. Then, videos are compared in a high-level space spanned by these key spatiotemporal polynomials. The framework was applied to a dataset of 900 manually-delimited clips from 100 cataract surgery videos. High classification performance (A z  = 0.816±0.118) and retrieval performance (MAP = 0.358) were observed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Seshamani, S., Lau, W., Hager, G.: Real-Time Endoscopic Mosaicking. In: Larsen, R., Nielsen, M., Sporring, J. (eds.) MICCAI 2006. LNCS, vol. 4190, pp. 355–363. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  2. Cano, A.M., Gayá, F., Lamata, P., Sánchez-González, P., Gómez, E.J.: Laparoscopic Tool Tracking Method for Augmented Reality Surgical Applications. In: Bello, F., Edwards, E. (eds.) ISBMS 2008. LNCS, vol. 5104, pp. 191–196. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  3. Cao, Y., Liu, D., Tavanapong, W., Wong, J., Oh, J., de Groen, P.: Computer-aided detection of diagnostic and therapeutic operations in colonoscopy videos. IEEE Trans. Biomed. Eng. 54(7), 1268–1279 (2007)

    Article  Google Scholar 

  4. Giannarou, S., Yang, G.-Z.: Content-Based Surgical Workflow Representation Using Probabilistic Motion Modeling. In: Liao, H., Edwards, P.J., Pan, X., Fan, Y., Yang, G.-Z. (eds.) MIAR 2010. LNCS, vol. 6326, pp. 314–323. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  5. Patel, B.V., Meshram, B.B.: Content based video retrieval systems. Int. J. UbiComp 3(2), 13–30 (2012)

    Article  Google Scholar 

  6. Naturel, X., Gros, P.: Detecting repeats for video structuring. Multimedia Tools and Applications 38(2), 233–252 (2008)

    Article  Google Scholar 

  7. Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and TRECVid. In: 8th ACM Int. Workshop on Multimedia Information Retrieval, pp. 321–330. ACM Press, New York (2006)

    Chapter  Google Scholar 

  8. Hu, W., Xie, D., Fu, Z., Zeng, W., Maybank, S.: Semantic-based surveillance video retrieval. IEEE Trans. Image. Process. 16(4), 1168–1181 (2007)

    Article  MathSciNet  Google Scholar 

  9. André, B., Vercauteren, T., Buchner, A.M., Shahid, M.W., Wallace, M.B., Ayache, N.: An Image Retrieval Approach to Setup Difficulty Levels in Training Systems for Endomicroscopy Diagnosis. In: Jiang, T., Navab, N., Pluim, J.P.W., Viergever, M.A. (eds.) MICCAI 2010, Part II. LNCS, vol. 6362, pp. 480–487. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  10. Xu, D., Chang, S.F.: Video event recognition using kernel methods with multilevel temporal alignment. IEEE Trans. Pattern. Anal. Mach. Intell. 30(11), 1985–1997 (2008)

    Article  Google Scholar 

  11. Rothganger, F., Lazebnik, S., Schmid, C., Ponce, J.: Segmenting, modeling, and matching video clips containing multiple moving objects. IEEE Trans. Pattern. Anal. Mach. Intell. 29(3), 477–491 (2007)

    Article  Google Scholar 

  12. Yamasaki, T., Aizawa, K.: Motion segmentation and retrieval for 3d video based on modified shape distribution. EURASIP J. Appl. Signal. Process 2007(1), 059535 (2007)

    Google Scholar 

  13. Laptev, I.: On space-time interest points. Int. J. Comput. Vis. 64(2-3), 107–123 (2005)

    Article  Google Scholar 

  14. Jiang, Y.G., Ngo, C.W., Yang, J.: Towards optimal bag-of-features for object categorization and semantic video retrieval. In: ACM Int. Conf. on Image and Video Retrieval, Amsterdam, The Netherlands, pp. 494–501 (2007)

    Google Scholar 

  15. Jeannin, S.: On the combination of a polynomial motion estimation with a hierarchical segmentation based video coding scheme. In: Int. Conf. on Image Processing, Lausanne, Switzerland, pp. 489–492 (1996)

    Google Scholar 

  16. Kihl, O., Tremblais, B., Augereau, B., Khoudeir, M.: Human activities discrimination with motion approximation in polynomial bases. In: Int. Conf. on Image Processing, Hong Kong, China, pp. 2469–2472 (2010)

    Google Scholar 

  17. Hu, X., Ahuja, N.: Long image sequence motion analysis using polynomial motion models. In: IAPR Workshop on Machine Vision Applications, Tokyo, Japan, pp. 109–114 (1992)

    Google Scholar 

  18. Jakubiak, J., Nomm, S., Vain, J., Miyawaki, F.: Polynomial based approach in analysis and detection of surgeon’s motions. In: Int. Conf. on Control, Automation, Robotics and Vision, Hanoi, Vietnam, pp. 611–616 (2008)

    Google Scholar 

  19. Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26(1), 43–49 (1978)

    Article  MATH  Google Scholar 

  20. Lee, D.S.: Meeting video retrieval using dynamic HMM model similarity. In: IEEE Int. Conf. on Multimedia and Expo., Amsterdam, The Netherlands (July 2005)

    Google Scholar 

  21. Lili, N.A.: Hidden markov model for content-based video retrieval. In: Asia Int. Conf. on Modelling and Simulation, Bandung, Indonesia, pp. 353–358 (2009)

    Google Scholar 

  22. Foulds, J.R., Frank, E.: Speeding up and boosting diverse density learning. In: Conf. on Discovery Science, Lyon, France, pp. 102–116 (2010)

    Google Scholar 

  23. Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: DARPA Imaging Understanding Workshop, Washington, DC, USA, pp. 121–130 (1981)

    Google Scholar 

  24. Maron, O., Lozano-Pérez, T.: A framework for multiple-instance learning. In: Conf. Advances in Neural Information Processing Systems, pp. 570–576. Denver, Co., USA (1998)

    Google Scholar 

  25. Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: Advances in Neural Information Processing Systems, Whistler, Canada, vol. 15, pp. 561–568 (2003)

    Google Scholar 

  26. Broyden, C.G.: The convergence of a class of double-rank minimization algorithms. J. Inst. Math. Appl. 6, 76–90 (1970)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Quellec, G., Lamard, M., Droueche, Z., Cochener, B., Roux, C., Cazuguel, G. (2013). A Polynomial Model of Surgical Gestures for Real-Time Retrieval of Surgery Videos. In: Greenspan, H., Müller, H., Syeda-Mahmood, T. (eds) Medical Content-Based Retrieval for Clinical Decision Support. MCBR-CDS 2012. Lecture Notes in Computer Science, vol 7723. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36678-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-36678-9_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-36677-2

  • Online ISBN: 978-3-642-36678-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics