Abstract
In this work, we investigate laparoscopic camera motion automation through imitation learning from retrospective videos of laparoscopic interventions. A novel method is introduced that learns to augment a surgeon’s behavior in image space through object motion invariant image registration via homographies. Contrary to existing approaches, no geometric assumptions are made and no depth information is necessary, enabling immediate translation to a robotic setup. Deviating from the dominant approach in the literature which consist of following a surgical tool, we do not handcraft the objective and no priors are imposed on the surgical scene, allowing the method to discover unbiased policies. In this new research field, significant improvements are demonstrated over two baselines on the Cholec80 and HeiChole datasets, showcasing an improvement of \(47\%\) over camera motion continuation. The method is further shown to indeed predict camera motion correctly on the public motion classification labels of the AutoLaparo dataset. All code is made accessible on GitHub (https://github.com/RViMLab/homography_imitation_learning).
C. Bergeles and T. Vercauteren—These authors contributed equally to this work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Agrawal, A.S.: Automating endoscopic camera motion for teleoperated minimally invasive surgery using inverse reinforcement learning. Ph.D. thesis, Worcester Polytechnic Institute (2018)
Budd, C., Garcia-Peraza Herrera, L.C., Huber, M., Ourselin, S., Vercauteren, T.: Rapid and robust endoscopic content area estimation: a lean GPU-based pipeline and curated benchmark dataset. Comput. Methods Biomech. Biomed. Eng. Imaging Vis. 11(4), 1215–1224 (2022). https://doi.org/10.1080/21681163.2022.2156393
Cartucho, J., Tukra, S., Li, Y., Elson, D.S., Giannarou, S.: VisionBlender: a tool to efficiently generate computer vision datasets for robotic surgery. Comput. Methods Biomech. Biomed. Eng. Imaging Vis. 9(4), 331–338 (2021)
Da Col, T., Mariani, A., Deguet, A., Menciassi, A., Kazanzides, P., De Momi, E.: SCAN: system for camera autonomous navigation in robotic-assisted surgery. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2996–3002. IEEE (2020)
Davenport, T., Kalakota, R.: The potential for artificial intelligence in healthcare. Future Healthc. J. 6(2), 94 (2019)
DeTone, D., Malisiewicz, T., Rabinovich, A.: Deep image homography estimation (2016). http://arxiv.org/abs/1606.03798
Esteva, A., et al.: A guide to deep learning in healthcare. Nat. Med. 25(1), 24–29 (2019)
Fiorini, P., Goldberg, K.Y., Liu, Y., Taylor, R.H.: Concepts and trends in autonomy for robot-assisted surgery. Proc. IEEE 110(7), 993–1011 (2022)
Garcia-Peraza-Herrera, L.C., et al.: Robotic endoscope control via autonomous instrument tracking. Front. Robot. AI 9, 832208 (2022)
Huber, M., Mitchell, J.B., Henry, R., Ourselin, S., Vercauteren, T., Bergeles, C.: Homography-based visual servoing with remote center of motion for semi-autonomous robotic endoscope manipulation. In: 2021 International Symposium on Medical Robotics (ISMR), pp. 1–7. IEEE (2021)
Huber, M., Ourselin, S., Bergeles, C., Vercauteren, T.: Deep homography estimation in dynamic surgical scenes for laparoscopic camera motion extraction. Comput. Methods Biomech. Biomed. Eng. Imaging Visu. 10(3), 321–329 (2022)
Ji, J.J., Krishnan, S., Patel, V., Fer, D., Goldberg, K.: Learning 2D surgical camera motion from demonstrations. In: 2018 IEEE 14th International Conference on Automation Science and Engineering (CASE), pp. 35–42. IEEE (2018)
Kassahun, Y., et al.: Surgical robotics beyond enhanced dexterity instrumentation: a survey of machine learning techniques and their role in intelligent and autonomous surgical actions. Int. J. Comput. Assist. Radiol. Surg. 11, 553–568 (2016). https://doi.org/10.1007/s11548-015-1305-z
Kitaguchi, D., Takeshita, N., Hasegawa, H., Ito, M.: Artificial intelligence-based computer vision in surgery: recent advances and future perspectives. Ann. Gastroenterological Surg. 6(1), 29–36 (2022)
Li, B., Lu, B., Lu, Y., Dou, Q., Liu, Y.H.: Data-driven holistic framework for automated laparoscope optimal view control with learning-based depth perception. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 12366–12372. IEEE (2021)
Li, B., Lu, B., Wang, Z., Zhong, F., Dou, Q., Liu, Y.H.: Learning laparoscope actions via video features for proactive robotic field-of-view control. IEEE Robot. Autom. Lett. 7(3), 6653–6660 (2022)
Li, B., et al.: 3D perception based imitation learning under limited demonstration for laparoscope control in robotic surgery. In: 2022 International Conference on Robotics and Automation (ICRA), pp. 7664–7670. IEEE (2022)
Loukas, C.: Video content analysis of surgical procedures. Surg. Endosc. 32, 553–568 (2018). https://doi.org/10.1007/s00464-017-5878-1
Maier-Hein, L., et al.: Surgical data science-from concepts toward clinical translation. Med. Image Anal. 76, 102306 (2022)
Marzullo, A., Moccia, S., Catellani, M., Calimeri, F., De Momi, E.: Towards realistic laparoscopic image generation using image-domain translation. Comput. Methods Programs Biomed. 200, 105834 (2021)
Sandoval, J., Laribi, M.A., Faure, J., Breque, C., Richer, J.P., Zeghloul, S.: Towards an autonomous robot-assistant for laparoscopy using exteroceptive sensors: feasibility study and implementation. IEEE Robot. Autom. Lett. 6(4), 6473–6480 (2021)
Scheikl, P.M., et al.: LapGym-an open source framework for reinforcement learning in robot-assisted laparoscopic surgery. arXiv preprint arXiv:2302.09606 (2023)
Su, Y.H., Huang, K., Hannaford, B.: Multicamera 3D viewpoint adjustment for robotic surgery via deep reinforcement learning. J. Med. Robot. Res. 6(01n02), 2140003 (2021)
Sun, J., Shen, Z., Wang, Y., Bao, H., Zhou, X.: LoFTR: detector-free local feature matching with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8922–8931 (2021)
Twinanda, A.P., Shehata, S., Mutter, D., Marescaux, J., De Mathelin, M., Padoy, N.: EndoNet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans. Med. Imaging 36(1), 86–97 (2016)
Wagner, M., et al.: A learning robot for cognitive camera control in minimally invasive surgery. Surg. Endosc. 35(9), 5365–5374 (2021). https://doi.org/10.1007/s00464-021-08509-8
Wagner, M., et al.: Comparative validation of machine learning algorithms for surgical workflow and skill analysis with the heichole benchmark. Med. Image Anal. 86, 102770 (2023)
Wang, Z., et al.: AutoLaparo: a new dataset of integrated multi-tasks for image-guided surgical automation in laparoscopic hysterectomy. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022, Part VII. LNCS, vol. 13437, pp. 486–496. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16449-1_46
van Workum, F., Fransen, L., Luyer, M.D., Rosman, C.: Learning curves in minimally invasive esophagectomy. World J. Gastroenterol. 24(44), 4974 (2018)
Zidane, I.F., Khattab, Y., Rezeka, S., El-Habrouk, M.: Robotics in laparoscopic surgery-a review. Robotica 41(1), 126–173 (2023)
Acknowledgements
This work was supported by core and project funding from the Wellcome/EPSRC [WT203148/Z/16/Z; NS/A000049/1; WT101957; NS/A000027/1]. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 101016985 (FAROS project). TV is supported by a Medtronic/RAEng Research Chair [RCSRF1819\(\backslash \)7\(\backslash \)34]. SO and TV are co-founders and shareholders of Hypervision Surgical. TV is co-founder and shareholder of Hypervision Surgical. TV holds shares from Mauna Kea Technologies.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Huber, M., Ourselin, S., Bergeles, C., Vercauteren, T. (2023). Deep Homography Prediction for Endoscopic Camera Motion Imitation Learning. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14228. Springer, Cham. https://doi.org/10.1007/978-3-031-43996-4_21
Download citation
DOI: https://doi.org/10.1007/978-3-031-43996-4_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43995-7
Online ISBN: 978-3-031-43996-4
eBook Packages: Computer ScienceComputer Science (R0)