Skip to main content

Imitation Learning from Expert Video Data for Dissection Trajectory Prediction in Endoscopic Surgical Procedure

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 (MICCAI 2023)

Abstract

High-level cognitive assistance, such as predicting dissection trajectories in Endoscopic Submucosal Dissection (ESD), can potentially support and facilitate surgical skills training. However, it has rarely been explored in existing studies. Imitation learning has shown its efficacy in learning skills from expert demonstrations, but it faces challenges in predicting uncertain future movements and generalizing to various surgical scenes. In this paper, we introduce imitation learning to the formulated task of learning how to suggest dissection trajectories from expert video demonstrations. We propose a novel method with implicit diffusion policy imitation learning (iDiff-IL) to address this problem. Specifically, our approach models the expert behaviors using a joint state-action distribution in an implicit way. It can capture the inherent stochasticity of future dissection trajectories, therefore allows robust visual representations for various endoscopic views. By leveraging the diffusion model in policy learning, our implicit policy can be trained and sampled efficiently for accurate predictions and good generalizability. To achieve conditional sampling from the implicit policy, we devise a forward-process guided action inference strategy that corrects the state mismatch. We collected a private ESD video dataset with 1032 short clips to validate our method. Experimental results demonstrate that our solution outperforms SOTA imitation learning methods on our formulated task. To the best of our knowledge, this is the first work applying imitation learning for surgical skill learning with respect to dissection trajectory prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Allan, M., et al.: 2018 robotic scene segmentation challenge. arXiv preprint arXiv:2001.11190 (2020)

  2. Chiu, P.W.Y., et al.: Endoscopic submucosal dissection (ESD) compared with gastrectomy for treatment of early gastric neoplasia: a retrospective cohort study. Surg. Endosc. 26, 3584–3591 (2012)

    Article  Google Scholar 

  3. Codevilla, F., Santana, E., López, A.M., Gaidon, A.: Exploring the limitations of behavior cloning for autonomous driving. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9329–9338 (2019)

    Google Scholar 

  4. Du, Y., Mordatch, I.: Implicit generation and modeling with energy based models. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  5. Florence, P., et al.: Implicit behavioral cloning. In: Conference on Robot Learning, pp. 158–168. PMLR (2022)

    Google Scholar 

  6. Ganapathi, A., Florence, P., Varley, J., Burns, K., Goldberg, K., Zeng, A.: Implicit kinematic policies: unifying joint and cartesian action spaces in end-to-end robot learning. In: 2022 International Conference on Robotics and Automation (ICRA), pp. 2656–2662. IEEE (2022)

    Google Scholar 

  7. Garrow, C.R., et al.: Machine learning for surgical phase recognition: a systematic review. Ann. Surg. 273(4), 684–693 (2021)

    Article  Google Scholar 

  8. Gu, T., et al.: Stochastic trajectory prediction via motion indeterminacy diffusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17113–17122 (2022)

    Google Scholar 

  9. Guo, J., Sun, Y., Guo, S.: A novel trajectory predicting method of catheter for the vascular interventional surgical robot. In: IEEE International Conference on Mechatronics and Automation, pp. 1304–1309 (2020)

    Google Scholar 

  10. Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Advances in Neural Information Processing Systems, vol. 33, pp. 6840–6851 (2020)

    Google Scholar 

  11. Hussein, A., Gaber, M.M., Elyan, E., Jayne, C.: Imitation learning: a survey of learning methods. ACM Comput. Surv. 50(2), 1–35 (2017)

    Article  Google Scholar 

  12. Jarrett, D., Bica, I., van der Schaar, M.: Strictly batch imitation learning by energy-based distribution matching. In: Advances in Neural Information Processing Systems, vol. 33, pp. 7354–7365 (2020)

    Google Scholar 

  13. Jin, Y., Long, Y., Gao, X., Stoyanov, D., Dou, Q., Heng, P.A.: Trans-svnet: hybrid embedding aggregation transformer for surgical workflow analysis. Int. J. Comput. Assist. Radiol. Surg. 17(12), 2193–2202 (2022)

    Article  Google Scholar 

  14. Ke, L., Choudhury, S., Barnes, M., Sun, W., Lee, G., Srinivasa, S.: Imitation learning as f-divergence minimization. In: Algorithmic Foundations of Robotics XIV: Proceedings of the Fourteenth Workshop on the Algorithmic Foundations of Robotics, vol. 14. pp. 313–329 (2021)

    Google Scholar 

  15. Kim, E., et al.: Factors predictive of perforation during endoscopic submucosal dissection for the treatment of colorectal tumors. Endoscopy 43(07), 573–578 (2011)

    Article  Google Scholar 

  16. Kläser, K., et al.: Imitation learning for improved 3D pet/MR attenuation correction. Med. Image Anal. 71, 102079 (2021)

    Article  Google Scholar 

  17. Laurence, J.M., Tran, P.D., Richardson, A.J., Pleass, H.C., Lam, V.W.: Laparoscopic or open cholecystectomy in cirrhosis: a systematic review of outcomes and meta-analysis of randomized trials. HPB 14(3), 153–161 (2012)

    Article  Google Scholar 

  18. Le Mero, L., Yi, D., Dianati, M., Mouzakitis, A.: A survey on imitation learning techniques for end-to-end autonomous vehicles. IEEE Trans. Intell. Transp. Syst. 23(9), 14128–14147 (2022)

    Article  Google Scholar 

  19. Li, Y., Song, J., Ermon, S.: Infogail: interpretable imitation learning from visual demonstrations. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  20. Loftus, T.J., et al.: Artificial intelligence and surgical decision-making. JAMA Surg. 155(2), 148–158 (2020)

    Article  Google Scholar 

  21. Maier-Hein, L., et al.: Surgical data science-from concepts toward clinical translation. Med. Image Anal. 76, 102306 (2022)

    Article  Google Scholar 

  22. Maier-Hein, L., et al.: Surgical data science for next-generation interventions. Nat. Biomed. Eng. 1(9), 691–696 (2017)

    Article  Google Scholar 

  23. Mohamed, A., Qian, K., Elhoseiny, M., Claudel, C.: Social-STGCNN: a social spatio-temporal graph convolutional neural network for human trajectory prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14424–14432 (2020)

    Google Scholar 

  24. Qin, Y., Feyzabadi, S., Allan, M., Burdick, J.W., Azizian, M.: Davincinet: joint prediction of motion and surgical state in robot-assisted surgery. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2921–2928. IEEE (2020)

    Google Scholar 

  25. Ren, A., Veer, S., Majumdar, A.: Generalization guarantees for imitation learning. In: Conference on Robot Learning, pp. 1426–1442. PMLR (2021)

    Google Scholar 

  26. Sun, J., Jiang, Q., Lu, C.: Recursive social behavior graph for trajectory prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 660–669 (2020)

    Google Scholar 

  27. Wang, J., et al.: Real-time landmark detection for precise endoscopic submucosal dissection via shape-aware relation network. Med. Image Anal. 75, 102291 (2022)

    Article  Google Scholar 

  28. Wang, Y., Long, Y., Fan, S.H., Dou, Q.: Neural rendering for stereo 3D reconstruction of deformable tissues in robotic surgery. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 431–441 (2022)

    Google Scholar 

  29. Wang, Z., Yan, Z., Xing, Y., Wang, H.: Real-time trajectory prediction of laparoscopic instrument tip based on long short-term memory neural network in laparoscopic surgery training. Int. J. Med. Robot. Comput. Assist. Surg. 18(6), e2441 (2022)

    Article  Google Scholar 

  30. Zhang, J., et al.: Symmetric dilated convolution for surgical gesture recognition. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12263, pp. 409–418. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59716-0_39

    Chapter  Google Scholar 

Download references

Acknowledgement

This work was supported in part by Shenzhen Portion of Shenzhen-Hong Kong Science and Technology Innovation Cooperation Zone under HZQB-KCZYB-20200089, in part by Hong Kong Innovation and Technology Commission Project No. ITS/237/21FP, in part by Hong Kong Research Grants Council Project No. T45-401/22-N, in part by Science, Technology and Innovation Commission of Shenzhen Municipality Project No. SGDX20220530111201008.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qi Dou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, J. et al. (2023). Imitation Learning from Expert Video Data for Dissection Trajectory Prediction in Endoscopic Surgical Procedure. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14228. Springer, Cham. https://doi.org/10.1007/978-3-031-43996-4_47

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43996-4_47

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43995-7

  • Online ISBN: 978-3-031-43996-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics