Abstract
High-level cognitive assistance, such as predicting dissection trajectories in Endoscopic Submucosal Dissection (ESD), can potentially support and facilitate surgical skills training. However, it has rarely been explored in existing studies. Imitation learning has shown its efficacy in learning skills from expert demonstrations, but it faces challenges in predicting uncertain future movements and generalizing to various surgical scenes. In this paper, we introduce imitation learning to the formulated task of learning how to suggest dissection trajectories from expert video demonstrations. We propose a novel method with implicit diffusion policy imitation learning (iDiff-IL) to address this problem. Specifically, our approach models the expert behaviors using a joint state-action distribution in an implicit way. It can capture the inherent stochasticity of future dissection trajectories, therefore allows robust visual representations for various endoscopic views. By leveraging the diffusion model in policy learning, our implicit policy can be trained and sampled efficiently for accurate predictions and good generalizability. To achieve conditional sampling from the implicit policy, we devise a forward-process guided action inference strategy that corrects the state mismatch. We collected a private ESD video dataset with 1032 short clips to validate our method. Experimental results demonstrate that our solution outperforms SOTA imitation learning methods on our formulated task. To the best of our knowledge, this is the first work applying imitation learning for surgical skill learning with respect to dissection trajectory prediction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Allan, M., et al.: 2018 robotic scene segmentation challenge. arXiv preprint arXiv:2001.11190 (2020)
Chiu, P.W.Y., et al.: Endoscopic submucosal dissection (ESD) compared with gastrectomy for treatment of early gastric neoplasia: a retrospective cohort study. Surg. Endosc. 26, 3584–3591 (2012)
Codevilla, F., Santana, E., López, A.M., Gaidon, A.: Exploring the limitations of behavior cloning for autonomous driving. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9329–9338 (2019)
Du, Y., Mordatch, I.: Implicit generation and modeling with energy based models. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Florence, P., et al.: Implicit behavioral cloning. In: Conference on Robot Learning, pp. 158–168. PMLR (2022)
Ganapathi, A., Florence, P., Varley, J., Burns, K., Goldberg, K., Zeng, A.: Implicit kinematic policies: unifying joint and cartesian action spaces in end-to-end robot learning. In: 2022 International Conference on Robotics and Automation (ICRA), pp. 2656–2662. IEEE (2022)
Garrow, C.R., et al.: Machine learning for surgical phase recognition: a systematic review. Ann. Surg. 273(4), 684–693 (2021)
Gu, T., et al.: Stochastic trajectory prediction via motion indeterminacy diffusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17113–17122 (2022)
Guo, J., Sun, Y., Guo, S.: A novel trajectory predicting method of catheter for the vascular interventional surgical robot. In: IEEE International Conference on Mechatronics and Automation, pp. 1304–1309 (2020)
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Advances in Neural Information Processing Systems, vol. 33, pp. 6840–6851 (2020)
Hussein, A., Gaber, M.M., Elyan, E., Jayne, C.: Imitation learning: a survey of learning methods. ACM Comput. Surv. 50(2), 1–35 (2017)
Jarrett, D., Bica, I., van der Schaar, M.: Strictly batch imitation learning by energy-based distribution matching. In: Advances in Neural Information Processing Systems, vol. 33, pp. 7354–7365 (2020)
Jin, Y., Long, Y., Gao, X., Stoyanov, D., Dou, Q., Heng, P.A.: Trans-svnet: hybrid embedding aggregation transformer for surgical workflow analysis. Int. J. Comput. Assist. Radiol. Surg. 17(12), 2193–2202 (2022)
Ke, L., Choudhury, S., Barnes, M., Sun, W., Lee, G., Srinivasa, S.: Imitation learning as f-divergence minimization. In: Algorithmic Foundations of Robotics XIV: Proceedings of the Fourteenth Workshop on the Algorithmic Foundations of Robotics, vol. 14. pp. 313–329 (2021)
Kim, E., et al.: Factors predictive of perforation during endoscopic submucosal dissection for the treatment of colorectal tumors. Endoscopy 43(07), 573–578 (2011)
Kläser, K., et al.: Imitation learning for improved 3D pet/MR attenuation correction. Med. Image Anal. 71, 102079 (2021)
Laurence, J.M., Tran, P.D., Richardson, A.J., Pleass, H.C., Lam, V.W.: Laparoscopic or open cholecystectomy in cirrhosis: a systematic review of outcomes and meta-analysis of randomized trials. HPB 14(3), 153–161 (2012)
Le Mero, L., Yi, D., Dianati, M., Mouzakitis, A.: A survey on imitation learning techniques for end-to-end autonomous vehicles. IEEE Trans. Intell. Transp. Syst. 23(9), 14128–14147 (2022)
Li, Y., Song, J., Ermon, S.: Infogail: interpretable imitation learning from visual demonstrations. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Loftus, T.J., et al.: Artificial intelligence and surgical decision-making. JAMA Surg. 155(2), 148–158 (2020)
Maier-Hein, L., et al.: Surgical data science-from concepts toward clinical translation. Med. Image Anal. 76, 102306 (2022)
Maier-Hein, L., et al.: Surgical data science for next-generation interventions. Nat. Biomed. Eng. 1(9), 691–696 (2017)
Mohamed, A., Qian, K., Elhoseiny, M., Claudel, C.: Social-STGCNN: a social spatio-temporal graph convolutional neural network for human trajectory prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14424–14432 (2020)
Qin, Y., Feyzabadi, S., Allan, M., Burdick, J.W., Azizian, M.: Davincinet: joint prediction of motion and surgical state in robot-assisted surgery. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2921–2928. IEEE (2020)
Ren, A., Veer, S., Majumdar, A.: Generalization guarantees for imitation learning. In: Conference on Robot Learning, pp. 1426–1442. PMLR (2021)
Sun, J., Jiang, Q., Lu, C.: Recursive social behavior graph for trajectory prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 660–669 (2020)
Wang, J., et al.: Real-time landmark detection for precise endoscopic submucosal dissection via shape-aware relation network. Med. Image Anal. 75, 102291 (2022)
Wang, Y., Long, Y., Fan, S.H., Dou, Q.: Neural rendering for stereo 3D reconstruction of deformable tissues in robotic surgery. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 431–441 (2022)
Wang, Z., Yan, Z., Xing, Y., Wang, H.: Real-time trajectory prediction of laparoscopic instrument tip based on long short-term memory neural network in laparoscopic surgery training. Int. J. Med. Robot. Comput. Assist. Surg. 18(6), e2441 (2022)
Zhang, J., et al.: Symmetric dilated convolution for surgical gesture recognition. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12263, pp. 409–418. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59716-0_39
Acknowledgement
This work was supported in part by Shenzhen Portion of Shenzhen-Hong Kong Science and Technology Innovation Cooperation Zone under HZQB-KCZYB-20200089, in part by Hong Kong Innovation and Technology Commission Project No. ITS/237/21FP, in part by Hong Kong Research Grants Council Project No. T45-401/22-N, in part by Science, Technology and Innovation Commission of Shenzhen Municipality Project No. SGDX20220530111201008.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Li, J. et al. (2023). Imitation Learning from Expert Video Data for Dissection Trajectory Prediction in Endoscopic Surgical Procedure. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14228. Springer, Cham. https://doi.org/10.1007/978-3-031-43996-4_47
Download citation
DOI: https://doi.org/10.1007/978-3-031-43996-4_47
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43995-7
Online ISBN: 978-3-031-43996-4
eBook Packages: Computer ScienceComputer Science (R0)