SEDSkill: Surgical Events Driven Method for Skill Assessment from Thoracoscopic Surgical Videos

Ding, Xinpeng; Xu, Xiaowei; Li, Xiaomeng

doi:10.1007/978-3-031-43996-4_4

Xinpeng Ding¹⁴,
Xiaowei Xu¹⁵ &
Xiaomeng Li¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14228))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

2898 Accesses

Abstract

Thoracoscopy-assisted mitral valve replacement (MVR) is a crucial treatment for patients with mitral regurgitation and demands exceptional surgical skills to prevent complications and enhance patient outcomes. Consequently, surgical skill assessment (SKA) for MVR is essential for certifying novice surgeons and training purposes. However, current automatic SKA approaches have inherent limitations that include the absence of public thoracoscopy-assisted surgery datasets, exclusion of inter-video relationships, and limited to SKA of a single short surgical action. This paper introduces a novel clinical dataset for MVR, which is the first thoracoscopy-assisted long-form surgery dataset to the best of our knowledge. Our dataset, unlike existing short video clips that contain single surgical action, includes videos of the whole MVR procedure that capture multiple complex skill-related surgical events. To tackle the challenges posed by MVR, we propose a novel method called Surgical Events Driven Skill assessment (SEDSkill). Our key idea is to develop a long-form surgical events-driven method for skill assessment, which is based on the insight that the skill level of a surgeon is closely tied to the occurrence of inappropriate operations such as excessively long suture repairing times. SEDSkill incorporates an event-aware module that automatically localizes skill-related events, thus extracting local semantics from long-form videos. Additionally, we introduce a difference regression block to learn imperceptible discrepancies, which enables precise and accurate surgical skills assessment. Extensive experiments demonstrate that our proposed method outperforms state-of-the-art approaches. Our code is available at https://github.com/xmed-lab/SEDSkill.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bai, Y., Zhou, D., Zhang, S., Wang, J., Ding, E., Guan, Y., Long, Y., Wang, J.: Action quality assessment with temporal parsing transformer. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022, Part IV. LNCS, vol. 13664, pp. 422–438. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19772-7_25
Chapter Google Scholar
Birkmeyer, J.D., et al.: Surgical skill and complication rates after bariatric surgery. N. Engl. J. Med. 369(15), 1434–1442 (2013)
Article Google Scholar
Brajcich, B.C., et al.: Association between surgical technical skill and long-term survival for colon cancer. JAMA Oncol. 7(1), 127–129 (2021)
Article Google Scholar
Carbello, B.: Mitral valve disease. Curr. Probl. Cardiol. 18(7), 425–478 (1993)
Article Google Scholar
Ding, X., Li, X.: Exploiting segment-level semantics for online phase recognition from surgical videos. arXiv preprint arXiv:2111.11044 (2021)
Ding, X., Wang, N., Gao, X., Li, J., Wang, X., Liu, T.: KFC: an efficient framework for semi-supervised temporal action localization. IEEE Trans. Image Process. 30, 6869–6878 (2021)
Article Google Scholar
Ding, X., et al.: Support-set based cross-supervision for video grounding. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11573–11582 (2021)
Google Scholar
Gao, Y., et al.: JHU-ISI gesture and skill assessment working set (JIGSAWS): a surgical activity dataset for human motion modeling. In: MICCAI Workshop: M2cai, vol. 3 (2014)
Google Scholar
Healey, M.A., Shackford, S.R., Osler, T.M., Rogers, F.B., Burns, E.: Complications in surgical patients. Arch. Surg. 137(5), 611–618 (2002)
Article Google Scholar
Jin, A., et al.: Tool detection and operative skill assessment in surgical videos using region-based convolutional neural networks. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 691–699. IEEE (2018)
Google Scholar
Kunisaki, C., et al.: Significance of thoracoscopy-assisted surgery with a minithoracotomy and hand-assisted laparoscopic surgery for esophageal cancer: the experience of a single surgeon. J. Gastrointest. Surg. 15, 1939–1951 (2011)
Article Google Scholar
Lavanchy, J., et al.: Automation of surgical skill assessment using a three-stage machine learning algorithm. Sci. Rep. 11(1), 5197 (2021)
Article Google Scholar
Li, M., Zhang, H.B., Lei, Q., Fan, Z., Liu, J., Du, J.X.: Pairwise contrastive learning network for action quality assessment. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13664, pp. 457–473. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19772-7_27
Chapter Google Scholar
Li, Z., Gu, L., Wang, W., Nakamura, R., Sato, Y.: Surgical skill assessment via video semantic aggregation. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022, Part VII. LNCS, vol. 13437, pp. 410–420. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16449-1_39
Chapter Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Google Scholar
Liu, D., Jiang, T., Wang, Y., Miao, R., Shan, F., Li, Z.: Surgical skill assessment on in-vivo clinical data via the clearness of operating field. In: Shen, D., et al. (eds.) MICCAI 2019, Part V. LNCS, vol. 11768, pp. 476–484. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32254-0_53
Chapter Google Scholar
Mason, J.D., Ansell, J., Warren, N., Torkington, J.: Is motion analysis a valid tool for assessing laparoscopic skill? Surg. Endosc. 27, 1468–1477 (2013)
Article Google Scholar
Parmar, P., Tran Morris, B.: Learning to score olympic events. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 20–28 (2017)
Google Scholar
Reznick, R.K.: Teaching and testing technical skills. Am. J. Surg. 165(3), 358–361 (1993)
Article Google Scholar
Strasberg, S.M., Linehan, D.C., Hawkins, W.G.: The accordion severity grading system of surgical complications. Ann. Surg. 250(2), 177–186 (2009)
Article Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Google Scholar
Uemura, M., et al.: Procedural surgical skill assessment in laparoscopic training environments. Int. J. Comput. Assist. Radiol. Surg. 11, 543–552 (2016)
Article Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Wagner, M., et al.: Comparative validation of machine learning algorithms for surgical workflow and skill analysis with the Heichole benchmark. arXiv preprint arXiv:2109.14956 (2021)
Wang, Tianyu, Wang, Yijie, Li, Mian: Towards accurate and interpretable surgical skill assessment: a video-based method incorporating recognized surgical gestures and skill levels. In: Martel, A.L., et al. (eds.) MICCAI 2020, Part III. LNCS, vol. 12263, pp. 668–678. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59716-0_64
Chapter Google Scholar
Wanzel, K.R., Ward, M., Reznick, R.K.: Teaching the surgical craft: from selection to certification. Curr. Probl. Surg. 39(6), 583–659 (2002)
Article Google Scholar
Yu, X., Rao, Y., Zhao, W., Lu, J., Zhou, J.: Group-aware contrastive regression for action quality assessment. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7919–7928 (2021)
Google Scholar
Zhang, C.L., Wu, J., Li, Y.: ActionFormer: localizing moments of actions with transformers. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022, Part IV. LNCS, vol. 13664, pp. 492–510. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19772-7_29
Chapter Google Scholar
Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., Ren, D.: Distance-IoU loss: faster and better learning for bounding box regression. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 12993–13000 (2020)
Google Scholar

Download references

Acknowledgement

This work was supported in part by a research grant from HKUST-BICI Exploratory Fund (HCIC-004) and in part by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project Reference Number: T45-401/22-N).

Author information

Authors and Affiliations

The Hong Kong University of Science and Technology, Hong Kong SAR, China
Xinpeng Ding & Xiaomeng Li
Guangdong Cardiovascular Institute, Guangdong Provincial People’s Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, China
Xiaowei Xu

Authors

Xinpeng Ding
View author publications
You can also search for this author in PubMed Google Scholar
Xiaowei Xu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaomeng Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xiaowei Xu or Xiaomeng Li .

Editor information

Editors and Affiliations

Icahn School of Medicine, Mount Sinai, NYC, NY, USA, Tel Aviv University, Tel Aviv, Israel
Hayit Greenspan
Emory University, Atlanta, GA, USA
Anant Madabhushi
Queen’s University, Kingston, ON, Canada
Parvin Mousavi
The University of British Columbia, Vancouver, BC, Canada
Septimiu Salcudean
Yale University, New Haven, CT, USA
James Duncan
IBM Research, San Jose, CA, USA
Tanveer Syeda-Mahmood
Johns Hopkins University, Baltimore, MD, USA
Russell Taylor

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ding, X., Xu, X., Li, X. (2023). SEDSkill: Surgical Events Driven Method for Skill Assessment from Thoracoscopic Surgical Videos. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14228. Springer, Cham. https://doi.org/10.1007/978-3-031-43996-4_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-43996-4_4
Published: 01 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43995-7
Online ISBN: 978-3-031-43996-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

SEDSkill: Surgical Events Driven Method for Skill Assessment from Thoracoscopic Surgical Videos