SENDD: Sparse Efficient Neural Depth and Deformation for Tissue Tracking

Schmidt, Adam; Mohareri, Omid; DiMaio, Simon; Salcudean, Septimiu E.

doi:10.1007/978-3-031-43996-4_23

Adam Schmidt ORCID: orcid.org/0000-0003-4769-4313¹⁴,
Omid Mohareri¹⁵,
Simon DiMaio¹⁵ &
…
Septimiu E. Salcudean¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14228))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

4669 Accesses
2 Citations

Abstract

Deformable tracking and real-time estimation of 3D tissue motion is essential to enable automation and image guidance applications in robotically assisted surgery. Our model, Sparse Efficient Neural Depth and Deformation (SENDD), extends prior 2D tracking work to estimate flow in 3D space. SENDD introduces novel contributions of learned detection, and sparse per-point depth and 3D flow estimation, all with less than half a million parameters. SENDD does this by using graph neural networks of sparse keypoint matches to estimate both depth and 3D flow anywhere. We quantify and benchmark SENDD on a comprehensively labelled tissue dataset, and compare it to an equivalent 2D flow model. SENDD performs comparably while enabling applications that 2D flow cannot. SENDD can track points and estimate depth at 10fps on an NVIDIA RTX 4000 for 1280 tracked (query) points and its cost scales linearly with an increasing/decreasing number of points. SENDD enables multiple downstream applications that require estimation of 3D motion in stereo endoscopy.

This work was supported by Intuitive Surgical.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Endo-4DGS: Endoscopic Monocular Scene Reconstruction with 4D Gaussian Splatting

Recurrent Implicit Neural Graph for Deformable Tracking in Endoscopic Videos

Online 3D Reconstruction and Dense Tracking in Endoscopic Videos

References

Barbed, O.L., Chadebecq, F., Morlana, J., Montiel, J.M.M., Murillo, A.C.: Super-point features in endoscopy. In: Imaging Systems for GI Endoscopy, and Graphs in Biomedical Image Analysis. LNCS, pp. 45–55. Springer, Cham (2022)
Google Scholar
Cartucho, J., et al.: SurgT: soft-tissue tracking for robotic surgery, benchmark and challenge (2023). https://doi.org/10.48550/ARXIV.2302.03022
DeTone, D., Malisiewicz, T., Rabinovich, A.: SuperPoint: self-supervised interest point detection and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2018)
Google Scholar
Geiger, A., Roser, M., Urtasun, R.: Efficient large-scale stereo matching. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010. LNCS, vol. 6492, pp. 25–38. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19315-6_3
Chapter Google Scholar
Giannarou, S., Ye, M., Gras, G., Leibrandt, K., Marcus, H., Yang, G.: Vision-based deformation recovery for intraoperative force estimation of tool-tissue interaction for neurosurgery. IJCARS 11, 929–936 (2016)
Google Scholar
Gómez-Rodríguez, J.J., Lamarca, J., Morlana, J., Tardós, J.D., Montiel, J.M.M.: SD-DefSLAM: semi-direct monocular slam for deformable and intracorporeal scenes. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 5170–5177 (May 2021)
Google Scholar
Jonschkowski, R., Stone, A., Barron, J.T., Gordon, A., Konolige, K., Angelova, A.: What matters in unsupervised optical flow. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M. (eds.) ECCV 2020. LNCS, pp. 557–572. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58536-5_33
Chapter Google Scholar
Kalia, M., Mathur, P., Tsang, K., Black, P., Navab, N., Salcudean, S.: Evaluation of a marker-less, intra-operative, augmented reality guidance system for robot-assisted laparoscopic radical prostatectomy. Int. J. CARS 15(7), 1225–1233 (2020)
Article Google Scholar
Lamarca, J., Parashar, S., Bartoli, A., Montiel, J.: DefSLAM: tracking and mapping of deforming scenes from monocular sequences. IEEE Trans. Rob. 37(1), 291–303 (2021). https://doi.org/10.1109/TRO.2020.3020739
Article Google Scholar
Lamarca, J., Gómez Rodríguez, J.J., Tardós, J.D., Montiel, J.: Direct and sparse deformable tracking. IEEE Robot. Autom. Lett. 7(4), 11450–11457 (2022). https://doi.org/10.1109/LRA.2022.3201253
Article Google Scholar
Li, Y., et al.: SuPer: a surgical perception framework for endoscopic tissue manipulation with surgical robotics. IEEE Robot. Autom. Lett. 5(2), 2294–2301 (2020)
Article Google Scholar
Lin, S., et al.: Semantic-SuPer: a semantic-aware surgical perception framework for endoscopic tissue identification, reconstruction, and tracking. In: 2023 IEEE International Conference on Robotics and Automation (ICRA), pp. 4739–4746 (2023). https://doi.org/10.1109/ICRA48891.2023.10160746
Lu, J., Jayakumari, A., Richter, F., Li, Y., Yip, M.C.: Super deep: a surgical perception framework for robotic tissue manipulation using deep learning for feature extraction. In: ICRA. IEEE (2021)
Google Scholar
Lukezic, A., Vojir, T., Zajc, L.C., Matas, J., Kristan, M.: Discriminative correlation filter with channel and spatial reliability. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4847–4856. IEEE, Honolulu, HI (2017). https://doi.org/10.1109/CVPR.2017.515
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. Commun. ACM 65(1), 99–106 (2021). https://doi.org/10.1145/3503250
Article Google Scholar
Recasens, D., Lamarca, J., Fácil, J.M., Montiel, J.M.M., Civera, J.: Endo-depth-and-motion: reconstruction and tracking in endoscopic videos using depth networks and photometric constraints. IEEE Robot. Autom. Lett. 6(4), 7225–7232 (2021). https://doi.org/10.1109/LRA.2021.3095528
Article Google Scholar
Richa, R., Bó, A.P., Poignet, P.: Towards robust 3D visual tracking for motion compensation in beating heart surgery. Med. Image Anal. 15, 302–315 (2011)
Article Google Scholar
Sarlin, P.E., DeTone, D., Malisiewicz, T., Rabinovich, A.: SuperGlue: learning feature matching with graph neural networks. In: CVPR (2020)
Google Scholar
Schmidt, A., Mohareri, O., DiMaio, S.P., Salcudean, S.E.: Fast graph refinement and implicit neural representation for tissue tracking. In: ICRA (2022)
Google Scholar
Schmidt, A., Salcudean, S.E.: Real-time rotated convolutional descriptor for surgical environments. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12904, pp. 279–289. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87202-1_27
Chapter Google Scholar
Song, J., Wang, J., Zhao, L., Huang, S., Dissanayake, G.: MIS-SLAM: real-time large-scale dense deformable slam system in minimal invasive surgery based on heterogeneous computing. IEEE Robot. Autom. Lett. 3, 4068–4075 (2018)
Article Google Scholar
Song, J., Zhu, Q., Lin, J., Ghaffari, M.: BDIS: Bayesian dense inverse searching method for real-time stereo surgical image matching. IEEE Trans. Rob. 39(2), 1388–1406 (2023)
Article Google Scholar
Tancik, M., et al.: Fourier features let networks learn high frequency functions in low dimensional domains. In: NeurIPS (2020)
Google Scholar
Teed, Z., Deng, J.: RAFT: recurrent all-pairs field transforms for optical flow. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12347, pp. 402–419. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58536-5_24
Chapter Google Scholar
Teed, Z., Deng, J.: RAFT-3D: scene flow using rigid-motion embeddings. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8375–8384 (2021)
Google Scholar
Wang, Y., Long, Y., Fan, S.H., Dou, Q.: Neural rendering for stereo 3D reconstruction of deformable tissues in robotic surgery. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022. LNCS, pp. 431–441. Springer, Cham (2022)
Chapter Google Scholar
Zhang, F., Prisacariu, V., Yang, R., Torr, P.H.: GA-Net: guided aggregation net for end-to-end stereo matching. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 185–194. IEEE, Long Beach, CA, USA (2019). https://doi.org/10.1109/CVPR.2019.00027
Zhang, Y., et al.: ColDE: a depth estimation framework for colonoscopy reconstruction. arXiv:2111.10371 [cs, eess] (2021)

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, The University of British Columbia, Vancouver, BC V6T 1Z4, Canada
Adam Schmidt & Septimiu E. Salcudean
Advanced Research, Intuitive Surgical, Sunnyvale, CA, 94086, USA
Omid Mohareri & Simon DiMaio

Authors

Adam Schmidt
View author publications
You can also search for this author in PubMed Google Scholar
Omid Mohareri
View author publications
You can also search for this author in PubMed Google Scholar
Simon DiMaio
View author publications
You can also search for this author in PubMed Google Scholar
Septimiu E. Salcudean
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adam Schmidt .

Editor information

Editors and Affiliations

Icahn School of Medicine, Mount Sinai, NYC, NY, USA, Tel Aviv University, Tel Aviv, Israel
Hayit Greenspan
Emory University, Atlanta, GA, USA
Anant Madabhushi
Queen’s University, Kingston, ON, Canada
Parvin Mousavi
The University of British Columbia, Vancouver, BC, Canada
Septimiu Salcudean
Yale University, New Haven, CT, USA
James Duncan
IBM Research, San Jose, CA, USA
Tanveer Syeda-Mahmood
Johns Hopkins University, Baltimore, MD, USA
Russell Taylor

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 14718 KB)

Supplementary material 2 (mp4 33203 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schmidt, A., Mohareri, O., DiMaio, S., Salcudean, S.E. (2023). SENDD: Sparse Efficient Neural Depth and Deformation for Tissue Tracking. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14228. Springer, Cham. https://doi.org/10.1007/978-3-031-43996-4_23

Download citation

DOI: https://doi.org/10.1007/978-3-031-43996-4_23
Published: 01 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43995-7
Online ISBN: 978-3-031-43996-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

SENDD: Sparse Efficient Neural Depth and Deformation for Tissue Tracking