Skip to main content

SENDD: Sparse Efficient Neural Depth and Deformation for Tissue Tracking

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 (MICCAI 2023)

Abstract

Deformable tracking and real-time estimation of 3D tissue motion is essential to enable automation and image guidance applications in robotically assisted surgery. Our model, Sparse Efficient Neural Depth and Deformation (SENDD), extends prior 2D tracking work to estimate flow in 3D space. SENDD introduces novel contributions of learned detection, and sparse per-point depth and 3D flow estimation, all with less than half a million parameters. SENDD does this by using graph neural networks of sparse keypoint matches to estimate both depth and 3D flow anywhere. We quantify and benchmark SENDD on a comprehensively labelled tissue dataset, and compare it to an equivalent 2D flow model. SENDD performs comparably while enabling applications that 2D flow cannot. SENDD can track points and estimate depth at 10fps on an NVIDIA RTX 4000 for 1280 tracked (query) points and its cost scales linearly with an increasing/decreasing number of points. SENDD enables multiple downstream applications that require estimation of 3D motion in stereo endoscopy.

This work was supported by Intuitive Surgical.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Barbed, O.L., Chadebecq, F., Morlana, J., Montiel, J.M.M., Murillo, A.C.: Super-point features in endoscopy. In: Imaging Systems for GI Endoscopy, and Graphs in Biomedical Image Analysis. LNCS, pp. 45–55. Springer, Cham (2022)

    Google Scholar 

  2. Cartucho, J., et al.: SurgT: soft-tissue tracking for robotic surgery, benchmark and challenge (2023). https://doi.org/10.48550/ARXIV.2302.03022

  3. DeTone, D., Malisiewicz, T., Rabinovich, A.: SuperPoint: self-supervised interest point detection and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2018)

    Google Scholar 

  4. Geiger, A., Roser, M., Urtasun, R.: Efficient large-scale stereo matching. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010. LNCS, vol. 6492, pp. 25–38. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19315-6_3

    Chapter  Google Scholar 

  5. Giannarou, S., Ye, M., Gras, G., Leibrandt, K., Marcus, H., Yang, G.: Vision-based deformation recovery for intraoperative force estimation of tool-tissue interaction for neurosurgery. IJCARS 11, 929–936 (2016)

    Google Scholar 

  6. Gómez-Rodríguez, J.J., Lamarca, J., Morlana, J., Tardós, J.D., Montiel, J.M.M.: SD-DefSLAM: semi-direct monocular slam for deformable and intracorporeal scenes. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 5170–5177 (May 2021)

    Google Scholar 

  7. Jonschkowski, R., Stone, A., Barron, J.T., Gordon, A., Konolige, K., Angelova, A.: What matters in unsupervised optical flow. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M. (eds.) ECCV 2020. LNCS, pp. 557–572. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58536-5_33

    Chapter  Google Scholar 

  8. Kalia, M., Mathur, P., Tsang, K., Black, P., Navab, N., Salcudean, S.: Evaluation of a marker-less, intra-operative, augmented reality guidance system for robot-assisted laparoscopic radical prostatectomy. Int. J. CARS 15(7), 1225–1233 (2020)

    Article  Google Scholar 

  9. Lamarca, J., Parashar, S., Bartoli, A., Montiel, J.: DefSLAM: tracking and mapping of deforming scenes from monocular sequences. IEEE Trans. Rob. 37(1), 291–303 (2021). https://doi.org/10.1109/TRO.2020.3020739

    Article  Google Scholar 

  10. Lamarca, J., Gómez Rodríguez, J.J., Tardós, J.D., Montiel, J.: Direct and sparse deformable tracking. IEEE Robot. Autom. Lett. 7(4), 11450–11457 (2022). https://doi.org/10.1109/LRA.2022.3201253

    Article  Google Scholar 

  11. Li, Y., et al.: SuPer: a surgical perception framework for endoscopic tissue manipulation with surgical robotics. IEEE Robot. Autom. Lett. 5(2), 2294–2301 (2020)

    Article  Google Scholar 

  12. Lin, S., et al.: Semantic-SuPer: a semantic-aware surgical perception framework for endoscopic tissue identification, reconstruction, and tracking. In: 2023 IEEE International Conference on Robotics and Automation (ICRA), pp. 4739–4746 (2023). https://doi.org/10.1109/ICRA48891.2023.10160746

  13. Lu, J., Jayakumari, A., Richter, F., Li, Y., Yip, M.C.: Super deep: a surgical perception framework for robotic tissue manipulation using deep learning for feature extraction. In: ICRA. IEEE (2021)

    Google Scholar 

  14. Lukezic, A., Vojir, T., Zajc, L.C., Matas, J., Kristan, M.: Discriminative correlation filter with channel and spatial reliability. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4847–4856. IEEE, Honolulu, HI (2017). https://doi.org/10.1109/CVPR.2017.515

  15. Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. Commun. ACM 65(1), 99–106 (2021). https://doi.org/10.1145/3503250

    Article  Google Scholar 

  16. Recasens, D., Lamarca, J., Fácil, J.M., Montiel, J.M.M., Civera, J.: Endo-depth-and-motion: reconstruction and tracking in endoscopic videos using depth networks and photometric constraints. IEEE Robot. Autom. Lett. 6(4), 7225–7232 (2021). https://doi.org/10.1109/LRA.2021.3095528

    Article  Google Scholar 

  17. Richa, R., Bó, A.P., Poignet, P.: Towards robust 3D visual tracking for motion compensation in beating heart surgery. Med. Image Anal. 15, 302–315 (2011)

    Article  Google Scholar 

  18. Sarlin, P.E., DeTone, D., Malisiewicz, T., Rabinovich, A.: SuperGlue: learning feature matching with graph neural networks. In: CVPR (2020)

    Google Scholar 

  19. Schmidt, A., Mohareri, O., DiMaio, S.P., Salcudean, S.E.: Fast graph refinement and implicit neural representation for tissue tracking. In: ICRA (2022)

    Google Scholar 

  20. Schmidt, A., Salcudean, S.E.: Real-time rotated convolutional descriptor for surgical environments. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12904, pp. 279–289. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87202-1_27

    Chapter  Google Scholar 

  21. Song, J., Wang, J., Zhao, L., Huang, S., Dissanayake, G.: MIS-SLAM: real-time large-scale dense deformable slam system in minimal invasive surgery based on heterogeneous computing. IEEE Robot. Autom. Lett. 3, 4068–4075 (2018)

    Article  Google Scholar 

  22. Song, J., Zhu, Q., Lin, J., Ghaffari, M.: BDIS: Bayesian dense inverse searching method for real-time stereo surgical image matching. IEEE Trans. Rob. 39(2), 1388–1406 (2023)

    Article  Google Scholar 

  23. Tancik, M., et al.: Fourier features let networks learn high frequency functions in low dimensional domains. In: NeurIPS (2020)

    Google Scholar 

  24. Teed, Z., Deng, J.: RAFT: recurrent all-pairs field transforms for optical flow. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12347, pp. 402–419. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58536-5_24

    Chapter  Google Scholar 

  25. Teed, Z., Deng, J.: RAFT-3D: scene flow using rigid-motion embeddings. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8375–8384 (2021)

    Google Scholar 

  26. Wang, Y., Long, Y., Fan, S.H., Dou, Q.: Neural rendering for stereo 3D reconstruction of deformable tissues in robotic surgery. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022. LNCS, pp. 431–441. Springer, Cham (2022)

    Chapter  Google Scholar 

  27. Zhang, F., Prisacariu, V., Yang, R., Torr, P.H.: GA-Net: guided aggregation net for end-to-end stereo matching. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 185–194. IEEE, Long Beach, CA, USA (2019). https://doi.org/10.1109/CVPR.2019.00027

  28. Zhang, Y., et al.: ColDE: a depth estimation framework for colonoscopy reconstruction. arXiv:2111.10371 [cs, eess] (2021)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adam Schmidt .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 14718 KB)

Supplementary material 2 (mp4 33203 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Schmidt, A., Mohareri, O., DiMaio, S., Salcudean, S.E. (2023). SENDD: Sparse Efficient Neural Depth and Deformation for Tissue Tracking. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14228. Springer, Cham. https://doi.org/10.1007/978-3-031-43996-4_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43996-4_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43995-7

  • Online ISBN: 978-3-031-43996-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics