Skip to main content
Log in

Investigating keypoint descriptors for camera relocalization in endoscopy surgery

  • Original Article
  • Published:
International Journal of Computer Assisted Radiology and Surgery Aims and scope Submit manuscript

Abstract

Purpose

Recent advances in computer vision and machine learning have resulted in endoscopic video-based solutions for dense reconstruction of the anatomy. To effectively use these systems in surgical navigation, a reliable image-based technique is required to constantly track the endoscopic camera’s position within the anatomy, despite frequent removal and re-insertion. In this work, we investigate the use of recent learning-based keypoint descriptors for six degree-of-freedom camera pose estimation in intraoperative endoscopic sequences and under changes in anatomy due to surgical resection.

Methods

Our method employs a dense structure from motion (SfM) reconstruction of the preoperative anatomy, obtained with a state-of-the-art patient-specific learning-based descriptor. During the reconstruction step, each estimated 3D point is associated with a descriptor. This information is employed in the intraoperative sequences to establish 2D–3D correspondences for Perspective-n-Point (PnP) camera pose estimation. We evaluate this method in six intraoperative sequences that include anatomical modifications obtained from two cadaveric subjects.

Results

Show that this approach led to translation and rotation errors of 3.9 mm and 0.2 radians, respectively, with 21.86% of localized cameras averaged over the six sequences. In comparison to an additional learning-based descriptor (HardNet++), the selected descriptor can achieve a better percentage of localized cameras with similar pose estimation performance. We further discussed potential error causes and limitations of the proposed approach.

Conclusion

Patient-specific learning-based descriptors can relocalize images that are well distributed across the inspected anatomy, even where the anatomy is modified. However, camera relocalization in endoscopic sequences remains a persistently challenging problem, and future research is necessary to increase the robustness and accuracy of this technique.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Code availability

The source code is available at https://github.com/arcadelab/camera-relocalization.

References

  1. Mirota DJ, Masaru I, Hager GD (2011) Vision-based navigation in image-guided interventions. Annu Rev Biomed Eng 13:297–319

    Article  CAS  PubMed  Google Scholar 

  2. Yeung BPM, Gourlay T (2012) A technical review of flexible endoscopic multitasking platforms. Int J Surg 10(7):345–54

    Article  PubMed  Google Scholar 

  3. Liu X, Zheng Y, Killeen B, Ishii M, Hager GD, Taylor RH, Unberath M (2020) Extremely dense point correspondences using a learned feature descriptor. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4847–4856

  4. Liu X, Stiber M, Huang J, Ishii M, Hager GD, Taylor RH, Unberath M (2020) Reconstructing sinus anatomy from endoscopic video—towards a radiation-free approach for quantitative longitudinal assessment. In: Martel AL, Abolmaesumi P, Stoyanov D, Mateus D, Zuluaga MA, Zhou SK, Racoceanu D, Joskowicz L (eds) Medical image computing and computer assisted intervention—MICCAI 2020. Springer, Cham, pp 3–13

    Chapter  Google Scholar 

  5. Liu X, Li Z, Ishii M, Hager GD, Taylor RH, Unberath M (2022) SAGE: SLAM with appearance and geometry prior for endoscopy. In: ICRA

  6. Waelkens P, Van Oosterom M, Van den Berg N, Navab N, Leeuwen FWB (2016) Surgical navigation: an overview of the state-of-the-art clinical applications. In: Radioguided surgery

  7. Schonberger JL, Frahm J-M (2016) Structure-from-motion revisited. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4104–4113

  8. Kendall A, Grimes M, Cipolla R (2015) PoseNet: a convolutional network for real-time 6-DOF camera relocalization. In: Proceedings of the IEEE international conference on computer vision, pp 2938–2946

  9. Sattler T, Zhou Q, Pollefeys M, Leal-Taixe L (2019) Understanding the limitations of CNN-based absolute camera pose regression. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3302–3312

  10. Mildenhall B, Srinivasan PP, Tancik M, Barron JT, Ramamoorthi R, Ng R (2021) NeRF: representing scenes as neural radiance fields for view synthesis. Commun ACM 65(1):99–106

    Article  Google Scholar 

  11. Sattler T, Leibe B, Kobbelt L (2011) Fast image-based localization using direct 2D-to-3D matching. In: 2011 international conference on computer vision, pp 667–674. IEEE

  12. Lepetit V, Moreno-Noguer F, Fua P (2009) EPnP: an accurate O(n) solution to the PnP problem. Int J Comput Vis. https://doi.org/10.1007/s11263-008-0152-6

    Article  Google Scholar 

  13. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110. https://doi.org/10.1023/B:VISI.0000029664.99615.94

  14. Strobl K, Hirzinger G (2006) Optimal hand-eye calibration, pp 4647–4653. https://doi.org/10.1109/IROS.2006.282250

  15. Vagdargi P, Uneri A, Jones C, Wu P, Han R, Luciano M, Anderson W, Hager G, Siewerdsen J (2021) Robot-assisted ventriculoscopic 3D reconstruction for guidance of deep-brain stimulation surgery. In: Medical imaging 2021: image-guided procedures, robotic interventions, and modeling, vol 11598, pp 47–54. SPIE

  16. Moreno-Noguer F, Lepetit V, Fua P (2007) Accurate non-iterative O(n) solution to the PnP problem. In: 11th IEEE international conference on computer vision

  17. Mishchuk A, Mishkin D, Radenovic F, Matas J (2017) Working hard to know your neighbor’s margins: Local descriptor learning loss. In: Advances in neural information processing systems, vol 30

Download references

Acknowledgements

Isabela Hernández acknowledges the support of the 2021 Uniandes-DeepMind Scholarship.

Funding

This work was funded in part by Johns Hopkins University internal funds and in part by NIH R01EB030511. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Isabela Hernández.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Ethics approval

Not necessary for this work.

Consent to participate

This study was performed under the approved IRB00267324 protocol on non-living subjects, for which informed consent was not required.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 9766 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hernández, I., Soberanis-Mukul, R., Mangulabnan, J.E. et al. Investigating keypoint descriptors for camera relocalization in endoscopy surgery. Int J CARS 18, 1135–1142 (2023). https://doi.org/10.1007/s11548-023-02918-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11548-023-02918-x

Keywords

Navigation