Investigating keypoint descriptors for camera relocalization in endoscopy surgery

Hernández, Isabela; Soberanis-Mukul, Roger; Mangulabnan, Jan Emily; Sahu, Manish; Winter, Jonas; Vedula, Swaroop; Ishii, Masaru; Hager, Gregory; Taylor, Russell H.; Unberath, Mathias

doi:10.1007/s11548-023-02918-x

Investigating keypoint descriptors for camera relocalization in endoscopy surgery

Original Article
Published: 09 May 2023

Volume 18, pages 1135–1142, (2023)
Cite this article

International Journal of Computer Assisted Radiology and Surgery Aims and scope Submit manuscript

551 Accesses
1 Altmetric
Explore all metrics

Abstract

Purpose

Recent advances in computer vision and machine learning have resulted in endoscopic video-based solutions for dense reconstruction of the anatomy. To effectively use these systems in surgical navigation, a reliable image-based technique is required to constantly track the endoscopic camera’s position within the anatomy, despite frequent removal and re-insertion. In this work, we investigate the use of recent learning-based keypoint descriptors for six degree-of-freedom camera pose estimation in intraoperative endoscopic sequences and under changes in anatomy due to surgical resection.

Methods

Our method employs a dense structure from motion (SfM) reconstruction of the preoperative anatomy, obtained with a state-of-the-art patient-specific learning-based descriptor. During the reconstruction step, each estimated 3D point is associated with a descriptor. This information is employed in the intraoperative sequences to establish 2D–3D correspondences for Perspective-n-Point (PnP) camera pose estimation. We evaluate this method in six intraoperative sequences that include anatomical modifications obtained from two cadaveric subjects.

Results

Show that this approach led to translation and rotation errors of 3.9 mm and 0.2 radians, respectively, with 21.86% of localized cameras averaged over the six sequences. In comparison to an additional learning-based descriptor (HardNet++), the selected descriptor can achieve a better percentage of localized cameras with similar pose estimation performance. We further discussed potential error causes and limitations of the proposed approach.

Conclusion

Patient-specific learning-based descriptors can relocalize images that are well distributed across the inspected anatomy, even where the anatomy is modified. However, camera relocalization in endoscopic sequences remains a persistently challenging problem, and future research is necessary to increase the robustness and accuracy of this technique.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Is Multi-model Feature Matching Better for Endoscopic Motion Estimation?

Wide-Baseline Dense Feature Matching for Endoscopic Images

Image Based Surgical Instrument Pose Estimation with Multi-class Labelling and Optical Flow

Code availability

The source code is available at https://github.com/arcadelab/camera-relocalization.

References

Mirota DJ, Masaru I, Hager GD (2011) Vision-based navigation in image-guided interventions. Annu Rev Biomed Eng 13:297–319
Article CAS PubMed Google Scholar
Yeung BPM, Gourlay T (2012) A technical review of flexible endoscopic multitasking platforms. Int J Surg 10(7):345–54
Article PubMed Google Scholar
Liu X, Zheng Y, Killeen B, Ishii M, Hager GD, Taylor RH, Unberath M (2020) Extremely dense point correspondences using a learned feature descriptor. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4847–4856
Liu X, Stiber M, Huang J, Ishii M, Hager GD, Taylor RH, Unberath M (2020) Reconstructing sinus anatomy from endoscopic video—towards a radiation-free approach for quantitative longitudinal assessment. In: Martel AL, Abolmaesumi P, Stoyanov D, Mateus D, Zuluaga MA, Zhou SK, Racoceanu D, Joskowicz L (eds) Medical image computing and computer assisted intervention—MICCAI 2020. Springer, Cham, pp 3–13
Chapter Google Scholar
Liu X, Li Z, Ishii M, Hager GD, Taylor RH, Unberath M (2022) SAGE: SLAM with appearance and geometry prior for endoscopy. In: ICRA
Waelkens P, Van Oosterom M, Van den Berg N, Navab N, Leeuwen FWB (2016) Surgical navigation: an overview of the state-of-the-art clinical applications. In: Radioguided surgery
Schonberger JL, Frahm J-M (2016) Structure-from-motion revisited. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4104–4113
Kendall A, Grimes M, Cipolla R (2015) PoseNet: a convolutional network for real-time 6-DOF camera relocalization. In: Proceedings of the IEEE international conference on computer vision, pp 2938–2946
Sattler T, Zhou Q, Pollefeys M, Leal-Taixe L (2019) Understanding the limitations of CNN-based absolute camera pose regression. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3302–3312
Mildenhall B, Srinivasan PP, Tancik M, Barron JT, Ramamoorthi R, Ng R (2021) NeRF: representing scenes as neural radiance fields for view synthesis. Commun ACM 65(1):99–106
Article Google Scholar
Sattler T, Leibe B, Kobbelt L (2011) Fast image-based localization using direct 2D-to-3D matching. In: 2011 international conference on computer vision, pp 667–674. IEEE
Lepetit V, Moreno-Noguer F, Fua P (2009) EPnP: an accurate O(n) solution to the PnP problem. Int J Comput Vis. https://doi.org/10.1007/s11263-008-0152-6
Article Google Scholar
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110. https://doi.org/10.1023/B:VISI.0000029664.99615.94
Strobl K, Hirzinger G (2006) Optimal hand-eye calibration, pp 4647–4653. https://doi.org/10.1109/IROS.2006.282250
Vagdargi P, Uneri A, Jones C, Wu P, Han R, Luciano M, Anderson W, Hager G, Siewerdsen J (2021) Robot-assisted ventriculoscopic 3D reconstruction for guidance of deep-brain stimulation surgery. In: Medical imaging 2021: image-guided procedures, robotic interventions, and modeling, vol 11598, pp 47–54. SPIE
Moreno-Noguer F, Lepetit V, Fua P (2007) Accurate non-iterative O(n) solution to the PnP problem. In: 11th IEEE international conference on computer vision
Mishchuk A, Mishkin D, Radenovic F, Matas J (2017) Working hard to know your neighbor’s margins: Local descriptor learning loss. In: Advances in neural information processing systems, vol 30

Download references

Acknowledgements

Isabela Hernández acknowledges the support of the 2021 Uniandes-DeepMind Scholarship.

Funding

This work was funded in part by Johns Hopkins University internal funds and in part by NIH R01EB030511. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Author information

Isabela Hernández and Roger Soberanis-Mukul have contributed equally to this work.

Authors and Affiliations

Johns Hopkins University, Baltimore, 21211, MD, USA
Isabela Hernández, Roger Soberanis-Mukul, Jan Emily Mangulabnan, Manish Sahu, Jonas Winter, Swaroop Vedula, Gregory Hager, Russell H. Taylor & Mathias Unberath
Johns Hopkins Medical Institutions, Baltimore, 21287, MD, USA
Masaru Ishii, Russell H. Taylor & Mathias Unberath

Authors

Isabela Hernández
View author publications
You can also search for this author in PubMed Google Scholar
Roger Soberanis-Mukul
View author publications
You can also search for this author in PubMed Google Scholar
Jan Emily Mangulabnan
View author publications
You can also search for this author in PubMed Google Scholar
Manish Sahu
View author publications
You can also search for this author in PubMed Google Scholar
Jonas Winter
View author publications
You can also search for this author in PubMed Google Scholar
Swaroop Vedula
View author publications
You can also search for this author in PubMed Google Scholar
Masaru Ishii
View author publications
You can also search for this author in PubMed Google Scholar
Gregory Hager
View author publications
You can also search for this author in PubMed Google Scholar
Russell H. Taylor
View author publications
You can also search for this author in PubMed Google Scholar
Mathias Unberath
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Isabela Hernández.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Ethics approval

Not necessary for this work.

Consent to participate

This study was performed under the approved IRB00267324 protocol on non-living subjects, for which informed consent was not required.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 9766 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Hernández, I., Soberanis-Mukul, R., Mangulabnan, J.E. et al. Investigating keypoint descriptors for camera relocalization in endoscopy surgery. Int J CARS 18, 1135–1142 (2023). https://doi.org/10.1007/s11548-023-02918-x

Download citation

Received: 09 March 2023
Accepted: 12 April 2023
Published: 09 May 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s11548-023-02918-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Investigating keypoint descriptors for camera relocalization in endoscopy surgery