Abstract
There is often a significant gap between research results and applicability in routine medical practice. This work studies the performance of well-known local features on a medical dataset captured during routine colonoscopy procedures. Local feature extraction and matching is a key step for many computer vision applications, specially regarding 3D modelling. In the medical domain, handcrafted local features such as SIFT, with public pipelines such as COLMAP, are still a predominant tool for this kind of tasks. We explore the potential of the well known self-supervised approach SuperPoint [4], present an adapted variation for the endoscopic domain and propose a challenging evaluation framework. SuperPoint based models achieve significantly higher matching quality than commonly used local features in this domain. Our adapted model avoids features within specularity regions, a frequent and problematic artifact in endoscopic images, with consequent benefits for matching and reconstruction results. Training code and models available https://github.com/LeonBP/SuperPointEndoscopy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
This project has been funded by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 863146 and Aragon Government FSE-T45_20R.
References
Azagra, P., et al.: Endomapper dataset of complete calibrated endoscopy procedures. arXiv preprint arXiv:2204.14240 (2022)
Borgli, H., Thambawita, V., Smedsrud, P.H., Hicks, S., Jha, D., et al.: Hyperkvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy. Sci. Data 7(1), 1–14 (2020)
Chadebecq, F., Vasconcelos, F., Mazomenos, E., Stoyanov, D.: Computer vision in the surgical operating room. Visceral Med. 36(6), 456–462 (2020)
DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: self-supervised interest point detection and description. In: Conference on Computer Vision and Pattern Recognition Workshops. IEEE (2018)
Di Febbo, P., Dal Mutto, C., Tieu, K., Mattoccia, S.: KCNN: extremely-efficient hardware keypoint detection with a compact convolutional neural network. In: CVPR Workshops. IEEE (2018)
Espinel, Y., Calvet, L., Botros, K., Buc, E., Tilmant, C., Bartoli, A.: Using multiple images and contours for deformable 3D-2D registration of a preoperative CT in laparoscopic liver surgery. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12904, pp. 657–666. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87202-1_63
Gómez-Rodríguez, J.J., Lamarca, J., Morlana, J., Tardós, J.D., Montiel, J.M.: SD-DefSLAM: semi-direct monocular SLAM for deformable and intracorporeal scenes. In: International Conference on Robotics and Automation. IEEE (2021)
Jau, Y.Y., Zhu, R., Su, H., Chandraker, M.: Deep keypoint-based camera pose estimation with geometric constraints. In: International Conference on Intelligent Robots and Systems. IEEE (2020). https://github.com/eric-yyjau/pytorch-superpoint
Jiang, W., Trulls, E., Hosang, J., Tagliasacchi, A., Yi, K.M.: Cotr: correspondence transformer for matching across images. arXiv preprint arXiv:2103.14167 (2021)
Jin, Y., et al.: Image matching across wide baselines: from paper to practice. Int. J. Comput. Vis. 129(2), 517–547 (2021)
Laguna, A.B., Riba, E., Ponsa, D., Mikolajczyk, K.: Key. Net: keypoint detection by handcrafted and learned CNN filters. In: ICCV. IEEE (2019)
Liao, C., Wang, C., Bai, J., Lan, L., Wu, X.: Deep learning for registration of region of interest in consecutive wireless capsule endoscopy frames. Comput. Meth. Programs Biomed. 208, 106189 (2021)
Liu, X., et al: Reconstructing sinus anatomy from endoscopic video – towards a radiation-free approach for quantitative longitudinal assessment. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12263, pp. 3–13. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59716-0_1
Liu, X., et al.: Extremely dense point correspondences using a learned feature descriptor. In: Conference on Computer Vision and Pattern Recognition. IEEE (2020)
Ma, J., Jiang, X., Fan, A., Jiang, J., Yan, J.: Image matching from handcrafted to deep features: a survey. Int. J. Comput. Vis. 1–57 (2020)
Ma, R., Wang, R., Pizer, S., Rosenman, J., McGill, S.K., Frahm, J.M.: Real-time 3D reconstruction of colonoscopic surfaces for determining missing regions. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11768, pp. 573–582. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32254-0_64
Mahmoud, N., Collins, T., Hostettler, A., Soler, L., Doignon, C., Montiel, J.M.M.: Live tracking and dense reconstruction for handheld monocular endoscopy. IEEE Trans. Med. Imaging 38(1), 79–89 (2018)
Mishchuk, A., Mishkin, D., Radenović, F., Matas, J.: Working hard to know your neighbor’s margins: local descriptor learning loss. In: International Conference on Neural Information Processing Systems (2017)
Mishkin, D., Radenovic, F., Matas, J.: Repeatability is not enough: learning affine regions via discriminability. In: ECCV (2018)
Ono, Y., Trulls, E., Fua, P., Yi, K.M.: LF-Net: learning local features from images. In: International Conference on Neural Information Processing Systems (2018)
Ozyoruk, K.B., et al.: EndoSLAM dataset and an unsupervised monocular visual odometry and depth estimation approach for endoscopic videos. Med. Image Anal. 71, 102058 (2021)
Revaud, J., Weinzaepfel, P., de Souza, C.R., Humenberger, M.: R2D2: repeatable and reliable detector and descriptor. In: International Conference on Neural Information Processing Systems (2019)
Sarlin, P.E., DeTone, D., Malisiewicz, T., Rabinovich, A.: Superglue: learning feature matching with graph neural networks. In: Conference on Computer Vision and Pattern Recognition. IEEE (2020)
Savinov, N., Seki, A., Ladický, L., Sattler, T., Pollefeys, M.: Quad-networks: unsupervised learning to rank for interest point detection. In: Conference on Computer Vision and Pattern Recognition. IEEE (2017)
Schönberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: CVPR. IEEE (2016)
Schönberger, J.L., Zheng, E., Pollefeys, M., Frahm, J.M.: Pixelwise view selection for unstructured multi-view stereo. In: European Conference on Computer Vision (2016)
Stoyanov, D., Yang, G.Z.: Removing specular reflection components for robotic assisted laparoscopic surgery. In: International Conference on Image Processing. IEEE (2005)
Sun, J., Shen, Z., Wang, Y., Bao, H., Zhou, X.: Loftr: detector-free local feature matching with transformers. In: CVPR. IEEE (2021)
Tian, Y., Fan, B., Wu, F.: L2-Net: deep learning of discriminative patch descriptor in euclidean space. In: Conference on Computer Vision and Pattern Recognition. IEEE (2017)
Tian, Y., Balntas, V., Ng, T., Barroso-Laguna, A., Demiris, Y., Mikolajczyk, K.: D2d: keypoint extraction with describe to detect approach. In: ACCV (2020)
Yi, K.M., Trulls, E., Lepetit, V., Fua, P.: LIFT: learned Invariant Feature Transform. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 467–483. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_28
Zhang, L., Rusinkiewicz, S.: Learning to detect features in texture images. In: Conference on Computer Vision and Pattern Recognition. IEEE (2018)
Zhang, Z., Xie, Y., Xing, F., McGough, M., Yang, L.: Mdnet: a semantically and visually interpretable medical image diagnosis network. In: CVPR. IEEE (2017)
Zhou, Q., Sattler, T., Leal-Taixe, L.: Patch2pix: epipolar-guided pixel-level correspondences. In: CVPR. IEEE (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Barbed, O.L., Chadebecq, F., Morlana, J., Montiel, J.M.M., Murillo, A.C. (2022). SuperPoint Features in Endoscopy. In: Manfredi, L., et al. Imaging Systems for GI Endoscopy, and Graphs in Biomedical Image Analysis. ISGIE GRAIL 2022 2022. Lecture Notes in Computer Science, vol 13754. Springer, Cham. https://doi.org/10.1007/978-3-031-21083-9_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-21083-9_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21082-2
Online ISBN: 978-3-031-21083-9
eBook Packages: Computer ScienceComputer Science (R0)