Simulated and phantom colon data for 6-dof camera pose estimation

Citation Author(s):
Min
Tan
Wentao
Jin
Gaosheng
Xie
Zeyang
Xia
Jing
Xiong
Submitted by:
Min Tan
Last updated:
Mon, 12/11/2023 - 22:15
DOI:
10.21227/af5p-mp40
License:
0
0 ratings - Please login to submit your rating.

Abstract 

Due to the complex and unstructured nature of the intestine, 3D reconstruction and visual navigation are imperative for clinical endoscopists performing the skill-intensive colonoscopy. Unsupervised 3D reconstruction methods, as a mainstream paradigm in auto-driving scenarios, exploit warping loss to predict 6-DOF pose and depth information jointly. However, owing to illumination inconsistency, repeated texture regions, and non-Lambertian reflection, the geometry warping constraint cannot be efficiently applied to the colonic environment. Therefore, we propose a novel feature point-based method to handle the pose estimation in the colonic scenario. Specifically, PoseNet incorporates an attention module that learns the feature points embedding, enabling the network to concentrate on distinguishable tissue regions. In addition, to finalize visual reconstruction, we proposed a DepthNet that adopts a multi-scale structure and scale-invariant loss to alleviate the ambiguity of the global scale. We also introduce a clinically-oriented colon dataset containing simulated and phantom data to facilitate the study of transfer-learning and prospective clinical applications. To validate our framework’s generalization ability on different datasets, we train our model on a public dataset but test it on the proposed datasets without any fine-tuning. The results reveal its superior performance. All datasets are publicly available for effective quantitative benchmarking.