Abstract

Due to the complex and unstructured nature of the intestine, 3D reconstruction and visual navigation are imperative for clinical endoscopists performing the skill-intensive colonoscopy. Unsupervised 3D reconstruction methods, as a mainstream paradigm in auto-driving scenarios, exploit warping loss to predict 6-DOF pose and depth information jointly. However, owing to illumination inconsistency, repeated texture regions, and non-Lambertian reflection, the geometry warping constraint cannot be efficiently applied to the colonic environment. Therefore, we propose a novel feature point-based method to handle the pose estimation in the colonic scenario. Specifically, PoseNet incorporates an attention module that learns the feature points embedding, enabling the network to concentrate on distinguishable tissue regions. In addition, to finalize visual reconstruction, we proposed a DepthNet that adopts a multi-scale structure and scale-invariant loss to alleviate the ambiguity of the global scale. We also introduce a clinically-oriented colon dataset containing simulated and phantom data to facilitate the study of transfer-learning and prospective clinical applications. To validate our framework’s generalization ability on different datasets, we train our model on a public dataset but test it on the proposed datasets without any fine-tuning. The results reveal its superior performance. All datasets are publicly available for effective quantitative benchmarking.

Dataset Files

Colon Dataset.rar (606.49 MB)
preprocess.rar (796 bytes)

Datasets

Standard Dataset

Simulated and phantom colon data for 6-dof camera pose estimation

Abstract

More from this Author

phantom-colon-data-6-dof-camera-pose-estimation

AGUMENTED DATASET FOR RESPIRATORY MOVMENT PREDICTION

Dataset Files

QUESTIONS?