Shape-Guided Configuration-Aware Learning for Endoscopic-Image-Based Pose Estimation of Flexible Robotic Instruments

Ma, Yiyao; Chen, Kai; Tong, Hon-Sing; Wei, Ruofeng; Ng, Yui-Lun; Kwok, Ka-Wai; Dou, Qi

doi:10.1007/978-3-031-72670-5_15

Yiyao Ma¹³,
Kai Chen¹³,
Hon-Sing Tong¹⁴,
Ruofeng Wei¹³,
Yui-Lun Ng¹⁴,
Ka-Wai Kwok^15,16 &
…
Qi Dou¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15080))

Included in the following conference series:

European Conference on Computer Vision

607 Accesses

Abstract

Accurate estimation of both the external orientation and internal bending angle is crucial for understanding a flexible robot state within its environment. However, existing sensor-based methods face limitations in cost, environmental constraints, and integration issues. Conventional image-based methods struggle with the shape complexity of flexible robots. In this paper, we propose a novel shape-guided configuration-aware learning framework for image-based flexible robot pose estimation. Inspired by the recent advances in 2D-3D joint representation learning, we leverage the 3D shape prior of the flexible robot to enhance its image-based shape representation. We first extract the part-level geometry representation of the 3D shape prior, then adapt this representation to the image by querying the image features corresponding to different robot parts. Furthermore, we present an effective mechanism to dynamically deform the shape prior. It aims to mitigate the shape difference between the adopted shape prior and the flexible robot depicted in the image. This more expressive shape guidance boosts the image-based robot representation and can be effectively used for flexible robot pose refinement. Extensive experiments on a general flexible robot designed for endoluminal surgery demonstrate the advantages of our method over a series of keypoint-based, skeleton-based and direct regression-based methods. Project homepage: https://poseflex.github.io/.

Y. Ma and K. Chen—Equal contributions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Sparse-then-dense alignment-based 3D map reconstruction method for endoscopic capsule robots

Article Open access 27 December 2017

Shape-Based Pose Estimation of Robotic Surgical Instruments

Monocular tissue reconstruction via remote center motion for robot-assisted minimally invasive surgery

Article Open access 13 August 2021

Notes

1.
Please refer to the supplementary material to check the depth map.

References

Afham, M., Dissanayake, I., Dissanayake, D., Dharmasiri, A., Thilakarathna, K., Rodrigo, R.: Crosspoint: self-supervised cross-modal contrastive learning for 3d point cloud understanding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9902–9912 (2022)
Google Scholar
Arsomngern, P., Nutanong, S., Suwajanakorn, S.: Learning geometric-aware properties in 2D representation using lightweight cad models, or zero real 3D pairs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 21371–21381 (2023)
Google Scholar
Baaij, T., et al.: Learning 3D shape proprioception for continuum soft robots with multiple magnetic sensors. Soft Matter 19(1), 44–56 (2023)
Article Google Scholar
Bilić, I., Marić, F., Marković, I., Petrović, I.: A distance-geometric method for recovering robot joint angles from an RGB image. arXiv preprint arXiv:2301.02051 (2023)
Cartucho, J., Wang, C., Huang, B., S. Elson, D., Darzi, A., Giannarou, S.: An enhanced marker pattern that achieves improved accuracy in surgical tool tracking. Comput. Meth. Biomech. Biomed. Eng. Imaging Visual. 10(4), 400–408 (2022)
Google Scholar
Chautems, C., Tonazzini, A., Boehler, Q., Jeong, S.H., Floreano, D., Nelson, B.J.: Magnetic continuum device with variable stiffness for minimally invasive surgery. Adv. Intell. Syst. 2(6), 1900086 (2020)
Article Google Scholar
Chin, K., Hellebrekers, T., Majidi, C.: Machine learning for soft robotic sensing and control. Ad. Intell. Syst. 2(6), 1900171 (2020)
Article Google Scholar
Garrido-Jurado, S., Muñoz-Salinas, R., Madrid-Cuevas, F.J., Marín-Jiménez, M.J.: Automatic generation and detection of highly reliable fiducial markers under occlusion. Pattern Recogn. 47(6), 2280–2292 (2014)
Article Google Scholar
Gu, G., et al.: A soft neuroprosthetic hand providing simultaneous myoelectric control and tactile feedback. Nat. Biomed. Eng. 7(4), 589–598 (2023)
Article Google Scholar
Ha, K.H., et al.: Highly sensitive capacitive pressure sensors over a wide pressure range enabled by the hybrid responses of a highly porous nanocomposite. Adv. Mater. 33(48), 2103320 (2021)
Article Google Scholar
He, Y., et al.: Stretchable optical fibre sensor for soft surgical robot shape reconstruction. Optica Applicata 51(4) (2021)
Google Scholar
Heindl, C., Zambal, S., Ponitz, T., Pichler, A., Scharinger, J.: 3D robot pose estimation from 2d images. arXiv preprint arXiv:1902.04987 (2019)
Jing, L., Vahdani, E., Tan, J., Tian, Y.: Cross-modal center loss for 3D cross-modal retrieval. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3142–3151 (2021)
Google Scholar
Katzschmann, R.K., et al.: Dynamically closed-loop controlled soft robotic arm using a reduced order finite element model with state observer. In: 2019 2nd IEEE International Conference on Soft Robotics (RoboSoft), pp. 717–724. IEEE (2019)
Google Scholar
Khatri, C., Mardia, K.V.: The von mises-fisher matrix distribution in orientation statistics. J. R. Stat. Soc. Ser. B Stat Methodol. 39(1), 95–106 (1977)
Article MathSciNet Google Scholar
Kim, S.Y., et al.: Sustainable manufacturing of sensors onto soft systems using self-coagulating conductive pickering emulsions. Sci. Robot. 5(39), eaay3604 (2020)
Google Scholar
Lambrecht, J., Grosenick, P., Meusel, M.: Optimizing keypoint-based single-shot camera-to-robot pose estimation through shape segmentation. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 13843–13849. IEEE (2021)
Google Scholar
Lee, T.E., et al.: Camera-to-robot pose estimation from a single image. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 9426–9432. IEEE (2020)
Google Scholar
Lepetit, V., Moreno-Noguer, F., Fua, P.: EPNP: An accurate o (n) solution to the PNP problem. Int. J. Comput. Vision 81(2), 155–166 (2009)
Article Google Scholar
Li, S., Hao, G.: Current trends and prospects in compliant continuum robots: a survey. In: Actuators, vol. 10, p. 145. MDPI (2021)
Google Scholar
Lin, M.X., et al.: Single image 3D shape retrieval via cross-modal instance and category contrastive learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11405–11415 (2021)
Google Scholar
Loo, J.Y., Ding, Z.Y., Baskaran, V.M., Nurzaman, S.G., Tan, C.P.: Robust multimodal indirect sensing for soft robots via neural network-aided filter-based estimation. Soft Rob. 9(3), 591–612 (2022)
Article Google Scholar
Lu, J., Liu, F., Girerd, C., Yip, M.: Image-based pose estimation and shape reconstruction for robot manipulators and soft, continuum robots via differentiable rendering. In: ICRA 2023-IEEE International Conference on Robotics and Automation (2023)
Google Scholar
Lu, J., Richter, F., Lin, S., Yip, M.C.: Tracking snake-like robots in the wild using only a single camera. arXiv preprint arXiv:2309.15700 (2023)
Lu, J., Richter, F., Yip, M.C.: Pose estimation for robot manipulators via keypoint optimization and sim-to-real transfer. IEEE Robot. Autom. Lett. 7(2), 4622–4629 (2022)
Article Google Scholar
Mair, L.O., et al.: Soft capsule magnetic millirobots for region-specific drug delivery in the central nervous system. Front. Robot. AI 8, 702566 (2021)
Article Google Scholar
Monet, F., et al.: High-resolution optical fiber shape sensing of continuum robots: a comparative study. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 8877–8883. IEEE (2020)
Google Scholar
Navarro, S.E.: A model-based sensor fusion approach for force and shape estimation in soft robotics. IEEE Robot. Autom. Lett. 5(4), 5621–5628 (2020)
Article Google Scholar
Ozel, S., et al.: A composite soft bending actuation module with integrated curvature sensing. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 4963–4968. IEEE (2016)
Google Scholar
Prentice, M.J.: Orientation statistics without parametric assumptions. J. R. Stat. Soc. Ser. B Stat Methodol. 48(2), 214–222 (1986)
Article MathSciNet Google Scholar
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: Deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Google Scholar
Ranftl, R., Bochkovskiy, A., Koltun, V.: Vision transformers for dense prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12179–12188 (2021)
Google Scholar
Ranzani, T., Cianchetti, M., Gerboni, G., De Falco, I., Menciassi, A.: A soft modular manipulator for minimally invasive surgery: design and characterization of a single module. IEEE Trans. Rob. 32(1), 187–200 (2016)
Article Google Scholar
Shih, B., et al.: Design considerations for 3D printed, soft, multimaterial resistive sensors for soft robotics. Front. Robot. AI 6, 30 (2019)
Article Google Scholar
Souipas, S., Nguyen, A., Laws, S.G., Davies, B.L., Baena, F.R.: SIMPS-Net: simultaneous pose & segmentation network of surgical tools. IEEE Trans. Med. Robot. Bionics (2023)
Google Scholar
Tanaka, K., Minami, Y., Tokudome, Y., Inoue, K., Kuniyoshi, Y., Nakajima, K.: Continuum-body-pose estimation from partial sensor information using recurrent neural networks. IEEE Robot. Autom. Lett. 7(4), 11244–11251 (2022)
Article Google Scholar
Teyssier, M., Parilusyan, B., Roudaut, A., Steimle, J.: Human-like artificial skin sensor for physical human-robot interaction. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 3626–3633. IEEE (2021)
Google Scholar
Thuruthel, T.G., Shih, B., Laschi, C., Tolley, M.T.: Soft robot perception using embedded soft sensors and recurrent neural networks. Sci. Robot. 4(26), eaav1488 (2019)
Google Scholar
Tian, Y., Zhang, J., Yin, Z., Dong, H.: Robot structure prior guided temporal attention for camera-to-robot pose estimation from image sequence. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8917–8926 (2023)
Google Scholar
Toshimitsu, Y., Wong, K.W., Buchner, T., Katzschmann, R.: Sopra: fabrication & dynamical modeling of a scalable soft continuum robotic arm with integrated proprioceptive sensing. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 653–660. IEEE (2021)
Google Scholar
Truby, R.L., Della Santina, C., Rus, D.: Distributed proprioception of 3D configuration in soft, sensorized robots via deep learning. IEEE Robot. Autom. Lett. 5(2), 3299–3306 (2020)
Article Google Scholar
Valassakis, E., Dreczkowski, K., Johns, E.: Learning eye-in-hand camera calibration from a single image. In: Conference on Robot Learning, pp. 1336–1346. PMLR (2022)
Google Scholar
Wang, Y., Chen, X., Cao, L., Huang, W., Sun, F., Wang, Y.: Multimodal token fusion for vision transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12186–12195 (2022)
Google Scholar
Wang, Y., Ye, T., Cao, L., Huang, W., Sun, F., He, F., Tao, D.: Bridged transformer for vision and point cloud 3D object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12114–12123 (2022)
Google Scholar
Webster, R.J., III., Jones, B.A.: Design and kinematic modeling of constant curvature continuum robots: a review. Int. J. Robot. Res. 29(13), 1661–1683 (2010)
Article Google Scholar
Xu, H., Runciman, M., Cartucho, J., Xu, C., Giannarou, S.: Graph-based pose estimation of texture-less surgical tools for autonomous robot control. In: 2023 IEEE International Conference on Robotics and Automation (ICRA), pp. 2731–2737. IEEE (2023)
Google Scholar
Xu, P., Zhu, X., Clifton, D.A.: Multimodal learning with transformers: a survey. IEEE Trans. Pattern Anal. Mach. Intell. (2023)
Google Scholar
Yang, J., Gao, M., Li, Z., Gao, S., Wang, F., Zheng, F.: Track anything: segment anything meets videos. arXiv preprint arXiv:2304.11968 (2023)
Yin, Y., Cai, Y., Wang, H., Chen, B.: Fishermatch: semi-supervised rotation regression via entropy-based filtering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11164–11173 (2022)
Google Scholar
Yoshimura, M., Marinho, M.M., Harada, K., Mitsuishi, M.: Single-shot pose estimation of surgical robot instruments’ shafts from monocular endoscopic images. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 9960–9966. IEEE (2020)
Google Scholar
Zhang, L., Ye, M., Chan, P.L., Yang, G.Z.: Real-time surgical tool tracking and pose estimation using a hybrid cylindrical marker. Int. J. Comput. Assist. Radiol. Surg. 12, 921–930 (2017)
Article Google Scholar
Zhang, T.Y., Suen, C.Y.: A fast parallel algorithm for thinning digital patterns. Commun. ACM 27(3), 236–239 (1984)
Article Google Scholar
Zhang, Z., Wang, X., Wang, S., Meng, D., Liang, B.: Shape detection and reconstruction of soft robotic arm based on fiber BRAGG grating sensor array. In: 2018 IEEE International Conference on Robotics and Biomimetics (ROBIO), pp. 978–983. IEEE (2018)
Google Scholar
Zhong, X., Zhu, W., Liu, W., Yi, J., Liu, C., Wu, Z.: G-SAM: a robust one-shot keypoint detection framework for PNP based robot pose estimation. J. Intell. Robot. Syst. 109(2), 28 (2023)
Article Google Scholar
Zhu, J., et al.: Intelligent soft surgical robots for next-generation minimally invasive surgery. Adv. Intell. Syst. 3(5), 2100011 (2021)
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by Hong Kong Innovation and Technology Commission under Project No. PRP/026/22FX, in part by Agilis Robotics and its subsidiaries, Agilis Robotics Limited and Agilis Robotics Limited (Guangzhou), and in part by a grant from the NSFC/RGC Joint Research Scheme sponsored by the Research Grants Council of the Hong Kong Special Administrative Region, China and the National Natural Science Foundation of China (Project No. N_CUHK410/23).

Author information

Authors and Affiliations

Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
Yiyao Ma, Kai Chen, Ruofeng Wei & Qi Dou
Agilis Robotics Limited, Hong Kong, China
Hon-Sing Tong & Yui-Lun Ng
Department of Mechanical Engineering, The University of Hong Kong, Hong Kong, China
Ka-Wai Kwok
Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Hong Kong, China
Ka-Wai Kwok

Authors

Yiyao Ma
View author publications
You can also search for this author in PubMed Google Scholar
Kai Chen
View author publications
You can also search for this author in PubMed Google Scholar
Hon-Sing Tong
View author publications
You can also search for this author in PubMed Google Scholar
Ruofeng Wei
View author publications
You can also search for this author in PubMed Google Scholar
Yui-Lun Ng
View author publications
You can also search for this author in PubMed Google Scholar
Ka-Wai Kwok
View author publications
You can also search for this author in PubMed Google Scholar
Qi Dou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Ka-Wai Kwok or Qi Dou .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 7868 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ma, Y. et al. (2025). Shape-Guided Configuration-Aware Learning for Endoscopic-Image-Based Pose Estimation of Flexible Robotic Instruments. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15080. Springer, Cham. https://doi.org/10.1007/978-3-031-72670-5_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-72670-5_15
Published: 30 September 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72669-9
Online ISBN: 978-3-031-72670-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Shape-Guided Configuration-Aware Learning for Endoscopic-Image-Based Pose Estimation of Flexible Robotic Instruments