Silhouette-Based 6D Object Pose Estimation

Cui, Xiao; Li, Nan; Zhang, Chi; Zhang, Qian; Feng, Wei; Wan, Liang

doi:10.1007/978-981-97-2092-7_8

Xiao Cui ORCID: orcid.org/0009-0006-3166-7483⁹,
Nan Li⁹,
Chi Zhang⁹,
Qian Zhang⁹,
Wei Feng⁹ &
…
Liang Wan⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14593))

Included in the following conference series:

International Conference on Computational Visual Media

91 Accesses

Abstract

For a long time, deep learning-based 6D object pose estimation networks have lacked the ability to address the problem of pose estimation of the unknown objects beyond the training datasets, due to the closed-set assumption and the expensive cost of high-quality annotation. Conversely, traditional methods struggle to achieve accurate pose estimation for texture-less objects. In this work, we propose a silhouette-based 6D object pose estimation method. being a conventional method As a traditional method, our approach achieves high accuracy without any need of annotation data, demonstrating excellent generalization. Additionally, we employ silhouette to mitigate texture dependency issues, ensuring effectiveness even in the case of textureless objects. In the method, we introduce a dimensionality reduction strategy for \(\textrm{SE}\left( 3\right) \) pose space, accompanied by theoretical proofs, which make it possible to perform pose estimation through search, rendering, and comparison in a reduced-dimensional space efficiently and accurately. Experimental results demonstrate the high precision and generalization of the proposed method. Our code is available at https://github.com/worldTester/STI-Pose.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Bay, H., Tuytelaars, T., Van Gool, L.: Surf: speeded up robust features. In: Proceedings of the 9th European Conference on Computer Vision. vol. Part I, pp. 404–417 (2006)
Google Scholar
Besl, P.J., McKay, N.D.: Method for registration of 3-D shapes. In: Proceedings of the International Society for Optical Engineering. vol. 14, pp. 239–256 (1992)
Google Scholar
Brachmann, E., Krull, A., Michel, F., Gumhold, S., Shotton, J., Rother, C.: Learning 6D object pose estimation using 3D object coordinates. In: Proceedings of the European Conference on Computer Vision. vol. Part II, pp. 536–551 (2014)
Google Scholar
Busam, B., Esposito, M., Che’Rose, S., Navab, N., Frisch, B.: A stereo vision approach for cooperative robotic movement therapy. In: Proceedings of the IEEE International Conference on Computer Vision workshops, pp. 127–135 (2015)
Google Scholar
Calli, B., Singh, A., Walsman, A., Srinivasa, S., Abbeel, P., Dollar, A.M.: The ycb object and model set: towards common benchmarks for manipulation research. In: Proceedings of thr IEEE International Conference on Advanced Robotics, pp. 510–517 (2015)
Google Scholar
Di, Y., Manhardt, F., Wang, G., Ji, X., Navab, N., Tombari, F.: SO-Pose: exploiting self-occlusion for direct 6D pose estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12396–12405 (2021)
Google Scholar
Drost, B., Ulrich, M., Navab, N., Ilic, S.: Model globally, match locally: efficient and robust 3D object recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 998–1005 (2010)
Google Scholar
Eberhart, R., Kennedy, J.: A new optimizer using particle swarm theory. In: Proceedings of the IEEE International Symposium on Micro Machine and Human Science, pp. 39–43 (1995)
Google Scholar
Ghazaei, G., Laina, I., Rupprecht, C., Tombari, F., Navab, N., Nazarpour, K.: Dealing with ambiguity in robotic grasping via multiple predictions. In: Proceedings of the Asian Conference on Computer Vision, pp. 38–55 (2019)
Google Scholar
Hinterstoisser, S., et al.: Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 858–865 (2011)
Google Scholar
Hodan, T., et al.: T-LESS: An RGB-D dataset for 6D pose estimation of texture-less objects. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, pp. 880–888. IEEE (2017)
Google Scholar
Hu, Y., Fua, P., Wang, W., Salzmann, M.: Single-stage 6D object pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2930–2939 (2020)
Google Scholar
Kendall, A., Grimes, M., Cipolla, R.: Posenet: a convolutional network for real-time 6-dof camera relocalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2938–2946 (2015)
Google Scholar
Kirillov, A., et al.: Segment Anything. arXiv:2304.02643 (2023)
Lepetit, V., Moreno-Noguer, F., Fua, P.: Ep\(n\)p: an accurate o (\(n\)) solution to the p\(n\)p problem. Int. J. Comput. Vision 81, 155–166 (2009)
Article Google Scholar
Li, Z., Wang, G., Ji, X.: CDPN: coordinates-based disentangled pose network for real-time RGB-based 6-DoF object pose estimation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 7678–7687 (2019)
Google Scholar
Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the Seventh IEEE International Conference On Computer Vision, vol. 2, pp. 1150–1157. IEEE (1999)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 91–110 (2004)
Article Google Scholar
Marchand, E., Uchiyama, H., Spindler, F.: Pose estimation for augmented reality: a hands-on survey. IEEE Trans. Visual Comput. Graphics 22(12), 2633–2651 (2015)
Article Google Scholar
Olson, C.F., Huttenlocher, D.P.: Automatic target recognition by matching oriented edge pixels. IEEE Trans. Image Process. 6(1), 103–113 (1997)
Article Google Scholar
Peng, S., Liu, Y., Huang, Q., Zhou, X., Bao, H.: PVNet: pixel-wise voting network for 6DoF pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4561–4570 (2019)
Google Scholar
Pérez, L., Rodríguez, Í., Rodríguez, N., Usamentiaga, R., García, D.F.: Robot guidance using machine vision techniques in industrial environments: a comparative review. Sensors 16(3), 335 (2016)
Article Google Scholar
Rambach, J., Pagani, A., Schneider, M., Artemenko, O., Stricker, D.: 6DoF object tracking based on 3D scans for augmented reality remote live support. Computers 7(1), 6 (2018)
Article Google Scholar
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: An efficient alternative to SIFT or SURF. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2564–2571 (2011)
Google Scholar
Rusu, R.B., Blodow, N., Marton, Z.C., Beetz, M.: Aligning point cloud views using persistent feature histograms. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3384–3391 (2008)
Google Scholar
Su, Y., et al.: ZebraPose: coarse to fine surface encoding for 6DoF object pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6738–6748 (2022)
Google Scholar
Suzuki, S., et al.: Topological structural analysis of digitized binary images by border following. Comput. Vision, Graph. Image Process. 30(1), 32–46 (1985)
Article Google Scholar
Tian, Z., Shen, C., Chen, H., He, T.: FCOS: fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9627–9636 (2019)
Google Scholar
Wang, C., et al.: DenseFusion: 6D object pose estimation by iterative dense fusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3343–3352 (2019)
Google Scholar
Wang, G., Manhardt, F., Tombari, F., Ji, X.: GDR-Net: geometry-guided direct regression network for monocular 6D object pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16611–16621 (2021)
Google Scholar
Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes. In: Robotics: Science and Systems XIV (2018)
Google Scholar
Zhang, X., Jiang, Z., Zhang, H., Wei, Q.: Vision-based pose estimation for textureless space objects by contour points matching. IEEE Trans. Aerosp. Electron. Syst. 54(5), 2342–2355 (2018)
Article Google Scholar
Zou, X., et al.: Segment everything everywhere all at once. arXiv preprint arXiv:2304.06718 (2023)

Download references

Author information

Authors and Affiliations

Tianjin University, Tianjin, China
Xiao Cui, Nan Li, Chi Zhang, Qian Zhang, Wei Feng & Liang Wan

Authors

Xiao Cui
View author publications
You can also search for this author in PubMed Google Scholar
Nan Li
View author publications
You can also search for this author in PubMed Google Scholar
Chi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Qian Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Feng
View author publications
You can also search for this author in PubMed Google Scholar
Liang Wan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liang Wan .

Editor information

Editors and Affiliations

Victoria University of Wellington, Wellington, New Zealand
Fang-Lue Zhang
Ben-Gurion University, Be'er Sheva, Israel
Andrei Sharf

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cui, X., Li, N., Zhang, C., Zhang, Q., Feng, W., Wan, L. (2024). Silhouette-Based 6D Object Pose Estimation. In: Zhang, FL., Sharf, A. (eds) Computational Visual Media. CVM 2024. Lecture Notes in Computer Science, vol 14593. Springer, Singapore. https://doi.org/10.1007/978-981-97-2092-7_8

Download citation

DOI: https://doi.org/10.1007/978-981-97-2092-7_8
Published: 30 March 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-2091-0
Online ISBN: 978-981-97-2092-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics