Abstract
For a long time, deep learning-based 6D object pose estimation networks have lacked the ability to address the problem of pose estimation of the unknown objects beyond the training datasets, due to the closed-set assumption and the expensive cost of high-quality annotation. Conversely, traditional methods struggle to achieve accurate pose estimation for texture-less objects. In this work, we propose a silhouette-based 6D object pose estimation method. being a conventional method As a traditional method, our approach achieves high accuracy without any need of annotation data, demonstrating excellent generalization. Additionally, we employ silhouette to mitigate texture dependency issues, ensuring effectiveness even in the case of textureless objects. In the method, we introduce a dimensionality reduction strategy for \(\textrm{SE}\left( 3\right) \) pose space, accompanied by theoretical proofs, which make it possible to perform pose estimation through search, rendering, and comparison in a reduced-dimensional space efficiently and accurately. Experimental results demonstrate the high precision and generalization of the proposed method. Our code is available at https://github.com/worldTester/STI-Pose.
References
Bay, H., Tuytelaars, T., Van Gool, L.: Surf: speeded up robust features. In: Proceedings of the 9th European Conference on Computer Vision. vol. Part I, pp. 404–417 (2006)
Besl, P.J., McKay, N.D.: Method for registration of 3-D shapes. In: Proceedings of the International Society for Optical Engineering. vol. 14, pp. 239–256 (1992)
Brachmann, E., Krull, A., Michel, F., Gumhold, S., Shotton, J., Rother, C.: Learning 6D object pose estimation using 3D object coordinates. In: Proceedings of the European Conference on Computer Vision. vol. Part II, pp. 536–551 (2014)
Busam, B., Esposito, M., Che’Rose, S., Navab, N., Frisch, B.: A stereo vision approach for cooperative robotic movement therapy. In: Proceedings of the IEEE International Conference on Computer Vision workshops, pp. 127–135 (2015)
Calli, B., Singh, A., Walsman, A., Srinivasa, S., Abbeel, P., Dollar, A.M.: The ycb object and model set: towards common benchmarks for manipulation research. In: Proceedings of thr IEEE International Conference on Advanced Robotics, pp. 510–517 (2015)
Di, Y., Manhardt, F., Wang, G., Ji, X., Navab, N., Tombari, F.: SO-Pose: exploiting self-occlusion for direct 6D pose estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12396–12405 (2021)
Drost, B., Ulrich, M., Navab, N., Ilic, S.: Model globally, match locally: efficient and robust 3D object recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 998–1005 (2010)
Eberhart, R., Kennedy, J.: A new optimizer using particle swarm theory. In: Proceedings of the IEEE International Symposium on Micro Machine and Human Science, pp. 39–43 (1995)
Ghazaei, G., Laina, I., Rupprecht, C., Tombari, F., Navab, N., Nazarpour, K.: Dealing with ambiguity in robotic grasping via multiple predictions. In: Proceedings of the Asian Conference on Computer Vision, pp. 38–55 (2019)
Hinterstoisser, S., et al.: Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 858–865 (2011)
Hodan, T., et al.: T-LESS: An RGB-D dataset for 6D pose estimation of texture-less objects. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, pp. 880–888. IEEE (2017)
Hu, Y., Fua, P., Wang, W., Salzmann, M.: Single-stage 6D object pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2930–2939 (2020)
Kendall, A., Grimes, M., Cipolla, R.: Posenet: a convolutional network for real-time 6-dof camera relocalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2938–2946 (2015)
Kirillov, A., et al.: Segment Anything. arXiv:2304.02643 (2023)
Lepetit, V., Moreno-Noguer, F., Fua, P.: Ep\(n\)p: an accurate o (\(n\)) solution to the p\(n\)p problem. Int. J. Comput. Vision 81, 155–166 (2009)
Li, Z., Wang, G., Ji, X.: CDPN: coordinates-based disentangled pose network for real-time RGB-based 6-DoF object pose estimation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 7678–7687 (2019)
Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the Seventh IEEE International Conference On Computer Vision, vol. 2, pp. 1150–1157. IEEE (1999)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 91–110 (2004)
Marchand, E., Uchiyama, H., Spindler, F.: Pose estimation for augmented reality: a hands-on survey. IEEE Trans. Visual Comput. Graphics 22(12), 2633–2651 (2015)
Olson, C.F., Huttenlocher, D.P.: Automatic target recognition by matching oriented edge pixels. IEEE Trans. Image Process. 6(1), 103–113 (1997)
Peng, S., Liu, Y., Huang, Q., Zhou, X., Bao, H.: PVNet: pixel-wise voting network for 6DoF pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4561–4570 (2019)
Pérez, L., Rodríguez, Í., Rodríguez, N., Usamentiaga, R., García, D.F.: Robot guidance using machine vision techniques in industrial environments: a comparative review. Sensors 16(3), 335 (2016)
Rambach, J., Pagani, A., Schneider, M., Artemenko, O., Stricker, D.: 6DoF object tracking based on 3D scans for augmented reality remote live support. Computers 7(1), 6 (2018)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: An efficient alternative to SIFT or SURF. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2564–2571 (2011)
Rusu, R.B., Blodow, N., Marton, Z.C., Beetz, M.: Aligning point cloud views using persistent feature histograms. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3384–3391 (2008)
Su, Y., et al.: ZebraPose: coarse to fine surface encoding for 6DoF object pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6738–6748 (2022)
Suzuki, S., et al.: Topological structural analysis of digitized binary images by border following. Comput. Vision, Graph. Image Process. 30(1), 32–46 (1985)
Tian, Z., Shen, C., Chen, H., He, T.: FCOS: fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9627–9636 (2019)
Wang, C., et al.: DenseFusion: 6D object pose estimation by iterative dense fusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3343–3352 (2019)
Wang, G., Manhardt, F., Tombari, F., Ji, X.: GDR-Net: geometry-guided direct regression network for monocular 6D object pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16611–16621 (2021)
Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes. In: Robotics: Science and Systems XIV (2018)
Zhang, X., Jiang, Z., Zhang, H., Wei, Q.: Vision-based pose estimation for textureless space objects by contour points matching. IEEE Trans. Aerosp. Electron. Syst. 54(5), 2342–2355 (2018)
Zou, X., et al.: Segment everything everywhere all at once. arXiv preprint arXiv:2304.06718 (2023)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Cui, X., Li, N., Zhang, C., Zhang, Q., Feng, W., Wan, L. (2024). Silhouette-Based 6D Object Pose Estimation. In: Zhang, FL., Sharf, A. (eds) Computational Visual Media. CVM 2024. Lecture Notes in Computer Science, vol 14593. Springer, Singapore. https://doi.org/10.1007/978-981-97-2092-7_8
Download citation
DOI: https://doi.org/10.1007/978-981-97-2092-7_8
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-2091-0
Online ISBN: 978-981-97-2092-7
eBook Packages: Computer ScienceComputer Science (R0)