Skip to main content
Log in

Pose estimation for workpieces in complex stacking industrial scene based on RGB images

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Recently, pose estimation algorithms have been widely discussed. Although been investigated by many prior works, pose estimation for heavily stacked workpieces, like bins, is still a challenge for industrial applications. Moreover, the scene of stacked workpieces is much more arduous than the general scene for robots to carry out routine tasks, such as object detection and picking. This paper aims to address the problem of pose estimation for stacked symmetrical workpieces in the presence of partial occlusions and cluttered backgrounds. To tackle those problems, we propose a novel pose estimation method for industrial stacking scenes based on RGB images. Specifically, due to the common symmetry characteristics of industrial workpieces, we firstly present a new standardized spatial representation method, which could auto-encode the 2D-3D correspondences of symmetrical workpieces. Besides, we introduce a novel GAN-based deep neural network model to reconstruct the representation of stacked workpieces. Based on that, the pose of the target workpieces is predicted based on the reconstructed expression and an improved RANSAC-PnP algorithm. Finally, comprehensive experiments demonstrate that the proposed method outperforms state-of-the-art methods, especially in complex stacking scenes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Aldoma A, Marton Z-C, Tombari F, Wohlkinger W, Potthast C, Zeisl B, Bogdan Rusu R, Gedikli S, Vincze M (2012) Tutorial: Point cloud library: Three-dimensional object recognition and 6 dof pose estimation. IEEE Robotics & Automation Magazine 19(3):80–91

    Article  Google Scholar 

  2. Barath D, Matas J (2018) Graph-cut ransac. In: 2018 IEEE/CVF Conference on computer vision and pattern recognition

  3. Barath D, Matas J (2019) Progressive-x: Efficient, anytime, multi-model fitting algorithm. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3780–3788

  4. Brachmann E, Michel F, Krull A, Yang MY, Gumhold S et al (2016) Uncertainty-driven 6d pose estimation of objects and scenes from a single rgb image. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3364–3372

  5. Butters L, Xu Z, Klette R (2019) Using machine vision to command a 6-axis robot arm to act on a randomly placed zinc die cast product. In: Proceedings of the 2nd International Conference on Control and Computer Vision, pp 8–12

  6. Castro P, Armagan A, Kim TK (2019) Accurate 6d object pose estimation by pose conditioned mesh reconstruction

  7. Chen W, Jia X, Jin Chang H, Duan J, Leonardis A (2020) G2l-net: Global to local network for real-time 6d pose estimation with embedding vector features

  8. Chum O, Matas J (2005) Matching with prosac-progressive sample consensus. In: 2005 IEEE Computer society conference on computer vision and pattern recognition (CVPR’05), vol 1, pp 220–226

  9. Drost B, Ulrich M, Navab N, Ilic S (2010) Model globally, match locally: Efficient and robust 3d object recognition. In: 2010 IEEE Computer society conference on computer vision and pattern recognition. Ieee, pp 998–1005

  10. Du G, Wang K, Lian S (2019) Vision-based robotic grasping from object localization, pose estimation, grasp detection to motion planning: A review

  11. Dwibedi D, Misra I, Hebert M (2017) Cut, paste and learn: Surprisingly easy synthesis for instance detection. In: Proceedings of the IEEE international conference on computer vision, pp 1301–1310

  12. Fernando O (2016) A comprehensive performance evaluation of 3d local feature descriptors. Comput Rev 57(8):507–507

    Google Scholar 

  13. Fu T, Li F, Zheng Y, Quan W, Song R, Li Y (2019) Dynamically grasping with incomplete information workpiece based on machine vision. In: 2019 IEEE International conference on unmanned systems (ICUS). IEEE, pp 502–507

  14. Guo Y, Bennamoun M, Sohel F, Lu M, Wan J, Kwok NM (2016) A comprehensive performance evaluation of 3d local feature descriptors. Int J Comput Vis 116(1):66–89

    Article  MathSciNet  Google Scholar 

  15. Hinterstoisser S, Cagniart C, Ilic S, Sturm P, Navab N, Fua P, Lepetit V (2011) Gradient response maps for real-time detection of textureless objects. IEEE Trans Pattern Anal Mach Intell 34 (5):876–888

    Article  Google Scholar 

  16. Hinterstoisser S, Holzer S, Cagniart C, Ilic S, Konolige K, Navab N, Lepetit V (2011) Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In: 2011 International conference on computer vision. IEEE, pp 858–865

  17. Hinterstoisser S, Lepetit V, Ilic S, Holzer S, Bradski G, Konolige K, Navab N (2012) Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes. In: Asian conference on computer vision. Springer, pp 548–562

  18. Hinterstoisser S, Lepetit V, Rajkumar N, Konolige K (2016) Going further with point pair features. In: European conference on computer vision

  19. Hodan T, Barath D, Matas J (2020) Epos: estimating 6d pose of objects with symmetries. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11703–11712

  20. Kehl W, Manhardt F, Tombari F, Ilic S, Navab N (2017) Ssd-6d: Making rgb-based 3d detection and 6d pose estimation great again. In: Proceedings of the IEEE international conference on computer vision, pp 1521–1529

  21. Kendall A, Cipolla R (2017) Geometric loss functions for camera pose regression with deep learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5974–5983

  22. Kneip L, Scaramuzza D, Siegwart R (2011) A novel parametrization of the perspective-three-point problem for a direct computation of absolute camera position and orientation. In: Proceedings / CVPR IEEE computer society conference on computer vision and pattern recognition. IEEE computer society conference on computer vision and pattern recognition, pp 2969– 2976

  23. Li Y, Wang G, Xiang YX, Fox D (2020) Deepim: Deep iterative matching for 6d pose estimation. Int J Comput Vision 128(3)

  24. Li Z, Wang G, Ji X (2019) Cdpn: Coordinates-based disentangled pose network for real-time rgb-based 6-dof object pose estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 7678–7687

  25. Park K, Patten T, Vincze M (2020) Pix2pose: Pixel-wise coordinate regression of objects for 6d pose estimation. In: 2019 IEEE/CVF International conference on computer vision (ICCV)

  26. Peng S, Liu Y, Huang Q, Bao H, Zhou X (2018) Pvnet: Pixel-wise voting network for 6dof pose estimation

  27. Rad M, Lepetit V (2017) Bb8: A scalable, accurate robust to partial occlusion method for predicting the 3d poses of challenging objects without using depth

  28. Rublee E, Rabaud V, Konolige K, Bradski G (2012) Orb: An efficient alternative to sift or surf. In: International conference on computer vision

  29. Song C, Song J, Huang Q (2020) Hybridpose: 6d object pose estimation under hybrid representations

  30. Tekin B, Sinha SN, Fua P (2018) Real-time seamless single shot 6d object pose prediction. In: 2018 IEEE/CVF Conference on computer vision and pattern recognition (CVPR)

  31. Lepetit V, Moreno-NoguerPascal F, Fua (2009) Epnp: An accurate o(n) solution to the pnp problem. International Journal of Computer Vision

  32. Wang C, Xu D, Zhu Y, Martín-Martín R, Lu C, Fei-Fei L, Savarese S (2019) Densefusion: 6d object pose estimation by iterative dense fusion

  33. Wang H, Sridhar S, Huang J, Valentin J, Guibas LJ (2019) Normalized object coordinate space for category-level 6d object pose and size estimation

  34. Xiang Y, Schmidt T, Narayanan V, Fox D (2017) Posecnn: A convolutional neural network for 6d object pose estimation in cluttered scenes. arXiv:1711.00199

  35. Zakharov S, Shugurov I, Ilic S (2019) Dpod: 6d pose object detector and refiner. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1941–1950

Download references

Acknowledgment

This paper was supported by the Natural Science Fund of China (NSFC) under Grant No.51575186, the Major Program of National Natural Science Foundation of China under Grant No. 61690214, Shanghai Science and Technology Action Plan under Grant No.18DZ1204000, 18510745500, 18510750100, 18510730600, Shanghai Aerospace Science and Technology Innovation Fund (SAST) under Grant No. 2019-080, 2019-116 and Shanghai Sailing Program under Grant No. 20YF1417300.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yajun Zhang or Jianjun Yi.

Ethics declarations

Conflict of Interests

Jianjun Yi has received research grants from Shanghai economic and information commission, Shanghai science and technology action plan project, the Natural Science Fund of China, the Fundamental Research Funds for the Central Universities, Shanghai Pujiang Program and Shanghai Software and IC industry Development Special Fund.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Y., Yi, J., Chen, Y. et al. Pose estimation for workpieces in complex stacking industrial scene based on RGB images. Appl Intell 52, 8757–8769 (2022). https://doi.org/10.1007/s10489-021-02857-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02857-7

Keywords

Navigation