Fast Organization of Objects’ Spatial Positions in Manipulator Space from Single RGB-D Camera

Sun, Yangchang; Yang, Minghao; Li, Jialing; Qiang, Baohua; Chen, Jinlong; Jia, Qingyu

doi:10.1007/978-3-030-92238-2_15

Yangchang Sun^13,14,
Minghao Yang^13,14,
Jialing Li¹⁵,
Baohua Qiang¹⁵,
Jinlong Chen¹⁵ &
…
Qingyu Jia¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 13110))

Included in the following conference series:

International Conference on Neural Information Processing

1603 Accesses

Abstract

For the grasp task in physical environment, it is important for the manipulator to know the objects’ spatial positions with as few sensors as possible in real time. This work proposed an effective framework to organize the objects’ spatial positions in the manipulator 3D workspace with a single RGB-D camera robustly and fast. It mainly contains two steps: (1) a 3D reconstruction strategy for objects’ contours obtained in environment; (2) a distance-restricted outlier point elimination strategy to reduce the reconstruction errors caused by sensor noise. The first step ensures fast object extraction and 3D reconstruction from scene image, and the second step contributes to more accurate reconstructions by eliminating outlier points from initial result obtained by the first step. We validated the proposed method in a physical system containing a Kinect 2.0 RGB-D camera and a Mico2 robot. Experiments show that the proposed method can run in quasi real time on a common PC and it outperforms the traditional 3D reconstruction methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Boby, R.A., Saha, S.K.: Single image based camera calibration and pose estimation of the end-effector of a robot. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 2435–2440. IEEE (2016)
Google Scholar
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: optimal speed and accuracy of object detection (2020). arXiv preprint arXiv:2004.10934
Brachmann, E., Michel, F., Krull, A., Yang, M.Y., Gumhold, S., et al.: Uncertainty-driven 6d pose estimation of objects and scenes from a single rgb image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3364–3372 (2016)
Google Scholar
Cao, Z., Sheikh, Y., Banerjee, N.K.: Real-time scalable 6D of pose estimation for textureless objects. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 2441–2448. IEEE (2016)
Google Scholar
Collet, A., Martinez, M., Srinivasa, S.S.: The moped framework: object recognition and pose estimation for manipulation. Int. J. Rob. Res. 30(10), 1284–1306 (2011)
Article Google Scholar
Durović, P., Grbić, R., Cupec, R.: Visual servoing for low-cost scara robots using an rgb-d camera as the only sensor. Automatika: časopis za automatiku, mjerenje, elektroniku, računarstvo i komunikacije 58(4), 495–505 (2017)
Google Scholar
Gao, G., Lauri, M., Wang, Y., Hu, X., Zhang, J., Frintrop, S.: 6D object pose regression via supervised learning on point clouds. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 3643–3649. IEEE (2020)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Google Scholar
Jones, M., Vernon, D.: Using neural networks to learn hand-eye co-ordination. Neural Comput. Appl. 2(1), 2–12 (1994)
Article Google Scholar
Kehl, W., Manhardt, F., Tombari, F., Ilic, S., Navab, N.: SSD-6D: Making rgb-based 3D detection and 6D pose estimation great again. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1521–1529 (2017)
Google Scholar
Kehl, W., Milletari, F., Tombari, F., Ilic, S., Navab, N.: Deep learning of local RGB-D patches for 3D object detection and 6D pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 205–220. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_13
Chapter Google Scholar
Kuan, Y.W., Ee, N.O., Wei, L.S.: Comparative study of intel r200, kinect v2, and primesense rgb-d sensors performance outdoors. IEEE Sens. J. 19(19), 8741–8750 (2019)
Article Google Scholar
Levine, S., Pastor, P., Krizhevsky, A., Ibarz, J., Quillen, D.: Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. Int. J. Rob. Res. 37(4–5), 421–436 (2018)
Article Google Scholar
Li, E., Mo, H., Xu, D., Li, H.: Image projective invariants. IEEE Trans. Pattern Anal. Mach. Intell. 41(5), 1144–1157 (2018)
Article Google Scholar
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Meng, Y., Zhuang, H.: Self-calibration of camera-equipped robot manipulators. Int. J. Rob. Res. 20(11), 909–921 (2001)
Article Google Scholar
Michel, F., et al.: Global hypothesis generation for 6D object pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 462–471 (2017)
Google Scholar
Mindspore: Mask-rcnn-mobilenetv1. Website (2020). https://gitee.com/mindspore/mindspore/blob/r1.1/model_zoo/official/cv/maskrcnn_mobilenetv1/src/maskrcnn_mobilenetv1/mobilenetv1.py
Pavlakos, G., Zhou, X., Chan, A., Derpanis, K.G., Daniilidis, K.: 6-dof object pose from semantic keypoints. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 2011–2018. IEEE (2017)
Google Scholar
Rad, M., Lepetit, V.: Bb8: a scalable, accurate, robust to partial occlusion method for predicting the 3D poses of challenging objects without using depth. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3828–3836 (2017)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement (2018). arXiv preprint arXiv:1804.02767
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 28, 91–99 (2015)
Google Scholar
Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344(6191), 1492–1496 (2014)
Article Google Scholar
Schmid, C., Mohr, R.: Local grayvalue invariants for image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 19(5), 530–535 (1997)
Article Google Scholar
Tekin, B., Sinha, S.N., Fua, P.: Real-time seamless single shot 6D object pose prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 292–301 (2018)
Google Scholar
Wang, C., Xu, D., Zhu, Y., Martín-Martín, R., Lu, C., Fei-Fei, L., Savarese, S.: Densefusion: 6D object pose estimation by iterative dense fusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3343–3352 (2019)
Google Scholar
Wohlhart, P., Lepetit, V.: Learning descriptors for object recognition and 3D pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3109–3118 (2015)
Google Scholar
Wu, H., Tizzano, W., Andersen, T.T., Andersen, N.A., Ravn, O.: Hand-eye calibration and inverse kinematics of robot arm using neural network. In: Kim, J.-H., Matson, E.T., Myung, H., Xu, P., Karray, F. (eds.) Robot Intelligence Technology and Applications 2. AISC, vol. 274, pp. 581–591. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-05582-4_50
Chapter Google Scholar
Zeng, A., et al.: Multi-view self-supervised deep learning for 6D pose estimation in the amazon picking challenge. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 1386–1383. IEEE (2017)
Google Scholar

Download references

Acknowledgments

This work is supported by the National Key Research & Development Program of China (No. 2018AAA0102902), the National Natural Science Foundation of China (NSFC) (No.61873269), the Beijing Natural Science Foundation (No: L192005), the CAAI-Huawei MindSpore Open Fund (CAAIXSJLJJ-20202-027A), the Guangxi Key Research and Development Program (AB18221011, AB21075004, AD18281002, AD19110137), the Natural Science Foundation of Guangxi of China (No: 2020GXNSFAA297061, 2019GXNSFDA185006, 2019GXN SFDA185007), Guangxi Key Laboratory of Intelligent Processing of Computer Images and Graphics (No GIIP201702) and Guangxi Key Laboratory of Trusted Software (NO kx201621,kx201715).

Author information

Authors and Affiliations

School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China
Yangchang Sun & Minghao Yang
Research Center for Brain-inspired Intelligence (BII), Institute of Automation, Chinese Academy of Sciences (CASIA), Beijing, 100190, China
Yangchang Sun & Minghao Yang
School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, 541004, China
Jialing Li, Baohua Qiang, Jinlong Chen & Qingyu Jia

Authors

Yangchang Sun
View author publications
You can also search for this author in PubMed Google Scholar
Minghao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jialing Li
View author publications
You can also search for this author in PubMed Google Scholar
Baohua Qiang
View author publications
You can also search for this author in PubMed Google Scholar
Jinlong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Qingyu Jia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Minghao Yang .

Editor information

Editors and Affiliations

Sampoerna University, Jakarta, Indonesia
Teddy Mantoro
Kyungpook National University, Daegu, Korea (Republic of)
Minho Lee
Sampoerna University, Jakarta, Indonesia
Media Anugerah Ayu
Murdoch University, Murdoch, WA, Australia
Kok Wai Wong
Universitas Indonesia, Depok, Indonesia
Achmad Nizar Hidayanto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sun, Y., Yang, M., Li, J., Qiang, B., Chen, J., Jia, Q. (2021). Fast Organization of Objects’ Spatial Positions in Manipulator Space from Single RGB-D Camera. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Lecture Notes in Computer Science(), vol 13110. Springer, Cham. https://doi.org/10.1007/978-3-030-92238-2_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-92238-2_15
Published: 05 December 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92237-5
Online ISBN: 978-3-030-92238-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Fast Organization of Objects’ Spatial Positions in Manipulator Space from Single RGB-D Camera