Abstract
Automatic object grasping is a challenging problem and has numerous applications in various fields. Currently, researchers have developed models that use only 3D point cloud data which is not sufficient to capture a complete grasping ability (graspability), because many visual features related to objects are missing in the 3D points. So here we propose an auxiliary convolutional neural network pipeline (CNN) for graspability modeling via simultaneously using visual information from RGBD images and 3D point clouds. For training the auxiliary CNN, we have created new dataset where the most graspable object has been placed in class 5, whereas the least graspable object has been placed in class 1. Our graspability modeling includes, 12 object features, where 9 are extracted from elliptic Fourier descriptors, the other 3 features are Euclidian distance from the centroid, compactness of an object and category of an object. We have thoroughly evaluated our proposed approach by incorporating it into state-of-the-art grasping method Graspnet [8], which has further improved the overall average grasp precision. Additionally, we performed an ablation study on various network elements and loss functions (cross entropy, mean square loss) for obtaining the best accuracy and graspability scores.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Asif, U., Tang, J., Harrer, S.: Ensemblenet: improving grasp detection using an ensemble of convolutional neural networks. In: BMVC, p. 10 (2018)
Atzmon, M., Maron, H., Lipman, Y.: Point convolutional neural networks by extension operators. arXiv preprint arXiv:1803.10091 (2018)
Bailey, S.E., Lynch, J.M.: Diagnostic differences in mandibular p4 shape between neandertals and anatomically modern humans. Am. J. Phys. Anthropology Official Publication Am. Assoc. Phys. Anthropologists 126(3), 268–277 (2005)
Calli, B., et al.: Yale-cmu-berkeley dataset for robotic manipulation research. Int. J. Robot. Res. 36(3), 261–268 (2017)
Chen, S.Y., Lestrel, P.E., Kerr, W.J.S., McColl, J.H.: Describing shape changes in the human mandible using elliptical fourier functions. Europ. J. Orthodontics 22(3), 205–216 (2000)
Chu, F.J., Xu, R., Vela, P.A.: Real-world multiobject, multigrasp detection. IEEE Robot. Autom. Lett. 3(4), 3355–3362 (2018)
Detry, R., Ek, C.H., Madry, M., Kragic, D.: Learning a dictionary of prototypical grasp-predicting parts from grasping experience. In: 2013 IEEE International Conference on Robotics and Automation, pp. 601–608. IEEE (2013)
Fang, H.S., Wang, C., Gou, M., Lu, C.: Graspnet-1billion: a large-scale benchmark for general object grasping. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11444–11453 (2020)
Freeman, H.: Computer processing of line-drawing images. ACM Comput. Surv. (CSUR) 6(1), 57–97 (1974)
Godefroy, J.E., Bornert, F., Gros, C.I., Constantinesco, A.: Elliptical fourier descriptors for contours in three dimensions: a new tool for morphometrical analysis in biology. C.R. Biol. 335(3), 205–213 (2012)
Granlund, G.H.: Fourier preprocessing for hand print character recognition. IEEE Trans. Comput. 100(2), 195–201 (1972)
Guo, D., Sun, F., Liu, H., Kong, T., Fang, B., Xi, N.: A hybrid deep architecture for robotic grasp detection. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 1609–1614. IEEE (2017)
Jiang, Y., Moseson, S., Saxena, A.: Efficient grasping from rgbd images: Learning using a new rectangle representation. In: 2011 IEEE International Conference on Robotics and Automation, pp. 3304–3311. IEEE (2011)
Jordan, J.: An overview of semantic image segmentation. Data Science, pp. 1–21 (2018)
Kappler, D., Bohg, J., Schaal, S.: Leveraging big data for grasp planning. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 4304–4311. IEEE (2015)
Kuhl, F.P., Giardina, C.R.: Elliptic fourier features of a closed contour. Comput. Graphics Image Process. 18(3), 236–258 (1982)
Lenz, I., Lee, H., Saxena, A.: Deep learning for detecting robotic grasps. Int. J. Robot. Res. 34(4–5), 705–724 (2015)
Lestrel, P.E.: Fourier descriptors and their applications in biology. Cambridge University Press (1997)
Lestrel, P.E., Kerr, W.J.S.: Quantification of function regulator therapy using elliptical fourier functions. Europ. J. Orthodontics 15(6), 481–491 (1993)
Levine, S., Pastor, P., Krizhevsky, A., Ibarz, J., Quillen, D.: Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. Int. J. Robot. Res. 37(4–5), 421–436 (2018)
Li, Y., Bu, R., Sun, M., Wu, W., Di, X., Chen, B.: Pointcnn: convolution on x-transformed points. Advances in neural information processing systems 31 (2018)
Liang, H., et al.: Pointnetgpd: detecting grasp configurations from point sets. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 3629–3635. IEEE (2019)
Mahler, J., et al.: Dex-net 2.0: Deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics. arXiv preprint arXiv:1703.09312 (2017)
Morrison, D., Corke, P., Leitner, J.: Closing the loop for robotic grasping: a real-time, generative grasp synthesis approach. arXiv preprint arXiv:1804.05172 (2018)
Park, D., Seo, Y., Shin, D., Choi, J., Chun, S.Y.: A single multi-task deep neural network with post-processing for object detection with reasoning and robotic grasp detection. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 7300–7306. IEEE (2020)
ten Pas, A., Gualtieri, M., Saenko, K., Platt, R.: Grasp pose detection in point clouds. Int. J. Robot. Res. 36(13–14), 1455–1473 (2017)
Pinto, L., Gupta, A.: Supersizing self-supervision: learning to grasp from 50k tries and 700 robot hours. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 3406–3413. IEEE (2016)
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems 30 (2017)
Su, H., et al.: Splatnet: sparse lattice networks for point cloud processing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2530–2539 (2018)
Tatarchenko, M., Dosovitskiy, A., Brox, T.: Octree generating networks: efficient convolutional architectures for high-resolution 3d outputs. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2088–2096 (2017)
Ten Pas, A., Platt, R.: Using geometry to detect grasp poses in 3d point clouds. In: Robotics Research, pp. 307–324. Springer (2018)
Wallace, T.P., Wintz, P.A.: An efficient three-dimensional aircraft recognition algorithm using normalized fourier descriptors. Comput. Graphics Image Process. 13(2), 99–126 (1980)
Wang, J., et al.: Deep high-resolution representation learning for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 43(10), 3349–3364 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Varun, P., Behera, L., Sandhan, T. (2023). Auxiliary CNN for Graspability Modeling with 3D Point Clouds and Images for Robotic Grasping. In: Gupta, D., Bhurchandi, K., Murala, S., Raman, B., Kumar, S. (eds) Computer Vision and Image Processing. CVIP 2022. Communications in Computer and Information Science, vol 1777. Springer, Cham. https://doi.org/10.1007/978-3-031-31417-9_41
Download citation
DOI: https://doi.org/10.1007/978-3-031-31417-9_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-31416-2
Online ISBN: 978-3-031-31417-9
eBook Packages: Computer ScienceComputer Science (R0)