Abstract
Point cloud segmentation is the premise and basis of many 3D perception tasks, such as intelligent driving, object detection and recognition, scene recognition and understanding. In this paper, we present an improved PointNet for 3D object part Segmentation, and named the proposed PointNet as Deep Residual Neural Network Based PointNet (DResNet-PointNet). The architecture of DResNet- PointNet was desigined based on the idea of residual networks. Residual networks can increase the depth of the DResNet-PointNet without network degradation. The depth of DResNet-PointNet is twice as deep as that of original PointNet model. Increasing the depth of DResNet-PointNet can improve its ability to express complex functions and generalization ability of complex classification problems, and achieve better approximation of complex functions, thus improving the accuracy of segmentation. The experimental results of part segmentation verify the feasibility and effectiveness of DResNet-PointNet.
Similar content being viewed by others
References
Babahajiani P, Fan L, Gabbouj M (2014) Object recognition in 3d point cloud of urban street scene. In: Asian conference on computer vision, vol 13, pp 177–190
Bi L, Kim J, Kumar A, Fulham M, Feng D (2017) Stacked fully convolutional networks with multi-channel learning: application to medical image segmentation. Vis Comput 33(6-8):1061–1071
Cao Z, Huang Q, Karthik R (2017) 3D object classification via spherical projections. In: 2017 international conference on 3D Vision (3DV), pp 566–574
Cicek O, Abdulkadir A, Lienkamp SS et al (2016) 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: Medical image computing and computer assisted intervention, pp 424–432
Graham B, Engelcke M, van der Maaten L (2018) 3d semantic segmentation with submanifold sparse convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9224–9232
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Huang G, Liu Z, Laurens VDM (2016) Densely connected convolutional networks. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2261–2269
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd international conference on international conference on machine learning, vol 37, pp 448–456
Jaderberg M, Simonyan K, Zisserman A, Kavukcuoglu K (2015) Spatial transformer networks. In: Proceedings of the 28th international conference on neural information processing systems, vol 2, pp 2017–2025
Jiang L et al (2018) GAL: Geometric adversarial loss for single-view 3d-object reconstruction. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 802–816
Johnson A (1997) Spin-images: A Representation for 3-D Surface Matching. PhD thesis, Robotics Institute Carnegie Mellon University
Johnson A, Hebert M (1999) Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Trans Pattern Anal Mach Intell 5:433–449
Kalogerakis E, Averkiou M, Maji S (2017) Chaudhuri s. 3D shape segmentation with projective convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3779–3788
Kalogerakis E, Hertzmann A, Singh K (2010) Learning 3D mesh segmentation and labeling. ACM Trans Grap (TOG) 29(4):102
Kamnitsas K, Ferrante E, Parisot S, Ledig C, Nori AV, Criminisi A, Rueckert D, Glocker B (2016) DeepMedic for brain tumor segmentation. In: International workshop on brainlesion, glioma, multiple sclerosis, stroke and traumatic brain injuries, vol 10154, pp 138–149
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, vol 67, pp 2361–2367
Lafferty J, McCallum A, Pereira FC (2001) Conditional random fields:, Probabilistic models for segmenting and labeling sequence data. 3(2): 282–289
Mandikal P, Navaneet KL, Venkatesh Babu R (2018) 3d-PSRNet: Part segmented 3d point cloud reconstruction from a single image. In: Proceedings of the European Conference on Computer Vision (ECCV)
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 807–814
Nilsson D, Sminchisescu C (2018) Semantic video segmentation by gated recurrent flow propagation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6819–6828
Pavlidis T (1982) Algorithms for graphics and image processing, vol 18. Springer, Berlin, p 448. Rockville: Computer Science Press
Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. Proc Comput Vis Pattern Recognit (CVPR), IEEE 1(2):4
Qi CR, Su H, Niebner M, Dai A, Yan M, Guibas LJ (2016) Volumetric and multi-view cnns for object classification on 3d data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5648–5656
Qi CR et al (2018) Frustum pointnets for 3d object detection from rgb-d data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 918–927
Riegler G, Ulusoy AO, Geiger A (2017) Octnet: learning deep 3D representations at high resolutions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 6620–6629
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention, vol 9351, pp 234–241
Rotaru C, Graf T, Zhang J (2008) Color image segmentation in HSI space for automotive applications. J Real-Time Image Proc 3(4):311–322
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Rusu R (2008) Learning informative point classes for the acquisition of object model maps. In: Proceedings of the 10th International Conference on Control, Automation, Robotics and Vision (ICARCV), pp 643–650
Rusu R (2008) Aligning point cloud views using persistent feature histograms. In: Proceedings of the 21St IEEE/RSJ IEEE/RSJ international conference on intelligent robots and systems, Nice, pp 3384–3391
Serrano A, Sitzmann V, Ruiz-Borau J, Wetzstein G, Gutierrez D, Masia B (2017) Movie editing and cognitive event segmentation in virtual reality video. ACM Trans Grap (TOG) 36(4):47
Su H, Jampani V, Sun D, Maji S, Kalogerakis E, Yang MH, Kautz J (2018) Splatnet: Sparse lattice networks for point cloud processing. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2530–2539
Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE international conference on computer vision, pp 945–953
Sun S, Sonka M, Beichel RR (2013) Lung segmentation refinement based on optimal surface finding utilizing a hybrid desktop/virtual reality user interface. Comput Med Imaging Graph 37(1):15–27
Wang PS, Liu Y, Guo YX, Sun CY, Tong X (2017) O-cnn: Octree-based convolutional neural networks for 3d shape analysis. ACM Transactions on Grap (TOG) 36(4):72
Wu W, Qi Z, Li F (2019) Pointconv: Deep convolutional networks on 3d point clouds. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9621–9630
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3D shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1912–1920
Xu Y et al (2018) Spidercnn: deep learning on point sets with parameterized convolutional filters. In: Proceedings of the European Conference on Computer Vision (ECCV)., pp 87–102
Yi L, Kim VG, Ceylan D, Shen I, Yan M, Su H, Lu C, Huang Q, Sheffer A, Guibas L (2016) A scalable active framework for region annotation in 3d shape collections. ACM Trans Graph (TOG) 35(6):210
Yi L et al (2017) Syncspeccnn: Synchronized spectral cnn for 3d shape segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2282–2290
Yu F, Wang D, Shelhamer E (2017) Deep layer aggregation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2403–2412
Acknowledgements
This research is partially supported by: Natural Science Foundation Project of science and Technology Department of Jilin Province under Grant no. 20200201165JC.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, B., Zhang, Y. & Sun, F. Deep residual neural network based PointNet for 3D object part segmentation. Multimed Tools Appl 81, 11933–11947 (2022). https://doi.org/10.1007/s11042-020-09609-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09609-8